MRUnit Should Sort Reduce Input ------------------------------- Key: MAPREDUCE-1216 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1216 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.20.1 Environment: Cloudera Distribution for Hadoop 0.20.1 + 133 Reporter: Ed Kohlwey
MRUnit should sort the input for a reduce task, the same way hadoop does. This is useful if you have a reduce task that, for instance, removes duplicate key value pairs. example: {code:java} class BadReducer extends Reducer{ public void reduce(...){ Text last = new Text(); for(Text text: values){ if(!text.equals(last)){ context.write(key, text); last.set(text); } } } } {code} {code:java} ReduceDriver driver = new ReduceDriver() driver.setInputKey("foo"); driver.addInputValue("bar"); driver.addInputValue("bar"); driver.addInputValue("foo"); {code} produces different results than {code:java} ReduceDriver driver = new ReduceDriver() driver.setInputKey("foo"); driver.addInputValue("bar"); driver.addInputValue("foo"); driver.addInputValue("bar"); {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.