[ https://issues.apache.org/jira/browse/HADOOP-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Owen O'Malley resolved HADOOP-1827. ----------------------------------- Resolution: Won't Fix > Reducer.reduce method's OutputCollector is too strict, it shoudn't need the > key to be WritableComparable > -------------------------------------------------------------------------------------------------------- > > Key: HADOOP-1827 > URL: https://issues.apache.org/jira/browse/HADOOP-1827 > Project: Hadoop > Issue Type: Bug > Components: mapred > Affects Versions: 0.14.0 > Reporter: Arun C Murthy > > The output of the {{Reducer}}'s reduce method is *not* sorted, hence the > {{OutputCollector}} passed to it shouldn't require the *key* to be > {{WritableComparable}}; passing a {{Writable}} should suffice. > Thus > {code: title=Reducer.java} > public interface Reducer<K2 extends WritableComparable, V2 extends Writable, > K3 extends WritableComparable, V3 extends Writable> > extends JobConfigurable, Closeable { > void reduce(K2 key, Iterator<V2> values, OutputCollector<K3, V3> output, > Reporter reporter) > throws IOException; > } > {code} > should, technically, be: > {code: title=Reducer.java} > public interface Reducer<K2 extends WritableComparable, V2 extends Writable, > K3 extends Writable, V3 extends Writable> > extends JobConfigurable, Closeable { > void reduce(K2 key, Iterator<V2> values, OutputCollector<K3, V3> output, > Reporter reporter) > throws IOException; > } > {code} > Pros: > It removes an artificial limitation where it forces applications to emit > <{{WritableComparable}}, {{Writable}}> pair, rather than a <{{Writable}}, > {{Writable}}> pair, there-by easing some applications (I ran into a few > recently... admittedly trivial ones). > Cons: > 1. We now need a separate {{Combiner}} interface, since the combiner's > {{OutputCollector}} *needs* to be able to sort keys, hence requires a > {{WritableComparable}} - same as the {{Mapper}}. > 2. We need a separate {{SortableOutputCollector}} (for > {{Mapper}}/{{Combiner}}) and a {{NonSortableOutputCollector}} (for > {{Reducer}}). > 3. Alas! As a consequence of (1) & (2)we cannot use the same class as both a > {{Reducer}} and {{Combiner}} anymore, a serious compatibility issue. > The purpose of this issue is two-fold: > 1. Spark a discussion among folks, both hadoop-dev & hadoop-users, to figure > if this really is a problem i.e. do folks really care about this anomaly in > the existing {{Reducer}} interface? Also, is it worth the pain (@see 'Cons') > to go fix it. > 2. Even if we decide to live with it, this issue could record for posterity > why we love hadoop, warts and all. *smile* > Lets discuss... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.