[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805732#action_12805732
 ] 

Jeff Hammerbacher commented on MAPREDUCE-1126:
----------------------------------------------

bq. Especially for frameworks written on top of MapReduce, less restrictive 
interfaces here would surely be fertile ground for performance improvements.

bq. Writing wrappers can be irritating, but for the MR API, I'd rather make it 
easier on common cases and users than on advanced uses and framework authors.

Great points, Chris. Yahoo! has stated that a significant majority of their 
MapReduce jobs are written in Pig, and Facebook says the same of Hive. Among 
our many customers at Cloudera, it's far more common to target the MapReduce 
execution engine with a higher level language rather than the Java API. What 
you propose as the common case, then, appears to be uncommon in practice. 
Perhaps we should adjust our design criteria to match the usage data reported 
by the users of the project?

Thanks,
Jeff

> shuffle should use serialization to get comparator
> --------------------------------------------------
>
>                 Key: MAPREDUCE-1126
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1126
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: task
>            Reporter: Doug Cutting
>            Assignee: Aaron Kimball
>             Fix For: 0.22.0
>
>         Attachments: MAPREDUCE-1126.2.patch, MAPREDUCE-1126.3.patch, 
> MAPREDUCE-1126.4.patch, MAPREDUCE-1126.5.patch, MAPREDUCE-1126.6.patch, 
> MAPREDUCE-1126.patch, MAPREDUCE-1126.patch
>
>
> Currently the key comparator is defined as a Java class.  Instead we should 
> use the Serialization API to create key comparators.  This would permit, 
> e.g., Avro-based comparators to be used, permitting efficient sorting of 
> complex data types without having to write a RawComparator in Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to