[ https://issues.apache.org/jira/browse/MAPREDUCE-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805618#action_12805618 ]
Alan Gates commented on MAPREDUCE-1126: --------------------------------------- bq. [From Owen] My assertion is that leaving the type as the primary instrument of the user in defining the job is correct. I haven't talked to any users that care about using a non-default serializer for a given type. Pig would like to. For scalar types Pig uses Java String, Long, Integer, etc. But default Java serialization is slow. So currently we convert these to and from Writables as we go across the Map and Reduce boundaries to get the faster Writable serialization. If we could instead define an alternate serializer and avoid these conversions it would make our code simpler and should perform better. > shuffle should use serialization to get comparator > -------------------------------------------------- > > Key: MAPREDUCE-1126 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1126 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: task > Reporter: Doug Cutting > Assignee: Aaron Kimball > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1126.2.patch, MAPREDUCE-1126.3.patch, > MAPREDUCE-1126.4.patch, MAPREDUCE-1126.5.patch, MAPREDUCE-1126.6.patch, > MAPREDUCE-1126.patch > > > Currently the key comparator is defined as a Java class. Instead we should > use the Serialization API to create key comparators. This would permit, > e.g., Avro-based comparators to be used, permitting efficient sorting of > complex data types without having to write a RawComparator in Java. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.