[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805625#action_12805625
 ] 

Ted Dunning commented on MAPREDUCE-1126:
----------------------------------------

{quote}
   >  [From Owen] My assertion is that leaving the type as the primary 
instrument of the user in defining the job is correct. 
   > I haven't talked to any users that care about using a non-default 
serializer for a given type.

Pig would like to. For scalar types Pig uses Java String, Long, Integer, etc. 
But default Java serialization is slow. So currently we convert these to and 
from Writables as we go across the Map and Reduce boundaries to get the faster 
Writable serialization. If we could instead define an alternate serializer and 
avoid these conversions it would make our code simpler and should perform 
better.
{quote}

I would like to.  I would like to start using Avro for greater expressive power 
as soon as possible.  I also can't change all of my legacy code right away so I 
will have some code that implements both Writable and Avro serialization.  I 
need to be able to use writable for old code and Avro for new code.



> shuffle should use serialization to get comparator
> --------------------------------------------------
>
>                 Key: MAPREDUCE-1126
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1126
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: task
>            Reporter: Doug Cutting
>            Assignee: Aaron Kimball
>             Fix For: 0.22.0
>
>         Attachments: MAPREDUCE-1126.2.patch, MAPREDUCE-1126.3.patch, 
> MAPREDUCE-1126.4.patch, MAPREDUCE-1126.5.patch, MAPREDUCE-1126.6.patch, 
> MAPREDUCE-1126.patch
>
>
> Currently the key comparator is defined as a Java class.  Instead we should 
> use the Serialization API to create key comparators.  This would permit, 
> e.g., Avro-based comparators to be used, permitting efficient sorting of 
> complex data types without having to write a RawComparator in Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to