[ 
https://issues.apache.org/jira/browse/CRUNCH-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13797716#comment-13797716
 ] 

Chao Shi commented on CRUNCH-280:
---------------------------------

I found it difficult that MR needs RawComparator, which compares two buffers of 
serialized records. But this would be not easy to use. I would be nice to 
support:
1) RawComparator, this is the most efficient way, but users must know the 
serialization format in mind
2) normal Comparator class (with extra record serialization overhead)
3) a serializable Comparator object, whose in-memory state is serialized to MR 
workers (with serialization overhead)

I found 2) and 3) are not easy, as I don't know how to deserialize data at 
runtime. Is it possible [~jwills]?

> Specify Comparator for total order sort
> ---------------------------------------
>
>                 Key: CRUNCH-280
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-280
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chao Shi
>            Assignee: Chao Shi
>
> It seems that Sort#sort can only uses the default comparator. It would be 
> nice to make it to be specified by clients. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to