[ 
https://issues.apache.org/jira/browse/CRUNCH-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13889602#comment-13889602
 ] 

Gabriel Reid commented on CRUNCH-329:
-------------------------------------

As far as I see, it's only the custom (i.e. non-primitive/String) types that 
need to be stored in the Configuration, as the updated state of the 
Writables.WRITABLE_CODES won't be available in a remote JVM.

What I had in mind was to just have the Writables.registerComparable method 
take a Configuration object as well, and instead of updating 
Writables.WRITABLE_CODES, it would just add/update the map of serialization 
codes in the Configuration. It's assumed that this Configuration would be the 
conf used by the Pipeline, and so the codes would be available to all tasks. 
Then any code that makes use of Writables.WRITABLE_CODES would instead use the 
union of the configured serialization codes and Writables.WRITABLE_CODES.

> Re-add type info to TupleWritable to make fields sort correctly
> ---------------------------------------------------------------
>
>                 Key: CRUNCH-329
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-329
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.10.0, 0.8.3
>            Reporter: Josh Wills
>            Assignee: Josh Wills
>             Fix For: 0.10.0, 0.8.3
>
>         Attachments: CRUNCH-329.patch, fix-ss-writables.patch
>
>
> Secondary sorts aren't currently working correctly for Writable types after 
> we hacked the TupleWritable impl to make all of the fields BytesWritables 
> (e.g., secondary IntWritable values will no longer be sorted correctly, even 
> though everything is still grouped correctly.)
> The least-bad way that I came up with to fix this is to use integer codes for 
> each possible WritableComparable type in a pipeline that we can use to decode 
> what Writable type each tuple field corresponds to. This allows us to keep 
> the various fields sortable while still doing a reasonable job of minimizing 
> the serialization required to pass the type information along.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to