[
https://issues.apache.org/jira/browse/CRUNCH-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881094#comment-13881094
]
Gabriel Reid commented on CRUNCH-329:
-------------------------------------
I was thinking that this could be done in
TupleWritableComparator.configureOrdering. I think the custom serialization
codes will need to be set in the Configuration no matter what, and
configureOrdering gets an array of WritableTypes to be used for sorting and a
Configuration, so it could verify that those WritableTypes are all either
built-in types or have a custom serialization code in the conf.
> Re-add type info to TupleWritable to make fields sort correctly
> ---------------------------------------------------------------
>
> Key: CRUNCH-329
> URL: https://issues.apache.org/jira/browse/CRUNCH-329
> Project: Crunch
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.10.0, 0.8.3
> Reporter: Josh Wills
> Assignee: Josh Wills
> Fix For: 0.10.0, 0.8.3
>
> Attachments: fix-ss-writables.patch
>
>
> Secondary sorts aren't currently working correctly for Writable types after
> we hacked the TupleWritable impl to make all of the fields BytesWritables
> (e.g., secondary IntWritable values will no longer be sorted correctly, even
> though everything is still grouped correctly.)
> The least-bad way that I came up with to fix this is to use integer codes for
> each possible WritableComparable type in a pipeline that we can use to decode
> what Writable type each tuple field corresponds to. This allows us to keep
> the various fields sortable while still doing a reasonable job of minimizing
> the serialization required to pass the type information along.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)