[ https://issues.apache.org/jira/browse/HADOOP-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719503#action_12719503 ]
ZhuGuanyin commented on HADOOP-5779: ------------------------------------ Thanks very much, I'm busy the last month and not followed this issue, I'll attach an example dataset to let key.toString().getBytes("UTF-8") throws exception soon, Thanks again! > KeyFieldBasedPartitioner would lost data if specifed field not exist, and it > should encode free not only support utf8 > --------------------------------------------------------------------------------------------------------------------- > > Key: HADOOP-5779 > URL: https://issues.apache.org/jira/browse/HADOOP-5779 > Project: Hadoop Core > Issue Type: Bug > Components: mapred > Affects Versions: 0.20.0 > Reporter: ZhuGuanyin > Fix For: 0.21.0 > > Attachments: encode-free-KeyFieldBasedPartitioner-v1.patch, > encode-free-KeyFieldBasedPartitioner.patch, HADOOP-5779-partial.patch, > HADOOP-5779-v1.0.patch.patch > > > 1) Currently, KeyFieldBasedPartitioner only support utf8 encoded recored, > we should use text or byteswriteable data types. > 2) when using KeyFieldBasedPartitioner, if the record doesn't contain the > specified field, the endChar would equal with array.length, which throw > ArrayOutOfIndex exception, losting that record! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.