[ https://issues.apache.org/jira/browse/HADOOP-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710191#action_12710191 ]
Jothi Padmanabhan commented on HADOOP-5779: ------------------------------------------- Some minor comments: * I think it would be better to have the instance check for the keys in a single if block {code} if (key instanceof BytesWritable) { // Handle BytesWritable } else if (key instanceof Text) { // Handle Text } else { // error } {code} * A test case to test for Text and BytesWritable keys would be good to have for this patch. It could either be a new test case or could modify TestStreamDataProtocol. Also, if the test case can demonstrate the fix for ArrayOutOfBoundsException -- it should fail without this patch and run with this patch, it would be really nice. > KeyFieldBasedPartitioner would lost data if specifed field not exist, and it > should encode free not only support utf8 > --------------------------------------------------------------------------------------------------------------------- > > Key: HADOOP-5779 > URL: https://issues.apache.org/jira/browse/HADOOP-5779 > Project: Hadoop Core > Issue Type: Bug > Components: mapred > Affects Versions: 0.20.0 > Reporter: ZhuGuanyin > Fix For: 0.21.0 > > Attachments: encode-free-KeyFieldBasedPartitioner-v1.patch, > encode-free-KeyFieldBasedPartitioner.patch > > > 1) Currently, KeyFieldBasedPartitioner only support utf8 encoded recored, > we should use text or byteswriteable data types. > 2) when using KeyFieldBasedPartitioner, if the record doesn't contain the > specified field, the endChar would equal with array.length, which throw > ArrayOutOfIndex exception, losting that record! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.