[
https://issues.apache.org/jira/browse/HADOOP-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710191#action_12710191
]
Jothi Padmanabhan commented on HADOOP-5779:
-------------------------------------------
Some minor comments:
* I think it would be better to have the instance check for the keys in a
single if block
{code}
if (key instanceof BytesWritable) {
// Handle BytesWritable
}
else if (key instanceof Text) {
// Handle Text
}
else {
// error
}
{code}
* A test case to test for Text and BytesWritable keys would be good to have for
this patch. It could either be a new test case or could modify
TestStreamDataProtocol. Also, if the test case can demonstrate the fix for
ArrayOutOfBoundsException -- it should fail without this patch and run with
this patch, it would be really nice.
> KeyFieldBasedPartitioner would lost data if specifed field not exist, and it
> should encode free not only support utf8
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5779
> URL: https://issues.apache.org/jira/browse/HADOOP-5779
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.20.0
> Reporter: ZhuGuanyin
> Fix For: 0.21.0
>
> Attachments: encode-free-KeyFieldBasedPartitioner-v1.patch,
> encode-free-KeyFieldBasedPartitioner.patch
>
>
> 1) Currently, KeyFieldBasedPartitioner only support utf8 encoded recored,
> we should use text or byteswriteable data types.
> 2) when using KeyFieldBasedPartitioner, if the record doesn't contain the
> specified field, the endChar would equal with array.length, which throw
> ArrayOutOfIndex exception, losting that record!
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.