[
https://issues.apache.org/jira/browse/HADOOP-5528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689849#action_12689849
]
Chris Douglas commented on HADOOP-5528:
---------------------------------------
Klaas,
Thanks for clarifying the python syntax; I hadn't realized that an absent left
or right offset would use the beginning/end of the list. It's what I get for
not reading the javadoc or that section carefully enough. I was expecting
python-like syntax, per Owen's original
[comment|https://issues.apache.org/jira/browse/HADOOP-5528?focusedCommentId=12684013&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12684013].
The one point unaddressed is whether the partitioner should honor the indices
when they exceed the key length. On this, I continue to disagree with the
current patch. Particularly when dealing with raw bytes, it is better to error
out and require user intervention than to silently proceed. Too often, this
will lead to unexpected and undetected results.
The class hierarchy still seems unnecessary to me. Since it's clear that what
it effects can be mapped to the aforementioned expression, adjusting the
user-provided indices to match the python semantics within that mapping takes
less code, is more readily understandable, and is more efficient.
> Binary partitioner
> ------------------
>
> Key: HADOOP-5528
> URL: https://issues.apache.org/jira/browse/HADOOP-5528
> Project: Hadoop Core
> Issue Type: New Feature
> Components: mapred
> Reporter: Klaas Bosteels
> Assignee: Klaas Bosteels
> Attachments: HADOOP-5528.patch, HADOOP-5528.patch, HADOOP-5528.patch,
> HADOOP-5528.patch
>
>
> It would be useful to have a {{BinaryPartitioner}} that partitions
> {{BinaryComparable}} keys by hashing a configurable part of the bytes array
> corresponding to each key.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.