[jira] Commented: (HADOOP-5528) Binary partitioner

Chris Douglas (JIRA) Fri, 27 Mar 2009 01:35:15 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-5528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689849#action_12689849
 ]


Chris Douglas commented on HADOOP-5528:
---------------------------------------

Klaas,

Thanks for clarifying the python syntax; I hadn't realized that an absent left 
or right offset would use the beginning/end of the list. It's what I get for 
not reading the javadoc or that section carefully enough. I was expecting 
python-like syntax, per Owen's original 
[comment|https://issues.apache.org/jira/browse/HADOOP-5528?focusedCommentId=12684013&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12684013].

The one point unaddressed is whether the partitioner should honor the indices 
when they exceed the key length. On this, I continue to disagree with the 
current patch. Particularly when dealing with raw bytes, it is better to error 
out and require user intervention than to silently proceed. Too often, this 
will lead to unexpected and undetected results.

The class hierarchy still seems unnecessary to me. Since it's clear that what 
it effects can be mapped to the aforementioned expression, adjusting the 
user-provided indices to match the python semantics within that mapping takes 
less code, is more readily understandable, and is more efficient.

> Binary partitioner
> ------------------
>
>                 Key: HADOOP-5528
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5528
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Klaas Bosteels
>            Assignee: Klaas Bosteels
>         Attachments: HADOOP-5528.patch, HADOOP-5528.patch, HADOOP-5528.patch, 
> HADOOP-5528.patch
>
>
> It would be useful to have a {{BinaryPartitioner}} that partitions 
> {{BinaryComparable}} keys by hashing a configurable part of the bytes array 
> corresponding to each key.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5528) Binary partitioner

Reply via email to