[
https://issues.apache.org/jira/browse/HBASE-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706409#comment-13706409
]
Nick Dimiduk commented on HBASE-8865:
-------------------------------------
That's a little clunky but should expose the feature [~dinghaifeng] is looking
for. I wonder if the shell could use a heuristic, look for strings that "look
like" binary strings and act accordingly. I hesitate to put too much effort
into making the shell "smart" though. I think your listed workaround is
actually the most correct solution, because neither you nor the shell are
confused regarding your intention to use a {{byte[]}}.
What if we patch the help message on shell's {{split}} command to explicitly
advise the user about this scenario?
> HBase shell split command acts incorrectly with hex split keys.
> ---------------------------------------------------------------
>
> Key: HBASE-8865
> URL: https://issues.apache.org/jira/browse/HBASE-8865
> Project: HBase
> Issue Type: Bug
> Components: shell, Usability
> Affects Versions: 0.94.8
> Environment: Linux
> Reporter: Ding Haifeng
> Attachments: 8865.txt
>
>
> When I tried to do a manual region split from HBase shell, I found that split
> command acts incorrectly with hex split keys.
> Here is an example.
> I execute hbase(main):003:0> split 'tsdb', "\x00\x00\xC3" .
> While I expect it to split at the 3-byte key "\x00\x00\xC3" , it actually
> split at a 5-byte key "\x00\x00\xEF\xBF\xBD".
> I test with more split keys and find some patterns:
> * If the all bytes in the split key represented in hexadecimal are between
> "\x00" and "\x7F" , it works as expected and split at exactly the key
> specified.
> * If there are any bytes between "\x80" and "xFF", it works incorrectly. No
> matter the byte is, it is interpreted as "\xEF\xBF\xBD". Here is another
> example. Specifying split key "\x00\xA0\x00\xB0" actually splits at
> "\x00\xEF\xBF\xBD\x00\xEF\xBF\xBD".
> I'm running Hbase 0.94.8, r1485407, both server-side and client-side.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira