[ 
https://issues.apache.org/jira/browse/HBASE-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ding Haifeng updated HBASE-8865:
--------------------------------

    Description: 
When I tried to do a manual region split from HBase shell, I found that split 
command acts incorrectly with hex split keys. 

Here is an example.

I execute hbase(main):003:0> split 'tsdb', "\x00\x00\xC3" .

While I expect it to split at the 3-byte key "\x00\x00\xC3" , it actually split 
at a 5-byte key "\x00\x00\xEF\xBF\xBD". 

I test with more split keys and find some patterns:
* If the all bytes in the split key represented in hexadecimal are between 
"\x00" and "\x7F" , it works as expected and split at exactly the key specified.
* If there are any bytes between "\x80" and "xFF", it works incorrectly. No 
matter the byte is, it is interpreted as "\xEF\xBF\xBD". Here is another 
example. Specifying split key "\x00\xA0\x00\xB0" actually splits at 
"\x00\xEF\xBF\xBD\x00\xEF\xBF\xBD".

I'm running Hbase 0.94.8, r1485407, both server-side and client-side. 



  was:
When I tried to do a manual region split from HBase shell, I found that split 
command acts incorrectly with hex split keys. 

Here is an example.

I execute hbase(main):003:0> split 'tsdb', "\x00\x00\xC3" .

While I expect it to split at the 3-byte key "\x00\x00\xC3" , it actually split 
at a 5-byte key "\x00\x00\xEF\xBF\xBD". 

I test with more split keys and find some patterns:
* If the all bytes in the split key represented in hexadecimal are between 
"\x00" and "\x9F" , it works as expected and split at exactly the key specified.
* If there are any bytes between "\xA0" and "xFF", it works incorrectly. No 
matter the byte is, it is interpreted as "\xEF\xBF\xBD". Here is another 
example. Specifying split key "\x00\xA0\x00\xB0" actually splits at 
"\x00\xEF\xBF\xBD\x00\xEF\xBF\xBD".

I'm running Hbase 0.94.8, r1485407, both server-side and client-side. 



    
> HBase shell split command acts incorrectly with hex split keys.
> ---------------------------------------------------------------
>
>                 Key: HBASE-8865
>                 URL: https://issues.apache.org/jira/browse/HBASE-8865
>             Project: HBase
>          Issue Type: Bug
>          Components: shell
>    Affects Versions: 0.94.8
>         Environment: Linux
>            Reporter: Ding Haifeng
>
> When I tried to do a manual region split from HBase shell, I found that split 
> command acts incorrectly with hex split keys. 
> Here is an example.
> I execute hbase(main):003:0> split 'tsdb', "\x00\x00\xC3" .
> While I expect it to split at the 3-byte key "\x00\x00\xC3" , it actually 
> split at a 5-byte key "\x00\x00\xEF\xBF\xBD". 
> I test with more split keys and find some patterns:
> * If the all bytes in the split key represented in hexadecimal are between 
> "\x00" and "\x7F" , it works as expected and split at exactly the key 
> specified.
> * If there are any bytes between "\x80" and "xFF", it works incorrectly. No 
> matter the byte is, it is interpreted as "\xEF\xBF\xBD". Here is another 
> example. Specifying split key "\x00\xA0\x00\xB0" actually splits at 
> "\x00\xEF\xBF\xBD\x00\xEF\xBF\xBD".
> I'm running Hbase 0.94.8, r1485407, both server-side and client-side. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to