Travis Hegner created HBASE-21852:
-------------------------------------
Summary: Cannot get rows from hbase-rest when the rowkey contains
any bytes above 0x7f
Key: HBASE-21852
URL: https://issues.apache.org/jira/browse/HBASE-21852
Project: HBase
Issue Type: Bug
Components: REST
Reporter: Travis Hegner
I have a table that stores it's records with big-endian long (8 byte integer)
rowkeys. I'd like to access this data via the hbase-rest api, but have come
across an issue where I can't access every row that exists. For example:
{{$ curl -v -H "Accept: application/json"
"http://hbase-rest:8080/emps/%00%00%00%00%00%00%04%00/"}}
Returns the expected row without issue. However
{{$ curl -v -H "Accept: application/json"
"http://hbase-rest:8080/emps/%00%00%00%00%00%00%03%FF/"}}
Returns a {{404 Not Found}}, though I'm certain the record exists. The broken
query also generates a log message on the rest server like this:
{{WARN [qtp1473981203-37561] util.URIUtil: /emps/%00%00%00%00%00%00%03%FF/
org.eclipse.jetty.util.Utf8Appendable$NotUtf8Exception: Not valid UTF8! byte Ff
in state 0}}
Some troubleshooting and testing suggests that the error happens when any query
contains an encoded byte above {{0x7f}}.
I've [read|https://stackoverflow.com/a/31772127/2639647] that hbase-rest
supports hex-escaped representation, like the shell, but that has not worked
for me, and when looking through {{RowSpec.java}}, I don't see any indication
that the {{parseRowKeys()}} method is attempting to parse the hex-escaped
representation. Am I missing something here? Is the rest server supposed to
support hex-escaped representation, and I'm not querying correctly?
I've looked at version 0.98, and the current master branch, and the
{{RowSpec.java}} source looks largely the same, so I don't believe this to even
be a regression.
I believe the error to be caused by {{java.net.urldecoder}}. I can only
speculate, but would it be more appropriate to have a generic function that
converts {{%XX}} strings directly to bytes, not relying on a specific
{{Charset}}? Or perhaps some logic should be put into the parser to truly
support the hex-escaped representation. Perhaps with a url parameter to
indicate parsing as such, much like the shell requires using double quotes to
indicate byte parsing.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)