Sorry for the long email but hopefully it provides enough detail to
understand the problem and if there is anything we can do to work around it
differently.


*Problem*



With Knox 0.6.x (HDP 2.3), the Knox WebHBase call returns results
correctly. With Knox 0.9.x (HDP 2.5), the Knox WebHBase call returns a 404
not found. If we hit WebHBase directly then there is no issue.



An example call:



curl -i -k -u USER '
https://HOST:8443/gateway/TOPOLOGY/hbase/ns:table/%2frkpart1%2frkpart2'



*Analysis*



Looked closer at gateway-audit and noticed the dispatch urls were being
encoded differently between the two versions.



Works – Knox 0.6.x (HDP 2.3)



17/05/23 16:54:13
||7c4131fc-8638-4a1a-9228-d9a67a312a40|audit|WEBHBASE|USER|||dispatch|uri|
http://HOST:8084/ns:table/%2frkpart1%2frkpart2?doAs=USER|success|Response
status: 200



Doesn’t work – Knox 0.9.x (HDP 2.5)



17/05/23 17:23:13
||4244f242-6694-40bb-914d-8dc7e222f074|audit|WEBHBASE|USER|||dispatch|uri|
http://HOST:8084/ns%3Atable/rkpart1/rkpart2?doAs=USER|success|Response
status: 404



The 404 is coming from WebHBase directly not being able to find the split
row key with the extra slash. The difference is that the %2f which is a /
is being decoded and then removed instead of being left as a %2f in the
URL. This changes the meaning of the url and causes issues for WebHBase on
the backend.



At first the culprit seemed like
https://issues.apache.org/jira/browse/KNOX-709, but this wasn’t the case.
Looks like KNOX-709 may have been caused by KNOX-690.



I pulled down a few Knox versions 0.8.0 and 0.9.0 and found that it did not
affect 0.8.0. I pulled down the code from
https://github.com/hortonworks/knox-release/tree/HDP-2.5.3.77-tag and did a
git bisect to find the offending commit using this test case:
https://gist.github.com/risdenk/afecc66d6fc0c9d665abd1ae5466f341. The
commit is https://git-wip-us.apache.org/repos/asf?p=knox.git;h=c28224c and
related JIRA is https://issues.apache.org/jira/browse/KNOX-690.



*Resolution*



I rebuilt Knox from
https://github.com/hortonworks/knox-release/tree/HDP-2.5.3.77-tag with the
commit c28224c for that reverted. The adjusted code is here:
https://github.com/risdenk/knox-release/tree/hdp25_revert_KNOX-690. The
change is only a single commit
https://github.com/risdenk/knox-release/commit/dc452126de99f6f1d15938f7294e95e3b7c89328



I rebuilt Knox with mvn -DskipTests package and copied the two affected
jars (gateway-provider-rewrite and gateway-util-urltemplate) to
/usr/hdp/current/knox-server/lib/



I moved the two old jars to /root. The affected jars were

·         gateway-provider-rewrite-0.9.0.2.5.3.0-37.jar

·         gateway-util-urltemplate-0.9.0.2.5.3.0-37.jar



I then restarted Knox on hdpr05en02. This made the following curl call work:



curl -i -k -u USER '
https://KNOXHOST:8443/gateway/TOPOLOGY/hbase/ns:table/%2frkpart1%2frkpart2'



*Conclusion*


I'm not convinced that KNOX-690 is a good idea but it basically made it so
url encoded paths were checked by the templates/parser. URL encoding should
be left alone in many cases. Reverting the change from KNOX-690 shouldn't
affect us much more other than upgrades to HDP could break this. I think we
should really avoid using url encodable characters in the rowkey especially
for webhbase. / is a bad character to try to pass through webservices.
Since having the customer change rowkey design will be painful, we will be
using a workaround in the short term.

Kevin Risden

Reply via email to