[
https://issues.apache.org/jira/browse/KNOX-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15551157#comment-15551157
]
Alexandre Linte commented on KNOX-754:
--------------------------------------
Hi [~lmccay],
Yes you're right, special characters are in the file name, so visible in the
URL.
I tried to access to my file directly through webHDFS instead of Knox and I can
successfully get its content:
{noformat}
[shfs3453@spark01 ~]$ curl -i -L --negotiate -u :
"http://namenode01:50070/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?op=OPEN"
HTTP/1.1 401 Authentication required
Cache-Control: must-revalidate,no-cache,no-store
Date: Thu, 06 Oct 2016 06:44:26 GMT
Pragma: no-cache
Date: Thu, 06 Oct 2016 06:44:26 GMT
Pragma: no-cache
WWW-Authenticate: Negotiate
Set-Cookie: hadoop.auth=; Path=/; Expires=Thu, 01-Jan-1970 00:00:00 GMT;
HttpOnly
Content-Type: text/html; charset=iso-8859-1
Content-Length: 1462
Server: Jetty(6.1.26)
HTTP/1.1 307 TEMPORARY_REDIRECT
Cache-Control: no-cache
Expires: Thu, 06 Oct 2016 06:44:26 GMT
Date: Thu, 06 Oct 2016 06:44:26 GMT
Pragma: no-cache
Expires: Thu, 06 Oct 2016 06:44:26 GMT
Date: Thu, 06 Oct 2016 06:44:26 GMT
Pragma: no-cache
Set-Cookie:
hadoop.auth="u=shfs3453&p=shfs3453@SANDBOX&t=kerberos&e=1475772266290&s=utA8s/id27FTN6tREF647hQKYjg=";
Path=/; Expires=Thu, 06-Oct-2016 16:44:26 GMT; HttpOnly
Content-Type: application/octet-stream
Location:
http://datanode01:1006/webhdfs/v1/user/shfs3453/WORK/datasets/test_%C3%A9lectronique_embarqu%C3%A9.pdf?op=OPEN&delegation=KAAIc2hmczM0NTMIc2hmczM0NTMAigFXmLxmNYoBV7zI6jWOL0uOBfgUWpLKkGUukx6cuEuOJdZQsKMxlZASV0VCSERGUyBkZWxlZ2F0aW9uEzE5Mi4xNjguMjAwLjIzOjgwMjA&namenoderpcaddress=sandbox&offset=0
Content-Length: 0
Server: Jetty(6.1.26)
HTTP/1.1 200 OK
Access-Control-Allow-Methods: GET
Access-Control-Allow-Origin: *
Content-Type: application/octet-stream
Connection: close
Content-Length: 6
hello
{noformat}
As you can see, the same GET operation fails with Knox.
You can find below the DEBUG logs when doing the curl request through Knox:
{noformat}
Oct 6 09:18:21 knox01 knox DEBUG - org.apache.hadoop.gatewayReceived request:
GET /webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf
Oct 6 09:18:21 localhost 16/10/06 09:18:21
||c8a33790-ea9f-442a-ac8c-6b1d8589ab87|audit|WEBHDFS||||access|uri|/gateway/bigdata/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN|unavailable|Request
method: GET
Oct 6 09:18:21 knox01 knox INFO - org.apache.hadoop.gatewayComputed userDn:
cn=shfs3453,ou=users,ou=kerberos,dc=rouen,dc=francetelecom.fr using dnTemplate
for principal: shfs3453
Oct 6 09:18:21 localhost 16/10/06 09:18:21
||c8a33790-ea9f-442a-ac8c-6b1d8589ab87|audit|WEBHDFS|shfs3453|||authentication|uri|/gateway/bigdata/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN|success|
Oct 6 09:18:21 localhost 16/10/06 09:18:21
||c8a33790-ea9f-442a-ac8c-6b1d8589ab87|audit|WEBHDFS|shfs3453|||authentication|uri|/gateway/bigdata/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN|success|Groups:
[]
Oct 6 09:18:21 knox01 knox DEBUG - org.apache.hadoop.gatewayRewrote URL:
https://knox01:8443/gateway/bigdata/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN,
direction: IN via explicit rule: WEBHDFS/webhdfs/inbound/namenode/file to URL:
http://namenode01:50070/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN
Oct 6 09:18:21 knox01 knox DEBUG - org.apache.hadoop.gatewayDispatch request:
GET
http://namenode01:50070/webhdfs/v1/user/shfs3453/WORK/datasets/test_%C3%A9lectronique_embarqu%C3%A9.pdf?doAs=shfs3453&OP=OPEN
Oct 6 09:18:21 localhost 16/10/06 09:18:21
||c8a33790-ea9f-442a-ac8c-6b1d8589ab87|audit|WEBHDFS|shfs3453|||dispatch|uri|http://namenode01:50070/webhdfs/v1/user/shfs3453/WORK/datasets/test_%C3%A9lectronique_embarqu%C3%A9.pdf?doAs=shfs3453&OP=OPEN|unavailable|Request
method: GET
Oct 6 09:18:21 knox01 knox DEBUG - org.apache.hadoop.gatewayDispatch response
status: 307
Oct 6 09:18:21 localhost 16/10/06 09:18:21
||c8a33790-ea9f-442a-ac8c-6b1d8589ab87|audit|WEBHDFS|shfs3453|||dispatch|uri|http://namenode01:50070/webhdfs/v1/user/shfs3453/WORK/datasets/test_%C3%A9lectronique_embarqu%C3%A9.pdf?doAs=shfs3453&OP=OPEN|success|Response
status: 307
Oct 6 09:18:21 knox01 knox DEBUG - org.apache.hadoop.gatewayRewrote URL:
http://datanode05:1006/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?op=OPEN&delegation=KAAIc2hmczM0NTMEa25veARrbm94igFXmNt1TIoBV7zn-UyOL1GOBfgU_4Rrb_m7FUvmXc1StZhxQz_uw7cSV0VCSERGUyBkZWxlZ2F0aW9uEzE5Mi4xNjguMjAwLjIzOjgwMjA&namenoderpcaddress=sandbox&offset=0,
direction: OUT via explicit rule:
WEBHDFS/webhdfs/outbound/namenode/headers/location to URL: https://...
Oct 6 09:18:21 localhost ...10.170.45.30:
8443/gateway/bigdata/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?_=AAAACAAAABAAAAEAI5_F8dSFrv2IHOnhme_uecgUne1TgIj23yg16fSf_bXoDWaYVvumymktcLmhOMI8V3UDWQO_TTfA_R5JmNsjYWh85UsE9evQQv3hFW20Cp6Uf_0uzpPKbcb_zbOuv1IYS15SUiHQlraDdd7yo0ABlNwgeASPqexp9mnGC2UCxmMWHkdv-uLyfVr8kjGwzSxjh2yuKh_ThF5sLuUoaaVYCjKJu-882OmpTv75z7CFWbKJQYrLTvrJbiAn_8lPrLfTggv49RbRDzvsSXtzg4RNR7WFW3bIbg1WYQnmGYbsxHine7ei0eAT5y-RJr-EU93uMIo1cLPnbUjcde0Eg_ughl_0KexyqTCVe4L0_f9-i5C3GAPtIV5ScQ
Oct 6 09:18:21 knox01 knox DEBUG - org.apache.hadoop.gatewayInbound response
entity content type: application/octet-stream
Oct 6 09:18:21 localhost 16/10/06 09:18:21
||c8a33790-ea9f-442a-ac8c-6b1d8589ab87|audit|WEBHDFS|shfs3453|||access|uri|/gateway/bigdata/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN|success|Response
status: 307
Oct 6 09:18:21 knox01 knox DEBUG - org.apache.hadoop.gatewayReceived request:
GET
/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_�lectronique_embarqu�.pdf
Oct 6 09:18:21 localhost 16/10/06 09:18:21
||1fa63f4e-0d7c-4792-ae93-6bb9ce82228d|audit|WEBHDFS||||access|uri|/gateway/bigdata/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_�lectronique_embarqu�.pdf?_=AAAACAAAABAAAAEAI5_F8dSFrv2IHOnhme_uecgUne1TgIj23yg16fSf_bXoDWaYVvumymktcLmhOMI8V3UDWQO_TTfA_R5JmNsjYWh85UsE9evQQv3hFW20Cp6Uf_0uzpPKbcb_zbOuv1IYS15SUiHQlraDdd7yo0ABlNwgeASPqexp9mnGC2UCxmMWHkdv-uLyfVr8kjGwzSxjh2yuKh_ThF5sLuUoaaVYCjKJu-882OmpTv75z7CFWbKJQYrLTvrJbiAn_8lPrLfTggv49RbRDzvsSXtzg4RNR7WFW3bIbg1WYQnmGYbsxHine7ei0eAT5y-RJr-EU93uMIo1cLPnbUjcde0Eg_ughl_0KexyqTCVe4L0_f9-i5C3GAPtIV5ScQ|unavailable|Request
method: GET
Oct 6 09:18:21 localhost 16/10/06 09:18:21
||1fa63f4e-0d7c-4792-ae93-6bb9ce82228d|audit|WEBHDFS||||access|uri|/gateway/bigdata/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_�lectronique_embarqu�.pdf?_=AAAACAAAABAAAAEAI5_F8dSFrv2IHOnhme_uecgUne1TgIj23yg16fSf_bXoDWaYVvumymktcLmhOMI8V3UDWQO_TTfA_R5JmNsjYWh85UsE9evQQv3hFW20Cp6Uf_0uzpPKbcb_zbOuv1IYS15SUiHQlraDdd7yo0ABlNwgeASPqexp9mnGC2UCxmMWHkdv-uLyfVr8kjGwzSxjh2yuKh_ThF5sLuUoaaVYCjKJu-882OmpTv75z7CFWbKJQYrLTvrJbiAn_8lPrLfTggv49RbRDzvsSXtzg4RNR7WFW3bIbg1WYQnmGYbsxHine7ei0eAT5y-RJr-EU93uMIo1cLPnbUjcde0Eg_ughl_0KexyqTCVe4L0_f9-i5C3GAPtIV5ScQ|success|Response
status: 401
{noformat}
The curl fails because Knox seems to access the file
"test_�lectronique_embarqu�.pdf" which doesn't exist.
> curl requests fail when dealing with special characters
> -------------------------------------------------------
>
> Key: KNOX-754
> URL: https://issues.apache.org/jira/browse/KNOX-754
> Project: Apache Knox
> Issue Type: Bug
> Components: ClientDSL, Server
> Affects Versions: 0.9.1
> Environment: Apache Knox 0.9.1, Apache Hadoop 2.7.2
> Reporter: Alexandre Linte
> Priority: Critical
>
> Since Knox 0.9.1, Knox can't work with files which contain special characters
> as : é, ù, ü, è, etc... This is reproducible at 100%. It was working well
> with Knox 0.7.0 so it's a regression.
> This happens when doing a GET or a PUT of a file of this type, and more
> particularly at the "location" process of the request. You can find an
> example below:
> {noformat}
> [shfs3453@spark01 Pig]$ curl -Iikv -u shfs3453 -X GET
> 'https://knox-gateway.fr/gateway/bigdata/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN'
> Enter host password for user 'shfs3453':
> * About to connect() to knox-gateway port 443 (#0)
> * Trying 10.117.41.12... connected
> * Connected to knox-gateway (10.117.41.12) port 443 (#0)
> * Initializing NSS with certpath: sql:/etc/pki/nssdb
> * warning: ignoring value of ssl.verifyhost
> * skipping SSL peer certificate verification
> * SSL connection using TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA
> * Server certificate:
> * subject:
> E=********@*****,CN=knox-gateway,OU=*****,O=****,L=*****,ST=*****,C=***
> * start date: Nov 07 11:33:05 2014 GMT
> * expire date: Nov 06 11:33:05 2019 GMT
> * common name: knox-gateway
> * issuer: CN=***************,OU=*******,OU=********,O=******,C=***
> * Server auth using Basic with user 'shfs3453'
> > GET
> > /gateway/bigdata/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN
> > HTTP/1.1
> > Authorization: Basic c2hmczM0NTM6UGIxOTkxMTAh
> > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1
> > Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> > Host: knox-gateway
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> HTTP/1.1 307 Temporary Redirect
> < Date: Wed, 05 Oct 2016 07:19:55 GMT
> Date: Wed, 05 Oct 2016 07:19:55 GMT
> < Set-Cookie:
> JSESSIONID=4zv7v1911q5vvcg6r1tqxe77;Path=/gateway/bigdata;Secure;HttpOnly
> Set-Cookie:
> JSESSIONID=4zv7v1911q5vvcg6r1tqxe77;Path=/gateway/bigdata;Secure;HttpOnly
> < Expires: Thu, 01 Jan 1970 00:00:00 GMT
> Expires: Thu, 01 Jan 1970 00:00:00 GMT
> < Set-Cookie: rememberMe=deleteMe; Path=/gateway/bigdata; Max-Age=0;
> Expires=Tue, 04-Oct-2016 07:19:56 GMT
> Set-Cookie: rememberMe=deleteMe; Path=/gateway/bigdata; Max-Age=0;
> Expires=Tue, 04-Oct-2016 07:19:56 GMT
> < Cache-Control: no-cache
> Cache-Control: no-cache
> < Expires: Wed, 05 Oct 2016 07:19:56 GMT
> Expires: Wed, 05 Oct 2016 07:19:56 GMT
> < Date: Wed, 05 Oct 2016 07:19:56 GMT
> Date: Wed, 05 Oct 2016 07:19:56 GMT
> < Pragma: no-cache
> Pragma: no-cache
> < Expires: Wed, 05 Oct 2016 07:19:56 GMT
> Expires: Wed, 05 Oct 2016 07:19:56 GMT
> < Date: Wed, 05 Oct 2016 07:19:56 GMT
> Date: Wed, 05 Oct 2016 07:19:56 GMT
> < Pragma: no-cache
> Pragma: no-cache
> < Location:
> https://knox-gateway/gateway/bigdata/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_▒lectronique_embarqu▒.pdf?_=AAAACAAAABAAAAEAl_jkRL_c3Tzm7hoXMR1KPge4OClEqM4hfs3eslFzfdY5CBbrfaMzOa--NXb08Xjw2O11CkOtyUX5kXwh2IgZmxjw_TNHqQUvVAfFkeXiMDiBXxpbhulsVx3o_NLn9pCLsp09xJ9r1utCHrueYOAvuxY_ksQWuHld2WWGEPyWRubcgb4e6xO2F4jo96NSZhuAP8iarY5LiCtTydLPBXcEbbD146jLD7S83Mhij4VS5sO1asESNH5y8_5Z2PvLcZE11WiTS9alu-9AUqXNixw1t9Y5Em6xDle7s8-oiF3nPVM80RIdbJel4LoeCZuB2zgddLaJAYx5tSb03-QGNzupOPQ5UQ0_7ybPwmAsgiFfFNuvMbj9sKgxLg
> Location:
> https://knox-gateway/gateway/bigdata/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_▒lectronique_embarqu▒.pdf?_=AAAACAAAABAAAAEAl_jkRL_c3Tzm7hoXMR1KPge4OClEqM4hfs3eslFzfdY5CBbrfaMzOa--NXb08Xjw2O11CkOtyUX5kXwh2IgZmxjw_TNHqQUvVAfFkeXiMDiBXxpbhulsVx3o_NLn9pCLsp09xJ9r1utCHrueYOAvuxY_ksQWuHld2WWGEPyWRubcgb4e6xO2F4jo96NSZhuAP8iarY5LiCtTydLPBXcEbbD146jLD7S83Mhij4VS5sO1asESNH5y8_5Z2PvLcZE11WiTS9alu-9AUqXNixw1t9Y5Em6xDle7s8-oiF3nPVM80RIdbJel4LoeCZuB2zgddLaJAYx5tSb03-QGNzupOPQ5UQ0_7ybPwmAsgiFfFNuvMbj9sKgxLg
> < Content-Type: application/octet-stream
> Content-Type: application/octet-stream
> < Server: Jetty(6.1.26)
> Server: Jetty(6.1.26)
> < Content-Length: 0
> Content-Length: 0
> <
> * Connection #0 to host knox-gateway left intact
> * Issue another request to this URL:
> 'https://knox-gateway/gateway/bigdata/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_▒lectronique_embarqu▒.pdf?_=AAAACAAAABAAAAEAl_jkRL_c3Tzm7hoXMR1KPge4OClEqM4hfs3eslFzfdY5CBbrfaMzOa--NXb08Xjw2O11CkOtyUX5kXwh2IgZmxjw_TNHqQUvVAfFkeXiMDiBXxpbhulsVx3o_NLn9pCLsp09xJ9r1utCHrueYOAvuxY_ksQWuHld54aYH365vWu3V6u-_BaX3E1Ax5puYXZkEypdB2SOKVFW5bqIi5JFkgUr_XV8bXdcFTcdbohr82pKVBqmK-OvSZnCAVSdy4Yjyf51fSLo_n07ElHK84zqsXEMLU1zF5DHbSKC_jwpGahsm5VlYK7H5Ppwt0SNFHIx50O9yBpPLHYNe-ALlqTOlq6UT3ifFufhmKsY6chP6IxLw7ZyCHrBSA'
> * Re-using existing connection! (#0) with host knox-gateway
> * Connected to knox-gateway (10.117.41.12) port 443 (#0)
> * Server auth using Basic with user 'shfs3453'
> > GET
> > /gateway/bigdata/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_▒lectronique_embarqu▒.pdf?_=AAAACAAAABAAAAEAl_jkRL_c3Tzm7hoXMR1KPge4OClEqM4hfs3eslFzfdY5CBbrfaMzOa--NXb08Xjw2O11CkOtyUX5kXwh2IgZmxjw_TNHqQUvVAfFkeXiMDiBXxpbhulsVx3o_NLn9pCLsp09xJ9r1utCHrueYOAvuxY_ksQWuHld54aYH365vWu3V6u-_BaX3E1Ax5puYXZkEypdB2SOKVFW5bqIi5JFkgUr_XV8bXdcFTcdbohr82pKVBqmK-OvSZnCAVSdy4Yjyf51fSLo_n07ElHK84zqsXEMLU1zF5DHbSKC_jwpGahsm5VlYK7H5Ppwt0SNFHIx50O9yBpPLHYNe-ALlqTOlq6UT3ifFufhmKsY6chP6IxLw7ZyCHrBSA
> > HTTP/1.1
> > Authorization: Basic c2hmczM0NTM6UGIxOTkxMTAh
> > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1
> > Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> > Host: knox-gateway
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> HTTP/1.1 404 Not Found
> < Date: Wed, 05 Oct 2016 07:32:07 GMT
> Date: Wed, 05 Oct 2016 07:32:07 GMT
> < Set-Cookie:
> JSESSIONID=1kp671ikau2cieuzdlw84yeh;Path=/gateway/bigdata;Secure;HttpOnly
> Set-Cookie:
> JSESSIONID=1kp671ikau2cieuzdlw84yeh;Path=/gateway/bigdata;Secure;HttpOnly
> < Expires: Thu, 01 Jan 1970 00:00:00 GMT
> Expires: Thu, 01 Jan 1970 00:00:00 GMT
> < Set-Cookie: rememberMe=deleteMe; Path=/gateway/bigdata; Max-Age=0;
> Expires=Tue, 04-Oct-2016 07:32:07 GMT
> Set-Cookie: rememberMe=deleteMe; Path=/gateway/bigdata; Max-Age=0;
> Expires=Tue, 04-Oct-2016 07:32:07 GMT
> < Content-Type: application/json; charset=utf-8
> Content-Type: application/json; charset=utf-8
> < Connection: close
> Connection: close
> < Server: Jetty(9.2.15.v20160210)
> Server: Jetty(9.2.15.v20160210)
> <
> * Closing connection #0
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)