[ 
https://issues.apache.org/jira/browse/KNOX-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15551157#comment-15551157
 ] 

Alexandre Linte commented on KNOX-754:
--------------------------------------

Hi [~lmccay],

Yes you're right, special characters are in the file name, so visible in the 
URL.

I tried to access to my file directly through webHDFS instead of Knox and I can 
successfully get its content:

{noformat}
[shfs3453@spark01 ~]$ curl -i -L --negotiate -u : 
"http://namenode01:50070/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?op=OPEN";
HTTP/1.1 401 Authentication required
Cache-Control: must-revalidate,no-cache,no-store
Date: Thu, 06 Oct 2016 06:44:26 GMT
Pragma: no-cache
Date: Thu, 06 Oct 2016 06:44:26 GMT
Pragma: no-cache
WWW-Authenticate: Negotiate
Set-Cookie: hadoop.auth=; Path=/; Expires=Thu, 01-Jan-1970 00:00:00 GMT; 
HttpOnly
Content-Type: text/html; charset=iso-8859-1
Content-Length: 1462
Server: Jetty(6.1.26)

HTTP/1.1 307 TEMPORARY_REDIRECT
Cache-Control: no-cache
Expires: Thu, 06 Oct 2016 06:44:26 GMT
Date: Thu, 06 Oct 2016 06:44:26 GMT
Pragma: no-cache
Expires: Thu, 06 Oct 2016 06:44:26 GMT
Date: Thu, 06 Oct 2016 06:44:26 GMT
Pragma: no-cache
Set-Cookie: 
hadoop.auth="u=shfs3453&p=shfs3453@SANDBOX&t=kerberos&e=1475772266290&s=utA8s/id27FTN6tREF647hQKYjg=";
 Path=/; Expires=Thu, 06-Oct-2016 16:44:26 GMT; HttpOnly
Content-Type: application/octet-stream
Location: 
http://datanode01:1006/webhdfs/v1/user/shfs3453/WORK/datasets/test_%C3%A9lectronique_embarqu%C3%A9.pdf?op=OPEN&delegation=KAAIc2hmczM0NTMIc2hmczM0NTMAigFXmLxmNYoBV7zI6jWOL0uOBfgUWpLKkGUukx6cuEuOJdZQsKMxlZASV0VCSERGUyBkZWxlZ2F0aW9uEzE5Mi4xNjguMjAwLjIzOjgwMjA&namenoderpcaddress=sandbox&offset=0
Content-Length: 0
Server: Jetty(6.1.26)

HTTP/1.1 200 OK
Access-Control-Allow-Methods: GET
Access-Control-Allow-Origin: *
Content-Type: application/octet-stream
Connection: close
Content-Length: 6

hello
{noformat}

As you can see, the same GET operation fails with Knox.

You can find below the DEBUG logs when doing the curl request through Knox:

{noformat}
Oct  6 09:18:21 knox01 knox DEBUG - org.apache.hadoop.gatewayReceived request: 
GET /webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf
Oct  6 09:18:21 localhost 16/10/06 09:18:21 
||c8a33790-ea9f-442a-ac8c-6b1d8589ab87|audit|WEBHDFS||||access|uri|/gateway/bigdata/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN|unavailable|Request
 method: GET
Oct  6 09:18:21 knox01 knox INFO - org.apache.hadoop.gatewayComputed userDn: 
cn=shfs3453,ou=users,ou=kerberos,dc=rouen,dc=francetelecom.fr using dnTemplate 
for principal: shfs3453
Oct  6 09:18:21 localhost 16/10/06 09:18:21 
||c8a33790-ea9f-442a-ac8c-6b1d8589ab87|audit|WEBHDFS|shfs3453|||authentication|uri|/gateway/bigdata/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN|success|
Oct  6 09:18:21 localhost 16/10/06 09:18:21 
||c8a33790-ea9f-442a-ac8c-6b1d8589ab87|audit|WEBHDFS|shfs3453|||authentication|uri|/gateway/bigdata/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN|success|Groups:
 []
Oct  6 09:18:21 knox01 knox DEBUG - org.apache.hadoop.gatewayRewrote URL: 
https://knox01:8443/gateway/bigdata/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN,
 direction: IN via explicit rule: WEBHDFS/webhdfs/inbound/namenode/file to URL: 
http://namenode01:50070/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN
Oct  6 09:18:21 knox01 knox DEBUG - org.apache.hadoop.gatewayDispatch request: 
GET 
http://namenode01:50070/webhdfs/v1/user/shfs3453/WORK/datasets/test_%C3%A9lectronique_embarqu%C3%A9.pdf?doAs=shfs3453&OP=OPEN
Oct  6 09:18:21 localhost 16/10/06 09:18:21 
||c8a33790-ea9f-442a-ac8c-6b1d8589ab87|audit|WEBHDFS|shfs3453|||dispatch|uri|http://namenode01:50070/webhdfs/v1/user/shfs3453/WORK/datasets/test_%C3%A9lectronique_embarqu%C3%A9.pdf?doAs=shfs3453&OP=OPEN|unavailable|Request
 method: GET
Oct  6 09:18:21 knox01 knox DEBUG - org.apache.hadoop.gatewayDispatch response 
status: 307
Oct  6 09:18:21 localhost 16/10/06 09:18:21 
||c8a33790-ea9f-442a-ac8c-6b1d8589ab87|audit|WEBHDFS|shfs3453|||dispatch|uri|http://namenode01:50070/webhdfs/v1/user/shfs3453/WORK/datasets/test_%C3%A9lectronique_embarqu%C3%A9.pdf?doAs=shfs3453&OP=OPEN|success|Response
 status: 307
Oct  6 09:18:21 knox01 knox DEBUG - org.apache.hadoop.gatewayRewrote URL: 
http://datanode05:1006/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?op=OPEN&delegation=KAAIc2hmczM0NTMEa25veARrbm94igFXmNt1TIoBV7zn-UyOL1GOBfgU_4Rrb_m7FUvmXc1StZhxQz_uw7cSV0VCSERGUyBkZWxlZ2F0aW9uEzE5Mi4xNjguMjAwLjIzOjgwMjA&namenoderpcaddress=sandbox&offset=0,
 direction: OUT via explicit rule: 
WEBHDFS/webhdfs/outbound/namenode/headers/location to URL: https://...
Oct  6 09:18:21 localhost ...10.170.45.30: 
8443/gateway/bigdata/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?_=AAAACAAAABAAAAEAI5_F8dSFrv2IHOnhme_uecgUne1TgIj23yg16fSf_bXoDWaYVvumymktcLmhOMI8V3UDWQO_TTfA_R5JmNsjYWh85UsE9evQQv3hFW20Cp6Uf_0uzpPKbcb_zbOuv1IYS15SUiHQlraDdd7yo0ABlNwgeASPqexp9mnGC2UCxmMWHkdv-uLyfVr8kjGwzSxjh2yuKh_ThF5sLuUoaaVYCjKJu-882OmpTv75z7CFWbKJQYrLTvrJbiAn_8lPrLfTggv49RbRDzvsSXtzg4RNR7WFW3bIbg1WYQnmGYbsxHine7ei0eAT5y-RJr-EU93uMIo1cLPnbUjcde0Eg_ughl_0KexyqTCVe4L0_f9-i5C3GAPtIV5ScQ
Oct  6 09:18:21 knox01 knox DEBUG - org.apache.hadoop.gatewayInbound response 
entity content type: application/octet-stream
Oct  6 09:18:21 localhost 16/10/06 09:18:21 
||c8a33790-ea9f-442a-ac8c-6b1d8589ab87|audit|WEBHDFS|shfs3453|||access|uri|/gateway/bigdata/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN|success|Response
 status: 307
Oct  6 09:18:21 knox01 knox DEBUG - org.apache.hadoop.gatewayReceived request: 
GET 
/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_�lectronique_embarqu�.pdf
Oct  6 09:18:21 localhost 16/10/06 09:18:21 
||1fa63f4e-0d7c-4792-ae93-6bb9ce82228d|audit|WEBHDFS||||access|uri|/gateway/bigdata/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_�lectronique_embarqu�.pdf?_=AAAACAAAABAAAAEAI5_F8dSFrv2IHOnhme_uecgUne1TgIj23yg16fSf_bXoDWaYVvumymktcLmhOMI8V3UDWQO_TTfA_R5JmNsjYWh85UsE9evQQv3hFW20Cp6Uf_0uzpPKbcb_zbOuv1IYS15SUiHQlraDdd7yo0ABlNwgeASPqexp9mnGC2UCxmMWHkdv-uLyfVr8kjGwzSxjh2yuKh_ThF5sLuUoaaVYCjKJu-882OmpTv75z7CFWbKJQYrLTvrJbiAn_8lPrLfTggv49RbRDzvsSXtzg4RNR7WFW3bIbg1WYQnmGYbsxHine7ei0eAT5y-RJr-EU93uMIo1cLPnbUjcde0Eg_ughl_0KexyqTCVe4L0_f9-i5C3GAPtIV5ScQ|unavailable|Request
 method: GET
Oct  6 09:18:21 localhost 16/10/06 09:18:21 
||1fa63f4e-0d7c-4792-ae93-6bb9ce82228d|audit|WEBHDFS||||access|uri|/gateway/bigdata/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_�lectronique_embarqu�.pdf?_=AAAACAAAABAAAAEAI5_F8dSFrv2IHOnhme_uecgUne1TgIj23yg16fSf_bXoDWaYVvumymktcLmhOMI8V3UDWQO_TTfA_R5JmNsjYWh85UsE9evQQv3hFW20Cp6Uf_0uzpPKbcb_zbOuv1IYS15SUiHQlraDdd7yo0ABlNwgeASPqexp9mnGC2UCxmMWHkdv-uLyfVr8kjGwzSxjh2yuKh_ThF5sLuUoaaVYCjKJu-882OmpTv75z7CFWbKJQYrLTvrJbiAn_8lPrLfTggv49RbRDzvsSXtzg4RNR7WFW3bIbg1WYQnmGYbsxHine7ei0eAT5y-RJr-EU93uMIo1cLPnbUjcde0Eg_ughl_0KexyqTCVe4L0_f9-i5C3GAPtIV5ScQ|success|Response
 status: 401
{noformat}

The curl fails because Knox seems to access the file 
"test_�lectronique_embarqu�.pdf" which doesn't exist.

> curl requests fail when dealing with special characters
> -------------------------------------------------------
>
>                 Key: KNOX-754
>                 URL: https://issues.apache.org/jira/browse/KNOX-754
>             Project: Apache Knox
>          Issue Type: Bug
>          Components: ClientDSL, Server
>    Affects Versions: 0.9.1
>         Environment: Apache Knox 0.9.1, Apache Hadoop 2.7.2
>            Reporter: Alexandre Linte
>            Priority: Critical
>
> Since Knox 0.9.1, Knox can't work with files which contain special characters 
> as : é, ù, ü, è, etc... This is reproducible at 100%. It was working well 
> with Knox 0.7.0 so it's a regression. 
> This happens when doing a GET or a PUT of a file of this type, and more 
> particularly at the "location" process of the request. You can find an 
> example below:
> {noformat}
> [shfs3453@spark01 Pig]$ curl -Iikv -u shfs3453 -X GET 
> 'https://knox-gateway.fr/gateway/bigdata/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN'
> Enter host password for user 'shfs3453':
> * About to connect() to knox-gateway port 443 (#0)
> *   Trying 10.117.41.12... connected
> * Connected to knox-gateway (10.117.41.12) port 443 (#0)
> * Initializing NSS with certpath: sql:/etc/pki/nssdb
> * warning: ignoring value of ssl.verifyhost
> * skipping SSL peer certificate verification
> * SSL connection using TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA
> * Server certificate:
> *       subject: 
> E=********@*****,CN=knox-gateway,OU=*****,O=****,L=*****,ST=*****,C=***
> *       start date: Nov 07 11:33:05 2014 GMT
> *       expire date: Nov 06 11:33:05 2019 GMT
> *       common name: knox-gateway
> *       issuer: CN=***************,OU=*******,OU=********,O=******,C=***
> * Server auth using Basic with user 'shfs3453'
> > GET 
> > /gateway/bigdata/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?OP=OPEN
> >  HTTP/1.1
> > Authorization: Basic c2hmczM0NTM6UGIxOTkxMTAh
> > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1 
> > Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> > Host: knox-gateway
> > Accept: */*
> >
> < HTTP/1.1 307 Temporary Redirect
> HTTP/1.1 307 Temporary Redirect
> < Date: Wed, 05 Oct 2016 07:19:55 GMT
> Date: Wed, 05 Oct 2016 07:19:55 GMT
> < Set-Cookie: 
> JSESSIONID=4zv7v1911q5vvcg6r1tqxe77;Path=/gateway/bigdata;Secure;HttpOnly
> Set-Cookie: 
> JSESSIONID=4zv7v1911q5vvcg6r1tqxe77;Path=/gateway/bigdata;Secure;HttpOnly
> < Expires: Thu, 01 Jan 1970 00:00:00 GMT
> Expires: Thu, 01 Jan 1970 00:00:00 GMT
> < Set-Cookie: rememberMe=deleteMe; Path=/gateway/bigdata; Max-Age=0; 
> Expires=Tue, 04-Oct-2016 07:19:56 GMT
> Set-Cookie: rememberMe=deleteMe; Path=/gateway/bigdata; Max-Age=0; 
> Expires=Tue, 04-Oct-2016 07:19:56 GMT
> < Cache-Control: no-cache
> Cache-Control: no-cache
> < Expires: Wed, 05 Oct 2016 07:19:56 GMT
> Expires: Wed, 05 Oct 2016 07:19:56 GMT
> < Date: Wed, 05 Oct 2016 07:19:56 GMT
> Date: Wed, 05 Oct 2016 07:19:56 GMT
> < Pragma: no-cache
> Pragma: no-cache
> < Expires: Wed, 05 Oct 2016 07:19:56 GMT
> Expires: Wed, 05 Oct 2016 07:19:56 GMT
> < Date: Wed, 05 Oct 2016 07:19:56 GMT
> Date: Wed, 05 Oct 2016 07:19:56 GMT
> < Pragma: no-cache
> Pragma: no-cache
> < Location: 
> https://knox-gateway/gateway/bigdata/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_▒lectronique_embarqu▒.pdf?_=AAAACAAAABAAAAEAl_jkRL_c3Tzm7hoXMR1KPge4OClEqM4hfs3eslFzfdY5CBbrfaMzOa--NXb08Xjw2O11CkOtyUX5kXwh2IgZmxjw_TNHqQUvVAfFkeXiMDiBXxpbhulsVx3o_NLn9pCLsp09xJ9r1utCHrueYOAvuxY_ksQWuHld2WWGEPyWRubcgb4e6xO2F4jo96NSZhuAP8iarY5LiCtTydLPBXcEbbD146jLD7S83Mhij4VS5sO1asESNH5y8_5Z2PvLcZE11WiTS9alu-9AUqXNixw1t9Y5Em6xDle7s8-oiF3nPVM80RIdbJel4LoeCZuB2zgddLaJAYx5tSb03-QGNzupOPQ5UQ0_7ybPwmAsgiFfFNuvMbj9sKgxLg
> Location: 
> https://knox-gateway/gateway/bigdata/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_▒lectronique_embarqu▒.pdf?_=AAAACAAAABAAAAEAl_jkRL_c3Tzm7hoXMR1KPge4OClEqM4hfs3eslFzfdY5CBbrfaMzOa--NXb08Xjw2O11CkOtyUX5kXwh2IgZmxjw_TNHqQUvVAfFkeXiMDiBXxpbhulsVx3o_NLn9pCLsp09xJ9r1utCHrueYOAvuxY_ksQWuHld2WWGEPyWRubcgb4e6xO2F4jo96NSZhuAP8iarY5LiCtTydLPBXcEbbD146jLD7S83Mhij4VS5sO1asESNH5y8_5Z2PvLcZE11WiTS9alu-9AUqXNixw1t9Y5Em6xDle7s8-oiF3nPVM80RIdbJel4LoeCZuB2zgddLaJAYx5tSb03-QGNzupOPQ5UQ0_7ybPwmAsgiFfFNuvMbj9sKgxLg
> < Content-Type: application/octet-stream
> Content-Type: application/octet-stream
> < Server: Jetty(6.1.26)
> Server: Jetty(6.1.26)
> < Content-Length: 0
> Content-Length: 0
> <
> * Connection #0 to host knox-gateway left intact
> * Issue another request to this URL: 
> 'https://knox-gateway/gateway/bigdata/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_▒lectronique_embarqu▒.pdf?_=AAAACAAAABAAAAEAl_jkRL_c3Tzm7hoXMR1KPge4OClEqM4hfs3eslFzfdY5CBbrfaMzOa--NXb08Xjw2O11CkOtyUX5kXwh2IgZmxjw_TNHqQUvVAfFkeXiMDiBXxpbhulsVx3o_NLn9pCLsp09xJ9r1utCHrueYOAvuxY_ksQWuHld54aYH365vWu3V6u-_BaX3E1Ax5puYXZkEypdB2SOKVFW5bqIi5JFkgUr_XV8bXdcFTcdbohr82pKVBqmK-OvSZnCAVSdy4Yjyf51fSLo_n07ElHK84zqsXEMLU1zF5DHbSKC_jwpGahsm5VlYK7H5Ppwt0SNFHIx50O9yBpPLHYNe-ALlqTOlq6UT3ifFufhmKsY6chP6IxLw7ZyCHrBSA'
> * Re-using existing connection! (#0) with host knox-gateway
> * Connected to knox-gateway (10.117.41.12) port 443 (#0)
> * Server auth using Basic with user 'shfs3453'
> > GET 
> > /gateway/bigdata/webhdfs/data/v1/webhdfs/v1/user/shfs3453/WORK/datasets/test_▒lectronique_embarqu▒.pdf?_=AAAACAAAABAAAAEAl_jkRL_c3Tzm7hoXMR1KPge4OClEqM4hfs3eslFzfdY5CBbrfaMzOa--NXb08Xjw2O11CkOtyUX5kXwh2IgZmxjw_TNHqQUvVAfFkeXiMDiBXxpbhulsVx3o_NLn9pCLsp09xJ9r1utCHrueYOAvuxY_ksQWuHld54aYH365vWu3V6u-_BaX3E1Ax5puYXZkEypdB2SOKVFW5bqIi5JFkgUr_XV8bXdcFTcdbohr82pKVBqmK-OvSZnCAVSdy4Yjyf51fSLo_n07ElHK84zqsXEMLU1zF5DHbSKC_jwpGahsm5VlYK7H5Ppwt0SNFHIx50O9yBpPLHYNe-ALlqTOlq6UT3ifFufhmKsY6chP6IxLw7ZyCHrBSA
> >  HTTP/1.1
> > Authorization: Basic c2hmczM0NTM6UGIxOTkxMTAh
> > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1 
> > Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> > Host: knox-gateway
> > Accept: */*
> >
> < HTTP/1.1 404 Not Found
> HTTP/1.1 404 Not Found
> < Date: Wed, 05 Oct 2016 07:32:07 GMT
> Date: Wed, 05 Oct 2016 07:32:07 GMT
> < Set-Cookie: 
> JSESSIONID=1kp671ikau2cieuzdlw84yeh;Path=/gateway/bigdata;Secure;HttpOnly
> Set-Cookie: 
> JSESSIONID=1kp671ikau2cieuzdlw84yeh;Path=/gateway/bigdata;Secure;HttpOnly
> < Expires: Thu, 01 Jan 1970 00:00:00 GMT
> Expires: Thu, 01 Jan 1970 00:00:00 GMT
> < Set-Cookie: rememberMe=deleteMe; Path=/gateway/bigdata; Max-Age=0; 
> Expires=Tue, 04-Oct-2016 07:32:07 GMT
> Set-Cookie: rememberMe=deleteMe; Path=/gateway/bigdata; Max-Age=0; 
> Expires=Tue, 04-Oct-2016 07:32:07 GMT
> < Content-Type: application/json; charset=utf-8
> Content-Type: application/json; charset=utf-8
> < Connection: close
> Connection: close
> < Server: Jetty(9.2.15.v20160210)
> Server: Jetty(9.2.15.v20160210)
> <
> * Closing connection #0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to