[ 
https://issues.apache.org/jira/browse/KNOX-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115870#comment-16115870
 ] 

Kevin Risden commented on KNOX-949:
-----------------------------------

[~lmccay] - sorry never got back and updated this ticket. This looks like it 
should resolve the issue. I tested it a bit and everything looked ok. Thanks 
for the work on this!

> WebHDFS proxy replaces %20 encoded spaces in URL with + encoding
> ----------------------------------------------------------------
>
>                 Key: KNOX-949
>                 URL: https://issues.apache.org/jira/browse/KNOX-949
>             Project: Apache Knox
>          Issue Type: Bug
>    Affects Versions: 0.11.0
>            Reporter: Alex Willmer
>            Assignee: Larry McCay
>            Priority: Blocker
>             Fix For: 0.13.0
>
>         Attachments: knox-0.13-with-KNOX-949-001-patch.log
>
>
> If a file with spaces in the name (e.g. {{foo bar.txt}}) is requested from 
> HDFS, through WebHDFS and Knox - then Knox rewrites the {{%20}} encoding in 
> the URL sent by the client, with {{+}} encoding (e.g. {{foo%20bar.txt}} -> 
> {{foo+bar.txt}}). This results in an HTTP 404 being returned by WebHDFS, and 
> hence by Knox. Requesting the same file directly from WebHDFS works. Example
> Client request
> {noformat}
> curl 
> "https://<hostname>:18443/gateway/<cluster>/webhdfs/v1/docs/filename%20with%20spaces.pdf?op=OPEN"
>  \
>      -<username>:<password> -k -s
> {noformat}
> Knox response body
> {noformat}
> {"exception":"FileNotFoundException",
>  "javaClassName":"java.io.FileNotFoundException",
>  "message":"File /docs/filename+with+spaces.pdf not found."}
> {noformat}
> Knox logs
> {noformat}
> ==> /var/log/hadoop/knox/gateway-audit.log <==
> 17/05/24 15:51:05 
> ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS||||access|uri|/gateway/<cluster>/webhdfs/v1/docs/filename
> with spaces.pdf?op=OPEN|unavailable|Request method: GET
> 17/05/24 15:51:05 
> ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||authentication|uri|/gateway/<cluster>/webhdfs/v1/docs/filename
> with spaces.pdf?op=OPEN|success|
> 17/05/24 15:51:05 
> ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||authentication|uri|/gateway/<cluster>/webhdfs/v1/docs/filename
> with spaces.pdf?op=OPEN|success|Groups: []
> 17/05/24 15:51:05 
> ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||authorization|uri|/gateway/<cluster>/webhdfs/v1/docs/filename
> with spaces.pdf?op=OPEN|success|
> 17/05/24 15:51:05 
> ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||dispatch|uri|http://<namenode>.<cluster>:50070/webhdfs/v1/docs/filename+with+spaces.pdf?op=OPEN&doAs=<username>|unavailable|Request
> method: GET
> 17/05/24 15:51:05 
> ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||dispatch|uri|http://<namenode>.<cluster>:50070/webhdfs/v1/docs/filename+with+spaces.pdf?op=OPEN&doAs=<username>|success|Response
> status: 404
> 17/05/24 15:51:05 
> ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|<username>|||access|uri|/gateway/<cluster>/webhdfs/v1/docs/filename
> with spaces.pdf?op=OPEN|success|Response status: 404
> ==> /var/log/hadoop/knox/gateway.log <==
> 2017-05-24 15:51:05,254 INFO  hadoop.gateway 
> (KnoxLdapRealm.java:getUserDn(691)) - Computed
> userDn: uid=<username>,cn=users,cn=accounts,dc=<cluster> using dnTemplate for
> principal: <username>
> 2017-05-24 15:51:05,259 INFO  hadoop.gateway 
> (AclsAuthorizationFilter.java:doFilter(85)) -
> Access Granted: true
> {noformat}
> Direct WebHDFS request for the same file
> {noformat}
> # curl -si -u: 
> "http://<namenode>:50070/webhdfs/v1/docs/filename%20with%20spaces.pdf?op=OPEN"
> --negotiate -L | head -n40
> HTTP/1.1 401 Authentication required
> Cache-Control: must-revalidate,no-cache,no-store
> Date: Wed, 24 May 2017 19:01:41 GMT
> Pragma: no-cache
> Date: Wed, 24 May 2017 19:01:41 GMT
> Pragma: no-cache
> X-FRAME-OPTIONS: SAMEORIGIN
> WWW-Authenticate: Negotiate
> Set-Cookie: hadoop.auth=; Path=/; HttpOnly
> Content-Type: text/html; charset=iso-8859-1
> Content-Length: 1533
> Server: Jetty(6.1.26.hwx)
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Wed, 24 May 2017 19:01:42 GMT
> Date: Wed, 24 May 2017 19:01:42 GMT
> Pragma: no-cache
> Expires: Wed, 24 May 2017 19:01:42 GMT
> Date: Wed, 24 May 2017 19:01:42 GMT
> Pragma: no-cache
> X-FRAME-OPTIONS: SAMEORIGIN
> WWW-Authenticate: Negotiate 
> YGkGCSqGSIb3EgECAgIAb1owWKADAgEFoQMCAQ+iTDBKoAMCARKiQwRBQM/auuLcl2xey6wMp6EjCPJFSqK3snscxMzW7RvfgxOo7182GzD5N9jf+OWGr+tjpvlRX0c/7iTBfYKSetf4ekU=
> Set-Cookie: 
> hadoop.auth="u=admin&p=admin@CYSAFA&t=kerberos&e=1495688502002&s=b7p35TgaxItAUTkKJuSXuynoq9E=";
> Path=/; HttpOnly
> Content-Type: application/octet-stream
> Location: 
> http://<datanode3>:1022/webhdfs/v1/docs/filename%20with%20spaces.pdf?op=OPEN&delegation=HgAFYWRtaW4FYWRtaW4AigFcO9YJ8ooBXF_ijfJFAxSBYFUnsXY3up11ZNIi4hIi__5RvRJXRUJIREZTIGRlbGVnYXRpb24PMTcyLjE4LjAuOTo4MDIw&namenoderpcaddress=<namenode>:8020&offset=0
> Content-Length: 0
> Server: Jetty(6.1.26.hwx)
> HTTP/1.1 200 OK
> Access-Control-Allow-Methods: GET
> Access-Control-Allow-Origin: *
> Content-Type: application/octet-stream
> Connection: close
> Content-Length: 13365618
> %����1.6
> <</Filter/FlateDecode/First 157/Length 5350/N 16/Type/ObjStm>>stream
> ...
> {noformat}
> See also
>  - 
> http://mail-archives.apache.org/mod_mbox/knox-user/201705.mbox/%3C335C4DD06CF6C24EAA7A73F44D43D7CB4E6EB300%40SE-EX021.groupinfra.com%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to