Hi Alex - I notice from the audit log that the 404 is actually coming from WebHDFS not from Knox. Can you confirm that direct access to WebHDFS without going through Knox works with the same URL?
thanks, --larry On Wed, May 24, 2017 at 12:32 PM, Willmer, Alex (UK Defence) < [email protected]> wrote: > How should I encode spaces characters in the URL when I make a request to > WebHDFS through Knox? Or should be enabling/configuring something in Knox > to handle them? > > I'm making the following (redacted values in <>) request to WebHDFS, > through Knox > > curl "https://<hostname>:18443/gateway/<cluster>/webhdfs/v1/ > docs/filename%20with%20spaces.pdf?op=OPEN" \ > -<username>:<password> -k -s > > However Knox is returning HTTP 404 with the following body > (whitespace/formatting added by me) > > {"exception":"FileNotFoundException", > "javaClassName":"java.io.FileNotFoundException", > "message":"File /docs/filename+with+spaces.pdf not found."}} > > I've tried encoding the spaces as + (same result), and not encoding them > (HTTP 400 Unknown Version). > If I request a file for which the path does not contain spaces then it > works. > > Any ideas? > > With thanks, Alex > > > > PS In anticipation of queries: I'm using Knox 0.11.0 with OpenJDK > 1.8.0_131 on CentOS 7, with an HDP 2.6 (Hadoop 2.7.x) cluster. Kerberos is > enabled in the cluster. > > The (redacted) response headers for the %20 encoded request > > < HTTP/1.1 404 Not Found > < Date: Wed, 24 May 2017 15:34:26 GMT > < Set-Cookie: JSESSIONID=15acwo8gt9qr8gdbvk48y9yjh; > Path=/gateway/<cluster>;Secure;HttpOnly > < Expires: Thu, 01 Jan 1970 00:00:00 GMT > < Set-Cookie: rememberMe=deleteMe; Path=/gateway/cysafa; Max-Age=0; > Expires=Tue, 23-May-2017 15:34:26 GMT > < Cache-Control: no-cache > < Expires: Wed, 24 May 2017 15:34:26 GMT > < Date: Wed, 24 May 2017 15:34:26 GMT > < Pragma: no-cache > < Expires: Wed, 24 May 2017 15:34:26 GMT > < Date: Wed, 24 May 2017 15:34:26 GMT > < Pragma: no-cache > < X-FRAME-OPTIONS: SAMEORIGIN > < Content-Type: application/json; charset=UTF-8 > < Server: Jetty(6.1.26.hwx) > < Content-Length: 252 > > The (redacted) Knox logs for the %20 encoded request > > ==> /var/log/hadoop/knox/gateway-audit.log <== > 17/05/24 15:51:05 ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|||| > access|uri|/gateway/<cluster>/webhdfs/v1/docs/filename with > spaces.pdf?op=OPEN|unavailable|Request method: GET > 17/05/24 15:51:05 ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|< > username>|||authentication|uri|/gateway/<cluster>/webhdfs/v1/docs/filename > with spaces.pdf?op=OPEN|success| > 17/05/24 15:51:05 ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|< > username>|||authentication|uri|/gateway/<cluster>/webhdfs/v1/docs/filename > with spaces.pdf?op=OPEN|success|Groups: [] > 17/05/24 15:51:05 ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|< > username>|||authorization|uri|/gateway/<cluster>/webhdfs/v1/docs/filename > with spaces.pdf?op=OPEN|success| > 17/05/24 15:51:05 ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|< > username>|||dispatch|uri|http://<namenode>.<cluster>:50070/ > webhdfs/v1/docs/filename+with+spaces.pdf?op=OPEN&doAs=<username>|unavailable|Request > method: GET > 17/05/24 15:51:05 ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|< > username>|||dispatch|uri|http://<namenode>.<cluster>:50070/ > webhdfs/v1/docs/filename+with+spaces.pdf?op=OPEN&doAs=<username>|success|Response > status: 404 > 17/05/24 15:51:05 ||88ce58ea-d7c5-46cd-a87a-c2f96b38130e|audit|WEBHDFS|< > username>|||access|uri|/gateway/<cluster>/webhdfs/v1/docs/filename with > spaces.pdf?op=OPEN|success|Response status: 404 > > ==> /var/log/hadoop/knox/gateway.log <== > 2017-05-24 15:51:05,254 INFO hadoop.gateway > (KnoxLdapRealm.java:getUserDn(691)) > - Computed userDn: uid=<username>,cn=users,cn=accounts,dc=<cluster> using > dnTemplate for principal: <username> > 2017-05-24 15:51:05,259 INFO hadoop.gateway > (AclsAuthorizationFilter.java:doFilter(85)) > - Access Granted: true > > The (redacted) topology > > <topology> > <gateway> > <provider> > <role>authentication</role> > <name>ShiroProvider</name> > <enabled>true</enabled> > <param> > <name>sessionTimeout</name> > <value>30</value> > </param> > <param> > <name>main.ldapRealm</name> > <value>org.apache.hadoop.gateway.shirorealm. > KnoxLdapRealm</value> > </param> > <param> > <name>main.ldapContextFactory</name> > <value>org.apache.hadoop.gateway.shirorealm. > KnoxLdapContextFactory</value> > </param> > <param> > <name>main.ldapRealm.contextFactory</name> > <value>$ldapContextFactory</value> > </param> > <param> > <name>main.ldapRealm.userDnTemplate</name> > <value>uid={0},cn=users,cn=accounts,dc=<cluster></value> > </param> > <param> > <name>main.ldapRealm.contextFactory.url</name> > <value>ldap://<freeipa_node>:389</value> > </param> > <param> > <name>main.ldapRealm.contextFactory. > authenticationMechanism</name> > <value>simple</value> > </param> > <param> > <name>urls./**</name> > <value>authcBasic</value> > </param> > </provider> > <provider> > <role>authorization</role> > <name>AclsAuthz</name> > <enabled>true</enabled> > <param> > <name>knox.acl</name> > <value>admin;*;*</value> > </param> > </provider> > <provider> > <role>identity-assertion</role> > <name>Default</name> > <enabled>true</enabled> > </provider> > <provider> > <role>hostmap</role> > <name>static</name> > <enabled>false</enabled> > <param><name>localhost</name><value>sandbox,sandbox. > hortonworks.com</value></param> > </provider> > </gateway> > > <service> > <role>WEBHDFS</role> > <url>http://<namenode>:50070/webhdfs</url> > </service> > > <service> > <role>SOLRAPI</role> > <url>http://<solrnode>:6083/solr</url> > </service> > </topology> > >
