Very helpful Larry! thanks Now the picture is becoming a little more clear. I'm going to check the WebHDFS right now ________________________________ From: larry mccay <lmc...@apache.org> Sent: Tuesday, September 10, 2019 4:22 PM To: user@knox.apache.org <user@knox.apache.org> Subject: Re: Adding a web.xml to gateway.jar
That is coming from WEBHDFS itself. The error message tells you exactly the problem with it. A 400 is usually going to come from application code whether that be a backend service that we dispatch to or a Knox API. It indicates that the request you used is bad in the context of HTTP method in this case or some other assumptions about what should be used as addresses or some other aspect of the request. If you are looking for the spot in the Knox code that this 400 is being throw you will not find it since it is coming from the NameNode. On Tue, Sep 10, 2019 at 7:07 PM jeff saremi <jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> wrote: Do we know where/when/how a 400 error is returned like this? root@clustertest:/tests/knox# curl -iku root:goodpassword -X GET https://gateway-svc:8443/gateway/default/webhdfs/v1/?op=STATUS HTTP/1.1 400 Bad Request Date: Fri, 06 Sep 2019 23:42:42 GMT Set-Cookie: KNOXSESSIONID=node01q5krk3jp1c9dzv3fc3t5tkgh4.node0;Path=/gateway/default;Secure;HttpOnly Expires: Thu, 01 Jan 1970 00:00:00 GMT Set-Cookie: rememberMe=deleteMe; Path=/gateway/default; Max-Age=0; Expires=Thu, 05-Sep-2019 23:42:43 GMT Date: Fri, 06 Sep 2019 23:42:43 GMT Cache-Control: no-cache Expires: Fri, 06 Sep 2019 23:42:45 GMT Date: Fri, 06 Sep 2019 23:42:45 GMT Pragma: no-cache X-FRAME-OPTIONS: SAMEORIGIN Content-Type: application/json;charset=utf-8 Transfer-Encoding: chunked Server: Jetty(9.4.12.v20180830) {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Invalid value for webhdfs parameter \"op\": STATUS is not a valid GET operation."}} I want to see if the same logic applies to 500 ________________________________ From: larry mccay <lmc...@apache.org<mailto:lmc...@apache.org>> Sent: Tuesday, September 10, 2019 2:14 PM To: user@knox.apache.org<mailto:user@knox.apache.org> <user@knox.apache.org<mailto:user@knox.apache.org>> Subject: Re: Adding a web.xml to gateway.jar Interesting.... Almost the entirety of Knox request/response processing is done in ServletFilters that represent each Provider along the chain as well as the dispatch to backend services or the JAXRS service of a Knox API. We could do this inside of a Provider filter but that would limit the details provided to the application logic of the GatewayServlet rather than being more container managed - I would think. That may also be the case for custom error pages in the web.xml - but it could at least be pattern matched across multiple servlets in the webapp if need be. I will give it some more thought and maybe find some time to try out the web.xml change. On Tue, Sep 10, 2019 at 2:36 PM jeff saremi <jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> wrote: So continuing with the assumption that the problem must be me, I decided to create a small app which uses jetty and a web.xml plus two servlets to demonstrate the intent. It worked perfectly the first time! Here's the link to the application. The web.xml is modeled after what I modified last in knox. I tested the app with the following URL's and each time I got the expected result: http://localhost:8080/?error=503 {"error": {"code":503,"uri":"/"}} https://1drv.ms/u/s!AmK9GYfTgrv-psxYrFLkiSd-Qpiv4Q?e=oge5qY So currently I'm sitting at the point that I cannot seem to be able to inject any additional servlets and any error redirections in Knox If someone tells me the first/last entry point in the application where I have the chance to intercept all errors/messages before going back to the user I would really appreciate it. This way I can modify the source code and be done with this with no messy web.xml and additional servlets. thanks ________________________________ From: jeff saremi <jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> Sent: Monday, September 9, 2019 9:53 PM To: user@knox.apache.org<mailto:user@knox.apache.org> <user@knox.apache.org<mailto:user@knox.apache.org>> Subject: Re: Adding a web.xml to gateway.jar I got further by making sure that the startup scripts do not touch the topology file anymore. I can see the deployment staying the same as I modified it However there must be something wrong with my web.xml or the servlet , since all requests mow return a 503 which does not seem to be initiated by the gateway logic itself. It could be that my web.xml is invalid or my servlet creates an infinite loop or something like that. I've got some investigating to do. Basically this the last version of the web.xml (truncated): <servlet> <servlet-name>default-knox-gateway-servlet</servlet-name> <servlet-class>org.apache.knox.gateway.GatewayServlet</servlet-class> <init-param> <param-name>gatewayDescriptorLocation</param-name> <param-value>/WEB-INF/gateway.xml</param-value> </init-param> </servlet> <servlet> <servlet-name>exception-handler-servlet</servlet-name> <servlet-class>ExceptionHandlerServlet</servlet-class> </servlet> <servlet-mapping> <servlet-name>exception-handler-servlet</servlet-name> <url-pattern>/ExceptionHandler</url-pattern> </servlet-mapping> <servlet-mapping> <servlet-name>default-knox-gateway-servlet</servlet-name> <url-pattern>/*</url-pattern> </servlet-mapping> ... <param-name>rewriteDescriptorLocation</param-name> <param-value>/WEB-INF/rewrite.xml</param-value> </context-param> <error-page> <error-code>503</error-code> <location>/ExceptionHandler</location> </error-page> <error-page> <error-code>400</error-code> <location>/ExceptionHandler</location> </error-page> <error-page> <error-code>401</error-code> <location>/ExceptionHandler</location> </error-page> <error-page> <error-code>404</error-code> <location>/ExceptionHandler</location> </error-page> </web-app> The Servlet file: ------------------------------ try { // Analyze the servlet exception Throwable throwable = (Throwable) request.getAttribute("javax.servlet.error.exception"); Integer statusCode = (Integer) request.getAttribute("javax.servlet.error.status_code"); String servletName = (String) request.getAttribute("javax.servlet.error.servlet_name"); if (servletName == null) { servletName = "Unknown"; } String requestUri = (String) request.getAttribute("javax.servlet.error.request_uri"); if (requestUri == null) { requestUri = "Unknown"; } // Set response content type response.setContentType("application/json"); PrintWriter out = response.getWriter(); out.write("{"); if(statusCode != 500){ out.write("\"error\": {"); out.write("\"code\":"+statusCode+","); out.write("\"uri\":\""+requestUri+"\"}"); } else { out.write("\"exception\":{"); out.write("\"servletName\":\""+servletName+"\","); out.write("\"exceptionClass\":\""+throwable.getClass().getName()+"\","); out.write("\"uri\":\""+requestUri+"\","); out.write("\"message\":\""+throwable.getMessage()+"\"}"); } out.write("}"); } catch (Throwable e) { log("Error inside ExceptionHandler: " + e.getMessage()); } Sample error returned by knox: ------------------------------------------ root@clustertest:~# curl -iku user:password -X GET https://gateway-svc:8443/gateway/default/webhdfs/v1/?op=LISTSTATUS HTTP/1.1 503 Service Unavailable Date: Tue, 10 Sep 2019 04:40:19 GMT Cache-Control: must-revalidate,no-cache,no-store Content-Type: text/html;charset=iso-8859-1 Content-Length: 364 Server: Jetty(9.4.12.v20180830) <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/> <title>Error 503 Service Unavailable</title> </head> <body><h2>HTTP ERROR 503</h2> <p>Problem accessing /gateway/default/webhdfs/v1/. Reason: <pre> Service Unavailable</pre></p><hr><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.4.12.v20180830</a><hr/> </body> </html> ________________________________ From: jeff saremi <jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> Sent: Monday, September 9, 2019 8:12 PM To: user@knox.apache.org<mailto:user@knox.apache.org> <user@knox.apache.org<mailto:user@knox.apache.org>> Subject: Re: Adding a web.xml to gateway.jar Good pointers. We're not using Ambari or any gui to do this. All command line. This is the supervisord config file: root@gateway-0:/opt/supervisor/conf.d# ls knox.conf root@gateway-0:/opt/supervisor/conf.d# cat knox.conf [program:knoxgateway] command=/opt/knox/bin/gateway-launcher.sh stopasgroup=true startretries=3 What makes a deployment to take place? What command lines should I be looking for? The only thing i see relevant is that the conf/topologies/default.xml is touched (re-created from a template file) Could that be causing a re-deployment? ________________________________ From: larry mccay <lmc...@apache.org<mailto:lmc...@apache.org>> Sent: Monday, September 9, 2019 4:53 PM To: user@knox.apache.org<mailto:user@knox.apache.org> <user@knox.apache.org<mailto:user@knox.apache.org>> Subject: Re: Adding a web.xml to gateway.jar Interesting. I'm not really familiar with Supervisord but it seems to be not only restarting the process but causing the redeployment of the topologies. Is it making an Ambari API call to restart Knox or something like that? The short term goal needs to be a restart that doesn't redeploy and the reuse of the web.xml. This would be just to verify that what you want to do works as expected then we would need to add the error handling to the ShrinkWrap code that generates the web.xml in the Knox deployment machinery. If you are using a default.xml topology that sounds like you are using Ambari and HDP? Are you able to remove Knox from the Supervisord monitoring temporarily or even just turn it off? If so, we can then stop the Knox process via Ambari then restart Knox manually. If you restart from Ambari it will push out configuration again and you will get a new deployment - so don't do that. A simple restart without config being changed will just load the previous web.xml and webapp without deploying a new one. On Mon, Sep 9, 2019 at 7:10 PM jeff saremi <jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> wrote: Larry I created a jar file out of a single ExceptionServlet and placed in GATEWAY_HOME/ext: root@gateway-0:/opt/knox# ls -l ext total 12 -rw-r--r-- 1 root root 83 Sep 6 00:23 README -rwxr-xr-- 1 root root 1704 Sep 9 22:41 error-servlet.jar drwxr--r-- 2 root root 4096 Sep 6 00:23 native root@gateway-0:/opt/knox# jar tvf ext/error-servlet.jar 0 Mon Sep 09 15:24:44 UTC 2019 META-INF/ 64 Mon Sep 09 15:24:44 UTC 2019 META-INF/MANIFEST.MF 2382 Mon Sep 09 15:23:54 UTC 2019 ExceptionHandlerServlet.class Since I appear to be using 'default' in my URLs I'm guessing that would be the right topology. So i went to data/deployments and modified the following web.xml: data/deployments/default.topo.16d08ec29f0 updated web.xml: root@gateway-0:/opt/knox# tail -25 data/deployments/default.topo.16d08ec29f0/%2F/WEB-INF/web.xml <param-value>/WEB-INF/rewrite.xml</param-value> </context-param> <error-page> <error-code>400</error-code> <location>/ExceptionHandler</location> </error-page> <error-page> <error-code>401</error-code> <location>/ExceptionHandler</location> </error-page> <error-page> <error-code>404</error-code> <location>/ExceptionHandler</location> </error-page> <error-page> <exception-type>java.lang.Throwable</exception-type> <location>/ExceptionHandler</location> </error-page> <error-page> <exception-type>javax.servlet.ServletException</exception-type> <location>/ExceptionHandler</location> </error-page> </web-app> I then killed the running process (since it was started by the Supervisor d) and it restarted itself The response were identical to before meaning that my servlet was never hit Looking at the deployment folder, I see that there's another 'default*' folder is created with updated timestamp. And the new web.xml there does not have any of my changes of course If I were to make these changes to the web.xml permenant, where would be the right location, given that the web.xml seems to be created on the fly. ________________________________ From: jeff saremi <jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> Sent: Saturday, September 7, 2019 4:28 PM To: user@knox.apache.org<mailto:user@knox.apache.org> <user@knox.apache.org<mailto:user@knox.apache.org>> Subject: Re: Adding a web.xml to gateway.jar Great suggestions! Thanks Larry I will work on getting the web.xml and the servlet integrated Completely agreed on the vulnerability side. We may expose this in a DEBUG version and not the release or provide a config value... ________________________________ From: larry mccay <lmc...@apache.org<mailto:lmc...@apache.org>> Sent: Friday, September 6, 2019 7:25 PM To: user@knox.apache.org<mailto:user@knox.apache.org> <user@knox.apache.org<mailto:user@knox.apache.org>> Subject: Re: Adding a web.xml to gateway.jar Hi Jeff - This is an interesting idea and we should consider discussing this as a feature of Knox rather than just something that you are trying to hack into an existing release/deployment. In order to get this to work, I would first change the web.xml in the deployments directory for a given topology and add the servlet to the in a jar within {GATEWAY_HOME}/ext directory. Stop and start the server and it should hopefully pickup the changed web.xml file. In order to cause a 500, I think just dispatching to an invalid URL would result in a 500 with a connection exception. See if that web.xml will work and we can take it from there. It should be noted that surfacing the details of a webappexception may expose sensitive information about the server and you may not want to always have this enabled. HTH. --larry On Fri, Sep 6, 2019 at 9:37 PM jeff saremi <jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> wrote: Ultimately I am trying to make sure when an HTTP 500 error happens the exception message and stacktrace are returned in the response, in the gateway So I decided to add a web.xml and overwrite parts of error handling there to the gateway project. (added to gateway-server-launcher/src/main/resources/META-INF/web.xml) root@gateway-0:/opt/knox/bin# cat META-INF/web.xml <?xml version="1.0" encoding="UTF-8"?> <!-- Licensed to ... --> <web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://java.sun.com/xml/ns/javaee" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_3_0.xsd" version="3.0"> <error-page> <error-code>400</exception-type> <location>/ExceptionHandler</location> </error-page> <error-page> <error-code>401</exception-type> <location>/ExceptionHandler</location> </error-page> <error-page> <error-code>404</exception-type> <location>/ExceptionHandler</location> </error-page> <error-page> <exception-type>java.lang.Throwable</exception-type> <location>/ExceptionHandler</location> </error-page> <error-page> <exception-type>javax.servlet.ServletException</exception-type> <location>/ExceptionHandler</location> </error-page> </web-app> And then I added the following Servlet (added it to gateway-util-common/src/main/java/org/apache/knox/gateway/servlet/ExceptionHandlerServlet.java) @WebServlet("/ExceptionHandler") public class ExceptionHandlerServlet extends HttpServlet { private static final long serialVersionUID = 1L; protected void service(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { // Analyze the servlet exception Throwable throwable = (Throwable) request.getAttribute("javax.servlet.error.exception"); Integer statusCode = (Integer) request.getAttribute("javax.servlet.error.status_code"); String servletName = (String) request.getAttribute("javax.servlet.error.servlet_name"); if (servletName == null) { servletName = "Unknown"; } String requestUri = (String) request.getAttribute("javax.servlet.error.request_uri"); if (requestUri == null) { requestUri = "Unknown"; } // Set response content type response.setContentType("text/html"); PrintWriter out = response.getWriter(); out.write("<html><head><title>Exception/Error Details</title></head><body>"); if(statusCode != 500){ out.write("<h3>Error Details</h3>"); out.write("<strong>Status Code</strong>:"+statusCode+"<br>"); out.write("<strong>Requested URI</strong>:"+requestUri); } else { out.write("<h3>Exception Details</h3>"); out.write("<ul><li>Servlet Name:"+servletName+"</li>"); out.write("<li>Exception Name:"+throwable.getClass().getName()+"</li>"); out.write("<li>Requested URI:"+requestUri+"</li>"); out.write("<li>Exception Message:"+throwable.getMessage()+"</li>"); out.write("</ul>"); } out.write("<br><br>"); out.write("</body></html>"); } } I see that the application is launched using gateway.jar. And i also see my web.xml inside that jar. However I'm not able to get anything returned from this servlet! I honestly don't know how to repro a 500. But I could do a 400, 401, and 404. Neither of them got intercepted by the Exception servlet i wrote. Here are some examples I ran. Note in the first one, a 400 is returned along with some exception message. that's what i want to do for 500 or verify that it's being done. However I haven't been able to (using text search) find out where in the code this response is formed like this root@clustertest:/tests/knox# curl -iku root:goodpassword -X GET https://gateway-svc:8443/gateway/default/webhdfs/v1/?op=STATUS HTTP/1.1 400 Bad Request Date: Fri, 06 Sep 2019 23:42:42 GMT Set-Cookie: KNOXSESSIONID=node01q5krk3jp1c9dzv3fc3t5tkgh4.node0;Path=/gateway/default;Secure;HttpOnly Expires: Thu, 01 Jan 1970 00:00:00 GMT Set-Cookie: rememberMe=deleteMe; Path=/gateway/default; Max-Age=0; Expires=Thu, 05-Sep-2019 23:42:43 GMT Date: Fri, 06 Sep 2019 23:42:43 GMT Cache-Control: no-cache Expires: Fri, 06 Sep 2019 23:42:45 GMT Date: Fri, 06 Sep 2019 23:42:45 GMT Pragma: no-cache X-FRAME-OPTIONS: SAMEORIGIN Content-Type: application/json;charset=utf-8 Transfer-Encoding: chunked Server: Jetty(9.4.12.v20180830) {"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Invalid value for webhdfs parameter \"op\": STATUS is not a valid GET operation."}} root@clustertest:/tests/knox# curl -iku root:goodpassword -X GET https://gateway-svc:8443/gateway/default/webhdfs/v1/?op=LISTSTATUS HTTP/1.1 200 OK Date: Fri, 06 Sep 2019 23:43:29 GMT Set-Cookie: KNOXSESSIONID=node0iz11bxvbn318h7zow5z977pc5.node0;Path=/gateway/default;Secure;HttpOnly Expires: Thu, 01 Jan 1970 00:00:00 GMT Set-Cookie: rememberMe=deleteMe; Path=/gateway/default; Max-Age=0; Expires=Thu, 05-Sep-2019 23:43:30 GMT Date: Fri, 06 Sep 2019 23:43:30 GMT Cache-Control: no-cache Expires: Fri, 06 Sep 2019 23:43:30 GMT Date: Fri, 06 Sep 2019 23:43:30 GMT Pragma: no-cache X-FRAME-OPTIONS: SAMEORIGIN Content-Type: application/json;charset=utf-8 Transfer-Encoding: chunked Server: Jetty(9.4.12.v20180830) {"FileStatuses":{"FileStatus":[{"accessTime":0,"blockSize":0,"childrenNum":0,"fileId":16411,"group":"supergroup","length":0,"modificationTime":1567812978306,"owner":"root","pathSuffix":"jar","permission":"755","replication":0,"storagePolicy":0,"type":"DIRECTORY"},{"accessTime":0,"blockSize":0,"childrenNum":6,"fileId":16389,"group":"supergroup","length":0,"modificationTime":1567812975255,"owner":"root","pathSuffix":"livy","permission":"755","replication":0,"storagePolicy":0,"type":"DIRECTORY"},{"accessTime":0,"blockSize":0,"childrenNum":1,"fileId":16386,"group":"supergroup","length":0,"modificationTime":1567812943856,"owner":"root","pathSuffix":"spark","permission":"775","replication":0,"storagePolicy":0,"type":"DIRECTORY"},{"accessTime":0,"blockSize":0,"childrenNum":1,"fileId":16387,"group":"supergroup","length":0,"modificationTime":1567813293988,"owner":"root","pathSuffix":"spark-events","permission":"733","replication":0,"storagePolicy":0,"type":"DIRECTORY"},{"accessTime":0,"blockSize":0,"childrenNum":2,"fileId":16395,"group":"supergroup","length":0,"modificationTime":1567813273907,"owner":"root","pathSuffix":"tmp","permission":"1777","replication":0,"storagePolicy":0,"type":"DIRECTORY"},{"accessTime":0,"blockSize":0,"childrenNum":1,"fileId":16412,"group":"supergroup","length":0,"modificationTime":1567813267540,"owner":"root","pathSuffix":"user","permission":"777","replication":0,"storagePolicy":0,"type":"DIRECTORY"}]}} root@clustertest:/tests/knox# curl -iku root:goodpassword -X GET https://gateway-svc:8443/gateway/default/webhdfs/v2/?op=LISTSTATUS HTTP/1.1 404 Not Found Date: Fri, 06 Sep 2019 23:43:53 GMT Content-Length: 0 Server: Jetty(9.4.12.v20180830) root@clustertest:/tests/knox# curl -iku root:badpassword -X GET https://gateway-svc:8443/gateway/default/webhdfs/v1/?op=LISTSTATUS HTTP/1.1 401 Unauthorized Date: Fri, 06 Sep 2019 23:44:17 GMT Set-Cookie: rememberMe=deleteMe; Path=/gateway/default; Max-Age=0; Expires=Thu, 05-Sep-2019 23:44:17 GMT WWW-Authenticate: BASIC realm="application" Content-Length: 0 Server: Jetty(9.4.12.v20180830)