Very helpful Larry! thanks
Now the picture is becoming a little more clear. I'm going to check the WebHDFS 
right now
________________________________
From: larry mccay <lmc...@apache.org>
Sent: Tuesday, September 10, 2019 4:22 PM
To: user@knox.apache.org <user@knox.apache.org>
Subject: Re: Adding a web.xml to gateway.jar

That is coming from WEBHDFS itself.
The error message tells you exactly the problem with it.

A 400 is usually going to come from application code whether that be a backend 
service that we dispatch to or a Knox API.
It indicates that the request you used is bad in the context of HTTP method in 
this case or some other assumptions about what should be used as addresses or 
some other aspect of the request.

If you are looking for the spot in the Knox code that this 400 is being throw 
you will not find it since it is coming from the NameNode.

On Tue, Sep 10, 2019 at 7:07 PM jeff saremi 
<jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> wrote:
Do we know where/when/how a 400 error is returned like this?

root@clustertest:/tests/knox# curl -iku root:goodpassword -X GET 
https://gateway-svc:8443/gateway/default/webhdfs/v1/?op=STATUS
HTTP/1.1 400 Bad Request
Date: Fri, 06 Sep 2019 23:42:42 GMT
Set-Cookie: 
KNOXSESSIONID=node01q5krk3jp1c9dzv3fc3t5tkgh4.node0;Path=/gateway/default;Secure;HttpOnly
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Set-Cookie: rememberMe=deleteMe; Path=/gateway/default; Max-Age=0; Expires=Thu, 
05-Sep-2019 23:42:43 GMT
Date: Fri, 06 Sep 2019 23:42:43 GMT
Cache-Control: no-cache
Expires: Fri, 06 Sep 2019 23:42:45 GMT
Date: Fri, 06 Sep 2019 23:42:45 GMT
Pragma: no-cache
X-FRAME-OPTIONS: SAMEORIGIN
Content-Type: application/json;charset=utf-8
Transfer-Encoding: chunked
Server: Jetty(9.4.12.v20180830)

{"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Invalid
 value for webhdfs parameter \"op\": STATUS is not a valid GET operation."}}


I want to see if the same logic applies to 500
________________________________
From: larry mccay <lmc...@apache.org<mailto:lmc...@apache.org>>
Sent: Tuesday, September 10, 2019 2:14 PM
To: user@knox.apache.org<mailto:user@knox.apache.org> 
<user@knox.apache.org<mailto:user@knox.apache.org>>
Subject: Re: Adding a web.xml to gateway.jar

Interesting....

Almost the entirety of Knox request/response processing is done in 
ServletFilters that represent each Provider along the chain as well as the 
dispatch to backend services or the JAXRS service of a Knox API.

We could do this inside of a Provider filter but that would limit the details 
provided to the application logic of the GatewayServlet rather than being more 
container managed - I would think.
That may also be the case for custom error pages in the web.xml - but it could 
at least be pattern matched across multiple servlets in the webapp if need be.

I will give it some more thought and maybe find some time to try out the 
web.xml change.

On Tue, Sep 10, 2019 at 2:36 PM jeff saremi 
<jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> wrote:
So continuing with the assumption that the problem must be me, I decided to 
create a small app which uses jetty and a web.xml plus two servlets to 
demonstrate the intent.
It worked perfectly the first time!
Here's the link to the application. The web.xml is modeled after what I 
modified last in knox.
I tested the app with the following URL's and each time I got the expected 
result:
http://localhost:8080/?error=503

{"error": {"code":503,"uri":"/"}}

https://1drv.ms/u/s!AmK9GYfTgrv-psxYrFLkiSd-Qpiv4Q?e=oge5qY

So currently I'm sitting at the point that I cannot seem to be able to inject 
any additional servlets and any error redirections in Knox

If someone tells me the first/last entry point in the application where I have 
the chance to intercept all errors/messages before going back to the user I 
would really appreciate it. This way I can modify the source code and be done 
with this with no messy web.xml and additional servlets.

thanks

________________________________
From: jeff saremi <jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>>
Sent: Monday, September 9, 2019 9:53 PM
To: user@knox.apache.org<mailto:user@knox.apache.org> 
<user@knox.apache.org<mailto:user@knox.apache.org>>
Subject: Re: Adding a web.xml to gateway.jar

I got further by making sure that the startup scripts do not touch the topology 
file anymore.
I can see the deployment staying the same as I modified it
However there must be something wrong with my web.xml or the servlet , since 
all requests  mow return a 503 which does not seem to be initiated by the 
gateway logic itself. It could be that my web.xml is invalid or my servlet 
creates an infinite loop or something like that.
I've got some investigating to do.
Basically this the last version of the web.xml (truncated):

  <servlet>
    <servlet-name>default-knox-gateway-servlet</servlet-name>
    <servlet-class>org.apache.knox.gateway.GatewayServlet</servlet-class>
    <init-param>
      <param-name>gatewayDescriptorLocation</param-name>
      <param-value>/WEB-INF/gateway.xml</param-value>
    </init-param>
  </servlet>
  <servlet>
    <servlet-name>exception-handler-servlet</servlet-name>
    <servlet-class>ExceptionHandlerServlet</servlet-class>
  </servlet>
  <servlet-mapping>
    <servlet-name>exception-handler-servlet</servlet-name>
    <url-pattern>/ExceptionHandler</url-pattern>
  </servlet-mapping>
  <servlet-mapping>
    <servlet-name>default-knox-gateway-servlet</servlet-name>
    <url-pattern>/*</url-pattern>
  </servlet-mapping>
...
    <param-name>rewriteDescriptorLocation</param-name>
    <param-value>/WEB-INF/rewrite.xml</param-value>
  </context-param>
  <error-page>
    <error-code>503</error-code>
    <location>/ExceptionHandler</location>
  </error-page>
  <error-page>
    <error-code>400</error-code>
    <location>/ExceptionHandler</location>
  </error-page>
  <error-page>
    <error-code>401</error-code>
    <location>/ExceptionHandler</location>
  </error-page>
  <error-page>
    <error-code>404</error-code>
    <location>/ExceptionHandler</location>
  </error-page>
</web-app>


The Servlet file:
------------------------------

    try {
    // Analyze the servlet exception
    Throwable throwable = (Throwable) 
request.getAttribute("javax.servlet.error.exception");
    Integer statusCode = (Integer) 
request.getAttribute("javax.servlet.error.status_code");
    String servletName = (String) 
request.getAttribute("javax.servlet.error.servlet_name");
    if (servletName == null) {
      servletName = "Unknown";
    }
    String requestUri = (String) 
request.getAttribute("javax.servlet.error.request_uri");
    if (requestUri == null) {
      requestUri = "Unknown";
    }

    // Set response content type
    response.setContentType("application/json");

    PrintWriter out = response.getWriter();
    out.write("{");
    if(statusCode != 500){
      out.write("\"error\": {");
      out.write("\"code\":"+statusCode+",");
      out.write("\"uri\":\""+requestUri+"\"}");
    } else {
      out.write("\"exception\":{");
      out.write("\"servletName\":\""+servletName+"\",");
      out.write("\"exceptionClass\":\""+throwable.getClass().getName()+"\",");
      out.write("\"uri\":\""+requestUri+"\",");
      out.write("\"message\":\""+throwable.getMessage()+"\"}");
    }
    out.write("}");
    } catch (Throwable e) {
        log("Error inside ExceptionHandler: " + e.getMessage());
    }

Sample error returned by knox:
------------------------------------------

root@clustertest:~# curl -iku user:password -X GET 
https://gateway-svc:8443/gateway/default/webhdfs/v1/?op=LISTSTATUS
HTTP/1.1 503 Service Unavailable
Date: Tue, 10 Sep 2019 04:40:19 GMT
Cache-Control: must-revalidate,no-cache,no-store
Content-Type: text/html;charset=iso-8859-1
Content-Length: 364
Server: Jetty(9.4.12.v20180830)

<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 503 Service Unavailable</title>
</head>
<body><h2>HTTP ERROR 503</h2>
<p>Problem accessing /gateway/default/webhdfs/v1/. Reason:
<pre>    Service Unavailable</pre></p><hr><a 
href="http://eclipse.org/jetty";>Powered by Jetty:// 9.4.12.v20180830</a><hr/>

</body>
</html>

________________________________
From: jeff saremi <jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>>
Sent: Monday, September 9, 2019 8:12 PM
To: user@knox.apache.org<mailto:user@knox.apache.org> 
<user@knox.apache.org<mailto:user@knox.apache.org>>
Subject: Re: Adding a web.xml to gateway.jar

Good pointers.
We're not using Ambari or any gui to do this. All command line.
This is the supervisord config file:

root@gateway-0:/opt/supervisor/conf.d# ls
knox.conf
root@gateway-0:/opt/supervisor/conf.d# cat knox.conf
[program:knoxgateway]
command=/opt/knox/bin/gateway-launcher.sh
stopasgroup=true
startretries=3

What makes a deployment to take place?
What command lines should I be looking for?
The only thing i see relevant is that the conf/topologies/default.xml is 
touched (re-created from a template file)
Could that be causing a re-deployment?

________________________________
From: larry mccay <lmc...@apache.org<mailto:lmc...@apache.org>>
Sent: Monday, September 9, 2019 4:53 PM
To: user@knox.apache.org<mailto:user@knox.apache.org> 
<user@knox.apache.org<mailto:user@knox.apache.org>>
Subject: Re: Adding a web.xml to gateway.jar

Interesting.
I'm not really familiar with Supervisord but it seems to be not only restarting 
the process but causing the redeployment of the topologies.
Is it making an Ambari API call to restart Knox or something like that?

The short term goal needs to be a restart that doesn't redeploy and the reuse 
of the web.xml.
This would be just to verify that what you want to do works as expected then we 
would need to add the error handling to the ShrinkWrap code that generates the 
web.xml in the Knox deployment machinery.

If you are using a default.xml topology that sounds like you are using Ambari 
and HDP?
Are you able to remove Knox from the Supervisord monitoring temporarily or even 
just turn it off?

If so, we can then stop the Knox process via Ambari then restart Knox manually.
If you restart from Ambari it will push out configuration again and you will 
get a new deployment - so don't do that.

A simple restart without config being changed will just load the previous 
web.xml and webapp without deploying a new one.

On Mon, Sep 9, 2019 at 7:10 PM jeff saremi 
<jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> wrote:
Larry
I created a jar file out of a single ExceptionServlet and placed in 
GATEWAY_HOME/ext:

root@gateway-0:/opt/knox# ls -l ext
total 12
-rw-r--r-- 1 root root   83 Sep  6 00:23 README
-rwxr-xr-- 1 root root 1704 Sep  9 22:41 error-servlet.jar
drwxr--r-- 2 root root 4096 Sep  6 00:23 native

root@gateway-0:/opt/knox# jar tvf ext/error-servlet.jar
     0 Mon Sep 09 15:24:44 UTC 2019 META-INF/
    64 Mon Sep 09 15:24:44 UTC 2019 META-INF/MANIFEST.MF
  2382 Mon Sep 09 15:23:54 UTC 2019 ExceptionHandlerServlet.class

Since I appear to be using 'default' in my URLs I'm guessing that would be the 
right topology. So i went to data/deployments and modified the following 
web.xml:
data/deployments/default.topo.16d08ec29f0

updated web.xml:


root@gateway-0:/opt/knox# tail -25 
data/deployments/default.topo.16d08ec29f0/%2F/WEB-INF/web.xml
    <param-value>/WEB-INF/rewrite.xml</param-value>
  </context-param>

  <error-page>
    <error-code>400</error-code>
    <location>/ExceptionHandler</location>
  </error-page>
  <error-page>
    <error-code>401</error-code>
    <location>/ExceptionHandler</location>
  </error-page>
  <error-page>
    <error-code>404</error-code>
    <location>/ExceptionHandler</location>
  </error-page>
  <error-page>
    <exception-type>java.lang.Throwable</exception-type>
    <location>/ExceptionHandler</location>
  </error-page>
  <error-page>
    <exception-type>javax.servlet.ServletException</exception-type>
    <location>/ExceptionHandler</location>
  </error-page>

</web-app>

I then killed the running process (since it was started by the Supervisor d) 
and it restarted itself

The response were identical to before meaning that my servlet was never hit

Looking at the deployment folder, I see that there's another 'default*' folder 
is created with updated timestamp. And the new web.xml there does not have any 
of my changes of course

If I were to make these changes to the web.xml permenant, where would be the 
right location, given that the web.xml seems to be created on the fly.



________________________________
From: jeff saremi <jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>>
Sent: Saturday, September 7, 2019 4:28 PM
To: user@knox.apache.org<mailto:user@knox.apache.org> 
<user@knox.apache.org<mailto:user@knox.apache.org>>
Subject: Re: Adding a web.xml to gateway.jar

Great suggestions! Thanks Larry
I will work on getting the web.xml and the servlet integrated
Completely agreed on the vulnerability side. We may expose this in a DEBUG 
version and not the release or provide a config value...
________________________________
From: larry mccay <lmc...@apache.org<mailto:lmc...@apache.org>>
Sent: Friday, September 6, 2019 7:25 PM
To: user@knox.apache.org<mailto:user@knox.apache.org> 
<user@knox.apache.org<mailto:user@knox.apache.org>>
Subject: Re: Adding a web.xml to gateway.jar

Hi Jeff -

This is an interesting idea and we should consider discussing this as a feature 
of Knox rather than just something that you are trying to hack into an existing 
release/deployment.

In order to get this to work, I would first change the web.xml in the 
deployments directory for a given topology and add the servlet to the in a jar 
within {GATEWAY_HOME}/ext directory.
Stop and start the server and it should hopefully pickup the changed web.xml 
file.

In order to cause a 500, I think just dispatching to an invalid URL would 
result in a 500 with a connection exception.

See if that web.xml will work and we can take it from there.

It should be noted that surfacing the details of a webappexception may expose 
sensitive information about the server and you may not want to always have this 
enabled.

HTH.

--larry

On Fri, Sep 6, 2019 at 9:37 PM jeff saremi 
<jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> wrote:

Ultimately I am trying to make sure when an HTTP 500 error happens the 
exception message and stacktrace are returned in the response, in the gateway
So I decided to add a web.xml and overwrite parts of error handling there to 
the gateway project. (added to 
gateway-server-launcher/src/main/resources/META-INF/web.xml)

root@gateway-0:/opt/knox/bin# cat META-INF/web.xml
<?xml version="1.0" encoding="UTF-8"?>
<!--
  Licensed to ...
-->
<web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; 
xmlns="http://java.sun.com/xml/ns/javaee"; 
xsi:schemaLocation="http://java.sun.com/xml/ns/javaee 
http://java.sun.com/xml/ns/javaee/web-app_3_0.xsd"; version="3.0">
  <error-page>
    <error-code>400</exception-type>
    <location>/ExceptionHandler</location>
  </error-page>
  <error-page>
    <error-code>401</exception-type>
    <location>/ExceptionHandler</location>
  </error-page>
  <error-page>
    <error-code>404</exception-type>
    <location>/ExceptionHandler</location>
  </error-page>
  <error-page>
    <exception-type>java.lang.Throwable</exception-type>
    <location>/ExceptionHandler</location>
  </error-page>
  <error-page>
    <exception-type>javax.servlet.ServletException</exception-type>
    <location>/ExceptionHandler</location>
  </error-page>
</web-app>

And then I added the following Servlet (added it to 
gateway-util-common/src/main/java/org/apache/knox/gateway/servlet/ExceptionHandlerServlet.java)


@WebServlet("/ExceptionHandler")
public class ExceptionHandlerServlet extends HttpServlet {
  private static final long serialVersionUID = 1L;

  protected void service(HttpServletRequest request,
    HttpServletResponse response) throws ServletException, IOException {
    // Analyze the servlet exception
    Throwable throwable = (Throwable) 
request.getAttribute("javax.servlet.error.exception");
    Integer statusCode = (Integer) 
request.getAttribute("javax.servlet.error.status_code");
    String servletName = (String) 
request.getAttribute("javax.servlet.error.servlet_name");
    if (servletName == null) {
      servletName = "Unknown";
    }
    String requestUri = (String) 
request.getAttribute("javax.servlet.error.request_uri");
    if (requestUri == null) {
      requestUri = "Unknown";
    }

    // Set response content type
    response.setContentType("text/html");

    PrintWriter out = response.getWriter();
    out.write("<html><head><title>Exception/Error 
Details</title></head><body>");
    if(statusCode != 500){
      out.write("<h3>Error Details</h3>");
      out.write("<strong>Status Code</strong>:"+statusCode+"<br>");
      out.write("<strong>Requested URI</strong>:"+requestUri);
    } else {
      out.write("<h3>Exception Details</h3>");
      out.write("<ul><li>Servlet Name:"+servletName+"</li>");
      out.write("<li>Exception Name:"+throwable.getClass().getName()+"</li>");
      out.write("<li>Requested URI:"+requestUri+"</li>");
      out.write("<li>Exception Message:"+throwable.getMessage()+"</li>");
      out.write("</ul>");
    }

    out.write("<br><br>");
    out.write("</body></html>");
  }
}


I see that the application is launched using gateway.jar. And i also see my 
web.xml inside that jar. However I'm not able to get anything returned from 
this servlet!

I honestly don't know how to repro a 500. But I could do a 400, 401, and 404. 
Neither of them got intercepted by the Exception servlet i wrote.

Here are some examples I ran. Note in the first one, a 400 is returned along 
with some exception message. that's what i want to do for 500 or verify that 
it's being done. However I haven't been able to (using text search) find out 
where in the code this response is formed like this

root@clustertest:/tests/knox# curl -iku root:goodpassword -X GET 
https://gateway-svc:8443/gateway/default/webhdfs/v1/?op=STATUS
HTTP/1.1 400 Bad Request
Date: Fri, 06 Sep 2019 23:42:42 GMT
Set-Cookie: 
KNOXSESSIONID=node01q5krk3jp1c9dzv3fc3t5tkgh4.node0;Path=/gateway/default;Secure;HttpOnly
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Set-Cookie: rememberMe=deleteMe; Path=/gateway/default; Max-Age=0; Expires=Thu, 
05-Sep-2019 23:42:43 GMT
Date: Fri, 06 Sep 2019 23:42:43 GMT
Cache-Control: no-cache
Expires: Fri, 06 Sep 2019 23:42:45 GMT
Date: Fri, 06 Sep 2019 23:42:45 GMT
Pragma: no-cache
X-FRAME-OPTIONS: SAMEORIGIN
Content-Type: application/json;charset=utf-8
Transfer-Encoding: chunked
Server: Jetty(9.4.12.v20180830)

{"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"Invalid
 value for webhdfs parameter \"op\": STATUS is not a valid GET operation."}}


root@clustertest:/tests/knox# curl -iku root:goodpassword -X GET 
https://gateway-svc:8443/gateway/default/webhdfs/v1/?op=LISTSTATUS
HTTP/1.1 200 OK
Date: Fri, 06 Sep 2019 23:43:29 GMT
Set-Cookie: 
KNOXSESSIONID=node0iz11bxvbn318h7zow5z977pc5.node0;Path=/gateway/default;Secure;HttpOnly
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Set-Cookie: rememberMe=deleteMe; Path=/gateway/default; Max-Age=0; Expires=Thu, 
05-Sep-2019 23:43:30 GMT
Date: Fri, 06 Sep 2019 23:43:30 GMT
Cache-Control: no-cache
Expires: Fri, 06 Sep 2019 23:43:30 GMT
Date: Fri, 06 Sep 2019 23:43:30 GMT
Pragma: no-cache
X-FRAME-OPTIONS: SAMEORIGIN
Content-Type: application/json;charset=utf-8
Transfer-Encoding: chunked
Server: Jetty(9.4.12.v20180830)

{"FileStatuses":{"FileStatus":[{"accessTime":0,"blockSize":0,"childrenNum":0,"fileId":16411,"group":"supergroup","length":0,"modificationTime":1567812978306,"owner":"root","pathSuffix":"jar","permission":"755","replication":0,"storagePolicy":0,"type":"DIRECTORY"},{"accessTime":0,"blockSize":0,"childrenNum":6,"fileId":16389,"group":"supergroup","length":0,"modificationTime":1567812975255,"owner":"root","pathSuffix":"livy","permission":"755","replication":0,"storagePolicy":0,"type":"DIRECTORY"},{"accessTime":0,"blockSize":0,"childrenNum":1,"fileId":16386,"group":"supergroup","length":0,"modificationTime":1567812943856,"owner":"root","pathSuffix":"spark","permission":"775","replication":0,"storagePolicy":0,"type":"DIRECTORY"},{"accessTime":0,"blockSize":0,"childrenNum":1,"fileId":16387,"group":"supergroup","length":0,"modificationTime":1567813293988,"owner":"root","pathSuffix":"spark-events","permission":"733","replication":0,"storagePolicy":0,"type":"DIRECTORY"},{"accessTime":0,"blockSize":0,"childrenNum":2,"fileId":16395,"group":"supergroup","length":0,"modificationTime":1567813273907,"owner":"root","pathSuffix":"tmp","permission":"1777","replication":0,"storagePolicy":0,"type":"DIRECTORY"},{"accessTime":0,"blockSize":0,"childrenNum":1,"fileId":16412,"group":"supergroup","length":0,"modificationTime":1567813267540,"owner":"root","pathSuffix":"user","permission":"777","replication":0,"storagePolicy":0,"type":"DIRECTORY"}]}}


root@clustertest:/tests/knox# curl -iku root:goodpassword -X GET 
https://gateway-svc:8443/gateway/default/webhdfs/v2/?op=LISTSTATUS
HTTP/1.1 404 Not Found
Date: Fri, 06 Sep 2019 23:43:53 GMT
Content-Length: 0
Server: Jetty(9.4.12.v20180830)

root@clustertest:/tests/knox# curl -iku root:badpassword -X GET 
https://gateway-svc:8443/gateway/default/webhdfs/v1/?op=LISTSTATUS
HTTP/1.1 401 Unauthorized
Date: Fri, 06 Sep 2019 23:44:17 GMT
Set-Cookie: rememberMe=deleteMe; Path=/gateway/default; Max-Age=0; Expires=Thu, 
05-Sep-2019 23:44:17 GMT
WWW-Authenticate: BASIC realm="application"
Content-Length: 0
Server: Jetty(9.4.12.v20180830)






Reply via email to