Interesting ! AFAIK there is no way to disable compression as of now and I would expect 'gateway.gzip.compress.mime.types' to work [1]
So you are looking to turn off Gzip compression for livy service, correct ? just want to make sure I understand the problem. Best, Sandeep [1] https://github.com/apache/knox/blob/master/gateway-server/src/main/java/org/apache/hadoop/gateway/GatewayServer.java#L422 On Thu, Aug 24, 2017 at 3:53 AM, Johan Wärlander <jo...@snowflake.nu> wrote: > Hello, > > Me and a colleague have been working on setting up a Knox service for > Livy, so that we can allow an external Jupyter setup to manage Spark > sessions without handling Kerberos auth; basically following this guide: > > https://community.hortonworks.com/articles/70499/adding- > livy-server-as-service-to-apache-knox.html > > However, Livy doesn't seem to accept the calls coming through Knox, > whereas if we POST directly to Livy using 'curl', all is good. > > From a quick 'tcpdump' session, a difference seems to be that Knox uses > chunked transfers and compression, so I decided to try out some options > (see details further down), and there definitely appears to be a problem > with compressing the request. > > Is there a way to disable compression for a particular service in Knox? > > NOTE: I know about 'gateway.gzip.compress.mime.types', but according to > docs it only affects compression when sending data to the browser; we tried > it nonetheless, and it didn't seem to help. > > TESTING DETAILS > > First, create some JSON to send to Livy: > > $ cat > session_johwar.json > {"proxyUser":"johwar","kind": "pyspark"} > $ gzip -n session_johwar.json > > Next, try a chunked and compressed POST request to /sessions: > > $ curl -u : --negotiate -v -s --trace-ascii http_trace_chunked_gz.log > --data-binary @session_johwar.json.gz -H "Content-Type: application/json" > -H 'Content-Encoding: gzip' -H 'Transfer-Encoding: chunked' > http://myserver:8999/sessions > "Illegal character ((CTRL-CHAR, code 31)): only regular white space (\\r, > \\n, \\t) is allowed between tokens\n at [Source: HttpInputOverHTTP@756a5d6c; > line: 1, column: 2]" > > Nope.. log excerpt: > > 040e: User-Agent: curl/7.47.0 > 0427: Accept: */* > 0434: Content-Type: application/json > 0454: Content-Encoding: gzip > 046c: Transfer-Encoding: chunked > 0488: > 048a: 3d > => Send data, 68 bytes (0x44) > 0000: ...........V*(....-N-R.R...(O,R.Q...KQ.RP*.,.H,.V.....7..)... > 003f: 0 > 0042: > == Info: upload completely sent off: 68 out of 61 bytes > <= Recv header, 26 bytes (0x1a) > 0000: HTTP/1.1 400 Bad Request > <= Recv header, 37 bytes (0x25) > 0000: Date: Thu, 24 Aug 2017 07:20:57 GMT > <= Recv header, 362 bytes (0x16a) > 0000: WWW-Authenticate: Negotiate ... > <= Recv header, 132 bytes (0x84) > 0000: Set-Cookie: hadoop.auth="u=johwar&..."; HttpOnly > <= Recv header, 47 bytes (0x2f) > 0000: Content-Type: application/json; charset=UTF-8 > <= Recv header, 21 bytes (0x15) > 0000: Content-Length: 172 > <= Recv header, 33 bytes (0x21) > 0000: Server: Jetty(9.2.16.v20160414) > <= Recv header, 2 bytes (0x2) > 0000: > <= Recv data, 172 bytes (0xac) > 0000: "Illegal character ((CTRL-CHAR, code 31)): only regular white sp > 0040: ace (\\r, \\n, \\t) is allowed between tokens\n at [Source: Http > 0080: InputOverHTTP@583564e8; line: 1, column: 2]" > > Ok, so let's try with just compression: > > $ curl -u : --negotiate -v -s --trace-ascii http_trace_gz.log > --data-binary @session_johwar.json.gz -H "Content-Type: application/json" > -H 'Content-Encoding: gzip' http://myserver:8999/sessions > "Illegal character ((CTRL-CHAR, code 31)): only regular white space (\\r, > \\n, \\t) is allowed between tokens\n at [Source: HttpInputOverHTTP@188893c9; > line: 1, column: 2]" > > Ok, no luck.. log is mostly the same, except for no chunking: > > 040e: User-Agent: curl/7.47.0 > 0427: Accept: */* > 0434: Content-Type: application/json > 0454: Content-Encoding: gzip > 046c: Content-Length: 61 > 0480: > => Send data, 61 bytes (0x3d) > 0000: ...........V*(....-N-R.R...(O,R.Q...KQ.RP*.,.H,.V.....7..)... > == Info: upload completely sent off: 61 out of 61 bytes > <= Recv header, 26 bytes (0x1a) > 0000: HTTP/1.1 400 Bad Request > > Decompress the file again: > > $ gunzip session_johwar.json.gz > > Then.. just a plain old request, "known" to work already: > > $ curl -u : --negotiate -v -s --trace-ascii http_trace.log --data > @session_johwar.json -H "Content-Type: application/json" > http://myserver:8999/sessions > {"id":5,"appId":null,"owner":"johwar","proxyUser":"johwar"," > state":"starting","kind":"pyspark","appInfo":{"driverLogUrl":null," > sparkUiUrl":null},"log":[]} > > Yep. Log is looking a lot better: > > 040e: User-Agent: curl/7.47.0 > 0427: Accept: */* > 0434: Content-Type: application/json > 0454: Content-Length: 40 > 0468: > => Send data, 40 bytes (0x28) > 0000: {"proxyUser":"johwar","kind": "pyspark"} > == Info: upload completely sent off: 40 out of 40 bytes > <= Recv header, 22 bytes (0x16) > 0000: HTTP/1.1 201 Created > > And with chunking? > > $ curl -u : --negotiate -v -s --trace-ascii http_trace_chunked.log --data > @session_johwar.json -H "Content-Type: application/json" -H > 'Transfer-Encoding: chunked' http://myserver:8999/sessions > {"id":6,"appId":null,"owner":"johwar","proxyUser":"johwar"," > state":"starting","kind":"pyspark","appInfo":{"driverLogUrl":null," > sparkUiUrl":null},"log":[]} > > Still works. >