Hi Sandeep,

Thanks for the reply. I see that this exception coming for connections from UI 
Client to Knox Gateway:

2019-08-11 04:43:27,432 DEBUG ssl.SslConnection 
(SslConnection.java:handshakeFailed(764)) - handshake failed 
SslConnection@1d3697f3::SocketChannelEndPoint@6d000fcf{/192.168.141.33:35102<->/192.168.141.31:8443,OPEN,fill=-,flush=-,to=2/300000}{io=0/0,kio=0,kro=1}->SslConnection@1d3697f3{NEED_UNWRAP,eio=0/-1,di=-1,fill=IDLE,flush=IDLE}~>DecryptedEndPoint@20e999e9{/192.168.141.33:35102<->/192.168.141.31:8443,OPEN,fill=-,flush=-,to=99/300000}=>HttpConnection@264e99b9[p=HttpParser{s=START,0
 of 
-1},g=HttpGenerator@717c736d{s=START}]=>HttpChannelOverHttp@570d89da{r=0,c=false,a=IDLE,uri=null,age=0}
 javax.net.ssl.SSLException: Received fatal alert: certificate_unknown

Now to fix this, do I have to add Knox certificate in my UI Client machine i.e. 
my laptop’s cacert  or what else ?

One more query: For any SSL enabled service to be integrated with Knox, the 
same step i.e. adding Knox certificate in UI Client’s truststore is mandatory ?

Regards,
Rajat

From: Sandeep Moré <moresand...@gmail.com>
Reply to: "user@knox.apache.org" <user@knox.apache.org>
Date: Wednesday, 14 August 2019 at 7:36 PM
To: "user@knox.apache.org" <user@knox.apache.org>
Subject: Re: Knox jetty threads stuck and Seeing ConnectionPoolTimeoutException 
while proxying a custom service

Hello Rajat,

I see "avax.net.ssl.SSLException: Received fatal alert: certificate_unknown" 
errors in the logs, this appears to be the root cause.
Appears to be a certificate issue.

On Mon, Aug 12, 2019 at 1:51 PM Rajat Goel 
<rajat.g...@thalesgroup.com<mailto:rajat.g...@thalesgroup.com>> wrote:
Attaching full debug log as well.

From: Rajat Goel <rajat.g...@thalesgroup.com<mailto:rajat.g...@thalesgroup.com>>
Reply to: "user@knox.apache.org<mailto:user@knox.apache.org>" 
<user@knox.apache.org<mailto:user@knox.apache.org>>
Date: Monday, 12 August 2019 at 10:50 PM
To: "user@knox.apache.org<mailto:user@knox.apache.org>" 
<user@knox.apache.org<mailto:user@knox.apache.org>>
Subject: Re: Knox jetty threads stuck and Seeing ConnectionPoolTimeoutException 
while proxying a custom service

Hi Sandeep,

Thanks for your reply.

Yes, the websocket traffic works as expected initially. Connection is 
established and data flows correctly. Websocket backend is secured as well. To 
check if the issue is due to SSL or not, I disabled SSL on My service and the 
issue got reproduced with no secure (non SSL) setup as well.

Attaching instance logs from one cycle of reproduction. This instance uses 
SSOCookieProvider (for SSO) and had DEBUG logs enabled only for 
org.apache.knox.gateway package. Full debug logs file instance is large in size 
so cannot send via mail but let me know if you need that as well. Also 
attaching thread dump of Knox for one particular instance when Knox threads 
were stuck.

I don’t see any errors in service log files. The issue is reproducible every 
time so if you need any other information, please do let me know.

Thanks & Regards,
Rajat

From: Sandeep Moré <moresand...@gmail.com<mailto:moresand...@gmail.com>>
Reply to: "user@knox.apache.org<mailto:user@knox.apache.org>" 
<user@knox.apache.org<mailto:user@knox.apache.org>>
Date: Monday, 12 August 2019 at 7:37 PM
To: "user@knox.apache.org<mailto:user@knox.apache.org>" 
<user@knox.apache.org<mailto:user@knox.apache.org>>
Subject: Re: Knox jetty threads stuck and Seeing ConnectionPoolTimeoutException 
while proxying a custom service

Hello Rajat,

Do you see Websocket traffic work as expected the first time when the UI works 
? can you check using the developer console if the initial Websocket connection 
was established and data is flowing correctly?

Looks like you are connecting to a secure backend, is the Websocket backend 
secure as well ?
Can you also post a redacted gateway.log file with DEBUG logging, it would be 
helpful to see the entire log file with the stacktrace.

Also, do you see any errors in your service log files ?

Best,
Sandeep


On Mon, Aug 12, 2019 at 1:08 AM Rajat Goel 
<rajat.g...@thalesgroup.com<mailto:rajat.g...@thalesgroup.com>> wrote:
Hello,

I need some help while trying to integrate my custom service with Knox. Here’s 
how the service is deployed:

                                                        _   Knox (Instance 1) _ 
                                          _  My Service1 (Master).
                                                     |                          
               |                                       |
UI Client -> Load Balancer  --> |                                         | --  
For Availability --> |
                     (HA Proxy)             |                                   
      |       (HA Proxy)             |
                                                     | _  Knox (Instance 2) _  
|                                       | _ My Service2 (Standby)


My custom service UI client uses both web sockets as well as REST calls to talk 
to My Service backend. I have written service definition xml and rewrite xml 
files and deployed them on a test setup as per the above architecture. My Knox 
version is 1.0 (HDP 3.1). I have added the service in Knox default topology and 
 default topology uses Shiro Provider for authentication and 
‘gateway.websocket.feature.enabled’ is set to true.

When Knox service is started, I see the following behavior:

  *   When I launch the UI from browser, I get Shiro authentication dialog 
popup, I enter my credentials and get redirected to my Service UI. Initially, 
for the first couple of minutes, UI works fine. All calls from UI are getting 
routed properly to backend.
  *   After first few minutes, I start seeing following exception in Knox 
gateway logs:

                       2019-08-10 15:37:32,053 ERROR gateway.websockets 
(ProxyWebSocketAdapter.java:onWebSocketConnect(105)) - Unable to connect to 
websocket server: java.io.IOException: Connect failure

java.io.IOException: Connect failure

Caused by: org.eclipse.jetty.websocket.api.UpgradeException: 0 null

        at 
org.eclipse.jetty.websocket.client.WebSocketUpgradeRequest.onComplete(WebSocketUpgradeRequest.java:515)

Caused by: java.io.EOFException: 
HttpConnectionOverHTTP@1e8a6128::SocketChannelEndPoint@5526bc43{rafint001-mgt-01.cloud.in.guavus.com/192.168.141.33:9443<http://rafint001-mgt-01.cloud.in.guavus.com/192.168.141.33:9443><->/192.168.141.31:49860<http://192.168.141.31:49860>,ISHUT,fill=-,flush=-,to=3/0}{io=0/0,kio=0,kro=1}->HttpConnectionOverHTTP@1e8a6128(l:/192.168.141.31:49860<http://192.168.141.31:49860>
 <-> 
r:rafint001-mgt-01.cloud.in.guavus.com/192.168.141.33:9443,closed=false)=<http://rafint001-mgt-01.cloud.in.guavus.com/192.168.141.33:9443,closed=false)=>>HttpChannelOverHTTP@500cd5d(exchange=HttpExchange@75bf9768
 req=TERMINATED/null@null 
res=PENDING/null@null)[send=HttpSenderOverHTTP@6678aae8(req=QUEUED,snd=COMPLETED,failure=null)[HttpGenerator@6dcd6c0{s=START}],recv=HttpReceiverOverHTTP@5358963(rsp=IDLE,failure=null)[HttpParser{s=CLOSED,0
 of -1}]]

        ... 13 more


  *   UI starts slowing down and response times increase.
  *   I also see DEBUG exceptions such as:

DEBUG io.FillInterest (FillInterest.java:onFail(134)) - onFail 
FillInterest@1d803962{null}

java.util.concurrent.TimeoutException: Idle timeout expired: 300000/300000 ms

        at 
org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166)

        at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50)

        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)

        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)





DEBUG io.WriteFlusher (WriteFlusher.java:onFail(471)) - ignored: 
WriteFlusher@3a2f4fc0{IDLE}->null

java.nio.channels.ClosedChannelException

        at org.eclipse.jetty.io.WriteFlusher.onClose(WriteFlusher.java:502)

        at 
org.eclipse.jetty.io.AbstractEndPoint.onClose(AbstractEndPoint.java:353)

        at 
org.eclipse.jetty.io.ChannelEndPoint.onClose(ChannelEndPoint.java:216)

        at 
org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:225)

        at 
org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:192)

        at 
org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:175)

        at 
org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.doClose(SslConnection.java:1132)

        at 
org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:220)



DEBUG ssl.SslConnection (SslConnection.java:flush(950)) - 
SslConnection@1cb9d1f9::SocketChannelEndPoint@6fa560e8{/192.168.141.31:44258<http://192.168.141.31:44258><->/192.168.141.31:8443<http://192.168.141.31:8443>,ISHUT,fill=-,flush=-,to=8/300000}{io=0/0,kio=0,kro=1}->SslConnection@1cb9d1f9{NEED_UNWRAP,eio=-1/-1,di=-1,fill=IDLE,flush=IDLE}~>DecryptedEndPoint@205f9957{/192.168.141.31:44258<http://192.168.141.31:44258><->/192.168.141.31:8443<http://192.168.141.31:8443>,CLOSED,fill=-,flush=-,to=9/300000}=>HttpConnection@11b5199c[p=HttpParser{s=CLOSED,0
 of 
-1},g=HttpGenerator@40706618{s=START}]=>HttpChannelOverHttp@655d0b98{r=0,c=false,a=IDLE,uri=null,age=0}

java.io.IOException: Broken pipe

        at 
org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.flush(SslConnection.java:847)

        at 
org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.doShutdownOutput(SslConnection.java:1076)

        at 
org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.doClose(SslConnection.java:1131)

        at 
org.eclipse.jetty.io.AbstractEndPoint.doOnClose(AbstractEndPoint.java:220)

        at 
org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:192)

        at 
org.eclipse.jetty.io.AbstractEndPoint.close(AbstractEndPoint.java:175)

        at 
org.eclipse.jetty.io.AbstractConnection.close(AbstractConnection.java:248)

        at 
org.eclipse.jetty.server.HttpChannelOverHttp.earlyEOF(HttpChannelOverHttp.java:234)

        at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:1551)

        at 
org.eclipse.jetty.server.HttpConnection.parseRequestBuffer(HttpConnection.java:360)

        at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:250)

        at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)

        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)

        at 
org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:411)



  *   Few minutes later, UI hangs. I start seeing following Exceptions in Knox 
gateway log:

                              2019-08-10 15:47:41,366 WARN  knox.gateway 
(DefaultDispatch.java:executeOutboundRequest(147)) - Connection exception 
dispatching request: 
https://rafint001-mgt-01.cloud.in.guavus.com:9443/_sock/iframe.html 
org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for 
connection from pool

        at 
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:313)

        at 
org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:279)



                               2019-08-10 15:47:41,371 ERROR knox.gateway 
(AbstractGatewayFilter.java:doFilter(63)) - Failed to execute filter: 
java.io.IOException: Service connectivity error.

java.io.IOException: Service connectivity error.

        at 
org.apache.knox.gateway.dispatch.DefaultDispatch.executeOutboundRequest(DefaultDispatch.java:148)

        at 
org.apache.knox.gateway.dispatch.DefaultDispatch.executeRequest(DefaultDispatch.java:116)

        at 
org.apache.knox.gateway.dispatch.DefaultDispatch.doGet(DefaultDispatch.java:278)



  *   Few minutes later, when I refresh UI page (after clearing any cookies) I 
see HTTP 401 Error on UI. This time no Shiro authentication dialog popup. 
Nothing seems to work hereafter. I see following logs only in gateway.log:

                     2019-08-10 15:40:10,667 DEBUG knox.gateway 
(GatewayFilter.java:doFilter(119)) - Received request: GET /pdie/cdap/

2019-08-10 15:40:10,667 DEBUG authc.BasicHttpAuthenticationFilter 
(BasicHttpAuthenticationFilter.java:sendChallenge(274)) - Authentication 
required: sending 401 Authentication challenge response.



  *   When I restart Knox service from Ambari or run a ‘touch’ command on 
topology file (/etc/knox/conf/topologies/default.xml), UI starts working again 
and the above set of issues repeats.
  *   I tried increasing gateway threads (gateway.threadpool.max to 500) and 
HTTP connections (gateway.httpclient.maxConnections to 100). With these 
settings, UI works fine for a little more time as compared to scenario with 
default values but eventually hit the same issues as above.
  *   When UI hangs, I tried to thread dump of Knox service. I see that many of 
the threads in Knox are stuck in socket read calls with following trace:
                                  "qtp2099051403-139" #139 prio=5 os_prio=0 
tid=0x00007f4d7c002000 nid=0x6ed8 runnable [0x00007f4dee0e0000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.net.SocketInputStream.read(SocketInputStream.java:171)
        at java.net.SocketInputStream.read(SocketInputStream.java:141)
        at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
        at sun.security.ssl.InputRecord.read(InputRecord.java:503)
        at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
        - locked <0x00000006f471e218> (a java.lang.Object)
        at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
        at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
        - locked <0x00000006f471edc0> (a sun.security.ssl.AppInputStream)
        at 
org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
        at 
org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
        at 
org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
        at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
        at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
        at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
        at 
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
        at 
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
        at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
        at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
        at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
        at 
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
        at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
        at 
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
        at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
        at 
org.apache.knox.gateway.dispatch.DefaultDispatch.executeOutboundRequest(DefaultDispatch.java:130)
        at 
org.apache.knox.gateway.dispatch.DefaultDispatch.executeRequest(DefaultDispatch.java:116)
        at 
org.apache.knox.gateway.dispatch.DefaultDispatch.doPost(DefaultDispatch.java:305)
        at 
org.apache.knox.gateway.dispatch.GatewayDispatchFilter$PostAdapter.doMethod(GatewayDispatchFilter.java:177)
        at 
org.apache.knox.gateway.dispatch.GatewayDispatchFilter.doFilter(GatewayDispatchFilter.java:122)
        at 
org.apache.knox.gateway.filter.AbstractGatewayFilter.doFilter(AbstractGatewayFilter.java:61)
        at 
org.apache.knox.gateway.GatewayFilter$Holder.doFilter(GatewayFilter.java:372)
        at 
org.apache.knox.gateway.GatewayFilter$Chain.doFilter(GatewayFilter.java:272)

Can some please help on resolving the above set of issues:

  1.  With the deployment architecture as described above: UI -> HAProxy -> 
Knox -> HAProxy -> Service backend, are there any specific configurations 
required in HA Proxy and Knox to make this work. Is this right way to deploy ?
  2.  ‘org.eclipse.jetty.websocket.api.UpgradeException’ and how can this be 
fixed?
  3.  How to debug and fix Knox Jetty threads hang ?

Thanks & Regards,
Rajat

Reply via email to