[jira] [Reopened] (CXF-7122) Infinite loop due to AsyncHTTPConduit read timeout with exhausted connection pool

William Montaz (JIRA) Fri, 04 Nov 2016 00:06:13 -0700

     [ 
https://issues.apache.org/jira/browse/CXF-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


William Montaz reopened CXF-7122:
---------------------------------

Hi Fang,

thank you for your answer. I disagree on several points :
- for the sync, we had no problem, so no need to remove the 
wait(scPolicy.getReceiveTimeout()). I had to actually put it back because that 
was breaking the syncTimeout test.
- for the async, the old code was doing the wait anyway, so the race condition 
you describe already applies on all previous versions of CXF ? Your analysis 
seems wrong -> if HttpResponse is given back by the callback, then when the 
thread enters getHttpResponse, it will found that httpResponse is not null 
(line 619) and thus will never wait. If httpresponse is null, it will wait. 
Since both setHttpResponse and getHttpresponse are synchronized, we have 
correct happens-before relationship. Correct behavior here.
- the solution you describe does not help with the fact that a thread will 
timeout even if no real tcp traffic occurs. it can be a pain to debug such a 
behavior, seing SocketReadTimeoutException in the log without any outgoing 
request.
- with the solution you describe, The IOReactor thread can still try to write 
some stuff in the socket, like headers (that’s actually what happened when we 
discovered the problem), and then is told not to do anything more. This is 
because there is a competition between thread shutting down the buffer and 
writing to the socket from the buffer. Such a behavior can lead to real 
troubles since the server will get an incomplete HttpRequest.
Finally, the solution I provide has been tested on our preproduction platform.

Could you dig further ?

Thank you

> Infinite loop due to AsyncHTTPConduit read timeout with exhausted connection 
> pool
> ---------------------------------------------------------------------------------
>
>                 Key: CXF-7122
>                 URL: https://issues.apache.org/jira/browse/CXF-7122
>             Project: CXF
>          Issue Type: Bug
>          Components: Transports
>            Reporter: William Montaz
>            Assignee: Freeman Fang
>            Priority: Critical
>             Fix For: 3.2.0, 3.1.9
>
>
> Using AsyncHTTPConduit, when the underlying connection pool gets exhausted, 
> requests waiting for a connection will lead to an infinite loop if they reach 
> receive timeout.
> The problem occured on all versions of CXF above 3.0.5 (we did not tested 
> other ones). 
> Let's imagine a backend that's broken and leads to timeout for all requests.
> When handling requests, the cxf worker thread will eventually go in wait 
> state (AsyncHTTPConduit:618), with a timeout that matches the 
> HTTPClientPolicy.setReceiveTimeout() value, waiting for the NIO stack to 
> complete and call notifyAll via responseCallback (AsyncHTTPConduit:455). 
> The timeout on the wait is the big problem :
> With our broken backend, the connection pool is exhausted waiting for other 
> requests to timeout. When a new request is made by cxf against this backend, 
> after timeout time this will happen :
>  - on the one side the reactor threads will get a connection from the pool 
> and try to write to the output stream. Waiting in the pool is not considered 
> as receive timeout.
>  - on the other side the cxf worker thread will wake up (because of the 
> timedout wait), and shutdown SharedOutputBuffer and SharedInputBuffer 
> (AsyncHTTPClient:624)
>  - reactor threads will go to infinite loop because they will try to 
> produceContent from a shutdown buffer (SharedOutputBuffer:120)
>  
>  From there, application recovery is compromised.
>   
>  To fix that, timeout should be handled only via the client callback 
> (AsyncHTTPConduit:463).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (CXF-7122) Infinite loop due to AsyncHTTPConduit read timeout with exhausted connection pool

Reply via email to