[ 
https://issues.apache.org/jira/browse/CXF-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15644164#comment-15644164
 ] 

Freeman Fang edited comment on CXF-7122 at 11/7/16 1:24 PM:
------------------------------------------------------------

Hi William,

Your last comment is very interesting.

As it's not the case when I worked on CXF-6910. I just checked the ahc code 
base with latest commit log, and found this one HTTPASYNC-105 introduce the 
behavior that when release the connection to pool then reset the SocketTimeout, 
and this is in ahc 4.1.2 released at June this year. With the fix for CXF-6939 
we upgrade to use 4.1.2 already so for CXF 3.1.x and 3.2 we should be able to 
use this new behavior from ahc(this is Good as we can use more universal way to 
handle the timeout!)

I will do more test  to confirm this new behavior will work as expected  and 
looked your patch again based on this new behavior. But I'm not sure with CXF 
3.0.x we can upgrade to use ahc 4.1.2, and also I'm wondering if you use CXF 
3.0.x how you can see this new behavior from ahc 4.1.2, or you just dig the ahc 
source?

Thanks
Freeman




was (Author: ffang):
Hi William,

Your last comment is very interesting.

As it's not the case when I worked on CXF-6910. I just checked the ahc code 
base with latest commit log, and found this one HTTPASYNC-105 introduce the 
behavior that when release the connection to pool then reset the SocketTimeout, 
and this is in ahc 4.1.2 released at June this year. With the fix for CXF-6939 
we upgrade to use 4.1.2 already so for CXF 3.1.x and 3.2 we should be able to 
use this new behavior from ahc(this is Good!)

I will do more test and looked your patch again based on this new behavior. But 
I'm not sure with CXF 3.0.x we can upgrade to use ahc 4.1.2, and also I'm 
wondering if you use CXF 3.0.x how you can see this new behavior from ahc 
4.1.2, or you just dig the ahc source?

Thanks
Freeman



> Infinite loop due to AsyncHTTPConduit read timeout with exhausted connection 
> pool
> ---------------------------------------------------------------------------------
>
>                 Key: CXF-7122
>                 URL: https://issues.apache.org/jira/browse/CXF-7122
>             Project: CXF
>          Issue Type: Bug
>          Components: Transports
>            Reporter: William Montaz
>            Assignee: Freeman Fang
>            Priority: Critical
>             Fix For: 3.2.0, 3.1.9
>
>         Attachments: AsyncHTTPConduitTest.java
>
>
> Using AsyncHTTPConduit, when the underlying connection pool gets exhausted, 
> requests waiting for a connection will lead to an infinite loop if they reach 
> receive timeout.
> The problem occured on all versions of CXF above 3.0.5 (we did not tested 
> other ones). 
> Let's imagine a backend that's broken and leads to timeout for all requests.
> When handling requests, the cxf worker thread will eventually go in wait 
> state (AsyncHTTPConduit:618), with a timeout that matches the 
> HTTPClientPolicy.setReceiveTimeout() value, waiting for the NIO stack to 
> complete and call notifyAll via responseCallback (AsyncHTTPConduit:455). 
> The timeout on the wait is the big problem :
> With our broken backend, the connection pool is exhausted waiting for other 
> requests to timeout. When a new request is made by cxf against this backend, 
> after timeout time this will happen :
>  - on the one side the reactor threads will get a connection from the pool 
> and try to write to the output stream. Waiting in the pool is not considered 
> as receive timeout.
>  - on the other side the cxf worker thread will wake up (because of the 
> timedout wait), and shutdown SharedOutputBuffer and SharedInputBuffer 
> (AsyncHTTPClient:624)
>  - reactor threads will go to infinite loop because they will try to 
> produceContent from a shutdown buffer (SharedOutputBuffer:120)
>  
>  From there, application recovery is compromised.
>   
>  To fix that, timeout should be handled only via the client callback 
> (AsyncHTTPConduit:463).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to