[
https://issues.apache.org/jira/browse/HTTPASYNC-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16983282#comment-16983282
]
Anurag Agarwal commented on HTTPASYNC-156:
------------------------------------------
Yes it is going to resolve the deadlock, but I have other concerns here.
So I am taking these two cases from where the cancel is being called:
1. Before the managedConn is created / locked. In that case just after entering
the resultFuture.completed -> connectionAllocated method it will check if the
exchange handler is closed and will discard this connection correctly.
2. Another is when just after getting the lock on the newly created managedConn
cancel is being called again the above case will trigger and it should work
fine as well.
3. Now if the cancel is called when the first thread is already in
connectionAllocated, let us say after this.closed.get() is checked, I am not
sure if from there things will go well. If here the cancel is called,
responseConsumers will be closed via exectionCancelled call. Then in discard
connection, this managed connection will be shutdown. This will in turn close
the session, but the call will stop while releasing connection as there we
require a managedConn lock which is currently held by the first thread. The
first thread can then proceed from there to requestOutput. There is a stale
check and I think it can handle things well, but I am not able to think of
every scenario where things can fail / pass here. This is the case when I have
doubts if the resources are going to be correctly cleared or not.
My main concern here is that even for the third scenario if the connection
again end up in the connection pool as a valid and available connection from
leased then the other requests will start failing on this connection as the
session is closed for this session. Just recheck once if that scenario is
handled.
> Deadlock while releasing connection
> -----------------------------------
>
> Key: HTTPASYNC-156
> URL: https://issues.apache.org/jira/browse/HTTPASYNC-156
> Project: HttpComponents HttpAsyncClient
> Issue Type: Bug
> Reporter: Anurag Agarwal
> Priority: Major
>
> There is a deadlock scenario which leads to blocking threads and can make the
> whole application stuck.
> Following is the thread dump:
> {code:java}
> "I/O dispatcher 3" - Thread t@31
> java.lang.Thread.State: BLOCKED
> at
> org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.isRouteComplete(PoolingNHttpClientConnectionManager.java:482)
> - waiting to lock <5acdd0f1> (a
> org.apache.http.impl.nio.conn.CPoolProxy) owned by "http-nio-8080-exec-35"
> t@194
> at
> org.apache.http.impl.nio.client.AbstractClientExchangeHandler.connectionAllocated(AbstractClientExchangeHandler.java:323)
> at
> org.apache.http.impl.nio.client.AbstractClientExchangeHandler.access$000(AbstractClientExchangeHandler.java:62)
> at
> org.apache.http.impl.nio.client.AbstractClientExchangeHandler$1.completed(AbstractClientExchangeHandler.java:387)
> at
> org.apache.http.impl.nio.client.AbstractClientExchangeHandler$1.completed(AbstractClientExchangeHandler.java:383)
> at
> org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:122)
> at
> org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager$1.completed(PoolingNHttpClientConnectionManager.java:306)
> at
> org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager$1.completed(PoolingNHttpClientConnectionManager.java:297)
> at
> org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:122)
> at
> org.apache.http.nio.pool.RouteSpecificPool.completed(RouteSpecificPool.java:146)
> at
> org.apache.http.nio.pool.AbstractNIOConnPool.requestCompleted(AbstractNIOConnPool.java:563)
> at
> org.apache.http.nio.pool.AbstractNIOConnPool$InternalSessionRequestCallback.completed(AbstractNIOConnPool.java:883)
> at
> org.apache.http.impl.nio.reactor.SessionRequestImpl.completed(SessionRequestImpl.java:154)
> - locked <7e72f906> (a
> org.apache.http.impl.nio.reactor.SessionRequestImpl)
> at
> org.apache.http.impl.nio.reactor.AbstractIOReactor.processNewChannels(AbstractIOReactor.java:426)
> at
> org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:287)
> at
> org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
> at
> org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
> at java.lang.Thread.run(Thread.java:748)
> Locked ownable synchronizers:
> - locked <5d82b8b0> (a
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> "I/O dispatcher 2" - Thread t@30
> java.lang.Thread.State: WAITING
> at sun.misc.Unsafe.park(Native Method)
> - waiting to lock <5d82b8b0> (a
> java.util.concurrent.locks.ReentrantLock$NonfairSync) owned by "I/O
> dispatcher 3" t@31
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
> at
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
> at
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
> at
> org.apache.http.nio.pool.AbstractNIOConnPool.release(AbstractNIOConnPool.java:344)
> at
> org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.releaseConnection(PoolingNHttpClientConnectionManager.java:391)
> - locked <6720e46f> (a org.apache.http.impl.nio.conn.CPoolProxy)
> at
> org.apache.http.impl.nio.client.AbstractClientExchangeHandler.releaseConnection(AbstractClientExchangeHandler.java:245)
> at
> org.apache.http.impl.nio.client.MainClientExec.responseCompleted(MainClientExec.java:387)
> at
> org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:172)
> at
> org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:448)
> at
> org.apache.http.nio.protocol.HttpAsyncRequestExecutor.responseReceived(HttpAsyncRequestExecutor.java:321)
> at
> org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:255)
> at
> org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
> at
> org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
> at
> org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
> at
> org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
> at
> org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
> at
> org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
> at
> org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
> at
> org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
> at
> org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
> at java.lang.Thread.run(Thread.java:748)
> Locked ownable synchronizers:
> - None
> {code}
> {code:java}
> "http-nio-8080-exec-35" - Thread t@194
> java.lang.Thread.State: WAITING
> at sun.misc.Unsafe.park(Native Method)
> - waiting to lock <5d82b8b0> (a
> java.util.concurrent.locks.ReentrantLock$NonfairSync) owned by "I/O
> dispatcher 3" t@31
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
> at
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
> at
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
> at
> org.apache.http.nio.pool.AbstractNIOConnPool.release(AbstractNIOConnPool.java:344)
> at
> org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.releaseConnection(PoolingNHttpClientConnectionManager.java:391)
> - locked <5acdd0f1> (a org.apache.http.impl.nio.conn.CPoolProxy)
> at
> org.apache.http.impl.nio.client.AbstractClientExchangeHandler.discardConnection(AbstractClientExchangeHandler.java:276)
> at
> org.apache.http.impl.nio.client.AbstractClientExchangeHandler.cancel(AbstractClientExchangeHandler.java:447)
> at
> org.apache.http.impl.nio.client.FutureWrapper.cancel(FutureWrapper.java:51)
> Locked ownable synchronizers:
> - locked <3adeb158> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> {code}
> Summary of the stack traces above:
> 1. Code followed till from callbacks till
> https://github.com/ok2c/httpasyncclient/blob/466ee63a471775973b4fc0f58a368123d677d94b/httpasyncclient/src/main/java/org/apache/http/impl/nio/client/AbstractClientExchangeHandler.java#L324.
>
> 2. As from the first stack trace, it reached to connectionAllocated from
> https://github.com/apache/httpcomponents-core/blob/37b35f407570e2b9c261326d5960edaa3c6f42a1/httpcore-nio/src/main/java/org/apache/http/nio/pool/AbstractNIOConnPool.java#L563
> above which the ReentrantLock to the connection pool is acquired.
> 3. Now at the line
> {code:java}
> this.connmgr.isRouteComplete(managedConn)
> {code}
> inside the method isRouteComplete we will find that we are trying to take a
> lock over managedConn.
> 4. But just before this call if AbstractClientExchangeHandler.cancel (or
> similar methods to this) which is itself trying to acquire a lock on
> managedConn and does acquire over it, will try to call
> org.apache.http.nio.pool.AbstractNIOConnPool.release but it won't be able to
> complete the call because this would require the ReentrantLock to be free
> which is currently blocked.
> This leads to a race condition as one of the threads has a lock over
> ReentrantLock and requires managedConn while the thread from where cancel is
> called has a lock over managedConn and requires ReentrantLock.
> Since the ReentrantLock lock won't be released until the first thread is
> finished, the pool will be blocked for other threads / IO Dispatchers as well
> leading to all requests failing / timing out / stuck.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]