Hello,

I have an app that needs to make concurrent HTTP requests to a web service
using persistent (keepalive) connections.  I'm using
ThreadSafeClientConnManager.  I ran into a performance bottleneck, and I
believe I've pinpointed the issue...

Affects Version(s): HttpCore 4.1.3, HttpClient 4.1.2

I construct my connection manager and client like this:

        connMgr = new
ThreadSafeClientConnManager(SchemeRegistryFactory.createDefault(), -1,
TimeUnit.MILLISECONDS);
        connMgr.setMaxTotal(400);
        connMgr.setDefaultMaxPerRoute(400);

        httpClient = new DefaultHttpClient(connMgr);

Note that this app only talks to a single URI on a single server -- thus
defaultMaxPerRoute == maxTotal, which I think is correct...please let me
know if that's bad!

Anyway, my app has a pool of 400 threads and generally performs quite
well.  But when all 400 threads need a connection concurrently, performance
suffers.  I've narrowed it down to contention caused by blocking calls in
the connection manager.  For example...a thread dump shows...

About half my threads are "stuck" (well, not stuck, but slow & waiting)
here:

"catalina-exec-347" daemon prio=10 tid=0x00007f3a54065000 nid=0x6b73
waiting on condition [0x00007f3a29b9a000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000006147c8318> (a
java.util.concurrent.locks.ReentrantLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
    at
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
    at
org.apache.http.impl.conn.tsccm.ConnPoolByRoute.freeEntry(ConnPoolByRoute.java:438)
    at
org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager.releaseConnection(ThreadSafeClientConnManager.java:276)
    - locked <0x000000062048ebc8> (a
org.apache.http.impl.conn.tsccm.BasicPooledConnAdapter)
    at
org.apache.http.impl.conn.AbstractClientConnAdapter.releaseConnection(AbstractClientConnAdapter.java:308)
    - locked <0x000000062048ebc8> (a
org.apache.http.impl.conn.tsccm.BasicPooledConnAdapter)
    at
org.apache.http.conn.BasicManagedEntity.releaseManagedConnection(BasicManagedEntity.java:181)
    at
org.apache.http.conn.BasicManagedEntity.eofDetected(BasicManagedEntity.java:142)
    at
org.apache.http.conn.EofSensorInputStream.checkEOF(EofSensorInputStream.java:211)
    at
org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:139)
    ...

While the other half are "stuck" here:

"catalina-exec-346" daemon prio=10 tid=0x00007f3a4c05d000 nid=0x6b72
waiting on condition [0x00007f3a29c9b000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000006147c8318> (a
java.util.concurrent.locks.ReentrantLock$NonfairSync)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
    at
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
    at
org.apache.http.impl.conn.tsccm.ConnPoolByRoute.getEntryBlocking(ConnPoolByRoute.java:337)
    at
org.apache.http.impl.conn.tsccm.ConnPoolByRoute$1.getPoolEntry(ConnPoolByRoute.java:300)
    at
org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager$1.getConnection(ThreadSafeClientConnManager.java:224)
    at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:401)
    at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
    at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:941)
    ...

It's not a deadlock per se.  It's just a bottleneck, and it is causing very
high latency in my app.  Below a certain threshold, i.e. when not all 400
threads need a connection concurrently, things are fine.  But when all 400
need a connection at once, that's when it gets painful.

I'm wondering if it might be feasible to switch to using non-blocking calls
for this, i.e. with ConcurrentHashMap and/or ConcurrentLinkedQueue, or
something of that nature?  I haven't dived into the source code yet, so
don't slap me too hard if that suggestion was way out of line.  :-)

Do you have any suggestions, in terms of ways I might be able to work
around this bottleneck otherwise?

Thanks!

Dan Checkoway

Reply via email to