How about something like a BlockingQueue made for each route, with a per-route limit?
On 7 January 2012 00:20, sebb <seb...@gmail.com> wrote: > On 6 January 2012 22:07, Ken Krugler <kkrugler_li...@transpac.com> wrote: >> >> On Jan 6, 2012, at 1:01pm, Oleg Kalnichevski wrote: >> >>> On Fri, 2012-01-06 at 11:06 -0500, Dan Checkoway wrote: >>>> Hello, >>>> >>>> I have an app that needs to make concurrent HTTP requests to a web service >>>> using persistent (keepalive) connections. I'm using >>>> ThreadSafeClientConnManager. I ran into a performance bottleneck, and I >>>> believe I've pinpointed the issue... >>>> >>>> Affects Version(s): HttpCore 4.1.3, HttpClient 4.1.2 >>>> >>>> I construct my connection manager and client like this: >>>> >>>> connMgr = new >>>> ThreadSafeClientConnManager(SchemeRegistryFactory.createDefault(), -1, >>>> TimeUnit.MILLISECONDS); >>>> connMgr.setMaxTotal(400); >>>> connMgr.setDefaultMaxPerRoute(400); >>>> >>>> httpClient = new DefaultHttpClient(connMgr); >>>> >>>> Note that this app only talks to a single URI on a single server -- thus >>>> defaultMaxPerRoute == maxTotal, which I think is correct...please let me >>>> know if that's bad! >>>> >>>> Anyway, my app has a pool of 400 threads and generally performs quite >>>> well. But when all 400 threads need a connection concurrently, performance >>>> suffers. I've narrowed it down to contention caused by blocking calls in >>>> the connection manager. For example...a thread dump shows... >>>> >>>> About half my threads are "stuck" (well, not stuck, but slow & waiting) >>>> here: >>>> >>>> "catalina-exec-347" daemon prio=10 tid=0x00007f3a54065000 nid=0x6b73 >>>> waiting on condition [0x00007f3a29b9a000] >>>> java.lang.Thread.State: WAITING (parking) >>>> at sun.misc.Unsafe.park(Native Method) >>>> - parking to wait for <0x00000006147c8318> (a >>>> java.util.concurrent.locks.ReentrantLock$NonfairSync) >>>> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) >>>> at >>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811) >>>> at >>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842) >>>> at >>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178) >>>> at >>>> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186) >>>> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262) >>>> at >>>> org.apache.http.impl.conn.tsccm.ConnPoolByRoute.freeEntry(ConnPoolByRoute.java:438) >>>> at >>>> org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager.releaseConnection(ThreadSafeClientConnManager.java:276) >>>> - locked <0x000000062048ebc8> (a >>>> org.apache.http.impl.conn.tsccm.BasicPooledConnAdapter) >>>> at >>>> org.apache.http.impl.conn.AbstractClientConnAdapter.releaseConnection(AbstractClientConnAdapter.java:308) >>>> - locked <0x000000062048ebc8> (a >>>> org.apache.http.impl.conn.tsccm.BasicPooledConnAdapter) >>>> at >>>> org.apache.http.conn.BasicManagedEntity.releaseManagedConnection(BasicManagedEntity.java:181) >>>> at >>>> org.apache.http.conn.BasicManagedEntity.eofDetected(BasicManagedEntity.java:142) >>>> at >>>> org.apache.http.conn.EofSensorInputStream.checkEOF(EofSensorInputStream.java:211) >>>> at >>>> org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:139) >>>> ... >>>> >>>> While the other half are "stuck" here: >>>> >>>> "catalina-exec-346" daemon prio=10 tid=0x00007f3a4c05d000 nid=0x6b72 >>>> waiting on condition [0x00007f3a29c9b000] >>>> java.lang.Thread.State: WAITING (parking) >>>> at sun.misc.Unsafe.park(Native Method) >>>> - parking to wait for <0x00000006147c8318> (a >>>> java.util.concurrent.locks.ReentrantLock$NonfairSync) >>>> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) >>>> at >>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811) >>>> at >>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842) >>>> at >>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178) >>>> at >>>> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186) >>>> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262) >>>> at >>>> org.apache.http.impl.conn.tsccm.ConnPoolByRoute.getEntryBlocking(ConnPoolByRoute.java:337) >>>> at >>>> org.apache.http.impl.conn.tsccm.ConnPoolByRoute$1.getPoolEntry(ConnPoolByRoute.java:300) >>>> at >>>> org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager$1.getConnection(ThreadSafeClientConnManager.java:224) >>>> at >>>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:401) >>>> at >>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820) >>>> at >>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:941) >>>> ... >>>> >>>> It's not a deadlock per se. It's just a bottleneck, and it is causing very >>>> high latency in my app. Below a certain threshold, i.e. when not all 400 >>>> threads need a connection concurrently, things are fine. But when all 400 >>>> need a connection at once, that's when it gets painful. >>>> >>>> I'm wondering if it might be feasible to switch to using non-blocking calls >>>> for this, i.e. with ConcurrentHashMap and/or ConcurrentLinkedQueue, or >>>> something of that nature? I haven't dived into the source code yet, so >>>> don't slap me too hard if that suggestion was way out of line. :-) >>>> >>>> Do you have any suggestions, in terms of ways I might be able to work >>>> around this bottleneck otherwise? >>>> >>>> Thanks! >>>> >>>> Dan Checkoway >>> >>> Hi Dan >>> >>> Yes, your observation is correct. The problem is that the connection >>> pool is guarded by a global lock. Naturally if you have 400 threads >>> trying to obtain a connection at about the same time all of them end up >>> contending for one lock. The problem is that I can't think of a >>> different way to ensure the max limits (per route and total) are >>> guaranteed not to be exceeded. If anyone can think of a better algorithm >>> please do let me know. What might be a possibility is creating a more >>> lenient and less prone to lock contention issues implementation that may >>> under stress occasionally allocate a few more connections than the max >>> limits. >> >> I'd also run into a similar situation during web crawling, when I had 300+ >> threads sharing one connection pool. >> >> Would it work to go for finer-grained locking, by using atomic counters to >> track & enforce limits on per route/total connections? > > If the per-route limit is likely to be reached, it might help to have > a lock per route. > If the route limit has not been reached, only then grab the global lock. > > However this won't help unless the per-route limits are reached > sufficiently often. > >> -- Ken >> >> -------------------------- >> Ken Krugler >> http://www.scaleunlimited.com >> custom big data solutions & training >> Hadoop, Cascading, Mahout & Solr >> >> >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org > For additional commands, e-mail: httpclient-users-h...@hc.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org For additional commands, e-mail: httpclient-users-h...@hc.apache.org