Re: Pool blocking but not hitting max size?
On 4/13/15 2:39 PM, Dan wrote: repeatedly seen this happen and would appreciate some input or advice. We're using version 1.6 of the commons pool, I don't believe we could upgrade without good reason. From the line numbers in the stack trace, it looks like you are actually running pool 1.3, which is, well, ancient. You should verify the version and if it is as I suspect, you should definitely upgrade. Just have a look at the change log for the many, many issues that have been resolved since 1.3. That version should be deadlock-free, though it achieves that by extreme over-synchronization. Both borrow and return are fully synchronized (threads are waiting on the pool monitor in the dump below). Version 1.3 is the least performant version of commons pool. My apologies, I just searched the .ear we deploy for what I thought was the commons pool and found commons-pool-1.6.jar, so I figured that was the active version. Now that I look again, there are 20 .jar files all containing the word commons in the file name, with varying version numbers. Like I said, I'm just a lowly server admin watching this app crash. Additionally, the maxIdle setting limits the number of instances that can be idle in the pool to just 8. That means that when instances are returned and there are already 8 idle, the returning instances are destroyed. When more load arrives, you then have to wait for them to be created, which in v 1.3 causes all threads to wait on the factory. If you can afford to have more instances idle, you should increase that number. In what sense do you say 'afford'? The servers have more than enough CPU/RAM available when this happens. Would you expect any meaningful overhead from bumping this number up to 50 or 100 during normal operations? We have test environments, but have struggled to create similar loads to production in them, so it may be difficult to fully test this change. You will likely get immediate relief by increasing maxIdle to several hundred or even the maxActive number; but you really should upgrade to a more recent version. See the pool web page for JDK requirements and version compatibility. Thanks, I was very surprised when I saw how many versions behind our code looked from my naive glancing. I will suggest the maxIdle change and go from there. Just to clarify, the reason these threads are (likely) blocking is that we have 8 idle pool members, so every single request thereafter will cause a synchronized construct/destruct which blocks the entire pool until it completes, effectively limiting throughput to one request at a time until load goes down. I'd just like something meaningful to pass onto the developers. Let me try to explain a little better so you can make the right decisions before and after upgrade. The maxActive setting (renamed maxTotal in v. 2) governs the total number of instances in circulation - checked out or idle waiting to be checked out - at a given time. The maxIdle setting limits the number of instances that can sit idle in the pool. Given your settings, a spike in demand could cause 2048 instances to be created and handed out to clients. On return, any arriving back when there were more than 8 instances idle would get destroyed. In general maxIdle maxTotal under variable load will result in a lot of object creation / destruction. If your environment can allow maxIdle == maxTotal (or at least not hugely less), you can avoid this churn. Sometimes people want to reallocate resources after load subsides so they set maxIdle maxActive. Now, pool 1.3 makes the pain associated with object churn much worse because the factory create and destroy methods are executed while holding *global* locks on the pool. That means all threads waiting on borrow or return have to wait for the factory methods to complete. Morals of the story: 1. Upgrade to a modern version of pool (1.5.7+) where the factory methods don't block the pool 2. Consider setting maxIdle closer or equal to maxActive. Phil I appreciate the input! - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org
Re: Pool blocking but not hitting max size?
On 4/13/15 9:36 AM, Dan wrote: Hello, I'm not a developer on the project I'm supporting, but I have repeatedly seen this happen and would appreciate some input or advice. We're using version 1.6 of the commons pool, I don't believe we could upgrade without good reason. From the line numbers in the stack trace, it looks like you are actually running pool 1.3, which is, well, ancient. You should verify the version and if it is as I suspect, you should definitely upgrade. Just have a look at the change log for the many, many issues that have been resolved since 1.3. That version should be deadlock-free, though it achieves that by extreme over-synchronization. Both borrow and return are fully synchronized (threads are waiting on the pool monitor in the dump below). Version 1.3 is the least performant version of commons pool. Additionally, the maxIdle setting limits the number of instances that can be idle in the pool to just 8. That means that when instances are returned and there are already 8 idle, the returning instances are destroyed. When more load arrives, you then have to wait for them to be created, which in v 1.3 causes all threads to wait on the factory. If you can afford to have more instances idle, you should increase that number. You will likely get immediate relief by increasing maxIdle to several hundred or even the maxActive number; but you really should upgrade to a more recent version. See the pool web page for JDK requirements and version compatibility. Phil We're running WebLogic and periodically see thread count shoot up to the work manager maximum, and throughput grinds to a halt. All threads are blocked waiting on the commons pool, which I thought was strange as heap dumps show output like: Type Name Value intevictLastIndex -1 ref_evictor null int_numActive 206 ref_factory org.springframework.aop.target.CommonsPoolTargetSource @ 0x610... ref_pool java.util.LinkedList @ 0x610... long _softMinEvictableIdleTimeMillis-1 long _minEvictableIdleTimeMillis180 int_numTestsPerEvictionRun3 long _timeBetweenEvictionRunsMillis -1 boolean_testWhileIdle FALSE boolean_testOnReturn FALSE boolean_testOnBorrow FALSE byte _whenExhaustedAction 1 long _maxWait -1 int_maxActive 4096 int_minIdle 10 int_maxIdle 8 booleanclosed FALSE So from that, I would have expected numActive to be much closer to 4096 which is the maxActive we set. Am I looking at the wrong place? Why is this pool blocking? This has happened multiple times, and each time numActive is between 200 and 300. The pool is being used (I believe) to limit the number of concurrent security authorizations, so each borrow then pings an external service, and that external service does not appear overloaded at all. I did notice that when I drilled down into the pool's linked list, it only ever has 8 members in it, I would have expected numActive members at least. I don't fully understand the pool though. Here's a partial thread dump showing this behavior (addresses truncated, but they all blocked on the same pool object): [ACTIVE] ExecuteThread: '160' for queue: 'weblogic.kernel.Default (self-tuning)' daemon prio=10 tid=0x000... nid=0x4b1f waiting for monitor entry [0x000...] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.commons.pool.impl.GenericObjectPool.returnObject( GenericObjectPool.java:916) - waiting to lock 0x000... (a org.apache.commons.pool.impl.GenericObjectPool) at org.springframework.aop.target.CommonsPoolTargetSource. releaseTarget(CommonsPoolTargetSource.java:252) -- [ACTIVE] ExecuteThread: '159' for queue: 'weblogic.kernel.Default (self-tuning)' daemon prio=10 tid=0x... nid=0x4b1e waiting for monitor entry [0x000...] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.commons.pool.impl.GenericObjectPool.borrowObject( GenericObjectPool.java:781) - waiting to lock 0x000... (a org.apache.commons.pool.impl.GenericObjectPool) at org.springframework.aop.target.CommonsPoolTargetSource.getTarget( CommonsPoolTargetSource.java:244) -- [ACTIVE] ExecuteThread: '158' for queue: 'weblogic.kernel.Default (self-tuning)' daemon prio=10 tid=0x000... nid=0x4b1d waiting for monitor entry [0x000...] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.commons.pool.impl.GenericObjectPool.returnObject( GenericObjectPool.java:916) - waiting to lock 0x000... (a org.apache.commons.pool.impl.GenericObjectPool) at org.springframework.aop.target.CommonsPoolTargetSource.