Re: Pool blocking but not hitting max size?

2015-04-13 Thread Phil Steitz
On 4/13/15 2:39 PM, Dan wrote:
 repeatedly seen this happen and would appreciate some input or advice.
 We're using version 1.6 of the commons pool, I don't believe we could
 upgrade without good reason.
 From the line numbers in the stack trace, it looks like you are
 actually running pool 1.3, which is, well, ancient.  You should
 verify the version and if it is as I suspect, you should definitely
 upgrade.  Just have a look at the change log for the many, many
 issues that have been resolved since 1.3.  That version should be
 deadlock-free, though it achieves that by extreme
 over-synchronization.  Both borrow and return are fully synchronized
 (threads are waiting on the pool monitor in the dump below).
 Version 1.3 is the least performant version of commons pool.
 My apologies, I just searched the .ear we deploy for what I thought
 was the commons pool and found commons-pool-1.6.jar, so I figured that
 was the active version. Now that I look again, there are 20 .jar files
 all containing the word commons in the file name, with varying version
 numbers. Like I said, I'm just a lowly server admin watching this app
 crash.

 Additionally, the maxIdle setting limits the number of instances
 that can be idle in the pool to just 8.  That means that when
 instances are returned and there are already 8 idle, the returning
 instances are destroyed.  When more load arrives, you then have to
 wait for them to be created, which in v 1.3 causes all threads to
 wait on the factory.  If you can afford to have more instances idle,
 you should increase that number.
 In what sense do you say 'afford'? The servers have more than enough
 CPU/RAM available when this happens. Would you expect any meaningful
 overhead from bumping this number up to 50 or 100 during normal
 operations? We have test environments, but have struggled to create
 similar loads to production in them, so it may be difficult to fully
 test this change.

 You will likely get immediate relief by increasing maxIdle to
 several hundred or even the maxActive number; but you really should
 upgrade to a more recent version.  See the pool web page for JDK
 requirements and version compatibility.
 Thanks, I was very surprised when I saw how many versions behind our
 code looked from my naive glancing. I will suggest the maxIdle change
 and go from there.

 Just to clarify, the reason these threads are (likely) blocking is
 that we have 8 idle pool members, so every single request thereafter
 will cause a synchronized construct/destruct which blocks the entire
 pool until it completes, effectively limiting throughput to one
 request at a time until load goes down. I'd just like something
 meaningful to pass onto the developers.

Let me try to explain a little better so you can make the right
decisions before and after upgrade.

The maxActive setting (renamed maxTotal in v. 2) governs the total
number of instances in circulation - checked out or idle waiting to
be checked out - at a given time.  The maxIdle setting limits the
number of instances that can sit idle in the pool.  Given your
settings, a spike in demand could cause 2048 instances to be created
and handed out to clients.  On return, any arriving back when there
were more than 8 instances idle would get destroyed.  In general
maxIdle  maxTotal under variable load will result in a lot of
object creation / destruction.  If your environment can allow
maxIdle == maxTotal (or at least not hugely less), you can avoid
this churn.  Sometimes people want to reallocate resources after
load subsides so they set maxIdle  maxActive.

Now, pool 1.3 makes the pain associated with object churn much worse
because the factory create and destroy methods are executed while
holding *global* locks on the pool.  That means all threads waiting
on borrow or return have to wait for the factory methods to complete.

Morals of the story:

1.  Upgrade to a modern version of pool (1.5.7+) where the factory
methods don't block the pool
2.  Consider setting maxIdle closer or equal to maxActive.

Phil

 I appreciate the input!

 -
 To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
 For additional commands, e-mail: user-h...@commons.apache.org





-
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org



Re: Pool blocking but not hitting max size?

2015-04-13 Thread Phil Steitz
On 4/13/15 9:36 AM, Dan wrote:
 Hello, I'm not a developer on the project I'm supporting, but I have
 repeatedly seen this happen and would appreciate some input or advice.
 We're using version 1.6 of the commons pool, I don't believe we could
 upgrade without good reason.
From the line numbers in the stack trace, it looks like you are
actually running pool 1.3, which is, well, ancient.  You should
verify the version and if it is as I suspect, you should definitely
upgrade.  Just have a look at the change log for the many, many
issues that have been resolved since 1.3.  That version should be
deadlock-free, though it achieves that by extreme
over-synchronization.  Both borrow and return are fully synchronized
(threads are waiting on the pool monitor in the dump below). 
Version 1.3 is the least performant version of commons pool.

Additionally, the maxIdle setting limits the number of instances
that can be idle in the pool to just 8.  That means that when
instances are returned and there are already 8 idle, the returning
instances are destroyed.  When more load arrives, you then have to
wait for them to be created, which in v 1.3 causes all threads to
wait on the factory.  If you can afford to have more instances idle,
you should increase that number.

You will likely get immediate relief by increasing maxIdle to
several hundred or even the maxActive number; but you really should
upgrade to a more recent version.  See the pool web page for JDK
requirements and version compatibility.

Phil

 We're running WebLogic and periodically see thread count shoot up to
 the work manager maximum, and throughput grinds to a halt. All threads
 are blocked waiting on the commons pool, which I thought was strange
 as heap dumps show output like:

 Type   Name   Value
 intevictLastIndex -1
 ref_evictor   null
 int_numActive 206
 ref_factory
 org.springframework.aop.target.CommonsPoolTargetSource @ 0x610...
 ref_pool  java.util.LinkedList @ 0x610...
 long   _softMinEvictableIdleTimeMillis-1
 long   _minEvictableIdleTimeMillis180
 int_numTestsPerEvictionRun3
 long   _timeBetweenEvictionRunsMillis -1
 boolean_testWhileIdle FALSE
 boolean_testOnReturn  FALSE
 boolean_testOnBorrow  FALSE
 byte   _whenExhaustedAction   1
 long   _maxWait   -1
 int_maxActive 4096
 int_minIdle   10
 int_maxIdle   8
 booleanclosed FALSE

 So from that, I would have expected numActive to be much closer to
 4096 which is the maxActive we set. Am I looking at the wrong place?
 Why is this pool blocking? This has happened multiple times, and each
 time numActive is between 200 and 300. The pool is being used (I believe)
 to limit the number of concurrent security authorizations, so each borrow
 then pings an external service, and that external service does not appear
 overloaded at all.

 I did notice that when I drilled down into the pool's linked list, it
 only ever has 8 members in it, I would have expected numActive members
 at least. I don't fully understand the pool though.

 Here's a partial thread dump showing this behavior (addresses truncated,
 but they all blocked on the same pool object):

 [ACTIVE] ExecuteThread: '160' for queue: 'weblogic.kernel.Default
 (self-tuning)' daemon prio=10 tid=0x000... nid=0x4b1f
 waiting for monitor entry [0x000...]
java.lang.Thread.State: BLOCKED (on object monitor)
 at org.apache.commons.pool.impl.GenericObjectPool.returnObject(
 GenericObjectPool.java:916)
 - waiting to lock 0x000... (a
 org.apache.commons.pool.impl.GenericObjectPool)
 at org.springframework.aop.target.CommonsPoolTargetSource.
 releaseTarget(CommonsPoolTargetSource.java:252)
 --
 [ACTIVE] ExecuteThread: '159' for queue: 'weblogic.kernel.Default
 (self-tuning)' daemon prio=10 tid=0x... nid=0x4b1e
 waiting for monitor entry [0x000...]
java.lang.Thread.State: BLOCKED (on object monitor)
 at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(
 GenericObjectPool.java:781)
 - waiting to lock 0x000... (a
 org.apache.commons.pool.impl.GenericObjectPool)
 at org.springframework.aop.target.CommonsPoolTargetSource.getTarget(
 CommonsPoolTargetSource.java:244)
 --
 [ACTIVE] ExecuteThread: '158' for queue: 'weblogic.kernel.Default
 (self-tuning)' daemon prio=10 tid=0x000... nid=0x4b1d
 waiting for monitor entry [0x000...]
java.lang.Thread.State: BLOCKED (on object monitor)
 at org.apache.commons.pool.impl.GenericObjectPool.returnObject(
 GenericObjectPool.java:916)
 - waiting to lock 0x000... (a
 org.apache.commons.pool.impl.GenericObjectPool)
 at org.springframework.aop.target.CommonsPoolTargetSource.