Re: [DBCP] Apparent deadlock related to getConnection

2015-04-12 Thread Shawn Heisey
On 4/11/2015 8:48 AM, Phil Steitz wrote:
 What is going on in that thread is that it is blocked waiting for a
 connection to become available in the pool.  There is no evidence of
 deadlock there - just pool exhaustion.
 
 The most likely culprit is application code that is failing to close
 connections that have been borrowed from the pool.  If you don't
 close the connections on all execution paths, they will leak as
 the pool has no way of knowing that your code is finished with them
 unless and until you close them.

I've done a review of all my database code for leaks in the past, and
everything looked good at that time.  I can repeat that check.  The
database portions of my code have not had any changes for quite some time.

Would I see evidence of a connection leak in the show full processlist
output at the MySQL server?  I don't see that happening.  There are only
a handful of open connections when I check there, even when the program
has been running for several days or weeks, running through its cycle
once a minute.  Once all parts of the program have been run at least
once, the connection count is stable.

Thanks,
Shawn


-
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org



Re: [DBCP] Apparent deadlock related to getConnection

2015-04-12 Thread Kristian Rosenvold
I once had a library that allocated 2 connections to complete a
transaction. Every now and then we'd deadlock the pool if number of
concurrent transactions  pool size, which could happen in load
bursts. If you do a thread dump of your process and see all threads
are waiting for a connection this is most probably happening. The
non-solution solution is to increase pool size, the real solution was
never to allocate 2 connections to complete a transaction.

Kristian


2015-04-12 11:58 GMT+02:00 Shawn Heisey apa...@elyograg.org:
 On 4/11/2015 8:48 AM, Phil Steitz wrote:
 What is going on in that thread is that it is blocked waiting for a
 connection to become available in the pool.  There is no evidence of
 deadlock there - just pool exhaustion.

 The most likely culprit is application code that is failing to close
 connections that have been borrowed from the pool.  If you don't
 close the connections on all execution paths, they will leak as
 the pool has no way of knowing that your code is finished with them
 unless and until you close them.

 I've done a review of all my database code for leaks in the past, and
 everything looked good at that time.  I can repeat that check.  The
 database portions of my code have not had any changes for quite some time.

 Would I see evidence of a connection leak in the show full processlist
 output at the MySQL server?  I don't see that happening.  There are only
 a handful of open connections when I check there, even when the program
 has been running for several days or weeks, running through its cycle
 once a minute.  Once all parts of the program have been run at least
 once, the connection count is stable.

 Thanks,
 Shawn


 -
 To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
 For additional commands, e-mail: user-h...@commons.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org



Re: [DBCP] Apparent deadlock related to getConnection

2015-04-12 Thread Shawn Heisey
On 4/12/2015 9:16 AM, Phil Steitz wrote:
 Can you post your full pool config and info on versions of DBCP and
 POOL?
 
 Do you call isClosed() before closing a connection (not recommended
 - that is really the issue in the ticket you commented)?
 
 To make sure there is not an actual deadlock somewhere, you should
 examine a full thread dump.  The thread you posted in the ticket is
 waiting on the pool, not holding any DBCP or POOL related locks; but
 it does hold this application lock:
 at com.REDACTED.idxbuild.solr.Chain.doReinsert(Chain.java:1332)
 - locked 0xdb0011c8 (a java.lang.Object)
 Make sure no other threads in process of doing things that will
 return connections to the pool aren't waiting on this lock.
 
 Given that you are using DBCP 2.1, if you have access to a JMX
 console it would also be good to look at what is happening in the
 pool when your application hangs.  In particular, numActive, numIdle
 and listAllObjects would be good to look at.  Looks like you are
 using PoolingDataSource directly, so you need to look at the
 underlying GenericObject's JMX properties.

My code does not contain the string isClosed at all.

I thought I had saved the full thread dump, but the stdout logfile was
wiped when I restarted the program, and I didn't think to save a copy.
I will be sure to save it next time there's a problem.

Here is the entire java file for my Database class, edited to redact
the company top-level domain:

http://apaste.info/6pX

The object that I am locking on (0xdb0011c8) is a global lock
for my whole program which I synchronize on for a handful of
normally-short-lived db-related operations just to be sure that they
aren't happening simultaneously, and to help make sure that the outcome
of certain operations is visible across threads.  I'm reasonably sure
that there are no operations in those synchronization blocks that are
expected to be slow, though I haven't verified that to 100% certainty.
One of the operations handled in that kind of synchronization lock is
the optimization of certain DB tables, which only happens if the size of
those tables is below a certain threshold -- part of making sure that
the operation completes quickly and doesn't deadlock the program.

Here are all the jar files in my program's lib directory.  My program
uses javamail, jrobin, dbcp, and solrj:

commons-dbcp2-2.1.jar
commons-io-2.4.jar
commons-pool2-2.3.jar
httpclient-4.4.1.jar
httpcore-4.4.1.jar
httpmime-4.4.1.jar
javax.mail.jar
jcl-over-slf4j-1.7.12.jar
jrobin-1.5.14.jar
jul-to-slf4j-1.7.12.jar
log4j-1.2.17.jar
mysql-connector-java-5.1.35-bin.jar
noggit-0.7.jar
slf4j-api-1.7.12.jar
slf4j-log4j12-1.7.12.jar
solr-solrj-5.0.0.jar
stax2-api-3.1.1.jar
woodstox-core-asl-4.2.0.jar
zookeeper-3.4.6.jar

I do not have remote JMX enabled for this application, and the server
where it runs does not have a gui.  I will add some code that spits out
the three requested pool stats every time the watchdog thread detects a
problem.  I cannot predict when there will be a problem ... this code
can run for weeks with nothing going wrong, or the deadlock might happen
only a few minutes after startup.

Thanks,
Shawn


-
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org



Re: [DBCP] Apparent deadlock related to getConnection

2015-04-12 Thread Shawn Heisey
On 4/12/2015 5:18 AM, Kristian Rosenvold wrote:
 I once had a library that allocated 2 connections to complete a
 transaction. Every now and then we'd deadlock the pool if number of
 concurrent transactions  pool size, which could happen in load
 bursts. If you do a thread dump of your process and see all threads
 are waiting for a connection this is most probably happening. The
 non-solution solution is to increase pool size, the real solution was
 never to allocate 2 connections to complete a transaction.

It will take a code review to be sure, but I'm pretty sure I would
remember if there was anything odd like that in the code.

Thanks,
Shawn


-
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org



Re: [DBCP] Apparent deadlock related to getConnection

2015-04-12 Thread Phil Steitz
On 4/12/15 2:58 AM, Shawn Heisey wrote:
 On 4/11/2015 8:48 AM, Phil Steitz wrote:
 What is going on in that thread is that it is blocked waiting for a
 connection to become available in the pool.  There is no evidence of
 deadlock there - just pool exhaustion.

 The most likely culprit is application code that is failing to close
 connections that have been borrowed from the pool.  If you don't
 close the connections on all execution paths, they will leak as
 the pool has no way of knowing that your code is finished with them
 unless and until you close them.
 I've done a review of all my database code for leaks in the past, and
 everything looked good at that time.  I can repeat that check.  The
 database portions of my code have not had any changes for quite some time.

Also try turning on removeAbandoned and logAbandoned.

 Would I see evidence of a connection leak in the show full processlist
 output at the MySQL server?  I don't see that happening.  There are only
 a handful of open connections when I check there, even when the program
 has been running for several days or weeks, running through its cycle
 once a minute.  Once all parts of the program have been run at least
 once, the connection count is stable.

Can you post your full pool config and info on versions of DBCP and
POOL?

Do you call isClosed() before closing a connection (not recommended
- that is really the issue in the ticket you commented)?

To make sure there is not an actual deadlock somewhere, you should
examine a full thread dump.  The thread you posted in the ticket is
waiting on the pool, not holding any DBCP or POOL related locks; but
it does hold this application lock:
at com.REDACTED.idxbuild.solr.Chain.doReinsert(Chain.java:1332)
- locked 0xdb0011c8 (a java.lang.Object)
Make sure no other threads in process of doing things that will
return connections to the pool aren't waiting on this lock.

Given that you are using DBCP 2.1, if you have access to a JMX
console it would also be good to look at what is happening in the
pool when your application hangs.  In particular, numActive, numIdle
and listAllObjects would be good to look at.  Looks like you are
using PoolingDataSource directly, so you need to look at the
underlying GenericObject's JMX properties.

Phil

 Thanks,
 Shawn


 -
 To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
 For additional commands, e-mail: user-h...@commons.apache.org





-
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org



Re: [DBCP] Apparent deadlock related to getConnection

2015-04-12 Thread Phil Steitz
On 4/12/15 10:31 AM, Shawn Heisey wrote:
 On 4/12/2015 9:16 AM, Phil Steitz wrote:
 Can you post your full pool config and info on versions of DBCP and
 POOL?

 Do you call isClosed() before closing a connection (not recommended
 - that is really the issue in the ticket you commented)?

 To make sure there is not an actual deadlock somewhere, you should
 examine a full thread dump.  The thread you posted in the ticket is
 waiting on the pool, not holding any DBCP or POOL related locks; but
 it does hold this application lock:
 at com.REDACTED.idxbuild.solr.Chain.doReinsert(Chain.java:1332)
 - locked 0xdb0011c8 (a java.lang.Object)
 Make sure no other threads in process of doing things that will
 return connections to the pool aren't waiting on this lock.

 Given that you are using DBCP 2.1, if you have access to a JMX
 console it would also be good to look at what is happening in the
 pool when your application hangs.  In particular, numActive, numIdle
 and listAllObjects would be good to look at.  Looks like you are
 using PoolingDataSource directly, so you need to look at the
 underlying GenericObject's JMX properties.
 My code does not contain the string isClosed at all.

 I thought I had saved the full thread dump, but the stdout logfile was
 wiped when I restarted the program, and I didn't think to save a copy.
 I will be sure to save it next time there's a problem.

Full thread dump will be helpful.

 Here is the entire java file for my Database class, edited to redact
 the company top-level domain:

 http://apaste.info/6pX

Several methods, getSingleObject, getResultSet, closeResultSet,
doUpdate can abandon connections on exception paths.  You should
insert finally blocks in these methods to ensure that connections
are closed. 

Have a look at the PoolingDataSource example referenced [1] on the
DBCP web page for a model to use with try-catch-finally.   Not
closing connections on exception paths will cause leaks if
exceptions are thrown.  If you are seeing BuildExceptions generated
by this code, each of them is leaking a connection.

 The object that I am locking on (0xdb0011c8) is a global lock
 for my whole program which I synchronize on for a handful of
 normally-short-lived db-related operations just to be sure that they
 aren't happening simultaneously, and to help make sure that the outcome
 of certain operations is visible across threads.  I'm reasonably sure
 that there are no operations in those synchronization blocks that are
 expected to be slow, though I haven't verified that to 100% certainty.

They don't have to be slow to deadlock, if you hit the right timing
and the following kind of scenario:

Thread 1 grab the lock and hold it while waiting to get a connection
(what the thread you showed in the ticket is doing)
Threads 2, ..., n try to get the lock before returning connections
they hold

Key is to get a full thread dump and see which threads are waiting
on / holding which locks.

Phil
[1] http://s.apache.org/0Nl

 One of the operations handled in that kind of synchronization lock is
 the optimization of certain DB tables, which only happens if the size of
 those tables is below a certain threshold -- part of making sure that
 the operation completes quickly and doesn't deadlock the program.

 Here are all the jar files in my program's lib directory.  My program
 uses javamail, jrobin, dbcp, and solrj:

 commons-dbcp2-2.1.jar
 commons-io-2.4.jar
 commons-pool2-2.3.jar
 httpclient-4.4.1.jar
 httpcore-4.4.1.jar
 httpmime-4.4.1.jar
 javax.mail.jar
 jcl-over-slf4j-1.7.12.jar
 jrobin-1.5.14.jar
 jul-to-slf4j-1.7.12.jar
 log4j-1.2.17.jar
 mysql-connector-java-5.1.35-bin.jar
 noggit-0.7.jar
 slf4j-api-1.7.12.jar
 slf4j-log4j12-1.7.12.jar
 solr-solrj-5.0.0.jar
 stax2-api-3.1.1.jar
 woodstox-core-asl-4.2.0.jar
 zookeeper-3.4.6.jar

 I do not have remote JMX enabled for this application, and the server
 where it runs does not have a gui.  I will add some code that spits out
 the three requested pool stats every time the watchdog thread detects a
 problem.  I cannot predict when there will be a problem ... this code
 can run for weeks with nothing going wrong, or the deadlock might happen
 only a few minutes after startup.


 Thanks,
 Shawn


 -
 To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
 For additional commands, e-mail: user-h...@commons.apache.org





-
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org



Re: [DBCP] Apparent deadlock related to getConnection

2015-04-12 Thread Shawn Heisey
On 4/12/2015 12:43 PM, Phil Steitz wrote:
 On 4/12/15 10:31 AM, Shawn Heisey wrote:
 http://apaste.info/6pX
 
 Several methods, getSingleObject, getResultSet, closeResultSet,
 doUpdate can abandon connections on exception paths.  You should
 insert finally blocks in these methods to ensure that connections
 are closed. 

I have modified the code to put close operations in finally blocks in
getSingleObject, doUpdate, and getColumnSet, and I have done the same
thing on all uses of the other two methods, with the close method called
in a finally block.

The updated version of the Database class is here:

http://apaste.info/6XU

I've engaged in some renaming.  The getResultSet method is now called
createConnectionAndResultSet, and the closeResultSet method is now
closeAllFromResultSet.

 They don't have to be slow to deadlock, if you hit the right timing
 and the following kind of scenario:
 
 Thread 1 grab the lock and hold it while waiting to get a connection
 (what the thread you showed in the ticket is doing)
 Threads 2, ..., n try to get the lock before returning connections
 they hold
 
 Key is to get a full thread dump and see which threads are waiting
 on / holding which locks.

If the finally changes I've made do not fix the problem, I will gather
a new full thread dump for analysis.  You've given me an idea of some
additional bad coding that I can look for, so thank you.  I've tried to
make all synchronized code use the database in a self-contained manner,
but I might have made mistakes.

Thanks,
Shawn


-
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org