[jira] [Commented] (DBCP-595) Connection pool can be exhausted when connections are killed on the DB side
[ https://issues.apache.org/jira/browse/DBCP-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818815#comment-17818815 ] Dénes Bodó commented on DBCP-595: - [~psteitz] I executed the ReproOneThread class with these settings: {code:java} connectionProperties.append("MaxActive=").append(1).append(","); connectionProperties.append("MaxTotal=").append(2).append(","); //connectionProperties.append("MaxIdle=").append(2).append(","); connectionProperties.append("fastFailValidation=").append("false").append(","); connectionProperties.append("TestOnBorrow=").append("true").append(","); connectionProperties.append("TestOnReturn=").append("true").append(","); connectionProperties.append("TestWhileIdle=").append("true").append(","); //connectionProperties.append("ValidationQuery=").append("SELECT 1").append(","); connectionProperties.append("timeBetweenEvictionRunsMillis=").append(10_000).append(","); connectionProperties.append("numTestsPerEvictionRun=").append(10).append(","); {code} The program got stuck when the org.apache.commons.pool2.impl.GenericObjectPool#create failed to create a new connection because newCreateCount > localMaxTotal was true. See this screen shot: !ReproOneThread-screenshot_when_create-newCreateCount_gt_localMaxTotal.png|width=576,height=386! I created jstack at the time when I created the screen shot and a jstack after I let the debugger continue running but the program really stuck: [^ReproOneThread-jstack_when_create.txt] and [^ReproOneThread-jstack_when_stuck.txt]. Setting fastFailValidation to true had no effect, the program successfully stuck. When I set MinIdle to 1 there was a thread running after the program stuck and it tried to create new connections but the variables shoed the same as the above screen shot: . [^ReproOneThread-jstack-minIdle.txt] *Based on this I confirm the program got stuck when numActive == maxActive.* Regarding validation and connection closure in case of exception: I played around with my repro code (ReproDBCP): * Closed the connection got from DataSource::getConnection() when it was not null ** when exception occurred ** in a finally block * maxActive=1, maxTotal=2, validation turned off completely * 4 threads There was no sign of any deadlock during testing. This confirms your theory that when the issue is happening in Oozie the "client" does not close the connection after it notices the exception about connection closure. My questions: 1. As OpenJPA is the client in Oozie's perspective does it mean that I have to check OpenJPA code if it closes/releases the connection when catches an exception from _org.apache.commons.dbcp2.PoolableConnection#handleException_ ? Is it the right approach to drop/close the connection by the client? Shouldn't the client get notified about *fatalSqlExceptionThrown* instead of a simple SQLException? {code:java} @Override protected void handleException(final SQLException e) throws SQLException { fatalSqlExceptionThrown |= isFatalException(e); super.handleException(e); } {code} 2. If DBCP is aware that this is a fatalSqlException, shouldn't it handle the situation by closing the connection automatically? - I know, this is what I suggested in my patch. Just curious. Thank you. > Connection pool can be exhausted when connections are killed on the DB side > --- > > Key: DBCP-595 > URL: https://issues.apache.org/jira/browse/DBCP-595 > Project: Commons DBCP > Issue Type: Bug >Affects Versions: 2.11.0 >Reporter: Dénes Bodó >Priority: Critical > Labels: deadlock, robustness > Attachments: ReproOneThread-jstack-minIdle.txt, > ReproOneThread-jstack_when_create.txt, ReproOneThread-jstack_when_stuck.txt, > ReproOneThread-screenshot_when_create-newCreateCount_gt_localMaxTotal.png > > > Apache Oozie 5.2.1 uses OpenJPA 2.4.2 and commons-dbcp 1.4 and commons-pool > 1.5.4. These are ancient versions, I know. > h1. Description > The issue is that when due to some network issues or "maintenance work" on > the DB side (especially PostgreSQL) which causes the DB connection to be > closed, it results exhausted Pool on the client side. Many threads are > waiting at this point: > {noformat} > "pool-2-thread-4" #20 prio=5 os_prio=31 tid=0x7faf7903b800 nid=0x8603 > waiting on condition [0x00030f3e7000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00066aca8e70> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at >
[jira] [Commented] (DBCP-595) Connection pool can be exhausted when connections are killed on the DB side
[ https://issues.apache.org/jira/browse/DBCP-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817059#comment-17817059 ] Dénes Bodó commented on DBCP-595: - Thank you Phil for your inputs. I'll go back testing your suggestions and will provide answers for your questions in one-two days. > Connection pool can be exhausted when connections are killed on the DB side > --- > > Key: DBCP-595 > URL: https://issues.apache.org/jira/browse/DBCP-595 > Project: Commons DBCP > Issue Type: Bug >Affects Versions: 2.11.0 >Reporter: Dénes Bodó >Priority: Critical > Labels: deadlock, robustness > > Apache Oozie 5.2.1 uses OpenJPA 2.4.2 and commons-dbcp 1.4 and commons-pool > 1.5.4. These are ancient versions, I know. > h1. Description > The issue is that when due to some network issues or "maintenance work" on > the DB side (especially PostgreSQL) which causes the DB connection to be > closed, it results exhausted Pool on the client side. Many threads are > waiting at this point: > {noformat} > "pool-2-thread-4" #20 prio=5 os_prio=31 tid=0x7faf7903b800 nid=0x8603 > waiting on condition [0x00030f3e7000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00066aca8e70> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1324) > {noformat} > According to my observation this is because the JDBC driver does not get > closed on the client side, nor the abstract DBCP connection > _org.apache.commons.dbcp2.PoolableConnection_ . > h1. Repro > (Un)Fortunately I can reproduce the issue using the latest and greatest > commons-dbcp 2.11.0 and commons-pool 2.12.0 along with OpenJPA 3.2.2. > I've just created a Java application to reproduce the issue: > [https://github.com/dionusos/pool_exhausted_repro] . See README.md for > detailed repro steps. > h1. Kind of solution? > To be honest I am not really familiar with DBCP but with this change I > managed to make my application more robust: > {code:java} > diff --git a/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > b/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > index 440cb756..678550bf 100644 > --- a/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > +++ b/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > @@ -214,6 +214,10 @@ public class PoolableConnection extends > DelegatingConnection impleme > @Override > protected void handleException(final SQLException e) throws SQLException > { > fatalSqlExceptionThrown |= isFatalException(e); > + if (fatalSqlExceptionThrown && getDelegate() != null) { > + getDelegate().close(); > + this.close(); > + } > super.handleException(e); > }{code} > What do you think about this approach? > Is it a completely dead-end or we can start working on it in this direction? > Do you agree that the reported and reproduced issue is a real one and nut > just some kind of misconfiguration? > > I am lost at this point and I need to move forward so I am asking for > guidance here. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DBCP-595) Connection pool can be exhausted when connections are killed on the DB side
[ https://issues.apache.org/jira/browse/DBCP-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816813#comment-17816813 ] Phil Steitz commented on DBCP-595: -- I just realized there is another scenario that can lead to this, which is due to a limitation in Commons Pool. If you look at the currently disabled test case, testLivenessOnTransientFactoryFailure in TestGenericObjectPool, it illustrates the following scenario: # Threads enter borrow object when there is capacity to create, but due to transient factory outage, creates fail. # The factory starts working again, but threads remain blocked waiting on the pool If minIdle > 0 and pool maintenance is enabled (minTimeBetweenEvictionRuns >0), the pool will create instances when the evictor runs to serve the blocked threads. > Connection pool can be exhausted when connections are killed on the DB side > --- > > Key: DBCP-595 > URL: https://issues.apache.org/jira/browse/DBCP-595 > Project: Commons DBCP > Issue Type: Bug >Affects Versions: 2.11.0 >Reporter: Dénes Bodó >Priority: Critical > Labels: deadlock, robustness > > Apache Oozie 5.2.1 uses OpenJPA 2.4.2 and commons-dbcp 1.4 and commons-pool > 1.5.4. These are ancient versions, I know. > h1. Description > The issue is that when due to some network issues or "maintenance work" on > the DB side (especially PostgreSQL) which causes the DB connection to be > closed, it results exhausted Pool on the client side. Many threads are > waiting at this point: > {noformat} > "pool-2-thread-4" #20 prio=5 os_prio=31 tid=0x7faf7903b800 nid=0x8603 > waiting on condition [0x00030f3e7000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00066aca8e70> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1324) > {noformat} > According to my observation this is because the JDBC driver does not get > closed on the client side, nor the abstract DBCP connection > _org.apache.commons.dbcp2.PoolableConnection_ . > h1. Repro > (Un)Fortunately I can reproduce the issue using the latest and greatest > commons-dbcp 2.11.0 and commons-pool 2.12.0 along with OpenJPA 3.2.2. > I've just created a Java application to reproduce the issue: > [https://github.com/dionusos/pool_exhausted_repro] . See README.md for > detailed repro steps. > h1. Kind of solution? > To be honest I am not really familiar with DBCP but with this change I > managed to make my application more robust: > {code:java} > diff --git a/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > b/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > index 440cb756..678550bf 100644 > --- a/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > +++ b/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > @@ -214,6 +214,10 @@ public class PoolableConnection extends > DelegatingConnection impleme > @Override > protected void handleException(final SQLException e) throws SQLException > { > fatalSqlExceptionThrown |= isFatalException(e); > + if (fatalSqlExceptionThrown && getDelegate() != null) { > + getDelegate().close(); > + this.close(); > + } > super.handleException(e); > }{code} > What do you think about this approach? > Is it a completely dead-end or we can start working on it in this direction? > Do you agree that the reported and reproduced issue is a real one and nut > just some kind of misconfiguration? > > I am lost at this point and I need to move forward so I am asking for > guidance here. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DBCP-595) Connection pool can be exhausted when connections are killed on the DB side
[ https://issues.apache.org/jira/browse/DBCP-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816685#comment-17816685 ] Phil Steitz commented on DBCP-595: -- Thanks for the info on validation. That info and the partial thread dump above (can you maybe include a fuller trace?) makes it look likely to me that the problem is that that the application is not closing connections on some exception paths. Can you check to make sure that even on exception paths, the code is calling close on the connection handles returned by DBCP? The fact that forcing close on fatal exceptions improves things supports the hypothesis that what is going on is the client code is not closing connections on some exception paths. Also, can you confirm that when the starvation happens, the pool is reporting numActive == maxActive? With validation turned on, DBCP will reliably close connections when validation fails. If any kind of exception occurs during the close, DBCP will destroy the connection handle and reduce its active count. You can see that in the validateObject method of PoolableConnectionFactory. If you set fastFailValidation to true, DBCP will fail validation without attempting a validation query after a fatal exception occurs, again destroying the handle and adjusting counters. > Connection pool can be exhausted when connections are killed on the DB side > --- > > Key: DBCP-595 > URL: https://issues.apache.org/jira/browse/DBCP-595 > Project: Commons DBCP > Issue Type: Bug >Affects Versions: 2.11.0 >Reporter: Dénes Bodó >Priority: Critical > Labels: deadlock, robustness > > Apache Oozie 5.2.1 uses OpenJPA 2.4.2 and commons-dbcp 1.4 and commons-pool > 1.5.4. These are ancient versions, I know. > h1. Description > The issue is that when due to some network issues or "maintenance work" on > the DB side (especially PostgreSQL) which causes the DB connection to be > closed, it results exhausted Pool on the client side. Many threads are > waiting at this point: > {noformat} > "pool-2-thread-4" #20 prio=5 os_prio=31 tid=0x7faf7903b800 nid=0x8603 > waiting on condition [0x00030f3e7000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00066aca8e70> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1324) > {noformat} > According to my observation this is because the JDBC driver does not get > closed on the client side, nor the abstract DBCP connection > _org.apache.commons.dbcp2.PoolableConnection_ . > h1. Repro > (Un)Fortunately I can reproduce the issue using the latest and greatest > commons-dbcp 2.11.0 and commons-pool 2.12.0 along with OpenJPA 3.2.2. > I've just created a Java application to reproduce the issue: > [https://github.com/dionusos/pool_exhausted_repro] . See README.md for > detailed repro steps. > h1. Kind of solution? > To be honest I am not really familiar with DBCP but with this change I > managed to make my application more robust: > {code:java} > diff --git a/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > b/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > index 440cb756..678550bf 100644 > --- a/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > +++ b/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > @@ -214,6 +214,10 @@ public class PoolableConnection extends > DelegatingConnection impleme > @Override > protected void handleException(final SQLException e) throws SQLException > { > fatalSqlExceptionThrown |= isFatalException(e); > + if (fatalSqlExceptionThrown && getDelegate() != null) { > + getDelegate().close(); > + this.close(); > + } > super.handleException(e); > }{code} > What do you think about this approach? > Is it a completely dead-end or we can start working on it in this direction? > Do you agree that the reported and reproduced issue is a real one and nut > just some kind of misconfiguration? > > I am lost at this point and I need to move forward so I am asking for > guidance here. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DBCP-595) Connection pool can be exhausted when connections are killed on the DB side
[ https://issues.apache.org/jira/browse/DBCP-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816520#comment-17816520 ] Dénes Bodó commented on DBCP-595: - [~psteitz] In the "Oozie-like repro steps" the following is configured: {code:java} connectionProperties.append("TestOnBorrow=").append("true").append(","); connectionProperties.append("TestOnReturn=").append("true").append(","); connectionProperties.append("TestWhileIdle=").append("true").append(",");{code} This means the program will stuck and threads will wait infinitely at org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1324). The only difference between connection validation and no connection validation is that how much time do we have to wait until the program sticks is. My observation is that there is some kind of timing issue in the background as we experience exhausted pool no matter what and connection validation only delays the issue and not solves it. > Connection pool can be exhausted when connections are killed on the DB side > --- > > Key: DBCP-595 > URL: https://issues.apache.org/jira/browse/DBCP-595 > Project: Commons DBCP > Issue Type: Bug >Affects Versions: 2.11.0 >Reporter: Dénes Bodó >Priority: Critical > Labels: deadlock, robustness > > Apache Oozie 5.2.1 uses OpenJPA 2.4.2 and commons-dbcp 1.4 and commons-pool > 1.5.4. These are ancient versions, I know. > h1. Description > The issue is that when due to some network issues or "maintenance work" on > the DB side (especially PostgreSQL) which causes the DB connection to be > closed, it results exhausted Pool on the client side. Many threads are > waiting at this point: > {noformat} > "pool-2-thread-4" #20 prio=5 os_prio=31 tid=0x7faf7903b800 nid=0x8603 > waiting on condition [0x00030f3e7000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00066aca8e70> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1324) > {noformat} > According to my observation this is because the JDBC driver does not get > closed on the client side, nor the abstract DBCP connection > _org.apache.commons.dbcp2.PoolableConnection_ . > h1. Repro > (Un)Fortunately I can reproduce the issue using the latest and greatest > commons-dbcp 2.11.0 and commons-pool 2.12.0 along with OpenJPA 3.2.2. > I've just created a Java application to reproduce the issue: > [https://github.com/dionusos/pool_exhausted_repro] . See README.md for > detailed repro steps. > h1. Kind of solution? > To be honest I am not really familiar with DBCP but with this change I > managed to make my application more robust: > {code:java} > diff --git a/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > b/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > index 440cb756..678550bf 100644 > --- a/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > +++ b/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > @@ -214,6 +214,10 @@ public class PoolableConnection extends > DelegatingConnection impleme > @Override > protected void handleException(final SQLException e) throws SQLException > { > fatalSqlExceptionThrown |= isFatalException(e); > + if (fatalSqlExceptionThrown && getDelegate() != null) { > + getDelegate().close(); > + this.close(); > + } > super.handleException(e); > }{code} > What do you think about this approach? > Is it a completely dead-end or we can start working on it in this direction? > Do you agree that the reported and reproduced issue is a real one and nut > just some kind of misconfiguration? > > I am lost at this point and I need to move forward so I am asking for > guidance here. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DBCP-595) Connection pool can be exhausted when connections are killed on the DB side
[ https://issues.apache.org/jira/browse/DBCP-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816463#comment-17816463 ] Phil Steitz commented on DBCP-595: -- I am still working through the repro code, but the symptoms don't make sense to me yet. The stack trace is indicating pool exhaustion, which would mean clients are holding or abandoning connections. If connections go bad, since all validation is turned off in the repro code, clients will keep getting bad connections and their operations will fail; but that does not explain the pool exhaustion, unless clients are not returning (closing) connections when they hit the exceptions. Have you tried turning on say testOnReturn? That will cause bad connections to be closed when they are returned to the pool. > Connection pool can be exhausted when connections are killed on the DB side > --- > > Key: DBCP-595 > URL: https://issues.apache.org/jira/browse/DBCP-595 > Project: Commons DBCP > Issue Type: Bug >Affects Versions: 2.11.0 >Reporter: Dénes Bodó >Priority: Critical > Labels: deadlock, robustness > > Apache Oozie 5.2.1 uses OpenJPA 2.4.2 and commons-dbcp 1.4 and commons-pool > 1.5.4. These are ancient versions, I know. > h1. Description > The issue is that when due to some network issues or "maintenance work" on > the DB side (especially PostgreSQL) which causes the DB connection to be > closed, it results exhausted Pool on the client side. Many threads are > waiting at this point: > {noformat} > "pool-2-thread-4" #20 prio=5 os_prio=31 tid=0x7faf7903b800 nid=0x8603 > waiting on condition [0x00030f3e7000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00066aca8e70> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1324) > {noformat} > According to my observation this is because the JDBC driver does not get > closed on the client side, nor the abstract DBCP connection > _org.apache.commons.dbcp2.PoolableConnection_ . > h1. Repro > (Un)Fortunately I can reproduce the issue using the latest and greatest > commons-dbcp 2.11.0 and commons-pool 2.12.0 along with OpenJPA 3.2.2. > I've just created a Java application to reproduce the issue: > [https://github.com/dionusos/pool_exhausted_repro] . See README.md for > detailed repro steps. > h1. Kind of solution? > To be honest I am not really familiar with DBCP but with this change I > managed to make my application more robust: > {code:java} > diff --git a/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > b/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > index 440cb756..678550bf 100644 > --- a/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > +++ b/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > @@ -214,6 +214,10 @@ public class PoolableConnection extends > DelegatingConnection impleme > @Override > protected void handleException(final SQLException e) throws SQLException > { > fatalSqlExceptionThrown |= isFatalException(e); > + if (fatalSqlExceptionThrown && getDelegate() != null) { > + getDelegate().close(); > + this.close(); > + } > super.handleException(e); > }{code} > What do you think about this approach? > Is it a completely dead-end or we can start working on it in this direction? > Do you agree that the reported and reproduced issue is a real one and nut > just some kind of misconfiguration? > > I am lost at this point and I need to move forward so I am asking for > guidance here. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DBCP-595) Connection pool can be exhausted when connections are killed on the DB side
[ https://issues.apache.org/jira/browse/DBCP-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815776#comment-17815776 ] Gary D. Gregory commented on DBCP-595: -- Hello [~dionusos] Thank you for your report. To get an idea of the impact of your change, you could create a PR on github against the master branch. This will cause builds to run on Java LTS platforms. That's a first step. It's possible that we do not have sufficient code coverage in tests to detect problems in this area though. IOW, the builds could be green and still cause issues for existing apps. This will need careful review. > Connection pool can be exhausted when connections are killed on the DB side > --- > > Key: DBCP-595 > URL: https://issues.apache.org/jira/browse/DBCP-595 > Project: Commons DBCP > Issue Type: Bug >Affects Versions: 2.11.0 >Reporter: Dénes Bodó >Priority: Critical > Labels: deadlock, robustness > > Apache Oozie 5.2.1 uses OpenJPA 2.4.2 and commons-dbcp 1.4 and commons-pool > 1.5.4. These are ancient versions, I know. > h1. Description > The issue is that when due to some network issues or "maintenance work" on > the DB side (especially PostgreSQL) which causes the DB connection to be > closed, it results exhausted Pool on the client side. Many threads are > waiting at this point: > {noformat} > "pool-2-thread-4" #20 prio=5 os_prio=31 tid=0x7faf7903b800 nid=0x8603 > waiting on condition [0x00030f3e7000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00066aca8e70> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at > org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1324) > {noformat} > According to my observation this is because the JDBC driver does not get > closed on the client side, nor the abstract DBCP connection > _org.apache.commons.dbcp2.PoolableConnection_ . > h1. Repro > (Un)Fortunately I can reproduce the issue using the latest and greatest > commons-dbcp 2.11.0 and commons-pool 2.12.0 along with OpenJPA 3.2.2. > I've just created a Java application to reproduce the issue: > [https://github.com/dionusos/pool_exhausted_repro] . See README.md for > detailed repro steps. > h1. Kind of solution? > To be honest I am not really familiar with DBCP but with this change I > managed to make my application more robust: > {code:java} > diff --git a/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > b/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > index 440cb756..678550bf 100644 > --- a/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > +++ b/src/main/java/org/apache/commons/dbcp2/PoolableConnection.java > @@ -214,6 +214,10 @@ public class PoolableConnection extends > DelegatingConnection impleme > @Override > protected void handleException(final SQLException e) throws SQLException > { > fatalSqlExceptionThrown |= isFatalException(e); > + if (fatalSqlExceptionThrown && getDelegate() != null) { > + getDelegate().close(); > + this.close(); > + } > super.handleException(e); > }{code} > What do you think about this approach? > Is it a completely dead-end or we can start working on it in this direction? > Do you agree that the reported and reproduced issue is a real one and nut > just some kind of misconfiguration? > > I am lost at this point and I need to move forward so I am asking for > guidance here. -- This message was sent by Atlassian Jira (v8.20.10#820010)