On 1/9/18 1:56 PM, Phil Steitz wrote: > On 1/9/18 11:50 AM, Phil Steitz wrote: >> On 1/8/18 4:23 PM, Shawn Heisey wrote: >>> On 11/22/2017 5:00 PM, Phil Steitz wrote: >>>> If the problem is the evictor closing a connection and having that >>>> connection delivered to a client, the problem is almost certainly in >>>> pool. The thread-safety of the pool in this regard is engineered in >>>> DefaultPooledObject, which is the wrapper that pool manages and >>>> delivers to DBCP. When the evictor visits a PooledObject (in >>>> GenericObjectPool#evict) it tries to start the eviction test on the >>>> object by calling its startEvictionTest method. This method is >>>> synchronized on the DefaultPooledObject. Look at the code in that >>>> method. It checks to make sure that the object is in fact idle in >>>> the pool. The other half of the protection here is in >>>> GenericObjectPool#borrowObject, which is what PoolingDataSource >>>> calls to get a connection. That method tries to get a PooledObject >>>> from the pool and before handing it out (or validating it), it calls >>>> the PooledObject's allocate method. Look at the code for that in >>>> DefaultPooledObject. That method (also synchronized on the >>>> PooledObject) checks that the object is not under eviction and sets >>>> its state to allocated. That is the core sync protection that >>>> *should* make it impossible for the evictor to do anything to an >>>> object that has been handed out to a client. >>> I see the synchronization you're talking about here. It appears that >>> all of the critical methods in DefaultPooledObject are synchronized (on >>> the object). >>> >>> If you're absolutely certain that DefaultPooledObject is involved with >>> all of the implementation my code is using, it all looks pretty complete >>> to me. >> Yes, the code you posted at the top of the thread uses a >> PoolableConnectionFactory as the object factory for the pool. You >> can see that PCF's makeObject returns a DefaultPooledObject, so that >> much is certain. >>> So I'm really curious as to why the connection is getting >>> closed. I have seen the problem only minutes after restarting my >>> program, so it seems unlikely that the server side is closing the >>> connection, since the timeout for that is 8 hours. >> I looked back at the initial stack trace and I noticed something >> that I had not noticed before. >> >> This line >> >> org.apache.commons.dbcp2.DelegatingConnection.createStatement(DelegatingConnection.java:262) >> >> means that checkOpen() succeeded. That, combined with your >> statement above that isClosed() returns true on a failed connection >> means that there might be concurrent access to the >> DelegatingConnections happening. It looks like the sequence might >> have been: >> >> thread 1: checkOpen - sees true >> thread 2: close the DelegatingConnection (there is no sync to >> prevent this) >> thread1 : createStatement - bang! >> thread1 : isClosed() returns true >> >> DBCP is not really safe to use that way - i.e., really the intended >> setup is that individual connection handles are not concurrently >> accessed by multiple threads. Is it possible something like this is >> going on? Note that what I am talking about here is two different >> threads holding references to the same connection handle - i.e., no >> trips back through the pool. > I just noticed another thing in [pool] that might have something to > do with this. It's probably best to investigate what I have in mind > on the dev list. I will post a summary / ticket reference here if > it turns out I this is a bug.
Sorry for the noise. Bug idea evaporated when I dug into it. Phil > > Phil >> Phil >>> I did add the code a while back to test on create, borrow, return, and >>> while idle, but it turns out that I hadn't actually pulled it down to >>> the test server and recompiled. That is now done, so we'll see if that >>> makes any difference. >>> >>> If testing the connection on pool actions does make a difference, then >>> what is your speculation about what was happening when I ran into the >>> closed connection only minutes after restart, and would it be worthy of >>> an issue in Jira? The only theory I had was a race condition between >>> eviction and borrowing, but unless there's something amiss in how all >>> the object inheritance works out, it looks like that's probably not it. >>> Some kind of issue with the TCP stack in Linux (either on the machines >>> running my code or the MySQL server) is the only other idea I can think >>> of. Or maybe a hardware/firmware issue, since it's likely that at least >>> one of the NICs involved is doing TCP offload. I think that virtually >>> every NIC in our infrastructure has that feature and that Linux enables it. >>> Thanks, >>> Shawn >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@commons.apache.org >>> For additional commands, e-mail: user-h...@commons.apache.org >>> >>> --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org