Dan, You are doing well to keep up -- we keep throwing lots at you. I think we all find this issue so bizarre we really want to figure it out. We are, even if it doesn't look like it, slowly eliminating some potential causes.
I've put a few comments inline, and then a larger update on some testing I was doing this morning to try to reproduce what you are observing in your live system. Questions / suggestions for Dan marked with @Dan On Sat, Aug 16, 2025 at 2:31 PM Chuck Caldarale <n82...@gmail.com> wrote: > > > On 2025 Aug 16, at 12:44, Daniel Schwartz <d...@danielgschwartz.com> > wrote: > > > > From: Chuck Caldarale <n82...@gmail.com> > > Sent: Saturday, August 16, 2025 1:33 PM > > To: Tomcat Users List <users@tomcat.apache.org> > > Subject: Re: [EXTERNAL EMAIL] How to access a REST service > > > >> Immediately after doing this, the message copied below started > appearing repeatedly in the server.log and my website stopped working---the > DB evidently became inaccessible. However, Glassfish was still running, > and as soon as I set the maximum pool size back to 1000, these messages > stopped coming and the website started working again. > > > > Indicative of an orphaned connection (resource leak). > > > > DGS: Why is this? What do those messages have to do with a resource > leak? > > > When the pool is at its maximum and all connections are in use, GlassFish > will hold requests for a configurable amount of time (max-wait, defaults to > 60 seconds), and only abort the request and log the situation after the > caller has been suspended for that amount of time. Since your web service’s > response time is very quick, this indicates the only allocated connection > has stayed in-use for over a minute - something leaked it. > > When the maximum pool size is set to your usual value, GlassFish > eventually cleans things up because it periodically discards all existing > connections and acquires new ones just in case they have become unusable > from the DB point of view. > Just to add that there is another situation, but is unlikely based on the default HTTP thread count maximums. It could be that you are getting a large amount of connections faster than they can be processed -- i.e. let's say 100 per 10 ms would overflow the connection pool if you had a limit of 50 and your average response time was 10 ms. This was why I asked about your settings for HTTP thread limits. I will check Glassfish 4.1 a bit later today, but with Glassfish 7.0.25 ("out of the box") the limit on the HTTP processing threads was 5, and without increasing that, it's not possible to have more than 5 typical requests being processed at once. That means that if each request never acquires more than 1 connection (which is the case in all the examples you have provided), the connection pool usage should never grow beyond 5. So, unless you have increased the HTTP processing thread limits from defaults to a value larger than the default thread pool maximum size, or the deployment you are using does such, you shouldn't have seen an issue originally if all the connections are being released to the pool. The "peak value", as tracked in Glassfish, should never exceed this thread count unless a connection is not being returned to the pool. Thus, the conclusion that there is a leak somewhere, but where... I'm not sure at this point. This is where I think we need to focus on -- possibly by dialing back your settings, and also by looking at what's going on with the robot requests -- are they causing faults that aren't visible"? (more on this at the end) @Dan: It would be a really good idea though (as it's been said, and been acknowledged) to wrap all the connection / statement / result-set acquisitions with try-with-resources. Not only will it rule out leaks from those going forward, it will also simplify your code and make it easier to read. I would do this sooner so we can try to rule out leaks from those activities. > > > So this is very interesting, and points to a possible line of > investigation. If the web crawlers are throwing garbage at your app, that > could possibly be resulting in losing track of connections, statements, > result sets, etc. You should carefully check your error paths and make sure > that all DB-related resources are being properly disposed of when invalid > input is received. > > > > DGS: I'm fairly sure that all connections and prepared statements are > being closed, but will look again. This would be happening in the code > fragment that tries to retrieve the list of holidays with calculated dates > (not the countries). This is somewhat complex and could have problems. > > > Your GetCountryList code from a previous message doesn’t dispose of the > ResultSet object; depending on the implementation details of the JDBC > driver and pool, closing the result set may not be required, but it’s > always good practice to do so. Since the ResultSet object contains a > reference to the Connection (and vice-versa), it may be interfering with > reuse of the connection until a garbage collection happens. > Okay, so while I was in between things this morning I did a whole bunch of testing. With Glassfish 7.0.25, with JDK 21, and the latest PostgreSQL JDBC driver, I was unable to reproduce the "leak" problem using code very similar to Dan's. I tried all of the following: - enabling and disabling the JDBC object wrapping in Glassfish - using the "singleton" data source or retrieving one each request - not closing the Statement or the ResultSet / or closing them - not closing the Connection or closing it The only situation where I could reproduce a leak (and very quickly) was to fail to close the connection. (remove it from the try-with-resources block). I could not get the lack of closing the ResultSet or the Statement to trigger a "leak" in the connection pool -- but that doesn't mean one can ignore that, only that it didn't definitely cause the issue in my test setup. I was not however running any "real queries". I was just running "SELECT 1" each time and returning the integer to the caller. So no data was coming from the client that could be triggering other potential failures. My next plan is to switch to MySQL JDBC (as I noted this is what Dan is using) and see if maybe there is something different with that JDBC driver and do the same tests. @Dan: Could you share the exact version of the JDBC driver you are using? And then, I'll try rolling back to Glassfish 4.1.x and see if that makes a difference. @Dan: Is there a minor version, or is it 4.1.latest ? Dan: it would also be a good idea to "verify" the parameters coming in for "valid values" without using a DB query (if that is possible) so as to avoid the robots triggering DB connections. Ultimately this is a risk of a DoS type of attack. You can also employ rate limiting as well to help avoid such problems. @Dan: I think we also need to change your "printing to Stdout" to print to the Glassfish logger. The problem with printing to Stdout is that only the HTTP client making the request that triggers these logs will see them. This could be "masking" some errors where you aren't doing the testing yourself. (I could be wrong about where Stdout output goes though -- so if you do see those in the logs, then ignore this item. I will validate that shortly myself in Glassfish 7.0.25). @Dan: Have you been able to get a rough count of the request rate you are seeing on your server, so we have some idea if it's even possible to be seeing an overload situation or if we can rule that out. Robert (who is clearly spending way too much time on Dan's issue, but is very puzzled and wants to get to the bottom of it, and as a result has now used Glassfish for the first time... :) )