This message is long. Lots of details, a fair amount of history.
The primary Tomcat version we've got is 7.0.42. Specifically, it is the
Tomcat that's included with Liferay 6.2. This is why we haven't
attempted an upgrade even though the version we're running is five years
old -- we don't want to rock the boat too hard and risk everything not
working. Liferay is battle-tested with that version of Tomcat.
Background:
Every now and then, we find that our MySQL master has reached the
connection limit of 600 connections, and websites stop working, because
they are trying to open new connections and failing. When I look at the
DB server, it shows a HUGE number of connections (300 to 500 out of the
600 max) that have idle times between one hour and eight hours. The
MySQL server-side idle timeout is at the default of 28800 seconds --
eight hours. The idle connections are coming from the webservers.
There are five main webservers, two of which run exclusively Liferay
6.2, and three that are running older apps on Tomcat version that's
probably ANCIENT, as well as Liferay 6.1 with Tomcat 7.0.23. I can't
seem to figure out what version the really ancient one is. There are no
version numbers that I can see in the files in the program directory.
We also have a staging server, a dev server, and a few ancillary
systems. The staging and dev servers mainly use a dev/staging database
server, but do need to connect to the main DB server for some things.
When we run out of connections, each of these five webservers has nearly
100 (and sometimes MORE than 100) open connections to the database
server. And the majority of them have been idle for a LONG time.
A number of JNDI database resources are configured in conf/context.xml.
So we have connection pooling. But even though there are TONS of
connections already established from Tomcat, and completely idle from
the DB server's perspective, the application is STILL trying to open a
NEW connection when somebody visits a page. This is the precise issue
that connection pooling SHOULD be preventing.
I've discussed this with the commons-user mailing list, from a DBCP
perspective. I started there because there is some different code that
I wrote, which doesn't run in Tomcat, and uses DBCP2 natively. I'm a
lot more familiar with my code than the code running in Tomcat.
Based on my discussions with commons-user, I really think that the
reason that the DB pools are trying to create more connections even
though there's already a lot of them open is because the connections are
actually abandoned. I suspect they were never closed by the
application, so the pool still has them as active, and thinks it can't
use them. Now that the discussion directly involves configuring pools
using Tomcat's own DBCP implementation, they're not really able to help
me any further.
If we are dealing with abandoned connections as I suspect, then I need
to ask why abandoned connection removal isn't actually working, and how
to configure it so that it DOES work.
We have figured out how to log the number of idle and active connections
on the datasource, but it's going to take some time to get that logging
into the production servers, so I don't have definitive proof that the
connections are actually active.
Here's a redacted configuration that we have in place for one of the DB
connection pools on the production servers:
<Resource name="jdbc/REDACTED" auth="Container"
factory="org.apache.tomcat.dbcp.dbcp.BasicDataSourceFactory"
driverClassName="com.mysql.jdbc.Driver" type="javax.sql.DataSource"
maxActive="60" maxIdle="10" maxWait="30000" removeAbandoned="true"
removeAbandonedTimeout="30" username="REDACTED" password="REDACTED"
testOnBorrow="true" validationQuery="select 1"
url="jdbc:mysql://REDACTED.REDACTED.com:3306/REDACTED?autoReconnect=true&zeroDateTimeBehavior=round"
/>
This is the Tomcat documentation I'm using as a reference:
https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html
Looking at Tomcat 7.0 jdbc documentation, the first thing that I notice
is that the factory we have configured is not even mentioned in the
documentation.
One person (who replied privately and not to commons-user) said that
BOTH factory and type attribute values mentioned in the Tomcat 7.0 docs
are completely wrong! And another (who did reply via the list) said
that the factory mentioned in the documentation is "the alternative
pool" and that I should be using the one documented here instead. But
this page is REALLY lean on details of how to actually set up the resource:
https://tomcat.apache.org/tomcat-7.0-doc/jndi-datasource-examples-howto.html#Database_Connection_Pool_(DBCP)_Configurations
=====================================
So now, with the sordid past covered, I have three main questions:
1) I think this is the really burning question in my mind: Why is the
server opening NEW connections when there are dozens of them already
open? Does this mean that (as I suspect) all those connections are
abandoned? If so, why are they not getting cleaned up, when we have
removeAbandoned set to true and removeAbandonedTimeout set to a
RIDICULOUSLY low value of 30 seconds? (We MUST increase this value.
The plan is one hour.)
2, background) We tried to reconfigure all the pools on our staging
server to a new config. They currently have a config much like what I
shared above, but with smaller pool size numbers. Here's an example of
what we tried to put in place:
<Resource name="jdbc/REDACTED"
auth="Container"
factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"
driverClassName="com.mysql.jdbc.Driver"
type="javax.sql.DataSource"
maxActive="10"
maxIdle="2"
minIdle="1"
maxAge="7200000"
removeAbandoned="true"
removeAbandonedTimeout="3600"
logAbandoned="true"
username="REDACTED"
password="REDACTED"
testOnBorrow="true"
testOnConnect="true"
testOnReturn="true"
testWhileIdle="true"
validationQuery="SELECT 1"
validationQueryTimeout="5"
validationInterval="5000"
minEvictableIdleTimeMillis="900000"
allowMultiQueries="false"
url="jdbc:mysql://REDACTED.REDACTED.com:3306/REDACTED?autoReconnect=true&zeroDateTimeBehavior=round"
/>
When we tried to start the service with a config like this for all of
the pools (there are half a dozen of them), it failed to correctly
start, getting too many connection errors.
The size of the pools on this staging server is a LOT smaller than what
we have configured in production. The DB server it connects to is a lot
less beefy, and has a limit of 400 connections. I am told by the
developer that did the restart that when this service started up, it had
made 140 connections to the dev/staging DB server before failing. Which
shouldn't be possible, with half a dozen pool configs that all have a
maxActive set to 10.
2, question) Have I got something wrong in the new config? If I have,
and there are recommendations for what I should change, please tell me
WHY each change is recommended, and if possible, point me at relevant
documentation. If you can, tell me how I may have misread the Tomcat
documentation, which I used extensively in coming up with the new config.
Side note/question: One of the replies I got on the DBCP list referred
me to this Javadoc:
https://commons.apache.org/proper/commons-dbcp/api-1.4/org/apache/commons/dbcp/BasicDataSource.html#getRemoveAbandoned()
Which basically says that the 1.4 version of the standard DBCP library
won't even try to eliminate abandoned connections on a pool of size 60
until there are more than 57 connections open on the one pool. This
sounds like a REALLY bad idea. The Tomcat jdbc documentation lists
"abandonWhenPercentageFull" as an attribute, and the description there
hints that Tomcat's jdbc implementation doesn't suffer from the same
problem. But with the unusual factory we have configured, I don't know
if our configuration is even properly honored. Does the tomcat pool
have that same issue?
Because we have multiple pools defined, and some of those pools have a
maxActive number that's pretty large, I'm betting that we run out of
connections long before that one pool has opened 57 of them.
Thanks,
Shawn
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org