Hi, we had been running Tomcat 9.0.17 for quite some time on our
high-load production servers, using the attached server.xml configuration.
Upon upgrading to 9.0.21 we started experiencing many random deadlocks.
We run performance advertising campaigns, and our conversion rates
dropped to below half of what they usually are, which was an obvious
consequence of our servers randomly locking up. Plus, it was very easy
to reproduce the deadlocks, which seemed to "magically unlock" when
opening a second tab/window and opening the same URL that was locked on
the other window/tab. Doing this unlocked both windows/tabs at once,
immediately.
We found this was only happening on HTTPS, but NOT on HTTP. Furthermore,
we found this was only happening when the browser negotiated an upgrade
to HTTPS/2.0
Once we found this, we temporarily removed the <UpgradeProtocol
className="org.apache.coyote.http2.Http2Protocol" /> configuration, and
all was back to normal.
However, we need HTTP/2, so we continued to look for a proper solution.
Looking at the Tomcat changelog, we found there have been many changes
since 9.0.17 related to useAsyncIO and HTTP/2. One particular change for
9.0.22 caught our attention:
/"Remove a source of potential deadlocks when using HTTP/2 when the
Connector is configured with useAsyncIO as true. (markt)"
/We also found the following discussion thread, which describes issues
similar to what we were experiencing:/
/http://mail-archives.apache.org/mod_mbox/tomcat-dev/201906.mbox/%3c20190606204631.bab6c8a...@gitbox.apache.org%3e/
/
So we upgraded to 9.0.22 thinking that the deadlock would be gone. But
alas, it was not. The deadlocks remained.
We found that 9.0.20 changed the default for useAsyncIO from "false" to
"true". So we changed useAsyncIO back to what it was when we were
running 9.0.17 (false) and all is back to normal on 9.0.22
So the conclusion is: there are still deadlock bugs on the NIO connector
with useAsyncIO="true" and upgrades to HTTP/2.0
Besides fixing them, we believe that the useAsyncIO default should be
reverted to "false".
We could find no Java deadlocked threads at all by inspecting jconsole
(not with the automatic "find deadlocks" functionality, nor by
inspection of a thread dump). We performed several thread dumps WHILE
the deadlock was clearly visible on screen (this was very easily
reproduceable).
The deadlock is definitely there though and goes away as soon as we turn
off "useAsyncIO".
Since we could not find Java-level deadlocks, we believe the problem
probably lies in the interaction with native code.
We are using org.apache.catalina.core.AprLifecycleListener as well as
Tomcat Native 1.2.23 on Linux.
We could not find any pointers in the Tomcat Native changelog dealing
with similar issues.
Any ideas? Thanks,
--
Manuel Dominguez Sarmiento
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Server>
<Server port="8005" shutdown="SHUTDOWN">
<Listener className="org.apache.catalina.startup.VersionLoggerListener" />
<Listener className="org.apache.catalina.core.AprLifecycleListener" />
<Listener className="org.apache.catalina.core.ThreadLocalLeakPreventionListener" />
<Listener className="org.apache.catalina.core.JreMemoryLeakPreventionListener" AWTThreadProtection="true" />
<Service name="Catalina">
<Executor name="threadPool" namePrefix="tomcat-http-"
maxThreads="1000" minSpareThreads="100" threadPriority="10" />
<Connector protocol="org.apache.coyote.http11.Http11NioProtocol"
executor="threadPool" port="80" redirectPort="443"
acceptCount="100" enableLookups="false"
connectionTimeout="30000" maxConnections="-1" />
<Connector protocol="org.apache.coyote.http11.Http11NioProtocol"
executor="threadPool" port="443"
acceptCount="100" enableLookups="false"
connectionTimeout="30000" maxConnections="-1"
scheme="https" secure="true" SSLEnabled="true"
defaultSSLHostConfigName="*.mydomain.com">
<UpgradeProtocol className="org.apache.coyote.http2.Http2Protocol" />
<SSLHostConfig hostName="*.mydomain.com" protocols="all" certificateVerification="none">
<Certificate certificateKeyAlias="server"
certificateKeystoreFile="/path/to/keystore_star_mydomain_com.jks"
certificateKeystorePassword="changeit" />
</SSLHostConfig>
<SSLHostConfig hostName="ren.ac" protocols="all" certificateVerification="none">
<Certificate certificateKeyAlias="server"
certificateKeystoreFile="/path/to/keystore_ren_ac.jks"
certificateKeystorePassword="changeit" />
</SSLHostConfig>
</Connector>
<Engine name="Catalina" defaultHost="localhost"
backgroundProcessorDelay="1" startStopThreads="0">
<Realm className="org.apache.catalina.realm.MemoryRealm" />
<Host name="localhost" appBase="webapps" unpackWARs="true"
autoDeploy="false" deployOnStartup="true" deployIgnore="manager|host-manager"
deployXML="true" copyXML="false">
<Context docBase="manager" path="/manager" reloadable="false" privileged="true">
<Valve className="org.apache.catalina.valves.RemoteAddrValve" allow="127\.0\.0\.1|192\.168\.\d+\.\d+|10\.\d+\.\d+\.\d+|172\.16\.\d+\.\d+|172\.17\.\d+\.\d+|172\.18\.\d+\.\d+|172\.19\.\d+\.\d+|172\.20\.\d+\.\d+|172\.21\.\d+\.\d+|172\.22\.\d+\.\d+|172\.23\.\d+\.\d+|172\.24\.\d+\.\d+|172\.25\.\d+\.\d+|172\.26\.\d+\.\d+|172\.27\.\d+\.\d+|172\.28\.\d+\.\d+|172\.29\.\d+\.\d+|172\.30\.\d+\.\d+|172\.31\.\d+\.\d+" />
</Context>
<Context docBase="host-manager" path="/host-manager" reloadable="false" privileged="true">
<Valve className="org.apache.catalina.valves.RemoteAddrValve" allow="127\.0\.0\.1|192\.168\.\d+\.\d+|10\.\d+\.\d+\.\d+|172\.16\.\d+\.\d+|172\.17\.\d+\.\d+|172\.18\.\d+\.\d+|172\.19\.\d+\.\d+|172\.20\.\d+\.\d+|172\.21\.\d+\.\d+|172\.22\.\d+\.\d+|172\.23\.\d+\.\d+|172\.24\.\d+\.\d+|172\.25\.\d+\.\d+|172\.26\.\d+\.\d+|172\.27\.\d+\.\d+|172\.28\.\d+\.\d+|172\.29\.\d+\.\d+|172\.30\.\d+\.\d+|172\.31\.\d+\.\d+" />
</Context>
</Host>
</Engine>
</Service>
</Server>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org