Hi, We encountered an issue where Tomcat suddenly and randomly stops responding to HTTP requests.
Tomcat startup logs: INFO: Server version: Mar 15, 2019 7:59:35 AM org.apache.catalina.startup.VersionLoggerListener log INFO: Server built: unknown Mar 15, 2019 7:59:35 AM org.apache.catalina.startup.VersionLoggerListener log INFO: Server number: 8.5.x Mar 15, 2019 7:59:35 AM org.apache.catalina.startup.VersionLoggerListener log INFO: OS Name: Linux Mar 15, 2019 7:59:35 AM org.apache.catalina.startup.VersionLoggerListener log INFO: OS Version: 3.10.0-693.17.1.el7.x86_64 Mar 15, 2019 7:59:35 AM org.apache.catalina.startup.VersionLoggerListener log INFO: Architecture: amd64 Mar 15, 2019 7:59:35 AM org.apache.catalina.startup.VersionLoggerListener log INFO: Java Home: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre Mar 15, 2019 7:59:35 AM org.apache.catalina.startup.VersionLoggerListener log INFO: JVM Version: 1.8.0_161-b14 Tomcat is configured to use APRConnector Mar 15, 2019 7:59:35 AM org.apache.catalina.core.AprLifecycleListener lifecycleEvent INFO: Loaded APR based Apache Tomcat Native library [1.2.14] using APR version [1.4.8]. Mar 15, 2019 7:59:35 AM org.apache.catalina.core.AprLifecycleListener lifecycleEvent INFO: APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true]. Mar 15, 2019 7:59:35 AM org.apache.catalina.core.AprLifecycleListener lifecycleEvent INFO: APR/OpenSSL configuration: useAprConnector [true], useOpenSSL [true] Mar 15, 2019 7:59:35 AM org.apache.catalina.core.AprLifecycleListener initializeSSL INFO: OpenSSL successfully initialized [OpenSSL 1.0.2k-fips 26 Jan 2017] Mar 15, 2019 7:59:35 AM org.apache.coyote.AbstractProtocol init INFO: Initializing ProtocolHandler ["http-apr-8081"] Mar 15, 2019 7:59:35 AM org.apache.coyote.AbstractProtocol init INFO: Initializing ProtocolHandler ["https-openssl-apr-0:0:0:0:0:0:0:0-8443"] After some troubleshooting, we found that when this happens, Java worker threads are idle and waiting for tasks, while sockets are in ESTABLISHED and CLOSE_WAIT state. All the TCP sockets at the OS level indicate they have bytes in the TCP buffer that is not been read by the application. so looks like something went wrong, and the APR connector library is no more reading from the socket, or some issue between java socket Acceptor call and APR library. Has anyone encountered any problem like this or have any suggestion on what should be checked that may help to isolate the problem. Below the lsof output with TCP Q data and thread dump for threads for port 8443. Thanks, -- Indrajeet java 25399 tomcatuser 55u IPv6 485303682 0t0 TCP *:8443 (LISTEN QR=0 QS=0) java 25399 tomcatuser 60u IPv6 542452576 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.212:62472 (CLOSE_WAIT QR=880 QS=0) java 25399 tomcatuser 63u IPv6 542319858 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.110:56315 (CLOSE_WAIT QR=704 QS=0) java 25399 tomcatuser 64u IPv6 542365742 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.227:65114 (CLOSE_WAIT QR=704 QS=0) java 25399 tomcatuser 73u IPv6 542271640 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.234:53983 (CLOSE_WAIT QR=848 QS=0) java 25399 tomcatuser 78u IPv6 542313271 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.212:61847 (CLOSE_WAIT QR=832 QS=0) java 25399 tomcatuser 80u IPv6 542312628 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.110:56277 (CLOSE_WAIT QR=704 QS=0) java 25399 tomcatuser 87u IPv6 542322416 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.110:56350 (CLOSE_WAIT QR=704 QS=0) java 25399 tomcatuser 92u IPv6 542301688 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.110:56251 (CLOSE_WAIT QR=784 QS=0) java 25399 tomcatuser 94u IPv6 542344204 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.212:61954 (CLOSE_WAIT QR=816 QS=0) java 25399 tomcatuser 95u IPv6 542301753 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.227:64780 (CLOSE_WAIT QR=784 QS=0) java 25399 tomcatuser 98u IPv6 542313111 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.110:56292 (CLOSE_WAIT QR=704 QS=0) java 25399 tomcatuser 109u IPv6 542312567 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.130:51038 (ESTABLISHED QR=447 QS=0) java 25399 tomcatuser 111u IPv6 542301754 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.129:54443 (CLOSE_WAIT QR=800 QS=0) java 25399 tomcatuser 112u IPv6 542312764 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.212:61834 (CLOSE_WAIT QR=816 QS=0) java 25399 tomcatuser 123u IPv6 542290291 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.110:56221 (CLOSE_WAIT QR=800 QS=0) java 25399 tomcatuser 125u IPv6 542355374 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.110:56421 (CLOSE_WAIT QR=704 QS=0) java 25399 tomcatuser 127u IPv6 542320066 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.212:61878 (CLOSE_WAIT QR=704 QS=0) java 25399 tomcatuser 135u IPv6 542391483 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.110:56532 (CLOSE_WAIT QR=816 QS=0) java 25399 tomcatuser 144u IPv6 542333802 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.163:63742 (CLOSE_WAIT QR=768 QS=0) java 25399 tomcatuser 158u IPv6 542333223 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.110:56352 (CLOSE_WAIT QR=720 QS=0) java 25399 tomcatuser 162u IPv6 542313272 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.163:63615 (CLOSE_WAIT QR=752 QS=0) java 25399 tomcatuser 5196u IPv6 542451367 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.234:55358 (ESTABLISHED QR=570 QS=0) java 25399 tomcatuser 5263u IPv6 542557088 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.130:55096 (ESTABLISHED QR=650 QS=0) java 25399 tomcatuser 5397u IPv6 542464143 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.234:55370 (ESTABLISHED QR=586 QS=0) java 25399 tomcatuser 5452u IPv6 543082702 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.55:50754 (ESTABLISHED QR=618 QS=0) java 25399 tomcatuser 5590u IPv6 542898860 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.88:65110 (ESTABLISHED QR=618 QS=0) java 25399 tomcatuser 5664u IPv6 542858261 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.199:19422 (ESTABLISHED QR=602 QS=0) java 25399 tomcatuser 6124u IPv6 543082704 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.196:50054 (ESTABLISHED QR=852 QS=0) java 25399 tomcatuser 6159u IPv6 542459540 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.110:51141 (ESTABLISHED QR=746 QS=0) java 25399 tomcatuser 6311u IPv6 542499860 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.130:54370 (ESTABLISHED QR=634 QS=0) java 25399 tomcatuser 6383u IPv6 542512620 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.212:57794 (ESTABLISHED QR=634 QS=0) java 25399 tomcatuser 6385u IPv6 543065773 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.130:58026 (ESTABLISHED QR=746 QS=0) java 25399 tomcatuser 6387u IPv6 542500653 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.212:57777 (ESTABLISHED QR=650 QS=0) java 25399 tomcatuser 6392u IPv6 542500684 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.212:57778 (ESTABLISHED QR=618 QS=0) java 25399 tomcatuser 6396u IPv6 542946274 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.163:50919 (ESTABLISHED QR=554 QS=0) java 25399 tomcatuser 6485u IPv6 543011874 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.199:20529 (ESTABLISHED QR=666 QS=0) java 25399 tomcatuser 6487u IPv6 542801192 0t0 TCP 10.77.xx.xx:8443->10.xx.xx.129:51349 (ESTABLISHED QR=634 QS=0) Looking at the thread dump taken around the same time: "https-openssl-apr-0:0:0:0:0:0:0:0-8443-exec-74" #52250 daemon prio=5 os_prio=0 tid=0x00007fe7a027e600 nid=0x1a5f waiting on condition [0x00007fe6354db000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000059b61d528> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:103) at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:31) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:748) "https-openssl-apr-0:0:0:0:0:0:0:0-8443-exec-71" #52247 daemon prio=5 os_prio=0 tid=0x00007fe7a01af350 nid=0x1a54 waiting on condition [0x00007fe71d3de000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000059b61d528> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:103) at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:31) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:748) "https-openssl-apr-0:0:0:0:0:0:0:0-8443-exec-67" #52243 daemon prio=5 os_prio=0 tid=0x00007fe7a024cc70 nid=0x1a50 waiting on condition [0x00007fe65ac01000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000059b61d528> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:103) at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:31) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:748) "https-openssl-apr-0:0:0:0:0:0:0:0-8443-AsyncTimeout" #406 daemon prio=5 os_prio=0 tid=0x00007fe93816b4c0 nid=0x2993 waiting on condition [0x00007fe660e61000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.coyote.AbstractProtocol$AsyncTimeout.run(AbstractProtocol.java:1211) at java.lang.Thread.run(Thread.java:748) "https-openssl-apr-0:0:0:0:0:0:0:0-8443-Acceptor-0" #405 daemon prio=5 os_prio=0 tid=0x00007fe938169bf0 nid=0x2992 runnable [0x00007fe660f62000] java.lang.Thread.State: RUNNABLE at org.apache.tomcat.jni.Socket.accept(Native Method) at org.apache.tomcat.util.net.AprEndpoint$Acceptor.run(AprEndpoint.java:961) at java.lang.Thread.run(Thread.java:748) "https-openssl-apr-0:0:0:0:0:0:0:0-8443-Poller" #404 daemon prio=5 os_prio=0 tid=0x00007fe938168320 nid=0x2991 runnable [0x00007fe661063000] java.lang.Thread.State: RUNNABLE at org.apache.tomcat.util.net.AprEndpoint$Poller.run(AprEndpoint.java:1642) at java.lang.Thread.run(Thread.java:748)