Re: JNDIRealm thread blocking issue
On Fri, Oct 29, 2021 at 2:02 PM Rémy Maucherat wrote: > > On Fri, Oct 29, 2021 at 9:28 AM Suvendu Sekhar Mondal > wrote: > > > > Hello Chris, > > > > On Fri, Oct 29, 2021 at 2:46 AM Christopher Schultz > > wrote: > > > > > > Suvendu, > > > > > > On 10/28/21 12:55, Suvendu Sekhar Mondal wrote: > > > > Hello Everyone, > > > > > > > > I was investigating one thread pool exhaustion issue. Thread dump > > > > analysis showed that all HTTP threads were waiting for a ReentrantLock > > > > object. Object address 0x00066d727f28 were same for all of the > > > > waiting threads: > > > > > > > > "http-nio-18100-exec-86" #32808 daemon prio=5 os_prio=0 > > > > tid=0x51835800 nid=0x29bc waiting on condition > > > > [0x7a5be000] > > > > java.lang.Thread.State: WAITING (parking) > > > > at sun.misc.Unsafe.park(Native Method) > > > > - parking to wait for <0x00066d727f28> (a > > > > java.util.concurrent.locks.ReentrantLock$NonfairSync) > > > > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > > > at > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > > > > at > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > > > > at > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > > > > at > > > > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) > > > > at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) > > > > at org.apache.catalina.realm.JNDIRealm.get(JNDIRealm.java:2385) > > > > at org.apache.catalina.realm.JNDIRealm.authenticate(JNDIRealm.java:1274) > > > > > > > > There was no hint in the thread dump about which thread was owning the > > > > lock. Luckily, one heap dump was taken before generating thread dump. > > > > When I queried the heap dump for that ReentrantLock object, I saw that > > > > another thread(http-nio-18100-exec-4) was holding the > > > > lock(exclusiveOwnerThread). There was NO trace of > > > > http-nio-18100-exec-4 thread in any of the thread dumps! So it was a > > > > "lock without an owner" case. > > > > > > I think you are looking at several pieces of evidence that may or may > > > not correlate to each other at all. The fact that the thread wasn't in > > > the thread dump indicates that the thread (or even the whole JVM) had > > > terminated between the time you took the heap-dump and the thread dump. > > > Most likely, the monitor was owned by another thread when you took your > > > thread-dump. Try using other tools which *do* reveal the lock-holders > > > identity. > > > > > > > This issue has happened a few times. "Busy Thread Count" was high > > during the problem period. JVM was up and running when I collected > > heap and thread dumps - pid was not changed in-between. I used jstack, > > visualvm, jcmd - nothing revealed owing thread details. Only heap > > dumps had some information on that object and which thread was holding > > onto it. Here is a snap: https://pasteboard.co/D7dV3jej6zId.jpg > > > > I can simulate similar blocking without Tomcat with dummy code. There > > also nothing reveals the owner's identity except the heap dump. Here > > is sample: https://gist.github.com/suv3ndu/2ec9fe660d2b833996817ed62186eac2 > > > > > > After glancing through the Tomcat’s JNDIRealm.get() code and > > > > beyond[1], I can see lock is being acquired on singleConnectionLock. > > > > That lock is getting released either in the close() or release() > > > > method. So, if something bad happens to the thread which is trying to > > > > establish a connection, then lock will be held without a proper owner > > > > and a thread blocking situation will be created. Am I interpreting the > > > > code correctly? Should we not handle any failure inside get()? > > > > > > > > Also, I still have not got the reason why the thread got terminated. > > > > Any suggestions on how I can enable any specific logging? > > > > > > > > My setup is: > > > > Tomcat version: 9.0.39 > > > > Connector: NIO > > > > JDK: AdoptOpenJDK: 1.8.192 > > > > OS: Windows 2016 > > > > > > Looks like you need a whole bunch of upgrades. Search the Tomcat 9.x > > > changelog for "JNDIRealm" and you'll see there have been changes since > > > 9.0.39 that may have already resolved this issue. Are you able to > > > re-test with Tomcat 9.0.54? > > > > > > > It will not be easy for me to upgrade it and test it. Lots of approval > > is required to get that done. :( > > > > > > [1] > > > https://github.com/apache/tomcat/blob/57a6a40fc9f995e4d449358bbde047aab6d9f39a/java/org/apache/catalina/realm/JNDIRealm.java#L2553 > > > > > > Note that you are looking at the current version of JNDIRealm.java. The > > > version you are running is 17 commits behind that. > > > > > > The line of code calling ReentrantLock.lock in your code would be > > > https://github.com/apac
Re: JNDIRealm thread blocking issue
On Fri, Oct 29, 2021 at 9:28 AM Suvendu Sekhar Mondal wrote: > > Hello Chris, > > On Fri, Oct 29, 2021 at 2:46 AM Christopher Schultz > wrote: > > > > Suvendu, > > > > On 10/28/21 12:55, Suvendu Sekhar Mondal wrote: > > > Hello Everyone, > > > > > > I was investigating one thread pool exhaustion issue. Thread dump > > > analysis showed that all HTTP threads were waiting for a ReentrantLock > > > object. Object address 0x00066d727f28 were same for all of the > > > waiting threads: > > > > > > "http-nio-18100-exec-86" #32808 daemon prio=5 os_prio=0 > > > tid=0x51835800 nid=0x29bc waiting on condition > > > [0x7a5be000] > > > java.lang.Thread.State: WAITING (parking) > > > at sun.misc.Unsafe.park(Native Method) > > > - parking to wait for <0x00066d727f28> (a > > > java.util.concurrent.locks.ReentrantLock$NonfairSync) > > > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > > at > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > > > at > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > > > at > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > > > at > > > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) > > > at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) > > > at org.apache.catalina.realm.JNDIRealm.get(JNDIRealm.java:2385) > > > at org.apache.catalina.realm.JNDIRealm.authenticate(JNDIRealm.java:1274) > > > > > > There was no hint in the thread dump about which thread was owning the > > > lock. Luckily, one heap dump was taken before generating thread dump. > > > When I queried the heap dump for that ReentrantLock object, I saw that > > > another thread(http-nio-18100-exec-4) was holding the > > > lock(exclusiveOwnerThread). There was NO trace of > > > http-nio-18100-exec-4 thread in any of the thread dumps! So it was a > > > "lock without an owner" case. > > > > I think you are looking at several pieces of evidence that may or may > > not correlate to each other at all. The fact that the thread wasn't in > > the thread dump indicates that the thread (or even the whole JVM) had > > terminated between the time you took the heap-dump and the thread dump. > > Most likely, the monitor was owned by another thread when you took your > > thread-dump. Try using other tools which *do* reveal the lock-holders > > identity. > > > > This issue has happened a few times. "Busy Thread Count" was high > during the problem period. JVM was up and running when I collected > heap and thread dumps - pid was not changed in-between. I used jstack, > visualvm, jcmd - nothing revealed owing thread details. Only heap > dumps had some information on that object and which thread was holding > onto it. Here is a snap: https://pasteboard.co/D7dV3jej6zId.jpg > > I can simulate similar blocking without Tomcat with dummy code. There > also nothing reveals the owner's identity except the heap dump. Here > is sample: https://gist.github.com/suv3ndu/2ec9fe660d2b833996817ed62186eac2 > > > > After glancing through the Tomcat’s JNDIRealm.get() code and > > > beyond[1], I can see lock is being acquired on singleConnectionLock. > > > That lock is getting released either in the close() or release() > > > method. So, if something bad happens to the thread which is trying to > > > establish a connection, then lock will be held without a proper owner > > > and a thread blocking situation will be created. Am I interpreting the > > > code correctly? Should we not handle any failure inside get()? > > > > > > Also, I still have not got the reason why the thread got terminated. > > > Any suggestions on how I can enable any specific logging? > > > > > > My setup is: > > > Tomcat version: 9.0.39 > > > Connector: NIO > > > JDK: AdoptOpenJDK: 1.8.192 > > > OS: Windows 2016 > > > > Looks like you need a whole bunch of upgrades. Search the Tomcat 9.x > > changelog for "JNDIRealm" and you'll see there have been changes since > > 9.0.39 that may have already resolved this issue. Are you able to > > re-test with Tomcat 9.0.54? > > > > It will not be easy for me to upgrade it and test it. Lots of approval > is required to get that done. :( > > > > [1] > > https://github.com/apache/tomcat/blob/57a6a40fc9f995e4d449358bbde047aab6d9f39a/java/org/apache/catalina/realm/JNDIRealm.java#L2553 > > > > Note that you are looking at the current version of JNDIRealm.java. The > > version you are running is 17 commits behind that. > > > > The line of code calling ReentrantLock.lock in your code would be > > https://github.com/apache/tomcat/blob/57a6a40fc9f995e4d449358bbde047aab6d9f39a/java/org/apache/catalina/realm/JNDIRealm.java#L2385 > > which is "return null" indicating that there is a version mismatch > > between the code you are running and the code you are reading. > > > > Yeah
Re: JNDIRealm thread blocking issue
Hello Chris, On Fri, Oct 29, 2021 at 2:46 AM Christopher Schultz wrote: > > Suvendu, > > On 10/28/21 12:55, Suvendu Sekhar Mondal wrote: > > Hello Everyone, > > > > I was investigating one thread pool exhaustion issue. Thread dump > > analysis showed that all HTTP threads were waiting for a ReentrantLock > > object. Object address 0x00066d727f28 were same for all of the > > waiting threads: > > > > "http-nio-18100-exec-86" #32808 daemon prio=5 os_prio=0 > > tid=0x51835800 nid=0x29bc waiting on condition > > [0x7a5be000] > > java.lang.Thread.State: WAITING (parking) > > at sun.misc.Unsafe.park(Native Method) > > - parking to wait for <0x00066d727f28> (a > > java.util.concurrent.locks.ReentrantLock$NonfairSync) > > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > > at > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > > at > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > > at > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > > at > > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) > > at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) > > at org.apache.catalina.realm.JNDIRealm.get(JNDIRealm.java:2385) > > at org.apache.catalina.realm.JNDIRealm.authenticate(JNDIRealm.java:1274) > > > > There was no hint in the thread dump about which thread was owning the > > lock. Luckily, one heap dump was taken before generating thread dump. > > When I queried the heap dump for that ReentrantLock object, I saw that > > another thread(http-nio-18100-exec-4) was holding the > > lock(exclusiveOwnerThread). There was NO trace of > > http-nio-18100-exec-4 thread in any of the thread dumps! So it was a > > "lock without an owner" case. > > I think you are looking at several pieces of evidence that may or may > not correlate to each other at all. The fact that the thread wasn't in > the thread dump indicates that the thread (or even the whole JVM) had > terminated between the time you took the heap-dump and the thread dump. > Most likely, the monitor was owned by another thread when you took your > thread-dump. Try using other tools which *do* reveal the lock-holders > identity. > This issue has happened a few times. "Busy Thread Count" was high during the problem period. JVM was up and running when I collected heap and thread dumps - pid was not changed in-between. I used jstack, visualvm, jcmd - nothing revealed owing thread details. Only heap dumps had some information on that object and which thread was holding onto it. Here is a snap: https://pasteboard.co/D7dV3jej6zId.jpg I can simulate similar blocking without Tomcat with dummy code. There also nothing reveals the owner's identity except the heap dump. Here is sample: https://gist.github.com/suv3ndu/2ec9fe660d2b833996817ed62186eac2 > > After glancing through the Tomcat’s JNDIRealm.get() code and > > beyond[1], I can see lock is being acquired on singleConnectionLock. > > That lock is getting released either in the close() or release() > > method. So, if something bad happens to the thread which is trying to > > establish a connection, then lock will be held without a proper owner > > and a thread blocking situation will be created. Am I interpreting the > > code correctly? Should we not handle any failure inside get()? > > > > Also, I still have not got the reason why the thread got terminated. > > Any suggestions on how I can enable any specific logging? > > > > My setup is: > > Tomcat version: 9.0.39 > > Connector: NIO > > JDK: AdoptOpenJDK: 1.8.192 > > OS: Windows 2016 > > Looks like you need a whole bunch of upgrades. Search the Tomcat 9.x > changelog for "JNDIRealm" and you'll see there have been changes since > 9.0.39 that may have already resolved this issue. Are you able to > re-test with Tomcat 9.0.54? > It will not be easy for me to upgrade it and test it. Lots of approval is required to get that done. :( > > [1] > https://github.com/apache/tomcat/blob/57a6a40fc9f995e4d449358bbde047aab6d9f39a/java/org/apache/catalina/realm/JNDIRealm.java#L2553 > > Note that you are looking at the current version of JNDIRealm.java. The > version you are running is 17 commits behind that. > > The line of code calling ReentrantLock.lock in your code would be > https://github.com/apache/tomcat/blob/57a6a40fc9f995e4d449358bbde047aab6d9f39a/java/org/apache/catalina/realm/JNDIRealm.java#L2385 > which is "return null" indicating that there is a version mismatch > between the code you are running and the code you are reading. > Yeah, that's correct. Sorry for the confusion. Our version is running: https://github.com/apache/tomcat/blob/95658dfd868216db0773c38aad8eebf544024b09/java/org/apache/catalina/realm/JNDIRealm.java#L2385 That get() has not changed since then. That's why I asked about
Re: JNDIRealm thread blocking issue
Suvendu, On 10/28/21 12:55, Suvendu Sekhar Mondal wrote: Hello Everyone, I was investigating one thread pool exhaustion issue. Thread dump analysis showed that all HTTP threads were waiting for a ReentrantLock object. Object address 0x00066d727f28 were same for all of the waiting threads: "http-nio-18100-exec-86" #32808 daemon prio=5 os_prio=0 tid=0x51835800 nid=0x29bc waiting on condition [0x7a5be000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00066d727f28> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at org.apache.catalina.realm.JNDIRealm.get(JNDIRealm.java:2385) at org.apache.catalina.realm.JNDIRealm.authenticate(JNDIRealm.java:1274) There was no hint in the thread dump about which thread was owning the lock. Luckily, one heap dump was taken before generating thread dump. When I queried the heap dump for that ReentrantLock object, I saw that another thread(http-nio-18100-exec-4) was holding the lock(exclusiveOwnerThread). There was NO trace of http-nio-18100-exec-4 thread in any of the thread dumps! So it was a "lock without an owner" case. I think you are looking at several pieces of evidence that may or may not correlate to each other at all. The fact that the thread wasn't in the thread dump indicates that the thread (or even the whole JVM) had terminated between the time you took the heap-dump and the thread dump. Most likely, the monitor was owned by another thread when you took your thread-dump. Try using other tools which *do* reveal the lock-holders identity. After glancing through the Tomcat’s JNDIRealm.get() code and beyond[1], I can see lock is being acquired on singleConnectionLock. That lock is getting released either in the close() or release() method. So, if something bad happens to the thread which is trying to establish a connection, then lock will be held without a proper owner and a thread blocking situation will be created. Am I interpreting the code correctly? Should we not handle any failure inside get()? Also, I still have not got the reason why the thread got terminated. Any suggestions on how I can enable any specific logging? My setup is: Tomcat version: 9.0.39 Connector: NIO JDK: AdoptOpenJDK: 1.8.192 OS: Windows 2016 Looks like you need a whole bunch of upgrades. Search the Tomcat 9.x changelog for "JNDIRealm" and you'll see there have been changes since 9.0.39 that may have already resolved this issue. Are you able to re-test with Tomcat 9.0.54? > [1] https://github.com/apache/tomcat/blob/57a6a40fc9f995e4d449358bbde047aab6d9f39a/java/org/apache/catalina/realm/JNDIRealm.java#L2553 Note that you are looking at the current version of JNDIRealm.java. The version you are running is 17 commits behind that. The line of code calling ReentrantLock.lock in your code would be https://github.com/apache/tomcat/blob/57a6a40fc9f995e4d449358bbde047aab6d9f39a/java/org/apache/catalina/realm/JNDIRealm.java#L2385 which is "return null" indicating that there is a version mismatch between the code you are running and the code you are reading. -chris - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
JNDIRealm thread blocking issue
Hello Everyone, I was investigating one thread pool exhaustion issue. Thread dump analysis showed that all HTTP threads were waiting for a ReentrantLock object. Object address 0x00066d727f28 were same for all of the waiting threads: "http-nio-18100-exec-86" #32808 daemon prio=5 os_prio=0 tid=0x51835800 nid=0x29bc waiting on condition [0x7a5be000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00066d727f28> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285) at org.apache.catalina.realm.JNDIRealm.get(JNDIRealm.java:2385) at org.apache.catalina.realm.JNDIRealm.authenticate(JNDIRealm.java:1274) There was no hint in the thread dump about which thread was owning the lock. Luckily, one heap dump was taken before generating thread dump. When I queried the heap dump for that ReentrantLock object, I saw that another thread(http-nio-18100-exec-4) was holding the lock(exclusiveOwnerThread). There was NO trace of http-nio-18100-exec-4 thread in any of the thread dumps! So it was a "lock without an owner" case. After glancing through the Tomcat’s JNDIRealm.get() code and beyond[1], I can see lock is being acquired on singleConnectionLock. That lock is getting released either in the close() or release() method. So, if something bad happens to the thread which is trying to establish a connection, then lock will be held without a proper owner and a thread blocking situation will be created. Am I interpreting the code correctly? Should we not handle any failure inside get()? Also, I still have not got the reason why the thread got terminated. Any suggestions on how I can enable any specific logging? My setup is: Tomcat version: 9.0.39 Connector: NIO JDK: AdoptOpenJDK: 1.8.192 OS: Windows 2016 [1] https://github.com/apache/tomcat/blob/57a6a40fc9f995e4d449358bbde047aab6d9f39a/java/org/apache/catalina/realm/JNDIRealm.java#L2553 Thanks & Regards, Suvendu - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org