On Fri, Oct 29, 2021 at 9:28 AM Suvendu Sekhar Mondal <suv3...@gmail.com> wrote:
>
> Hello Chris,
>
> On Fri, Oct 29, 2021 at 2:46 AM Christopher Schultz
> <ch...@christopherschultz.net> wrote:
> >
> > Suvendu,
> >
> > On 10/28/21 12:55, Suvendu Sekhar Mondal wrote:
> > > Hello Everyone,
> > >
> > > I was investigating one thread pool exhaustion issue. Thread dump
> > > analysis showed that all HTTP threads were waiting for a ReentrantLock
> > > object. Object address 0x000000066d727f28 were same for all of the
> > > waiting threads:
> > >
> > > "http-nio-18100-exec-86" #32808 daemon prio=5 os_prio=0
> > > tid=0x0000000051835800 nid=0x29bc waiting on condition
> > > [0x000000007a5be000]
> > >     java.lang.Thread.State: WAITING (parking)
> > > at sun.misc.Unsafe.park(Native Method)
> > > - parking to wait for  <0x000000066d727f28> (a
> > > java.util.concurrent.locks.ReentrantLock$NonfairSync)
> > > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> > > at 
> > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> > > at 
> > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
> > > at 
> > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
> > > at 
> > > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
> > > at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
> > > at org.apache.catalina.realm.JNDIRealm.get(JNDIRealm.java:2385)
> > > at org.apache.catalina.realm.JNDIRealm.authenticate(JNDIRealm.java:1274)
> > >
> > > There was no hint in the thread dump about which thread was owning the
> > > lock. Luckily, one heap dump was taken before generating thread dump.
> > > When I queried the heap dump for that ReentrantLock object, I saw that
> > > another thread(http-nio-18100-exec-4) was holding the
> > > lock(exclusiveOwnerThread). There was NO trace of
> > > http-nio-18100-exec-4 thread in any of the thread dumps! So it was a
> > > "lock without an owner" case.
> >
> > I think you are looking at several pieces of evidence that may or may
> > not correlate to each other at all. The fact that the thread wasn't in
> > the thread dump indicates that the thread (or even the whole JVM) had
> > terminated between the time you took the heap-dump and the thread dump.
> > Most likely, the monitor was owned by another thread when you took your
> > thread-dump. Try using other tools which *do* reveal the lock-holders
> > identity.
> >
>
> This issue has happened a few times. "Busy Thread Count" was high
> during the problem period. JVM was up and running when I collected
> heap and thread dumps - pid was not changed in-between. I used jstack,
> visualvm, jcmd - nothing revealed owing thread details. Only heap
> dumps had some information on that object and which thread was holding
> onto it. Here is a snap: https://pasteboard.co/D7dV3jej6zId.jpg
>
> I can simulate similar blocking without Tomcat with dummy code. There
> also nothing reveals the owner's identity except the heap dump. Here
> is sample: https://gist.github.com/suv3ndu/2ec9fe660d2b833996817ed62186eac2
>
> > > After glancing through the Tomcat’s JNDIRealm.get() code and
> > > beyond[1], I can see lock is being acquired on singleConnectionLock.
> > > That lock is getting released either in the close() or release()
> > > method. So, if something bad happens to the thread which is trying to
> > > establish a connection, then lock will be held without a proper owner
> > > and a thread blocking situation will be created. Am I interpreting the
> > > code correctly? Should we not handle any failure inside get()?
> > >
> > > Also, I still have not got the reason why the thread got terminated.
> > > Any suggestions on how I can enable any specific logging?
> > >
> > > My setup is:
> > > Tomcat version: 9.0.39
> > > Connector: NIO
> > > JDK: AdoptOpenJDK: 1.8.192
> > > OS: Windows 2016
> >
> > Looks like you need a whole bunch of upgrades. Search the Tomcat 9.x
> > changelog for "JNDIRealm" and you'll see there have been changes since
> > 9.0.39 that may have already resolved this issue. Are you able to
> > re-test with Tomcat 9.0.54?
> >
>
> It will not be easy for me to upgrade it and test it. Lots of approval
> is required to get that done. :(
>
> >  > [1]
> > https://github.com/apache/tomcat/blob/57a6a40fc9f995e4d449358bbde047aab6d9f39a/java/org/apache/catalina/realm/JNDIRealm.java#L2553
> >
> > Note that you are looking at the current version of JNDIRealm.java. The
> > version you are running is 17 commits behind that.
> >
> > The line of code calling ReentrantLock.lock in your code would be
> > https://github.com/apache/tomcat/blob/57a6a40fc9f995e4d449358bbde047aab6d9f39a/java/org/apache/catalina/realm/JNDIRealm.java#L2385
> > which is "return null" indicating that there is a version mismatch
> > between the code you are running and the code you are reading.
> >
>
> Yeah, that's correct. Sorry for the confusion. Our version is running:
> https://github.com/apache/tomcat/blob/95658dfd868216db0773c38aad8eebf544024b09/java/org/apache/catalina/realm/JNDIRealm.java#L2385
> That get() has not changed since then. That's why I asked about
> handling failure inside get().

This should be handled. So I recommend you update.
Nearly always, you'll get a NamingException in authenticate, this works now:
https://github.com/apache/tomcat/blob/main/java/org/apache/catalina/realm/JNDIRealm.java#L1235
goes: 
https://github.com/apache/tomcat/blob/main/java/org/apache/catalina/realm/JNDIRealm.java#L1288

After review, I can see Naming exceptions are caught, which should be
enough, but it would be safer to catch Exception. Also getPassword was
missing handling for an exception. I will fix these, but I doubt you
were affected. I'll tighten this up.

Rémy

>
> I am also trying to find why it's failing in the first place. We might
> be having some intermittent connection problems which might be
> triggering this. Is there any way to get more info about the failure
> from Tomcat? Please share your thoughts.
>
> > -chris
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > For additional commands, e-mail: users-h...@tomcat.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to