Re: The failure

2018-10-24 Thread serguei.spit...@oracle.com
Leonid confirmed this deadlock is not reproducible if the Kitchensink agent_sampler is disabled. Also, applying the patch from Robbin (with agent_sampler enabled) hit new assert that has caught another case in JvmtiEnv::GetStackTrace with the same pattern: With proposed patch issue reproduced

Re: The failure

2018-10-24 Thread Robbin Ehn
Hi sorry, the assert should be assert(!t->have_threads_list(),) We should not have a threads list :) /Robbin On 24/10/2018 11:18, Robbin Ehn wrote: Hi Serguei, On 24/10/2018 11:00, serguei.spit...@oracle.com wrote: Hi Robbin and David, There is no JvmtiEnv::SuspendThreadList call in th

Re: The failure

2018-10-24 Thread Robbin Ehn
Hi Serguei, On 24/10/2018 11:00, serguei.spit...@oracle.com wrote: Hi Robbin and David, There is no JvmtiEnv::SuspendThreadList call in the dumped stack traces. But there is an instance of the JvmtiEnv::SuspendThread which seems to be supporting your theory: Sorry, I did mean any place we ta

Re: The failure

2018-10-24 Thread Robbin Ehn
On 24/10/2018 09:46, David Holmes wrote: Thanks Robbin! So you're no allowed to request a VM operation if you hold a ThreadsListHandle ? I suppose that is no different to not being able to request a VM operation whilst holding the Threads_lock. Yes, exactly. /Robbin I suspect before Thread

Re: The failure

2018-10-24 Thread serguei.spit...@oracle.com
Hi Robbin and David, There is no JvmtiEnv::SuspendThreadList call in the dumped stack traces. But there is an instance of the JvmtiEnv::SuspendThread which seems to be supporting your theory: Thread 136 (Thread 0x2ae494100700 (LWP 28023)): #0  0x2ae3927b5945 in pthread_cond_wait@@GLIBC_2.3

Re: The failure

2018-10-24 Thread David Holmes
Thanks Robbin! So you're no allowed to request a VM operation if you hold a ThreadsListHandle ? I suppose that is no different to not being able to request a VM operation whilst holding the Threads_lock. I suspect before ThreadSMR this may have been a case where we weren't ensuring a target th

Re: The failure

2018-10-24 Thread Robbin Ehn
Hi, truncate: On 24/10/2018 02:00, serguei.spit...@oracle.com wrote: One thing I noticed which Robbin should be able to expand upon is that Thread 101 is terminating and has called ThreadsSMRSupport::smr_delete and is blocked here:   // Wait for a release_stable_list() call before we check ag

Re: The failure

2018-10-23 Thread serguei . spitsyn
Okay, thanks! Serguei On 10/23/18 4:58 PM, David Holmes wrote: I should have looked further before sending this. Many threads are in smr_delete. David On 24/10/2018 9:56 AM, David Holmes wrote: Hi Serguei, Robbin, One thing I noticed which Robbin should be able to expand upon is that Threa

Re: The failure

2018-10-23 Thread David Holmes
I should have looked further before sending this. Many threads are in smr_delete. David On 24/10/2018 9:56 AM, David Holmes wrote: Hi Serguei, Robbin, One thing I noticed which Robbin should be able to expand upon is that Thread 101 is terminating and has called ThreadsSMRSupport::smr_delete

Re: The failure

2018-10-23 Thread David Holmes
Hi Serguei, Robbin, One thing I noticed which Robbin should be able to expand upon is that Thread 101 is terminating and has called ThreadsSMRSupport::smr_delete and is blocked here: // Wait for a release_stable_list() call before we check again. No // safepoint check, no timeout, and not a

Re: The failure

2018-10-23 Thread serguei . spitsyn
Please, skip it - sorry for the noise. It is hard to prove anything with current dump. Thanks, Serguei On 10/23/18 9:09 AM, serguei.spit...@oracle.com wrote: Hi David and Robbin, I have an idea that needs to be checked. It can be almost the same deadlock scenario that I've already explained b

Re: The failure

2018-10-23 Thread serguei . spitsyn
Hi David and Robbin, On 10/23/18 7:38 AM, Robbin Ehn wrote: Hi, On 10/23/18 10:34 AM, David Holmes wrote: Hi Serguei, The VMThread is executing VM_HandshakeAllThreads which is not a safepoint operation. There's no real way to tell from the stacks what it's stuck on. Good point. We agreed

Re: The failure

2018-10-23 Thread serguei.spit...@oracle.com
Hi David and Robbin, I have an idea that needs to be checked. It can be almost the same deadlock scenario that I've already explained but more sophisticated. I suspect a scenario with JvmtiThreadState_lock that the flag Monitor::_safepoint_check_always does not help much. It can be verified by

Re: The failure

2018-10-23 Thread Robbin Ehn
Hi, On 10/23/18 10:34 AM, David Holmes wrote: Hi Serguei, The VMThread is executing VM_HandshakeAllThreads which is not a safepoint operation. There's no real way to tell from the stacks what it's stuck on. I cannot find a thread that is not considered safepoint safe or is_ext_suspended (th

Re: The failure

2018-10-23 Thread David Holmes
Hi Serguei, The VMThread is executing VM_HandshakeAllThreads which is not a safepoint operation. There's no real way to tell from the stacks what it's stuck on. David On 23/10/2018 5:58 PM, serguei.spit...@oracle.com wrote: Hi David, You are right, thanks. It means, this deadlock needs mor

Re: The failure

2018-10-23 Thread David Holmes
Hi Serguei, The JvmtiThreadState_lock is always acquired with safepoint checks enabled, so all JavaThreads blocked trying to acquire it will be _thread_blocked and so safepoint-safe and so won't be holding up the safepoint. David On 23/10/2018 5:21 PM, serguei.spit...@oracle.com wrote: Hi,

Re: The failure

2018-10-23 Thread serguei.spit...@oracle.com
Hi, I've added the seviceability-dev mailing list. It can be interesting for the SVC folks. :) On 10/22/18 22:14, Leonid Mesnik wrote: Hi Seems last version also crashes with 2 other diffe