This is evil :) There might be more cases like this, e.g.
frame_x86.cpp frame::is_interpreted_frame_valid(): if (locals > thread->stack_base() || locals < (address) fp()) return false; Also, I would have thought the little alloca() dance we do at the start of thread_native_entry() would push the first real frame down the stack. The fix looks good. Cheers, Thomas On Mon, Nov 18, 2019 at 3:31 AM David Holmes <[email protected]> wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8215355 > webrev: http://cr.openjdk.java.net/~dholmes/8215355/webrev/ > > This was a very difficult bug to track down and I want to publicly > acknowledge and thank the jemalloc folk (users and developers) for > continuing to investigate this issue from their side. Without their > persistence this issue would have languished. > > The thread stack_base() is the first address above the thread's stack. > However, the "in stack" checks performed by Thread::on_local_stack and > Thread::is_in_stack allowed the checked address to be equal to the > stack_base() - which is not correct. Here's how this manifests as the bug: > > - Let a JavaThread instance, T2, be allocated at the end of thread T1's > stack i.e. at T1->stack_base() > [This seems to be why this only reproduced with jemalloc.] > - Let T2 lock an inflated monitor > - Let T1 try to lock the same monitor > - T1 would consider the _owner field value (T2) as being in its stack > and so consider the monitor stack-locked by T1 > - And so both T1 and T2 would have ownership of the monitor allowing > the monitor state (and application state) to be corrupted. This results > in a range of hangs and crashes depending on the exact interleaving. > > Interestingly Thread::is_in_usable_stack does not have this bug. > > The bug can be tracked way back to JDK-6699669 as explained in the bug > report. That issue also showed that the same bug existed in the SA > implementations of these "on stack" checks. > > Testing: > - The reproducer from the bug report, using jemalloc, ran over 5000 > times without failing in any way. > - tiers 1-3 on all Oracle platforms > - serviceability/sa tests > > Thanks, > David > ----- >
