Re: RFR: 8215355: Object monitor deadlock with no threads holding the monitor (using jemalloc 5.1)

David Holmes Mon, 18 Nov 2019 18:01:36 -0800

Hi Dan,

Thanks for taking a look at this.


On 19/11/2019 3:36 am, Daniel D. Daugherty wrote:

Hi David,


On 11/17/19 9:30 PM, David Holmes wrote:

Bug: https://bugs.openjdk.java.net/browse/JDK-8215355
webrev: http://cr.openjdk.java.net/~dholmes/8215355/webrev/


src/hotspot/share/runtime/thread.hpp
     Nice catch!

src/hotspot/share/runtime/thread.cpp
     Nice catch!

     Not your issue, but these two lines feel strange/wrong:

         L1008:   // Allow non Java threads to call this without stack_base
         L1009:   if (_stack_base == NULL) return true;

     When _stack_base is NULL, any 'adr' is in the caller's stack? The
     comment is not helping understand why this is so...

src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/JavaThread.java
     Nice catch!

     Again, not your issue, but these four lines are questionable:

         L383     Address sp      = lastSPDbg();
         L384     Address stackBase = getStackBase();
         L385     // Be robust
         L386     if (sp == null) return false;

     I can see why a NULL sp would cause a "false" return since obviously
     something is a amiss in the frame. However, the C++ code doesn't make
     this check so why does the SA code?

     And this code doesn't check stackBase == NULL so it's not matching
     the C++ code either.


Thumbs up on the change itself. My queries above and below might warrant
new bugs or RFEs to be filed.


I have filed a bug to examine this and the issue Thomas flagged:

https://bugs.openjdk.java.net/browse/JDK-8234372

"Investigate use of Thread::stack_base() and queries for "in stack""

This was a very difficult bug to track down and I want to publiclyacknowledge and thank the jemalloc folk (users and developers) forcontinuing to investigate this issue from their side. Without theirpersistence this issue would have languished.
You also deserve thanks for sticking with this bug: Thanks David!!

Thanks, but I had written this off as a jemalloc issue until theyprovided the additional data.

The thread stack_base() is the first address above the thread's stack.However, the "in stack" checks performed by Thread::on_local_stack andThread::is_in_stack allowed the checked address to be equal to thestack_base() - which is not correct. Here's how this manifests as thebug:
- Let a JavaThread instance, T2, be allocated at the end of threadT1's stack i.e. at T1->stack_base()
  [This seems to be why this only reproduced with jemalloc.]
- Let T2 lock an inflated monitor
- Let T1 try to lock the same monitor
- T1 would consider the _owner field value (T2) as being in itsstack and so consider the monitor stack-locked by T1 - And so both T1 and T2 would have ownership of the monitor allowingthe monitor state (and application state) to be corrupted. Thisresults in a range of hangs and crashes depending on the exactinterleaving.
Ouch!

So I was wondering how this bug could happen with the thread alignment
logic that we have in place... search for the _real_malloc_address stuff...
And then I noticed that the logic only kicks in when UseBiasedLocking ==true
(and this bug says it doesn't happen with -XX:-UseBiasedLocking):

Actually that is a false claim. As per my comment on "2019-10-09 14:09"it does reproduce with biased-locking disabled but much more rarely.

src/hotspot/share/runtime/thread.cpp:

// ======= Thread ========
// Support for forcing alignment of thread objects for biased locking
void* Thread::allocate(size_t size, bool throw_excpt, MEMFLAGS flags) {
   if (UseBiasedLocking) {
     const size_t alignment = markWord::biased_lock_alignment;
     size_t aligned_size = size + (alignment - sizeof(intptr_t));

void* real_malloc_addr = throw_excpt? AllocateHeap(aligned_size,flags, CURRENT_PC) : AllocateHeap(aligned_size,flags, CURRENT_PC,

AllocFailStrategy::RETURN_NULL);
     void* aligned_addr     = align_up(real_malloc_addr, alignment);
     assert(((uintptr_t) aligned_addr + (uintptr_t) size) <=
            ((uintptr_t) real_malloc_addr + (uintptr_t) aligned_size),
            "JavaThread alignment code overflowed allocated storage");
     if (aligned_addr != real_malloc_addr) {

log_info(biasedlocking)("Aligned thread " INTPTR_FORMAT " to "INTPTR_FORMAT,

                               p2i(real_malloc_addr),
                               p2i(aligned_addr));
     }
     ((Thread*) aligned_addr)->_real_malloc_address = real_malloc_addr;
     return aligned_addr;
   } else {
     return throw_excpt? AllocateHeap(size, flags, CURRENT_PC)

: AllocateHeap(size, flags, CURRENT_PC,AllocFailStrategy::RETURN_NULL);

   }
}


The logging logic above:

     if (aligned_addr != real_malloc_addr) {

log_info(biasedlocking)("Aligned thread " INTPTR_FORMAT " to "INTPTR_FORMAT,

                               p2i(real_malloc_addr),
                               p2i(aligned_addr));
     }

allows for real_malloc_addr to be the same as aligned_addr sometimes
(and no log message is issued), but I'm not sure from spelunking in
code whether it's really possible for:

     void* aligned_addr     = align_up(real_malloc_addr, alignment);

to return aligned_addr == real_malloc_addr. In other words, if
real_malloc_addr is already aligned perfectly, does align_up() still
change that value?

If it is possible for (aligned_addr == real_malloc_addr), then it is
possible for this bug to happen without jemalloc.

I've convinced myself that this is possible because of this line:

     size_t aligned_size = size + (alignment - sizeof(intptr_t));

If real_malloc_addr is already aligned perfectly and align_up()
always changed the input address, then the aligned_size would be
too small by sizeof(intptr_t) and we would have seen a buffer
overwrite like that over the many, many years.

So my conclusion is that it should be possible for this bug to
happen without jemalloc, but it would have to be rare.

I'm a little surprised that we specialize this way as I thought the128/256 byte alignment was necessary regardless of biased-locking.Further even if running without biased-locking we still have alignmentrequirements for the lock-bits, age-bits etc, that do not seem to becaptured by the above code unless AllocateHeap somehow already providessuch alignment by default. (I'm also unclear why this doesn't fail indebug builds but just assume the allocation patterns are different.)

Anyway, if the allocator already returns a suitably aligned block ofmemory then I am assuming the above code doesn't actually need to doanything.

So theoretically, without having advance knowledge of the details of theallocator, yes this bug could happen for any allocator.


Thanks,
David
-----

Interestingly Thread::is_in_usable_stack does not have this bug.
So we have Thread::is_in_usable_stack(), Thread::on_local_stack() and
Thread::is_in_stack()? I haven't compared all three side by side, but
there might be some cleanup work that can be done here (in a different
bug).
The bug can be tracked way back to JDK-6699669 as explained in the bugreport. That issue also showed that the same bug existed in the SAimplementations of these "on stack" checks.
Ouch! JDK-6699669 was fixed in jdk7-B56 and looks like it was pushed
to the jdk6u train... so this bug goes back quite a ways...

Outstanding hunt David!

Dan
Testing:
- The reproducer from the bug report, using jemalloc, ran over 5000times without failing in any way.
  - tiers 1-3 on all Oracle platforms
  - serviceability/sa tests

Thanks,
David
-----

Re: RFR: 8215355: Object monitor deadlock with no threads holding the monitor (using jemalloc 5.1)

Reply via email to