Greetings,

My fix for the following bug:

    JDK-8047720 Xprof hangs on Solaris

that was pushed to JDK9 last June needs to be cleaned up.

Thanks to Alex Garthwaite (agarthwa...@twitter.com) and Carsten
Varming (varm...@gmail.com) for reporting the mess that I made
in WatcherThread::stop() and for suggesting fixes.

This code review is for a general cleanup pass on PeriodicTask_lock
and some of the surrounding code. This is a targeted review in that
I would like to hear from three groups of people:

1) The author and reviewers for:

   JDK-7127792 Add the ability to change an existing PeriodicTask's
               execution interval

   Rickard, David H, and Markus G.

2) The reviewers for:

   JDK-8047720 Xprof hangs on Solaris

   Markus G and Coleen

3) Alex and Carsten


Here's the webrev URL:

http://cr.openjdk.java.net/~dcubed/8072439-webrev/0-for_jdk9_hs_rt/

I've attached the original RFR for JDK-8047720 that explains
the original deadlock that was being fixed. Similar testing
will be done with this fix.

Dan
--- Begin Message ---
Greetings,

I have a fix ready for the following bug:

    8047720 Xprof hangs on Solaris
    https://bugs.openjdk.java.net/browse/JDK-8047720

Here is the webrev URL:

http://cr.openjdk.java.net/~dcubed/8047720-webrev/0-jdk9-hs-rt/

This deadlock occurred between the following threads:

    Main thread   - Trying to stop the WatcherThread as part of
                    shutting down the VM; this thread is blocked
                    on the PeriodicTask_lock which keeps it from
                    reaching a safepoint.
    WatcherThread - Requested a VM_ForceSafepoint to complete
                    a JavaThread::java_suspend() call as part
                    of a FlatProfiler record_thread_ticks()
                    call; this thread owns the PeriodicTask_lock
                    since it is processing a periodic task.
    VMThread      - Trying to start a safepoint; this thread is
                    blocked waiting for the Main thread to reach
                    a safepoint.

The PeriodicTask_lock is one of the VM internal locks and is
typically managed using Mutex::_no_safepoint_check_flag to
avoid deadlocks. Yes, the irony is dripping on the floor... :-)

The interesting part of this deadlock is that I think that it
is possible for other periodic tasks to hit it. Anything that
causes the WatcherThread to start a safepoint while processing
a periodic task should be susceptible to this race. Think about
the -XX:+DeoptimizeALot option and how it causes VM_Deopt
requests on thread state transitions... Interesting...

Testing:
    - I found a way to add delays to the right spots in the
      VM to make the deadlock reproduce in just about every
      run of the test associated with the bug. The new
      os::naked_short_sleep() function is your friend. Thanks
      to Fred for adding that! See the bug report for the
      debugging diffs.
    - 72 hours of running the test in the bug report with
      delays enabled for product, fastdebug and jvmg bits
      in parallel on my Solaris X86 server.
    - JPRT test run
    - Aurora Adhoc results are in process; we're having issues
      with both a broken testbase build and infra problems
      with results not being uploaded.




--- End Message ---

Reply via email to