We can't hold a mutex across an event posting. The event posting
code calls out of the VM to the user agent and holding a VM mutex
across that boundary is not allowed.

I'll also take a look at Swamy's comment about us posting the
VM_DEATH event too soon. These VM shutdown cases are certainly
interesting...

Dan


Paul Hohensee wrote:
If that's true, then you do indeed need a mutex. The OrderAccess methods only guarantee visibility ordering. They don't guarantee exclusive access to shared variables.

Paul

Hiroshi Yamauchi wrote:
Dan,

On Fri, Feb 13, 2009 at 11:39 AM, Daniel D. Daugherty
<[email protected]> wrote:

I'm leaning toward making the two fields volatile and adding the
appropriate OrderAccess::XXX calls. I vaguely remember an existing
bug for events posting late. I'm going to see if I can find it.

Making the fields volatile and adding the OrderAccess:xxx calls isn't
enough because we could see the dead phase change while a loop that
calls the list of callbacks is executing (eg the for loop in
JvmtiExport::post_compiled_method_load). We need to guard at least the
loop and the preceding EVT_TRIG_TRACE in a mutex. That's my take.

Hiroshi


Hiroshi Yamauchi wrote:
Hi Dan,

On Wed, Feb 11, 2009 at 12:43 PM, Daniel D. Daugherty
<[email protected] <mailto:[email protected]>> wrote:

   I'll chime in on parts of this thread below.




   Tim Bell wrote:

       I don't know the precise answer to this question, so I am
forwarding it to the Serviceability list to get a wider audience.

       -------- Original Message --------
       Subject:     Data visibility between threads in Hotspot
       Date:     Tue, 10 Feb 2009 11:14:24 -0800

       Hi Tim,

       I have a Hotspot question. Chuck Rasbold pointed me to you.
       Feel free to
       forward this message to other experts at Sun, if needed.

       If one Hotspot engineer wants to guarantee the immediate
       visibility of a
       write to a static variable to reads in other threads (assuming
       the reads
       and the writes are properly synchronized via a mutex), what
       does s/he do?

       Does the use of MutexLocker guarantee the visibility? It
       probably does not.


   I believe that MutexLocker does guarantee the visibility:

   src/share/vm/runtime/mutexLocker.hpp:

    133  // See orderAccess.hpp.  We assume throughout the VM that
   MutexLocker's
    134  // and friends constructors do a fence, a lock and an
   acquire *in that
    135  // order*.  And that their destructors do a release and
   unlock, in *that*
    136  // order.  If their implementations change such that these
   assumptions
    137  // are violated, a whole lot of code will break.


OK. That's handy.



   See src/share/vm/runtime/orderAccess.hpp for the gory details.




       There are several volatile variables in Hotpot, but it may not
       really
       work because C++ compilers do not guarantee the visibility or
       limit the
       instruction reordering.

   See the "C++ Volatility" section in
   src/share/vm/runtime/orderAccess.hpp.


It's an interesting header file to read.




       For example, there is the static variable JvmtiEnvBase:_phase
       which
       indicates the JVMTI phase in which the JVM is currently in.
       But AFAICT,
there is no synchronization used by the callers of get_phase() and
       set_phase() and apparently they can be called by multiple
       threads. Also,
       the JvmtiEventEnabled::_enabled_bits which is a jlong variable
       that
       indicates which events are enabled for a single jvmtiEnv. The
       writes and
       reads to it are not synchronized, either. These are race
       conditions,
       which implies that some JVMTI event can go off in the dead
       phase when no
       events are supposed to go off. I'm actually looking into a bug
       in a
       JVMTI agent that I suspect is due to this.


   It certainly looks like JvmtiEnvBase::_phase should minimally be a
   volatile; it may also require some sort of synchronization to force
   the updated values to be visible...

   JvmtiEventEnabled::_enabled_bits also looks problematic. I'm going
   to have to mull on both of these and talk to some other folks.


The bug that I ran across was a memory corruption crash because a
CompiledMethodLoad event happened to be sent (due to this race condition) after Agent_OnUnload call where I had already deallocated the agent's data objects. This happens very rarely (once in every more than 100k runs of a
more-or-less javac invocation) but often enough to fix.

I have a patch attached, which I used to suppress the bug by avoiding the events to be sent during the dead phase by adding synchronization. Basically
I make the _phase variable volatile and I use a mutex to guard event
callbacks in the all the post_xxx functions and the phase transition to the dead phase in post_vm_death(). Probably we don't need the volatile as long
as the mutex guarantees the visibility. The other phase transitions
(onload->primodial->start->live) may not need the same degree of
synchronization because events that can be sent in a earlier phase can be
sent in the next phase as well.

The _enabled_bits and the callback function table will probably need a
similar synchronization as well. But the attached patch does not implement it and only ensures that no events are sent during the dead phase. I'd like
to get the fix into openjdk in one form or another.

Thanks,
Hiroshi




   Dan



       Thanks,
       Hiroshi


Reply via email to