On Wed, Feb 18, 2009 at 8:43 PM, Swamy Venkataramanappa <[email protected]> wrote: > Hiroshi Yamauchi wrote: >> >> Thanks for chipping in. >> >> On Fri, Feb 13, 2009 at 9:39 PM, Swamy Venkataramanappa >> <[email protected]> wrote: >> >>> >>> jvmti spec says no events should be generated after the VM_DEATH. >>> >>> http://java.sun.com/javase/6/docs/platform/jvmti/jvmti.html#VMDeath >>> >>> It says: "No event will occur of after this event". >>> >>> I suspect we are posting vm death event too soon. I think we should >>> post vm death after vm thread is destroyed or to the place where all java >>> threads including daemon thread has stopped running. >>> >> >> That may be a good fix for the issue of events sent after the VM death. >> Since I couldn't suppress my bug with a SetEventCallbacks call that >> nulls out the callback function table or a SetEventNotificationMode >> call to disable all the events, I think there still is some race >> condition/visibility issue there, independent of the above fix. >> > > Yes I agree that there is still some race condition/visibility issue. We do > need your fix for that. My suggestion is just to avoid using mutex locks in > event posting.
There are two bugs: 1. A CompiledMethodLoad event is sent after the VM death event. 2. SetEventCallbacks and SetEventNotificationMode calls may not be respected because of the race condition/visibility issue. Delaying the posting of the VM death event will probably fix bug 1 and we can probably avoid using mutex locks in event posting. But I suspect that fixing bug 2 will need mutex locks after all. Is there a way around mutex locks here? Thanks, Hiroshi > > Thanks, > -Swamy >>> >>> -Swamy >>> >>> Hiroshi Yamauchi wrote: >>> >>>> >>>> Dan, >>>> >>>> On Fri, Feb 13, 2009 at 11:39 AM, Daniel D. Daugherty >>>> <[email protected]> wrote: >>>> >>>> >>>> >>>>> >>>>> I'm leaning toward making the two fields volatile and adding the >>>>> appropriate OrderAccess::XXX calls. I vaguely remember an existing >>>>> bug for events posting late. I'm going to see if I can find it. >>>>> >>>>> >>>> >>>> Making the fields volatile and adding the OrderAccess:xxx calls isn't >>>> enough because we could see the dead phase change while a loop that >>>> calls the list of callbacks is executing (eg the for loop in >>>> JvmtiExport::post_compiled_method_load). We need to guard at least the >>>> loop and the preceding EVT_TRIG_TRACE in a mutex. That's my take. >>>> >>>> Hiroshi >>>> >>>> >>>> >>>> >>>>> >>>>> Hiroshi Yamauchi wrote: >>>>> >>>>> >>>>>> >>>>>> Hi Dan, >>>>>> >>>>>> On Wed, Feb 11, 2009 at 12:43 PM, Daniel D. Daugherty >>>>>> <[email protected] <mailto:[email protected]>> wrote: >>>>>> >>>>>> I'll chime in on parts of this thread below. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Tim Bell wrote: >>>>>> >>>>>> I don't know the precise answer to this question, so I am >>>>>> forwarding it to the Serviceability list to get a wider audience. >>>>>> >>>>>> -------- Original Message -------- >>>>>> Subject: Data visibility between threads in Hotspot >>>>>> Date: Tue, 10 Feb 2009 11:14:24 -0800 >>>>>> >>>>>> Hi Tim, >>>>>> >>>>>> I have a Hotspot question. Chuck Rasbold pointed me to you. >>>>>> Feel free to >>>>>> forward this message to other experts at Sun, if needed. >>>>>> >>>>>> If one Hotspot engineer wants to guarantee the immediate >>>>>> visibility of a >>>>>> write to a static variable to reads in other threads (assuming >>>>>> the reads >>>>>> and the writes are properly synchronized via a mutex), what >>>>>> does s/he do? >>>>>> >>>>>> Does the use of MutexLocker guarantee the visibility? It >>>>>> probably does not. >>>>>> >>>>>> >>>>>> I believe that MutexLocker does guarantee the visibility: >>>>>> >>>>>> src/share/vm/runtime/mutexLocker.hpp: >>>>>> >>>>>> 133 // See orderAccess.hpp. We assume throughout the VM that >>>>>> MutexLocker's >>>>>> 134 // and friends constructors do a fence, a lock and an >>>>>> acquire *in that >>>>>> 135 // order*. And that their destructors do a release and >>>>>> unlock, in *that* >>>>>> 136 // order. If their implementations change such that these >>>>>> assumptions >>>>>> 137 // are violated, a whole lot of code will break. >>>>>> >>>>>> >>>>>> OK. That's handy. >>>>>> >>>>>> >>>>>> >>>>>> See src/share/vm/runtime/orderAccess.hpp for the gory details. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> There are several volatile variables in Hotpot, but it may not >>>>>> really >>>>>> work because C++ compilers do not guarantee the visibility or >>>>>> limit the >>>>>> instruction reordering. >>>>>> >>>>>> See the "C++ Volatility" section in >>>>>> src/share/vm/runtime/orderAccess.hpp. >>>>>> >>>>>> >>>>>> It's an interesting header file to read. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> For example, there is the static variable JvmtiEnvBase:_phase >>>>>> which >>>>>> indicates the JVMTI phase in which the JVM is currently in. >>>>>> But AFAICT, >>>>>> there is no synchronization used by the callers of get_phase() >>>>>> and >>>>>> set_phase() and apparently they can be called by multiple >>>>>> threads. Also, >>>>>> the JvmtiEventEnabled::_enabled_bits which is a jlong variable >>>>>> that >>>>>> indicates which events are enabled for a single jvmtiEnv. The >>>>>> writes and >>>>>> reads to it are not synchronized, either. These are race >>>>>> conditions, >>>>>> which implies that some JVMTI event can go off in the dead >>>>>> phase when no >>>>>> events are supposed to go off. I'm actually looking into a bug >>>>>> in a >>>>>> JVMTI agent that I suspect is due to this. >>>>>> >>>>>> >>>>>> It certainly looks like JvmtiEnvBase::_phase should minimally be a >>>>>> volatile; it may also require some sort of synchronization to force >>>>>> the updated values to be visible... >>>>>> >>>>>> JvmtiEventEnabled::_enabled_bits also looks problematic. I'm going >>>>>> to have to mull on both of these and talk to some other folks. >>>>>> >>>>>> >>>>>> The bug that I ran across was a memory corruption crash because a >>>>>> CompiledMethodLoad event happened to be sent (due to this race >>>>>> condition) >>>>>> after Agent_OnUnload call where I had already deallocated the agent's >>>>>> data >>>>>> objects. This happens very rarely (once in every more than 100k runs >>>>>> of >>>>>> a >>>>>> more-or-less javac invocation) but often enough to fix. >>>>>> >>>>>> I have a patch attached, which I used to suppress the bug by avoiding >>>>>> the >>>>>> events to be sent during the dead phase by adding synchronization. >>>>>> Basically >>>>>> I make the _phase variable volatile and I use a mutex to guard event >>>>>> callbacks in the all the post_xxx functions and the phase transition >>>>>> to >>>>>> the >>>>>> dead phase in post_vm_death(). Probably we don't need the volatile as >>>>>> long >>>>>> as the mutex guarantees the visibility. The other phase transitions >>>>>> (onload->primodial->start->live) may not need the same degree of >>>>>> synchronization because events that can be sent in a earlier phase can >>>>>> be >>>>>> sent in the next phase as well. >>>>>> >>>>>> The _enabled_bits and the callback function table will probably need a >>>>>> similar synchronization as well. But the attached patch does not >>>>>> implement >>>>>> it and only ensures that no events are sent during the dead phase. I'd >>>>>> like >>>>>> to get the fix into openjdk in one form or another. >>>>>> >>>>>> Thanks, >>>>>> Hiroshi >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Dan >>>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Hiroshi >>>>>> >>>>>> >>>>>> >>>>>> >>> >>> > >
