On 8/19/07, Weldon Washburn <[EMAIL PROTECTED]> wrote:
> All,
>
> Four things.
>
> 1)
> I put a description of thread suspension race conditions in
> harmony-4644, HarmonyJVM_Thread_State_Diagrams.ppt. If the analysis
> is correct, there is a race condition where the suspend request thread
> can see a stale value of the target thread's disable_count. In other
> words, the suspend request thread can set the target thread->request
> field to true then (accidentally) read a stale value of
> thread->disable_count. The stale value misleads the requestor thread
> to think it is OK to enumerate the target's stack.
>
> 2)
> Below is rough code that shows an algorithm that might fix the above
> problem. The concept is to attempt to repair the existing code base
> rather than a complete redesign. Basically the idea is that the
> target thread will set disable_count = 1 *before* it checks
> thread->request. And the suspension requestor thread needs to wait
> for the HW store buffer to drain. I do not know of any HW
> architectural guarantees on the maximum clock cycles to drain the
> store buffer. I made an intial guess of 300Msec. It may be that
> 30msec is sufficient for many generations of future HW. I simply do
> not know at this point. Please see the below code for more details.
> (Even more details are in harmony-4644, AAA_thread_suspension.cpp.)
I guess it's not a good idea to sleep explicitly for 300ms. A minor
collection can be as fast as tens of ms.
> 3)
> A question for GC folks. What is the maximum sleep() time in the
> below code that is acceptable? Note that for GC suspending a single
> thread, we always need to wait for the HW store buffer to drain.
> However for stop-the-world GC, we do not neccessarily have to wait for
> each target thread's store buffer to drain. We can set thread->request
> on all threads, then do a single sleep(xx msec) to ensure the store
> buffers have drained.
To force the store buffer to drain, I think a safer (and more elegant)
way is to kill a signal to the target thread, so that the target
thread is interrupted.
I am afraid this kind of explicit timing control as 300msec sleep
should always be avoided.
> 4)
> Next steps -- clean up the below suspension kernel and run experiments
> on 4-way SMP to look for race conditions. I hope to do this today.
>
>
>
> ///////////////////////////////////////////////
>
> void mutator__hythread_suspend_disable()
> {
> hythread_t self = get_TLS_thread();
>
> // the following write will slowly dribble out to the coherency domain
> // we could add an mfence but this would slow down
> hythread_suspend_disable() execution (bad idea)
> // instead of an mfence() in the target thread, we sleep 300msec
> in the requestor thread to
> // allow the mutator's HW store buffer to empty
> self->disable_count++;
>
> if (self->disable_count == 1) {
> if (readLocation(&self->request) ) {
>
> self->disable_count = 0;
> self->now_blocked_at_disable_count_0_to_1 = true;
> mfence();
>
> while (self->j_l_t_suspend_request || self->gc_suspend_request) {
> // put a timeout to ensure a missed semaphore set does not
> // hang the system (graceful degradation)
> status = hysem_wait(self->resume_event, timout_3000msec);
> if (status == timeout && self->gc_suspend_request) log
> a warning/diagnostic somewhere
> }
>
> self->disable_count = 1;
> self->now_blocked_at_disable_count_0_to_1 = false;
> mfence();
> }
> else {
> // intentionally do nothing
> // eventually disable_count going from 0 to 1 will make it
> into the coherency domain
> }
> }
>
> void hythread_suspend_other(hythread_t thread, suspend_t kind)
> {
> if (kind == J_L_T_SUSPEND) {
> if (thread->j_l_t_suspend_request == true) return; //its
> already been dealt with
> thread->j_l_t_suspend_request = true;
> } else {
> assert(kind == GCSUSPEND);
> // only allow one GC at a time for now
> assert(thread->gc_suspend_request = false);
> thread->gc_suspend_request = true;
> }
> apr_atomic_inc32(&thread->request);
>
> // if the target thread just did a "disable_count = 1", the below
> sleep will allow
> // enough time so that the target thread's HW store buffer will
> empty into the
> // coherency domain
> sleep(300msec);
>
> //we now know we are not fetching stale version of disable_count
> from coherency domain
> while (readLocation(&thread->disable_count) != 0) {
> yield();
> }
> if ( (thread->now_blocked_at_back_branch == 0) &&
> (thread->now_blocked_at_disable_count_0_to_1 == 0) ) {
> assert(0);
> }
> }
>
--
http://xiao-feng.blogspot.com