Re: RFR 8059533: (process) Make exiting process wait for exiting threads [win]

Ivan Gerasimov Mon, 27 Oct 2014 01:36:41 -0700


On 27.10.2014 3:36, David Holmes wrote:

On 27/10/2014 1:15 AM, Ivan Gerasimov wrote:
David, would you approve this fix?
Sorry Ivan I'm having trouble following the logic this time - couldyou add some comments about what we are checking at each step.


Yes, sure.

The main idea is to make the thread that ends the process wait for thethreads that had finished so far.

Thus, we have an array for storing the thread handles.

Any thread that is on thread-exit path, first tries to remove thecompleted threads from the array (to keep the list smaller), and thenadds its own handle to the end of the array.The thread that is on process-exit path, calls exit (or _exit), whilestill owning the critical section.This way we make sure, no other threads execute any exit-related code atthe same time.


Here's a typical scenario:

1) First thread that decided to end itself callsexit_process_or_thread() -- let's assume it is on thread-exit path.

Initializes the critical section.
2) Grabs the ownership of the crit. section

3) The list of thread handles is initially empty, so the thread adds aduplicate of its handle to the array.

4) Releases the crit. section
5) Calls _endthreadex() to terminate itself

6) Another thread enters exit_process_or_thread() -- let it be onthread-exit path as well.

7) Grabs the ownership of the crit. section
8) In a loop checks if any previously ended thread has completed.
Here we call WaitForSingleObject with zero timeout, so we don't block.
All the handles of completed threads are closed.

9) If there's is a free slot in the array, the thread adds its handle tothe end10) If the array is full (which is very unlikely), the thread waits forANY thread to complete, and then adds itself to the array.

11) Releases the crit. section
12) Calls _endthreadex() to terminate itself

13) Some thread enters exit_process_or_thread() in order to end thewhole process.

14) Grabs the ownership of the crit. section

15) Waits on all the threads that have added their handles to the array(typically there will be only one such thread handle).Since the ownership of the critical section is held, no other threadswill execute any exit-related code at this time.16) Once all the threads from the list have completed, the thread closesthe handles and calls exit() (or _exit()), holding the crit. sectionownership.


We're done.

Error handling: in a case of errors, we report them, and proceed withexiting as usual.- If initialization of critical section fails, we'll just call thecorresponding exit routine.- If we failed, waiting for an exiting thread to complete, close itshandle as if it has completed.- If we failed, waiting for any thread to complete withing a time-out(array is full), close all the handles and continue as if there were nothreads exited before.- If we couldn't duplicate the handle, ignore it (don't add it to thearray), so no one will wait for it later.- If the thread on the process-exit path failed to wait for the threadsto complete withing the time-out, proceed to the exit anyway.

All these errors should never happen during normal execution, but ifthey do, we still try to end threads/process in a way it's done now.

In this, later case, we are at risk of observing a race condition.

However, the chances of this happening are much lesser, and in additionwe'll have a waring message to analyze.


Possible bottlenecks.

1) All the threads have to obtain the ownership of the critical section,which effectively serializes all the exiting threads.However, this doesn't appear to make things too much slower, as all thethreads already do similar thing in _endthreadex().

2) Normally, the threads don't block having ownership of the crit. section.
The block can only happen if there's no free slot in the array of handles.

This can only happen if MAX_EXIT_HANDLES (== 16) threads have justcalled _endthreadex(), and none of them completed.3) When the thread at process-exit path waits for all the exitingthreads to complete, the time-out of 1 second is specified.If any of those threads do not complete, this can lead to that theapplication is delayed at the exit.However, we don't block forever, and the delay can only be observed upona failure.

Also we seem to exit while still holding the critical section - howdoes that work?

Right.

We make the thread at the process-exit path call exit() from withingcritical section block.This way it is ensured no other exit-related code is executed at thesame moment, and a race is avoided.


Sincerely yours,
Ivan

Thanks,
David

Sincerely yours,
Ivan

On 26.10.2014 19:01, Daniel D. Daugherty wrote:

On 10/25/14 12:23 PM, Ivan Gerasimov wrote:


On 25.10.2014 3:06, Daniel D. Daugherty wrote:

On 10/1/14 3:07 AM, Ivan Gerasimov wrote:

Hello!

The tests that continue to fail with wrong exit codes suggest that
the fix for JDK-8057744 wasn't sufficient.
Here's another proposal, which expands the synchronized portion of
the code.
It is proposed to make the exiting process wait for the threads
that have already started exiting.
This should help to make sure that no thread is executing any
potentially racy code concurrently with the exiting process.

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8059533
WEBREV: http://cr.openjdk.java.net/~igerasim/8059533/0/webrev/


Finally got a chance to look at the official version of fix.

Thumbs up!

src/os/windows/vm/os_windows.cpp
    No comments.

Thank you Daniel!

I assume the change needs the second hotspot reviewer?


Yes, HotSpot changes always need two reviewers. David Holmes
chimed in on this thread. You should ask him if he can be
counted as a reviewer.

What would be the best time for pushing this fix?


Let's go for Wednesday again so we have a full week of testing
to evaluate this latest tweak.

Dan


Sincerely yours,
Ivan

Dan

P.S.
We had another sighting of an exit_code == 60115 test failure
this past week so while your previous fix greatly reduced the
odds of this race, I'm looking forward to seeing this new
version in action...


Comments, suggestion are welcome!

Sincerely yours,
Ivan

Re: RFR 8059533: (process) Make exiting process wait for exiting threads [win]

Reply via email to