Dear David,
I took a look at all occurrences of PeriodicTask_lock. We are in luck:
PeriodicTask::enroll and PeriodicTask::disenroll check for safepoints when
acquiring PeriodicTask_lock.
PeriodicTask::time_to_wait and PeriodicTask::real_time_tick are called only
by the watcher thread.
WatcherThread::unpark is used only in contexts where PeriodicTask_lock is
owned by the caller.
WatcherThread::sleep is called only by the watcher thread.
I took another look at WatcherThread::sleep and realized that there is a
path from acquiring PeriodicTask_lock to waiting on the lock where
_should_terminate is not checked. I added that to the minimal fix attached.
Looking at these methods made me want to clean up a little more. See
better.patch attached.
I feel pretty good about the patch with the limited usage of
no_safepoint_check_flag with PeriodicTask_lock.
Carsten
On Tue, Feb 3, 2015 at 11:28 PM, David Holmes <[email protected]>
wrote:
> On 4/02/2015 1:28 PM, Carsten Varming wrote:
>
>> Dear David Holmes,
>>
>> Please see inline response,
>>
>> On Tue, Feb 3, 2015 at 9:38 PM, David Holmes <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> On 4/02/2015 5:00 AM, Carsten Varming wrote:
>>
>> Greetings all,
>>
>> I was recently introduced to the deadlock described in
>> JDK-8047720 and
>> fixed in JDK9. The fix seems a little messy to me, and it looks
>> like it
>> left some very short races in the code. So I decided to clean up
>> the
>> code. See attached diff.
>>
>> The change takes a step back and acquires PeriodicTask_lock in
>> WatcherThread::stop (like before the fix in JDK9), but this time
>> safepoints are allowed to proceed when acquiring
>> PeriodicTask_lock,
>> preventing the deadlock.
>>
>>
>> It isn't obvious that blocking for a safepoint whilst in this method
>> will always be a safe thing to do. That would need to be examined in
>> detail - potential interactions can be very subtle.
>>
>>
>> Absolutely true. For reference, the pattern is repeated with the
>> Terminator_lock a few lines below. The pattern is also used in
>> Threads::destroy_vm before and after calling before_exit, and the java
>> shutdown hooks are called right before the call to before_exit. So there
>> is a lot of evidence that safepoints are allowed to happen in this
>> context.
>>
>
> The thread calling before_exit is a JavaThread so of course it
> participates in safepoints. The issue is whether the interaction between
> that thread and the WatcherThread, via the PeriodicTask_lock, can allow for
> the JavaThread blocking at a safepoint whilst owning that lock. If another
> JavaThread can try to lock it without a safepoint check then we can get a
> deadlock.
>
> Cheers,
> David
>
>
> Thank you for taking your time to look at this,
>> Carsten
>>
>>
>> David
>>
>>
>> Let me know what you think,
>> Carsten
>>
>>
>>
diff --git a/src/share/vm/runtime/task.cpp b/src/share/vm/runtime/task.cpp
--- a/src/share/vm/runtime/task.cpp
+++ b/src/share/vm/runtime/task.cpp
@@ -74,8 +74,7 @@
}
int PeriodicTask::time_to_wait() {
- MutexLockerEx ml(PeriodicTask_lock->owned_by_self() ?
- NULL : PeriodicTask_lock, Mutex::_no_safepoint_check_flag);
+ assert(PeriodicTask_lock->owned_by_self(), "precondition");
if (_num_tasks == 0) {
return 0; // sleep until shutdown or a task is enrolled
diff --git a/src/share/vm/runtime/thread.cpp b/src/share/vm/runtime/thread.cpp
--- a/src/share/vm/runtime/thread.cpp
+++ b/src/share/vm/runtime/thread.cpp
@@ -1198,6 +1198,10 @@
int WatcherThread::sleep() const {
MutexLockerEx ml(PeriodicTask_lock, Mutex::_no_safepoint_check_flag);
+ if (_should_terminate) {
+ return 0; // we did not sleep.
+ }
+
// remaining will be zero if there are no tasks,
// causing the WatcherThread to sleep until a task is
// enrolled
@@ -1252,7 +1256,7 @@
this->initialize_thread_local_storage();
this->set_native_thread_name(this->name());
this->set_active_handles(JNIHandleBlock::allocate_block());
- while (!_should_terminate) {
+ while (true) {
assert(watcher_thread() == Thread::current(), "thread consistency check");
assert(watcher_thread() == this, "thread consistency check");
@@ -1288,6 +1292,10 @@
}
}
+ if (_should_terminate) {
+ break;
+ }
+
PeriodicTask::real_time_tick(time_waited);
}
@@ -1318,24 +1326,20 @@
}
void WatcherThread::stop() {
- // Get the PeriodicTask_lock if we can. If we cannot, then the
- // WatcherThread is using it and we don't want to block on that lock
- // here because that might cause a safepoint deadlock depending on
- // what the current WatcherThread tasks are doing.
- bool have_lock = PeriodicTask_lock->try_lock();
-
_should_terminate = true;
- OrderAccess::fence(); // ensure WatcherThread sees update in main loop
-
- if (have_lock) {
+ {
+ // Let safepoints happen to avoid the deadlock:
+ // The VM thread has Threads_lock and waits for Java threads to report back for a safepoint.
+ // The watcher thread has PeriodicTask_lock and tries to acquire Threads_lock.
+ // A Java thread executes WatcherThread::stop, tries to acquire PeriodicTask_lock,
+ // and holds up the VM thread by not reaching a safepoint.
+ MutexLocker mu(PeriodicTask_lock);
+
WatcherThread* watcher = watcher_thread();
if (watcher != NULL) {
- // If we managed to get the lock, then we should unpark the
- // WatcherThread so that it can see we want it to stop.
+ // Unpark the WatcherThread so that it can see that it should terminate.
watcher->unpark();
}
-
- PeriodicTask_lock->unlock();
}
// it is ok to take late safepoints here, if needed
@@ -1358,9 +1362,7 @@
}
void WatcherThread::unpark() {
- MutexLockerEx ml(PeriodicTask_lock->owned_by_self()
- ? NULL
- : PeriodicTask_lock, Mutex::_no_safepoint_check_flag);
+ assert(PeriodicTask_lock->owned_by_self(), "precondition");
PeriodicTask_lock->notify();
}
diff --git a/src/share/vm/runtime/thread.cpp b/src/share/vm/runtime/thread.cpp
--- a/src/share/vm/runtime/thread.cpp
+++ b/src/share/vm/runtime/thread.cpp
@@ -1198,6 +1198,10 @@
int WatcherThread::sleep() const {
MutexLockerEx ml(PeriodicTask_lock, Mutex::_no_safepoint_check_flag);
+ if (_should_terminate) {
+ return 0; // we did not sleep.
+ }
+
// remaining will be zero if there are no tasks,
// causing the WatcherThread to sleep until a task is
// enrolled
@@ -1318,24 +1322,20 @@
}
void WatcherThread::stop() {
- // Get the PeriodicTask_lock if we can. If we cannot, then the
- // WatcherThread is using it and we don't want to block on that lock
- // here because that might cause a safepoint deadlock depending on
- // what the current WatcherThread tasks are doing.
- bool have_lock = PeriodicTask_lock->try_lock();
-
_should_terminate = true;
- OrderAccess::fence(); // ensure WatcherThread sees update in main loop
-
- if (have_lock) {
+ {
+ // Let safepoints happen to avoid the deadlock:
+ // The VM thread has Threads_lock and waits for Java threads to report back for a safepoint.
+ // The watcher thread has PeriodicTask_lock and tries to acquire Threads_lock.
+ // A Java thread executes WatcherThread::stop, tries to acquire PeriodicTask_lock,
+ // and holds up the VM thread by not reaching a safepoint.
+ MutexLocker mu(PeriodicTask_lock);
+
WatcherThread* watcher = watcher_thread();
if (watcher != NULL) {
- // If we managed to get the lock, then we should unpark the
- // WatcherThread so that it can see we want it to stop.
+ // Unpark the WatcherThread so that it can see that it should terminate.
watcher->unpark();
}
-
- PeriodicTask_lock->unlock();
}
// it is ok to take late safepoints here, if needed