On Tue, 29 Sep 2020 14:39:23 GMT, Zhengyu Gu <z...@openjdk.org> wrote:
>>> Hi Erik, >>> >>> I have been playing with this patch for past a few days. Great work! >>> >>> I found that this patch seems to break an early assumption. >>> We have a comment in JavaThread::exit() says: >>> >>> // We need to cache the thread name for logging purposes below as once >>> // we have called on_thread_detach this thread must not access any oops. >>> >>> Then in method : >>> >>> ``` >>> >>> void Threads::remove(JavaThread* p, bool is_daemon) { >>> ... >>> BarrierSet::barrier_set()->on_thread_detach(p); >>> >>> // Extra scope needed for Thread_lock, so we can check >>> // that we do not remove thread without safepoint code notice >>> { MonitorLocker ml(Threads_lock); >>> .. >>> } >>> ``` >>> >>> It calls barrier's on_thread_detach(), acquires Threads_lock. >>> The lock acquisition triggers stack processing, that potential accesses >>> oops. >>> >>> ``` >>> >>> V [libjvm.so+0x10c6f5e] StackWatermark::start_processing()+0x6a >>> V [libjvm.so+0x10c77e8] StackWatermarkSet::start_processing(JavaThread*, >>> StackWatermarkKind)+0x82 >>> V [libjvm.so+0xfd7757] SafepointMechanism::process(JavaThread*)+0x37 >>> V [libjvm.so+0xfd796b] >>> SafepointMechanism::process_if_requested_slow(JavaThread*)+0x1d >>> V [libjvm.so+0x4b3683] >>> SafepointMechanism::process_if_requested(JavaThread*)+0x2b >>> V [libjvm.so+0xe87f0d] >>> ThreadBlockInVMWithDeadlockCheck::~ThreadBlockInVMWithDeadlockCheck()+0x5f >>> V [libjvm.so+0xe86700] Mutex::lock_contended(Thread*)+0x12c >>> V [libjvm.so+0xe867d8] Mutex::lock(Thread*)+0x96 >>> V [libjvm.so+0xe86823] Mutex::lock()+0x23 >>> V [libjvm.so+0x29b4bc] MutexLocker::MutexLocker(Mutex*, >>> Mutex::SafepointCheckFlag)+0xe2 >>> V [libjvm.so+0x29b533] MonitorLocker::MonitorLocker(Monitor*, >>> Mutex::SafepointCheckFlag)+0x29 >>> V [libjvm.so+0x119f2ce] Threads::remove(JavaThread*, bool)+0x56 >>> V [libjvm.so+0x1198a2b] JavaThread::exit(bool, JavaThread::ExitType)+0x905 >>> V [libjvm.so+0x1197fde] JavaThread::post_run()+0x22 >>> V [libjvm.so+0x1193eae] Thread::call_run()+0x230 >>> V [libjvm.so+0xef3e38] thread_native_entry(Thread*)+0x1e4 >>> ``` >>> >>> This is a problem for Shenandoah, as it flushes SATB buffers during >>> on_thread_detach() and does not expect to see any >>> more SATB traffic. >>> Thanks. >> >> What oop are you encountering here? You should have no frames left at this >> point, and all oops should have been >> cleared. At least that is the theory, and that was why the thread oop moved >> out from the thread (to enforce that). So I >> am curious what oop you have found to still be around at this point. >> Anyway, you can try moving the GC hook into the >> critical section. That should help. > >> > Hi Erik, >> > I have been playing with this patch for past a few days. Great work! >> > I found that this patch seems to break an early assumption. >> > We have a comment in JavaThread::exit() says: >> > // We need to cache the thread name for logging purposes below as once >> > // we have called on_thread_detach this thread must not access any oops. >> > Then in method : >> > ``` >> > >> > void Threads::remove(JavaThread* p, bool is_daemon) { >> > ... >> > BarrierSet::barrier_set()->on_thread_detach(p); >> > >> > // Extra scope needed for Thread_lock, so we can check >> > // that we do not remove thread without safepoint code notice >> > { MonitorLocker ml(Threads_lock); >> > .. >> > } >> > ``` >> > >> > >> > It calls barrier's on_thread_detach(), acquires Threads_lock. >> > The lock acquisition triggers stack processing, that potential accesses >> > oops. >> > ``` >> > >> > V [libjvm.so+0x10c6f5e] StackWatermark::start_processing()+0x6a >> > V [libjvm.so+0x10c77e8] StackWatermarkSet::start_processing(JavaThread*, >> > StackWatermarkKind)+0x82 >> > V [libjvm.so+0xfd7757] SafepointMechanism::process(JavaThread*)+0x37 >> > V [libjvm.so+0xfd796b] >> > SafepointMechanism::process_if_requested_slow(JavaThread*)+0x1d >> > V [libjvm.so+0x4b3683] >> > SafepointMechanism::process_if_requested(JavaThread*)+0x2b >> > V [libjvm.so+0xe87f0d] >> > ThreadBlockInVMWithDeadlockCheck::~ThreadBlockInVMWithDeadlockCheck()+0x5f >> > V [libjvm.so+0xe86700] Mutex::lock_contended(Thread*)+0x12c >> > V [libjvm.so+0xe867d8] Mutex::lock(Thread*)+0x96 >> > V [libjvm.so+0xe86823] Mutex::lock()+0x23 >> > V [libjvm.so+0x29b4bc] MutexLocker::MutexLocker(Mutex*, >> > Mutex::SafepointCheckFlag)+0xe2 >> > V [libjvm.so+0x29b533] MonitorLocker::MonitorLocker(Monitor*, >> > Mutex::SafepointCheckFlag)+0x29 >> > V [libjvm.so+0x119f2ce] Threads::remove(JavaThread*, bool)+0x56 >> > V [libjvm.so+0x1198a2b] JavaThread::exit(bool, >> > JavaThread::ExitType)+0x905 >> > V [libjvm.so+0x1197fde] JavaThread::post_run()+0x22 >> > V [libjvm.so+0x1193eae] Thread::call_run()+0x230 >> > V [libjvm.so+0xef3e38] thread_native_entry(Thread*)+0x1e4 >> > ``` >> > >> > >> > This is a problem for Shenandoah, as it flushes SATB buffers during >> > on_thread_detach() and does not expect to see any >> > more SATB traffic. Thanks. >> >> What oop are you encountering here? You should have no frames left at this >> point, and all oops should have been >> cleared. At least that is the theory, and that was why the thread oop moved >> out from the thread (to enforce that). So I >> am curious what oop you have found to still be around at this point. > > They are handles: > <pre><code> > V [libjvm.so+0x98d693] chunk_oops_do(OopClosure*, Chunk*, char*)+0xce > V [libjvm.so+0x98d6d3] HandleArea::oops_do(OopClosure*)+0x37 > V [libjvm.so+0x1194ad0] Thread::oops_do_no_frames(OopClosure*, > CodeBlobClosure*)+0x88 > V [libjvm.so+0x119b0c8] JavaThread::oops_do_no_frames(OopClosure*, > CodeBlobClosure*)+0x88 > </code></pre> > >> >> Anyway, you can try moving the GC hook into the critical section. That >> should help. > > This seems to work. > > > Hi Erik, > > > I have been playing with this patch for past a few days. Great work! > > > I found that this patch seems to break an early assumption. > > > We have a comment in JavaThread::exit() says: > > > // We need to cache the thread name for logging purposes below as once > > > // we have called on_thread_detach this thread must not access any oops. > > > Then in method : > > > ``` > > > > > > void Threads::remove(JavaThread* p, bool is_daemon) { > > > ... > > > BarrierSet::barrier_set()->on_thread_detach(p); > > > > > > // Extra scope needed for Thread_lock, so we can check > > > // that we do not remove thread without safepoint code notice > > > { MonitorLocker ml(Threads_lock); > > > .. > > > } > > > ``` > > > > > > > > > It calls barrier's on_thread_detach(), acquires Threads_lock. > > > The lock acquisition triggers stack processing, that potential accesses > > > oops. > > > ``` > > > > > > V [libjvm.so+0x10c6f5e] StackWatermark::start_processing()+0x6a > > > V [libjvm.so+0x10c77e8] > > > StackWatermarkSet::start_processing(JavaThread*, StackWatermarkKind)+0x82 > > > V [libjvm.so+0xfd7757] SafepointMechanism::process(JavaThread*)+0x37 > > > V [libjvm.so+0xfd796b] > > > SafepointMechanism::process_if_requested_slow(JavaThread*)+0x1d > > > V [libjvm.so+0x4b3683] > > > SafepointMechanism::process_if_requested(JavaThread*)+0x2b > > > V [libjvm.so+0xe87f0d] > > > ThreadBlockInVMWithDeadlockCheck::~ThreadBlockInVMWithDeadlockCheck()+0x5f > > > V [libjvm.so+0xe86700] Mutex::lock_contended(Thread*)+0x12c > > > V [libjvm.so+0xe867d8] Mutex::lock(Thread*)+0x96 > > > V [libjvm.so+0xe86823] Mutex::lock()+0x23 > > > V [libjvm.so+0x29b4bc] MutexLocker::MutexLocker(Mutex*, > > > Mutex::SafepointCheckFlag)+0xe2 > > > V [libjvm.so+0x29b533] MonitorLocker::MonitorLocker(Monitor*, > > > Mutex::SafepointCheckFlag)+0x29 > > > V [libjvm.so+0x119f2ce] Threads::remove(JavaThread*, bool)+0x56 > > > V [libjvm.so+0x1198a2b] JavaThread::exit(bool, > > > JavaThread::ExitType)+0x905 > > > V [libjvm.so+0x1197fde] JavaThread::post_run()+0x22 > > > V [libjvm.so+0x1193eae] Thread::call_run()+0x230 > > > V [libjvm.so+0xef3e38] thread_native_entry(Thread*)+0x1e4 > > > ``` > > > > > > > > > This is a problem for Shenandoah, as it flushes SATB buffers during > > > on_thread_detach() and does not expect to see any > > > more SATB traffic. Thanks. > > > > > > What oop are you encountering here? You should have no frames left at this > > point, and all oops should have been > > cleared. At least that is the theory, and that was why the thread oop moved > > out from the thread (to enforce that). So I > > am curious what oop you have found to still be around at this point. > > They are handles: > > ``` > > V [libjvm.so+0x98d693] chunk_oops_do(OopClosure*, Chunk*, char*)+0xce > V [libjvm.so+0x98d6d3] HandleArea::oops_do(OopClosure*)+0x37 > V [libjvm.so+0x1194ad0] Thread::oops_do_no_frames(OopClosure*, > CodeBlobClosure*)+0x88 > V [libjvm.so+0x119b0c8] JavaThread::oops_do_no_frames(OopClosure*, > CodeBlobClosure*)+0x88 > ``` > > > Anyway, you can try moving the GC hook into the critical section. That > > should help. > > This seems to work. Okay, I will move it. ------------- PR: https://git.openjdk.java.net/jdk/pull/296