Re: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6]

Thomas Stuefe Thu, 10 Mar 2022 10:08:12 -0800

On Thu, 10 Mar 2022 12:31:06 GMT, Anton Kozlov <akoz...@openjdk.org> wrote:


> > Signal handlers are per-process not per-thread, so a thread need not be 
> > attached to the VM for our signal handlers to get involved - that is why 
> > they call Thread::current_or_null_safe().
> 
> Oh, right, thanks. I was too concentrated on thinking another platforms like 
> windows, that missed the example will work for *nix.
> 
> A general question to the bug. The signal mask is per-thread, and a native 
> thread may block the JVM signal. I think safefetch will fail, if somehow we 
> manage to call it on this thread (without jsig). So I'm not sure the 
> safefetch is really safe on all platforms and in all contexts, that is, it 
> always recovers from the read failure if called on a random thread.

To expand on @dholmes-ora answer: blocking SIGSEGV and SIGBUS - or other 
synchronous error signals like SIGFPE - and then triggering said signal is UB. 
What happens is OS-dependent. I saw processes vanishing, or hang, or core. It 
makes sense, since what is the kernel supposed to do. It cannot deliver the 
signal, and deferring it would require returning to the faulting instruction, 
that would just re-fault. 

For some more details see e.g. https://bugs.openjdk.java.net/browse/JDK-8252533

> 
> Is there a crash that is fixed by the change? I just spotted it is an 
> enhancement, not a bug. Just trying to understand the problem.

Yes, this issue is a breakout from 
https://bugs.openjdk.java.net/browse/JDK-8282306, where we'd like to use 
SafeFetch to make stack walking in AsyncGetCallTrace more robust. AGCT is 
called from the signal handler, and it may run in any number of situations 
(e.g. in foreign threads, or threads which are in the process of getting 
dismantled, etc).

Another situation is error handling itself. When writing an hs-err file, we use 
SafeFetch to do carefully tiptoe around the possibly corrupt VM state. If the 
original crash happened in a foreign thread, we still want some of these 
reports to work (e.g. dumping register content or printing stacks). So 
SafeFetch should be as robust as possible.

> I'm not opposing the refactoring, it has some advantages, but I'd want to 
> separate functional change from the refactoring. This would aid backporting, 
> at least.

I agree that a minimal patch would be good. I feel partly guilty for this 
expanding discussion. I'm fine with a minimal change, without any refactoring, 
in whatever form @parttimenerd choses - be it OS thread local or the approach 
proposed by you, @AntonKozlov.

Cheers Thomas

-------------

PR: https://git.openjdk.java.net/jdk/pull/7727

Re: RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6]

Reply via email to