Re: RFR: 8279124: VirtualMachineImpl should check signal handler has installed before sending SIGQUIT

David Holmes Tue, 18 Jan 2022 00:03:06 -0800

On Mon, 17 Jan 2022 09:00:25 GMT, Kevin Walls <kev...@openjdk.org> wrote:


>> Hi @navyxliu,
>> 
>> nice catch. I can see how this can be annoying.
>> 
>> I propose a simpler and more robust way to fix it though. We (1) set up 
>> general hotspot signal handling very early, then (2) proceed to initialize 
>> the heap - which you have shown can take some time - then (3) set up SIGQUIT 
>> handling. We core if we get a quit signal before (3).
>> 
>> I would add SIGQUIT handling to the general signal handler too, just to 
>> cover the time frame between (1) and (3). At (3), it would be overwritten, 
>> but that would be fine. Before (3), we could just ignore SIGQUIT, or print 
>> out some generic information (I assume thread dumps are not yet possible at 
>> this stage).
>> 
>> Since the documented behavior for the JVM is to threaddump on SIGQUIT unless 
>> we run with -Xrs, I think this is acceptable behavior. Not printing thread 
>> dump or only printing partial information is preferable to quitting with 
>> core.
>> 
>> Then, this would work for any client that sends sigquit to a JVM, not just 
>> those using the attach framework. And it would work on all Posix platforms, 
>> not just Linux. And we'd would not have to rely on parsing the proc fs.
>> 
>> Als note that a solution implemented in the client as you did has the 
>> disadvantage that I need a modern jcmd for it to work. However, I often just 
>> use whatever jcmd is in the path. Better to handle this in the receiving 
>> hotspot.
>> 
>> I sketched out a simple patch to test if what I propose can work:
>> 
>> 
>> diff --git a/src/hotspot/os/posix/signals_posix.cpp 
>> b/src/hotspot/os/posix/signals_posix.cpp
>> index 2c020a79408..3f0dd91e03b 100644
>> --- a/src/hotspot/os/posix/signals_posix.cpp
>> +++ b/src/hotspot/os/posix/signals_posix.cpp
>> @@ -615,6 +615,11 @@ int JVM_HANDLE_XXX_SIGNAL(int sig, siginfo_t* info,
>>  #endif // ZERO
>>    }
>>  
>> +if (sig == BREAK_SIGNAL) {
>> +  printf("too early for thread dumps...\n");
>> +  signal_was_handled = true;
>> +}
>> +
>>    // Ignore SIGPIPE and SIGXFSZ (4229104, 6499219).
>>    if (!signal_was_handled &&
>>        (sig == SIGPIPE || sig == SIGXFSZ)) {
>> @@ -1279,6 +1284,7 @@ void install_signal_handlers() {
>>    set_signal_handler(SIGFPE);
>>    PPC64_ONLY(set_signal_handler(SIGTRAP);)
>>    set_signal_handler(SIGXFSZ);
>> +  set_signal_handler(BREAK_SIGNAL); // just for initialization phase
>> 
>> It still misses a number of things (I did not check signal mask setup and 
>> ReduceSignalUsage must be handled too), but it shows the general direction 
>> and worked as expected.
>> 
>> Cheers, Thomas
>
>> I propose a simpler and more robust way to fix it though
> 
> Great, this is the kind of thing I was heading towards with the conversation 
> in the bug text.  Although not sure why I could not reproduce the problem, 
> with various different JDK versions.

Apologies @kevinjwalls  as I also missed the discussion in the JBS issue.

Ideally we would know if the target VM is ready before we send the SIGQUIT to 
attach - e.g. writing a well-known file that the attach mechanism looks for 
before trying to attach. That seems feasible but perhaps costly and not always 
possible(?) ...

What has been presented here is a side-channel way of knowing. In that respect 
I like this. It is a pity it is Linux only.

The alternative suggestions of just making the window during which SIGQUIT 
terminates the VM process small enough to be un-hittable, also has merit. I 
don't think we have to try and accommodate extreme code that just looks up a 
process in the process table and throws a signal at it. Ignoring SIGQUIT during 
the early VM startup seems a reasonable solutions (we can't produce a thread 
dump at that time anyway). It seems to me that we can simply install 
UserHandler for SIGQUIT very early in the VM initialization process and it will 
be a no-op until the JDK signal handling is properly initialized (just need to 
fix one assert).

Cheers,
David

-------------

PR: https://git.openjdk.java.net/jdk/pull/7003

Re: RFR: 8279124: VirtualMachineImpl should check signal handler has installed before sending SIGQUIT

Reply via email to