On Wed, 30 Apr 2025 16:11:14 GMT, Aleksey Shipilev <sh...@openjdk.org> wrote:

>> When jspawnhelper fails for whatever reason, but more prominently due to 
>> [JDK-8325621](https://bugs.openjdk.org/browse/JDK-8325621), it will report 
>> the errors into stdout, but not to the relevant `IOException`. So, if the 
>> application is configured to only capture the exception logs (e.g. through 
>> `java.util.logging`), we will miss any output from `jspawnhelper`, and user 
>> would be left without a clue what have happened. We have seen customers 
>> spending weeks trying to figure out what went wrong.
>> 
>> It would be good to provide useful `IOException` when `jspawnhelper` fails. 
>> We already have a precedent with vmError [printing helpful 
>> suggestions](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/utilities/vmError.cpp#L422)
>>  when VM fails, we can do a similar thing here. 
>> 
>> I am very open to bike-shedding about the actual message :)
>> 
>> Additional testing:
>>  - [x] Ad-hoc experiments with breaking jspawnhelper
>>  - [x] macos-aarch64-server-fastdebug, `java/lang/Process 
>> java/lang/ProcessBuilder`
>
> Aleksey Shipilev has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains six additional 
> commits since the last revision:
> 
>  - Add bug ID
>  - Merge branch 'master' into JDK-8352533-jspawnhelper-ioexceptions
>  - Pull message into a field
>  - Test cases
>  - Merge branch 'master' into JDK-8352533-jspawnhelper-ioexceptions
>  - Initial fix
>    
>    Good spawnhelper failure message
>    
>    Trying tests
>    
>    Print helper message on all IOExceptions
>    
>    Final

Makes sense.

I recently looked into this because of signal handling issues. I planned to add 
logging to jspawnhelper. I had done similar work at SAP, and we found it really 
useful.

My very pragmatic plan was to define an environment variable that would name 
the name of a file; the variable's existence would then signal logging=on to 
jspawnhelper and it would log into this file. Since we can have process trees, 
every file name would get the pid of the process attached. Of course, care must 
be taken around exec (close the file before exec as to not inherit it; re-open 
it for some last logging after exec errors) and when passing around this 
variable (must be excluded from any environment var cleansing we do).

Bit tricky but no rocket science. I will do it when I find the time. 

Of course, all of that does not help customers who don't contact support; for 
them, the exception text is the sole info. This would be more of a tool for us 
than for them.

src/java.base/unix/native/libjava/ProcessImpl_md.c line 346:

> 344:         if (ret != EINVAL)
> 345:             detail = tmpbuf;
> 346:     }

Pre-existing, possibly as follow-up: 

I wonder whether we can do more. I don't like how we either only print out the 
errno string, or only print out the "Default" detail (I don't like the 
"default" prefix since it's really useful context information about this 
errno). 

For a problem on this side, we mostly pass in errno, then swallow the 
"defaultdetail". E.g. for posix_spawn failure, we'd print e.g. "12:Out of 
memory\nPossible reasons: ... " (or whatever localized string strerror returns).

For errors that happen inside the jspawnhelper (see line 790ff where we pass 0 
for the errno), we print out "0:Bad code from spawn helper\nPossible reasons: 
... ", so we swallow the errno we got from the jspawnhelper.

Could we print out instead the default, then the errno string, then the 
Possible reasons text? E.g.
"posix_spawn failed (12:Out of memory)\nPossible reasons: ..."
or
"Bad code from spawn helper (12:Out of memory)\nPossible reasons: ..." 

There is an issue of errnum values from the far side possibly intersecting with 
valid errno. Could be solved simply by assigning negative values to ERR_MALLOC, 
ERR_PIPE and ERR_ARGS. I am not aware of any errno values being negative. The 
errno printing would need to recognise these three values in addition to errno 
values. We also could remove the +3 workaround in the test.

The only argument I see against is some vague form of backward compatibility, 
in case people get confused about the new information, or that existing tools 
parse this output.

src/java.base/unix/native/libjava/ProcessImpl_md.c line 348:

> 346:     }
> 347:     /* ASCII Decimal representation uses 2.4 times as many bits as 
> binary. */
> 348:     fmtsize = sizeof(IOE_FORMAT) + strlen(internalErrorDetail) + 
> strlen(detail) + 3 * sizeof(errnum);

Suggestion:

    fmtsize = sizeof(IOE_FORMAT) + 3 * sizeof(errnum) + strlen(detail) +  
strlen(internalErrorDetail) + 1;

... just to appease my OCD.

src/java.base/unix/native/libjava/ProcessImpl_md.c line 351:

> 349:     errmsg = NEW(char, fmtsize);
> 350:     if (errmsg == NULL)
> 351:         return;

Preexisting: seems to be weird to just quietly swallow the allocation error.

src/java.base/unix/native/libjava/ProcessImpl_md.c line 377:

> 375:       throwIOExceptionImpl(env, errnum, defaultDetail, "");
> 376:   }
> 377: }

Why only for POSIX_SPAWN? We use jspawnhelper also for fork/vfork+exec.

-------------

PR Review: https://git.openjdk.org/jdk/pull/24149#pullrequestreview-2817819896
PR Review Comment: https://git.openjdk.org/jdk/pull/24149#discussion_r2075257935
PR Review Comment: https://git.openjdk.org/jdk/pull/24149#discussion_r2075220275
PR Review Comment: https://git.openjdk.org/jdk/pull/24149#discussion_r2075224053
PR Review Comment: https://git.openjdk.org/jdk/pull/24149#discussion_r2075211949

Reply via email to