On Wed, 30 Apr 2025 16:11:14 GMT, Aleksey Shipilev <sh...@openjdk.org> wrote:
>> When jspawnhelper fails for whatever reason, but more prominently due to >> [JDK-8325621](https://bugs.openjdk.org/browse/JDK-8325621), it will report >> the errors into stdout, but not to the relevant `IOException`. So, if the >> application is configured to only capture the exception logs (e.g. through >> `java.util.logging`), we will miss any output from `jspawnhelper`, and user >> would be left without a clue what have happened. We have seen customers >> spending weeks trying to figure out what went wrong. >> >> It would be good to provide useful `IOException` when `jspawnhelper` fails. >> We already have a precedent with vmError [printing helpful >> suggestions](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/utilities/vmError.cpp#L422) >> when VM fails, we can do a similar thing here. >> >> I am very open to bike-shedding about the actual message :) >> >> Additional testing: >> - [x] Ad-hoc experiments with breaking jspawnhelper >> - [x] macos-aarch64-server-fastdebug, `java/lang/Process >> java/lang/ProcessBuilder` > > Aleksey Shipilev has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains six additional > commits since the last revision: > > - Add bug ID > - Merge branch 'master' into JDK-8352533-jspawnhelper-ioexceptions > - Pull message into a field > - Test cases > - Merge branch 'master' into JDK-8352533-jspawnhelper-ioexceptions > - Initial fix > > Good spawnhelper failure message > > Trying tests > > Print helper message on all IOExceptions > > Final Makes sense. I recently looked into this because of signal handling issues. I planned to add logging to jspawnhelper. I had done similar work at SAP, and we found it really useful. My very pragmatic plan was to define an environment variable that would name the name of a file; the variable's existence would then signal logging=on to jspawnhelper and it would log into this file. Since we can have process trees, every file name would get the pid of the process attached. Of course, care must be taken around exec (close the file before exec as to not inherit it; re-open it for some last logging after exec errors) and when passing around this variable (must be excluded from any environment var cleansing we do). Bit tricky but no rocket science. I will do it when I find the time. Of course, all of that does not help customers who don't contact support; for them, the exception text is the sole info. This would be more of a tool for us than for them. src/java.base/unix/native/libjava/ProcessImpl_md.c line 346: > 344: if (ret != EINVAL) > 345: detail = tmpbuf; > 346: } Pre-existing, possibly as follow-up: I wonder whether we can do more. I don't like how we either only print out the errno string, or only print out the "Default" detail (I don't like the "default" prefix since it's really useful context information about this errno). For a problem on this side, we mostly pass in errno, then swallow the "defaultdetail". E.g. for posix_spawn failure, we'd print e.g. "12:Out of memory\nPossible reasons: ... " (or whatever localized string strerror returns). For errors that happen inside the jspawnhelper (see line 790ff where we pass 0 for the errno), we print out "0:Bad code from spawn helper\nPossible reasons: ... ", so we swallow the errno we got from the jspawnhelper. Could we print out instead the default, then the errno string, then the Possible reasons text? E.g. "posix_spawn failed (12:Out of memory)\nPossible reasons: ..." or "Bad code from spawn helper (12:Out of memory)\nPossible reasons: ..." There is an issue of errnum values from the far side possibly intersecting with valid errno. Could be solved simply by assigning negative values to ERR_MALLOC, ERR_PIPE and ERR_ARGS. I am not aware of any errno values being negative. The errno printing would need to recognise these three values in addition to errno values. We also could remove the +3 workaround in the test. The only argument I see against is some vague form of backward compatibility, in case people get confused about the new information, or that existing tools parse this output. src/java.base/unix/native/libjava/ProcessImpl_md.c line 348: > 346: } > 347: /* ASCII Decimal representation uses 2.4 times as many bits as > binary. */ > 348: fmtsize = sizeof(IOE_FORMAT) + strlen(internalErrorDetail) + > strlen(detail) + 3 * sizeof(errnum); Suggestion: fmtsize = sizeof(IOE_FORMAT) + 3 * sizeof(errnum) + strlen(detail) + strlen(internalErrorDetail) + 1; ... just to appease my OCD. src/java.base/unix/native/libjava/ProcessImpl_md.c line 351: > 349: errmsg = NEW(char, fmtsize); > 350: if (errmsg == NULL) > 351: return; Preexisting: seems to be weird to just quietly swallow the allocation error. src/java.base/unix/native/libjava/ProcessImpl_md.c line 377: > 375: throwIOExceptionImpl(env, errnum, defaultDetail, ""); > 376: } > 377: } Why only for POSIX_SPAWN? We use jspawnhelper also for fork/vfork+exec. ------------- PR Review: https://git.openjdk.org/jdk/pull/24149#pullrequestreview-2817819896 PR Review Comment: https://git.openjdk.org/jdk/pull/24149#discussion_r2075257935 PR Review Comment: https://git.openjdk.org/jdk/pull/24149#discussion_r2075220275 PR Review Comment: https://git.openjdk.org/jdk/pull/24149#discussion_r2075224053 PR Review Comment: https://git.openjdk.org/jdk/pull/24149#discussion_r2075211949