Hi, I am currently working on removing (eventually) the vfork mode. Before we can do this, we need a bit better error diagnostics. I do this by gently improving the error handling throughout the code, so that we can generate IOExceptions based on more exact knowledge.
While working at this, I re-examined the "send aliveness ping from jspawnhelper to parent" logic again I introduced back in 2019 as a workaround for an obscure glibc bug with posix_spawn (see https://bugs.openjdk.org/browse/JDK-8223777). I found that it was not needed anymore since the glibc was fixed, so I started removing that workaround (https://bugs.openjdk.org/browse/JDK-8362257). But then it occurred to me that an obscure part of the jspawnhelper diagnostics introduced with https://bugs.openjdk.org/browse/JDK-8226242 piggy-backs on the aliveness ping for its "detect-abnormal-process-termination-before-exec" capabilities. Works like this: A jspawnhelper starts B jspawnhelper enters childProcess() and sends alivenes ping C jspawnhelper does a bunch of other things D jspawnhelper exec's In the parent, we count abnormal child terminations that occur before the aliveness ping (B) as "spawn error" and print the signal number. Without the aliveness ping we could not tell apart "jspawnhelper ends abnormally due to a signal" from "jspawnhelper exec()'s the user program successfully, user program runs and ends abnormally due to signal". However, the question is how important or even useful this part really is: - for externally sent abnormal termination signals (SIGTERM etc), from the caller's point of view it probably does not matter when it comes : before or after exec(). - it only matters for synchronous termination signals (crashes) we ourselves cause; but here it only covers crashes in a rather small piece of code, before the liveness ping (B). Basically, just the first part of jspawnhelper main(). Any crashes after the liveness ping are still unrecognised by ProcessBuilder.start, and are instead handled by the caller calling Process.waitFor(). There are two ways to deal with this: We could do without the crash protection in point (A), which would allow us to remove the liveness ping. I would very much prefer that. It would simplify the jspawnhelper protocol and make it more robust. Because we now don't have any communication in case no error happens - there would be only a single bit of information sent back via fail pipe, and only in case of an error. Fail pipe would stay quiet in case of successful exec(). Abnormal child process termination in the first part of jspawnhelper would be covered by the same path that detects abnormal child process termination in user programs - Process.waitFor(). If we determine we really need this crash detection, we could at least improve it: move the liveness ping to just-before the exec() call, so that we cover all the area from A-D. Also, do it for all modes (FORK, too), to simplify coding. Bottomline: remove an obscure and complex small mechanism that only helps a bit with detecting program errors (sigsegv etc) inside the first part of jspawnhelper main() ? Thanks, Thomas