Hi Martin,

On Mon, May 13, 2019 at 2:08 PM Martin Buchholz <marti...@google.com> wrote:

>
>
> I am happy this is resolved and the intermittent behavior explained. Yes,
>> we could improve exception messages, especially since analyzing fork
>> scenarios is cumbersome.
>>
>
> I tried hard back in 2005 to provide pretty good java-level diagnostics
> when subprocess starting failed somehow (see WhyCantJohnnyExec) .  At least
> the errno did get reported.
>
>
I know your code. For many years I wondered who Johnny is :)

We have a very similar solution in our port: we have our own error codes
(plus errno mixed in where it makes sense) for the many things that can go
wrong in the forkhelper. Maybe we can improve upon your solution a bit.
And/or add tracing for environment etc.

But here is one thing that I still do not understand with Remis problem:

The theory is that the first exec(), starting jspawnhelper, went wrong with
NOACCESS, yes?

Man page for posix_spawn() states:

<quote>
       Upon successful completion, posix_spawn() and posix_spawnp() place
       the PID of the child process in pid, and return 0.  If there is an
       error before or during the fork(2), then no child is created, the
       contents of *pid are unspecified, and these functions return an error
       number as described below.

       Even when these functions return a success status, the child process
       may still fail for a plethora of reasons related to its pre-exec()
       initialization.  In addition, the exec(3) may fail.  In all of these
       cases, the child process will exit with the exit value of 127.
</quote>

To me this looks as if what should have happened is: posix_spawn() should
have returned with success, since the fork() went thru. Then, the child
process (still inside posix_spawn()) attempts exec and gets a NOACCESS.
Then, child process should have ended with exit code 127. Your fail pipe
would never read an error code since we never entered the main function of
jspawnhelper. For the java caller it should have looked like a very short
lived process with exit code 127.

Obviously this is not what happened, since Remi reported an IOException
with an errno. So, where do I understand this wong?


I've had this little script around for ages:
>
> #!/bin/bash
> # -v: Print unabbreviated versions of environment, etc
>
> exec /usr/bin/strace -f -v -s 256 -e signal=none -e trace=process "$@"
>
>
We had all this as part of spawn traces. But this is a nice and neat idea.
Does it print current directory?

Cheers, Thomas

Reply via email to