Re: How to report a death by signal ?

2015-02-18 Thread Olivier Brunel
On 02/18/15 14:20, Laurent Bercot wrote:
 On 18/02/2015 14:04, Olivier Brunel wrote:
 I don't follow, what's wrong with using a fd?
 
  It needs a convention between G and P. And I can't do that, because
 G and P are not necessarily both execline commands. They are normal
 Unix programs, and the whole point of execline is to have commands
 that work transparently in any environment, with only the Unix argv
 and envp as conventions.

But isn't the whole anything = 128 will be reported as 128, and
anything higher is actually 128+signum also a convention that both needs
to agree upon?

P will do something to report info about C to G, and P and G needs to
agree about how said info will be reported. Using 128+signum is one way,
using an fd for the full/correct info is another.

The later being an option, it wouldn't change what P returns, but be an
additional means to provide accurate information to grandprocess G
should they need to.
Or just, like shells, assume it's not needed and simply only do the
128+signum convention.

Noting that shells do not actually clamp the exit code to 128. As
illustrated by Peter's example, shells return the exit code (up to 255
included), or 128+signum.
So assuming no signal, you get the accurate exit code. But of course,
with anything higher than 128 there's no way of knowing if it was an
exit code or a signal (unless you know exit codes don't go that high).
(Clamping provides better results though, so I'm not saying don't do it;
Just the difference shall probably be pointed out/documented.)


 Cause that was my idea as well: return the exit code or 255.
 
  I was considering it for a while, then figured that the signal number
 is an interesting information to have, if G remotely cares about
 C crashing. I prefer to reserve the whole range of 128+ for
 something went very wrong, most likely a crash at some point, and
 if you get 129+, it was directly below you and you get the signal
 number.
 
 
 Though if you want shell compatibility you could also have an option
 to return exit code, or 128+signum when signaled, and similarly one
 would either be fine with that, or have to use the fd for full/complete
 info.
 
  Programs that can establish a convention between one another are easy
 to deal with. If I remember to document the convention (finish scripts
 *whistle*)
 



Re: How to report a death by signal ?

2015-02-18 Thread Laurent Bercot

On 18/02/2015 14:55, Olivier Brunel wrote:

But isn't the whole anything = 128 will be reported as 128, and
anything higher is actually 128+signum also a convention that both needs
to agree upon?


 Sure, but most commands exit 128 so that's reliable enough, and it's
a lot easier to follow than the whole pipe shebang. It's much, much
simpler to exit with a given code than to write stuff to a pipe (what
do you do if it blocks ? what do you do if you're fd-constrained ?
what do you do if setting up the plumbing in the parent fails for
whatever reason ? etc. etc.)



Noting that shells do not actually clamp the exit code to 128.


 Indeed, but it comes at the price of uncertainty - you get
accurate information if you're lucky, and complete misinformation
if you're not. It works for shells most of the time because you
don't manually nest shells - it's much riskier for execline.



Just the difference shall probably be pointed out/documented.)


 Definitely.

--
 Laurent



Re: How to report a death by signal ?

2015-02-18 Thread Laurent Bercot

On 18/02/2015 14:20, Peter Pentchev wrote:

[roam@straylight ~]$ perl -e 'die(foo!\n);'; echo $?
foo!
255


 I think you should be ok, for the same reason why a shell is ok:
if you're using Perl, you're most likely writing your whole script
with it, especially control flow and error/crash checking.
You're not playing with an inner interpreter reporting a code to an
outer interpreter. So the weird 255 should not be a problem in
practice.

 If I'm wrong and your use case precisely involves a perl script
running as P or C with G being an execline command, please mention
it! Just because I'd be curious. :)

--
 Laurent



Re: How to report a death by signal ?

2015-02-18 Thread Laurent Bercot

On 18/02/2015 11:58, Peter Pentchev wrote:

OK, so the not using the whole range of valid exit codes point rules
out my obvious reply - do what the shell does - exit 128 + signum.


 Well the shell is happily ignoring the problem, but it doesn't mean
it has solved it. The shell reserves a few exit codes, then does some
best effort, hoping its invoked commands do not step on its feet.
It works because most commands will avoid exiting something  125,
but it's still a convention, and most importantly, the shell itself
does not follow that convention (it obviously cannot!)
 So, something like sh -c sh -c foobar does not report errors
properly: for 126 and 127, there's no way to know if the code belongs
to the inner shell or the outer shell, and for 128+, there's no way
to know if the inner shell or the foobar process got killed.

 Shells get away with it because when they're nested, it's usually
auto-subshell magic and users don't want to know about the inner
shell; but here, I'm trying to solve the problem for execline commands,
and those tend to be nested a lot - so I definitely cannot reserve codes
for the outer command, because the inner command may very well use the
same ones too.



Now the question is, do you want to solve this problem in general, or do
you want to solve it for a particular combination of programs, even if
new programs may be added to that combination in the future, but only
under certain rules?  If it's the former (in general), then, sorry, I
don't have a satisfactory answer for you, and the fact that the POSIX
shell still keeps the exit 128 + signum behavior mostly means that
nobody else has come up with a better one, either (or it might be
available at least as some kind of an option).


 It just means that nobody cares about shell exit codes. Error handling,
if any, is done inside of shell scripts anyway; and in most scripts, a
random signal killing a running command isn't even something people think
about, and I'm sure there are hilarious behaviours hiding in dark corners
of very popular shell scripts, that fortunately remain asleep to this day.

 For execline, however, I cannot use the same casual approach. Execline
scripts live and die by proper exit code reporting, and carelessness may
lead to very obvious breakage.



Personally, I quite like the idea of some kind of a pipe (be it a
pipe(2) pair of file descriptors or an AF_UNIX/PF_UNSPEC socketpair or
some other kind of communication channel based on file descriptors),
even if it is only unidirectional:


 Oh, don't get me wrong, I'm a fan of child-to-parent communication via
pipes, and I use it wherever applicable. Unfortunately, the child may
be anything here, so I need something generic.

 Thanks for your input !

--
 Laurent



Re: How to report a death by signal ?

2015-02-18 Thread Laurent Bercot

On 18/02/2015 14:04, Olivier Brunel wrote:

I don't follow, what's wrong with using a fd?


 It needs a convention between G and P. And I can't do that, because
G and P are not necessarily both execline commands. They are normal
Unix programs, and the whole point of execline is to have commands
that work transparently in any environment, with only the Unix argv
and envp as conventions.



Cause that was my idea as well: return the exit code or 255.


 I was considering it for a while, then figured that the signal number
is an interesting information to have, if G remotely cares about
C crashing. I prefer to reserve the whole range of 128+ for
something went very wrong, most likely a crash at some point, and
if you get 129+, it was directly below you and you get the signal
number.



Though if you want shell compatibility you could also have an option
to return exit code, or 128+signum when signaled, and similarly one
would either be fine with that, or have to use the fd for full/complete
info.


 Programs that can establish a convention between one another are easy
to deal with. If I remember to document the convention (finish scripts
*whistle*)

--
 Laurent



Re: How to report a death by signal ?

2015-02-18 Thread Peter Pentchev
On Wed, Feb 18, 2015 at 01:58:34PM +0100, Laurent Bercot wrote:
 
  I'm leaning more and more towards the following approach:
 
  - child crashed: exit 128 + signal number
  - child exited with 128 or more: exit 128
  - else: exit the child's exit code.
 
  Assuming normal commands never exit more than 127, that
 reports the whole information to the immediate parent, and
 correct information, if incomplete, higher up. That should
 be enough to make things work in all cases.
 
  Thoughts ?

Hm, just a point here; not saying it's very important, but people
do sometimes use Perl for stuff :) [0]

[roam@straylight ~]$ perl -e 'die(foo!\n);'; echo $?
foo!
255
[roam@straylight ~]$

Yes, I know, I know, I myself have no idea why die() exits with 255
instead of, say, 1... or rather, yes, I do have an idea (make it
different from any code that a usual program would exit with), but I do
recognize it as weird :(

So, hm, I don't know...  Clamp the killed by a signal range to
128-191?  Unfortunately, I'm not sure (couldn't find it in a quick
browsing of the POSIX specs) whether there is actually a defined maximum
signal number; anything higher than 63 is reported as 63 might be
an option here.  Or, of course, just ignore Perl's die()'s weirdness or
rather wrap it up in any exit code larger than 128 is unusual enough to
be lumped together with other exit codes larger than 128, as your
proposal already states... so apologies for the wasted electrons, I
guess :)

G'luck,
Peter

[0] ...and, yes, at $RealJob we do use a Perl loader/watcher for several
important daemons... but then we have to keep compatibility with various
init systems, we can't force a choice upon our customers :)

-- 
Peter Pentchev  r...@ringlet.net r...@freebsd.org p.penc...@storpool.com
PGP key:http://people.FreeBSD.org/~roam/roam.key.asc
Key fingerprint 2EE7 A7A5 17FC 124C F115  C354 651E EFB0 2527 DF13


signature.asc
Description: Digital signature


Re: How to report a death by signal ?

2015-02-18 Thread Laurent Bercot


 I'm leaning more and more towards the following approach:

 - child crashed: exit 128 + signal number
 - child exited with 128 or more: exit 128
 - else: exit the child's exit code.

 Assuming normal commands never exit more than 127, that
reports the whole information to the immediate parent, and
correct information, if incomplete, higher up. That should
be enough to make things work in all cases.

 Thoughts ?

--
 Laurent