Re: Segfault on recursive trap/kill

Mike Gerwitz Sat, 06 Oct 2018 22:58:24 -0700

Hey, Bob!

On Sat, Oct 06, 2018 at 22:44:17 -0600, Bob Proulx wrote:
> Let me give the discussion this way and I think you will be
> convinced. :-)


Well, thanks for taking the time for such a long reply. :)

> How is your example any different from a C program?  Or Perl, Python,
> Ruby, and so forth?  All of those also allow infinite recursion and
> the kernel will terminate them with a segfault.  Because all of those
> also allow infinite recursion.  A program that executes an infinite
> recursion would use infinite stack space.  But real machines have a
> finite amount of stack available and therefore die when the stack is
> exceeded.

I expect this behavior when writing in C, certainly.  But in languages
where the user does not deal with memory management, I'm used to a more
graceful abort when the stack gets out of control.  A segfault means
something to a C hacker.  It means very little to users who are
unfamiliar with the concepts that you were describing.

I don't have enough experience with Perl, Python, or Ruby to know how
they handle stack issues.  But, out of interest, I gave it a try:

  $ perl -e 'sub foo() { foo(); }; foo()'
  Out of memory!

  $ python <<< 'def foo():
  >  foo()
  >foo()' |& tail -n1
  RuntimeError: maximum recursion depth exceeded

  $ ruby -e 'def foo()
  >   foo()
  > end
  > foo()'
  -e:2: stack level too deep (SystemStackError)

Some languages I'm more familiar with:

  $ node -e '(function foo() { foo(); })()'
  [eval]:1
  (function foo() { foo(); })()
               ^

  RangeError: Maximum call stack size exceeded

  $ php -r 'function foo() { foo(); } foo();'

  Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to
  allocate 262144 bytes) in Command line code on line 1

  $ guile -e '(let x () (+ (x)))'
  allocate_stack failed: Cannot allocate memory
  Warning: Unwind-only `stack-overflow' exception; skipping pre-unwind handler.

  $ emacs --batch --eval '(message (defun foo () (foo)) (foo))'
  Lisp nesting exceeds ‘max-lisp-eval-depth’

And so on.

I understand that in C you usually don't manage your own stack and,
consequently, you can't say that it falls under "memory management" in
the sense of malloc(3) and brk(2) and such.  But C programmers are aware
of the mechanisms behind the stack (or at least better be) and won't be
surprised when they get a segfault in this situation.

But if one of my coworkers who knows some web programming and not much
about system programming gets a segfault, that's not a friendly
error.  If Bash instead said something like the above languages, then
that would be useful.

When I first saw the error, I didn't know that my trap was
recursing.  My immediate reaction was "shit, I found a bug".  Once I saw
it was the trap, I _assumed_ it was just exhausting the stack, but I
wanted to report it regardless, just in case; I didn't have the time to
dig deeper, and even so, I wasn't sure if it was intended behavior to
just let the kernel handle it.


> This following complete C program recurses infinitely.  Or at least
> until the stack is exhausted.  At which time it triggers a segfault
> because it tries to use memory beyond the page mapped stack.

[...]

> Would you say that is a bug in the C language?  A bug in gcc that
> compiled it?  A bug in the Unix/Linux kernel for memory management
> that trapped the error?  The parent shell that reported the exit code
> of the program?  Or in the program source code?  I am hoping that we
> will all agree that it is a bug in the program source code and not
> either gcc or the kernel. :-)

I agree, yes.

> Shell script code is program source code.  Infinite loops or infinite
> recursion are bugs in the shell script source code not the interpreter
> that is executing the code as written.

I also agree.  But the context is very different.  Shell is a very,
very high-level language.

> This feels to me to be related to The Halting Problem.

Knowing in advance whether there may be a problem certainly is, but we
don't need to do that; we'd just need to detect it at runtime to provide
a more useful error message.

> Other shells are also fun to check:
>
>   $ dash -c 'trap "kill 0" TERM; kill 0'
>   Segmentation fault
>
>   $ ash -c 'trap "kill 0" TERM; kill 0'
>   Segmentation fault
>
>   $ mksh -c 'trap "kill 0" TERM; kill 0'
>   Segmentation fault

Heh, interesting!

>   $ ksh93 -c 'trap "kill 0" TERM; kill 0'
>   $ echo $?
>   0

This is not the behavior I'd want.

>   $ posh -c 'trap "kill 0" TERM; kill 0'
>   Terminated
>   Terminated
>   Terminated
>   ...
>   Terminated
>   ^C

:x

> This finds what look like bugs in posh and ksh93.

That's a fair assessment.

>> it's just that most users assume that a segfault represents a
>> problem with the program
>
> Yes.  And here it indicates a bug too.  It is indicating a bug in the
> shell program code which sets up the infinite recursion.  Programs
> should avoid doing that. :-)

Indeed they should, but inevitably, such bugs do happen, and mine was a
particularly common pitfall: I was trying to set the variable from
within a subprocess.  But it wasn't in a subprocss when I originally
wrote it.

> The proper way for a program to terminate itself upon catching a
> signal is to set the signal handler back to the default value and then
> send the signal to itself so that it will be terminated as a result of
> the signal and therefore the exit status will be set correctly.

That's good advice.  Thank you.

>   https://www.cons.org/cracauer/sigint.html

Thanks.

> Hope this helps!

There was useful information, yes.  I hope I was able to further clarify
my concerns as well.

-- 
Mike Gerwitz

signature.asc
Description: PGP signature

Re: Segfault on recursive trap/kill

Reply via email to