Re: Segfault on recursive trap/kill

2018-10-06 Thread Mike Gerwitz
Hey, Bob!

On Sat, Oct 06, 2018 at 22:44:17 -0600, Bob Proulx wrote:
> Let me give the discussion this way and I think you will be
> convinced. :-)

Well, thanks for taking the time for such a long reply. :)

> How is your example any different from a C program?  Or Perl, Python,
> Ruby, and so forth?  All of those also allow infinite recursion and
> the kernel will terminate them with a segfault.  Because all of those
> also allow infinite recursion.  A program that executes an infinite
> recursion would use infinite stack space.  But real machines have a
> finite amount of stack available and therefore die when the stack is
> exceeded.

I expect this behavior when writing in C, certainly.  But in languages
where the user does not deal with memory management, I'm used to a more
graceful abort when the stack gets out of control.  A segfault means
something to a C hacker.  It means very little to users who are
unfamiliar with the concepts that you were describing.

I don't have enough experience with Perl, Python, or Ruby to know how
they handle stack issues.  But, out of interest, I gave it a try:

  $ perl -e 'sub foo() { foo(); }; foo()'
  Out of memory!

  $ python <<< 'def foo():
  >  foo()
  >foo()' |& tail -n1
  RuntimeError: maximum recursion depth exceeded

  $ ruby -e 'def foo()
  >   foo()
  > end
  > foo()'
  -e:2: stack level too deep (SystemStackError)

Some languages I'm more familiar with:

  $ node -e '(function foo() { foo(); })()'
  [eval]:1
  (function foo() { foo(); })()
   ^

  RangeError: Maximum call stack size exceeded

  $ php -r 'function foo() { foo(); } foo();'

  Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to
  allocate 262144 bytes) in Command line code on line 1

  $ guile -e '(let x () (+ (x)))'
  allocate_stack failed: Cannot allocate memory
  Warning: Unwind-only `stack-overflow' exception; skipping pre-unwind handler.

  $ emacs --batch --eval '(message (defun foo () (foo)) (foo))'
  Lisp nesting exceeds ‘max-lisp-eval-depth’

And so on.

I understand that in C you usually don't manage your own stack and,
consequently, you can't say that it falls under "memory management" in
the sense of malloc(3) and brk(2) and such.  But C programmers are aware
of the mechanisms behind the stack (or at least better be) and won't be
surprised when they get a segfault in this situation.

But if one of my coworkers who knows some web programming and not much
about system programming gets a segfault, that's not a friendly
error.  If Bash instead said something like the above languages, then
that would be useful.

When I first saw the error, I didn't know that my trap was
recursing.  My immediate reaction was "shit, I found a bug".  Once I saw
it was the trap, I _assumed_ it was just exhausting the stack, but I
wanted to report it regardless, just in case; I didn't have the time to
dig deeper, and even so, I wasn't sure if it was intended behavior to
just let the kernel handle it.


> This following complete C program recurses infinitely.  Or at least
> until the stack is exhausted.  At which time it triggers a segfault
> because it tries to use memory beyond the page mapped stack.

[...]

> Would you say that is a bug in the C language?  A bug in gcc that
> compiled it?  A bug in the Unix/Linux kernel for memory management
> that trapped the error?  The parent shell that reported the exit code
> of the program?  Or in the program source code?  I am hoping that we
> will all agree that it is a bug in the program source code and not
> either gcc or the kernel. :-)

I agree, yes.

> Shell script code is program source code.  Infinite loops or infinite
> recursion are bugs in the shell script source code not the interpreter
> that is executing the code as written.

I also agree.  But the context is very different.  Shell is a very,
very high-level language.

> This feels to me to be related to The Halting Problem.

Knowing in advance whether there may be a problem certainly is, but we
don't need to do that; we'd just need to detect it at runtime to provide
a more useful error message.

> Other shells are also fun to check:
>
>   $ dash -c 'trap "kill 0" TERM; kill 0'
>   Segmentation fault
>
>   $ ash -c 'trap "kill 0" TERM; kill 0'
>   Segmentation fault
>
>   $ mksh -c 'trap "kill 0" TERM; kill 0'
>   Segmentation fault

Heh, interesting!

>   $ ksh93 -c 'trap "kill 0" TERM; kill 0'
>   $ echo $?
>   0

This is not the behavior I'd want.

>   $ posh -c 'trap "kill 0" TERM; kill 0'
>   Terminated
>   Terminated
>   Terminated
>   ...
>   Terminated
>   ^C

:x

> This finds what look like bugs in posh and ksh93.

That's a fair assessment.

>> it's just that most users assume that a segfault represents a
>> problem with the program
>
> Yes.  And here it indicates a bug too.  It is indicating a bug in the
> shell program code which sets up the infinite recursion.  Programs
> should avoid doing that. :-)

Indeed they should, but inevitably, such bugs do 

Re: Segfault on recursive trap/kill

2018-10-06 Thread Bob Proulx
Hi Mike,

Mike Gerwitz wrote:
> ... but are you saying that terminating with a segfault is the
> intended behavior for runaway recursion?

Let me give the discussion this way and I think you will be
convinced. :-)

How is your example any different from a C program?  Or Perl, Python,
Ruby, and so forth?  All of those also allow infinite recursion and
the kernel will terminate them with a segfault.  Because all of those
also allow infinite recursion.  A program that executes an infinite
recursion would use infinite stack space.  But real machines have a
finite amount of stack available and therefore die when the stack is
exceeded.

This following complete C program recurses infinitely.  Or at least
until the stack is exhausted.  At which time it triggers a segfault
because it tries to use memory beyond the page mapped stack.

  int main() {
return main();
  }

  $ gcc -o forever forever.c
  $ ./forever
  Segmentation fault
  $ echo $?
  139  # Signal 11 + 128

   The return value of a simple command is its exit status, or 128+n if
   the command is terminated by signal n.

Would you say that is a bug in the C language?  A bug in gcc that
compiled it?  A bug in the Unix/Linux kernel for memory management
that trapped the error?  The parent shell that reported the exit code
of the program?  Or in the program source code?  I am hoping that we
will all agree that it is a bug in the program source code and not
either gcc or the kernel. :-)

Shell script code is program source code.  Infinite loops or infinite
recursion are bugs in the shell script source code not the interpreter
that is executing the code as written.

This feels to me to be related to The Halting Problem.

> As long as there is no exploitable flaw here, then I suppose this isn't
> a problem;

It's not a privilege escalation.  Nor a buffer overflow.  Whether this
is otherwise exploitable depends upon the surrounding environment usage.

> I haven't inspected the code to see if this is an access violation
> or if Bash is intentionally signaling SIGSEGV.

It is the kernel that manages memory, maps pages, detects page faults,
kills the program.  The parent bash shell is only reporting the exit
code that resulted.  The interpreting shell executed the shell script
souce code as written.

  
Other shells are also fun to check:

  $ dash -c 'trap "kill 0" TERM; kill 0'
  Segmentation fault

  $ ash -c 'trap "kill 0" TERM; kill 0'
  Segmentation fault

  $ mksh -c 'trap "kill 0" TERM; kill 0'
  Segmentation fault

  $ ksh93 -c 'trap "kill 0" TERM; kill 0'
  $ echo $?
  0

  $ posh -c 'trap "kill 0" TERM; kill 0'
  Terminated
  Terminated
  Terminated
  ...
  Terminated
  ^C

Testing zsh is interesting because it seems to keep the interpreter
stack in data space and therefore can consume a large amount of memory
if it is available.  And then can trap the result of being out of data
memory and then kills itself with a SIGTERM.  Note that in my testing
I have Linux memory overcommit disabled.

This finds what look like bugs in posh and ksh93.

> it's just that most users assume that a segfault represents a
> problem with the program

Yes.  And here it indicates a bug too.  It is indicating a bug in the
shell program code which sets up the infinite recursion.  Programs
should avoid doing that. :-)

  bash -c 'trap "kill 0" TERM; kill 0'

The trap handler was not set back to the default before the program
sent the signal to itself.  The way to fix this is:

  $ bash -c 'trap "trap - TERM; kill 0" TERM; kill 0'
  Terminated
  $ echo $?
  143  # killed on SIGTERM as desired, good

If ARG is absent (and a single SIGNAL_SPEC is supplied) or `-',
each specified signal is reset to its original value.

The proper way for a program to terminate itself upon catching a
signal is to set the signal handler back to the default value and then
send the signal to itself so that it will be terminated as a result of
the signal and therefore the exit status will be set correctly.

For example the following is useful boilerplate:

  unset tmpfile
  cleanup() {
test -n "$tmpfile" && rm -f "$tmpfile" && unset tmpfile
  }
  trap "cleanup" EXIT
  trap "cleanup; trap - HUP; kill -HUP $$" HUP
  trap "cleanup; trap - INT; kill -INT $$" INT
  trap "cleanup; trap - QUIT; kill -QUIT $$" QUIT
  trap "cleanup; trap - TERM; kill -TERM $$" TERM
  tmpfile=$(mktemp) || exit 1

If a program traps a signal then it should restore the default signal
handler for that signal and send the signal back to itself.  Otherwise
the exit code will be incorrect.  Otherwise parent programs won't know
that the child was killed with a signal.

For a highly recommended deep dive into this:

  https://www.cons.org/cracauer/sigint.html

Hope this helps!
Bob


signature.asc
Description: PGP signature


Re: Segfault on recursive trap/kill

2018-10-06 Thread Robert Elz
Date:Sat, 06 Oct 2018 19:53:25 -0400
From:Mike Gerwitz 
Message-ID:  <874ldy1vka@gnu.org>

  | I haven't inspected the code to see if this is an access
  | violation or if Bash is intentionally signaling SIGSEGV.

I expect that if you did look, you'd probably find that while
technically the former, it isn't a reference to some wild pointer,
but rather simply growing the stack until the OS says "no more"
and returns a SIGSEGV instead af allocating a new stack page.

kre




Re: Segfault on recursive trap/kill

2018-10-06 Thread Mike Gerwitz
On Sat, Oct 06, 2018 at 12:33:22 -0400, Chet Ramey wrote:
> On 10/5/18 9:33 PM, Mike Gerwitz wrote:
>> The following code will cause a segfault on bash-4.4.19(1) on
>> GNU Guix.  I reproduced the issue on an old Ubuntu 14.04 LTS running
>> bash-4.3.11(1) as well as a Trisquel system running the same version.
>> 
>>   bash -c 'trap "kill 0" TERM; kill 0'
>> 
>> Also segfaults when replacing `0' with `$$', and presumably in any other
>> situation that would trigger the trap recursively.
>
> Yes. Bash has allowed recursive trap handlers since early 2014 (pre-4.3)
> due to requests for the feature and compatibility with other shells that
> allow it.
>
> If you manage to create infinite recursion, bash won't stop you.

Sure, I agree that the feature is useful, but are you saying that
terminating with a segfault is the intended behavior for runaway
recursion?  Upon further inspection, it does look like
`foo() { foo; }; foo' also causes a segfault, so the behavior is
consistent with trap recursion.

As long as there is no exploitable flaw here, then I suppose this isn't
a problem; it's just that most users assume that a segfault represents a
problem with the program (unless they're dealing with their own memory
management).  I haven't inspected the code to see if this is an access
violation or if Bash is intentionally signaling SIGSEGV.

In any case, thanks for the reply.

-- 
Mike Gerwitz


signature.asc
Description: PGP signature


Re: Auto-update program cache feature

2018-10-06 Thread Bob Proulx
Jeffrey Walton wrote:
> I think a useful feature for Bash would be to automatically update the
> program cache after an install.

Put this in your ~/.bashrc file and I believe your use case will be
much happier.

  shopt -s checkhash

In the bash manual:

  checkhash
  If set, bash checks that a command found in the hash
  table exists before trying to execute it.  If a hashed
  command no longer exists, a normal path search is
  performed.

I would prefer that to be a default.  But it changed a behavior back
in the day when it was added and deviated from the previous csh
behavior.  Therefore being an optional option makes sense.  But I
don't see a downside to defaulting to it now.  In any case I always
add it to my bashrc file.  (Along with "shopt -s checkwinsize" too.)

Bob



Re: Segfault on recursive trap/kill

2018-10-06 Thread Chet Ramey
On 10/5/18 9:33 PM, Mike Gerwitz wrote:
> The following code will cause a segfault on bash-4.4.19(1) on
> GNU Guix.  I reproduced the issue on an old Ubuntu 14.04 LTS running
> bash-4.3.11(1) as well as a Trisquel system running the same version.
> 
>   bash -c 'trap "kill 0" TERM; kill 0'
> 
> Also segfaults when replacing `0' with `$$', and presumably in any other
> situation that would trigger the trap recursively.

Yes. Bash has allowed recursive trap handlers since early 2014 (pre-4.3)
due to requests for the feature and compatibility with other shells that
allow it.

If you manage to create infinite recursion, bash won't stop you.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/