Re: SIGINT handling during async functions

2023-02-10 Thread Chet Ramey

On 2/6/23 10:26 PM, Martin D Kealey wrote:


By orthogonal, I meant these things should ideally be managed by separate
controls:
  1. ignoring signals (or not)
  2. redirecting filedescriptors
  3. immediately waiting on the process (or not)
  4. creating new process groups
  5. sending a signal to about-to-be orphaned children when the shell exits





In particular I'm thinking of options along the lines of:

nohup --no-redir --[block/default/keep]=[INT,QUIT,HUP,...]

(exact names not important; hopefully --long-options are self-explanatory)


I feel like this will turn into something like daemon(1), but if someone
wants to take a shot -- using a new name, obviously -- let's talk about it.

Chet

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: SIGINT handling during async functions

2023-02-06 Thread Martin D Kealey
On Fri, 3 Feb 2023 at 07:17, Chet Ramey  wrote:

> On 1/28/23 5:56 AM, Martin D Kealey wrote:
> > Firstly, let's just leave aside "POSIX requires this" for a bit.
> Be that as it may, POSIX exists and this is a requirement. It's also how
> other shells behave.
>

Of course. I'm only contemplating making changes in extended mode, not
POSIX mode.


> > I contend that it's inconsistent for the actions of "nohup" and "&" to
> NOT
> > be fully orthogonal.
>
> Maybe, but their historical behavior has always differed: `nohup' ignores
> SIGHUP, and background processes ignore SIGINT/SIGQUIT. You could say those
> are "fully orthogonal," setting aside the sometimes-confusing manipulation
> of input and output FDs. Is the latter what you mean by orthogonality?
>

Sorry, nohup was a terrible way to illustrate this, since conflates other
things that I wasn't considering; rather I meant something more along the
lines of "some hypothetical command structured like nohup but which affects
SIGINT & SIGQUIT instead of SIGHUP, and which doesn't redirect stdio" (and
which doesn't make the job automatically backgrounded, though that already
applies to nohup).

By orthogonal, I meant these things should ideally be managed by separate
controls:
 1. ignoring signals (or not)
 2. redirecting filedescriptors
 3. immediately waiting on the process (or not)
 4. creating new process groups
 5. sending a signal to about-to-be orphaned children when the shell exits

The problem, as I see it, is that there's no shell syntax that *only* does
#3.

Yes one could write a shell function, but it'd be pretty contorted, and
waste a lot of effort for things that would be undone when using other
features at the same time, leading to a situation where people would only
bother to use it when they're *not *going to give explicit signal
dispositions; so it's still not *practically* orthogonal.

> In the meantime, « shopt -s background_without_magic » (*2) gets my vote,
>
> I don't see any advantage over the mechanism above.
>

The value proposition in making changes isn't that "this can't already be
done *somehow*", but rather the unorthogonality of the current features is
suboptimal language design, and poor for user understanding.

Rather than a global shopt setting that stops "&" from blocking SIGINT &
SIGQUIT, which I'll grant is a hard sell, perhaps an entirely new notation
would be possible, using some combination of "&" with other punctuation
that isn't already defined, such as "&;" or "&|".

> along with incorporating « nohup » as a built-in (so that Bash can
> > guarantee its behaviour, and add options to improve its internal
> > orthogonality.).
>
> What guarantees would you like?


I put those the wrong way round. To add extensions, and guarantee that
they're available to every Bash script, nohup would have to be a built-in.

In particular I'm thinking of options along the lines of:

nohup --no-redir --[block/default/keep]=[INT,QUIT,HUP,...]

(exact names not important; hopefully --long-options are self-explanatory)

> *1: I have very occasionally had interactive single-user shell running on
> > /dev/console, which doesn't appear to count as a tty because it doesn't
> > respond to tcsetpgrp.
>
> Try running something in a Docker container; that doesn't guarantee a
> controlling terminal.
>

That's a very good point, and I suspect it's for the same underlying
reason: that inside the container, the "top" process has pid 1 or pgrp 0 or
somesuch, and somewhere this is interpreted as "set the terminal so that it
has no pgrp".

-Martin


Re: SIGINT handling during async functions

2023-02-02 Thread Chet Ramey

On 1/28/23 5:56 AM, Martin D Kealey wrote:


Firstly, let's just leave aside "POSIX requires this" for a bit. I know 
that the requirement is there, and I think it is one of those broken things 
that ought to have been dropped from POSIX, or at least reduced to optional 
rather than required.


Be that as it may, POSIX exists and this is a requirement. It's also how
other shells behave.



On 1/21/23 7:55 AM, Tycho Kirchner wrote:
 > Please consider a script launching several commands in background
 > and waiting for their completion:


(note these last 4 words; I suspect they exclude the "common case 
experience" of people who think that it's only natural to want to insulate 
new daemons from tty signals)


Sure.

I contend that it's inconsistent for the actions of "nohup" and "&" to NOT 
be fully orthogonal.


Maybe, but their historical behavior has always differed: `nohup' ignores
SIGHUP, and background processes ignore SIGINT/SIGQUIT. You could say those
are "fully orthogonal," setting aside the sometimes-confusing manipulation
of input and output FDs. Is the latter what you mean by orthogonality?



And I contend that a daemon that unexpectedly dies is much more obvious 
than a bunch of internal processes that are unexpectedly left running; you 
have to proactively check for orphaned processes, and their continued 
action may cause weird bugs.


You can get background processes that have SIGINT and SIGQUIT set to
SIG_DFL today with the (buggy) existing bash behavior. The same POSIX-
blessed technique will work in fixed future versions:

{ trap - SIGINT SIGQUIT ; program; } &  # `exec program' if you prefer

instead of

program &

and this has the advantage -- or not -- of granularity.

I haven't encountered an interactive shell in a tty without job control in 
the last 35 years. (*1)

I contend that it's past time this POSIX misfeature was retired.


You absolutely can have that discussion with the POSIX group; since job
control remains an optional POSIX feature, you might want to incorporate a
proposal to make it mandatory.

In the meantime, « shopt -s background_without_magic » (*2) gets my vote, 


I don't see any advantage over the mechanism above.

along with incorporating « nohup » as a built-in (so that Bash can 
guarantee its behaviour, and add options to improve its internal 
orthogonality.).


What guarantees would you like? Or, what do you consider the essential
parts of nohup's behavior that should be guaranteed that are not now?
nohup's been around for what, 40+ years now; its behavior is pretty
well known. There's little advantage to making it a builtin other than
to nohup builtins, and you basically can do that already.


*1: I have very occasionally had interactive single-user shell running on 
/dev/console, which doesn't appear to count as a tty because it doesn't 
respond to tcsetpgrp.


Try running something in a Docker container; that doesn't guarantee a
controlling terminal.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: SIGINT handling during async functions

2023-01-28 Thread Martin D Kealey
Firstly, let's just leave aside "POSIX requires this" for a bit. I know
that the requirement is there, and I think it is one of those broken things
that ought to have been dropped from POSIX, or at least reduced to optional
rather than required.

On Tue, 24 Jan 2023 at 07:35, Chet Ramey  wrote:

> On 1/21/23 7:55 AM, Tycho Kirchner wrote:
> > Please consider a script launching several commands in background
> > and waiting for their completion:
>

(note these last 4 words; I suspect they exclude the "common case
experience" of people who think that it's only natural to want to insulate
new daemons from tty signals)


> >
> > cmd1 &
> > cmd2 &
> > wait
> >
> > [...] In my experience, what the user usually wants in such a case is

> to abort cmd1, cmd2 as well as the script having launched them.
>
> Odd, my experience is the opposite. I have run commands asynchronously
> from scripts quite often in my previous lives, with the intent of
> insulating  them from signals.
>

Which behaviour seems "intuitive" probably depends on which of two patterns
one uses more often:
1. create a daemon that is intended to continue running after the script
finishes; or
2. create a number of cooperating parallel processes that are entirely
internal to the script, which *should* exit at or before the end of the
script.

I'm old enough to have used the Bourne shell before it had tty job control,
so I can see why *in the interactive case* it makes sense to prevent tty
signals from affecting any "background" process launched directly from the
interactive shell.

I contend that it's inconsistent for the actions of "nohup" and "&" to NOT
be fully orthogonal.

And I contend that a daemon that unexpectedly dies is much more obvious
than a bunch of internal processes that are unexpectedly left running; you
have to proactively check for orphaned processes, and their continued
action may cause weird bugs.


I haven't encountered an interactive shell in a tty without job control in
the last 35 years. (*1)
I contend that it's past time this POSIX misfeature was retired.

In the meantime, « shopt -s background_without_magic » (*2) gets my vote,
along with incorporating « nohup » as a built-in (so that Bash can
guarantee its behaviour, and add options to improve its internal
orthogonality.).

-Martin

*1: I have very occasionally had interactive single-user shell running on
/dev/console, which doesn't appear to count as a tty because it doesn't
respond to tcsetpgrp.

*2: or perhaps with finer granularity « shopt -u bg_block_signals
bg_null_stdin »


Re: SIGINT handling during async functions

2023-01-23 Thread Chet Ramey

On 1/21/23 7:55 AM, Tycho Kirchner wrote:



Am 16.01.23 um 18:26 schrieb Chet Ramey:


The fix is to add enough state machinery to detect this situation and
behave in a way that can satisfy both the standard and the later
interpretation, while being careful not to undo this work later. This is
obviously not how bash worked in the past.


Thanks for the explanation. While editing the state machinery I would like 
to suggest to add a new shopt, let's call it keepsigint, which a user may 
set to preserve the SIGINT trap set in the parent shell for all 
asynchronous commands.


Is this really what you want, since none of the scenarious you describe use
it?

I suggest that what you are asking for is a way to set the signal
disposition to SIG_DFL instead of SIG_IGN.

While the POSIX behavior to ignore SIGINT for background processes if job 
control is disabled makes totally sense for interactive shells, for scripts 
to me it often appears not constructive. 


How often do you have job control disabled in interactive shells? It seems
to me that scripts are the primary motivaation for this behavior.

Please consider a script launching

several commands in background and waiting for their completion:

cmd1 &
cmd2 &
wait

If the user having launched this script from the interactive terminal 
aborts it by hitting Ctrl+C, by default, the shell sends SIGINT to the 
process group (pgid) of the script. However, while cmd1 and cmd2 get their 
signal, they usually (if they don't override it) ignore it due to above 
POSIX requirement. In my experience, what the user usually wants in such a 
case is to abort cmd1, cmd2 as well as the script having launched them.


Odd, my experience is the opposite. I have run commands asynchronously from
scripts quite often in my previous lives, with the intent of insulating
them from signals.

It's pretty clear that the historical bash behavior, which ends up the way
you want, is not correct.

Anyway, if your goal is to allow CMD to have SIG_DFL for SIGINT and
SIGQUIT, the POSIX way to do it is similar to your fourth option.

{ trap - SIGINT ; exec cmd1; } &
{ trap - SIGINT ; exec cmd2; } &

which interp 751 says has to work (you can pick and choose your use of
`exec' depending on what you're running, of course).

It obviously doesn't work in bash-5.2, but bash-5.2 doesn't have that new
option, either, and already makes such asynchronous commands interruptible.

As to `sanity': I would argue that expecting any behavior other than to
have asynchronous commands with SIGINT and SIGQUIT set to SIG_IGN is not a
reasonable expectation. The Bourne family of shells has always behaved
that way. Now, you might have to work around it, but the workaround should
be possible, not the default.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: SIGINT handling during async functions

2023-01-21 Thread Greg Wooledge
On Sat, Jan 21, 2023 at 01:55:27PM +0100, Tycho Kirchner wrote:
> cmd1 &
> cmd2 &
> wait
> 
> If the user having launched this script from the interactive terminal aborts 
> it by hitting Ctrl+C, by default, the shell sends SIGINT to the process group 
> (pgid) of the script. However, while cmd1 and cmd2 get their signal, they 
> usually (if they don't override it) ignore it due to above POSIX requirement. 
> In my experience, what the user usually wants in such a case is to abort 
> cmd1, cmd2 as well as the script having launched them.

A given user might *want* that, but that's not what is going to happen,
nor what is supposed to happen.

If a user wants that behavior, they will need to set up a trap of their
own, store the PIDs of the background processes, and kill them in the
trap.

There is nothing that should be changed in bash with regard to this.
Bash is *already* doing a better job than POSIX requires, with its
EXIT traps that actually do what one expects.



Re: SIGINT handling during async functions

2023-01-21 Thread Tycho Kirchner




Am 16.01.23 um 18:26 schrieb Chet Ramey:


The fix is to add enough state machinery to detect this situation and
behave in a way that can satisfy both the standard and the later
interpretation, while being careful not to undo this work later. This is
obviously not how bash worked in the past.


Thanks for the explanation. While editing the state machinery I would like to 
suggest to add a new shopt, let's call it keepsigint, which a user may set to 
preserve the SIGINT trap set in the parent shell for all asynchronous commands.
While the POSIX behavior to ignore SIGINT for background processes if job 
control is disabled makes totally sense for interactive shells, for scripts to 
me it often appears not constructive. Please consider a script launching 
several commands in background and waiting for their completion:

cmd1 &
cmd2 &
wait

If the user having launched this script from the interactive terminal aborts it 
by hitting Ctrl+C, by default, the shell sends SIGINT to the process group 
(pgid) of the script. However, while cmd1 and cmd2 get their signal, they 
usually (if they don't override it) ignore it due to above POSIX requirement. 
In my experience, what the user usually wants in such a case is to abort cmd1, 
cmd2 as well as the script having launched them.

Of course there are ways to kill cmd1 and cmd2 (and possible grandchildren) 
explicitly, e.g. by sending an additional TERM signal to the process group, 
e.g. (at the top of the script)
trap 'trap "" TERM; env kill -TERM -- -$$; exit 130' INT
However, this is usually only safe, when we are the process group leader 
(otherwise we might kill our parent as well!), so we need an additional
[ $$ -eq $(($(ps -o pgid= -p "$$"))) ] || exec setsid --wait "${BASH_SOURCE[0]}" 
"$@"
at the top of our script to create a new process group if necessary. Further, applications may 
react differently on TERM and INT, making the "signal conversion" undesirable in the 
general case. Finally, asynchronously running bash scripts may print "Terminated" 
messages which are usually not of interest for a user having aborted the command manually.

Another option would be to enable jobcontrol within the script and kill the 
commands that way, e.g.
set -m; cmd1 & jobs=($(jobs -p)); env kill -INT -- "${jobs[@]/#/-}"
However, jobcontrol disables the possibility to suspend the "whole script" with 
Ctrl+Z and bears the risk to eventually loose some jobs while without jobcontrol, killing 
the single pgid kills all leftovers with high certainty.

A third way is to launch cmd1 and cmd2 with
env --default-signal=SIGINT,SIGQUIT cmd1 &
so they do not ignore SIGINT. That's fine, but has to be repeated for every 
command. Further, process substitutions and functions cannot be called that way.

A fourth way is to explicitly set the INT trap within an async command group 
before executing the command, like
{ trap 'true' INT; exec cmd1; } &
Personally I regularly use below __async__ function for async commands, command groups, 
process substitutions and functions and I'm fine with that. But all these four options 
require some typing (and reading) overhead and just don't feel "sane". I think, 
bash would really benefit from a 'keepsigint' option. What are your thoughts about that?

Thanks and kind regards
Tycho

_


__async__ bash -c 'echo first; trap -p; sleep 6'; wait
{ __async__; exec bash -c 'echo second; trap -p; sleep 6'; } & wait
foofunc(){ bash -c 'echo foofunc; trap -p; sleep 6'; };  __async__ foofunc; wait
cat <(__async__; exec bash -c 'echo psub; trap -p; sleep 6';)


__async__(){
local int_trap
int_trap="$(trap -p INT)"
[ -z "$int_trap" ] && int_trap="trap -- 'exit 130' SIGINT"
if [ "${#@}" -eq 0 ]; then
# Already running async, just set parent's INT handler.
eval "$int_trap";
return
fi
if [[ $(type -t "$1") == file ]]; then
# exec into external file so pid is same as if called like 'cmd &'
{ eval "$int_trap"; exec "$@"; } &
else
{ eval "$int_trap"; "$@"; } &
fi
}




Re: SIGINT handling during async functions

2023-01-16 Thread Chet Ramey

On 1/12/23 6:34 PM, Tycho Kirchner wrote:

Hi,
we found quite some inconsistency and weirdness in the handling of SIGINT's 
during async function calls and were wondering, whether those are expected. 
All calls were executed from a script with jobcontrol turned off (set +m) 
while pressing Ctrl+C shortly afterwards.


Thanks for the report. The basic issue is that the process started to
execute the background command (`asynchronous list') does have the
SIGINT and SIGQUIT dispositions set to SIG_IGN, but the processes it
creates don't. The issue is that the processes in this list have to ignore
SIGINT ("the commands in the list shall inherit from the shell a signal
action of ignored (SIG_IGN) for the  SIGINT and SIGQUIT signals" from the
normative standard text) but they have to be allowed to use trap to change
the signal dispositions (POSIX interp 751).

The first problem is that if, say, a shell forked to run the shell function
tries to initialize its signals and finds SIGINT ignored, it will assume
that SIGINT should be `hard ignored', and non-interactive shells are not
allowed to change that ("Signals that were ignored on entry to a non-
interactive shell cannot be trapped or reset"). This is what happens with
the `trap -p' and why the trap seems to change. We had a rousing discussion
about precisely what "on entry" means a few years back.

The second problem is figuring out how to set the SIGINT disposition in
the child, since it's no longer a simple "what did I inherit from my
parent?"

So what do we do about that? Well, you want to preserve the original
disposition of SIGINT in the child process that sets the handler to
SIG_IGN, or figure out a different way that the child process can change
that disposition, so an inherited value of SIG_IGN doesn't prevent a shell
from setting a new trap. You also want to prevent the shell from setting
the SIGINT disposition of processes it forks to this preserved previous
disposition, since they're all supposed to get SIG_IGN by default (but see
below!).

The fix is to add enough state machinery to detect this situation and
behave in a way that can satisfy both the standard and the later
interpretation, while being careful not to undo this work later. This is
obviously not how bash worked in the past.

It gets tricky. Say a shell forked to run this asynchronous list runs trap
to change the SIGINT disposition. It inherited SIG_IGN from its parent, but
now the processes it forks need SIGINT to be set to SIG_DFL instead of
SIG_IGN ("traps caught by the shell shall be set to the default values").
So this new state sort of ripples across different operations.


The main INT handler is never executed in foofunc (is that expected?) 


Yes, subshells never inherit traps.

while 
the new (default) handler either aborts command execution in case of 
'foofunc &' or continues execution in case of '{ foofunc; } &'. 


Inconsistent handling of the above requirements.

While on 
'foofunc &' 'trap -p' at the beginning of foofunc (wrongly) prints the main 
handler,


That's not wrong; the shell has to preserve the trap strings while changing
the disposition and only change the string if a new trap is set. It's all
very ad-hoc.

 in case of '{ foofunc; } &' it suddenly prints the ignore handler

"trap -- '' SIGINT" and remains indeed uninterruptible. > Thus printing the
trap apparently changes bash's behavior.


Kind of. Reinitializing the signals reveals the real handler, and `trap -p'
just displays it. It's setting SIGINT to be `hard ignored' that is the
problem here.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




Re: SIGINT handling during async functions

2023-01-13 Thread Robert Elz
Date:Fri, 13 Jan 2023 08:29:25 +0100
From:Tycho Kirchner 
Message-ID:  <6df2fd46-18e8-775d-a670-bd29ffdf3...@mail.de>

  | However, did you actually actually put the short snippets into a script,

No, I didn't, and now I have, I see what you mean, bash does look to
be doing something wrong wrt the state of the signals in the subshell
it forks (sometimes).   That's weird.

  | __
  | bash -c 'echo first; trap -p' & wait
  | { bash -c 'echo second; trap -p'; } & wait
  | { trap -p >/dev/null; bash -c 'echo third; trap -p'; } & wait
  | __

  | $ ./test.sh
  | first
  | trap -- '' SIGINT
  | trap -- '' SIGQUIT

In that case, you're running bash -c asynchronously (in the background),
so SIGINT and SIGQUIT are ignored, the child bash starts in that environment,
the trap -p shows that those signals remain ignored, all is behaving as it
should.

  | second

But that one is wrong, running the same thing, inside a group, should
change nothing at all.   The group (if it is actually run at all, since
it contains only one command in this case, it could simply be optimised
away, which would produce identical code to execute as the first case,
though that's not required) is run as a subshell (async), which means it
(the subshell created) should have SIGINT and SIGQUIT ignored, just the
same as in the first case.  Nothing should be changing that when
that group invokes bash -c, so those signals should be remaining ignored
when that process is invoked (that it happens to be another instance of
bash is irrelevant for that), so that bash should start in the same state
as the one in the first test.   Yet it clearly doesn't.

Note that it is the bash running the script that is doing odd things, not
the "bash -c" invoked within it.   Run the script with a different shell
(bosh, zsh, ksh93, dash, the FreeBSD and NetBSD shells) and everything
acts the same (the script still running "bash -c ...") for all 3 tests
(though some shells require removing the '-p' arg to trap in the 3rd case,
at least in the versions I have, as they do not (yet, in the versions I
have anyway) support "trap -p".   That changes nothing when the script is
run with bash, (using just "trap" there instead of "trap -p") so I mostly
left it that way.[Aside: bosh also ignores SIGTTIN in an async command
(when job control is disabled) which is probably a good idea, but isn't
required by anything, but that difference is irrelevant here - it ignores
it in all 3 cases, along with SIGINT and SIGQUIT]

  | third
  | trap -- '' SIGINT

And that one is even stranger.   For some reason in this case, when invoked
(the trap -p sending its results to /dev/null in your script is actually
writing that same output - only SIGINT is being ignored there, which explains
why only SIGINT is ignored inside the "bash -c" though why having a trap
command there is making that kind of difference (it does mean that the
group cannot simply be optimised away however), apparently causing only SIGINT
to be ignored (since it wasn't in the 2nd case, though both it and SIGQUIT
should have been) I can't guess.

You're quite correct, this is all badly broken.   And note, it has been
broken (just not quite the same way) for a very long time:

jacaranda$ bash2 /tmp/test.sh
first
trap -- '' SIGINT
trap -- '' SIGQUIT
second
third

That's different, but still broken, but actually better than bash 5,
since at least the results from the 2nd and 3rd tests are the same, the
added trap command in the 3rd test is changing nothing  (In all cases,
for all tests, the "bash -c" invoked inside the script is bash5 - but
since that simple code is doing exactly what it should, that's irrelevant,
that could be replaced by any shell that supports "trap -p").

  | So, even in this simple case, differences are observable.

Yes, they are.   Apologies for my hasty response, I was concentrating on
the wrong issues (as some kind of explanation - it was the early hours of
the morning, for me, I should have been asleep, but I just had to read mail
one more time...)

And just for the record, I'm running bash 5.2.15(1)-release on NetBSD 10.99.1
(amd64 processor - or x86_64 if you prefer - same as yours, just different OS).

The bash2 I ran was 2.05b.0(1)-release

kre




Re: SIGINT handling during async functions

2023-01-12 Thread Tycho Kirchner




Am 13.01.23 um 03:02 schrieb Robert Elz:

 Date:Fri, 13 Jan 2023 00:34:02 +0100
 From:Tycho Kirchner 
 Message-ID:  <7d59c17d-792e-0ac7-fd86-b3b2e7d4b...@mail.de>

   | we found quite some inconsistency and weirdness
   | in the handling of SIGINT's during async function calls

Not inconsistent or weird, and has nothing to do with
function calls.

   | and were wondering, whether those are expected.

Expected and required.

   | The main INT handler is never executed in foofunc [...]
   | Thus printing the trap apparently changes bash's behavior.

Nonsense (the conclusion)> 
When an async command (any command, not just functions,

or blocks enclosed in { } ) is run with job control
disabled, SIGINT is ignored for that async command.
(SIGQUIT too).

That has been the way shells work since before either
the Bourne shell (and all later shells based upon it,
like bash) or job control, were invented.

That is all you are seeing here.

kre


Dear Robert Elz,
thanks for the quick response. However, did you actually actually put the short 
snippets into a script, executed it and verified that their behavior is the 
same? In particular, did you check, whether the respective 'sleep' commands 
kept running, after hitting Ctrl+C? On my test system, the 'sleep 3' within 
foofounc **is** killed in the first three code snippets, proving your 
statements wrong. **Only** in case of the 4th snippet, where the trap is 
printed at the beginning of foofunc, the 'sleep 3' command keeps running after 
hitting Ctrl+C.
Let me give another example. Put the following commands into a script test.sh 
and execute it.
__
bash -c 'echo first; trap -p' & wait
{ bash -c 'echo second; trap -p'; } & wait
{ trap -p >/dev/null; bash -c 'echo third; trap -p'; } & wait
__

$ ./test.sh
first
trap -- '' SIGINT
trap -- '' SIGQUIT
second
third
trap -- '' SIGINT
__

So, even in this simple case, differences are observable.
Kind regards
Tycho



Re: SIGINT handling during async functions

2023-01-12 Thread Robert Elz
Oh, the differences in what trap -p is printing is because
of special case handling for trap in a subshell environment,
when the trap command is the first (maybe only) command
executed (details vary between shells).  That is mostly
intended to allow T=$(trap -p) to work, but is usually applied
to any subsell environment (it is simpler that way).
An async command is a subshell environment.

When you do foofunc& the trap command thus prints the
trap from the parent's environment, but when you  embed
that ina group, the traps get reset to those for the
subshell before the trap command gets to run, so you see
that instead.

Everything is working as intended.

kre




Re: SIGINT handling during async functions

2023-01-12 Thread Robert Elz
Date:Fri, 13 Jan 2023 00:34:02 +0100
From:Tycho Kirchner 
Message-ID:  <7d59c17d-792e-0ac7-fd86-b3b2e7d4b...@mail.de>

  | we found quite some inconsistency and weirdness
  | in the handling of SIGINT's during async function calls

Not inconsistent or weird, and has nothing to do with
function calls.

  | and were wondering, whether those are expected.

Expected and required.

  | The main INT handler is never executed in foofunc [...]
  | Thus printing the trap apparently changes bash's behavior.

Nonsense (the conclusion).

When an async command (any command, not just functions,
or blocks enclosed in { } ) is run with job control
disabled, SIGINT is ignored for that async command.
(SIGQUIT too).

That has been the way shells work since before either
the Bourne shell (and all later shells based upon it,
like bash) or job control, were invented.

That is all you are seeing here.

kre



SIGINT handling during async functions

2023-01-12 Thread Tycho Kirchner

Hi,
we found quite some inconsistency and weirdness in the handling of SIGINT's 
during async function calls and were wondering, whether those are expected. All 
calls were executed from a script with jobcontrol turned off (set +m) while 
pressing Ctrl+C shortly afterwards. In summary:
The main INT handler is never executed in foofunc (is that expected?) while the new (default) handler 
either aborts command execution in case of 'foofunc &' or continues execution in case of '{ foofunc; 
} &'. While on 'foofunc &' 'trap -p' at the beginning of foofunc (wrongly) prints the main 
handler, in case of '{ foofunc; } &' it suddenly prints the ignore handler "trap -- '' 
SIGINT" and remains indeed uninterruptible. Thus printing the trap apparently changes bash's 
behavior.

Tested bash versions:
GNU bash, Version 5.1.4(1)-release (x86_64-pc-linux-gnu)
GNU bash, Version 5.2.2(1)-release (x86_64-pc-linux-gnu)
on Debian Bullseye.

Thanks and kind regards
Tycho


t='echo INT ${FUNCNAME[0]-main} >&2'
trap "$t" INT
foofunc(){ sleep 3; echo foo >&2; }
foofunc &
sleep 5
--> INT main
# foofunc INT-handler is reset to default ('foo' is not printed).
# Note that 'trap -p' within foofunc wrongly prints above INT handler.



t='echo INT ${FUNCNAME[0]-main} >&2'
trap "$t" INT
foofunc(){ trap "$t" INT; sleep 3; echo foo >&2; }
foofunc &
sleep 5
--> INT main
INT foofunc
foo
# foofunc custom INT-handler works.


t='echo INT ${FUNCNAME[0]-main} >&2'
trap "$t" INT
foofunc(){ sleep 3; echo foo >&2; }
{ foofunc; } &
sleep 5
--> INT main
foo
# Opposing to 'foofunc &' foo _is_ printed so apparently we have a
# different default trap handler here.


t='echo INT ${FUNCNAME[0]-main} >&2'
trap "$t" INT
foofunc(){ trap -p; sleep 3; echo foo >&2; }
{ foofunc; } &
sleep 5
--> trap -- '' SIGINT
^CINT main
$ foo
# Here, when the trap is printed, INT is reported as "ignored" and foofunc
# becomes indeed uninterruptible. So, 'trap -p' changes bash's behavior.




Re: SIGINT handling

2015-09-24 Thread Chet Ramey
On 9/21/15 5:07 PM, Stephane Chazelas wrote:

> The problem is that here the parent's SIGINT handler is run upon
> the return from waitpid(), just after. My patch doesn't rely on
> EINTR from waitpid() (which doesn't happen here, waitpid() returns
> with the pid of the child that did an exit() upon receiving
> SIGINT), just on the "status" returned by the child, so doesn't
> have the problem.

I wonder if the kernel is restarting the waitpid() even though the
signal handler was installed without SA_RESTART.

> What do you suggest we do to fix that issue?

I think your additional test for wait_sigint_received coupled with a
check that the child died for some other reason than SIGINT, in addition
to the EINTR test, is a reasonable fix.

>> This still counts as catching and handling the SIGINT, and the shell
>> should not act as if the foreground process died as a result of one.
> 
> That's the point I'm arguing on.
> 
> If the command handled SIGINT and returned with 130, I argue it
> is considering itself and telling its parent as having been
> "interrupted"

No, it's not.  If a shell exits with status 130, it's saying that the last
command it executed was killed by SIGINT, or happened to exit with status
130 for some random reason.  It's very possible for a non-interactive
shell to restore its original SIGINT handler and resend SIGINT to itself,
if it's concerned about telling its parent that it's been interrupted.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: SIGINT handling

2015-09-24 Thread Stephane Chazelas
2015-09-24 14:53:16 -0400, Chet Ramey:
> On 9/24/15 9:57 AM, Stephane Chazelas wrote:
> 
> > IMO, the best approach would be to give up on WCE altogether
> > which is more source of frustration anyway than it has ever
> > helped. I live very well with a /bin/sh (dash) and interactive
> > shell (zsh) that don't do it.
> 
> We'll agree to disagree.
[...]


Now that we're settled on WCE,

would you agree that

a=$(cmd-that-catches-sigint)

should behave like

(cmd-that-catches-sigint)

(as in, not exit the shell as per WCE)?

What about $PIPESTATUS?

In:

cmd-that-catches-sigint | cmd-that-does-not
or
cmd-that-does-not | cmd-that-catches-sigint

Should we exit on SIGINT or leave that command run in
background?

Should pipefail have an influence on the behaviour? What about
lastpipe?

What about when using the wait builtin?

Why should:

cmd & wait "$!"

be treated differently from

cmd

?

Because cmd's stdin is /dev/null and so is unlikely to be an
interactive command?

So we admit WCE is a kludge 

-- 
Stephane



Re: SIGINT handling

2015-09-24 Thread Chet Ramey
On 9/24/15 9:57 AM, Stephane Chazelas wrote:

> IMO, the best approach would be to give up on WCE altogether
> which is more source of frustration anyway than it has ever
> helped. I live very well with a /bin/sh (dash) and interactive
> shell (zsh) that don't do it.

We'll agree to disagree.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: SIGINT handling

2015-09-24 Thread Chet Ramey
On 9/22/15 8:18 AM, Greg Wooledge wrote:
> On Mon, Sep 21, 2015 at 10:07:55PM +0100, Stephane Chazelas wrote:
>> Maybe the test scenario was not clear:
>>
>> bash -c 'cmd; echo hi'
>>
>> is run from an interactive shell, cmd is a long running
>> application (the problem that sparked this discussion was with
>> ping and I showed examples with an inline-script calling sleep)
> 
> Just for the record, ping is the *classic* example of an incorrectly
> written application that traps SIGINT but doesn't kill itself with
> SIGINT afterward.  (This seems to be true on multiple systems -- at
> the very least, HP-UX and Linux pings both suffer from it.)
> 
> A loop like this works as expected:
> 
> while true; do
>   sleep 1
> done
> 
> A loop like this does not:
> 
> while true; do
>   ping -c 1 some.host # or on HP-UX, ping some.host -n 1
> done

If you decide, as bash has, to allow the foreground job to determine what
to do with SIGINT, you have to cope with software like ping.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: SIGINT handling

2015-09-24 Thread Stephane Chazelas
Given that the bug was introduced by Linus' patch (to fix a bug
that anyway is in all shell implementations that do WCE) and
that it's caused by a behaviour that seems to be specific to the
Linux kernel (that the kernel seems to be messing up with the
order of delivery of the SIGCHLD (or return from waitpid()) and
SIGINT), we may want to bring the issue up to him.

Here, the behaviour could be seen as a kernel bug, since the
child should clearly die *after* the SIGINT has been issued to
the parent (since the ^C should insert the SIGINT in the signal
queue of both parent and child processes at the same time) so
it's wrong for SIGINT to be handled *after* waitpid() returns.

But of course one can also argue that the order of signal
delivery is not guaranteed in general anyway.

IMO, the best approach would be to give up on WCE altogether
which is more source of frustration anyway than it has ever
helped. I live very well with a /bin/sh (dash) and interactive
shell (zsh) that don't do it.

WCE may be good in a perfect world where everything does it
(everything that calls waitpid() without using system(3)), but
if not, I hardly see the point.

What's the point of bash doing it when sh, find -exec, xargs,
watch, git (like in that emacs bug report) don't do it.

it seems to me that finding another way to address it (like
emacs approach of putting itself on its own in a new forground
job if it's not already a process group leader) for the rare
cases where it's useful (like the vi -> :! case) would be
better.

-- 
Stephane



Re: SIGINT handling

2015-09-24 Thread Stephane Chazelas
2015-09-24 09:36:08 +0100, Pádraig Brady:
[...]
> > (gdb) handle SIGINT nostop pass
[...]
> 
> In case it's relevant, I'm not entirely sure of gdb's signal handling:
> https://sourceware.org/bugzilla/show_bug.cgi?id=18364

Yes, I wondered about that.

I'd expect the "handle SIGINT nostop pass", to take gdb out of
the loop, but I've not verified it and I suspect ptracing could
have side effects.

It's easy to corroborate with printfs though here which I just
did:

$ ./bash -c './a; echo x'
^Cwait_sigint_received=1 pid=-1
wait_sigint_received=1 pid=956
x
$ ./bash -c './a; echo x'
^Cwait_sigint_received=1 pid=958

$ diff -pu jobs.c\~ jobs.c
--- jobs.c~ 2015-09-20 20:03:14.692119372 +0100
+++ jobs.c  2015-09-24 11:49:03.963122465 +0100
@@ -3262,6 +3262,7 @@ itrace("waitchld: waitpid returns %d blo
 require the child to actually die due to SIGINT to act on the
 SIGINT we received; otherwise we assume the child handled it and
 let it go. */
+  fprintf(stderr, "wait_sigint_received=%d pid=%d\n", 
wait_sigint_received, pid);
   if (pid < 0 && errno == EINTR && wait_sigint_received)
child_caught_sigint = 1;

-- 
Stephane



Re: SIGINT handling

2015-09-24 Thread Pádraig Brady
On 24/09/15 07:20, Stephane Chazelas wrote:
> 2015-09-24 07:01:23 +0100, Stephane Chazelas:
>> 2015-09-23 21:27:00 -0400, Chet Ramey:
>>> On 9/19/15 5:31 PM, Stephane Chazelas wrote:
>>>
 In case it was caused by some Debian patch, I recompiled the
 code of 4.3.42 from gnu.org and the one from the devel branch on
 the git repository (commit bash-20150911 snapshot) and still:

 $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
 ^Chi
 $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
 ^Chi
 $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
 ^C
 $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
 ^Chi

 Sometimes (and the frequency of occurrences is erratic,
 generally roughly 80% of "hi"s but at times, I don't see a "hi"
 in a while), the "hi" doesn't show up. Note that I press ^C well
 after sleep has started.
>>>
>>> It would be nice to see a system call trace for this so we can check
>>> what's going on with the timing.
>>
>> I don't have them logged but I did several tests in gdb
>> with "handle SIGINT nostop pass" and as I said before, 
>> Upon the test that sets child_caught_sigint, waitpid() has not
>> returned with EINTR and wait_sigint_received has been set.
>>
>> If I break on the SIGINT handler, I see the call trace at the
>> return of the "syscall".
>>
>> I can try and get you a call trace later today.
> [...]
> 
> (gdb) handle SIGINT nostop pass
> SIGINT is used by the debugger.
> Are you sure you want to change it? (y or n) y
> SignalStop  Print   Pass to program Description
> SIGINTNoYes Yes Interrupt
> (gdb) break wait_sigint_handler
> Breakpoint 1 at 0x443a70: file jobs.c, line 2241.
> (gdb) run
> Starting program: bash-4.3/bash -c ./a\;\ echo\ x
> ^C
> Program received signal SIGINT, Interrupt.
> 
> Breakpoint 1, wait_sigint_handler (sig=2) at jobs.c:2241
> 2241{
> (gdb) bt
> #0  wait_sigint_handler (sig=2) at jobs.c:2241
> #1  
> #2  0x776bc31c in __libc_waitpid (pid=pid@entry=-1, 
> stat_loc=stat_loc@entry=0x7fffdbc8, options=options@entry=0) at 
> ../sysdeps/unix/sysv/linux/waitpid.c:31
> #3  0x00445f3d in waitchld (block=block@entry=1, wpid=5337) at 
> jobs.c:3224
> #4  0x0044733b in wait_for (pid=5337) at jobs.c:2485
> #5  0x00437992 in execute_command_internal 
> (command=command@entry=0x70bb88, asynchronous=asynchronous@entry=0, 
> pipe_in=pipe_in@entry=-1, pipe_out=pipe_out@entry=-1,
> fds_to_close=fds_to_close@entry=0x70bde8) at execute_cmd.c:829
> #6  0x00437b0e in execute_command (command=0x70bb88) at 
> execute_cmd.c:390
> #7  0x00435f23 in execute_connection (fds_to_close=0x70bdc8, 
> pipe_out=-1, pipe_in=-1, asynchronous=0, command=0x70bd88) at 
> execute_cmd.c:2494
> #8  execute_command_internal (command=0x70bd88, 
> asynchronous=asynchronous@entry=0, pipe_in=pipe_in@entry=-1, 
> pipe_out=pipe_out@entry=-1, fds_to_close=fds_to_close@entry=0x70bdc8)
> at execute_cmd.c:945
> #9  0x0047955b in parse_and_execute (string=, 
> from_file=from_file@entry=0x4b5f96 "-c", flags=flags@entry=4) at 
> evalstring.c:387
> #10 0x004205d7 in run_one_command (command=) at 
> shell.c:1348
> #11 0x0041f524 in main (argc=3, argv=0x7fffe258, 
> env=0x7fffe278) at shell.c:695
> (gdb) frame 2
> #2  0x776bc31c in __libc_waitpid (pid=pid@entry=-1, 
> stat_loc=stat_loc@entry=0x7fffdbc8, options=options@entry=0) at 
> ../sysdeps/unix/sysv/linux/waitpid.c:31
> 31  ../sysdeps/unix/sysv/linux/waitpid.c: No such file or directory.
> (gdb) disassemble
> Dump of assembler code for function __libc_waitpid:
>0x776bc300 <+0>: mov0x2f14cd(%rip),%r9d# 
> 0x779ad7d4 <__libc_multiple_threads>
>0x776bc307 <+7>: test   %r9d,%r9d
>0x776bc30a <+10>:jne0x776bc336 <__libc_waitpid+54>
>0x776bc30c <+12>:xor%r10d,%r10d
>0x776bc30f <+15>:movslq %edx,%rdx
>0x776bc312 <+18>:movslq %edi,%rdi
>0x776bc315 <+21>:mov$0x3d,%eax
>0x776bc31a <+26>:syscall
> => 0x776bc31c <+28>:cmp$0xf000,%rax
>0x776bc322 <+34>:ja 0x776bc325 <__libc_waitpid+37>
>0x776bc324 <+36>:retq
>0x776bc325 <+37>:mov0x2ebb3c(%rip),%rdx# 
> 0x779a7e68
>0x776bc32c <+44>:neg%eax
>0x776bc32e <+46>:mov%eax,%fs:(%rdx)
>0x776bc331 <+49>:or $0x,%rax
> (gdb) fin
> Run till exit from #2  0x776bc31c in __libc_waitpid 
> (pid=pid@entry=-1, stat_loc=stat_loc@entry=0x7fffdbc8, 
> options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:31
> 0x00445f3d in waitchld (block=block@entry=1, wpid=5481) at jobs.c:3224
> 3224  pid = WAITPID (-1, &status, waitpid_flags);
> V

Re: SIGINT handling

2015-09-23 Thread Stephane Chazelas
2015-09-24 07:01:23 +0100, Stephane Chazelas:
> 2015-09-23 21:27:00 -0400, Chet Ramey:
> > On 9/19/15 5:31 PM, Stephane Chazelas wrote:
> > 
> > > In case it was caused by some Debian patch, I recompiled the
> > > code of 4.3.42 from gnu.org and the one from the devel branch on
> > > the git repository (commit bash-20150911 snapshot) and still:
> > > 
> > > $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> > > ^Chi
> > > $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> > > ^Chi
> > > $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> > > ^C
> > > $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> > > ^Chi
> > > 
> > > Sometimes (and the frequency of occurrences is erratic,
> > > generally roughly 80% of "hi"s but at times, I don't see a "hi"
> > > in a while), the "hi" doesn't show up. Note that I press ^C well
> > > after sleep has started.
> > 
> > It would be nice to see a system call trace for this so we can check
> > what's going on with the timing.
> 
> I don't have them logged but I did several tests in gdb
> with "handle SIGINT nostop pass" and as I said before, 
> Upon the test that sets child_caught_sigint, waitpid() has not
> returned with EINTR and wait_sigint_received has been set.
> 
> If I break on the SIGINT handler, I see the call trace at the
> return of the "syscall".
> 
> I can try and get you a call trace later today.
[...]

(gdb) handle SIGINT nostop pass
SIGINT is used by the debugger.
Are you sure you want to change it? (y or n) y
SignalStop  Print   Pass to program Description
SIGINTNoYes Yes Interrupt
(gdb) break wait_sigint_handler
Breakpoint 1 at 0x443a70: file jobs.c, line 2241.
(gdb) run
Starting program: bash-4.3/bash -c ./a\;\ echo\ x
^C
Program received signal SIGINT, Interrupt.

Breakpoint 1, wait_sigint_handler (sig=2) at jobs.c:2241
2241{
(gdb) bt
#0  wait_sigint_handler (sig=2) at jobs.c:2241
#1  
#2  0x776bc31c in __libc_waitpid (pid=pid@entry=-1, 
stat_loc=stat_loc@entry=0x7fffdbc8, options=options@entry=0) at 
../sysdeps/unix/sysv/linux/waitpid.c:31
#3  0x00445f3d in waitchld (block=block@entry=1, wpid=5337) at 
jobs.c:3224
#4  0x0044733b in wait_for (pid=5337) at jobs.c:2485
#5  0x00437992 in execute_command_internal 
(command=command@entry=0x70bb88, asynchronous=asynchronous@entry=0, 
pipe_in=pipe_in@entry=-1, pipe_out=pipe_out@entry=-1,
fds_to_close=fds_to_close@entry=0x70bde8) at execute_cmd.c:829
#6  0x00437b0e in execute_command (command=0x70bb88) at 
execute_cmd.c:390
#7  0x00435f23 in execute_connection (fds_to_close=0x70bdc8, 
pipe_out=-1, pipe_in=-1, asynchronous=0, command=0x70bd88) at execute_cmd.c:2494
#8  execute_command_internal (command=0x70bd88, 
asynchronous=asynchronous@entry=0, pipe_in=pipe_in@entry=-1, 
pipe_out=pipe_out@entry=-1, fds_to_close=fds_to_close@entry=0x70bdc8)
at execute_cmd.c:945
#9  0x0047955b in parse_and_execute (string=, 
from_file=from_file@entry=0x4b5f96 "-c", flags=flags@entry=4) at 
evalstring.c:387
#10 0x004205d7 in run_one_command (command=) at 
shell.c:1348
#11 0x0041f524 in main (argc=3, argv=0x7fffe258, 
env=0x7fffe278) at shell.c:695
(gdb) frame 2
#2  0x776bc31c in __libc_waitpid (pid=pid@entry=-1, 
stat_loc=stat_loc@entry=0x7fffdbc8, options=options@entry=0) at 
../sysdeps/unix/sysv/linux/waitpid.c:31
31  ../sysdeps/unix/sysv/linux/waitpid.c: No such file or directory.
(gdb) disassemble
Dump of assembler code for function __libc_waitpid:
   0x776bc300 <+0>: mov0x2f14cd(%rip),%r9d# 
0x779ad7d4 <__libc_multiple_threads>
   0x776bc307 <+7>: test   %r9d,%r9d
   0x776bc30a <+10>:jne0x776bc336 <__libc_waitpid+54>
   0x776bc30c <+12>:xor%r10d,%r10d
   0x776bc30f <+15>:movslq %edx,%rdx
   0x776bc312 <+18>:movslq %edi,%rdi
   0x776bc315 <+21>:mov$0x3d,%eax
   0x776bc31a <+26>:syscall
=> 0x776bc31c <+28>:cmp$0xf000,%rax
   0x776bc322 <+34>:ja 0x776bc325 <__libc_waitpid+37>
   0x776bc324 <+36>:retq
   0x776bc325 <+37>:mov0x2ebb3c(%rip),%rdx# 
0x779a7e68
   0x776bc32c <+44>:neg%eax
   0x776bc32e <+46>:mov%eax,%fs:(%rdx)
   0x776bc331 <+49>:or $0x,%rax
(gdb) fin
Run till exit from #2  0x776bc31c in __libc_waitpid (pid=pid@entry=-1, 
stat_loc=stat_loc@entry=0x7fffdbc8, options=options@entry=0) at 
../sysdeps/unix/sysv/linux/waitpid.c:31
0x00445f3d in waitchld (block=block@entry=1, wpid=5481) at jobs.c:3224
3224  pid = WAITPID (-1, &status, waitpid_flags);
Value returned is $5 = 5337
(gdb) p wait_sigint_received
$6 = 1

In the other (working) cases, the difference is that waitpid()
returs -1 EINTR instead.

Note that Bart on the zsh mailing

Re: SIGINT handling

2015-09-23 Thread Stephane Chazelas
2015-09-23 21:27:00 -0400, Chet Ramey:
> On 9/19/15 5:31 PM, Stephane Chazelas wrote:
> 
> > In case it was caused by some Debian patch, I recompiled the
> > code of 4.3.42 from gnu.org and the one from the devel branch on
> > the git repository (commit bash-20150911 snapshot) and still:
> > 
> > $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> > ^Chi
> > $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> > ^Chi
> > $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> > ^C
> > $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> > ^Chi
> > 
> > Sometimes (and the frequency of occurrences is erratic,
> > generally roughly 80% of "hi"s but at times, I don't see a "hi"
> > in a while), the "hi" doesn't show up. Note that I press ^C well
> > after sleep has started.
> 
> It would be nice to see a system call trace for this so we can check
> what's going on with the timing.

I don't have them logged but I did several tests in gdb
with "handle SIGINT nostop pass" and as I said before, 
Upon the test that sets child_caught_sigint, waitpid() has not
returned with EINTR and wait_sigint_received has been set.

If I break on the SIGINT handler, I see the call trace at the
return of the "syscall".

I can try and get you a call trace later today.

> 
> Can you reproduce this on anything other than Debian?  I'm wondering
> whether it's a Linux-4 kernel phenomenon.  Plus I don't have any
> Debian machines laying around.


It's hard to reproduce on an idle system. It's relatively easy
to reproduce on a busy one and if the "cmd" exits shortly after
receiving its SIGINT. I can reproduce on a Ubuntu 14.04 with an
older kernel (3.13). I can't reproduce on FreeBSD (in a VM
though).


cmd ==
#include 
main() {signal(2,_exit);pause();}

$ tar zcf - / >& /dev/null &
[1] 4417
$ tar zcf - / >& /dev/null &
[2] 4419
$ tar zcf - / >& /dev/null &
[3] 4421
$ bash -c './a.out; echo x'
^Cx
$ bash -c './a.out; echo x'
^C

Works on second attempt.

-- 
Stephane



Re: SIGINT handling

2015-09-23 Thread Chet Ramey
On 9/20/15 11:52 AM, Stephane Chazelas wrote:

> When the above code exits without printing "hi", we see this
> call stack for instance (breakpoint on kill() in gdb):
> 
> #0  kill () at ../sysdeps/unix/syscall-template.S:81
> #1  0x0045dd8e in termsig_handler (sig=) at sig.c:588
> #2  0x0045ddef in termsig_handler (sig=) at sig.c:554
> #3  0x004466bb in set_job_status_and_cleanup (job=0) at jobs.c:3539
> #4  waitchld (block=block@entry=1, wpid=20802) at jobs.c:3316
> #5  0x0044733b in wait_for (pid=20802) at jobs.c:2485
> #6  0x00437992 in execute_command_internal 
> (command=command@entry=0x70aa48, asynchronous=asynchronous@entry=0, 
> pipe_in=pipe_in@entry=-1, pipe_out=pipe_out@entry=-1,
> fds_to_close=fds_to_close@entry=0x70bb68) at execute_cmd.c:829
> #7  0x00437b0e in execute_command (command=0x70aa48) at 
> execute_cmd.c:390
> #8  0x00435f23 in execute_connection (fds_to_close=0x70bb48, 
> pipe_out=-1, pipe_in=-1, asynchronous=0, command=0x70bb08) at 
> execute_cmd.c:2494
> #9  execute_command_internal (command=0x70bb08, 
> asynchronous=asynchronous@entry=0, pipe_in=pipe_in@entry=-1, 
> pipe_out=pipe_out@entry=-1, fds_to_close=fds_to_close@entry=0x70bb48)
> at execute_cmd.c:945
> #10 0x0047955b in parse_and_execute (string=, 
> from_file=from_file@entry=0x4b5f96 "-c", flags=flags@entry=4) at 
> evalstring.c:387
> #11 0x004205d7 in run_one_command (command=) at 
> shell.c:1348
> #12 0x0041f524 in main (argc=3, argv=0x7fffe198, 
> env=0x7fffe1b8) at shell.c:695
> 
> That is, SIGINT is being handled *after* the SIGINT handler has
> been restored to its default of exiting the shell.

An alternate explanation is that somehow the shell is forgetting that
SIGINT is trapped.  I don't see how or why that would happen, but I
don't have enough information to determine whether that's the case.

> Now, I'm not sure how to best fix that as I suppose we don't get
> any guarantee of when SIGINT will be delivered (it may be why
> ksh93 ignores SIGINT altogether and relies solely on
> WIFSIGNALED)
> 
> The above scenario suggests SIGCHLD is being delivered before
> SIGINT which is strange. I'd expect SIGINT to be inserted by the
> kernel in both cmd and bash queues upon CTRL-C, and the SIGCHLD
> would necesarily come after those SIGINT. Could it be that
> SIGCHLD jumps the queue?

The above scenario doesn't suggest that SIGCHLD is being delivered at
all.  The shell is doing a blocking waitpid for a specific pid.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: SIGINT handling

2015-09-23 Thread Chet Ramey
On 9/19/15 5:31 PM, Stephane Chazelas wrote:

> In case it was caused by some Debian patch, I recompiled the
> code of 4.3.42 from gnu.org and the one from the devel branch on
> the git repository (commit bash-20150911 snapshot) and still:
> 
> $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> ^Chi
> $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> ^Chi
> $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> ^C
> $ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> ^Chi
> 
> Sometimes (and the frequency of occurrences is erratic,
> generally roughly 80% of "hi"s but at times, I don't see a "hi"
> in a while), the "hi" doesn't show up. Note that I press ^C well
> after sleep has started.

It would be nice to see a system call trace for this so we can check
what's going on with the timing.

Can you reproduce this on anything other than Debian?  I'm wondering
whether it's a Linux-4 kernel phenomenon.  Plus I don't have any
Debian machines laying around.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: SIGINT handling

2015-09-22 Thread Stephane Chazelas
2015-09-22 12:04:45 -0600, Bob Proulx:
> Greg Wooledge wrote:
> > Just for the record, ping is the *classic* example of an incorrectly
> > written application that traps SIGINT but doesn't kill itself with
> > SIGINT afterward.  (This seems to be true on multiple systems -- at
> > the very least, HP-UX and Linux pings both suffer from it.)
> 
> The command I run into the problem most with is 'rsync' in a loop.
> 
>   EXIT VALUES
>0  Success
>   ...
>20 Received SIGUSR1 or SIGINT
> 
> Which forces me to write such things this way.
> 
>   rsync ...
>   rc=$?
>   if [ $rc -eq 20 ]; then
> kill -INT $$
>   fi
>   if [ $rc -ne 0 ]; then
> echo "Error: failed: ..." 1>&2
> exit 1
>   fi
[...]

Another (generic) work-around as mentioned at
http://unix.stackexchange.com/a/230568
and here is to add:

trap '
  trap - INT
  kill -s INT "$$"
' INT

That doesn't work properly if there are subshells though.

That basically turns a WCE shell to WUE (for very simple scripts).

For SIGQUIT, you'd probably want to disable core dumps as well:
  
trap '
  trap - QUIT
  ulimit -c 0
  kill -s QUIT "$$"
' QUIT

-- 
Stephane



Re: SIGINT handling

2015-09-22 Thread Bob Proulx
Greg Wooledge wrote:
> Just for the record, ping is the *classic* example of an incorrectly
> written application that traps SIGINT but doesn't kill itself with
> SIGINT afterward.  (This seems to be true on multiple systems -- at
> the very least, HP-UX and Linux pings both suffer from it.)

The command I run into the problem most with is 'rsync' in a loop.

  EXIT VALUES
   0  Success
  ...
   20 Received SIGUSR1 or SIGINT

Which forces me to write such things this way.

  rsync ...
  rc=$?
  if [ $rc -eq 20 ]; then
kill -INT $$
  fi
  if [ $rc -ne 0 ]; then
echo "Error: failed: ..." 1>&2
exit 1
  fi

Bob



Re: SIGINT handling

2015-09-22 Thread Stephane Chazelas
2015-09-22 15:18:32 +0100, Stephane Chazelas:
> 2015-09-22 09:41:35 -0400, Chet Ramey:
> [...]
> > > AFAICT emacs starts a new process group (and makes it the
> > > foreground process group).
> > 
> > Maybe, if it's being run from an interactive shell or in a separate
> > X window.  On the other hand, run this script with `dash':
> [...]
> 
> It does that unconditionaly (since 94 at least), but that's
> under a #ifdef BSD_PGRPS in the emacs source. Strangely enough,
> that BSD_PGRPS is not defined anymore for freebsd or netbsd
> though it is for gnu-linux
[...]

To add on that, the code was removed at some point altogether
http://git.savannah.gnu.org/cgit/emacs.git/commit/?id=58eb6cf0f77547d29f4fddca922eb6f98c0ffb28
in emacs-24.0.96 and then added back without the #ifdef
BSD_PGRPS
http://git.savannah.gnu.org/cgit/emacs.git/commit/?id=322aea6ddf7ec7fd71410d98ec1de69f219aff3e
in emacs-24.2.90

So versions 24.0.96 to 24.2 must have been broken under
gnu-linux as well, and newer versions (24.2.90 and above) should
be OK including on FreeBSD|OS/X (so no need to report it as a
bug to the emacs maintainers).

-- 
Stephane



Re: SIGINT handling

2015-09-22 Thread Chet Ramey
On 9/22/15 11:28 AM, Stephane Chazelas wrote:
> 2015-09-22 15:18:32 +0100, Stephane Chazelas:
>> 2015-09-22 09:41:35 -0400, Chet Ramey:
>> [...]
 AFAICT emacs starts a new process group (and makes it the
 foreground process group).
>>>
>>> Maybe, if it's being run from an interactive shell or in a separate
>>> X window.  On the other hand, run this script with `dash':
>> [...]
>>
>> It does that unconditionaly (since 94 at least), but that's
>> under a #ifdef BSD_PGRPS in the emacs source. Strangely enough,
>> that BSD_PGRPS is not defined anymore for freebsd or netbsd
>> though it is for gnu-linux
> [...]
> 
> To add on that, the code was removed at some point altogether
> http://git.savannah.gnu.org/cgit/emacs.git/commit/?id=58eb6cf0f77547d29f4fddca922eb6f98c0ffb28
> in emacs-24.0.96 and then added back without the #ifdef
> BSD_PGRPS
> http://git.savannah.gnu.org/cgit/emacs.git/commit/?id=322aea6ddf7ec7fd71410d98ec1de69f219aff3e
> in emacs-24.2.90
> 
> So versions 24.0.96 to 24.2 must have been broken under
> gnu-linux as well, and newer versions (24.2.90 and above) should
> be OK including on FreeBSD|OS/X (so no need to report it as a
> bug to the emacs maintainers).

I don't use GNU emacs; it's not that big a deal.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: SIGINT handling

2015-09-22 Thread Stephane Chazelas
2015-09-22 16:28:16 +0100, Stephane Chazelas:
[...]
> To add on that, the code was removed at some point altogether
> http://git.savannah.gnu.org/cgit/emacs.git/commit/?id=58eb6cf0f77547d29f4fddca922eb6f98c0ffb28
> in emacs-24.0.96 and then added back without the #ifdef
> BSD_PGRPS
> http://git.savannah.gnu.org/cgit/emacs.git/commit/?id=322aea6ddf7ec7fd71410d98ec1de69f219aff3e
> in emacs-24.2.90
[...]

And here's the bug that prompted for reinserting that code,
which is relevant to this discussion:

https://debbugs.gnu.org/cgi/bugreport.cgi?bug=12697

-- 
Stephane



Re: SIGINT handling

2015-09-22 Thread Stephane Chazelas
2015-09-22 09:41:35 -0400, Chet Ramey:
[...]
> > AFAICT emacs starts a new process group (and makes it the
> > foreground process group).
> 
> Maybe, if it's being run from an interactive shell or in a separate
> X window.  On the other hand, run this script with `dash':
[...]

It does that unconditionaly (since 94 at least), but that's
under a #ifdef BSD_PGRPS in the emacs source. Strangely enough,
that BSD_PGRPS is not defined anymore for freebsd or netbsd
though it is for gnu-linux

It seems it's because the meaning of that macro has changed over
time.

I suspect it used to mean "whether job control was available",
but now it's to decide whether to use setpgtp or setpgid. The
part that puts emacs on its own foreground process group
(narrow_foreground_group) does use setpgrp() (and calls
tcsetpgrp()) but after a:

#ifdef HAVE_SETPGID
#if !defined (USG) || defined (BSD_PGRPS)
#undef setpgrp
#define setpgrp setpgid
#endif
#endif

So in any case, it is calling setpgid()

Just seems like a bug/overlook that narrow_foreground_group is
not done on BSD and causes the problem you observe.

-- 
Stephane



Re: SIGINT handling

2015-09-22 Thread Stephane Chazelas
2015-09-22 09:41:35 -0400, Chet Ramey:
[...]
> > AFAICT emacs starts a new process group (and makes it the
> > foreground process group).
> 
> Maybe, if it's being run from an interactive shell or in a separate
> X window.  On the other hand, run this script with `dash':
> 
> echo before
> emacs -nw /tmp/qux
> echo after
> 
> If you use ^G to abort an editing command in emacs, you won't see `after'
> displayed and the script will exit with status 130, even though emacs
> clearly doesn't die due to SIGINT.
[...]

It works for me (on Debian, displays both before and after) as
emacs starts in a new process group.

The problem seems to be with some ports of emacs to OS/X and was
already discussed at
http://www.zsh.org/mla/workers/2009/msg00926.html about the
MacPorts version of Emacs that doesn't seem to be starting the
new process group.

-- 
Stephane



Re: SIGINT handling

2015-09-22 Thread Chet Ramey
On 9/21/15 5:24 PM, Stephane Chazelas wrote:
> 2015-09-21 15:34:28 -0400, Chet Ramey:
>> On 9/21/15 5:48 AM, Stephane Chazelas wrote:
>>
>>> I'm not sure I prefer that WCE approach over WUE. Wouldn't it be
>>> preferable that applications that intercept SIGINT/QUIT/TSTP for
>>> anything other than clean-up before exit/suspend implement job
>>> control themselves instead (like vi's :! should create a process
>>> group and make that the foreground process group of the
>>> terminal so pressing ^C in sh -c vi, :!sleep 10, only sends the
>>> SIGINT to sleep)?
>>
>> The classic example is emacs remapping the terminal intr key to ^G
>> and using SIGINT as its internal abort-command signal.
> [...]
> 
> AFAICT emacs starts a new process group (and makes it the
> foreground process group).

Maybe, if it's being run from an interactive shell or in a separate
X window.  On the other hand, run this script with `dash':

echo before
emacs -nw /tmp/qux
echo after

If you use ^G to abort an editing command in emacs, you won't see `after'
displayed and the script will exit with status 130, even though emacs
clearly doesn't die due to SIGINT.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: SIGINT handling

2015-09-22 Thread Stephane Chazelas
2015-09-22 08:18:08 -0400, Greg Wooledge:
[...]
> You might already have been aware of this; I'm not sure.  But in any case,
> it makes a tremendous different what "cmd" is in your example.  You
> can't generalize it.

Hi Greg,

Yes, this whole thread is about the behaviour of uninteractive
bash with commands that call exit() upon SIGINT. It was
initially a follow-up on
https://unix.stackexchange.com/questions/230421/unable-to-stop-a-bash-script-with-ctrlc/230568#230568

which was about ping specifically.

It's true that with shells implementing WCE, the behaviour of
ping is unfortunate, but I don't think we can say that ping is
to blame, more WCE.

ping cannot exit other than on error or when killed. It seems
reasonable for it to exit (after printing the statistics)
if there was no error upon CTRL-C.

Note that the iputils version does a
exit(!nreceived || (deadline && nreceived < npackets));

It it returning information to the caller which it couldn't do
if it killed itself.

That allows system("ping something") for instance to make use of
the return status (system(3) ignores SIGINT in the parent).

The WCE behaviour is cause for a number of bugs like that so I'm
not sure it's such a great idea.

-- 
Stephane



Re: SIGINT handling

2015-09-22 Thread Greg Wooledge
On Mon, Sep 21, 2015 at 10:07:55PM +0100, Stephane Chazelas wrote:
> Maybe the test scenario was not clear:
> 
> bash -c 'cmd; echo hi'
> 
> is run from an interactive shell, cmd is a long running
> application (the problem that sparked this discussion was with
> ping and I showed examples with an inline-script calling sleep)

Just for the record, ping is the *classic* example of an incorrectly
written application that traps SIGINT but doesn't kill itself with
SIGINT afterward.  (This seems to be true on multiple systems -- at
the very least, HP-UX and Linux pings both suffer from it.)

A loop like this works as expected:

while true; do
  sleep 1
done

A loop like this does not:

while true; do
  ping -c 1 some.host # or on HP-UX, ping some.host -n 1
done

You might already have been aware of this; I'm not sure.  But in any case,
it makes a tremendous different what "cmd" is in your example.  You
can't generalize it.



Re: SIGINT handling

2015-09-21 Thread Stephane Chazelas
2015-09-22 07:41:09 +0100, Stephane Chazelas:
[...]
> I wonder how FreeBSD sh addresses that.
> 
> BTW, ksh93 has the problem (the 2011 one) as well as in:
> 
> ksh93 -c 'while :; do /bin/true; done'
> 
> Sometimes is not interrupted by the first ^C. (same with bash
> with my patch applied).
[...]

Looks like FreeBSD sh doesn't address it either, ^C also fails to
interrupt at times there as well.

-- 
Stephane



Re: SIGINT handling

2015-09-21 Thread Stephane Chazelas
2015-09-21 22:07:55 +0100, Stephane Chazelas:
[...]
> Can you please clarify why the check for EINTR was needed?
> 
> What do you suggest we do to fix that issue?
[...]
> The thing is that thread was about the opposite problem at the
> other end of the spectrum so we need to find the right way to do
> it so we don't cause one problem or the other.
[...]

OK, I get it now, that other thread was about a totally
different scenario where ^C is pressed in between waitpid()
returning for a normal exit and bash restoring the normal
handler for SIGINT which explains the check for EINTR which is
intended as a race-free check that SIGINT was received before
the child died.

Now, that check for EINTR is wrong as well as it introduces that
other bug, so it could very well be that the only thing we can
do is reduce that window above to a minimum or give up on WCE.
Unless there's a clever thing that can be done in SIGCHLD and
SIGINT handlers. I wonder how FreeBSD sh addresses that.

BTW, ksh93 has the problem (the 2011 one) as well as in:

ksh93 -c 'while :; do /bin/true; done'

Sometimes is not interrupted by the first ^C. (same with bash
with my patch applied).

Note that the WCE/WUE was discussed in 2009 on the zsh mailing
list: http://www.zsh.org/mla/workers/2009/msg00943.html
where the order of delivery for SIGCHLD and SIGINT was already
noted.

It looks like the zsh maintainers are no big fan of WCE either.

-- 
Stephane



Re: SIGINT handling

2015-09-21 Thread Stephane Chazelas
2015-09-21 22:24:03 +0100, Stephane Chazelas:
[...]
> If it didn't, we could not use it in scripts of shells that
> don't do WCE *but also in non-shell scripts* (perl, python,
> ruby...) or non-scripts.
[...]

For completeness

perl's and python's system() like system(3) ignore SIGINT, so
it's a WUNE (wait and unconditionaly not exit).

python's subprocess.call() does "IUE" (for "immediate
unconditional exit")

-- 
Stephane



Re: SIGINT handling

2015-09-21 Thread Stephane Chazelas
2015-09-21 15:34:28 -0400, Chet Ramey:
> On 9/21/15 5:48 AM, Stephane Chazelas wrote:
> 
> > I'm not sure I prefer that WCE approach over WUE. Wouldn't it be
> > preferable that applications that intercept SIGINT/QUIT/TSTP for
> > anything other than clean-up before exit/suspend implement job
> > control themselves instead (like vi's :! should create a process
> > group and make that the foreground process group of the
> > terminal so pressing ^C in sh -c vi, :!sleep 10, only sends the
> > SIGINT to sleep)?
> 
> The classic example is emacs remapping the terminal intr key to ^G
> and using SIGINT as its internal abort-command signal.
[...]

AFAICT emacs starts a new process group (and makes it the
foreground process group).

UIDPID  PPID  PGID   SID  C STIME TTY  TIME CMD
chazelas 12232  5595 12232 12232  0 15:00 pts/13   00:00:00 /bin/zsh
chazelas 13609 12232 13609 12232  0 22:14 pts/13   00:00:00   sh -c emacs; 
echo test
chazelas 13610 13609 13610 12232  0 22:14 pts/13   00:00:00 emacs

>From strace:

13766 setpgid(0, 0) = 0
13766 ioctl(3, TIOCSPGRP, [13766])  = 0

If it didn't, we could not use it in scripts of shells that
don't do WCE *but also in non-shell scripts* (perl, python,
ruby...) or non-scripts.

A real-life problem though is things like:

sh -c 'vi; echo hi'

Where if you run :!sleep 10 and interrupt it with Ctrl-C, the
"echo hi" is not run in shells that don't do WCE (and non-shell
scripts and non-scripts that don't do it either).

-- 
Stephane



Re: SIGINT handling

2015-09-21 Thread Stephane Chazelas
2015-09-21 15:04:46 -0400, Chet Ramey:
> On 9/20/15 3:45 PM, Stephane Chazelas wrote:
> > 2015-09-20 17:12:45 +0100, Stephane Chazelas:
> > [...]
> >> I thought the termsig_handler was being invoked upon SIGINT as
> >> the SIGINT handler, but it is being called explicitely by
> >> set_job_status_and_cleanup so the problem is elsewhere.
> >>
> >> child_caught_sigint is 0 while if I understand correctly it
> >> should be 1 for a cmd that calls exit() upon SIGINT. So that's
> >> probably probably where we should be looking.
> > [...]
> > 
> > I had another look.
> > 
> > If we're to beleive gdb, child_caught_sigint is 0 because
> > waitpid() returns without EINTR even though wait_sigint_received
> > is 1.
> > 
> > The only reasonable explanation I can think of is that the child
> > handles its SIGINT first, exits which updates its state and
> > causes bash the parent to be scheduled, and waitpid() returns
> > (without EINT) and after that bash's SIGINT handler kicks in too
> > late.
> 
> Absent kernel problems, there are four scenarios for the child process
> reacting to SIGINT:
> 
> 1.  The SIGINT arrives before the child begins executing.
> 
> 2.  The SIGINT arrives while the child is executing.
> 
> 3.  The SIGINT arrives while the child is exiting successfully.
> 
> 4.  The SIGINT arrives after the child has exited but before the
> parent's waitpid() returns.
> 
> In the first two cases, the shell's waitpid() should return -1, but the
> first case will probably return ECHILD while the second returns EINTR.
> In the third case, there's not really anything the shell can do, since
> there's nothing to distinguish that case from one where the child catches
> SIGINT and exits successfully, and your patch doesn't change things.
> The fourth case will, in practice, be indistinguishable from the third
> case, since the kernel is usually `greedy' and will not return EINTR if
> there is something to report.

The problem is that here the parent's SIGINT handler is run upon
the return from waitpid(), just after. My patch doesn't rely on
EINTR from waitpid() (which doesn't happen here, waitpid() returns
with the pid of the child that did an exit() upon receiving
SIGINT), just on the "status" returned by the child, so doesn't
have the problem.

There would still be a problem if SIGINT was handled even
later (after we test for it), but I could not reproduce that.

Given that the SIGCHLD should come before SIGINT, it would seem
reasonable to assume SIGINT should be handled at the latest upon
the return of waitpid().

Can you please clarify why the check for EINTR was needed?

What do you suggest we do to fix that issue?

> In all these cases, I assume that bash has called waitchld() and
> waiting_for_child == 1.  If it's not, the signal handler treats the
> signal as it would normally, if it were not waiting for a child to
> exit.

Maybe the test scenario was not clear:

bash -c 'cmd; echo hi'

is run from an interactive shell, cmd is a long running
application (the problem that sparked this discussion was with
ping and I showed examples with an inline-script calling sleep)
that has a handler on SIGINT that calls exit().

Upon pressing ^C, a few seconds after starting that, so at a
time where bash is doing waitpid() and cmd is doing something
like sleep(), the tty line discipline sends SIGINT to the
foreground process group, so both bash and cmd.

Now, the whole problem is caused by cmd calling exit() straight
upon receiving that SIGINT. So everything happens at the same
time.

In the case where "hi" is not output, we have the events in this
order:
- SIGINT is sent to bash and cmd
- cmd handles its SIGINT and calls exit()
- bash's waitpid() returns without being interrupted with the
child's status being 0 (in anycase not WIFSIGNALED() with
SIGINT). 
- Straight upon return of that syscall (gdb shows a call trace
for the SIGINT handler in the __waipid() libc wrapper), bash's
SIGINT handler is executed.
- we return from the handler in __waitpid(), and then:
  if (pid < 0 && errno == EINTR && wait_sigint_received)
   child_caught_sigint = 1;
  because waipid() did *not* return with EINTR, we have
  child_caught_sigint = 0 (even though the child clearly caught
  sigint as it returned with WIFEXITED), so even though
  wait_sigint_received is 1, bash makes the wrong decision,
  decides the child did not catch SIGINT and kills itself with
  SIGINT (and so doesn't run echo hi).

Things that don't look right in the code either are things like
where those conditions are asserted and tested and the handler
set and reset, but then again I don't have the full picture.

> > Anyway, this patch makes the problem go away for me (and
> > addresses my problem #2 about exit code 130 not being treated
> > as an interrupted child). It might break things though if there
> > was a real reason for bash to check for waitpid()'s EINTR.
> 
> You should read
> 
> http://lists.gnu.org/archive/html/bug-bash/2011-02/msg00088.html
> 
> for a 

Re: SIGINT handling

2015-09-21 Thread Chet Ramey
On 9/21/15 5:48 AM, Stephane Chazelas wrote:

> I'm not sure I prefer that WCE approach over WUE. Wouldn't it be
> preferable that applications that intercept SIGINT/QUIT/TSTP for
> anything other than clean-up before exit/suspend implement job
> control themselves instead (like vi's :! should create a process
> group and make that the foreground process group of the
> terminal so pressing ^C in sh -c vi, :!sleep 10, only sends the
> SIGINT to sleep)?

The classic example is emacs remapping the terminal intr key to ^G
and using SIGINT as its internal abort-command signal.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: SIGINT handling

2015-09-21 Thread Chet Ramey
On 9/20/15 3:45 PM, Stephane Chazelas wrote:
> 2015-09-20 17:12:45 +0100, Stephane Chazelas:
> [...]
>> I thought the termsig_handler was being invoked upon SIGINT as
>> the SIGINT handler, but it is being called explicitely by
>> set_job_status_and_cleanup so the problem is elsewhere.
>>
>> child_caught_sigint is 0 while if I understand correctly it
>> should be 1 for a cmd that calls exit() upon SIGINT. So that's
>> probably probably where we should be looking.
> [...]
> 
> I had another look.
> 
> If we're to beleive gdb, child_caught_sigint is 0 because
> waitpid() returns without EINTR even though wait_sigint_received
> is 1.
> 
> The only reasonable explanation I can think of is that the child
> handles its SIGINT first, exits which updates its state and
> causes bash the parent to be scheduled, and waitpid() returns
> (without EINT) and after that bash's SIGINT handler kicks in too
> late.

Absent kernel problems, there are four scenarios for the child process
reacting to SIGINT:

1.  The SIGINT arrives before the child begins executing.

2.  The SIGINT arrives while the child is executing.

3.  The SIGINT arrives while the child is exiting successfully.

4.  The SIGINT arrives after the child has exited but before the
parent's waitpid() returns.

In the first two cases, the shell's waitpid() should return -1, but the
first case will probably return ECHILD while the second returns EINTR.
In the third case, there's not really anything the shell can do, since
there's nothing to distinguish that case from one where the child catches
SIGINT and exits successfully, and your patch doesn't change things.
The fourth case will, in practice, be indistinguishable from the third
case, since the kernel is usually `greedy' and will not return EINTR if
there is something to report.

In all these cases, I assume that bash has called waitchld() and
waiting_for_child == 1.  If it's not, the signal handler treats the
signal as it would normally, if it were not waiting for a child to
exit.

> 
> Anyway, this patch makes the problem go away for me (and
> addresses my problem #2 about exit code 130 not being treated
> as an interrupted child). It might break things though if there
> was a real reason for bash to check for waitpid()'s EINTR.

You should read

http://lists.gnu.org/archive/html/bug-bash/2011-02/msg00088.html

for a summary of why the test for waitpid() returning -1/EINTR exists.
Linus's posts, at least the ones where there's more light than heat, are
good reading.

> With that patch applied,
> 
> ./bash -c 'sh -c "trap exit INT; sleep 120; :"; echo hi'
> ./bash -c 'mksh -c "sleep 120; :"; echo hi'
> 
> Does *not* output "hi" (as mksh or sh do a exit(130) which is
> regarded as them being "interrupted by that SIGINT", or at least
> reporting that the child they want to report the status of
> (sleep) has been killed by a SIGINT).

This still counts as catching and handling the SIGINT, and the shell
should not act as if the foreground process died as a result of one.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: SIGINT handling

2015-09-21 Thread Jilles Tjoelker
On Mon, Sep 21, 2015 at 10:48:07AM +0100, Stephane Chazelas wrote:
> 2015-09-19 21:36:28 +0100, Stephane Chazelas:
> > 2015-09-18 16:14:39 +0100, Stephane Chazelas:
> > [...]
> > > In:
> > > 
> > > bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> > > 
> > > If I press Ctrl-C, I still see "hi".
> > [...]
> > 
> > Jilles provided with the explanation at
> > http://unix.stackexchange.com/a/230731

> > with a link to:
> > http://www.cons.org/cracauer/sigint.html
> [...]

> Note that bash (and ksh, contrary to FreeBSD sh) is not
> consistent in its handling of that "WCE" (for "wait and
> cooperative exit") approach in that pressing ^C in:

> bash -c '
>   var=$(sh -c "trap \"\" INT; sleep 3; echo result)
>   echo "$var"
> '

> kills bash, leaving the "sh" and "sleep" running unattended in
> background.

> Same for:

> bash -O lastpipe -c '
>   sh -c "trap \"\" INT; sleep 3; echo test" | read var; echo done'

> One could also argue, that to be consistent, SIGTSTP and SIGQUIT
> should be treated similarly (strangely enough
> http://www.cons.org/cracauer/sigint.html doesn't mention SIGTSTP).

Agreed for SIGQUIT, but not for SIGTSTP. For SIGTSTP, either the shell
has job control enabled or it does not. If it does, SIGTSTP stops the
job and continues the shell; if it does not, SIGTSTP stops the whole job
including the shell.

> I'm not sure I prefer that WCE approach over WUE. Wouldn't it be
> preferable that applications that intercept SIGINT/QUIT/TSTP for
> anything other than clean-up before exit/suspend implement job
> control themselves instead (like vi's :! should create a process
> group and make that the foreground process group of the
> terminal so pressing ^C in sh -c vi, :!sleep 10, only sends the
> SIGINT to sleep)?

This kind of job control manipulation is very hard to get right in the
general case. FreeBSD's su does it, and it needed various iterations to
fix hanging processes or unexpected logouts, some of which only occur
when the application is started from certain shells.

Also, it is not possible to fix generally cases like
  su SOMEUSER -c 'while sleep 0.1; do echo @@@; done' | less
where there are other processes in the same process group as the one
doing job control manipulations. If su changes the tty's foreground
process group, it will prevent less from reconfiguring terminal modes.

-- 
Jilles Tjoelker



Re: SIGINT handling

2015-09-21 Thread Stephane Chazelas
2015-09-21 17:35:36 +0200, Jilles Tjoelker:
[...]
> This kind of job control manipulation is very hard to get right in the
> general case. FreeBSD's su does it, and it needed various iterations to
> fix hanging processes or unexpected logouts, some of which only occur
> when the application is started from certain shells.
> 
> Also, it is not possible to fix generally cases like
>   su SOMEUSER -c 'while sleep 0.1; do echo @@@; done' | less
> where there are other processes in the same process group as the one
> doing job control manipulations. If su changes the tty's foreground
> process group, it will prevent less from reconfiguring terminal modes.
[...]

What was the rationale for adding that to "su"? I'd have
expected job control to be only done by interactive
applications. 

-- 
Stephane



Re: SIGINT handling

2015-09-21 Thread Stephane Chazelas
2015-09-21 17:35:36 +0200, Jilles Tjoelker:
[...]
> > One could also argue, that to be consistent, SIGTSTP and SIGQUIT
> > should be treated similarly (strangely enough
> > http://www.cons.org/cracauer/sigint.html doesn't mention SIGTSTP).
> 
> Agreed for SIGQUIT, but not for SIGTSTP. For SIGTSTP, either the shell
> has job control enabled or it does not. If it does, SIGTSTP stops the
> job and continues the shell; if it does not, SIGTSTP stops the whole job
> including the shell.
[...]

Note sure what you mean, we may not be talking of the same thing.

What I meant:

In:

sh -c '(trap "" INT; sleep 10); echo done'

If you send ^C, nothing happens.

In:

sh -c '(trap "" TSTP; sleep 10); echo done'

If you press ^Z, the "sh" is suspended, but "sleep" keeps
running in background.

One could argue they should be treated the same (that sh
shouldn't suspend itself if the process it's currently waiting
for has not been suspended, just like for ^C it should not die
of SIGINT if the process it's currently waiting for has not died
of SIGINT).

You may be talking of:

sh -mc 'sleep 10; echo "$?"'

where SIGINT upon ^C, SIGTSTP upon ^Z is sent to sleep only (not
sh).

-- 
Stephane



Re: SIGINT handling

2015-09-21 Thread Stephane Chazelas
2015-09-19 21:36:28 +0100, Stephane Chazelas:
> 2015-09-18 16:14:39 +0100, Stephane Chazelas:
> [...]
> > In:
> > 
> > bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> > 
> > If I press Ctrl-C, I still see "hi".
> [...]
> 
> Jilles provided with the explanation at
> http://unix.stackexchange.com/a/230731
> 
> with a link to:
> http://www.cons.org/cracauer/sigint.html
[...]

Note that bash (and ksh, contrary to FreeBSD sh) is not
consistent in its handling of that "WCE" (for "wait and
cooperative exit") approach in that pressing ^C in:

bash -c '
  var=$(sh -c "trap \"\" INT; sleep 3; echo result)
  echo "$var"
'

kills bash, leaving the "sh" and "sleep" running unattended in
background.

Same for:

bash -O lastpipe -c '
  sh -c "trap \"\" INT; sleep 3; echo test" | read var; echo done'

One could also argue, that to be consistent, SIGTSTP and SIGQUIT
should be treated similarly (strangely enough
http://www.cons.org/cracauer/sigint.html doesn't mention SIGTSTP).

I'm not sure I prefer that WCE approach over WUE. Wouldn't it be
preferable that applications that intercept SIGINT/QUIT/TSTP for
anything other than clean-up before exit/suspend implement job
control themselves instead (like vi's :! should create a process
group and make that the foreground process group of the
terminal so pressing ^C in sh -c vi, :!sleep 10, only sends the
SIGINT to sleep)?

-- 
Stephane



Re: SIGINT handling

2015-09-20 Thread Stephane Chazelas
2015-09-20 17:12:45 +0100, Stephane Chazelas:
[...]
> I thought the termsig_handler was being invoked upon SIGINT as
> the SIGINT handler, but it is being called explicitely by
> set_job_status_and_cleanup so the problem is elsewhere.
> 
> child_caught_sigint is 0 while if I understand correctly it
> should be 1 for a cmd that calls exit() upon SIGINT. So that's
> probably probably where we should be looking.
[...]

I had another look.

If we're to beleive gdb, child_caught_sigint is 0 because
waitpid() returns without EINTR even though wait_sigint_received
is 1.

The only reasonable explanation I can think of is that the child
handles its SIGINT first, exits which updates its state and
causes bash the parent to be scheduled, and waitpid() returns
(without EINT) and after that bash's SIGINT handler kicks in too
late.

Anyway, this patch makes the problem go away for me (and
addresses my problem #2 about exit code 130 not being treated
as an interrupted child). It might break things though if there
was a real reason for bash to check for waitpid()'s EINTR.

With that patch applied,

./bash -c 'sh -c "trap exit INT; sleep 120; :"; echo hi'
./bash -c 'mksh -c "sleep 120; :"; echo hi'

Does *not* output "hi" (as mksh or sh do a exit(130) which is
regarded as them being "interrupted by that SIGINT", or at least
reporting that the child they want to report the status of
(sleep) has been killed by a SIGINT).

And 

./bash -c 'sh -c "trap exit\ 0 INT; sleep 120; :"; echo hi'

*consistently* outputs "hi" (the zero exit status cancels the
aborting of bash).

--- jobs.c~ 2015-09-20 20:03:14.692119372 +0100
+++ jobs.c  2015-09-20 20:37:01.510892045 +0100
@@ -3257,21 +3257,15 @@ itrace("waitchld: waitpid returns %d blo
   CHECK_TERMSIG;
   CHECK_WAIT_INTR;
 
-  /* If waitpid returns -1/EINTR and the shell saw a SIGINT, then we
-assume the child has blocked or handled SIGINT.  In that case, we
-require the child to actually die due to SIGINT to act on the
-SIGINT we received; otherwise we assume the child handled it and
-let it go. */
-  if (pid < 0 && errno == EINTR && wait_sigint_received)
-   child_caught_sigint = 1;
-
   if (pid <= 0)
continue;   /* jumps right to the test */
 
-  /* If the child process did die due to SIGINT, forget our assumption
-that it caught or otherwise handled it. */
-  if (WIFSIGNALED (status) && WTERMSIG (status) == SIGINT)
-child_caught_sigint = 0;
+  /* If we received a SIGINT, but the child did not die of a SIGINT and
+ did not report a 128+SIGINT exit status, we assume the child handled
+ it and let it go. */
+  child_caught_sigint = wait_sigint_received &&
+   ! ((WIFSIGNALED (status) && WTERMSIG (status) == SIGINT) ||
+  (WIFEXITED (status) && WEXITSTATUS (status) == 128 + SIGINT));
 
   /* children_exited is used to run traps on SIGCHLD.  We don't want to
  run the trap if a process is just being continued. */

-- 
Stephane



Re: SIGINT handling

2015-09-20 Thread Stephane Chazelas
[...]
> When the above code exits without printing "hi", we see this
> call stack for instance (breakpoint on kill() in gdb):
> 
> #0  kill () at ../sysdeps/unix/syscall-template.S:81
> #1  0x0045dd8e in termsig_handler (sig=) at sig.c:588
> #2  0x0045ddef in termsig_handler (sig=) at sig.c:554
> #3  0x004466bb in set_job_status_and_cleanup (job=0) at jobs.c:3539
> #4  waitchld (block=block@entry=1, wpid=20802) at jobs.c:3316
> #5  0x0044733b in wait_for (pid=20802) at jobs.c:2485
> #6  0x00437992 in execute_command_internal 
> (command=command@entry=0x70aa48, asynchronous=asynchronous@entry=0, 
> pipe_in=pipe_in@entry=-1, pipe_out=pipe_out@entry=-1,
> fds_to_close=fds_to_close@entry=0x70bb68) at execute_cmd.c:829
> #7  0x00437b0e in execute_command (command=0x70aa48) at 
> execute_cmd.c:390
> #8  0x00435f23 in execute_connection (fds_to_close=0x70bb48, 
> pipe_out=-1, pipe_in=-1, asynchronous=0, command=0x70bb08) at 
> execute_cmd.c:2494
> #9  execute_command_internal (command=0x70bb08, 
> asynchronous=asynchronous@entry=0, pipe_in=pipe_in@entry=-1, 
> pipe_out=pipe_out@entry=-1, fds_to_close=fds_to_close@entry=0x70bb48)
> at execute_cmd.c:945
> #10 0x0047955b in parse_and_execute (string=, 
> from_file=from_file@entry=0x4b5f96 "-c", flags=flags@entry=4) at 
> evalstring.c:387
> #11 0x004205d7 in run_one_command (command=) at 
> shell.c:1348
> #12 0x0041f524 in main (argc=3, argv=0x7fffe198, 
> env=0x7fffe1b8) at shell.c:695
> 
> That is, SIGINT is being handled *after* the SIGINT handler has
> been restored to its default of exiting the shell.
[...]

Sorry, please disregard that.

I thought the termsig_handler was being invoked upon SIGINT as
the SIGINT handler, but it is being called explicitely by
set_job_status_and_cleanup so the problem is elsewhere.

child_caught_sigint is 0 while if I understand correctly it
should be 1 for a cmd that calls exit() upon SIGINT. So that's
probably probably where we should be looking.

-- 
Stephane



Re: SIGINT handling

2015-09-20 Thread Stephane Chazelas
2015-09-19 21:28:24 -0400, Chet Ramey:
> On 9/19/15 5:31 PM, Stephane Chazelas wrote:
> > 2015-09-19 16:42:28 -0400, Chet Ramey:
> > [...]
> >> I'm surprised you've managed to avoid the dozen or so discussions on the
> >> topic.
> >>
> >> http://lists.gnu.org/archive/html/bug-bash/2014-03/msg00108.html
> > [...]
> > 
> > Thanks for the links. I still think the comments on the second
> > article I sent
> > (http://thread.gmane.org/gmane.comp.shells.bash.bugs/24178/focus=24183)
> > still hold though and from a quick read I don't see those points
> > being mentioned in the past discussions (but that was a quick
> > read).
> > 
> > I notice that you mention the race conditions have been fixed,
> > but I'm still seeing some non-deterministic behaviour.
> 
> I can't reproduce this on Mac OS X and RHEL 6 and 7, the systems I have
> readily available today.
> 
> The shell notes when it sees SIGINT and whether or not waitpid returns
> -1/EINTR.  If the sleep exits due to SIGINT, even after the waitpid
> returns -1, the shell assumes it didn't catch and handle the SIGINT and
> the shell calls the trap handler.
[...]

To clarify,

In

bash -c 'sh -c "trap exit INT; sleep 99; :"; echo hi'

The command under test is "bash", not "sh". The "sh" is just
there as a cmd that does exit() upon receiving SIGINT.

It's just:

bash -c 'cmd; echo hi'

You can replace "cmd" with:

perl -e '$SIG{INT}= sub{exit}; sleep'

(or

mksh -c 'sleep 10; :'

(which does an exit(130) upon receiving SIGINT))

The problem here is that when you press CTRL-C, SIGINT is sent
to all the processes in the process group, so to "bash" and
"cmd".

Now, bash works as expected only if it handles its own SIGINT
before the child has caught its own one and exited.

When the above code exits without printing "hi", we see this
call stack for instance (breakpoint on kill() in gdb):

#0  kill () at ../sysdeps/unix/syscall-template.S:81
#1  0x0045dd8e in termsig_handler (sig=) at sig.c:588
#2  0x0045ddef in termsig_handler (sig=) at sig.c:554
#3  0x004466bb in set_job_status_and_cleanup (job=0) at jobs.c:3539
#4  waitchld (block=block@entry=1, wpid=20802) at jobs.c:3316
#5  0x0044733b in wait_for (pid=20802) at jobs.c:2485
#6  0x00437992 in execute_command_internal 
(command=command@entry=0x70aa48, asynchronous=asynchronous@entry=0, 
pipe_in=pipe_in@entry=-1, pipe_out=pipe_out@entry=-1,
fds_to_close=fds_to_close@entry=0x70bb68) at execute_cmd.c:829
#7  0x00437b0e in execute_command (command=0x70aa48) at 
execute_cmd.c:390
#8  0x00435f23 in execute_connection (fds_to_close=0x70bb48, 
pipe_out=-1, pipe_in=-1, asynchronous=0, command=0x70bb08) at execute_cmd.c:2494
#9  execute_command_internal (command=0x70bb08, 
asynchronous=asynchronous@entry=0, pipe_in=pipe_in@entry=-1, 
pipe_out=pipe_out@entry=-1, fds_to_close=fds_to_close@entry=0x70bb48)
at execute_cmd.c:945
#10 0x0047955b in parse_and_execute (string=, 
from_file=from_file@entry=0x4b5f96 "-c", flags=flags@entry=4) at 
evalstring.c:387
#11 0x004205d7 in run_one_command (command=) at 
shell.c:1348
#12 0x0041f524 in main (argc=3, argv=0x7fffe198, 
env=0x7fffe1b8) at shell.c:695

That is, SIGINT is being handled *after* the SIGINT handler has
been restored to its default of exiting the shell.

Now, I'm not sure how to best fix that as I suppose we don't get
any guarantee of when SIGINT will be delivered (it may be why
ksh93 ignores SIGINT altogether and relies solely on
WIFSIGNALED)

The above scenario suggests SIGCHLD is being delivered before
SIGINT which is strange. I'd expect SIGINT to be inserted by the
kernel in both cmd and bash queues upon CTRL-C, and the SIGCHLD
would necesarily come after those SIGINT. Could it be that
SIGCHLD jumps the queue?

Note that I'm not seeing that as often on every system. It seems
I can make it more likely by making the system busier.

-- 
Stephane



Re: SIGINT handling

2015-09-19 Thread Chet Ramey
On 9/19/15 5:31 PM, Stephane Chazelas wrote:
> 2015-09-19 16:42:28 -0400, Chet Ramey:
> [...]
>> I'm surprised you've managed to avoid the dozen or so discussions on the
>> topic.
>>
>> http://lists.gnu.org/archive/html/bug-bash/2014-03/msg00108.html
> [...]
> 
> Thanks for the links. I still think the comments on the second
> article I sent
> (http://thread.gmane.org/gmane.comp.shells.bash.bugs/24178/focus=24183)
> still hold though and from a quick read I don't see those points
> being mentioned in the past discussions (but that was a quick
> read).
> 
> I notice that you mention the race conditions have been fixed,
> but I'm still seeing some non-deterministic behaviour.

I can't reproduce this on Mac OS X and RHEL 6 and 7, the systems I have
readily available today.

The shell notes when it sees SIGINT and whether or not waitpid returns
-1/EINTR.  If the sleep exits due to SIGINT, even after the waitpid
returns -1, the shell assumes it didn't catch and handle the SIGINT and
the shell calls the trap handler.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: SIGINT handling

2015-09-19 Thread Stephane Chazelas
2015-09-19 16:42:28 -0400, Chet Ramey:
[...]
> I'm surprised you've managed to avoid the dozen or so discussions on the
> topic.
> 
> http://lists.gnu.org/archive/html/bug-bash/2014-03/msg00108.html
[...]

Thanks for the links. I still think the comments on the second
article I sent
(http://thread.gmane.org/gmane.comp.shells.bash.bugs/24178/focus=24183)
still hold though and from a quick read I don't see those points
being mentioned in the past discussions (but that was a quick
read).

I notice that you mention the race conditions have been fixed,
but I'm still seeing some non-deterministic behaviour.

In case it was caused by some Debian patch, I recompiled the
code of 4.3.42 from gnu.org and the one from the devel branch on
the git repository (commit bash-20150911 snapshot) and still:

$ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
^Chi
$ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
^Chi
$ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
^C
$ ./bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
^Chi

Sometimes (and the frequency of occurrences is erratic,
generally roughly 80% of "hi"s but at times, I don't see a "hi"
in a while), the "hi" doesn't show up. Note that I press ^C well
after sleep has started.

On Linux 4.1.0-1-amd64 core2 duo, bashcompiled with gcc (Debian
5.2.1-16) 5.2.1 20150903 linked with
GNU C Library (Debian GLIBC 2.19-19) stable release version 2.19, by Roland 
McGrath et al.
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.8.5.
Compiled on a Linux 4.0.7 system on 2015-07-09.
Available extensions:
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
libc ABIs: UNIQUE IFUNC

-- 
Stephane



Re: SIGINT handling

2015-09-19 Thread Chet Ramey
On 9/18/15 11:14 AM, Stephane Chazelas wrote:
> Hello.
> 
> In:
> 
> bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> 
> If I press Ctrl-C, I still see "hi".
> 
> On Solaris with 4.1.11(2)-release (i386-pc-solaris2.11), that
> seems to be consistent.
> 
> On Debian with 4.3.42(1)-release (x86_64-pc-linux-gnu), that
> seems to happen only in something like 80% of the time.
> 
> For bash to exit upon receiving that SIGINT, the currently
> running process has to die itself as well of SIGINT (or the
> currently running command to be builtin).
> 
> That sounds like a bad idea, especially considering that it
> doesn't exit either if the process returns with exit code 130
> upon receiving that SIGINT. For instance:
> 
> For instance, in:
> 
> bash -c 'mksh -c "sleep 10; :"; echo hi'
> 
> Upon pressing Ctrl-C, mksh handles the SIGINT and exits with
> 130 (as opposed to dying of a SIGINT), so bash doesn't exit
> (sometimes only on Debian).

I'm surprised you've managed to avoid the dozen or so discussions on the
topic.

http://lists.gnu.org/archive/html/bug-bash/2014-03/msg00108.html

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: SIGINT handling

2015-09-19 Thread Stephane Chazelas
2015-09-18 16:14:39 +0100, Stephane Chazelas:
[...]
> In:
> 
> bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'
> 
> If I press Ctrl-C, I still see "hi".
[...]

Jilles provided with the explanation at
http://unix.stackexchange.com/a/230731

with a link to:
http://www.cons.org/cracauer/sigint.html

Which makes sense.

Now, IMO a few things could be improved:

1- it would be nice if it could be clearly documented

2- if the shell received SIGINT, then I'd argue the currently
running process returning with a "status" such that
WIFEXITED(status)&& WEXITSTATUS(status) == SIGINT + 0200
should be another case where bash (and AT&T ksh and FreeBSD sh)
should exit as well (by killing themselves with SIGINT or
exit(SIGINT + 0200)).

That's my:

> That sounds like a bad idea, especially considering that it
> doesn't exit either if the process returns with exit code 130
> upon receiving that SIGINT. For instance:
> 
> For instance, in:
> 
> bash -c 'mksh -c "sleep 10; :"; echo hi'
> 
> Upon pressing Ctrl-C, mksh handles the SIGINT and exits with
> 130 (as opposed to dying of a SIGINT), so bash doesn't exit
> (sometimes only on Debian).

3. There still seems to be a bug in bash in that

> On Debian with 4.3.42(1)-release (x86_64-pc-linux-gnu), that
> seems to happen only in something like 80% of the time.

Cheers,
Stephane



SIGINT handling

2015-09-18 Thread Stephane Chazelas
Hello.

In:

bash -c 'sh -c "trap exit INT; sleep 10; :"; echo hi'

If I press Ctrl-C, I still see "hi".

On Solaris with 4.1.11(2)-release (i386-pc-solaris2.11), that
seems to be consistent.

On Debian with 4.3.42(1)-release (x86_64-pc-linux-gnu), that
seems to happen only in something like 80% of the time.

For bash to exit upon receiving that SIGINT, the currently
running process has to die itself as well of SIGINT (or the
currently running command to be builtin).

That sounds like a bad idea, especially considering that it
doesn't exit either if the process returns with exit code 130
upon receiving that SIGINT. For instance:

For instance, in:

bash -c 'mksh -c "sleep 10; :"; echo hi'

Upon pressing Ctrl-C, mksh handles the SIGINT and exits with
130 (as opposed to dying of a SIGINT), so bash doesn't exit
(sometimes only on Debian).

ksh93 seems to be doing something similar (even worse).
http://unix.stackexchange.com/a/230568/22565

Why? What's the rational behind that. It seems it's not
documented and contradicts the documentation.

-- 
Stephane