On Fri, Jun 7, 2013 at 11:05 PM, Roland Mainz <[email protected]> wrote:
> Hi!
>
> ----
>
> While testing the signal issues we stumbled over an issue with the
> "wait" builtin - it seems doing a $ while ! wait ; do true ; done #
> (the loop is neccesary since "wait" can be interupted by signals)
> doesn't wait for all child processes launched by the shell.
>
> The following testcase...
> -- snip --
> $ cat ksh_waitjobs1.sh
>
> typeset -i i
>
> for (( i=0 ; i < 256 ; i++ )) ; do
> {
> sleep 10
> exit 0
> } &
> done
>
> while ! wait ; do
> true
> done
>
> jl="$(LC_ALL='C' jobs -l | fgrep Running)"
>
> if [[ "$jl" == '' ]] ; then
> printf '# success.\n'
> exit 0
> else
> printf '# error: job list not empty:\n%s\n' "$jl"
> exit 1
> fi
>
> # notreached
> -- snip --
>
> ... works flawlessly with bash4 (bash 4.2.42(1)) on SuSE 12.3/AMD64/64bit:
> -- snip --
> $ bash ksh_waitjobs1.sh
> # success.
> -- snip --
>
> ... but the same script fails with ast-ksh.2013-05-24:
> -- snip --
> $ ~/bin/ksh ksh_waitjobs1.sh
> # error: job list not empty:
> [256] + 8094 Running <command unknown>
> [255] - 8093 Running <command unknown>
> [254] 8092 Running <command unknown>
> [253] 8091 Running <command unknown>
> [252] 8090 Running <command unknown>
> [251] 8089 Running <command unknown>
> [250] 8088 Running <command unknown>
> [249] 8087 Running <command unknown>
> [snip]
> [6] 7844 Running <command unknown>
> [5] 7843 Running <command unknown>
> [4] 7842 Running <command unknown>
> [3] 7841 Running <command unknown>
> [2] 7840 Running <command unknown>
> [1] 7839 Running <command unknown>
> -- snip --
>
> Uhm... is this a bug or am I doing something wrong ?
AFAIK I found the root cause for the issue...
... the following modification of the original script...
-- snip --
typeset -i i
for (( i=0 ; i < 256 ; i++ )) ; do
{
sleep 10
exit 0
} &
done
while ! wait ; do
true
done
jl="$(LC_ALL='C' jobs -l | fgrep 'Running')"
if [[ "$jl" != '' ]] ; then
printf '# error: job list not empty:\n%s\n' "$jl"
(( i=0 ))
while true ; do
(( i++ ))
jl="$(LC_ALL='C' jobs -l | fgrep 'Running')"
[[ "$jl" == '' ]] && break
sleep 0.0001
done
printf '# took %d cycles to drain the queue.\n' i
fi
exit 0
-- snip --
... shows that it *ALWAYS* needs exactly one loop cycle ("# took 1
cycles to drain the queue.") until "job -l" reports that no jobs are
left:
-- snip --
$ ksh ksh_waitjobs1.sh
# error: job list not empty:
[256] + 13139 Running <command unknown>
[255] - 13138 Running <command unknown>
[254] 13137 Running <command unknown>
[253] 13136 Running <command unknown>
[252] 13135 Running <command unknown>
[251] 13134 Running <command unknown>
[250] 13133 Running <command unknown>
[249] 13132 Running <command unknown>
[248] 13131 Running <command unknown>
[247] 13130 Running <command unknown>
[246] 13129 Running <command unknown>
[245] 13128 Running <command unknown>
[244] 13127 Running <command unknown>
[243] 13126 Running <command unknown>
[242] 13125 Running <command unknown>
[241] 13124 Running <command unknown>
[snip]
[5] 12888 Running <command unknown>
[4] 12887 Running <command unknown>
[3] 12886 Running <command unknown>
[2] 12885 Running <command unknown>
[1] 12884 Running <command unknown>
# took 1 cycles to drain the queue.
-- snip --
It looks that calling $ jobs -l # doesn't check whether the state of
the child processes has changed... but something does it _after_ $
jobs -l # printed it's output (e.g. calling $ jobs -l # or any
external process does do the checking...).
IMO $ jobs -l # should always reflect the latest status of the child
process as the system reports it via the SIGCHLD handler's siginfo
structure (if anyone wants to listen to all the state changes of the
child process he/she has to set-up a CHLD trap and look at the
.sh.sig.code/.sh.sig.pid variables ...).
The original testcase with a workaround applied (and usage of
"grep"+pipe chain removed to avoid that it can count as external
process) looks like this:
-- snip --
typeset -i i
for (( i=0 ; i < 256 ; i++ )) ; do
{
sleep 10
exit 0
} &
done
while ! wait ; do
true
done
# run external process (to keep jobs -l output updated) as
# workaround for ksh93 <= ast-ksh.2013-05-24
/usr/bin/true >'/dev/null'
jl="${ LC_ALL='C' jobs -l ; }"
if [[ "$jl" != *Running* ]] ; then
printf '# success.\n'
exit 0
else
printf '# error: job list not empty:\n%s\n' "$jl"
exit 1
fi
# notreached
-- snip --
----
Bye,
Roland
--
__ . . __
(o.\ \/ /.o) [email protected]
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 3992797
(;O/ \/ \O;)
_______________________________________________
ast-developers mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-developers