Re: [ast-developers] Hang because job management _within_ the SIGCHLD signal handler... / was: Re: AT&T Software Technology ast alpha software download update

Glenn Fowler Tue, 30 Jul 2013 07:12:50 -0700

On Tue, 30 Jul 2013 15:40:48 +0200 Roland Mainz wrote:
> --047d7b6d8d7cb1934204e2bac182
> Content-Type: text/plain; charset=ISO-8859-1


> On Mon, Jul 29, 2013 at 3:34 PM, Roland Mainz <[email protected]> 
> wrote:
> > On Sat, Jul 27, 2013 at 6:17 PM, Glenn Fowler <[email protected]> wrote:
> [snip]
> > I hit (again) one of the nasty and hard to reproduce problems (which
> > means: No... I don't have a testcase... ;-( ) related to job
> > management:
> > -- snip --
> [snip]
> > -- snip --
> >
> > ... this hang occured because ksh93 tries to do job management from
> > within the signal handler... which is IMO _far_ from being safe to
> > do...
> >
> > David: Is there *ANY* way the job management can be moved out of the
> > signal handler and just rely on the siginfo data ?

> Attached (as 
> "astksh20130727_prototype_process_sigchld_outside_signalhandler001.diff.txt")
> is a prototype (hacked) patch for discussion.

> * The good news are:
> - SIGCHLD processing now seems to be reliable. I can't find a simple
> testcase to break it (but see the item about the testsuite below)
> - SIGCHLD processing now works without problems under "valgrind" control
> - The patch seems to fix a lot of Cedric's issues with signals being
> lost. Even the testcase where a $ ... while ! wait ; sleep 8 ; done #
> was loosing signals now works... if I replace the "sleep 8" with
> "compound -a dummy ; poll -t 8 dummy" (so there seems to be a seperate
> problem with "sleep" ... somehow)

> * The bad news are:
> - The patch doesn't work very well for interactive shells
> - The "sigchld.sh" testsuite module fails in several places... AFAIK
> the reason is that we don't check the list of queued CHLD signals in
> all places where they should be checked (e.g. each time before we
> execute a shell statement and before and after running external
> processes and each time wait(1) internally wakes up from a signal)
> - $ jobs -l # reports already purged processes... erm... no clue why... ;-(

> * Notes about interactive mode:
> An interactive shell AFAIK requires to report terminating childs
> "immediately" if some shell options are set (erm... David: Which
> option is this again ?). AFAIK this can be archived by two ways:
> 1. wake up from time-to-time (e.g. polling) ... but this has a bad
> effect in CPU usage (imagine 1000 shell sessions in a multi-user
> system waking-up 10 times per second... or imagine the impact for
> battery life in a laptop)
> 2. add some glue to the editor code to wake up for each signald and
> process it when |sfpkrd()| returns...
> -- snip --
> (gdb) where
> #0  0x00007fc2be588860 in __poll_nocancel () at
> ../sysdeps/unix/syscall-template.S:81
> #1  0x000000000050280d in sfpkrd (fd=0, argbuf=0x7fff33545080, n=80,
> rc=0, tm=-1, action=1)
>     at 
> /home/test001/work/ast_ksh_20130727/build_i386_64bit_debug_patched/src/lib/libast/sfio/sfpkrd.c:142
> #2  0x0000000000410fb9 in ed_read (context=0x7fc2bf063960, fd=0,
> buff=0x7fff33545080 "\240PT3\377\177", size=80, reedit=0)
>     at 
> /home/test001/work/ast_ksh_20130727/build_i386_64bit_debug_patched/src/cmd/ksh93/edit/edit.c:884
> -- snip --
> ... that's the strategy my patch attempts...

for the current code do you see failures when the vmalloc debug method is not 
used?

_______________________________________________
ast-developers mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-developers

Re: [ast-developers] Hang because job management _within_ the SIGCHLD signal handler... / was: Re: AT&T Software Technology ast alpha software download update

Reply via email to