On Fri, Mar 29, 2013 at 4:02 PM, Roland Mainz <[email protected]> wrote:
> On Wed, Mar 27, 2013 at 9:48 PM, Irek Szczesniak <[email protected]> 
> wrote:
>> On Wed, Mar 27, 2013 at 6:10 PM, Roland Mainz <[email protected]> 
>> wrote:
>>> On Wed, Mar 27, 2013 at 11:31 AM, Irek Szczesniak <[email protected]> 
>>> wrote:
>>> [snip]
>>>>>> BTW: The patch currently doesn't cover passing SIGCHLD siginfo data to
>>>>>> the shell traps (this needs to be done in |job_waitsafe()| ... but
>>>>>> that may be tricky).
>>>>>
>>>>> Could you add.sh.status (Exit value or signal) and .sh.pid for CHLD
>>>>> traps, please?
>>>>
>>>> Appending to this RFE:
>>>> We like to have .sh.code for CHLD traps too, with .sh.code being a
>>>> STRING returning one of the CLD_* codes defined in
>>>> http://pubs.opengroup.org/onlinepubs/7908799/xsh/signal.h.html (i.e.
>>>> this is defined by X/OPEN and POSIX and therefore should be applicable
>>>> as shell extension).
>>>
>>> Erm... do you mean that you want to be able to write a ksh93 version
>>> of the following C code ?
>>> -- snip --
>>> #include <stdlib.h>
>>> #include <stdio.h>
>>> #include <signal.h>
>>> #include <sys/types.h>
>>> #include <unistd.h>
>>> #include <errno.h>
>>>
>>> static
>>> const char *sicode2str(int si_code)
>>> {
>>>         const char *str;
>>>         switch(si_code)
>>>         {
>>> #define STRSYM(sym) \
>>>         case (sym): str = #sym ; break;
>>>                 STRSYM(CLD_EXITED)
>>>                 STRSYM(CLD_KILLED)
>>>                 STRSYM(CLD_DUMPED)
>>>                 STRSYM(CLD_TRAPPED)
>>>                 STRSYM(CLD_STOPPED)
>>>                 STRSYM(CLD_CONTINUED)
>>>                 default:
>>>                         str="<unknown>";
>>>                         break;
>>>         }
>>>         return str;
>>> }
>>>
>>>
>>> static
>>> void chld_handler (int signum, siginfo_t *si,
>>>         void *context)
>>> {
>>>         printf("#SIGCHLD, si_code=%d/%s handler\n",
>>>                 si->si_code,
>>>                 sicode2str(si->si_code));
>>> }
>>>
>>>
>>> static
>>> void chsleep(void)
>>> {
>>>         int i;
>>>         for (i=0 ; i < 120 ; i++)
>>>                 usleep(10000);
>>> }
>>>
>>>
>>> int main(int ac, char *av[])
>>> {
>>>         struct sigaction new_action;
>>>         pid_t pid;
>>>
>>>         /* Setting-up the sigchld handler */
>>>         new_action.sa_sigaction = chld_handler;
>>>         sigemptyset (&new_action.sa_mask);
>>>         new_action.sa_flags = SA_SIGINFO;
>>>         sigaction (SIGCHLD, &new_action, NULL);
>>>
>>>         pid = fork(); /*FIXME: Need error handling*/
>>>
>>>         if (pid == 0)
>>>         {
>>>                 /* Child process */
>>>
>>>                 puts("# child process "
>>>                         "(stopping now...)");
>>>                 raise(SIGSTOP);
>>>
>>>                 puts("# child continues...");
>>>
>>>                 _exit(0);
>>>         }
>>>         else
>>>         {
>>>                 /* Parent process */
>>>
>>>                 chsleep();
>>>                 puts("# parent waking child...");
>>>                 kill(pid, SIGCONT);
>>>                 chsleep();
>>>         }
>>>
>>>         return EXIT_SUCCESS;
>>> }
>>> -- snip --
>>>
>>> The output should look like this:
>>> -- snip --
>>> # child process (stopping now...)
>>> #SIGCHLD, si_code=5/CLD_STOPPED handler
>>> # parent waking child...
>>> # child continues...
>>> #SIGCHLD, si_code=6/CLD_CONTINUED handler
>>> #SIGCHLD, si_code=1/CLD_EXITED handler
>>> -- snip --
>>>
>>> This should be technically possible to implement in ksh93... but for
>>> which purpose are you interested in the |CLD_STOPPED|&co. codes ?
>>
>> Until now there is no scalable way to monitor only a single process in
>> a pool of many (20000+) worker processes. The usual way of listening
>> to the CHLD trap and then take a sample of the job list using jobs -l
>> works but scales very poorly. Using the information solely from the
>> siginfo data passed to the CHLD handler would avoid that scalability
>> bottleneck.
>> Looking at si_code reveals whether the child process completed
>> successfully, crashed or somehow else changed state and is thus IMO
>> mandatory to replace jobs -l as information source.
>
> Ok... thanks for the explanation.
>
> Attached (as "astksh20130318_shsig_chld001.diff.txt") is a dumb 10min
> hack which fills the .sh.sig data for CHLD traps.
>
> * Changes:
> - .sh.sig data are now filled for CHLD traps
> - ".sh.sig.name" was renamed to ".sh.sig.signame" to avoid naming
> collisions on platforms which pass names around as part of |siginfo_t|
> - ".sh.sig.code" is now a _string_ since |siginfo_t|'s |si_code|
> numbers are not portable... nor are they self-explaining (the strings
> returned should be...)
> - .sh.sig.status now works and returns even exit codes like 97 or 121 ...
>
>
> * Open issues:
>  - ".sh.sig.code" code should be more robust (SIGRT* signals come in mind)
>  - .sh.sig.* should only have variables available which match the
> siginfo_t data returned
> - Only |CLD_EXITED| is currently supported since the code is mostly a
> 10min hack... the real code should save the |siginfo_t| data in a list
> and replay it when the CHLD shell traps are called.
>
>
> * Example usage looks like this:
> -- snip --
> $ cat test1.sh
>
> function chld_trap
> {
>         printf "child done, code=%q, exit code=%d\n" \
>                 "${.sh.sig.code}" \
>                 ${.sh.sig.status}
> }
>
> trap 'chld_trap' CHLD
>
> exit 120 &
>
> wait
>
> print '#done.'
> $ ksh ./test1.sh
> child done, code=CLD_EXITED, exit code=120
> #done.
> -- snip --

Thank you. I'd want to sync with David (Korn) because he said he's
making a patch for this, too.

In any case the feature that .sh.sig.code now returns the strings
defined by POSIX and no numbers is IMO very useful and make it into
the final version of the .sh.sig. code.

Irek
_______________________________________________
ast-developers mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-developers

Reply via email to