Using strace as you (Simon) suggested I confirmed that not all my signals
are getting delivered. In my test, I install a sigCHLD handler and fork
two processes. Both processes terminate as expected, but only one signal
is delivered. `ps` confirms that the other child process has terminated:
It is shown as "<defunct>". (Well, actually, all of that is what usually
happens. Occasionally both signals are delivered and my test finishes
correctly.)
The fine points of Unix signal semantics have always been somewhat
mysterious to me. However, after digging around in man pages for a while,
I have a theory as to what's going "wrong"...
At the dawn of Unix time--signals were much simpler, and the general
notion was that a simple set of pending signals was kept, with the
implication that multiple concurrent occurrences of a given signal were
merged into one. Since then, signals have grown up, and many signals
carry (the possibility of) auxiliary information; to avoid losing this
occurrence-specific information, multiple concurrent occurrences of a
given signal need to be kept separate. However, it appears that to
obtain the newer behavior, one must set the SA_SIGINFO flag in the
sigaction structure. From the Solaris signal(5) man page:
When a signal is generated by sigqueue(3R) or any
signal-generating function which supports the specification
of an application defined value, the signal is marked pend-
ing and, if the SA_SIGINFO flag is set for that signal, the
signal is queued to the process along with the application
specified signal value. Multiple occurrences of signals so
generated are queued in FIFO order. If the SA_SIGINFO flag
is not set for that signal, later occurrences of that
signal's generation, when a signal is already queued, are
silently discarded.
(I couldn't find any similar wording among the terser Linux man pages.)
Now, if what I see in fptools/ghc/rts/Signals.c is apropos (I never have
confidence I've found the right code in the CVS repository. Is there
some overview documentation I've overlooked?), GHC doesn't set SA_SIGINFO.
This is not too surprising, given that a Haskell signal handler doesn't
get access to the auxiliary information. (Why shouldn't it?) But I think
this means that I can't be sure of getting exactly one sigCHLD signal for
each child termination. Alas. Any constructive ideas?
Dean
On Tue, 9 Jul 2002, Simon Marlow wrote:
> > After much study I have a new theory. It appears that the
> > pipe machinery
> > is working fine, but that sometimes my program fails to
> > "reap" all of its
> > terminated child processes. I'm using a `sigCHLD` signal handler that
> > does `getAnyProcessStatus True False` each time it's invoked.
> > It seems
> > that one or more `sigCHLD` signals are occurring while an
> > instance of the
> > handler is already running (having presumably blocked
> > `sigCHLD` during its
> > execution), and hence the "contemporaneous" signals are getting lost.
> > (Oh, and it seems this unfortunate behavior occurs for me
> > under Linux but
> > not Solaris.)
> >
> > How can I avoid losing `sigCHLD` signals?
> >
> > It seems that use of the `SA_NODEFER` flag on `sigaction` might do the
> > trick, but that flag is not accessible via `installHandler`.
>
> I can't see a reason why SIGCHLD signals might be lost, but the handler
> might be deferred in the way that Volker described if you have blocking
> C calls.
>
> Perhaps you could investigate with strace and see if the signal is
> actually being delivered?
>
> Cheers,
> Simon
_______________________________________________
Glasgow-haskell-bugs mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs