The bug report below might explain some of the reported signal weirdness on FreeBSD and derivatives.
How far are we anyway with signal reliability vs volatile of the ksh93 struct jobs? Ced ---------- Forwarded message ---------- From: Konstantin Belousov <kostik...@gmail.com> Date: 21 November 2013 22:15 Subject: Re: Problem with signal 0 being delivered to SIGUSR1 handler To: Vitaly Magerya <vmage...@gmail.com> Cc: freebsd-hack...@freebsd.org, davi...@freebsd.org, thre...@freebsd.org On Thu, Nov 21, 2013 at 02:39:02PM +0200, Vitaly Magerya wrote: > Hi, folks. I'm investigating a test case failure that devel/boehm-gc > has on recent FreeBSD releases. The problem is that a signal > handler registered for SIGUSR1 is sometimes called with signum=0, > which should not be possible under any conditions. > > Here's a simple test case that demonstrates this behavior: > > /* Compile with 'c99 -o example example.c -pthread' > */ > #include <pthread.h> > #include <signal.h> > #include <stdio.h> > #include <stdlib.h> > > void signal_handler(int signum, siginfo_t *si, void *context) { > if (signum != SIGUSR1) { > printf("bad signal, signum=%d\n", signum); > exit(1); > } > } > > void *thread_func(void *arg) { > return arg; > } > > int main(void) { > struct sigaction sa = { 0 }; > sa.sa_flags = SA_SIGINFO; > sa.sa_sigaction = signal_handler; > if (sigfillset(&sa.sa_mask) != 0) abort(); > if (sigaction(SIGUSR1, &sa, NULL) != 0) abort(); > for (int i = 0; i < 10000; i++) { > pthread_t t; > pthread_create(&t, NULL, thread_func, NULL); > pthread_kill(t, SIGUSR1); Side note. pthread_kill(3) call behaviour is undefined if pthread_create(3) in the line before failed. > } > return 0; > } > > Under FreeBSD 9.2-RELEASE amd64 I pretty consistently get > "signum=0" from this program, but you may need to run it a few > times or increase the number of iterations to see the same. > > Interestingly enough, I don't see this behavior under 9.0-RELEASE. > > So, any ideas what the problem here is? It happens when libthr deferred signal handling path is taken for signal delivery and for some reason the code inside the deferred path called into rtld for symbol binding. Than, rtld lock is locked, some code in rtld is executed, and rtld lock is unlocked. Unlock causes _thr_ast() run, which results in the nested check_deferred_signal() execution. The check_deferred_signal() clearks si_signo, so on return the same signal is delivered one more time, but is advertized as signo zero. The _thr_rtld_init() approach of doing dummy calls does not really work, since it is not practically possible to enumerate the symbols needed during signal delivery. My first attempt to fix this was to increment curthread->critical_count around the calls to check_* functions in the _thr_ast(), but it causes reverse problem of losing _thr_ast() runs on unlock. I ended up with the flag to indicate that deferred delivery is running, so check_deferred_signal() should avoid doing anything. A delicate moment is that user signal handler is allowed to modify the passed machine context to result the return from the signal handler to cause arbitrary jump, or just do longjmp(). For this case, I also clear the flag in thr_sighandler(), since kernel signal delivery means that nested delivery code should not run right now. Please try this. diff --git a/lib/libthr/thread/thr_private.h b/lib/libthr/thread/thr_private.h index 83a02b5..c6651cd 100644 --- a/lib/libthr/thread/thr_private.h +++ b/lib/libthr/thread/thr_private.h @@ -433,6 +433,9 @@ struct pthread { /* the sigaction should be used for deferred signal. */ struct sigaction deferred_sigact; + /* deferred signal delivery is performed, do not reenter. */ + int deferred_run; + /* Force new thread to exit. */ int force_exit; diff --git a/lib/libthr/thread/thr_sig.c b/lib/libthr/thread/thr_sig.c index 415ddb0..57c9406 100644 --- a/lib/libthr/thread/thr_sig.c +++ b/lib/libthr/thread/thr_sig.c @@ -162,6 +162,7 @@ thr_sighandler(int sig, siginfo_t *info, void *_ucp) act = _thr_sigact[sig-1].sigact; _thr_rwl_unlock(&_thr_sigact[sig-1].lock); errno = err; + curthread->deferred_run = 0; /* * if a thread is in critical region, for example it holds low level locks, @@ -320,14 +321,18 @@ check_deferred_signal(struct pthread *curthread) siginfo_t info; int uc_len; - if (__predict_true(curthread->deferred_siginfo.si_signo == 0)) + if (__predict_true(curthread->deferred_siginfo.si_signo == 0 || + curthread->deferred_run)) return; + curthread->deferred_run = 1; uc_len = __getcontextx_size(); uc = alloca(uc_len); getcontext(uc); - if (curthread->deferred_siginfo.si_signo == 0) + if (curthread->deferred_siginfo.si_signo == 0) { + curthread->deferred_run = 0; return; + } __fillcontextx2((char *)uc); act = curthread->deferred_sigact; uc->uc_sigmask = curthread->deferred_sigmask; -- Cedric Blancher <cedric.blanc...@gmail.com> Institute Pasteur
pgpNdVKKpwCjc.pgp
Description: PGP signature
_______________________________________________ ast-developers mailing list ast-developers@lists.research.att.com http://lists.research.att.com/mailman/listinfo/ast-developers