I have two threads, and one thread uses pthread_kill() to asynchronously send
signals to the other thread. The signal is set up to be delivered with
SA_SIGINFO | SA_ONSTACK flags, and the other thread has setup an alternate
signal stack. Under some circumstances, the signal is delivered on the wrong
stack.
My signal handler begins with
void runner::do_signal(int signo, siginfo_t *psi, void *pv)
{
#ifndef NDEBUG
char *_probe = (char*)&pv;
char *_low = (char*)sigstack_;
char *_high = (char*)(sigstack_ + STACK_SIZE);
#endif
assert((_probe > _low) && (_probe < _high));
..
}
On certain occasions, the assertion fails (meaning that the arguments are not
within the bounds of the alternate signal stack), and the sigaltstack() also
reports that the handler is NOT executing on the alternate stack. This is the
backtrace at that point:
current thread: [EMAIL PROTECTED]
[1] __lwp_kill(0x2, 0x6), at 0xcbd78e65
[2] _thr_kill(0x2, 0x6), at 0xcbd75a1e
[3] raise(0x6), at 0xcbd32102
[4] abort(0xcbdb4d80, 0xcbdb0000, 0x65737341, 0x6f697472, 0x6166206e,
0x64656c69), at 0xcbd10dad
[5] __assert(0x8079fe4, 0x807a008, 0xde), at 0xcbd10fce
=>[6] sched::runner::do_signal(signo = 41, psi = 0x80aa600, pv = 0x80aa400),
line 222 in "src/shdif.cc"
[7] __sighndlr(0x29, 0x80aa600, 0x80aa400, 0x80721f0), at 0xcbd7795f
---- called from signal handler with signal 41 (SIGRTMIN) ------
[8] take_deferred_signal(0x29), at 0xcbd6d01e
[9] do_exit_critical(0xcbfb2400, 0xcbdb0000, 0x804784c, 0x8047798, 0x80aa704,
0xcbd6d7a0), at 0xcbd75ab4
[10] block_all_signals(0xcbfb2400), at 0xcbd6d626
[11] _thr_sigsetmask(0x1, 0x809035c, 0x0), at 0xcbd6d7a0
[12] _sigprocmask(0x1, 0x809035c, 0x0), at 0xcbd6d921
[13] sched::runner::enter_critical(this = 0x8093f30), line 398 in
"src/shdif.cc"
[14] sched::enter_critical(), line 256 in "src/shd.h"
[15] comms::send<int>(chn = CLASS, msg = CLASS), line 252 in "src/chn.h"
[16] sender::behavior(this = 0x80a8620), line 67 in "test/bvt_sim4.cc"
[17] sched::process::start(ppr = 0x80a8620), line 67 in "src/shdif.cc"
It seems that take_deferred_signal function does not arrange a switch to the
alternate stack. The error is consistently reproducible in my program when it
runs on a 2-CPU machine (thus, truly in parallell), but I have not yet
succeeded to write a small test-case.
Opinions?
This message posted from opensolaris.org
_______________________________________________
opensolaris-code mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code