Re: [Libevent-users] Sigprocmask vs pthread_sigprocmask

2007-02-24 Thread Scott Lamb


On Feb 24, 2007, at 12:49 PM, Niels Provos wrote:


 I know that
it's difficult to get complete coverage, but some tests for each new
change are a very good practice.


Well, at least the test_immediatesignal() fails with the old code and  
passes with the new code.


I would love to get coverage of the race condition as well...but I  
don't have a good way to do it. I had this same discussion with the  
Twisted people not too long ago. I kind of think full coverage is a  
good goal to shoot for, but essentially impossible to achieve.


--
Scott Lamb 
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] Sigprocmask vs pthread_sigprocmask

2007-02-24 Thread Niels Provos

On 2/24/07, Scott Lamb <[EMAIL PROTECTED]> wrote:

There are two regression tests for this now:


Great.


But you're referring to something more...proving that there would
have been an evsignal_process() race if not for the reordering,
right? That's hard to do through regression tests. I can think of
only a couple ways to do that:


I am really just looking for something that all of us can run every
time we make a change to the library.  Otherwise, it's really hard to
be sure that one did not introduce unwanted side effects.  I know that
it's difficult to get complete coverage, but some tests for each new
change are a very good practice.

I'll have a look at your changes tomorrow.

Thank you,
Niels.
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] Sigprocmask vs pthread_sigprocmask

2007-02-24 Thread Scott Lamb


On Feb 22, 2007, at 6:32 PM, Niels Provos wrote:


Do you have an addition to the regression test that would allow me to
verify that the new code works as intended?   If not, it would be most
appreciated :-)


Happy Weekend, everyone. Looking at this again.

There are two regression tests for this now:

* test_simplesignal() tests a signal delivered while within select()/ 
poll()/kevent()/epoll()/whatever.
* test_immediatesignal() tests a signal delivered between signal_add 
() and event_loop().


But you're referring to something more...proving that there would  
have been an evsignal_process() race if not for the reordering,  
right? That's hard to do through regression tests. I can think of  
only a couple ways to do that:


(1) loop trying it many times, scheduling later delivery with alarms  
or through an additional thread (tests would depend on pthreads).  
Probably have to add some instrumentation to evsignal_process() to  
make it sleep at particular times to make it likely enough to happen  
on demand.


(2) use debugging mechanisms to step through and deliver the signal  
at a particular time. I actually have done this programmatically  
before when proving a signal-handling library [*], but it was a lot  
of code and not exactly the most portable. Maybe it could be done  
more easily with gdb and expect, on at least gdb platforms.


Would it help if I showed you a way to test it by hand?

I've attached a simple test program that registers a handler for  
SIGUSR1 and SIGUSR2 which simply notes the signal was received. You  
can see the races I mention with gdb as follows:


$ gcc -Wall sigtest.c -o sigtest -L.libs -levent
$ LD_LIBRARY_PATH=.libs gdb ./sigtest
...
(gdb) break evsignal_process
Function "evsignal_process" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 1 (evsignal_process) pending.
(gdb) run
Starting program: /home/slamb/svn/libevent/sigtest
(no debugging symbols found)
<<>>
Breakpoint 2 at 0x2aabf4a0: file signal.c, line 165.
Pending breakpoint "evsignal_process" resolved

Program received signal SIGINT, Interrupt.
0x00318a4cd9d3 in __epoll_wait_nocancel () from /lib64/libc.so.6
(gdb) signal SIGUSR1
Breakpoint 2, evsignal_process () at signal.c:165 (my line numbers  
are off because I've #ifdefed the two code blocks)

<<>>
(gdb) n
179 memset(evsigcaught, 0, sizeof(evsigcaught));
(gdb) signal SIGUSR2
Continuing with signal SIGUSR2.
Received signal 10

Here you can see that it only acknowledged receipt of SIGUSR1;  
SIGUSR2 was lost.


If you do the same thing at any point in my modified evsignal_process 
(), you find instead:


(gdb) signal SIGUSR2
Continuing with signal SIGUSR2.
Received signal 10

Breakpoint 2, evsignal_process () at signal.c:165
165 {
(gdb) c
Continuing.
Received signal 12

[*] http://www.slamb.org/svn/repos/trunk/projects/sigsafe/tests/ 
race_checker/


--
Scott Lamb 



sigtest.c
Description: Binary data
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] Sigprocmask vs pthread_sigprocmask

2007-02-22 Thread Nick Mathewson
On Thu, Feb 22, 2007 at 05:52:10PM -0800, Scott Lamb wrote:
 [...]
> What about just never using sigprocmask() or pthread_sigmask()? The  
> patch I sent to the list a while back fixes two bugs:
>
> * prevents Anton Povarov's reported race in which a signal delivered  
> quickly after event_signal_add() isn't handled correctly
> * avoids sigprocmask(), which is not pthread-safe.

I like your approach; it seems a lot more elegant than the
block-and-unblock approach in the current code.  It looks at least
superficially okay to me.

My only libevent thread needs (for Tor) are to be able to have other
threads running besides the thread the uses libevent.  I don't need to
be able to access libevent at all from more than one of them, but I'd
really like to be able to have the subthreads not get killed when a
signal comes in.

If this goes in, it could IMO warrant a 1.3a; we really need this bug
fixed, especially for BSD and OSX users.

> There are still several caveats with threads. Ignoring the complex  
> idea of sharing an event_base between threads (need still has to be  
> proven...I've been way too busy with other things to run the  
> benchmarks I've been wanting to), these ones remain:
> 
> 2. If you forget event_base_set on an event, it's associated with the  
> latest base created. This will probably work most of the time. It'd  
> be much less confusing if it consistently broke.

How about an "event_clear_default_base()" function to force
current_base to NULL, so that things will consistently break if called
with the default base?

> 3. Each new base created leaks the existing ev_signal_pair descriptors.
>
> 4. Signals are delivered to whatever event loop happens to see them  
> first.
>
> 6. You can't destroy an event_base.

event_base_free() doesn't work?

[...]
> There's one subtlety to it. With my change, evsignal_process() runs
> with signals unblocked, so timing is critical. The original code is
> then racy:

I agree that this is a subtle point; maybe the code needs a comment
explaining the possible race issues.

yrs,
-- 
Nick Mathewson


pgpE0eafu9rax.pgp
Description: PGP signature
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] Sigprocmask vs pthread_sigprocmask

2007-02-22 Thread Scott Lamb


On Feb 22, 2007, at 6:32 PM, Niels Provos wrote:


Do you have an addition to the regression test that would allow me to
verify that the new code works as intended?   If not, it would be most
appreciated :-)


There is one that verifies the quickly received signal case. Are you  
talking about the race condition I describe? In that case, no...I  
could try something that pounds on it in a loop for a bit or  
something, if you like.


Also realized I sent an older version of the patch just now...newer  
one attached. (Why can I never send a patch to a mailing list  
correctly the first time? It's a bizarre handicap.)




event-signal.patch
Description: Binary data
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] Sigprocmask vs pthread_sigprocmask

2007-02-22 Thread Niels Provos

Do you have an addition to the regression test that would allow me to
verify that the new code works as intended?   If not, it would be most
appreciated :-)

Niels.

On 2/22/07, Scott Lamb <[EMAIL PROTECTED]> wrote:


On Feb 21, 2007, at 1:29 AM, William Ahern wrote:

> On Wed, Feb 21, 2007 at 03:44:58AM -0500, Nick Mathewson wrote:
> 
>> libevent, "I'm going to use pthreads; use pthread_sigmask()
>> instead of
>> sigprocmask()."  I don't know what that interface should be, but the
>> corresponding code should be pretty simple to write.
>
> Probably should be a [global] function pointer, since an autoconf
> check
> doesn't address the linking issue.

What about just never using sigprocmask() or pthread_sigmask()? The
patch I sent to the list a while back fixes two bugs:

* prevents Anton Povarov's reported race in which a signal delivered
quickly after event_signal_add() isn't handled correctly
* avoids sigprocmask(), which is not pthread-safe.

There are still several caveats with threads. Ignoring the complex
idea of sharing an event_base between threads (need still has to be
proven...I've been way too busy with other things to run the
benchmarks I've been wanting to), these ones remain:

2. If you forget event_base_set on an event, it's associated with the
latest base created. This will probably work most of the time. It'd
be much less confusing if it consistently broke.

3. Each new base created leaks the existing ev_signal_pair descriptors.

4. Signals are delivered to whatever event loop happens to see them
first.

6. You can't destroy an event_base.

My old patch and notes are below.

This patch does not get rid of ncalls, which I'm also thinking of
doing, as I mentioned in an earlier message to the list.

devpoll.c  |   15 +++
epoll.c|   15 +++
evport.c   |   17 +++--
evsignal.h |8 ++
poll.c |   15 +++
select.c   |   14 +++---
signal.c   |   74 +++
+-
test/regress.c |   32 
8 files changed, 76 insertions(+), 114 deletions(-)

There's one subtlety to it. With my change, evsignal_process() runs
with signals unblocked, so timing is critical. The original code is
then racy:

 TAILQ_FOREACH(ev, &signalqueue, ev_signal_next) {
 ncalls = evsigcaught[EVENT_SIGNAL(ev)];<-- 
point A
 if (ncalls) {
 if (!(ev->ev_events & EV_PERSIST))
 event_del(ev);
 event_active(ev, EV_SIGNAL, ncalls);
 }
 }

 memset(evsigcaught, 0, sizeof(evsigcaught));   <-- point B
 evsignal_caught = 0;   <-- point C

For any signal that is not already pending (ncalls == 0 when reaching
its point A), if it arrives between then and its ncalls being cleared
in point B, it will be lost. If it arrives after then but before
evsignal_caught in point C, it will be arbitrarily delayed (until
another signal comes along to set   evsignal_caught).

I solved this by

* moving the evsignal_caught to before checking the individual
signals for delivery (but after evsignal_caught is checked itself, in
a different function).
* setting "evsigcaught[x] = 0" only when we are setting that signal
active.

--
Scott Lamb 




___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users




___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] Sigprocmask vs pthread_sigprocmask

2007-02-22 Thread Scott Lamb


On Feb 21, 2007, at 1:29 AM, William Ahern wrote:


On Wed, Feb 21, 2007 at 03:44:58AM -0500, Nick Mathewson wrote:

libevent, "I'm going to use pthreads; use pthread_sigmask()  
instead of

sigprocmask()."  I don't know what that interface should be, but the
corresponding code should be pretty simple to write.


Probably should be a [global] function pointer, since an autoconf  
check

doesn't address the linking issue.


What about just never using sigprocmask() or pthread_sigmask()? The  
patch I sent to the list a while back fixes two bugs:


* prevents Anton Povarov's reported race in which a signal delivered  
quickly after event_signal_add() isn't handled correctly

* avoids sigprocmask(), which is not pthread-safe.

There are still several caveats with threads. Ignoring the complex  
idea of sharing an event_base between threads (need still has to be  
proven...I've been way too busy with other things to run the  
benchmarks I've been wanting to), these ones remain:


2. If you forget event_base_set on an event, it's associated with the  
latest base created. This will probably work most of the time. It'd  
be much less confusing if it consistently broke.


3. Each new base created leaks the existing ev_signal_pair descriptors.

4. Signals are delivered to whatever event loop happens to see them  
first.


6. You can't destroy an event_base.

My old patch and notes are below.

This patch does not get rid of ncalls, which I'm also thinking of  
doing, as I mentioned in an earlier message to the list.


devpoll.c  |   15 +++
epoll.c|   15 +++
evport.c   |   17 +++--
evsignal.h |8 ++
poll.c |   15 +++
select.c   |   14 +++---
signal.c   |   74 +++ 
+-

test/regress.c |   32 
8 files changed, 76 insertions(+), 114 deletions(-)

There's one subtlety to it. With my change, evsignal_process() runs  
with signals unblocked, so timing is critical. The original code is  
then racy:


TAILQ_FOREACH(ev, &signalqueue, ev_signal_next) {
ncalls = evsigcaught[EVENT_SIGNAL(ev)]; <-- point A
if (ncalls) {
if (!(ev->ev_events & EV_PERSIST))
event_del(ev);
event_active(ev, EV_SIGNAL, ncalls);
}
}

memset(evsigcaught, 0, sizeof(evsigcaught));<-- point B
evsignal_caught = 0;<-- point C

For any signal that is not already pending (ncalls == 0 when reaching  
its point A), if it arrives between then and its ncalls being cleared  
in point B, it will be lost. If it arrives after then but before  
evsignal_caught in point C, it will be arbitrarily delayed (until  
another signal comes along to set 	evsignal_caught).


I solved this by

* moving the evsignal_caught to before checking the individual  
signals for delivery (but after evsignal_caught is checked itself, in  
a different function).
* setting "evsigcaught[x] = 0" only when we are setting that signal  
active.


--
Scott Lamb 



event-signal.patch
Description: Binary data


___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] Sigprocmask vs pthread_sigprocmask

2007-02-21 Thread William Ahern
On Wed, Feb 21, 2007 at 03:44:58AM -0500, Nick Mathewson wrote:

> libevent, "I'm going to use pthreads; use pthread_sigmask() instead of
> sigprocmask()."  I don't know what that interface should be, but the
> corresponding code should be pretty simple to write.

Probably should be a [global] function pointer, since an autoconf check
doesn't address the linking issue.

___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


[Libevent-users] Sigprocmask vs pthread_sigprocmask

2007-02-21 Thread Nick Mathewson
Hi, all.  There's been a recent exchange on the Tor development list
about the intersection of signals with pthreads with libevent.  Both
posts are (I hope) concise, so I'll link to them here:
  http://archives.seul.org/or/dev/Feb-2007/msg00028.html
  http://archives.seul.org/or/dev/Feb-2007/msg00030.html

In summary: when pthreads are in use, it's wrong to call
sigprocmask(), and right to call its identical cousin,
pthread_sigmask().  This isn't just a cosmetic issue; it can cause
real bugs with signal delivery.  Right now, libevent only calls
sigprocmask.  It would be good if there were some way to tell
libevent, "I'm going to use pthreads; use pthread_sigmask() instead of
sigprocmask()."  I don't know what that interface should be, but the
corresponding code should be pretty simple to write.

yrs,
-- 
Nick Mathewson


pgpjhJl3Tsp4y.pgp
Description: PGP signature
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users