Hi,

A colleague of mine was having problems with the pth library dying because
it couldn't schedule any threads.  I was able to come up with test cases for
which it failed and came up with a patch against v1.4.0.  You can compile
the program with gcc -DUSEPTH -o test_fail testfail.c -lpth to link it with
pth in softcall mode.  Compile it with gcc -o test_ok testfail.c to see what
the results should be without pth.

The problem is pth doesn't handle errors correctly during select (actually,
it ignores errors) and wait on fd events.  When an error such as EINVAL or
EBADF occurs, pth ignores it, which eventually leads to the thread never
being reawaked and no threads being scheduled.  The patch will go through
each event and set ev_occurred to true for all events that might have caused
the select error (this is slow since it has to do a new select for each
event and compare it against the errno that it's expecting).  Probably a
more elegant way of doing it is to just return EINTR for all events and
reawaken all threads to retry the call.  The past few hours have been my
first look into using pth and looking at the source code for it so I
wouldn't be surprised if there are problems with it.  I only looked at
select and briefly touched upon the fd event so there might be similar
problems with other events.

- Thanh

test_fail.c

pth_sched.c.diff

pth_high.c.diff

Reply via email to