Hi Salusa, Would you or Max be able to construct a unit test that demonstrates this failure condition, and then success once the patch is applied? There should be some example tests in the t/ directory which you can draw on for inspiration.
On Sun, Mar 4, 2012 at 10:29 AM, SalusaSecondus <sal...@nationstates.net> wrote: > (Patch and system details at bottom) > > Hi all. I've root-caused and written a patch for the children stuck on > futex problem described by both Sean Thorne in 2009 and Max Barry (who I > work with) in 2011. > > The core of the problem is that modperl_tipool_putback_base only > broadcasts that there are more interpreters available when there were no > available interpreters prior to this putback. While this makes sense, it > can create a problem. > > Notation: > A: Acquire an interpreter > P: Putback an interpreter > B: Broadcast a free intepreter (really a signal) > W: Wait on condition tipool->available (for free interpreter) > (x,y): x is number of free interpreters at this point. y is the number > in use. > The number at the beginning of a line is the thread number > Each line occurs within a single critical section (on mutex tipool->tiplock) > > Expected behavior: > 4 threads, 2 free interpreters > 1: A (1,1) > 2: A (2,0) > 3: W > 4: W > 1: P (1,1) B > 3: A (2,0) > 2: P (1,1) B > 4: A (2,0) > 3: P (1,1) B > 4: P (0,2) <-- No broadcast because there was an available interpreter > prior to this putback. > > Broken behavior: > 4 threads, 2 free interpreters > 1: A (1,1) > 2: A (2,0) > 3: W > 4: W > 1: P (1,1) B > 2: P (0,2) <-- No broadcast because there was an available interpreter > prior to this putback. > 3: A (1,1) > 3: P (0,2) <-- No broadcast because there was an available interpreter > prior to this putback. > (Broken) > > Thread 4 will never be signaled to pick up an interpreter. This results > in the thread getting stuck on futex because sooner or later, apache > will tell this worker to die (due to MaxRequestsPerChild). So, the > parent thread will wait on the child threads joining, but one or more > child threads will never wake up due to this problem. > > My proposed fix is to always broadcast the availability of an > interpreter, regardless of whether there were already any free. This > change passes all tests that I have found to throw at it as well as no > longer deadlocking when reproducing the problem according to Max's > instructions (http://pastebin.com/YDbmq84w). > > My System Details: > uname -a: Linux modperl 2.6.38-8-server #42-Ubuntu SMP Mon Apr 11 > 03:49:04 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux > Apache: Custom build of 2.2.20 with ubuntu patches > (http://packages.ubuntu.com/source/oneiric/apache2) > modperl: Custom build of 2.0.5 with ubuntu patches > (http://packages.ubuntu.com/source/oneiric/libapache2-mod-perl2) > Build process: Standard ubuntu build process with following flags set: > DEB_BUILD_OPTIONS="nostrip parallel=2 debug" > CFLAGS="-g -O2 -DMP_TRACE=1 -DPERL_DESTRUCT_LEVEL=2 -DMP_DEBUG=1 > -UMP_USE_GTOP -I/usr/include/libgtop-2.0/ -I/usr/include/glib-2.0/ > -I/usr/lib/x86_64-linux-gnu/glib-2.0/include/" > > Patch: > --- src/modules/perl/modperl_tipool.c.old 2012-03-03 > 19:43:57.112152297 -0800 > +++ src/modules/perl/modperl_tipool.c 2012-03-03 04:28:31.000000000 -0800 > @@ -328,9 +328,9 @@ > MP_TRACE_i(MP_FUNC, "0x%lx now available (%d in use, %d running)", > (unsigned long)listp->data, tipool->in_use, tipool->size); > > + modperl_tipool_broadcast(tipool); > if (tipool->in_use == (tipool->cfg->max - 1)) { > /* hurry up, another thread may be blocking */ > - modperl_tipool_broadcast(tipool); > modperl_tipool_unlock(tipool); > return; > } > > > Please let me know how best to get this checked in and out. As you might > imagine, this futex problem has been causing us quite a few headaches :-) > > Greg Rubin > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@perl.apache.org > For additional commands, e-mail: dev-h...@perl.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@perl.apache.org For additional commands, e-mail: dev-h...@perl.apache.org