On 24/03/11 19:43, Ulf Magnusson wrote: > On Thu, Mar 24, 2011 at 7:06 PM, Ulf Magnusson <ulfali...@gmail.com> wrote: >> Hi, >> >> My DirectFB application kept hanging on shutdown whenever I used the >> Linux Input driver, so I did some investigation to figure out why. >> Here's what happens: >> >> 1. On shutdown, the threads processing /dev/event/X are all >> pthread_cancel()'ed. >> 2. For some reason[1] one or more of the threads reach the D_PERROR >> ("linux_input thread died\n") at the end of linux_input_EventThread(). >> 3. One of the threads acquires the log->lock mutex in >> direct_log_printf(), writes to log->fd, and then dies before releasing >> the lock as write() is a cancellation point. >> 4. The next thread that tries to write to the log gets stuck on log->lock. >> >> Here's my proposed fix, which temporarily changes the cancellation >> state of the thread inside direct_log_printf() to prevent it from >> being canceled while holding the lock (I'm by means a Pthreads guru, >> so there might very well be a better solution): >> >> --- a/lib/direct/log.c 2010-11-15 22:12:08.000000000 +0100 >> +++ b/lib/direct/log.c 2011-03-24 17:58:38.259808355 +0100 >> @@ -167,14 +167,19 @@ >> else { >> int len; >> char buf[512]; >> + int old_cancellation_state; >> >> len = vsnprintf( buf, sizeof(buf), format, args ); >> >> - pthread_mutex_lock( &log->lock ); >> + /* Ensure the thread does not get canceled at the write(), which >> + * would prevent the log lock from being released. */ >> + pthread_setcancelstate( PTHREAD_CANCEL_DISABLE, >> &old_cancellation_state ); >> >> + pthread_mutex_lock( &log->lock ); >> write( log->fd, buf, len ); >> - >> pthread_mutex_unlock( &log->lock ); >> + >> + pthread_setcancelstate( old_cancellation_state, NULL ); >> } >> >> va_end( args ); >> >> With the above patch my application no longer hangs on shutdown. >> >> You probably also ought to make sure a thread can never die inside a >> direct_log_lock()/direct_log_unlock() pair. >> >> [1] This seems to be due to a uClibc bug that causes the select() on >> linux_input.c:902 (DirectFB 1.4.11) to return -1 while errno remains >> 0. Seems to depend on subtle details in how the application was >> compiled. > > To clarify: In this case the error was due to uClibc, but the same > thing would happen any time a thread is pthread_cancel()'ed and then > writes to the log, which seems like a bug.
Thanks, we'll integrate the changes! Not sure if we should prevent threads from logging when they are cancelled. -- Best regards, Denis Oliver Kropp .------------------------------------------. | DirectFB - Hardware accelerated graphics | | http://www.directfb.org/ | "------------------------------------------" _______________________________________________ directfb-dev mailing list directfb-dev@directfb.org http://mail.directfb.org/cgi-bin/mailman/listinfo/directfb-dev