Quite possible. Basically one thread was telling the other thread that it's work was done by sending a pipe signal, then unlocking the mutex. The other thread was catching the pipe signal, and exiting, and then destroying the mutex. Leon caught the fact that it should first lock the mutex, then unlock and cleanup. (Or I could have used non-detached threads).
Can you give it a try? Paul Alfille (Sorry for the blank messages, Gmail seems a little flaky). ________________________________ From: Paul Alfille [mailto:[email protected]] Sent: Thursday, August 19, 2010 6:45 AM To: OWFS (One-wire file system) discussion and help Subject: Re: [Owfs-developers] owserver mutext bug solved. On Wed, Aug 18, 2010 at 7:00 PM, Alex Shepherd <[email protected]> wrote: Hmmm... does this relate to the issues that I was seeing where owserver would crash with the Assertion: "...tpp.c:66: __pthread_tpp_change_priority: Assertion..." If it is then I'll rebuild from the latest source and see if it makes a difference. Currently I'm running an older version that doesn't have the Assertion issue. Alex ________________________________ From: Paul Alfille [mailto:[email protected]] Sent: Thursday, 19 August 2010 7:35 a.m. To: owfs-developers Subject: [Owfs-developers] owserver mutext bug solved. Leon did some careful sleuthing and found an architecture-dependent bug in the the locking model for owserver. The problem is mostlikely some problem with the mutex implementation on that platform, but the problem seems solved. In his words, the problem: ------- Submitted By: Leon (nleonard671) Assigned to: Nobody/Anonymous (nobody) Summary: owserver crashes on client disconnection on Mips CPU Initial Comment: I'm using OWFS with an OpenWrt OS on a MIPS cpu (exact model: MIPS 24Kc V7.4) and I have a 100% reproductible crash problem with owserver. Step to reproduce: 1. Launch owserver, no matter if you use fake device or real device (DS9490 in my case) 2. Launch "owget /" => Segmentation fault in the server ----------- And the solution: -------------- Ok, I've finally managed to find out the bug. It's a race condition between the 'ping loop' (owserver/loop.c) and the 'data handler' (owserver/data.c). Possible crash explanation: 1. [data.c:DataHandler()] Data thread lock the TOCLIENT mutex, sets the hd->toclient to toclient_complete then write some dummy data to the pipe, to wakeup the PingLoop thread (line 215) 2. [loop.c:Ping_or_Send()] Ping thread select is 'awaked' by the pipe data (line 92) and returns without locking the TOCLIENT mutex (line 96) 3. [loop.c:PingLoop()] Exist from the loop and execute LoopCleanup(à, which destroys the TOCLIENT mutex 4. [data.c:DataHandler()] Data thread unlocks the TOCLIENT mutex (line 218), and crashes I didn't notice early, but in my output log files, the statement "Finished with client request" (data.c:222) was never written, confirming that the data thread crashes before. I tried to reproduce it unsuccessfully on my development hardware (x86) - maybe the uClibc pthread implementation used by OpenWrt has a different behavior than the glibc's one. A working fix: instead of returning of c:Ping_or_Send() directly after the select(), a lock/unlock step is executed, allowing the data thread to finish properly before destroying the shared resources. --- owfs-2.8p0-ORIG/module/ owserver/src/c/loop.c 2010-05-08 20:47:10.000000000 +0200 +++ owfs-2.8p0/module/owserver/src/c/loop.c 2010-08-18 02:49:27.000000000 +0200 @@ -93,7 +93,7 @@ // read pipe shows final data was sent if ( select_value == 1 ) { - return toclient_complete ; + next_toclient = toclient_complete ; } TOCLIENTLOCK(hd); --------------------- Thanks! Paul Alfille ------------------------------------------------------------------------------ This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev _______________________________________________ Owfs-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/owfs-developers The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.
------------------------------------------------------------------------------ This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev
_______________________________________________ Owfs-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/owfs-developers
