Quite possible.

Basically one thread was telling the other thread that it's work was done by
sending a pipe signal, then unlocking the mutex. The other thread was catching
the pipe signal, and exiting, and then destroying the mutex.  Leon caught the
fact that it should first lock the mutex, then unlock and cleanup. (Or I could
have used non-detached threads).

Can you give it a try?

Paul Alfille

 

(Sorry for the blank messages, Gmail seems a little flaky).

 

________________________________

From: Paul Alfille [mailto:[email protected]] 
Sent: Thursday, August 19, 2010 6:45 AM
To: OWFS (One-wire file system) discussion and help
Subject: Re: [Owfs-developers] owserver mutext bug solved.

 

 

On Wed, Aug 18, 2010 at 7:00 PM, Alex Shepherd <[email protected]> wrote:

Hmmm... does this relate to the issues that I was seeing where owserver
would crash with the Assertion:

"...tpp.c:66: __pthread_tpp_change_priority: Assertion..."

If it is then I'll rebuild from the latest source and see if it makes a
difference. Currently I'm running an older version that doesn't have the
Assertion issue.

Alex


________________________________

       From: Paul Alfille [mailto:[email protected]]
       Sent: Thursday, 19 August 2010 7:35 a.m.
       To: owfs-developers
       Subject: [Owfs-developers] owserver mutext bug solved.



       Leon did some careful sleuthing and found an architecture-dependent
bug in the the locking model for owserver. The problem is mostlikely some
problem with the mutex implementation on that platform, but the problem
seems solved.

       In his words, the problem:
       -------
       Submitted By: Leon (nleonard671)
       Assigned to: Nobody/Anonymous (nobody)
       Summary: owserver crashes on client disconnection on Mips CPU

       Initial Comment:
       I'm using OWFS with an OpenWrt OS on a MIPS cpu (exact model: MIPS
24Kc V7.4) and I have a 100% reproductible crash problem with owserver.
       Step to reproduce:
        1. Launch owserver, no matter if you use fake device or real device
(DS9490 in my case)
        2. Launch "owget /"
       => Segmentation fault in the server
       -----------


       And the solution:
       --------------
       Ok, I've finally managed to find out the bug. It's a race condition
between
       the 'ping loop' (owserver/loop.c) and the 'data handler'
       (owserver/data.c).

       Possible crash explanation:
        1. [data.c:DataHandler()] Data thread lock the TOCLIENT mutex,
sets the
       hd->toclient to toclient_complete then write some dummy data to the
pipe,
       to wakeup the PingLoop thread (line 215)
        2. [loop.c:Ping_or_Send()] Ping thread select is 'awaked' by the
pipe
       data (line 92) and returns without locking the TOCLIENT mutex (line
96)
        3. [loop.c:PingLoop()] Exist from the loop and execute
LoopCleanup(à,
       which destroys the TOCLIENT mutex
        4. [data.c:DataHandler()] Data thread unlocks the TOCLIENT mutex
(line
       218), and crashes
       I didn't notice early, but in my output log files, the statement
"Finished
       with client request" (data.c:222) was never written, confirming that
the
       data thread crashes before.

       I tried to reproduce it unsuccessfully on my development hardware
(x86) -
       maybe the uClibc pthread implementation used by OpenWrt has a
different
       behavior than the glibc's one.

       A working fix: instead of returning of c:Ping_or_Send() directly
after the
       select(), a lock/unlock step is executed, allowing the data thread
to
       finish properly before destroying the shared resources.

       --- owfs-2.8p0-ORIG/module/
       owserver/src/c/loop.c        2010-05-08
       20:47:10.000000000 +0200
       +++ owfs-2.8p0/module/owserver/src/c/loop.c     2010-08-18
       02:49:27.000000000 +0200
       @@ -93,7 +93,7 @@

              // read pipe shows final data was sent
              if ( select_value == 1 ) {
       -               return toclient_complete ;
       +               next_toclient = toclient_complete ;
              }

              TOCLIENTLOCK(hd);

       ---------------------


       Thanks!
       Paul Alfille







------------------------------------------------------------------------------
This SF.net email is sponsored by

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev
_______________________________________________
Owfs-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/owfs-developers

 



The information in this e-mail is intended only for the person to whom it is
addressed. If you believe this e-mail was sent to you in error and the e-mail
contains patient information, please contact the Partners Compliance HelpLine at
http://www.partners.org/complianceline . If the e-mail was sent to you in error
but does not contain patient information, please contact the sender and properly
dispose of the e-mail.
------------------------------------------------------------------------------
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
_______________________________________________
Owfs-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/owfs-developers

Reply via email to