Re: Locking bug in NSMessagePortWin32

Richard Frith-Macdonald Sat, 09 Sep 2006 10:01:45 -0700


On 5 Sep 2006, at 13:43, Wim Oudshoorn wrote:

Debugging under windows is a little tricky, but in our applicaton Iobserve
the following deadlock:

Thread 8:
NSMessagePort _setupSendPort line 145 self =0x19a5770 Block on Lock: this->lockNSMessagePort newWithName line208 Grabs lock: messagePortLock
       ...
NSMessagePort receivedEventRead line 638 self =0x2a45c90
Thread 1:
NSMessagePort newWithName line200 Block on lock: messagePortLock
       ...
NSMessagePort receivedEventRead line 638 self =0x19a5770 Grabs lock: 0x19a5770->lock
Consequence:  DEADLOCK!


So here is a scenario how we can end up in this situation.

1 - Thread 8 sends a message to thread 1.
2 - Thread 1 replies to thread 8
3 - Thread X sends a message to thread 1.
4 - Thread 1 handles starts handling the message from Thread X andgrabs
    the 0x19a5770->lock

5 - Thread 8 starts handling the reply of thread 1
6 - Thread 8 reads the send port of the reply and tries to
    get the port that was used to send the reply.
    For this it calls newWithName.

7 - Thread 8 grabs the messsagePortLock in newWithName
8 - thread 8 calls _setupSendPort on the messageport 0x19a5770which was used for sending9 - Thread 8 tries to grab 0x19a5770->lock but fails (hold bythread 1 in sterp 4)
10 - Thread 1 continues and wants to deduce the port that thread Xused for sending,11 - Thread 1 calls newWithName and blocks on messsPortLock (holdby thread 8 in step 7)
So an obvious fix is to try to make the locks non nesting in
newPortName:  and initWithName:.

But:

A - I don't know if that is wrong


Seems plausible though.

B - I don't know if it is enough to fix the problem

I'm not sure either ... but there is no obvious way that this wouldhappen if the call to _setupSendPort is moved outside the regionprotected by the messagePortLock ... so I've restructured the codethat way.

C - I just have this nagging feeling that _setupSendPort is
    useless anyway.  Why is it called on a port that already exists?

I think, because the port may exist only for receiving and need to beset up for sending too.

This code also suffered from the bug that we could potentially getdouble deallocation of a port if one thread searched the table andfound it while another thread was performing a final release on it.I've added an implementation of -release which should fix that.

We need to review all the places where objects are 'uniqued' in aglobal table but are not permanently cached ... they probably allsuffer from the same problem and need fixing.




_______________________________________________
Bug-gnustep mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/bug-gnustep

Re: Locking bug in NSMessagePortWin32

Reply via email to