Bugs item #2270819, was opened at 2008-11-12 19:43
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=756076&aid=2270819&group_id=143636

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Pekka Pessi (ppessi)
Assigned to: Pekka Pessi (ppessi)
Summary: Automatic NOTIFY uses stale handle

Initial Comment:
The nua handle gets recycled and reference counting fails if a NOTIFY is 
generated by a stack after handle is destroyed.

--

Due to a slightly non-standard SIP flow in an attended transfer implementation 
I encountered, I've found a way to crash the Sofia stack.  It's a race 
condition, and is difficult to reproduce, but I can usually get it to happen 
after about an hour of execution on our test system here.  I am willing to run 
some tests of fixes on our set-up here, if need be.

I have the transferer and transferee running on the same host, sharing the same 
Sofia nua stack instance.  The transfer destination is running on another host. 
 The transferee's handle for its dialog with the transferer is getting 
deallocated (ref count reaches zero) while it is still on the handle list.  The 
deallocation memsets the entire handle to 0xaa.  The next time nua_stack_timer 
runs, it calls nh_call_pending for the deallocated handle, which dereferences a 
pointer found inside the handle, and BOOM!  I get a bus error due to an 
unaligned memory address access (it's trying to load a 32-bit word from 
0xaaaaaaaa + 0x20).

Further investigation shows that the stack is getting into this situation as 
follows:

   1. Transferer has sent INVITE to transfer destination and received a 100 
Trying and 180 Ringing, but transfer destination hasn't answered the call.
   2. Transferer sends REFER to transferee, immediately followed by a BYE (This 
is the non-"standard" part.  The transferer should really be waiting to send 
the BYE when it gets the NOTIFY [200 OK]).
   3. Transferee receives the REFER and sends an INVITE to the transfer 
destination (starts dialog D3).
   4. Transferee receives BYE indications, etc., with the last being a 
"Terminated" from Sofia for D1, which the transferee responds to by calling 
nua_handle_destroy.
   5. Transferee's nua "protocol thread" processes an r_destroy signal for D1's 
handle, thereby removing that handle from the handle list.
   6. Transfer destination sends 100 Trying and 180 Ringing to transferee for 
D3.
   7. Transferee's nua "protocol thread" processes an r_notify signal in 
nua_stack_signal.  At line 549, in nua_stack.c, the handle is seen to not be on 
the handle list, and is added back onto the list.
   8. 180 Ringing arrives for D3 from the transfer destination just as the 
transferee hangs up (calls nua_bye for) D3,  because the automated test system 
ended the call.
   9. Transferee receives a bunch of indications from nua for D3 (as a result 
of receiving the 487 Request Terminated), the last of which is a "Terminated" 
indication.  The transferee responds to the "Terminated" indication by calling 
nua_handle_destroy for D3's handle.
  10. Transferee's nua "protocol thread" processes an r_notify signal for D1's 
handle.  The handle is unrefd, and the ref count reaches zero.  The handle is 
deallocated, but it's still on the handle list.


Of course, the next time the timer expires and runs nua_stack_timer, the handle 
list is traversed, and my thread catches the SIGBUS to hell.  ;)

It seems like the NOTIFY code needs to remove the handle from the handle list 
when the subscription terminates, but I don't really know what I'm talking 
about here.  Maybe there's some other reason why the handle needs to remain on 
the handle list, in which case, there must be a missing call to nua_handle_ref 
somewhere.

I have a workaround that I'm testing right now.  I've changed the transferee 
such that it waits until after D3 has either become active or terminated before 
it calls nua_destroy_handle for the terminated dialog D1.  This seems to be 
working, but I'll run it for a few days to be sure.

Cheers.

--Jen 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=756076&aid=2270819&group_id=143636

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Sofia-sip-devel mailing list
Sofia-sip-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sofia-sip-devel

Reply via email to