Bugs item #2940556, was opened at 2010-01-27 00:34
Message generated for change (Tracker Item Submitted) made by apogrebennyk
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2940556&group_id=232389

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: modules
Group: 1.5.x
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Andrew Pogrebennyk (apogrebennyk)
Assigned to: Nobody/Anonymous (nobody)
Summary: Timer bug in TM handler

Initial Comment:
I have got stuck trying to implement a sort of serial forking (forwarding) 
scenario with OpenSIPS. The thing is that when fr_inv_timer hits, OpenSIPS 
(prematurely) sends INVITE on the next branch and only after that CANCELs the 
previous one. (And if the gateway receives different branch on transaction to 
which no final reply has been sent yet - it can merge the requests.) It can be 
clearly seen in the attached trace that contains messages from the OpenSIPS to 
the b2bua on the same machine port 5061. I've found this behavior to be 
consistent in the versions 1.3.2 - 1.5.3, but the logs that follow are from 
OpenSER 1.3.2. Similar bug has been reported for Kamailio: 
http://lists.kamailio.org/pipermail/devel/2009-May/018982.html

Another thing I've found is that OpenSIPS resets the fr_timer in 
retransmission_handler() if no provisional response to INVITE has been 
received. So if the fr_timer hits OpenSIPS silently fails over to the next 
forwarding destination without sending CANCEL to this branch (again 1.5.3 is 
affected). Here we see that it forwards the INVITE and sets FR_TIMER as per 
script:

Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: Request leaving server, 
D-URI='<null>' - M=INVITE RURI=sip:40084956288...@195.128.80.211:5061 
F=sip:000...@sip2.d-tel.ru T=sip:000...@sip2.d-tel.ru IP=77.122.227.73 
ID=YmM1NGE4NWQ1ZWZhZjVjNDBkNzljODZjYTk5MDVlN2Q.
[...skipped...]
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:_set_fr_retr: 
FR_TIMER = 10
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: relative 
timeout is 10
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling 
insert_timer_unsafe: timeout=10, final value=46
...
Here it hits the retransmission_handler() - and this "relative timeout 
is 1000000" messages appears:

Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: 
DBG:tm:retransmission_handler: retransmission_handler : request resending 
(t=0x28701ad0, INVITE si ... )
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: relative 
timeout is 1000000
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling 
insert_timer_unsafe: timeout=1000000, final value=37500000
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: 
DBG:tm:insert_timer_unsafe: [5]: 0x28701d30 (37500000)
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: 
DBG:tm:retransmission_handler: retransmission_handler : done

I think this timeout somehow interferes with fr_timer because when fr_timer 
hits in 20 seconds, OpenSIPS _silently_ fails over to the next 
forwarding destination without sending CANCEL to _this_ branch. This occurs 
only when no provisional response has been got so I tend to think 
this is somehow related to the timeout thing.

I have added extra debug print to insert_timer_unsafe() like:
  LM_DBG("calling insert_timer_unsafe: timeout=%lld, final 
value=%lld\n",timeout,timeout + get_ticks());

so you could see the sequence of timer updates:
Jan 27 05:04:15 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling 
insert_timer_unsafe: timeout=2000000, final value=39500000
Jan 27 05:04:17 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling 
insert_timer_unsafe: timeout=4000000, final value=43500000
Jan 27 05:04:21 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling 
insert_timer_unsafe: timeout=4000000, final value=47500000

Then OpenSIPS fails over to the next forwarding destination on locally 
generated 408 - but note there is no real "sending cancel..." message 
from cancel_branch() between these messages:

Jan 27 05:04:24 sip2 /usr/local/sbin/openser[48744]: 
DBG:tm:insert_timer_unsafe: [0]: 0x28701e60 (56)
Jan 27 05:04:24 sip2 /usr/local/sbin/openser[48744]: DBG:tm:relay_reply: 
branch=1, save=1, relay=-1
Jan 27 05:04:24 sip2 /usr/local/sbin/openser[48744]: 
DBG:tm:final_response_handler: done

Please check.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2940556&group_id=232389

_______________________________________________
Devel mailing list
Devel@lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/devel

Reply via email to