Hi Cinthia,

Although the t_check_trans() enhancement solves one corner case, there are plenty more which could lead to lingering grabbed locks or too many lock releases:

* what if the UAS (original UAC in your scenario) becomes offline due to a network event for 1 minute? The Re-INVITE will time out (having received no reply), and your onreply_route will not get triggered, causing a lingering lock. You might get away by releasing it inside the failure_route, but it still looks error-prone to me.

* how do we handle call forking? Because you are grabbing the lock once, but releasing it for each reply of a forked outgoing branch.

Although I wrote the cfgutils locking support, I would advise against using it in such a SIP-centered, complex use case. I suggest simply storing/testing for a marker at dialog level, combined with t_check_trans() to detect retransmissions.

Upon an in-dialog request arrival:

* if the marker is not there -> put it, and proceed normally (e.g. if (!$dlg_val(request_pending)) $dlg_val(request_pending) = 1;) * if the marker is there (e.g. if ($dlg_val(request_pending))) -> drop the request, let the UAC retransmit it. Hopefully, by the time the retransmission arrives, the marker is invalidated, otherwise we will have to keep on dropping them.

Upon an in-dialog reply arrival, we invalidate the marker ($dlg_val(req_pending) = 0)

Cheers,

Liviu Chircu
OpenSIPS Developer
http://www.opensips-solutions.com

On 22.01.2018 22:47, Cinthia Leung wrote:
Hello all,

We're trying to use get_dynamic_lock as a solution to a race condition.  OpenSIPS receives an in-dialog INVITE and UPDATE almost at the same time from a UAS and passes both to the SIP client immediately.  The client responds to whichever one arrives first and sends a SIP 500 to the second packet.

I know we should look into fixing this behavior in the UAS.  But this wonderful cfgutil feature looks like something that may help us in the meanwhile.  We call get_dynamic_lock for the first packet in in-dialog route.  release_dynamic_lock is called when we receive a final response, which should not take long because there's no human interaction involved.  And then the second packet gets the lock.

We have since learned that this approach does not work well when there's network latency that causes the UAS to retranx, as you can imagine.  We are planning to use t_check_trans() before calling get_lock.  At the same time we're wondering if there's a way to keep track of these locks, potentially periodically purge them as needed.

If you have any other ideas on how to solve our originally problem then even better.

TYVMIA!


Cindy


_______________________________________________
Users mailing list
[email protected]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users

_______________________________________________
Users mailing list
[email protected]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users

Reply via email to