Hi Pekka and all,

We are just having an issue with sofia-sip in a multi-threaded
environment. Not sure if this is an error in sofia-sip or an error in
our multi-threading model, but anyway, I would like to know your
opinion, and see if my suggestion on how to solve this makes much sense.
As background, we receive all SIP messages in a "listener" thread A, but
the messages get processed by different "worker" threads (B,C..). Of
course, we try to make sure that the root object of the NTA is properly
acquired/released when we operate on the nta_agent_t.

The issue is the following:

[t] We create an INVITE outgoing transaction and we send it to the UAS.

[t+3s] We receive 100 Trying in thread A

[t+4s] We receive 200 OK in thread A, and the callback set in
orq->orq_callback is called. We start processing the 200 OK reply in a
DIFFERENT thread B (not the one running the nta_agent).

[t+7s] We receive a new retransmitted 200 OK in thread A. The processing
of the first 200 OK is still not finished by thread B, so
orq->orq_completed is still not TRUE. Thus, outgoing_duplicate() is
never called in outgoing_recv() as this is not treated as a
retransmission yet. The problem now is that as this is not treated as
retransmission, before re-calling orq->orq_callback(), the previous
response msg is destroyed with msg_destroy():
  /* Previous orq response is destroyed */
  if (orq->orq_response)
    msg_destroy(orq->orq_response);
  /* New orq response is set */
  orq->orq_response = msg;
  /* Call callback */
  orq->orq_callback(orq->orq_magic, orq, sip);

[t+8s] We finished processing the first 200 OK in thread B, and we want
to generate the ACK. BUT, the sip_t we received in the callback is NO
LONGER valid, as it was generated from the first msg_t (which was
destroyed just after the second 200 OK arrived). Thus, we end up having
lots of invalid reads reported by valgrind, and potentially a segfault.

Of course, in a single-threaded application this would not happen, as
the reception of the retransmitted 200 OK would have been done always
after having fully processed the first 200 OK.


Now, in order to avoid this, the idea is to just make sure that at least
one reference of the msg_t is available while we process the reply in
thread B. This could be managed if inside our callback stored in
orq->orq_callback we could call msg_ref_create() and call ourselves
msg_destroy() when we no longer need to process the associated sip_t.
The steps would be:

 * We receive the SIP response in the listener thread A. Sofia-SIP calls
our nta_response_f callback stored in orq->orq_callback.

 * Our nta_response_f makes sure a new reference to the msg_t is
obtained. We don't have a pointer to the msg_t, but we have the orq and
we can get the msg_t from the orq:
  /* get msg_t from the orq */
  msg_t *msg = nta_outgoing_getresponse (orq);
  /* new reference in the msg_t */
  msg_ref_create (msg);
  /* Now, we 'forward' the reply to one of the workers as before... */

 * In the worker thread, when we finished processing the response, we
just make sure our reference is unref-ed:
  /* get msg_t from the orq */
  msg_t *msg = nta_outgoing_getresponse (orq);
  /* destroy our reference */
  msg_destroy(msg);


Of course, with this solution we just avoid the invalid reads in
valgrind, but, still we will see that retransmissions arrive to the
nta_agent_t and they are not treated as retransmissions, as we still
didn't 'complete' the outgoing request in thread B while received the
new message in thread A. Any possible way of avoiding this?

Sorry for the long email, btw.

Cheers,
-Aleksander




------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Sofia-sip-devel mailing list
Sofia-sip-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sofia-sip-devel

Reply via email to