Hi,

Ovidiu Sas wrote:
> It is a race between the 200ok being sent out and the update of the
> dialog state machine.  The ACK is received while the dialog state
> machine was not updates for the previously sent 200ok.
> IIRC, in opensips this was fixed by introducing a new tm callback in
> order to update the dialog state machine as soon as the 200ok was sent
> out.

Taking a look at dlg_handlers.c from 3.0.2, one can see that the
function which transitions the dialog state (dlg_onreply) is called back
from TMCB_RESPONSE_FWDED or TMCB_RESPONSE_READY. While the former is
called just before a response is forwarded (but hasn't been so yet), the
latter is mapped to TMCB_RESPONSE_OUT which, according to t_hooks.h and
the code in t_reply.c, is called *after* the response has been delivered.

In 1.5, however, dialog's reply handler function already relies on
TMCB_RESPONSE_PRE_OUT which properly avoids the possibility of receiving
an ACK prior to the complete processing of a previous OK response, and
in turn race conditions.

Consequently, it seems to me that 3.0.x introduced a bug by not using
the proper tm callback anymore, specifically not one that finishes for
sure before any message has been sent out. I'm not quite sure whether
TMCB_RESPONSE_READY has simply been given the wrong value (its name
indicates to me that the response is still waiting to be transmitted) or
the tm module uses the wrong callback macros: Apparently, 1.5's *_PRE_*
callbacks have gone and *_SENT introduced instead in 3.0.2. Not sure if
that circumstance is truly reflected by the code.


Cheers,

--Timo



> For now, the only way to avoid this issue is to run with one single
> listener.  This will keep the ACK into the received queue until the
> listener finishes the processing of received 200ok.
> 
> 
> Regards,
> Ovidiu Sas
> 
> On Mon, Jun 14, 2010 at 9:01 AM, Klaus Feichtinger
> <klaus.feichtin...@gmx.net> wrote:
>> Hello Ovidiu,
>>
>> I've been testing reproducablility of the described error and found that it 
>> is reproducable in "standard" mode onyl. This should mean: forked 
>> application with child processes. As soon as I start the application even as 
>> non-forked or with only one child (children=1), the error was not 
>> reproducable until now. I've fastened reproduction with SIPp the call 
>> generating tool and in standard mode this error messages are visible after 
>> about 1-2 minutes (the call rate is one call per second with a pause of 
>> 1500ms between ACK and the following BYE).
>>
>> Under these test circumstances no subscription for presence or dialog state 
>> was active; it was tested on a "isolated" server where no other UAs were 
>> active; only SIPp and one other UA were active.
>>
>> I found that this problem is UserAgent-independent, too. Do you have an idea 
>> how to find out any more details about this problem?
>>
>> regards,
>>
>> Klaus Feichtinger
>>
>>> Datum: Thu, 10 Jun 2010 22:24:52 +0200
>>> Von: Klaus Feichtinger <klaus.feichtin...@gmx.net>
>>> An: Ovidiu Sas <o...@voipembedded.com>
>>> CC: sr-dev@lists.sip-router.org
>>> Betreff: Re: [sr-dev] Dialog module: "CRITICAL bogus event 6 in state 2
>>> Hello,
>>>
>>> according tcpdump-traces on the linux machine and log entries in syslog
>>> (see below) the 200 OK passed the SIP Router (this means - the call was
>>> accounted and afterwards the bogus event was noted):
>>>
>>> Jun 10 18:29:09 debian /usr/sbin/kamailio[20026]: INFO: <script>:
>>> ---xlog: a call with sip:116...@10.16.10.99 is coming in
>>> Jun 10 18:29:12 debian /usr/sbin/kamailio[20027]: NOTICE: acc
>>> [acc.c:275]: ACC: transaction answered:
>>> timestamp=1276208952;method=INVITE;from_tag=3120233282-36939680-1276187634579;to_tag=3338003860-56654088-1276187447718;call_id=741466613-36939680-1276187634...@10.16.10.152;code=200;reason=OK;src_user=116201;src_domain=10.16.10.99;dst_user=116202;dst_domain=10.16.10.90;src_ip=10.16.10.152
>>> Jun 10 18:29:12 debian /usr/sbin/kamailio[20025]: CRITICAL: dialog
>>> [dlg_hash.c:591]: bogus event 6 in state 2 for dlg 0xa7eb3ea8
>>> [2285:1225504260] with clid
>>> '741466613-36939680-1276187634...@10.16.10.152' and tags
>>> '3120233282-36939680-1276187634579' ''
>>>
>>>
>>> However, this is not the one and only event / state combination I found
>>> in syslog. There is also an entry with event 6 in state 1 present:
>>>
>>> Jun 10 19:35:49 debian /usr/sbin/kamailio[20026]: INFO: <script>:
>>> ---xlog: a call with sip:116...@10.16.10.99 is coming in
>>> Jun 10 19:35:50 debian /usr/sbin/kamailio[20027]: NOTICE: acc
>>> [acc.c:275]: ACC: transaction answered:
>>> timestamp=1276212950;method=INVITE;from_tag=3805046041-37017968-1276191635112;to_tag=befc3f7326747dei0;call_id=789390150-37017968-1276191635...@10.16.10.152;code=200;reason=OK;src_user=116201;src_domain=10.16.10.99;dst_user=116002;dst_domain=10.16.10.205;src_ip=10.16.10.152
>>> Jun 10 19:35:50 debian /usr/sbin/kamailio[20026]: CRITICAL: dialog
>>> [dlg_hash.c:591]: bogus event 6 in state 1 for dlg 0xa7e979b8
>>> [1648:1273002173] with clid
>>> '789390150-37017968-1276191635...@10.16.10.152' and tags
>>> '3805046041-37017968-1276191635112' ''
>>>
>>> But tomorrow I will test the setting - as recommended by you - to reduce
>>> kamailio to a single worker and check if the problem will keep
>>> repreduceable.
>>>
>>> Regards,
>>>
>>> Klaus Feichtinger
>>>
>>>> Bogus event 6 in state 2 means ACK received in early state.
>>>> This is a little bit bizarre, because it seems that the dialog did not
>>>> see the 200ok passing by.
>>>> Try to use a single worker (children=1) and see if you can reproduce
>>>> the problem.
>>>> Also, check that the 200ok is really going through the proxy.
>>>>
>>>> Regards,
>>>> Ovidiu Sas
>>>>
>>>> On Thu, Jun 10, 2010 at 6:44 AM, Klaus Feichtinger
>>>> <klaus.feichtin...@gmx.net> wrote:
>>>>
>>>>> Hello list,
>>>>>
>>>>> under special circumstances I have a problem with the DIALOG module of
>>> SIP-Router/Kamailio version 3.0.2. The dialog module is used in combination
>>> with presence, presence_xml, pua, pua_usrloc and pua_dialoginfo modules +
>>> db_mysql modules. The db_mode is set to "write_through", because of
>>> redundancy requirements.
>>>>> From my point of view this problem has something to do with timing in
>>> general. E.g. a so called "data call" has a duration of about 50 ms only;
>>> INVITE....BYE). That problem occurs on this short calls only. Because of the
>>> problems in the dialog state machine, NOTIFY messages are incorrect. Even
>>> when the call is already finished, the NOTIFY message includes the state
>>> "confirmed" and causes a wrong status indication....
>>>>> The detailed error message can be seen here:
>>>>> "Jun  9 16:00:39 debian /usr/sbin/kamailio[14992]: CRITICAL: dialog
>>> [dlg_hash.c:591]: bogus event 6 in state 2 for dlg [dialog-ID] with clid
>>> [Call-ID] and tags '1299370188-28358304-1276092068837' ''"
>>>>>
>>>>> I found in old mails of the developer list another error that looked
>>> nearly the same, but it differed in the event and state of the dialog state
>>> machine (the thread can be found under the link
>>> http://www.mail-archive.com/de...@lists.kamailio.org/msg03234.html). That 
>>> bug should - according the
>>> information that I've found in the list - be solved. Therefore I will ask
>>> you: is the error as displayed above another well known error / bug, which
>>> (maybe) should already be solved? Is it a new bug?
>>>>> Thanks for any information.
>>>>>
>>>>> regards,
>>>>>
>>>>> Klaus Feichtinger
>>>>>
>>>>> _______________________________________________
>>>>> sr-dev mailing list
>>>>> sr-dev@lists.sip-router.org
>>>>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>>>>>
>>>>>
>>>
>>> _______________________________________________
>>> sr-dev mailing list
>>> sr-dev@lists.sip-router.org
>>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>> --
>> GMX DSL: Internet-, Telefon- und Handy-Flat ab 19,99 EUR/mtl.
>> Bis zu 150 EUR Startguthaben inklusive! http://portal.gmx.net/de/go/dsl
>>
> 
> _______________________________________________
> sr-dev mailing list
> sr-dev@lists.sip-router.org
> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev


-- 
Timo Reimann
Software-Developer

1&1 Internet AG - Consumer Core
Ernst-Frey-Straße 9 · DE-76135 Karlsruhe
Telefon: +49 721 91374-8122
timo.reim...@1und1.de

Amtsgericht Montabaur / HRB 6484
Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Thomas
Gottschlich, Robert Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr.
Oliver Mauss, Gert Nowotny, Jan Oetjen
Aufsichtsratsvorsitzender: Michael Scheeren

_______________________________________________
sr-dev mailing list
sr-dev@lists.sip-router.org
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev

Reply via email to