XX-6065 is a can of worms:  There are a number of problems that can
produce the symptom (a phone rings forever), and a number of symptoms
that can arise from the principal cause (a phone does not respond to an
INVITE with a 1xx response quickly enough).


Background:  The following sipx-dev threads contain discussion of some
of the underlying design questions regarding the request resend
schedule.  The particular message pointed to for each thread is one I
consider to be particularly relevant:

http://list.sipfoundry.org/archive/sipx-dev/msg15757.html
Subject: [sipX-dev] Improving the resend schedule


http://list.sipfoundry.org/archive/sipxecs-commit/msg20957.html
Subject: [SFtrack] Issue Comment Edited: (XECS-1589) SipXecs does not
gracefully handle broken TCP streams

http://list.sipfoundry.org/archive/sipx-dev/msg15640.html
Subject: Re: [sipX-dev] Retransmission of messages over TCP


Component problems/solutions:

1.  As far as I know, all SIP phones will ring forever if they receive
an INVITE but no CANCEL.  This is really silly, as there is no situation
where a phone should ring more than, say, 15 minutes.  So we should
suggest to phone vendors that the phone have a "dead-man" expiration of
15 minutes or so.

2.  To circumvent (1), *every* INVITE generated by sipXecs should have
an Expires header.  Fortunately, every transaction in the proxy has an
expiration time, so we can have the proxy add an Expires header to every
INVITE that does not already have one without changing the functionality
in all non-error cases.  To avoid having both ends of the transaction
cancel it at the same time (causing clutter in traces), the added
Expires value should be 1 or 2 seconds longer than the expiration in the
proxy transaction.

3.  If the proxy receives a 1xx response to an INVITE that it has sent,
but the transaction in the proxy has already (internally) canceled that
leg, the proxy should send a CANCEL.  Currently, if no 1xx was received
for the INVITE, the proxy will (correctly) not send a CANCEL, but if a
1xx is received subsequently, the proxy will not then send a CANCEL.
(Kathy E. tells me that at least part of the machinery needed for this
is present in SipTransaction; it is possible that I am requesting the
intended behavior, but that there is an outright bug in sipXtack about
this.)

4.  The timeout for resending to a non-responding destination should be
increased.  The current timeout is 1.5 seconds.  This appears to have
been chosen only because we thought it was sufficient for the slowest
networks that we expected sipXecs to be used on.  (See the discussions
referenced above.)  The new value needs to be discussed, but values up
to 6 seconds have been suggested.  The constraints are that the value
should be high enough to avoid incorrectly marking destinations as
unresponsive, but low enough to not annoy users in HA systems with
nonresponding components.  (IIRC, the typical HA worst case is 3 times
the time-out:  once for a proxy, once for a registrar, and once for a
target application server.)

Note that this would be done by increasing the number of resend cycles,
rather than increasing the length of the first resend cycle (the T1
value), because T1 is set based on the most common case (where RTT < 100
msec).

5.  If phones take more than 100 msec to respond to an INVITE, they are
out of specification.  If response times are much more than that, we
should prod the phone vendors to investigate the problem.  (This is
really a "soft real-time" problem, and that is a known problem domain.)

Dale


_______________________________________________
sipx-dev mailing list [email protected]
List Archive: http://list.sipfoundry.org/archive/sipx-dev
Unsubscribe: http://list.sipfoundry.org/mailman/listinfo/sipx-dev
sipXecs IP PBX -- http://www.sipfoundry.org/

Reply via email to