I have written a fix for XECS-1589, and I would like to have people
review it.

The fundamental problem in XECS-1589 is that sipXtack would not re-send
requests that were sent by TCP.  This would cause a request to not get
through to a destination if the near end thought that there was a TCP
connection to the destination, but the destination had forgotten about
the connection.  (E.g., if the destination was a phone that had rebooted
due to power failure.)  The first send would be over the broken TCP
connection and would be rejected by the destination, and there would be
no resend.  This patch causes TCP to use the same resend schedule as
UDP.  Resends are quenched when the sender receives a (provisional or
final) response, or when the attempt times out.  In the case of a broken
TCP connection, the first send fails, but the second send causes a new
TCP connection to be established, and so the message gets through.

I've revised the default resend schedule to match the properties of real
networks:  Resends are by default sent at 100 msec, 300 msec, and 700
msec, with the final timeout at 1500 msec.  This is more reasonable than
the previous default (500 msec, 1500 msec, 3500 msec, up to 32000 msec)
given the RTTs in real networks; the final timeout is long enough for
satellite links; and the 1.5 sec final timeout is short enough to not be
onerous for failing over to an alternative destination.

Previously, the stack only sent 100 responses to the initial receipt of
an INVITE request.  This patch corrects that by also sending 100
responses to resends of INVITE requests.  In addition, it sends 100
responses to resends (but not initial sends) of non-INVITE requests.
The latter is contrary to RFC 3261, but RFC 3261 assumes that non-INVITE
requests will complete quickly (<100 msec), and that is not true if
there are transport problems.

Instructions are included in XECS-1589 for verifying a number of the
changed behaviors.

One thing this patch does *not* provide is proper feedback from
transport errors (detected in the kernel and passed back to user-space)
to SIP transport failures (reported to SipTransaction, causing immediate
fallback to alternative destinations).  This code attempts
retransmission until timeout even if the kernel provides a definitive
failure indication.  Implementing this feedback would require
considerable additional work.

Dale


_______________________________________________
sipx-dev mailing list
[email protected]
List Archive: http://list.sipfoundry.org/archive/sipx-dev
Unsubscribe: http://list.sipfoundry.org/mailman/listinfo/sipx-dev

Reply via email to