________________________________________
From: [email protected] 
[[email protected]] On Behalf Of Irena Dolovčak 
[[email protected]]

I have a two server HA sipx cluster.
When I take down the primary server, everything still works with the secondary 
server but it takes much longer to establish a call (any call).

On the primary server it normally takes 1-2 seconds between placing a call and 
start of the ringing tone. No matter what the endpoint device is.
In the failover mode I get much higher times:

X-Lite – 5-7 seconds
Yealink – 5-7 seconds (just want to mention that Yealink now works great with 
sipx. In newest revision they fully support DNS SRV records and all the 
relevant options)
Snom – 37 seconds

These are constant, repeatable times.
Servers are configured with strict failover, not load balancing. We took couple 
of different approaches with DNS and DHCP to make sure they are fine.
_______________________________________

In all of these cases, it is the phone that performs the failover operation.  
It sends (and re-sends) the INVITE to the primary.  After not getting any 
response from from the primary for a certain length of time, to selects the 
next destination (the secondary) from the DNS results and sends the INVITE to 
it.  So it should be the phone that is controlling the failover time.

To check that this is what is happening, run a Wireshark capture and watch for 
the INVITE messages sent by the phone.

The official recommendations for doing the failover are in RFC 3261 (figure 5 
and table 4).  The failover time is named "Timer B".  But practice may differ 
from those figures.  Without change, Timer B would be 32 seconds, which is too 
long for most practical situations.  The recommendations could be amended by 
noting that T1 is supposed to be an estimate of the round-trip time in your 
network, and that for most LANs, 500 msec is far too long.  Reducing T1 to a 
more realistic 100 msec reduces Timer B to 3.2 seconds.    In many corporate 
networks, the RTT may be as small as 10 msec, reducing Timer B to 0.32 seconds.

Of course none of this matters unless the phones can be configured to use the 
shorter times.  Hopefully improving the configuring of the phones will solve 
that.

IIRC, sipX uses a failover time of 1.5 seconds when sending to a destination, 
which is long enough for almost all networks we've encountered and yet short 
enough that users don't notice it as a failure.

Dale
_______________________________________________
sipx-users mailing list
[email protected]
List Archive: http://list.sipfoundry.org/archive/sipx-users/

Reply via email to