________________________________________ From: [email protected] [[email protected]] On Behalf Of Irena Dolovčak [[email protected]]
I have a two server HA sipx cluster. When I take down the primary server, everything still works with the secondary server but it takes much longer to establish a call (any call). On the primary server it normally takes 1-2 seconds between placing a call and start of the ringing tone. No matter what the endpoint device is. In the failover mode I get much higher times: X-Lite – 5-7 seconds Yealink – 5-7 seconds (just want to mention that Yealink now works great with sipx. In newest revision they fully support DNS SRV records and all the relevant options) Snom – 37 seconds These are constant, repeatable times. Servers are configured with strict failover, not load balancing. We took couple of different approaches with DNS and DHCP to make sure they are fine. _______________________________________ In all of these cases, it is the phone that performs the failover operation. It sends (and re-sends) the INVITE to the primary. After not getting any response from from the primary for a certain length of time, to selects the next destination (the secondary) from the DNS results and sends the INVITE to it. So it should be the phone that is controlling the failover time. To check that this is what is happening, run a Wireshark capture and watch for the INVITE messages sent by the phone. The official recommendations for doing the failover are in RFC 3261 (figure 5 and table 4). The failover time is named "Timer B". But practice may differ from those figures. Without change, Timer B would be 32 seconds, which is too long for most practical situations. The recommendations could be amended by noting that T1 is supposed to be an estimate of the round-trip time in your network, and that for most LANs, 500 msec is far too long. Reducing T1 to a more realistic 100 msec reduces Timer B to 3.2 seconds. In many corporate networks, the RTT may be as small as 10 msec, reducing Timer B to 0.32 seconds. Of course none of this matters unless the phones can be configured to use the shorter times. Hopefully improving the configuring of the phones will solve that. IIRC, sipX uses a failover time of 1.5 seconds when sending to a destination, which is long enough for almost all networks we've encountered and yet short enough that users don't notice it as a failure. Dale _______________________________________________ sipx-users mailing list [email protected] List Archive: http://list.sipfoundry.org/archive/sipx-users/
