Luke Marsden <luke-li...@hybrid-logic.co.uk> writes: > We're actually using it to provide redundancy in this instance. In our > application any request for any site can be made to any (live) server, > so having dead servers in the pool of A records doesn't matter so long > as real web browsers failover to some other A record within a second, > which they do! http://crypto.stanford.edu/dns/dns-rebinding.pdf
Be aware that the time to failover to an alternate A record need not be that fast depending on the sort of failure involved. Failover can only occur quickly as long as the outage (network unreachable, port no longer active on the host, etc..) is such that the connection attempt is explicitly rejected by the target host or a router along the way. If it's a more complicated outage (e.g., a routing loop or total machine failure) for which no explicit failure response will be received by the client, you'll be subject to the client's connect timeout (one per each failing address and attempt to that address it tries). These may vary by client and/or platform, but can easily be 30-60s - certainly long enough for the human involved to potentially want to give up. Also, since web browsers typically cache DNS responses, if a bad address is early in the list, a timeout will be encountered for each and every individual browser request generated. I did a quick test with a stock FireFox 3.6 under OSX and with a bad initial A record (non-existent host) it took about 75s to failover to the next A record. In my test case even that was unusable since the host I was referencing had other references to itself needed to load that home page, and each of those references themselves took another 75 seconds to time out. So it took more than 2 minutes for me to see the page I wanted, which I presume most people would give up on. That's not to say using multiple A records isn't a helpful practice for many sorts of outages (especially to permit controlled maintenance). Just don't expect it to necessarily be sufficient in all failure modes depending on the behavior you want clients to experience. If this is strictly limited to a client you control, it's much less of an issue, since you can drop the TCP connect timeout much lower than what it defaults to, though you still probably can't match how fast it can happen for rejected connections, since you'll want to leave enough room for occasional latency or response time issues without immediately failing over. But you can do a lot better than the system defaults. -- David _______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python