I have run in to an issue with a client which I think is of general interest.

The issue arises in cases where you have a set of origin servers that share a 
fully qualified domain name but have distinct IP addresses. In this case the 
DNS query returns all of the IP addresses, each one corresponding to a specific 
server. The desired behavior is for the connection load to be spread across the 
servers. However, if you rely on round robin support from the DNS server, 
because ATS caches the DNS response you get the servers being hit hard one 
after another, as the DNS data times out.

If you set the ATS internal DNS round robin 
(proxy.config.hostdb.strict_round_robin) you can have another problem if you 
also have server session sharing enabled. For each transaction on a session, a 
HostDB lookup is done on the FQDN of the origin server. If there is a server 
session associated with the client session, the FQDN and the IP address are 
checked against those in the server session and if they match, the server 
session is kept with the client session. This makes server sessions "sticky" 
and kept associated with a client as long as that client continues to make 
requests to the same FQDN.

This changes if strict round robin is used because the HostDB lookup for the 
same FQDN will generally return a different IP address. In this case the server 
session associated with the client session is rarely re-used, leading to a lot 
of additional server sessions and re-connects.

There are a couple of approaches to dealing with this, if server session 
sharing is desired.

1) Change the associated session check to look only at the FQDN. If that 
matches, keep the session. [*]

2) Enable the HostDB lookup to take a hint IP address. If that address is valid 
for the FQDN, use it regardless of round robin settings. If not, select an IP 
address as if no hint were provided. The IP address from an associated server 
session (if any) is passed to the HostDB lookup. [#]

These turn out to potentially have different operational characteristics. One 
case that came up was if the nameserver was rotating IP addresses, such as 
returning only 2 of 4 but (over time) changing the pair returned. This can lead 
to the case where a server session remains valid with an FQDN / IP address pair 
that is not considered valid by HostDB (because the DNS response timed out and 
a new one was retrieved which does not contain that IP address). In the case of 
my particular client, this is considered a feature and not a bug, but others 
might see it differently.

What we would like is to be able to have DNS round robin effects *between* 
clients while keeping the stickiness of the client / server session sharing 
that exists without multiple address FQDNs. Does this seem useful to anyone 
else, and what style of implementation would seem best? Should the choices of 
(1) and (2) be configurable?

* This is a bit trickier to implement than it sounds, because the HostDB lookup 
has been done and the resulting IP address stored in various places. That value 
becomes wrong if the currently associated server session is kept because it has 
a different IP address. Currently I think only one additional place needs to be 
updated but I am still researching that.

# It turns out the HostDB lookup already checks for an associated server 
session for unrelated reasons, so picking out the IP address would be a simple 
matter.

Reply via email to