On Sep 30, 2013, at 2:17 PM, Alan M. Carroll <a...@network-geographics.com> wrote:
> I have run in to an issue with a client which I think is of general interest. > > The issue arises in cases where you have a set of origin servers that share a > fully qualified domain name but have distinct IP addresses. In this case the > DNS query returns all of the IP addresses, each one corresponding to a > specific server. The desired behavior is for the connection load to be spread > across the servers. However, if you rely on round robin support from the DNS > server, because ATS caches the DNS response you get the servers being hit > hard one after another, as the DNS data times out. > > If you set the ATS internal DNS round robin > (proxy.config.hostdb.strict_round_robin) you can have another problem if you > also have server session sharing enabled. For each transaction on a session, > a HostDB lookup is done on the FQDN of the origin server. If there is a > server session associated with the client session, the FQDN and the IP > address are checked against those in the server session and if they match, > the server session is kept with the client session. This makes server > sessions "sticky" and kept associated with a client as long as that client > continues to make requests to the same FQDN. > > This changes if strict round robin is used because the HostDB lookup for the > same FQDN will generally return a different IP address. In this case the > server session associated with the client session is rarely re-used, leading > to a lot of additional server sessions and re-connects. > > There are a couple of approaches to dealing with this, if server session > sharing is desired. > > 1) Change the associated session check to look only at the FQDN. If that > matches, keep the session. [*] > > 2) Enable the HostDB lookup to take a hint IP address. If that address is > valid for the FQDN, use it regardless of round robin settings. If not, select > an IP address as if no hint were provided. The IP address from an associated > server session (if any) is passed to the HostDB lookup. [#] How about matching on FQDN, but frequently(-ish) expiring sessions from the pool? This would cause the pool to slowly traverse the full set of origins. > > These turn out to potentially have different operational characteristics. One > case that came up was if the nameserver was rotating IP addresses, such as > returning only 2 of 4 but (over time) changing the pair returned. This can > lead to the case where a server session remains valid with an FQDN / IP > address pair that is not considered valid by HostDB (because the DNS response > timed out and a new one was retrieved which does not contain that IP > address). In the case of my particular client, this is considered a feature > and not a bug, but others might see it differently. > > What we would like is to be able to have DNS round robin effects *between* > clients while keeping the stickiness of the client / server session sharing > that exists without multiple address FQDNs. Does this seem useful to anyone > else, and what style of implementation would seem best? Should the choices of > (1) and (2) be configurable? > > * This is a bit trickier to implement than it sounds, because the HostDB > lookup has been done and the resulting IP address stored in various places. > That value becomes wrong if the currently associated server session is kept > because it has a different IP address. Currently I think only one additional > place needs to be updated but I am still researching that. > > # It turns out the HostDB lookup already checks for an associated server > session for unrelated reasons, so picking out the IP address would be a > simple matter. >