TTL-based DNS resolution ?

Ben Tisdall Fri, 15 Apr 2016 08:54:24 -0700

Hi,

are there are plans to support DNS resolution based on TTL a la NGINX? This
would be helpful for use cases where the upstream is an ELB or similar
system. I've pasted a reply from AWS support based on some observations
from a couple of our services that use HAProxy 1.6 in front of ELBs. Note
that I am not contending that the issue of uneven distribution of upstream
IPs is HAProxy's fault (that is a consequence of our design), but the
cycling of ELB nodes when retirement occurs is something that NGINX would
seem to handle in a more satisfactory way.


"
 I think an explanation of what happens when ELB scales will be helpful as
background at this point. ELB employs what we term "Graceful Scaling". When
a scaling trigger is breached, let's say for sake of argument this is a
scale-up event, then ELBs controller immediately begins the process of
provisioning new more performant ELB nodes. This usually takes a few
minutes, and once these new nodes pass the controller health-checks, we
remove the old node IPs from the DNS record set, and add in the new ELB
node IPs to the DNS set. Since the TTL with this DNS record is published in
60 seconds, after about 2 - 3 minutes, most traffic will migrate over to
the new node. We do however, do not de-provision the old ELB nodes, but
instead we begin to monitor them to determine when traffic received by
these nodes drops to below a certain threshold level, or a maximum age has
expired (this is several days). This happens to cater for the case where
some clients are caching DNS longer than the TTL value.

Given the way that HA proxy works, when it starts up, it resolves the ELB
name, and obtains the current IPs. HealthChecks are a requirement for the
resolver clause, so HA also begins to perform the configured health-check
on the nodes it learned about at startup.

If the ELB were to scale now, the new nodes would come online but HA proxy
would never learn of them, as the old nodes will continue to pass
health-checks. If traffic continues to increase, at some point the older
ELB nodes will become overwhelmed and will fail a health-check on HA proxy,
at which point that HA proxy node on which the health-check failed, will
learn of the new ELB nodes from DNS, and start to send traffic to the new
one.

Should traffic not increase sufficiently to cause the old nodes to fail a
health-check, then only new HA proxy instances in your fleet will learn of
the new nodes. Eventually the maximum graceful node lifetime will be
reached, and we will terminate the old nodes, at this point all your HA
proxy instances will fail health-checks on their upstream and learn of the
new nodes at the same time.

This process happens in such a fashion that over a time, its conceivable
that each of your HA proxy nodes may know of different back-end IPs. As a
result, traffic on the inside ELB nodes will not be symmetrically
distributed by the HA proxy nodes over time. This is somewhat mitigated on
the back-end by the use of cross-zone load-balancing, so the asymmetry is
not propagated to the back-ends. We do monitor each ELB node individually,
thus the ELB will scale on the monitoring of a single node, rather than the
entire ELB, which further mitigates the effects of any asymmetry on your
ELB nodes.

There is no easy way to make HA proxy work perfectly in front of an ELB,
due to the nature of how HA proxy have implemented DNS resolution.

We often recommend to customers using a reverse proxy in front of ELB, to
rather use Nginx, as this does have the ability to follow DNS TTLs of its
upstreams perfectly. In this case, given the way you have implemented it
means that HA will learn of failed ELB nodes, and eventually learn of
scaling, and the ELB mitigates the imbalance to your back-ends through
cross zone. So, in summary, as of now the only possible way to overcome
this behavior would be to consider using a different reverse proxy solution
between the two ELB tiers instead of HA proxy. I apologize for any
inconvenience. I hope the above information was helpful. Please let us know
if you have any other questions or concerns and we will be happy to assist
you.
"

Regards,

-- 
Ben

-- 




*This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure.If you are not the intended recipient 
you must not copy this message or attachment or disclose the contents to 
any other person.For more information please contact:  
**[email protected] 
<[email protected]>*

*PhotoBox Group, Unit 7, Metal Box Factory, 30 Great Guildford Street, 
London, SE1 0HS*

*http://group.photobox.com <http://group.photobox.com/>*

TTL-based DNS resolution ?

Reply via email to