Hi List,

I just upgraded one of our HAProxy installations to 2.2.10 (on Debian using the 
from the HAProxy maintained apt repo). It appears the changes made to how SRV 
records are expired is causing issues, at least with short-lived TTLs in the 
SRV records.

The issue I'm seeing is the record resolves, the servers stay properly set (and 
serving requests) until the SRV TTL expires (which in our case could be any 
value between 0 and 60), at which point the servers are set to no address, but 
this happens *before* a new record is fetched to reset the TTLs since this 
timeout is based on the values defined in resolves. I can play with the timeout 
section of resolvers to improve this situation, but it never completely fixes 
it, since the TTL on the SRV record could be quite low since I'm not fetching 
directly from our origin NS.

Code snippet below:

---
resolvers default
  nameserver …
  accepted_payload_size 8192
  resolve_retries 4

  hold valid     30s                    <---- Adjusting this to really low 
helps, but adds undue load on DNS, and may end up still being expired by the 
new "watchdog".
  hold obsolete  60s            <---- Adjusting this higher means it stays up 
longer, but still fails to load the new record set in time

  hold timeout    1s
  timeout resolve 5s
  timeout retry   1s
---

If the TTL for the SRV records return is less than valid or obsolete, the 
servers will lose their address before it is updated.

I would consider this a serious regression for short-lived SRV records.

Thanks! Happy to provide more details if this isn't easily reproducible.

Luke

—
Luke Seelenbinder
Stadia Maps | Founder
stadiamaps.com

Reply via email to