We have a setup with ECS and AWS's Service Discovery being load balanced by HAProxy in order to support sticky sessions for WebSocket handshakes, and we're working on making it more efficient by upgrading to 1.8.9 and taking advantage of seamless reloads and DNS service discovery. We have a solution almost working, however, we're seeing an issue during scaling when the DNS response crosses a certain size.
We're using the following config (anonymized): https://gist.github.com/jredville/523de951d5ab6b60a0d345516bcf46d4 What we're seeing is: * if we bring up 3 target servers, they come up as healthy, and traffic is routed appropriately. If we restart haproxy, it comes up healthy * if we then scale to 4 or more servers, the 4th and additional are never recognized, however, the first 3 stay healthy * if we restart haproxy with 4 or more servers, no servers come up healthy We've attempted to modify the init-addr setting, accepted_payload_size, check options, and we've tried with and without a server-template and this is the behavior we consistently see. If we run strace over haproxy, we see it making the DNS requests but never updating the state of the servers. At this point we're not sure if we have something wrong in config or if there is a bug in how haproxy parses responses from AWS. Johnathan (cc'd) has pcap's if that would be helpful as well. Thanks, Jim