Re: DNS Resolver Issues

2019-03-25 Thread Baptiste
>
> A reload of the HAProxy instance also forces the instances to query all
> records from the resolver.
>
>
Hi Bruno,

Actually, this is true only when you don't use the 'resolvers' section or
for the parameters who doesn't benefit from the resolvers section, here the
'addr' parameter.

Baptiste


Re: DNS Resolver Issues

2019-03-25 Thread Baptiste
Hi all,

Thanks @daniel for you very detailed report and @Piba for your help.
As Piba pointed out, the issue is related to the 'addr' parameter.
Currently, the only component in HAProxy which can benefit from dynamic
resolution at run time is the 'server', which means any other object using
a DNS hostname which does not resolve at start up may trigger an error,
like you discovered with 'addr'.

@Piba, feel free to fill up a feature request on github and Cc me there, so
we can discuss this point.

Baptiste



On Sat, Mar 23, 2019 at 2:53 PM PiBa-NL  wrote:

> Hi Daniel, Baptiste,
>
> @Daniel, can you remove the 'addr loadbalancer-internal.xxx.yyy' from
> the server check? It seems to me that that name is not being resolved by
> the 'resolvers'. And even if it would it would be kinda redundant as it
> is in the example as it is the same as the servername.?. Not sure how
> far below scenarios are all explained by this though..
>
> @Baptiste, is it intentional that a wrong 'addr' dns name makes haproxy
> fail to start despite having the supposedly never failing
> 'default-server init-addr last,libc,none' ? Is it possibly a good
> feature request to support re-resolving a dns name for the addr setting
> as well ?
>
> Regards,
> PiBa-NL (Pieter)
>
> Op 21-3-2019 om 20:37 schreef Daniel Schneller:
> > Hi!
> >
> > Thanks for the response. I had looked at the "hold" directives, but
> since they all seem to have reasonable defaults, I did not touch them.
> > I specified 10s explictly, but it did not make a difference.
> >
> > I did some more tests, however, and it seems to have more to do with the
> number of responses for the initial(?) DNS queries.
> > Hopefully these three tables make sense and don't get mangled in the
> mail. The "templated"
> > proxy is defined via "server-template" with 3 "slots". The "regular" one
> just as "server".
> >
> >
> > Test 1: Start out  with both "valid" and "broken" DNS entries. Then
> comment out/add back
> > one at a time as described in (1)-(5).
> > Each time after changing /etc/hosts, restart dnsmasq and check haproxy
> via hatop.
> > Haproxy started fresh once dnsmasq was set up to (1).
> >
> > |  state   state
> >  /etc/hosts |  regular templated
> > |-
> > (1) BRK|  UP/L7OK DOWN/L4TOUT
> >  VALID  |  MAINT/resolution
> > |  UP/L7OK
> > |
> >
> > (2) BRK|  DOWN/L4TOUT DOWN/L4TOUT
> >  #VALID |  MAINT/resolution
> > |  MAINT/resolution
> > |
> > (3) #BRK   |  UP/L7OK UP/L7OK
> >  VALID  |  MAINT/resolution
> > |  MAINT/resolution
> > |
> > (4) BRK|  UP/L7OK UP/L7OK
> >  VALID  |  DOWN/L4TOUT
> > |  MAINT/resolution
> > |
> > (5) BRK|  DOWN/L4TOUT DOWN/L4TOUT
> >  #VALID |  MAINT/resolution
> > |  MAINT/resolution
> >
> > This all looks normal and as expected. As soon as the "VALID" DNS entry
> is present, the
> > UP state follows within a few seconds.
> >
> >
> >
> > Test 2: Start out "valid only" (1) and proceed as described in (2)-(5),
> again restarting
> > dnsmasq each time, and haproxy reloaded after dnsmasq was set up to (1).
> >
> > |  state   state
> >  /etc/hosts |  regular templated
> > |
> > (1) #BRK   |  UP/L7OK MAINT/resolution
> >  VALID  |  MAINT/resolution
> > |  UP/L7OK
> > |
> > (2) BRK|  UP/L7OK DOWN/L4TOUT
> >  VALID  |  MAINT/resolution
> > |  UP/L7OK
> > |
> > (3) #BRK   |  UP/L7OK MAINT/resolution
> >  VALID  |  MAINT/resolution
> > |  UP/L7OK
> > |
> > (4) BRK|  UP/L7OK DOWN/L4TOUT
> >  VALID  |  MAINT/resolution
> > |  UP/L7OK
> > |
> > (5) BRK|  DOWN/L4TOUT 

Re: DNS Resolver Issues

2019-03-24 Thread Daniel Schneller
Hello!

I am currently on vacation for two weeks, but I'll see to it when I get back.
There is no particular reason for the specific check address here, as you 
correctly figured. It is just an artefact of the template used to create the 
configuration. I can remove that, but there might be cases were it matters 
(though I don't think we have any ATM AFAIR). Would not have guessed there 
would be different resolution paths; if this is intentional, a note in the 
documentation would be helpful. I can provide that when I am back and when 
there is clarity on why it would be like this.

Thank you very much for your help!

Cheers,
Daniel


> On 23. Mar 2019, at 14:53, PiBa-NL  wrote:
> 
> Hi Daniel, Baptiste,
> 
> @Daniel, can you remove the 'addr loadbalancer-internal.xxx.yyy' from the 
> server check? It seems to me that that name is not being resolved by the 
> 'resolvers'. And even if it would it would be kinda redundant as it is in the 
> example as it is the same as the servername.?. Not sure how far below 
> scenarios are all explained by this though..
> 
> @Baptiste, is it intentional that a wrong 'addr' dns name makes haproxy fail 
> to start despite having the supposedly never failing 'default-server 
> init-addr last,libc,none' ? Is it possibly a good feature request to support 
> re-resolving a dns name for the addr setting as well ?
> 
> Regards,
> PiBa-NL (Pieter)
> 
> Op 21-3-2019 om 20:37 schreef Daniel Schneller:
>> Hi!
>> 
>> Thanks for the response. I had looked at the "hold" directives, but since 
>> they all seem to have reasonable defaults, I did not touch them.
>> I specified 10s explictly, but it did not make a difference.
>> 
>> I did some more tests, however, and it seems to have more to do with the 
>> number of responses for the initial(?) DNS queries.
>> Hopefully these three tables make sense and don't get mangled in the mail. 
>> The "templated"
>> proxy is defined via "server-template" with 3 "slots". The "regular" one 
>> just as "server".
>> 
>> 
>> Test 1: Start out  with both "valid" and "broken" DNS entries. Then comment 
>> out/add back
>> one at a time as described in (1)-(5).
>> Each time after changing /etc/hosts, restart dnsmasq and check haproxy via 
>> hatop.
>> Haproxy started fresh once dnsmasq was set up to (1).
>> 
>>|  state   state
>> /etc/hosts |  regular templated
>>|-
>> (1) BRK|  UP/L7OK DOWN/L4TOUT
>> VALID  |  MAINT/resolution
>>|  UP/L7OK
>>|
>> 
>> (2) BRK|  DOWN/L4TOUT DOWN/L4TOUT
>> #VALID |  MAINT/resolution
>>|  MAINT/resolution
>>|
>> (3) #BRK   |  UP/L7OK UP/L7OK
>> VALID  |  MAINT/resolution
>>|  MAINT/resolution
>>|
>> (4) BRK|  UP/L7OK UP/L7OK
>> VALID  |  DOWN/L4TOUT
>>|  MAINT/resolution
>>|
>> (5) BRK|  DOWN/L4TOUT DOWN/L4TOUT
>> #VALID |  MAINT/resolution
>>|  MAINT/resolution
>>   This all looks normal and as expected. As soon as the 
>> "VALID" DNS entry is present, the
>> UP state follows within a few seconds.
>>   
>> 
>> Test 2: Start out "valid only" (1) and proceed as described in (2)-(5), 
>> again restarting
>> dnsmasq each time, and haproxy reloaded after dnsmasq was set up to (1).
>> 
>>|  state   state
>> /etc/hosts |  regular templated
>>|
>> (1) #BRK   |  UP/L7OK MAINT/resolution
>> VALID  |  MAINT/resolution
>>|  UP/L7OK
>>|
>> (2) BRK|  UP/L7OK DOWN/L4TOUT
>> VALID  |  MAINT/resolution
>>|  UP/L7OK
>>|
>> (3) #BRK   |  UP/L7OK MAINT/resolution
>> VALID  |  MAINT/resolution
>>|  UP/L7OK
>>|
>> (4) BRK|  UP/L7OK DOWN/L4TOUT
>> VALID  |  MAINT/resolution
>>|  UP/L7OK

Re: DNS Resolver Issues

2019-03-23 Thread PiBa-NL

Hi Daniel, Baptiste,

@Daniel, can you remove the 'addr loadbalancer-internal.xxx.yyy' from 
the server check? It seems to me that that name is not being resolved by 
the 'resolvers'. And even if it would it would be kinda redundant as it 
is in the example as it is the same as the servername.?. Not sure how 
far below scenarios are all explained by this though..


@Baptiste, is it intentional that a wrong 'addr' dns name makes haproxy 
fail to start despite having the supposedly never failing 
'default-server init-addr last,libc,none' ? Is it possibly a good 
feature request to support re-resolving a dns name for the addr setting 
as well ?


Regards,
PiBa-NL (Pieter)

Op 21-3-2019 om 20:37 schreef Daniel Schneller:

Hi!

Thanks for the response. I had looked at the "hold" directives, but since they 
all seem to have reasonable defaults, I did not touch them.
I specified 10s explictly, but it did not make a difference.

I did some more tests, however, and it seems to have more to do with the number 
of responses for the initial(?) DNS queries.
Hopefully these three tables make sense and don't get mangled in the mail. The 
"templated"
proxy is defined via "server-template" with 3 "slots". The "regular" one just as 
"server".


Test 1: Start out  with both "valid" and "broken" DNS entries. Then comment 
out/add back
one at a time as described in (1)-(5).
Each time after changing /etc/hosts, restart dnsmasq and check haproxy via 
hatop.
Haproxy started fresh once dnsmasq was set up to (1).

|  state   state
 /etc/hosts |  regular templated
|-
(1) BRK|  UP/L7OK DOWN/L4TOUT
 VALID  |  MAINT/resolution
|  UP/L7OK
|

(2) BRK|  DOWN/L4TOUT DOWN/L4TOUT
 #VALID |  MAINT/resolution
|  MAINT/resolution
|
(3) #BRK   |  UP/L7OK UP/L7OK
 VALID  |  MAINT/resolution
|  MAINT/resolution
|
(4) BRK|  UP/L7OK UP/L7OK
 VALID  |  DOWN/L4TOUT
|  MAINT/resolution
|
(5) BRK|  DOWN/L4TOUT DOWN/L4TOUT
 #VALID |  MAINT/resolution
|  MAINT/resolution
   
This all looks normal and as expected. As soon as the "VALID" DNS entry is present, the

UP state follows within a few seconds.
   



Test 2: Start out "valid only" (1) and proceed as described in (2)-(5), again 
restarting
dnsmasq each time, and haproxy reloaded after dnsmasq was set up to (1).

|  state   state
 /etc/hosts |  regular templated
|
(1) #BRK   |  UP/L7OK MAINT/resolution
 VALID  |  MAINT/resolution
|  UP/L7OK
|
(2) BRK|  UP/L7OK DOWN/L4TOUT
 VALID  |  MAINT/resolution
|  UP/L7OK
|
(3) #BRK   |  UP/L7OK MAINT/resolution
 VALID  |  MAINT/resolution
|  UP/L7OK
|
(4) BRK|  UP/L7OK DOWN/L4TOUT
 VALID  |  MAINT/resolution
|  UP/L7OK
|
(5) BRK|  DOWN/L4TOUT DOWN/L4TOUT
 #VALID |  MAINT/resolution
|  MAINT/resolution
   
Everything good here, too. Adding the broken DNS entry does not bring the proxies down

until only the broken one is left.



Test 3: Start out "broken only" (1).
Again, same as before, haproxy restarted once dnsmasq was initialized to (1).

|  state   state
 /etc/hosts |  regular templated
|
(1) BRK|  DOWN/L4TOUT DOWN/L4TOUT
 #VALID |  MAINT/resolution
|  MAINT/resolution

Re: DNS Resolver Issues

2019-03-21 Thread Daniel Schneller
Hi!

Thanks for the response. I had looked at the "hold" directives, but since they 
all seem to have reasonable defaults, I did not touch them.
I specified 10s explictly, but it did not make a difference.

I did some more tests, however, and it seems to have more to do with the number 
of responses for the initial(?) DNS queries.
Hopefully these three tables make sense and don't get mangled in the mail. The 
"templated"
proxy is defined via "server-template" with 3 "slots". The "regular" one just 
as "server".


Test 1: Start out  with both "valid" and "broken" DNS entries. Then comment 
out/add back
one at a time as described in (1)-(5). 
Each time after changing /etc/hosts, restart dnsmasq and check haproxy via 
hatop.
Haproxy started fresh once dnsmasq was set up to (1).

   |  state   state
/etc/hosts |  regular templated
   |-
(1) BRK|  UP/L7OK DOWN/L4TOUT
VALID  |  MAINT/resolution
   |  UP/L7OK
   |

(2) BRK|  DOWN/L4TOUT DOWN/L4TOUT
#VALID |  MAINT/resolution
   |  MAINT/resolution
   |  
(3) #BRK   |  UP/L7OK UP/L7OK
VALID  |  MAINT/resolution
   |  MAINT/resolution
   |
(4) BRK|  UP/L7OK UP/L7OK
VALID  |  DOWN/L4TOUT
   |  MAINT/resolution
   |
(5) BRK|  DOWN/L4TOUT DOWN/L4TOUT
#VALID |  MAINT/resolution
   |  MAINT/resolution  
  
  
This all looks normal and as expected. As soon as the "VALID" DNS entry is 
present, the
UP state follows within a few seconds.
  


Test 2: Start out "valid only" (1) and proceed as described in (2)-(5), again 
restarting
dnsmasq each time, and haproxy reloaded after dnsmasq was set up to (1).

   |  state   state
/etc/hosts |  regular templated
   |
(1) #BRK   |  UP/L7OK MAINT/resolution
VALID  |  MAINT/resolution
   |  UP/L7OK
   |
(2) BRK|  UP/L7OK DOWN/L4TOUT
VALID  |  MAINT/resolution
   |  UP/L7OK
   |
(3) #BRK   |  UP/L7OK MAINT/resolution
VALID  |  MAINT/resolution
   |  UP/L7OK
   |
(4) BRK|  UP/L7OK DOWN/L4TOUT
VALID  |  MAINT/resolution
   |  UP/L7OK
   |
(5) BRK|  DOWN/L4TOUT DOWN/L4TOUT
#VALID |  MAINT/resolution
   |  MAINT/resolution  
  
  
Everything good here, too. Adding the broken DNS entry does not bring the 
proxies down
until only the broken one is left.



Test 3: Start out "broken only" (1).
Again, same as before, haproxy restarted once dnsmasq was initialized to (1).

   |  state   state
/etc/hosts |  regular templated
   |
(1) BRK|  DOWN/L4TOUT DOWN/L4TOUT
#VALID |  MAINT/resolution
   |  MAINT/resolution
   |  
(2) BRK|  DOWN/L4TOUT UP/L7OK
VALID  |  MAINT/resolution
   |  MAINT/resolution
   |  
(3) #BRK   |  UP/L7OK MAINT/resolution
VALID  |  UP/L7OK
   |  MAINT/resolution
   |  
(4) BRK|  UP/L7OK DOWN/L4TOUT
VALID  |  UP/L7OK
   |  MAINT/resolution
   

Re: DNS Resolver Issues

2019-03-21 Thread Bruno Henc
Hello Daniel,


You might be missing the hold-valid directive in your resolvers section: 
https://www.haproxy.com/documentation/hapee/1-9r1/onepage/#5.3.2-timeout

This should force HAProxy to fetch the DNS record values from the resolver.

A reload of the HAProxy instance also forces the instances to query all records 
from the resolver.

Can you please retest with the updated configuration and report back the 
results?


Best regards,

Bruno Henc

‐‐‐ Original Message ‐‐‐
On Thursday, March 21, 2019 12:09 PM, Daniel Schneller 
 wrote:

> Hello!
>
> Friendly bump :)
> I'd be willing to amend the documentation once I understand what's going on :D
>
> Cheers,
> Daniel
>
> > On 18. Mar 2019, at 20:28, Daniel Schneller 
> > daniel.schnel...@centerdevice.com wrote:
> > Hi everyone!
> > I assume I am misunderstanding something, but I cannot figure out what it 
> > is.
> > We are using haproxy in AWS, in this case as sidecars to applications so 
> > they need not
> > know about changing backend addresses at all, but can always talk to 
> > localhost.
> > Haproxy listens on localhost and then forwards traffic to an ELB instance.
> > This works great, but there have been two occasions now, where due to a 
> > change in the
> > ELB's IP addresses, our services went down, because the backends could not 
> > be reached
> > anymore. I don't understand why haproxy sticks to the old IP address 
> > instead of going
> > to one of the updated ones.
> > There is a resolvers section which points to the local dnsmasq instance 
> > (there to send
> > some requests to consul, but that's not used here). All other traffic is 
> > forwarded on
> > to the AWS DNS server set via DHCP.
> > I managed to get timely updates and updated backend servers when using 
> > server-template,
> > but form what I understand this should not really be necessary for this.
> > This is the trimmed down sidecar config. I have not made any changes to dns 
> > timeouts etc.
> > resolvers default
> >
> > dnsmasq
> >
> > 
> >
> > nameserver local 127.0.0.1:53
> > listen regular
> > bind 127.0.0.1:9300
> > option dontlog-normal
> > server lb-internal loadbalancer-internal.xxx.yyy:9300 resolvers default 
> > check addr loadbalancer-internal.xxx.yyy port 9300
> > listen templated
> > bind 127.0.0.1:9200
> > option dontlog-normal
> > option httpchk /haproxy-simple-healthcheck
> > server-template lb-internal 2 loadbalancer-internal.xxx.yyy:9200 resolvers 
> > default check port 9299
> > To simulate changing ELB adresses, I added entries for 
> > loadbalancer-internal.xxx.yyy in /etc/hosts
> > and to be able to control them via dnsmasq.
> > I tried different scenarios, but could not reliably predict what would 
> > happen in all cases.
> > The address ending in 52 (marked as "valid" below) is a currently (as of 
> > the time of testing)
> > valid IP for the ELB. The one ending in 199 (marked "invalid") is an unused 
> > private IP address
> > in my VPC.
> > Starting with /etc/hosts:
> > 10.205.100.52 loadbalancer-internal.xxx.yyy # valid
> > 10.205.100.199 loadbalancer-internal.xxx.yyy # invalid
> > haproxy starts and reports:
> > regular: lb-internal UP/L7OK
> > templated: lb-internal1 DOWN/L4TOUT
> > lb-internal2 UP/L7OK
> > That's expected. Now when I edit /etc/hosts to only contain the invalid 
> > address
> > and restart dnsmasq, I would expect both proxies to go fully down. But only 
> > the templated
> > proxy behaves like that:
> > regular: lb-internal UP/L7OK
> > templated: lb-internal1 DOWN/L4TOUT
> > lb-internal2 MAINT (resolution)
> > Reloading haproxy in this state leads to:
> > regular: lb-internal DOWN/L4TOUT
> > templated: lb-internal1 MAINT (resolution)
> > lb-internal2 DOWN/L4TOUT
> > After fixing /etc/hosts to include the valid server again and restarting 
> > dnsmasq:
> > regular: lb-internal DOWN/L4TOUT
> > templated: lb-internal1 UP/L7OK
> > lb-internal2 DOWN/L4TOUT
> > Shouldn't the regular proxy also recognize the change and bring the backend 
> > up or down
> > depending on the DNS change? I have waited for several health check rounds 
> > (seeing
> > "* L4TOUT" and "L4TOUT") toggle, but it still never updates.
> > I also tried to have only the invalid address in /etc/hosts, then 
> > restarting haproxy.
> > The regular backends will never recognize it when I add the valid one back 
> > in.
> > The templated one does, unless I set it up to have only 1 instead of 2 
> > server slots.
> > In that case it behaves will also only pick up the valid server when 
> > reloaded.
> > On the other hand, it will recognize when I remove the valid server without 
> > a reload
> > on the next health check, but not bring them back in and make the proxy UP 
> > when it
> > comes back.
> > I assume my understanding of something here is broken, and I would gladly 
> > be told
> > about it :)
> > Thanks a lot!
> > Daniel
> >
> > Version Info:
> >
> > --
> >
> > $ haproxy -vv
> > HA-Proxy version 1.8.19-1ppa1~trusty 2019/02/12
> 

Re: DNS Resolver Issues

2019-03-21 Thread Daniel Schneller
Hello!

Friendly bump :)
I'd be willing to amend the documentation once I understand what's going on :D

Cheers,
Daniel


> On 18. Mar 2019, at 20:28, Daniel Schneller 
>  wrote:
> 
> Hi everyone!
> 
> I assume I am misunderstanding something, but I cannot figure out what it is.
> We are using haproxy in AWS, in this case as sidecars to applications so they 
> need not
> know about changing backend addresses at all, but can always talk to 
> localhost.
> 
> Haproxy listens on localhost and then forwards traffic to an ELB instance. 
> This works great, but there have been two occasions now, where due to a 
> change in the
> ELB's IP addresses, our services went down, because the backends could not be 
> reached
> anymore. I don't understand why haproxy sticks to the old IP address instead 
> of going
> to one of the updated ones.
> 
> There is a resolvers section which points to the local dnsmasq instance 
> (there to send
> some requests to consul, but that's not used here). All other traffic is 
> forwarded on
> to the AWS DNS server set via DHCP.
> 
> I managed to get timely updates and updated backend servers when using 
> server-template,
> but form what I understand this should not really be necessary for this. 
> 
> This is the trimmed down sidecar config. I have not made any changes to dns 
> timeouts etc.
> 
> resolvers default
>  # dnsmasq
>  nameserver local 127.0.0.1:53
> 
> listen regular
>  bind 127.0.0.1:9300
>  option dontlog-normal
>  server lb-internal loadbalancer-internal.xxx.yyy:9300 resolvers default 
> check addr loadbalancer-internal.xxx.yyy port 9300
> 
> listen templated
>  bind 127.0.0.1:9200
>  option dontlog-normal
>  option httpchk /haproxy-simple-healthcheck
>  server-template lb-internal 2 loadbalancer-internal.xxx.yyy:9200 resolvers 
> default check  port 9299
> 
> 
> To simulate changing ELB adresses, I added entries for 
> loadbalancer-internal.xxx.yyy in /etc/hosts
> and to be able to control them via dnsmasq.
> 
> I tried different scenarios, but could not reliably predict what would happen 
> in all cases.
> 
> The address ending in 52 (marked as "valid" below) is a currently (as of the 
> time of testing) 
> valid IP for the ELB. The one ending in 199 (marked "invalid") is an unused 
> private IP address
> in my VPC.
> 
> 
> Starting with /etc/hosts:
> 
> 10.205.100.52  loadbalancer-internal.xxx.yyy# valid
> 10.205.100.199 loadbalancer-internal.xxx.yyy# invalid
> 
> haproxy starts and reports:
> 
> regular:   lb-internal UP/L7OK
> templated: lb-internal1  DOWN/L4TOUT
>   lb-internal2UP/L7OK
> 
> That's expected. Now when I edit /etc/hosts to _only_ contain the _invalid_ 
> address
> and restart dnsmasq, I would expect both proxies to go fully down. But only 
> the templated
> proxy behaves like that:
> 
> regular:   lb-internal UP/L7OK
> templated: lb-internal1  DOWN/L4TOUT
>   lb-internal2  MAINT (resolution)
> 
> Reloading haproxy in this state leads to:
> 
> regular:   lb-internal   DOWN/L4TOUT
> templated: lb-internal1  MAINT (resolution)
>   lb-internal2  DOWN/L4TOUT
> 
> After fixing /etc/hosts to include the valid server again and restarting 
> dnsmasq:
> 
> regular:   lb-internal   DOWN/L4TOUT
> templated: lb-internal1UP/L7OK
>   lb-internal2  DOWN/L4TOUT
> 
> 
> Shouldn't the regular proxy also recognize the change and bring the backend 
> up or down
> depending on the DNS change? I have waited for several health check rounds 
> (seeing 
> "* L4TOUT" and "L4TOUT") toggle, but it still never updates.
> 
> I also tried to have _only_ the invalid address in /etc/hosts, then 
> restarting haproxy.
> The regular backends will never recognize it when I add the valid one back in.
> 
> The templated one does, _unless_ I set it up to have only 1 instead of 2 
> server slots.
> In that case it behaves will also only pick up the valid server when reloaded.
> 
> On the other hand, it _will_ recognize when I remove the valid server without 
> a reload
> on the next health check, but _not_ bring them back in and make the proxy UP 
> when it 
> comes back.
> 
> 
> I assume my understanding of something here is broken, and I would gladly be 
> told
> about it :)
> 
> 
> Thanks a lot!
> Daniel
> 
> 
> Version Info:
> --
> $ haproxy -vv
> HA-Proxy version 1.8.19-1ppa1~trusty 2019/02/12
> Copyright 2000-2019 Willy Tarreau 
> 
> Build options :
>  TARGET  = linux2628
>  CPU = generic
>  CC  = gcc
>  CFLAGS  = -O2 -g -O2 -fPIE -fstack-protector --param=ssp-buffer-size=4 
> -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fno-strict-aliasing 
> -Wdeclaration-after-statement -fwrapv -Wno-unused-label
>  OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 
> USE_PCRE=1 USE_PCRE_JIT=1 USE_NS=1
> 
> Default settings :
>  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
> 
> Built with OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014
> Running on OpenSSL 

DNS Resolver Issues

2019-03-18 Thread Daniel Schneller
Hi everyone!

I assume I am misunderstanding something, but I cannot figure out what it is.
We are using haproxy in AWS, in this case as sidecars to applications so they 
need not
know about changing backend addresses at all, but can always talk to localhost.

Haproxy listens on localhost and then forwards traffic to an ELB instance. 
This works great, but there have been two occasions now, where due to a change 
in the
ELB's IP addresses, our services went down, because the backends could not be 
reached
anymore. I don't understand why haproxy sticks to the old IP address instead of 
going
to one of the updated ones.

There is a resolvers section which points to the local dnsmasq instance (there 
to send
some requests to consul, but that's not used here). All other traffic is 
forwarded on
to the AWS DNS server set via DHCP.

I managed to get timely updates and updated backend servers when using 
server-template,
but form what I understand this should not really be necessary for this. 

This is the trimmed down sidecar config. I have not made any changes to dns 
timeouts etc.

resolvers default
  # dnsmasq
  nameserver local 127.0.0.1:53
  
listen regular
  bind 127.0.0.1:9300
  option dontlog-normal
  server lb-internal loadbalancer-internal.xxx.yyy:9300 resolvers default check 
addr loadbalancer-internal.xxx.yyy port 9300

listen templated
  bind 127.0.0.1:9200
  option dontlog-normal
  option httpchk /haproxy-simple-healthcheck
  server-template lb-internal 2 loadbalancer-internal.xxx.yyy:9200 resolvers 
default check  port 9299


To simulate changing ELB adresses, I added entries for 
loadbalancer-internal.xxx.yyy in /etc/hosts
and to be able to control them via dnsmasq.

I tried different scenarios, but could not reliably predict what would happen 
in all cases.

The address ending in 52 (marked as "valid" below) is a currently (as of the 
time of testing) 
valid IP for the ELB. The one ending in 199 (marked "invalid") is an unused 
private IP address
in my VPC.


Starting with /etc/hosts:

10.205.100.52  loadbalancer-internal.xxx.yyy# valid
10.205.100.199 loadbalancer-internal.xxx.yyy# invalid

haproxy starts and reports:

regular:   lb-internal UP/L7OK
templated: lb-internal1  DOWN/L4TOUT
   lb-internal2UP/L7OK

That's expected. Now when I edit /etc/hosts to _only_ contain the _invalid_ 
address
and restart dnsmasq, I would expect both proxies to go fully down. But only the 
templated
proxy behaves like that:

regular:   lb-internal UP/L7OK
templated: lb-internal1  DOWN/L4TOUT
   lb-internal2  MAINT (resolution)
   
Reloading haproxy in this state leads to:

regular:   lb-internal   DOWN/L4TOUT
templated: lb-internal1  MAINT (resolution)
   lb-internal2  DOWN/L4TOUT
   
After fixing /etc/hosts to include the valid server again and restarting 
dnsmasq:

regular:   lb-internal   DOWN/L4TOUT
templated: lb-internal1UP/L7OK
   lb-internal2  DOWN/L4TOUT


Shouldn't the regular proxy also recognize the change and bring the backend up 
or down
depending on the DNS change? I have waited for several health check rounds 
(seeing 
"* L4TOUT" and "L4TOUT") toggle, but it still never updates.

I also tried to have _only_ the invalid address in /etc/hosts, then restarting 
haproxy.
The regular backends will never recognize it when I add the valid one back in.

The templated one does, _unless_ I set it up to have only 1 instead of 2 server 
slots.
In that case it behaves will also only pick up the valid server when reloaded.

On the other hand, it _will_ recognize when I remove the valid server without a 
reload
on the next health check, but _not_ bring them back in and make the proxy UP 
when it 
comes back.


I assume my understanding of something here is broken, and I would gladly be 
told
about it :)


Thanks a lot!
Daniel


Version Info:
--
$ haproxy -vv
HA-Proxy version 1.8.19-1ppa1~trusty 2019/02/12
Copyright 2000-2019 Willy Tarreau 

Build options :
  TARGET  = linux2628
  CPU = generic
  CC  = gcc
  CFLAGS  = -O2 -g -O2 -fPIE -fstack-protector --param=ssp-buffer-size=4 
-Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fno-strict-aliasing 
-Wdeclaration-after-statement -fwrapv -Wno-unused-label
  OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 
USE_PCRE=1 USE_PCRE_JIT=1 USE_NS=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014
Running on OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2
Built with Lua version : Lua 5.3.1
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT 
IP_FREEBIND
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE version : 8.31 2012-07-06
Running on PCRE