Re: 1.9 BUG: redispatch broken

2018-12-24 Thread Olivier Houchard
Hi Lukas,

On Sat, Dec 22, 2018 at 11:16:09PM +0100, Lukas Tribus wrote:
> Hello Oliver,
> 
> 
> redispatch is broken since commit 25b401536 ("BUG/MEDIUM: connection:
> Just make sure we closed the fd on connection failure"). It simply
> fails to connect to the next server.
> 
> 1.9 is affected.
> 
> 
> Repro:
> 
> global
> log 10.0.0.4:514 len 65535 local1 debug
> maxconn 1000
> 
> defaults
> log global
> mode http
> option httplog
> timeout connect 1s
> timeout client 30s
> timeout server 30s
> retries 3
> option redispatch 1
> 
> frontend http-in
> bind :80
> default_backend mybak
> 
> backend mybak
> mode http
> balance first
> server primary-fail 10.0.0.199:80 # this server is unreachable
> server backup-ok 10.0.0.254:80
> 
> 
> 
> 
> cheers,

Ooops you're right indeed. The attached patch should fix it.

Thanks a lot for reporting !

Regards,

Olivier
>From 2276c53dac820d0079525730e9bd7abfd3ea408c Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Mon, 24 Dec 2018 13:32:13 +0100
Subject: [PATCH] BUG/MEDIUM: servers: Don't try to reuse connection if we
 switched server.

In connect_server(), don't attempt to reuse the old connection if it's
targetting a different server than the one we're supposed to access, or
we will never be able to connect to a server if the first one we tried failed.

This should be backported to 1.9.
---
 src/backend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend.c b/src/backend.c
index 2407f8a32..bc38c5710 100644
--- a/src/backend.c
+++ b/src/backend.c
@@ -1129,7 +1129,7 @@ int connect_server(struct stream *s)
srv_cs = objt_cs(s->si[1].end);
if (srv_cs) {
old_conn = srv_conn = cs_conn(srv_cs);
-   if (old_conn) {
+   if (old_conn && (!old_conn->target || old_conn->target == 
s->target)) {
old_conn->flags &= ~(CO_FL_ERROR | CO_FL_SOCK_RD_SH | 
CO_FL_SOCK_WR_SH);
srv_cs->flags &= ~(CS_FL_ERROR | CS_FL_EOS | 
CS_FL_REOS);
reuse = 1;
-- 
2.19.2



Re: DNS resolution problem since 1.8.14

2018-12-24 Thread Patrick Valsecchi

Hi Jonathan,

I've build the 1.8.16 image myself and the problem is indeed fixed. Any 
plan of including that fix in a 1.9.1 release?


Thanks.

On 23.12.18 18:20, Jonathan Matthews wrote:

Hey Patrick,

Have you looked at the fixes in 1.8.16? They sound kinda-sorta related 
to your problem ...


J

On Sun, 23 Dec 2018 at 16:17, Patrick Valsecchi > wrote:


I did a tcpdump. My config is modified to point to a local
container (www) in a docker compose (I'm trying to simplify my
setup). You can see the DNS answers correctly:

16:06:00.181533 IP (tos 0x0, ttl 64, id 63816, offset 0, flags
[DF], proto UDP (17), length 68)
    127.0.0.11.53 > localhost.40994: 63037 1/0/0 www. A
172.20.0.17 (40)

Could it be related to that?

https://github.com/haproxy/haproxy/commit/8d4e7dc880d2094658fead50dedd9c22c95c556a

On 23.12.18 13:59, Patrick Valsecchi wrote:


Hi,

Since haproxy version 1.8.14 and including the last 1.9 release,
haproxy puts all my backends in MAINT after around 31s. They
first work fine, but then they are put in MAINT.

The logs look like that:

<149>Dec 23 12:45:11 haproxy[1]: Proxy www started.
<149>Dec 23 12:45:11 haproxy[1]: Proxy plain started.
[NOTICE] 356/124511 (1) : New worker #1 (8) forked
<150>Dec 23 12:45:13 haproxy[8]: 89.217.194.174:49752
 [23/Dec/2018:12:45:13.098]
plain www/linked 0/0/16/21/37 200 4197 - -  1/1/0/0/0 0/0
"GET / HTTP/1.1"
[WARNING] 356/124542 (8) : Server www/linked is going DOWN
for maintenance (DNS timeout status). 0 active and 0 backup
servers left. 0 sessions active, 0 requeued, 0 remaining in
queue.
<145>Dec 23 12:45:42 haproxy[8]: Server www/linked is going
DOWN for maintenance (DNS timeout status). 0 active and 0
backup servers left. 0 sessions active, 0 requeued, 0
remaining in queue.
[ALERT] 356/124542 (8) : backend 'www' has no server available!
<144>Dec 23 12:45:42 haproxy[8]: backend www has no server
available!

I run haproxy using docker:

docker run --name toto -ti --rm -v
/home/docker-compositions/web/proxy/conf.test:/etc/haproxy/:ro
-p 8080:80 haproxy:1.9 haproxy -f /etc/haproxy/

And my config is that:

global
    log stderr local2
    chroot  /tmp
    pidfile /run/haproxy.pid
    maxconn 4000
    max-spread-checks 500

    master-worker

    user    nobody
    group   nogroup

resolvers dns
  nameserver docker 127.0.0.11:53 
  hold valid 1s

defaults
    mode    http
    log global
    option  httplog
    option  dontlognull
    option http-server-close
    option forwardfor   except 127.0.0.0/8

    option  redispatch
    retries 3
    timeout http-request    10s
    timeout queue   1m
    timeout connect 10s
    timeout client  10m
    timeout server  10m
    timeout http-keep-alive 10s
    timeout check   10s
    maxconn 3000
    default-server init-addr last,libc,none

    errorfile 400 /usr/local/etc/haproxy/errors/400.http
    errorfile 403 /usr/local/etc/haproxy/errors/403.http
    errorfile 408 /usr/local/etc/haproxy/errors/408.http
    errorfile 500 /usr/local/etc/haproxy/errors/500.http
    errorfile 502 /usr/local/etc/haproxy/errors/502.http
    errorfile 503 /usr/local/etc/haproxy/errors/503.http
    errorfile 504 /usr/local/etc/haproxy/errors/504.http

backend www
    option httpchk GET / HTTP/1.0\r\nUser-Agent:\ healthcheck
    http-check expect status 200
    default-server inter 60s fall 3 rise 1
    server linked www.topin.travel:80
 check resolvers dns

frontend plain
    bind :80

    http-request set-header X-Forwarded-Proto   http
    http-request set-header X-Forwarded-Host   
%[req.hdr(host)]
    http-request set-header X-Forwarded-Port    %[dst_port]
    http-request set-header X-Forwarded-For %[src]
    http-request set-header X-Real-IP   %[src]

    compression algo gzip
    compression type text/css text/html text/javascript
application/javascript text/plain text/xml application/json

    # Forward to the main linked container by default
    default_backend www


Any idea what is happening?