That makes perfect sense. Thank you very much. -Patrick
------------------------------------------------------------------------ *From: *Willy Tarreau <w...@1wt.eu> *Sent: * 2014-04-02 15:38:04 E *To: *Patrick Hemmer <hapr...@stormcloud9.net> *CC: *haproxy@formilux.org *Subject: *Re: haproxy intermittently not connecting to backend > Hi Patrick, > > On Tue, Apr 01, 2014 at 03:20:15PM -0400, Patrick Hemmer wrote: >> We have an issue with haproxy (1.5-dev22-1a34d57) where it is >> intermittently not connecting to the backend server. However the >> behavior it is exhibiting seems strange. >> The reason I say strange is that in one example, it logged that the >> client disconnected after ~49 seconds with a connection flags of "CC--". >> However our config has "timeout connect 5000", so it should have timed >> out connecting to the backend server after 5 seconds. Additionally we >> have "retries 3" in the config, so upon timing out, it should have tried >> another backend server, but it never did (the retries counter in the log >> shows "0"). > No, retries impacts only retries to the same server, it's "option redispatch" > which allows the last retry to be performed on another server. But you have > it anyway. > >> At the time of this log entry, the backend server is responding >> properly. For the ~49 seconds prior to the log entry, the backend server >> has taken other requests. The backend server is also another haproxy >> (same version). >> >> Here's an example of one such log entry: > [fixed version pasted here] > >> 198.228.211.13:60848 api~ platform-push/i-84d931a5 49562/0/-1/-1/49563 >> 0/0/0/0/0 0/0 691/212 503 CC-- 4F8E-4624 + GET >> /1/sync/notifications/subscribe?sync_box_id=12496&sender=D7A9F93D-F653-4527-A022-383AD55A1943 >> HTTP/1.1 > OK in fact the client did not wait 49 seconds. If you look closer, you'll > see that the client remained silent for 49 seconds (typically a connection > pool or a preconnect) and closed immediately after sending the request (in > the same millisecond). Since you have "option abortonclose", the connection > was aborted before the server had a chance to respond. > > So I can easily imagine that you randomly get this error, you're in a race > condition, if the server responds immediately, you win the race and the > request is handled, otherwise it's aborted. > > Please start by removing "option abortonclose", I think it will fix the issue. > Second thing you can do is to remove "option httpclose" or replace it with > "option http-server-close" which is active and not just passive. The > connections > will last less time on your servers which is always appreciated. > > I'm not seeing any other issue, so with just this you should be fine. > >> The log format is defined as: >> %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq\ >> %U/%B\ %ST\ %tsc\ %ID\ +\ %r >> >> Running a "show errors" on the stats socket did not return any relevant >> results. >> >> Here's the relevant portions of the haproxy config. It is not the entire >> thing as the whole config is 1,513 lines long. >> >> global >> log 127.0.0.1 local0 >> maxconn 20480 >> user haproxy >> group haproxy >> daemon >> stats socket /var/run/hapi/haproxy/haproxy.sock level admin >> >> defaults >> log global >> mode http >> option httplog >> option dontlognull >> option log-separate-errors >> retries 3 >> option redispatch >> timeout connect 5000 >> timeout client 60000 >> timeout server 170000 >> option clitcpka >> option srvtcpka >> option abortonclose >> option splice-auto >> monitor-uri /haproxy/ping >> stats enable >> stats uri /haproxy/stats >> stats refresh 15 >> stats auth user:pass >> >> frontend api >> bind *:80 >> bind *:443 ssl crt /etc/haproxy/server.pem >> maxconn 20000 >> option httpclose >> option forwardfor >> acl internal src 10.0.0.0/8 >> acl have_request_id req.fhdr(X-Request-Id) -m found >> http-request set-nice -100 if internal >> http-request add-header X-API-URL %[path] if !internal >> http-request add-header X-Request-Timestamp %Ts.%ms >> http-request add-header X-Request-Id %[req.fhdr(X-Request-Id)] if >> internal have_request_id >> http-request set-header X-Request-Id %{+X}o%pid-%rt if !internal || >> !have_request_id >> http-request add-header X-API-Host i-4a3b1c6a >> unique-id-format %{+X}o%pid-%rt >> log-format %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\ >> %ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %U/%B\ %ST\ %tsc\ %ID\ +\ %r >> default_backend DEFAULT_404 >> >> acl rewrite-found req.hdr(X-Rewrite-ID,1) -m found >> >> acl nqXn_path path_reg ^/1/sync/notifications/subscribe/([^\ ?]*)$ >> acl nqXn_method method OPTIONS GET HEAD POST PUT DELETE TRACE CONNECT >> PATCH >> http-request set-header X-Rewrite-Id nqXn if !rewrite-found nqXn_path >> nqXn_method >> acl rewrite-nqXn req.hdr(X-Rewrite-Id) -m str nqXn >> use_backend platform-push if rewrite-nqXn >> reqrep ^(OPTIONS|GET|HEAD|POST|PUT|DELETE|TRACE|CONNECT|PATCH)\ >> /1/sync/notifications/subscribe/([^\ ?]*)([\ ?].*|$) \1\ >> /1/sync/subscribe/\2\3 if rewrite-nqXn >> >> >> backend platform-push >> option httpchk GET /ping >> default-server inter 15s fastinter 1s >> server i-6eaf724d 10.230.23.64:80 check observe layer4 >> server i-84d931a5 10.230.42.8:80 check observe layer4 >> > Regards, > Willy > >