That makes perfect sense. Thank you very much.

-Patrick

------------------------------------------------------------------------
*From: *Willy Tarreau <w...@1wt.eu>
*Sent: * 2014-04-02 15:38:04 E
*To: *Patrick Hemmer <hapr...@stormcloud9.net>
*CC: *haproxy@formilux.org
*Subject: *Re: haproxy intermittently not connecting to backend

> Hi Patrick,
>
> On Tue, Apr 01, 2014 at 03:20:15PM -0400, Patrick Hemmer wrote:
>> We have an issue with haproxy (1.5-dev22-1a34d57) where it is
>> intermittently not connecting to the backend server. However the
>> behavior it is exhibiting seems strange.
>> The reason I say strange is that in one example, it logged that the
>> client disconnected after ~49 seconds with a connection flags of "CC--".
>> However our config has "timeout connect 5000", so it should have timed
>> out connecting to the backend server after 5 seconds. Additionally we
>> have "retries 3" in the config, so upon timing out, it should have tried
>> another backend server, but it never did (the retries counter in the log
>> shows "0").
> No, retries impacts only retries to the same server, it's "option redispatch"
> which allows the last retry to be performed on another server. But you have
> it anyway.
>
>> At the time of this log entry, the backend server is responding
>> properly. For the ~49 seconds prior to the log entry, the backend server
>> has taken other requests. The backend server is also another haproxy
>> (same version).
>>
>> Here's an example of one such log entry:
> [fixed version pasted here]
>
>> 198.228.211.13:60848 api~ platform-push/i-84d931a5 49562/0/-1/-1/49563 
>> 0/0/0/0/0 0/0 691/212 503 CC-- 4F8E-4624 + GET 
>> /1/sync/notifications/subscribe?sync_box_id=12496&sender=D7A9F93D-F653-4527-A022-383AD55A1943
>>  HTTP/1.1
> OK in fact the client did not wait 49 seconds. If you look closer, you'll
> see that the client remained silent for 49 seconds (typically a connection
> pool or a preconnect) and closed immediately after sending the request (in
> the same millisecond). Since you have "option abortonclose", the connection
> was aborted before the server had a chance to respond.
>
> So I can easily imagine that you randomly get this error, you're in a race
> condition, if the server responds immediately, you win the race and the
> request is handled, otherwise it's aborted.
>
> Please start by removing "option abortonclose", I think it will fix the issue.
> Second thing you can do is to remove "option httpclose" or replace it with
> "option http-server-close" which is active and not just passive. The 
> connections
> will last less time on your servers which is always appreciated.
>
> I'm not seeing any other issue, so with just this you should be fine.
>
>> The log format is defined as:
>> %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq\
>> %U/%B\ %ST\ %tsc\ %ID\ +\ %r
>>
>> Running a "show errors" on the stats socket did not return any relevant
>> results.
>>
>> Here's the relevant portions of the haproxy config. It is not the entire
>> thing as the whole config is 1,513 lines long.
>>
>> global
>>   log 127.0.0.1 local0
>>   maxconn 20480
>>   user haproxy
>>   group haproxy
>>   daemon
>>   stats socket /var/run/hapi/haproxy/haproxy.sock level admin
>>
>> defaults
>>   log global
>>   mode http
>>   option httplog
>>   option dontlognull
>>   option log-separate-errors
>>   retries 3
>>   option redispatch
>>   timeout connect 5000
>>   timeout client 60000
>>   timeout server 170000
>>   option clitcpka
>>   option srvtcpka
>>   option abortonclose
>>   option splice-auto
>>   monitor-uri /haproxy/ping
>>   stats enable
>>   stats uri /haproxy/stats
>>   stats refresh 15
>>   stats auth user:pass
>>
>> frontend api
>>   bind *:80
>>   bind *:443 ssl crt /etc/haproxy/server.pem
>>   maxconn 20000
>>   option httpclose
>>   option forwardfor
>>   acl internal src 10.0.0.0/8
>>   acl have_request_id req.fhdr(X-Request-Id) -m found
>>   http-request set-nice -100 if internal
>>   http-request add-header X-API-URL %[path] if !internal
>>   http-request add-header X-Request-Timestamp %Ts.%ms
>>   http-request add-header X-Request-Id %[req.fhdr(X-Request-Id)] if
>> internal have_request_id
>>   http-request set-header X-Request-Id %{+X}o%pid-%rt if !internal ||
>> !have_request_id
>>   http-request add-header X-API-Host i-4a3b1c6a
>>   unique-id-format %{+X}o%pid-%rt
>>   log-format %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\
>> %ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %U/%B\ %ST\ %tsc\ %ID\ +\ %r
>>   default_backend DEFAULT_404
>>
>>   acl rewrite-found req.hdr(X-Rewrite-ID,1) -m found
>>
>>   acl nqXn_path path_reg ^/1/sync/notifications/subscribe/([^\ ?]*)$
>>   acl nqXn_method method OPTIONS GET HEAD POST PUT DELETE TRACE CONNECT
>> PATCH
>>   http-request set-header X-Rewrite-Id nqXn if !rewrite-found nqXn_path
>> nqXn_method
>>   acl rewrite-nqXn req.hdr(X-Rewrite-Id) -m str nqXn
>>   use_backend platform-push if rewrite-nqXn
>>   reqrep ^(OPTIONS|GET|HEAD|POST|PUT|DELETE|TRACE|CONNECT|PATCH)\
>> /1/sync/notifications/subscribe/([^\ ?]*)([\ ?].*|$) \1\
>> /1/sync/subscribe/\2\3 if rewrite-nqXn
>>
>>
>> backend platform-push
>>   option httpchk GET /ping
>>   default-server inter 15s fastinter 1s
>>   server i-6eaf724d 10.230.23.64:80 check observe layer4
>>   server i-84d931a5 10.230.42.8:80 check observe layer4
>>
> Regards,
> Willy
>
>

Reply via email to