Seeing an issue with haproxy 1.5.3 against a solr server, it appears to be dropping tcp sessions to the target host during the health check super quick - before the server is even done sending it's response. It's not continuous, but the frequency does seem to go up more rapidly (i.e. not just constant multiplier) with higher frequency health checks.

I've tried with different combinations of build parms and both 1.4.25 and 1.5.3, see the same behavior. No change with -de/-dp/-ds options, or with generic builds instead of linux26 or linux2628. Also symptom doesn't change if I craft an http check with tcp-check instead of using httpchk.


I've attached two example captures - one of a valid/correct health check, and the other of one where the tcp session was cut off.

In the failing request, I see the GET request, acked, then the http/1.1 200 Ok response from Solr, a single ack, and then an almost instantaneous reset sent by the client.

It looks like in a "working" case, I see two packets immediately back to back (one with header, and next a continuation with content) with no ack in between, followed by ack, rst+ack, rst.

Another person on #haproxy indicates he's seeing the same symptom on his deployment. Occasional timeouts, and connection resets without warning.


defaults:
    timeout server 120s
    timeout connect 10s
    timeout client 120s
    timeout check 5s
    timeout http-request 120s

backend:
    mode http
    option httpchk GET /solr/admin/ping


I'm open to any suggestions on what to try.

-- Nathan

------------------------------------------------------------
Nathan Neulinger                       [email protected]
Neulinger Consulting                   (573) 612-1412

Attachment: solr-working.cap
Description: application/vnd.tcpdump.pcap

Attachment: solr-cutoff2.cap
Description: application/vnd.tcpdump.pcap

Reply via email to