On Wed, Mar 31, 2010 at 01:06:05AM -0400, Geoffrey Mina wrote:
> Willy,
> Thanks for the response.  I have attached a pcap file here off the webserver
> we are trying to load balance to.  It is full of errors... but honestly, I
> don't know enough about this level of TCP to know what the problem is.

Those are not real errors. It is just because the outgoing packets' TCP
checksum is computed by the network card, so it is not yet correct in
the network stack where tcpdump gets the packets. You can safely ignore
that.

> Again, the attached file is a tcpdump of port 80 on the webserver.

Your trace shows that the server announces that it is going to send
1213090 bytes :

HTTP/1.1 206 Partial Content
Content-Length: 1213090
Content-Type: application/x-shockwave-flash
Content-Location: http://173.203.208.131/adminui/IntelliQueueAdmin.swf
Content-Range: bytes 412748-1625837/1625838

Unfortunately, it stops sending anything after 263938 bytes, after
what haproxy's timeout finally strikes.

Looking more in details, we see that the server sends one segment
which never manages to reach the haproxy server :

07:02:21.547517 IP (tos 0x0, ttl 128, id 7675, offset 0, flags [DF], proto TCP 
(6), length 1500, bad cksum 0 (->ca2c)!)
    173.203.224.217.80 > 173.203.208.131.45278: Flags [.], cksum 0x0cfb 
(incorrect -> 0xcb49), seq 4060117381:4060118829, ack 1003968144, win 65038, 
options [nop,nop,TS val 21159278 ecr 11067241], length 1448

This one is repeated multiple times and the only ACK which are
sent to the server are for the previous segment. This indicates
that the connection is still alive but that this specific packet
cannot pass through as it is never received by the other side.
According to your TTL, you seem to have one component between
haproxy and the web server. Is it a firewall ? Maybe it's a bit
buggy (that once was very common, it's less common these days).
It would be nice to attempt the capture on it again to see if
the same packet passes through it or not. Maybe there is some
form of IDS or pattern matching which believes it has found an
attack or invalid content in that packet (pattern matching is
the worst thing to do, it will always do such nasty things).

Also, when the packet was retransmitted, the server also tried
to reduce its size to 576 bytes, but this did not work either.
That means that if something is wrong, it's in the first 576
bytes of the packet.

Regards,
Willy


Reply via email to