I was also getting a lot 502s from Varnish, I had hoped by ignoring them they
would go away. I don't want to check if they are still there because I think
it might be too soon.
Let me know if you need more tcpdumps, I might be able to supply them (if, of
course, they are still there).
On 8/26/09 7:50 AM, Miguel Pilar Vilagran wrote:
Hello,
On 8/26/09 3:45 AM, "Willy Tarreau" <[email protected]> wrote:
Hello,
On Tue, Aug 25, 2009 at 03:26:57PM -0400, Miguel Pilar Vilagran wrote:
> Hello all,
>
> I have tried to diagnose our HAProxy install to see if we can get
it solved. It seems that the HAProxy will randomly return 502s for
existing items in the end servers.
>
> The configuration is as follows:
> Ubuntu Server 9.04 ( Linux valkyrie 2.6.28-15-server #49-Ubuntu
SMP Tue Aug 18 20:09:37 UTC 2009 x86_64 GNU/Linux )
> HAProxy 1.3.20 (compiled from source with `make TARGET=linux26
USE_PCRE=1` after `apt-get build-dep haproxy` )
>
> HAProxy Config can be found here: http://pastebin.com/f9014ec9
>
> I am using HAProxy to load balance a varnish install on two
servers which share virtual Ips through spread and wackamole.
HAProxy runs on the same server running varnish and then load
balances the web servers.
>
> The http path is:
> (browser) -> [ (HAProxy:public) -> (Varnish) ->
(HAProxy:varnish_front) ] -> (IIS Cluster)
>
> Where everything between square brackets is (optimally) in the
same host. Varnish will only talk with the local HAProxy, but
HAProxy will fail over to the other varnish in the cluster (this
allows for varnish to fail on a single host, which is necessary
because varnish has a fairly long restart time).
>
> All the servers are located in the same VLAN inside the same blade
enclosure and average latency is 0.150ms, .026ms mdev.
>
> A snippet of the log (filtered for 502 responses) can be found
here: http://pastebin.com/f6db9a351
>
> Previously (haproxy 1.3.15.5, the ubuntu default) the 502s ended
with SH- but now they are SL--.
Interesting. I think this means that the server aborts before
sending the whole
response. If that's so, 1.3.20 should not log it SL (L means last
packet of the
data part). Please try to get a tcpdump trace between haproxy and
IIS, since
the behaviour is random, it is the only way to understand better what is
happening. Use tcpdump -s0 to get full packets capture.
Regards,
Willy
The dump has been sent to Willy directly. If anyone else needs it let me
know.
--
Miguel Pilar