Hi John,
On Thu, May 15, 2014 at 05:49:59PM +0000, JDzialo John wrote:
> Hi Guys,
>
> We have been using haproxy and have loved it and appreciate all the hard work
> you have put into this great product.
>
> I am new to the project and still trying to grasp it's complexities so
> forgive me in advance for any ignorance.
>
> We have been running haproxy infront of a web farm since January. HAProxy
> 1.4.24 on a Debian 7 server performing a round robin balance across 6 web
> servers using IIS 7.5 and hosting a .NET application. All hosted in AWS.
>
> Recently we have started to see a strange issue arise. Every once in a while
> a browser request or a web service call will hang until our 10 minute client
> timeout is hit and the request fails.
>
> Using fiddler for testing once in a while I see a request initiate but never
> make the connection to a back end server. The request hangs and eventually
> fiddler reports a content length mismatch as our header declared a certain
> amount of data but the client only received a fraction of it.
So it sounds like your server is sending a wrong content-length.
> The issue is random but happens pretty consistently throughout the day.
>
> This just started a few weeks ago and there were no changes on our HAProxy
> config made since February.
Which would be consistent with a recent change on the server :-)
> Below find our configuration.
>
> # Global config
> global
> log 127.0.0.1 local0
> log 127.0.0.1 local1 notice
> maxconn 100000
> stats socket /var/run/haproxy.sock mode 0600 level admin
> user haproxy
> group haproxy
> daemon
>
> # Default config
> defaults
> log global
> mode http
> option httplog
> option dontlognull
> option redispatch
> option forwardfor
> option httpclose
> option abortonclose
> retries 1
> timeout connect 5000
> timeout client 50000
> timeout server 50000
>
> listen stats
> #disabled
> bind *:8888
> stats enable
> stats uri /haproxy?stats
> stats realm Strictly\ Private
> stats auth xxxx:xxxxxxxxxx
>
> frontend unsecured *:80
> timeout client 600000
>
> default_backend web
>
> backend web
> timeout server 600000
> balance roundrobin
>
> server web1 xxx.xxx.xxx.xxx:80 check
> server web2 xxx.xxx.xxx.xxx:80 check
> server web3 xxx.xxx.xxx.xxx:80 check
> server web4 xxx.xxx.xxx.xxx:80 check
> server web5 xxx.xxx.xxx.xxx:80 check
> server web6 xxx.xxx.xxx.xxx:80 check
>
> Is there anything in our configuration that could cause this weird behavior?
> Or anything I could add?
No, haproxy will use content-length just like your client to know the
response length, but will not modify it.
> How about kernel settings in sysctl? What are the optimal settings to run a
> haproxy server?
These are totally unrelated. Here you're having a problem with less data
being returned than advertised. So very likely your server is sending
some abnormal contents. Have you tried taking a network capture between
haproxy and the server to verify this ?
Note that another explanation could be totally unrelated to content length
and could simply be a bug causing haproxy to actually stop receiving or
emitting data, and closing the connection after the timeout. That would
also explain why your client receives less data than indicated in the
content-length header. But if so, you should be able to tell whether or
not the contents have been truncated.
However I'm not seeing any known bug looking like this in 1.4.24, so while
possible, this seems very strange since most people are using 1.4 :-/
> Any help you can give I would really appreciate it?
>
> Please let me know if there is anything else I can provide.
Please try to take a capture of the response from the server to haproxy so
that we find there if the response size is correct, and/or if any special
event happens. Please use something like this (assuming all your traffic
flows by interface eth0) :
tcpdump -s0 -npi eth0 -w trace-server.cap src <server-ip>
If you have enough space on your disk, it can be even better to also capture
the haproxy-to-client traffic :
tcpdump -s0 -npi eth0 -w trace-client.cap dst <client-ip>
Best regards,
Willy