Re: option httpchk is reporting servers as down when they're not

2009-03-06 Thread Willy Tarreau
Hi Thomas,

On Thu, Mar 05, 2009 at 08:45:20AM -0500, Allen, Thomas wrote:
 Hi Jeff,
 
 The thing is that if I don't include the health check, the load balancer 
 works fine and each server receives equal distribution. I have no idea why 
 the servers would be reported as down but still work when unchecked.

It is possible that your servers expect the Host: header to
be set during the checks. There's a trick to do it right now
(don't forget to escape spaces) :

option httpchk GET /index.php HTTP/1.0\r\nHost:\ www.mydomain.com

Also, you should check the server's logs to see why it is reporting
the service as down. And as a last resort, a tcpdump of the traffic
between haproxy and a failed server will show you both the request
and the complete error from the server.

Regards,
Willy




RE: option httpchk is reporting servers as down when they're not

2009-03-06 Thread Allen, Thomas
Thanks, once I figure out logging I'll let you guys know what I discover
:^) 

Thomas Allen
Web Developer, ASCE
703.295.6355

-Original Message-
From: Willy Tarreau [mailto:w...@1wt.eu] 
Sent: Friday, March 06, 2009 1:39 PM
To: Allen, Thomas
Cc: Jeffrey 'jf' Lim; haproxy@formilux.org
Subject: Re: option httpchk is reporting servers as down when they're
not

Hi Thomas,

On Thu, Mar 05, 2009 at 08:45:20AM -0500, Allen, Thomas wrote:
 Hi Jeff,
 
 The thing is that if I don't include the health check, the load
balancer works fine and each server receives equal distribution. I have
no idea why the servers would be reported as down but still work when
unchecked.

It is possible that your servers expect the Host: header to
be set during the checks. There's a trick to do it right now
(don't forget to escape spaces) :

option httpchk GET /index.php HTTP/1.0\r\nHost:\
www.mydomain.com

Also, you should check the server's logs to see why it is reporting
the service as down. And as a last resort, a tcpdump of the traffic
between haproxy and a failed server will show you both the request
and the complete error from the server.

Regards,
Willy




Re: load balancer and HA

2009-03-06 Thread Willy Tarreau
On Wed, Mar 04, 2009 at 12:12:21AM +0100, Alexander Staubo wrote:
 On Tue, Mar 3, 2009 at 11:44 PM, Martin Karbon martin.kar...@asbz.it wrote:
  just wanted to know if anyone knows an opensource solution for a so called
  transparent failover: what I mean with that is, I installed two machines
  with haproxy on it which comunicate with each other via heartbeat. If one
  fails the other one goes from passive to active but all sessions are lost
  and users have to reconnect.
 
 We use Heartbeat (http://www.keepalived.org/) for this. Heartbeat lets
 us set up virtual service IPs which are reassigned to another box if
 the box goes down. Works like a charm. Current connections are lost,
 but new ones go to the new IP.
 
 Note that there are two current versions of Heartbeat. There's the old
 1.x series, which is simple and stable, but which has certain
 limitations such as only supporting two nodes, if I remember
 correctly. Then there's 2.x, which is much more complex and less
 stable.
 
 We run 2.0.7 today, and we have had some situations where the
 Heartbeat processes have run wild. It's been running quietly for over
 a year now, so recent patches may have fixed the issues. I would still
 recommend sticking with 1.x if at all possible.

I still don't understand why people stick to heartbeat for things
as simple as moving an IP address. Heartbeat is more of a clustering
solution, with abilities to perform complex tasks.

When it comes to just move an IP address between two machines an do
nothing else, the VRRP protocol is really better. It's what is
implemented in keepalived. Simple, efficient and very reliable.

I've been told that ucarp was good at that too, though I've never
tried it yet.

 While there are solutions out there that preserve connections on
 failover, my gut feeling is that they introduce a level of complexity
 and computational overhead that is necessarily puts a restraint on
 performance.

In fact it's useless to synchronise TCP sessions between load-balancers
for fast-moving connections (eg: HTTP traffic). Some people require that
for long sessions (terminal server, ...) but this cannot be achieved in
a standard OS, you need to synchronise every minor progress of the TCP
stack with the peer. And that also prevents true randomness from being
used at TCP and IP levels. It also causes trouble when some packets are
lost between the peers, because they can quickly get out of sync.

In practise, in order to synchronise TCP between two hosts, you need
more bandwidth than that of the traffic you want to forward.

There are intermediate solutions which synchronise at layer 4 only,
without taking into account the data nor the sequence numbers. Those
present the advantage of being able to take over a connection without
too much overhead, but no layer 7 processing can be done there, and
those cannot be system sockets. That's typically what you find in some
firewalls or layer4 load balancers which just forward packets between
two sides and maintain a vague context.

Regards,
Willy




Re: measuring haproxy performance impact

2009-03-06 Thread Michael Fortson
On Fri, Mar 6, 2009 at 8:43 AM, Willy Tarreau w...@1wt.eu wrote:
 Hi Michael,

 On Thu, Mar 05, 2009 at 01:04:06PM -0800, Michael Fortson wrote:
 I'm trying to understand why our proxied requests have a much greater
 chance of significant delay than non-proxied requests.

 The server is an 8-core (dual quad) Intel machine. Making requests
 directly to the nginx backend is just far more reliable. Here's a
 shell script output that just continuously requests a blank 0k image
 file from nginx directly on its own port, and spits out a timestamp if
 the delay isn't 0 or 1 seconds:

 Thu Mar 5 12:36:17 PST 2009
 beginning continuous test of nginx port 8080
 Thu Mar 5 12:38:06 PST 2009
 Nginx Time is 2 seconds



 Here's the same test running through haproxy, simultaneously:

 Thu Mar 5 12:36:27 PST 2009
 beginning continuous test of haproxy port 80
 Thu Mar 5 12:39:39 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:39:48 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:39:55 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:40:03 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:40:45 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:40:48 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:40:55 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:40:58 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:41:55 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:42:01 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:42:08 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:42:29 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:42:38 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:43:05 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:43:15 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:44:08 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:44:25 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:44:30 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:44:33 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:44:39 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:44:46 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:44:54 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:45:07 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:45:16 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:45:45 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:45:54 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:45:58 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:46:05 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:46:08 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:46:32 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:46:48 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:46:53 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:46:58 PST 2009
 Nginx Time is 3 seconds
 Thu Mar 5 12:47:40 PST 2009
 Nginx Time is 3 seconds

 3 seconds is typically a TCP retransmit. You have network losses somewhere
 from/to your haproxy. Would you happen to be running on a gigabit port
 connected to a 100 Mbps switch ? What type of NIC is this ? I've seen
 many problems with broadcom netxtreme 2 (bnx2) caused by buggy firmwares,
 but it seems to work fine for other people after a firmware upgrade.

 My sanitized haproxy config is here (mongrel backend was omitted for 
 brevity) :
 http://pastie.org/408729

 Are the ACLs just too expensive?

 Not at all. Especially in your case. To reach 3 seconds of latency, you would
 need hundreds of thousands of ACLs, so this is clearly unrelated to your 
 config.

 Nginx is running with 4 processes, and the box shows mostly idle.

 ... which indicates that you aren't burning CPU cycles processing ACLs ;-)

 It is also possible that some TCP settings are too low for your load, but
 I don't know what your load is. Above a few hundreds-thousands of sessions
 per second, you will need to do some tuning, otherwise you can end up with
 similar situations.

 Regards,
 Willy



Hmm. I think it is gigabit connected to 100 Mb (all Dell rack-mount
servers and switches). The nginx backend runs on the same machine as
haproxy and is referenced via 127.0.0.1 -- does that still involve a
real network port? Should I try the test all on localhost to isolate
it from any networking retransmits?

Here's a peek at the stats page after about a day of running (this
should help demonstrate current loading)
http://pastie.org/409632



RE: load balancer and HA

2009-03-06 Thread John Lauro
 I still don't understand why people stick to heartbeat for things
 as simple as moving an IP address. Heartbeat is more of a clustering
 solution, with abilities to perform complex tasks.
 
 When it comes to just move an IP address between two machines an do
 nothing else, the VRRP protocol is really better. It's what is
 implemented in keepalived. Simple, efficient and very reliable.

One reason, heartbeat is standard in many distributions (ie: RHEL, Centos)
and vrrp and keepalived are not.  It might be overkill for just IP
addresses, but being supported in the base OS is a plus that shouldn't be
discounted.  If you have to support heartbeat on other servers, using
heartbeat for places you have to share resources is easier than using vrrp
for some and heartbeat on others.






Re: measuring haproxy performance impact

2009-03-06 Thread Willy Tarreau
On Fri, Mar 06, 2009 at 11:23:02AM -0800, Michael Fortson wrote:
 On Fri, Mar 6, 2009 at 8:43 AM, Willy Tarreau w...@1wt.eu wrote:
  Hi Michael,
 
  On Thu, Mar 05, 2009 at 01:04:06PM -0800, Michael Fortson wrote:
  I'm trying to understand why our proxied requests have a much greater
  chance of significant delay than non-proxied requests.
 
  The server is an 8-core (dual quad) Intel machine. Making requests
  directly to the nginx backend is just far more reliable. Here's a
  shell script output that just continuously requests a blank 0k image
  file from nginx directly on its own port, and spits out a timestamp if
  the delay isn't 0 or 1 seconds:
 
  Thu Mar 5 12:36:17 PST 2009
  beginning continuous test of nginx port 8080
  Thu Mar 5 12:38:06 PST 2009
  Nginx Time is 2 seconds
 
 
 
  Here's the same test running through haproxy, simultaneously:
 
  Thu Mar 5 12:36:27 PST 2009
  beginning continuous test of haproxy port 80
  Thu Mar 5 12:39:39 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:39:48 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:39:55 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:40:03 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:40:45 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:40:48 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:40:55 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:40:58 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:41:55 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:42:01 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:42:08 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:42:29 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:42:38 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:43:05 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:43:15 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:44:08 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:44:25 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:44:30 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:44:33 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:44:39 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:44:46 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:44:54 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:45:07 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:45:16 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:45:45 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:45:54 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:45:58 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:46:05 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:46:08 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:46:32 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:46:48 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:46:53 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:46:58 PST 2009
  Nginx Time is 3 seconds
  Thu Mar 5 12:47:40 PST 2009
  Nginx Time is 3 seconds
 
  3 seconds is typically a TCP retransmit. You have network losses somewhere
  from/to your haproxy. Would you happen to be running on a gigabit port
  connected to a 100 Mbps switch ? What type of NIC is this ? I've seen
  many problems with broadcom netxtreme 2 (bnx2) caused by buggy firmwares,
  but it seems to work fine for other people after a firmware upgrade.
 
  My sanitized haproxy config is here (mongrel backend was omitted for 
  brevity) :
  http://pastie.org/408729
 
  Are the ACLs just too expensive?
 
  Not at all. Especially in your case. To reach 3 seconds of latency, you 
  would
  need hundreds of thousands of ACLs, so this is clearly unrelated to your 
  config.
 
  Nginx is running with 4 processes, and the box shows mostly idle.
 
  ... which indicates that you aren't burning CPU cycles processing ACLs ;-)
 
  It is also possible that some TCP settings are too low for your load, but
  I don't know what your load is. Above a few hundreds-thousands of sessions
  per second, you will need to do some tuning, otherwise you can end up with
  similar situations.
 
  Regards,
  Willy
 
 
 
 Hmm. I think it is gigabit connected to 100 Mb (all Dell rack-mount
 servers and switches).

OK so then please check with ethtool if your port is running in half
or full duplex :

# ethtool eth0

Most often, 100 Mbps switches are forced to 100-full without autoneg,
and gig ports in front of them see them as half thinking they are hubs.

 The nginx backend runs on the same machine as
 haproxy and is referenced via 127.0.0.1 -- does that still involve a
 real network port? Should I try the test all on localhost to isolate
 it from any networking retransmits?

Yes if you can do that, that would be nice. If the issue persists,
we'll have to check the network stack tuning, but that's getting
harder as it depends on the workload. Also, please provide the
output of netstat -s.

 Here's a peek at the stats page after about a day of running (this
 should help demonstrate current loading)
 http://pastie.org/409632

I'm seeing something odd here. A lot of mongrel servers experience
connection retries. Are they located on 

Re: measuring haproxy performance impact

2009-03-06 Thread Willy Tarreau
On Fri, Mar 06, 2009 at 11:49:39AM -0800, Michael Fortson wrote:
 Oops, looks like it's actually Gb - Gb:
 http://pastie.org/409653

ah nice !

 Here's a netstat -s:
 http://pastie.org/409652

Oh there are interesting things there :

  - 513607 failed connection attempts
= let's assume it was for dead servers

  - 34784881 segments retransmited
= this is huge, maybe your outgoing bandwidth is limited
   by the provider, causing lots of drops ?

  - 8325393 SYN cookies sent
= either you've been experiencing a SYN flood attack, or
   one of your listening socket's backlog is extremely small

  -  1235433 times the listen queue of a socket overflowed
 1235433 SYNs to LISTEN sockets ignored
 = up to 1.2 million times some client socket experienced
a drop, causing at least a 3 seconds delay to establish.
 The errors your scripts detect certainly account for a small
 part of those.

  - 2962458 times recovered from packet loss due to SACK data
= many losses, related to second point above.

Could you post the output of sysctl -a |grep ^net ? I think that
your TCP syn backlog is very low. Your stats page indicate an average
of about 300 sessions/s over the last 24 hours. If your external
bandwidth is capped and causes drops, you can nearly saturate the
default backlog of 1024 with 300 sessions/s each taking 3s to
complete. If you're interested, the latest snapshot will report
the number of sess/s in the stats.

 Haproxy and nginx are currently on the same box. Mongrels are all on a
 private network accessed through eth1 (public access is via eth0).

OK.

 stats page attached (backend everything is not currently in use;
 it'll be a use-when-full option for fast_mongrels once we upgrade to
 the next haproxy).

According to the stats, your avg output bandwidth is around 10 Mbps.
Would this match your external link ?

Regards,
Willy




Re: measuring haproxy performance impact

2009-03-06 Thread Michael Fortson
On Fri, Mar 6, 2009 at 12:53 PM, Willy Tarreau w...@1wt.eu wrote:
 On Fri, Mar 06, 2009 at 11:49:39AM -0800, Michael Fortson wrote:
 Oops, looks like it's actually Gb - Gb:
 http://pastie.org/409653

 ah nice !

 Here's a netstat -s:
 http://pastie.org/409652

 Oh there are interesting things there :

  - 513607 failed connection attempts
    = let's assume it was for dead servers

  - 34784881 segments retransmited
    = this is huge, maybe your outgoing bandwidth is limited
       by the provider, causing lots of drops ?

  - 8325393 SYN cookies sent
    = either you've been experiencing a SYN flood attack, or
       one of your listening socket's backlog is extremely small

  -  1235433 times the listen queue of a socket overflowed
     1235433 SYNs to LISTEN sockets ignored
     = up to 1.2 million times some client socket experienced
        a drop, causing at least a 3 seconds delay to establish.
     The errors your scripts detect certainly account for a small
     part of those.

  - 2962458 times recovered from packet loss due to SACK data
    = many losses, related to second point above.

 Could you post the output of sysctl -a |grep ^net ? I think that
 your TCP syn backlog is very low. Your stats page indicate an average
 of about 300 sessions/s over the last 24 hours. If your external
 bandwidth is capped and causes drops, you can nearly saturate the
 default backlog of 1024 with 300 sessions/s each taking 3s to
 complete. If you're interested, the latest snapshot will report
 the number of sess/s in the stats.

 Haproxy and nginx are currently on the same box. Mongrels are all on a
 private network accessed through eth1 (public access is via eth0).

 OK.

 stats page attached (backend everything is not currently in use;
 it'll be a use-when-full option for fast_mongrels once we upgrade to
 the next haproxy).

 According to the stats, your avg output bandwidth is around 10 Mbps.
 Would this match your external link ?

 Regards,
 Willy


Thanks Willy -- here's the sysctl -a |grep ^net output:
http://pastie.org/409735

Our outbound cap is 400 Mb



Re: measuring haproxy performance impact

2009-03-06 Thread Willy Tarreau
On Fri, Mar 06, 2009 at 01:00:38PM -0800, Michael Fortson wrote:
 Thanks Willy -- here's the sysctl -a |grep ^net output:
 http://pastie.org/409735

after a quick check, I see two major things :
  - net.ipv4.tcp_max_syn_backlog = 1024
= far too low, increase it to 10240 and check if it helps

  - net.netfilter.nf_conntrack_max = 265535
  - net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
= this proves that netfiler is indeed running on this machine
   and might be responsible for session drops. 265k sessions is
   very low for the large time_wait. It limits to about 2k
   sessions/s, including local connections on loopback, etc...

You should then increase nf_conntrack_max and nf_conntrack_buckets
to about nf_conntrack_max/16, and reduce nf_conntrack_tcp_timeout_time_wait
to about 30 seconds.

 Our outbound cap is 400 Mb

OK so I think you're still far away from that.

Regards,
Willy




Re: question about queue and max_conn = 1

2009-03-06 Thread Willy Tarreau
Hi Greg,

On Fri, Mar 06, 2009 at 03:54:13PM -0500, Greg Gard wrote:
 hi willy and all,
 
 wondering if i can expect haproxy to queue requests when max conn per
 backend it set to 1. running nginx  haproxy  mongrel/rails2.2.2.

yes, it works fine and is even the recommended way of setting it for
use with mongrel. There have been trouble in the past with some old
versions, where a request could starve in the queue for too long.
What version are you using ?

 all seems ok, but i am getting a few users complaining of connection
 problems and never see anything other than zeros in the
 queue columns.

Have you correctly set the maxconn on the server lines ? I suspect
you have changed it in the frontend instead, which would be a disaster.
Could you please post your configuration ?

Regards,
Willy




Dropped HTTP Requests

2009-03-06 Thread Timothy Olson
I'm using HAProxy 1.3.15.7 to load-balance three Tomcat instances, and to
fork requests for static content to a single Apache instance.  I've found
that after the initial HTML page is loaded from Tomcat, the browser's
subsequent first request for a static image from Apache gets dropped
(neither HAProxy nor Apache logs the request, but I can sniff it).  The rest
of the images after the first load fine.  If I create a small, static, test
HTML page on Tomcat (making the images come from a different backend), it
shows the first image on the page as broken.  If I put the exact same HTML
page on Apache (no backend switch required), it works fine.  I wonder if we
have a configuration problem, or perhaps this is a bug in the way HAProxy
deals with an HTTP keepalive request that spreads to a second backend?
Here is our simple test HTML page:

htmlbodyimg src=/header/logo.jpgbrimg
src=/header/play_demo.jpg/body/html

Both images come from Apache.  Again, if we request this HTML from the same
backend as the images, it works.  If we request this HTML from the Tomcat
backend, the request to Apache for logo.jpg gets dropped.  Our config
follows:


global
  log 127.0.0.1 local0 notice
  maxconn 4096
  chroot /usr/share/haproxy-jail
  user apache
  group apache
  daemon
  spread-checks 5

defaults
  log global
  modehttp
  option  httplog
  option  dontlognull
  option  forwardfor
  option  redispatch
  retries 3
  timeout connect 4s
  timeout server 20s
  timeout client 10s
  timeout http-request 10s

frontend all
  bind :80
  acl rp path_beg /rp
  acl rp_adserving path_beg /rp/javascript.js or path_beg /rp/do/adserving
  use_backend apache if !rp
  use_backend tomcat_nosession if rp_adserving
  default_backend tomcat_session

backend apache
  server LocalApache localhost:81
  balance roundrobin
  option httpchk /
  stats enable
  stats hide-version
  stats uri /hapstat
  stats realm   Haproxy\ Statistics
  stats authx:xxx
  stats refresh 5s

backend tomcat_session
  server TomcatA out1:8080 cookie A check slowstart 10s
  server TomcatB out2:8080 cookie B check slowstart 10s
  server TomcatC out3:8080 cookie C check slowstart 10s
  balance roundrobin
  option httpchk /rp/index.html
  timeout server 2m
  cookie JSESSIONID prefix

backend tomcat_nosession
  server TomcatA out1:8080 check slowstart 10s
  server TomcatB out2:8080 check slowstart 10s
  server TomcatC out3:8080 check slowstart 10s
  balance roundrobin
  option httpchk /rp/index.html


Re: Dropped HTTP Requests

2009-03-06 Thread Willy Tarreau
On Fri, Mar 06, 2009 at 04:55:21PM -0500, Timothy Olson wrote:
 I'm using HAProxy 1.3.15.7 to load-balance three Tomcat instances, and to
 fork requests for static content to a single Apache instance.  I've found
 that after the initial HTML page is loaded from Tomcat, the browser's
 subsequent first request for a static image from Apache gets dropped
 (neither HAProxy nor Apache logs the request, but I can sniff it).  The rest
 of the images after the first load fine.  If I create a small, static, test
 HTML page on Tomcat (making the images come from a different backend), it
 shows the first image on the page as broken.  If I put the exact same HTML
 page on Apache (no backend switch required), it works fine.  I wonder if we
 have a configuration problem, or perhaps this is a bug in the way HAProxy
 deals with an HTTP keepalive request that spreads to a second backend?

Haproxy does not support HTTP keepalive yet. However it can workaround it
using option httpclose, which you should set in your defaults section.
What you describe is typically what happens without the option.

Regards,
Willy




RE: measuring haproxy performance impact

2009-03-06 Thread John Lauro
   - net.netfilter.nf_conntrack_max = 265535
   - net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
 = this proves that netfiler is indeed running on this machine
and might be responsible for session drops. 265k sessions is
very low for the large time_wait. It limits to about 2k
sessions/s, including local connections on loopback, etc...
 
 You should then increase nf_conntrack_max and nf_conntrack_buckets
 to about nf_conntrack_max/16, and reduce
 nf_conntrack_tcp_timeout_time_wait
 to about 30 seconds.
 

Minor nit...
He has:  net.netfilter.nf_conntrack_count = 0
Which if I am not mistaken, indicates connection tracking although in the
kernel, it is not being used.  (No firewall rules triggering it).






Re: load balancer and HA

2009-03-06 Thread Alexander Staubo
On Fri, Mar 6, 2009 at 7:48 PM, Willy Tarreau w...@1wt.eu wrote:
 When it comes to just move an IP address between two machines an do
 nothing else, the VRRP protocol is really better. It's what is
 implemented in keepalived. Simple, efficient and very reliable.

Actually, it seems that my information is out of date, and we (that
is, our IT management company that we outsource our system
administration to) are in fact using Keepalived these days. I was
confused by the presence of ha_logd on our boxes, which is part of the
Heartbeat package; don't know what the one is doing there. So, yeah,
you're right. Stick with Keepalived. :-)

 In fact it's useless to synchronise TCP sessions between load-balancers
 for fast-moving connections (eg: HTTP traffic). Some people require that
 for long sessions (terminal server, ...) but this cannot be achieved in
 a standard OS, you need to synchronise every minor progress of the TCP
 stack with the peer.

A less ambitious scheme would have the new proxy take over the client
connection and retry the request with the next available backend. This
depends on a couple of factors: For one, it only works if nothing has
yet been sent back to the client. Secondly, it assumes the request
itself is repeatable without side effects. The latter, of course, is
application-dependent; but following the REST principle, in a
well-designed app GET requests are supposed to have no side effects,
so they can be retried, whereas POST, PUT etc. cannot. Still expensive
and error-prone, of course, but much more pragmatic and limited in
scope.

Alexander.



Re: measuring haproxy performance impact

2009-03-06 Thread Willy Tarreau
On Fri, Mar 06, 2009 at 02:36:59PM -0800, Michael Fortson wrote:
 On Fri, Mar 6, 2009 at 1:46 PM, Willy Tarreau w...@1wt.eu wrote:
  On Fri, Mar 06, 2009 at 01:00:38PM -0800, Michael Fortson wrote:
  Thanks Willy -- here's the sysctl -a |grep ^net output:
  http://pastie.org/409735
 
  after a quick check, I see two major things :
   - net.ipv4.tcp_max_syn_backlog = 1024
     = far too low, increase it to 10240 and check if it helps
 
   - net.netfilter.nf_conntrack_max = 265535
   - net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
     = this proves that netfiler is indeed running on this machine
        and might be responsible for session drops. 265k sessions is
        very low for the large time_wait. It limits to about 2k
        sessions/s, including local connections on loopback, etc...
 
  You should then increase nf_conntrack_max and nf_conntrack_buckets
  to about nf_conntrack_max/16, and reduce nf_conntrack_tcp_timeout_time_wait
  to about 30 seconds.
 
  Our outbound cap is 400 Mb
 
  OK so I think you're still far away from that.
 
  Regards,
  Willy
 
 
 
 Hmm; I did these (John is right, netfilter is down at the moment
 because I dropped iptables to help troubleshoot this),

What did you unload precisely ? You don't need any iptables rules
for the conntrack to take effect.

 so I guess the
 syn backlog is the only net change. No difference so far -- still
 seeing regular 3s responses.
 
 It's weird, but I actually see better results testing mongrel than
 nginx; haproxy = mongrel heartbeat is more reliable than the haproxy
 = nginx request.

mongrel is on another machine ? You might be running out of some
resource on the local one making it difficult to reach accept().
Unfortunately I don't see what :-(

Have you checked with dmesg that you don't have network stack
errors or any type of warning ?

Willy




Re: load balancer and HA

2009-03-06 Thread Willy Tarreau
On Fri, Mar 06, 2009 at 11:47:14PM +0100, Alexander Staubo wrote:
 On Fri, Mar 6, 2009 at 7:48 PM, Willy Tarreau w...@1wt.eu wrote:
  When it comes to just move an IP address between two machines an do
  nothing else, the VRRP protocol is really better. It's what is
  implemented in keepalived. Simple, efficient and very reliable.
 
 Actually, it seems that my information is out of date, and we (that
 is, our IT management company that we outsource our system
 administration to) are in fact using Keepalived these days. I was
 confused by the presence of ha_logd on our boxes, which is part of the
 Heartbeat package; don't know what the one is doing there. So, yeah,
 you're right. Stick with Keepalived. :-)

Ah nice! The author will be please to read this, he's subscribed to the
list :-)

  In fact it's useless to synchronise TCP sessions between load-balancers
  for fast-moving connections (eg: HTTP traffic). Some people require that
  for long sessions (terminal server, ...) but this cannot be achieved in
  a standard OS, you need to synchronise every minor progress of the TCP
  stack with the peer.
 
 A less ambitious scheme would have the new proxy take over the client
 connection and retry the request with the next available backend.

Will not work because the connection from the client to the proxy will
have been broken during the take-over. The second proxy cannot inherit
the primary one's sockets.

 This
 depends on a couple of factors: For one, it only works if nothing has
 yet been sent back to the client. Secondly, it assumes the request
 itself is repeatable without side effects. The latter, of course, is
 application-dependent; but following the REST principle, in a
 well-designed app GET requests are supposed to have no side effects,
 so they can be retried, whereas POST, PUT etc. cannot. Still expensive
 and error-prone, of course, but much more pragmatic and limited in
 scope.

What you're talking about are idempotent HTTP requests, which are quite
well documented in RFC2616. Those are important to consider because
idempotent requests are the only ones a proxy may retry upon a connection
error when sending a request on a keep-alive session. IIRC, HEAD, PUT,
GET and DELETE were supposed to be idempotent methods. But we all know
that GET is not that much when used with CGIs.

Willy




Re: load balancer and HA

2009-03-06 Thread Alexander Staubo
On Sat, Mar 7, 2009 at 12:07 AM, Willy Tarreau w...@1wt.eu wrote:
 A less ambitious scheme would have the new proxy take over the client
 connection and retry the request with the next available backend.

 Will not work because the connection from the client to the proxy will
 have been broken during the take-over. The second proxy cannot inherit
 the primary one's sockets.

Unless you have some kind of shared-memory L4 magic like the original
poster talked about, that allows taking over an existing TCP
connection.

 What you're talking about are idempotent HTTP requests, which are quite
 well documented in RFC2616.

That was the exact word I was looking for. I didn't know that PUT was
idempotent, but the others make sense.

Alexander.



Re: load balancer and HA

2009-03-06 Thread Willy Tarreau
On Sat, Mar 07, 2009 at 12:14:44AM +0100, Alexander Staubo wrote:
 On Sat, Mar 7, 2009 at 12:07 AM, Willy Tarreau w...@1wt.eu wrote:
  A less ambitious scheme would have the new proxy take over the client
  connection and retry the request with the next available backend.
 
  Will not work because the connection from the client to the proxy will
  have been broken during the take-over. The second proxy cannot inherit
  the primary one's sockets.
 
 Unless you have some kind of shared-memory L4 magic like the original
 poster talked about, that allows taking over an existing TCP
 connection.

in this case of course I agree. But that means kernel-level changes.

  What you're talking about are idempotent HTTP requests, which are quite
  well documented in RFC2616.
 
 That was the exact word I was looking for. I didn't know that PUT was
 idempotent, but the others make sense.

in fact it also makes sense for PUT because you're supposed to use
this method to send a file. Normally, you can send it as many times
as you want, the result will not change.

Willy




Re: question about queue and max_conn = 1

2009-03-06 Thread Greg Gard
thanks for taking a look willy. let me know if there's anything else i
should change.

global
maxconn 4096
user haproxy
group haproxy
daemon
log 127.0.0.1local0 notice

# http
defaults
log global
retries3
timeoutconnect 5000
timeoutclient  60
timeoutserver  60
stats enable
stats auth ggard:buddycat
mode http
optionhttplog
balance roundrobin
# option httpclose
option httpchk HEAD /check.txt HTTP/1.0

#stable sites run win2k/iis5/asp
listen stable 192.168.1.5:10301
option forwardfor
server stable1 192.168.1.10:10300 weight 4 check
server stable2 192.168.1.11:10300 weight 6 check

# beta sites running mongrel/rails2.2.2
listen beta 192.168.1.5:8089
   server 7-8091  192.168.1.22:8091 weight 4 maxconn 1 check
   server 7-8092  192.168.1.22:8092 weight 4 maxconn 1 check
   server 7-8093  192.168.1.22:8093 weight 4 maxconn 1 check
   server 7-8094  192.168.1.22:8094 weight 4 maxconn 1 check
   server 7-8095  192.168.1.22:8095 weight 4 maxconn 1 check

   server 3-8091  192.168.1.23:8091 weight 6 maxconn 1 check
   server 3-8092  192.168.1.23:8092 weight 6 maxconn 1 check
   server 3-8093  192.168.1.23:8093 weight 6 maxconn 1 check
   server 3-8094  192.168.1.23:8094 weight 6 maxconn 1 check
   server 3-8095  192.168.1.23:8095 weight 6 maxconn 1 check







On Fri, Mar 6, 2009 at 4:51 PM, Willy Tarreau w...@1wt.eu wrote:

 Hi Greg,

 On Fri, Mar 06, 2009 at 03:54:13PM -0500, Greg Gard wrote:
  hi willy and all,
 
  wondering if i can expect haproxy to queue requests when max conn per
  backend it set to 1. running nginx  haproxy  mongrel/rails2.2.2.

 yes, it works fine and is even the recommended way of setting it for
 use with mongrel. There have been trouble in the past with some old
 versions, where a request could starve in the queue for too long.
 What version are you using ?

  all seems ok, but i am getting a few users complaining of connection
  problems and never see anything other than zeros in the
  queue columns.

 Have you correctly set the maxconn on the server lines ? I suspect
 you have changed it in the frontend instead, which would be a disaster.
 Could you please post your configuration ?

 Regards,
 Willy




-- 
greg gard, psyd
www.carepaths.com


Re: question about queue and max_conn = 1

2009-03-06 Thread Willy Tarreau
On Fri, Mar 06, 2009 at 10:02:03PM -0500, Greg Gard wrote:
 thanks for taking a look willy. let me know if there's anything else i
 should change.
 
(...)
 defaults
(...)
 # option httpclose

This one above should not be commented out. Otherwise, client doing keepalive
will artificially maintain a connection to a mongrel when they don't use it,
thus preventing another client from using it.

 #stable sites run win2k/iis5/asp
 listen stable 192.168.1.5:10301
 option forwardfor
 server stable1 192.168.1.10:10300 weight 4 check
 server stable2 192.168.1.11:10300 weight 6 check

You can also set a maxconn on your iis sites if you think you sometimes
hit their connection limit. Maybe maxconn 200 or something like this.
The stats will tell you how high you go and if there are errors.

The rest looks fine.

Regards,
Willy