Re: How to configure HAProxy to modify response headers

2010-11-15 Thread Larry Root
Thanks for the information on common practices. We are in fact trying to
work around a misbehaving application. In the case of 304s its returning the
wrong content-type and causing issues with browsers.

Thank you for the help!

larry


---
Larry Root la...@armorgames.com
Head of Web Development | 949.207.6063
Armor Games Inc. | http://armorgames.com


On Thu, Nov 11, 2010 at 10:22 PM, Willy Tarreau w...@1wt.eu wrote:

 On Thu, Nov 11, 2010 at 04:28:46PM -0800, Larry Root wrote:
  Im struggling to figure out how to modify the *Response* headers within
  haproxy. Here is the logic I would like to implement:
 
  IF response_status_code == 304 THEN remove header Content-Type
 
  I believe I need to setup an ACL rule to capture the status code part,
 and
  then use rspidel to conditionally remove the Content-Type header based on
  the ACL rule. However Im struggling with how to specify this ACL rule for
  the response status code. Im also not sure where to place this logic, in
 the
  frontend or backend? Any support would be greatly appreciated.

 You could place it in either the frontend or backend. The rule of thumb is
 to consider that what it relevant to the server farm should be done in the
 backend, and what is relevant to the access point should be done in the
 frontend. So if you have to remove that header because of a buggy client
 it would be better to do it in the frontend, and if it's because the server
 is stupid, then better in the backend.

 I think the following rule should work :

rspidel ^Content-type  if { status 304 }

 However, I don't really understand why you'd want to remove the header in
 such a case, as a 304 is supposed to return the exact same headers as the
 200 !

 Regards,
 Willy




stats page enhancement idea

2010-11-15 Thread Corin Langosch

Hi all,

I'd like to suggest to add min/max/avg response time for each backend on 
stats page.  I think it'd help a lot in finding the proper settings of 
how to load balance the backends and easily check if everything is 
working as it should. What do you think?


Corin



Re: stats page enhancement idea

2010-11-15 Thread Willy Tarreau
Hi Corin,

On Mon, Nov 15, 2010 at 04:55:55PM +, Corin Langosch wrote:
 Hi all,
 
 I'd like to suggest to add min/max/avg response time for each backend on 
 stats page.  I think it'd help a lot in finding the proper settings of 
 how to load balance the backends and easily check if everything is 
 working as it should. What do you think?

It's in the pipe, with much more useful information too (eg: last server
activity, and possibly per-url stats). The main issue right now is to find
how to display those information on the page. Someone had posted some examples
showing how to display a box under the mouse pointer, which could provide more
detailed information. Probably that we should do something like this. But as
you see it's much more a matter of knowing how to report the information than
to compute it.

Regards,
Willy




(haproxy) How-TO get HAPROXY to balanace 2 SSL encypted Webservers ?

2010-11-15 Thread toms
So we have 2 webservers on the backend with SSL encryption.
We want to keep this the way it is.
Is there a way for HAPROXY to balance these 2 servers with sticky 
sessions enabled?

how can this be done?

Currently when trying it this way;

defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
maxconn 2000
contimeout 5000
clitimeout 5
srvtimeout 5
stats enable
stats uri /stats


frontend http-in
bind *:80
acl is_ww2_test1_com hdr_end(host) -i ww2.test1.com
use_backend ww2_test1_com if is_ww2_test1_com

backend ww2_test1_com
balance roundrobin
cookie SERVERID insert nocache indirect
option httpchk
option httpclose
option forwardfor
server Server1 10.10.10.11:80 cookie Server1
server Server1 10.10.10.12:80 cookie Server2

Since the 2 servers are encrypted on port 443 (with the main front 
page on port 80 not encrypted),
the above setup works until it hits 443 where i get the error 
Error 310 (net::ERR_TOO_MANY_REDIRECTS): There were too many 
redirects..
Port 443 on the HAPROXY frontend is using Pound for the encryption.
However both backend servers have a Tomcat Keystore (signed through 
thawte) which I doubt will be compatable with Pound.  (and I don't 
want to resign the cert or get a new cert)
Can I somehow get HAPROXY to balance these 2 servers with proper 
sticky session handling?

TIA!




Re: Get real source IP

2010-11-15 Thread Graeme Donaldson
On 15 November 2010 21:09, Maxime Ducharme m...@techboom.com wrote:

 Hi guys

 We are looking for a way to get real source IP that is connecting to our
 web services.

 We currently use option forwardfor, but some people are using this to
 bypass our checks.

 Is there other way to send real IP to our web servers ?


Another way to do this is to use HAproxy in transparent proxy mode. I have
not used it personally, but unless I'm mistaken it functions more like a
NAT/routing device instead of a proxy.

Here's a short howto if you'd like to try it out:
http://blog.loadbalancer.org/configure-haproxy-with-tproxy-kernel-for-full-transparent-proxy/

Regards,
Graeme.


Re: (haproxy) How-TO get HAPROXY to balanace 2 SSL encypted Webservers ?

2010-11-15 Thread Hank A. Paulson
Where is the rest of your haproxy config - if you are talking to port 443 on 
your tomcat servers...


If you have have the 2 backend servers and you want haproxy to talk to the 
encrypted/ssl ports on them (and you want your end users to see the certs that 
are on the tomcat servers) then the only thing haproxy can see is the source 
IP and source port and try to create stickiness with the source IP. So you 
have to think in those terms - what is unencrypted at the time each request 
and response passes through haproxy.


In this case the end user sees the cert installed on pound and haproxy can use 
all the layer 7/http capabilities:

ssl/443 - pound - non-ssl - haproxy non-ssl - tomcat(s)

you can't do (AFAIK):

ssl/443 - pound - non-ssl - haproxy - ssl - tomcat(s)
because the user would still see only the pound cert and I don't think haproxy 
can initiate ssl sessions on its own.


On 11/15/10 11:08 AM, t...@hush.com wrote:

So we have 2 webservers on the backend with SSL encryption.
We want to keep this the way it is.
Is there a way for HAPROXY to balance these 2 servers with sticky
sessions enabled?

how can this be done?

Currently when trying it this way;

defaults
 log global
 mode http
 option httplog
 option dontlognull
 retries 3
 option redispatch
 maxconn 2000
 contimeout 5000
 clitimeout 5
 srvtimeout 5
 stats enable
 stats uri /stats


frontend http-in
 bind *:80
 acl is_ww2_test1_com hdr_end(host) -i ww2.test1.com
 use_backend ww2_test1_com if is_ww2_test1_com

backend ww2_test1_com
 balance roundrobin
 cookie SERVERID insert nocache indirect
 option httpchk
 option httpclose
 option forwardfor
 server Server1 10.10.10.11:80 cookie Server1
 server Server1 10.10.10.12:80 cookie Server2

Since the 2 servers are encrypted on port 443 (with the main front
page on port 80 not encrypted),
the above setup works until it hits 443 where i get the error
Error 310 (net::ERR_TOO_MANY_REDIRECTS): There were too many
redirects..
Port 443 on the HAPROXY frontend is using Pound for the encryption.
However both backend servers have a Tomcat Keystore (signed through
thawte) which I doubt will be compatable with Pound.  (and I don't
want to resign the cert or get a new cert)
Can I somehow get HAPROXY to balance these 2 servers with proper
sticky session handling?

TIA!






Re: (haproxy) How-TO get HAPROXY to balanace 2 SSL encypted Webservers ?

2010-11-15 Thread toms
Thanks.
Are there any config examples I can take a look at?
Specifically having HAPROXY load balance 2 backend SSL encrypted 
tomcat servers.
As per your message, I would not be able to use POUND.
How can I configure HAPROXY to only balance the 2 servers' port 443 
and apply stickiness to the source IP's?
are there any examples I can look at?

How can I modify the below config to also passthrough, balance and 
create the sticky sessions for SSL also?
currently our port 80 load balancing looks like this: (entire 
config)

global
log 127.0.0.1:514 local7 # only send important events
maxconn 4096
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
maxconn 2000
contimeout 5000
clitimeout 5
srvtimeout 5
stats enable
stats uri /stats
frontend http-in
bind *:80
acl is_ww2_test1_com hdr_end(host) -i ww2.test1.com
use_backend ww2_test1_com if is_ww2_test1_com
backend ww2_test1_com
balance roundrobin
cookie SERVERID insert nocache indirect
option httpchk
option httpclose
option forwardfor
server Server1 10.10.10.11:80 cookie Server1
server Server2 10.10.10.12:80 cookie Server2

thanks again.

ts

On Mon, 15 Nov 2010 14:39:13 -0500 Hank A. Paulson 
h...@spamproof.nospammail.net wrote:
Where is the rest of your haproxy config - if you are talking to 
port 443 on 
your tomcat servers...

If you have have the 2 backend servers and you want haproxy to 
talk to the 
encrypted/ssl ports on them (and you want your end users to see 
the certs that 
are on the tomcat servers) then the only thing haproxy can see 
is the source 
IP and source port and try to create stickiness with the source 
IP. So you 
have to think in those terms - what is unencrypted at the time 
each request 
and response passes through haproxy.

In this case the end user sees the cert installed on pound and 
haproxy can use 
all the layer 7/http capabilities:
ssl/443 - pound - non-ssl - haproxy non-ssl - tomcat(s)

you can't do (AFAIK):

ssl/443 - pound - non-ssl - haproxy - ssl - tomcat(s)
because the user would still see only the pound cert and I don't 
think haproxy 
can initiate ssl sessions on its own.

On 11/15/10 11:08 AM, t...@hush.com wrote:
 So we have 2 webservers on the backend with SSL encryption.
 We want to keep this the way it is.
 Is there a way for HAPROXY to balance these 2 servers with 
sticky
 sessions enabled?

 how can this be done?

 Currently when trying it this way;

 defaults
  log global
  mode http
  option httplog
  option dontlognull
  retries 3
  option redispatch
  maxconn 2000
  contimeout 5000
  clitimeout 5
  srvtimeout 5
  stats enable
  stats uri /stats


 frontend http-in
  bind *:80
  acl is_ww2_test1_com hdr_end(host) -i ww2.test1.com
  use_backend ww2_test1_com if is_ww2_test1_com

 backend ww2_test1_com
  balance roundrobin
  cookie SERVERID insert nocache indirect
  option httpchk
  option httpclose
  option forwardfor
  server Server1 10.10.10.11:80 cookie Server1
  server Server1 10.10.10.12:80 cookie Server2

 Since the 2 servers are encrypted on port 443 (with the main 
front
 page on port 80 not encrypted),
 the above setup works until it hits 443 where i get the error
 Error 310 (net::ERR_TOO_MANY_REDIRECTS): There were too many
 redirects..
 Port 443 on the HAPROXY frontend is using Pound for the 
encryption.
 However both backend servers have a Tomcat Keystore (signed 
through
 thawte) which I doubt will be compatable with Pound.  (and I 
don't
 want to resign the cert or get a new cert)
 Can I somehow get HAPROXY to balance these 2 servers with proper
 sticky session handling?

 TIA!






RE: Get real source IP

2010-11-15 Thread Angelo Höngens
Or you could remove the client's xff header, and always use your own. Then you 
are sure you can trust your own xff header, and the client can't bypass.

--

With kind regards,

Angelo Höngens
Systems Administrator
--
NetMatch
tourism internet software solutions
Ringbaan Oost 2b
5013 CA Tilburg
T: +31 (0)13 5811088
F: +31 (0)13 5821239
mailto:a.hong...@netmatch.nl
http://www.netmatch.nl
--

From: Graeme Donaldson [mailto:gra...@donaldson.za.net]
Sent: maandag 15 november 2010 20:17
To: Maxime Ducharme
Cc: haproxy@formilux.org
Subject: Re: Get real source IP

On 15 November 2010 21:09, Maxime Ducharme 
m...@techboom.commailto:m...@techboom.com wrote:
Hi guys

We are looking for a way to get real source IP that is connecting to our
web services.

We currently use option forwardfor, but some people are using this to
bypass our checks.

Is there other way to send real IP to our web servers ?

Another way to do this is to use HAproxy in transparent proxy mode. I have not 
used it personally, but unless I'm mistaken it functions more like a 
NAT/routing device instead of a proxy.

Here's a short howto if you'd like to try it out: 
http://blog.loadbalancer.org/configure-haproxy-with-tproxy-kernel-for-full-transparent-proxy/

Regards,
Graeme.


Re: Limiting throughput with a cold cache

2010-11-15 Thread Dmitri Smirnov

Willy,

thank you for taking time to respond. This is always thought provoking.

On 11/13/2010 12:18 AM, Willy Tarreau wrote:


Why reject instead of redispatching ? Do you fear that non-cached requests
will have a domino effect on all your squids ?


Yes, this indeed happens. Also, the objective is not to exceed the 
number of connections from squids to backend database. In case of cold 
cache the redispatch will cause a cache entry to be brought into the 
wrong shard which is unlikely to be reused. Thus this would use up a 
valuable connection just to satisfy one request.


However, even this is an optimistic scenario. These cold cache 
situations happen due to external factors like AWS issue, our home grown 
DNS messed up (AWS does not provide DNS) and etc which causes not all of 
the squids to be reported to proxies and messes up the distribution. 
This is because haproxy is restarted after the config file is regenerated.


I have been thinking about preserving some of the distribution using 
server IDs when the set of squids partially changes but that's another 
story, let's not digress.


Thus even with redispatch enabled the other squid is unlikely to have 
free connection slots because when it goes cold, most of them do.


Needless to say, most of the other components in the system are also in 
distress in case something happens on a large scale. So I choose the 
stability of the system to be the priority even though some of the 
clients will be refused service which happens to be the least of the evils.




I don't see any http-server-close there, which makes me think that your
squids are selected upon the first request of a connection and will still
have to process all subsequent requests, even if their URL don't hash to
them.


Good point. This is not the case, however, forceclose takes care of it 
and I can see that most of the time the number of concurrently open 
connections to any particular squid changes very quickly in a range of 
0-3 even though each of them handles a chunk requests per second.




Your client timeout at 100ms seems way too low to me. I'm sure you'll
regularly get truncated responses due to packet losses. In my opinion,
both your client and server timeouts must cover at least a TCP retransmit
(3s), so that makes sense to set them to at least 4-5 seconds. Also, do
not forget that sometimes your cache will have to fetch an object before
responding, so it could make sense to have a server timeout which is at
least as big as the squid timeout.


Agree. Right now server timeout in prod is 5s according to what is 
recommended in the docs. In fact, I will probably reverse my timeout 
changes to be inline with your recommendations.


Having slept on the problem I came up with a fairly simple idea which is 
not perfect but I think does most of the bang for such a simple change.


It revolves around of adding a maxconn restriction for every individual 
squid on in the backend.


And the number can be easily calculated and then tuned after a loadtest.

Let's assume I have 1 haproxy in front of a single squid.

Furthermore, HIT latency: 5ms, MISS latency 200ms for simplicity.

Incoming traffic 1,000 rps at peak.

From squids to backend lets have 50 connections max, i.e. with 250 rps max.

So through a single connection allowance for hot caches we will be able 
to process 200 rps. For cold cache we will do only 5 rps.


This means that to support Hot traffic we need 5 connections at least.
At the same time this will throttle MISS requests to max of 25.

Because we have 250 rps max at the backend we can raise maxconn to 50 
for the squid. This creates a range of 250-10,000 rps.


As caches warm up the traffic becomes mixed and drifts towards the hot 
model so the same number of connections will process more and more 
requests until it reached 99.6% hit rate in our case.


I choose to leave a queue size unlimited but put a fast expiration time 
on the queue entries so they are rejected with 503 unless you have other 
recommendations.


I also choose to impose an individual maxconn rather than a backend 
maxconn. This is to prevent MISS requests to use up all of the 
connections limit and allow HITs to be served quickly from hot shards.

I am still pondering over this point though.

The situation would be more complicated if the maxconn was too big for 
MISS and too small for HIT but this is not the case.


The biggest problem remaining: clients stop seeing rejections when at 
least 5 connections are available for HIT traffic. This means that MISS 
traffic should be at 225 rps at the most, i.e. caches must be  77% hot.


I will test experimentally if this takes too long.

thanks,
--
Dmitri Smirnov




Re: How to configure HAProxy to modify response headers

2010-11-15 Thread Larry Root
So I played around with this today and I am able to delete the Content-type
header but I cant seem to get the condition to behave. Here is example code
from my backend:

backend http_www
server srv_name 0.0.0.0:80 maxconn 32 weight 1
*acl is_status_304 status eq 304*
*rspidel ^Content-type if is_status_304*

No matter what I do with the rspidel command it always deletes the
content-type header and it never follows the condition. For example the
following also deletes the content-type header:

*rspidel ^Content-type if FALSE*

Perhaps I am not using conditions and ACLs properly?

Thanks

larry

---
Larry Root la...@armorgames.com
Head of Web Development | 949.207.6063
Armor Games Inc. | http://armorgames.com


On Thu, Nov 11, 2010 at 10:22 PM, Willy Tarreau w...@1wt.eu wrote:

 On Thu, Nov 11, 2010 at 04:28:46PM -0800, Larry Root wrote:
  Im struggling to figure out how to modify the *Response* headers within
  haproxy. Here is the logic I would like to implement:
 
  IF response_status_code == 304 THEN remove header Content-Type
 
  I believe I need to setup an ACL rule to capture the status code part,
 and
  then use rspidel to conditionally remove the Content-Type header based on
  the ACL rule. However Im struggling with how to specify this ACL rule for
  the response status code. Im also not sure where to place this logic, in
 the
  frontend or backend? Any support would be greatly appreciated.

 You could place it in either the frontend or backend. The rule of thumb is
 to consider that what it relevant to the server farm should be done in the
 backend, and what is relevant to the access point should be done in the
 frontend. So if you have to remove that header because of a buggy client
 it would be better to do it in the frontend, and if it's because the server
 is stupid, then better in the backend.

 I think the following rule should work :

rspidel ^Content-type  if { status 304 }

 However, I don't really understand why you'd want to remove the header in
 such a case, as a 304 is supposed to return the exact same headers as the
 200 !

 Regards,
 Willy




Re: Limiting throughput with a cold cache

2010-11-15 Thread Bedis 9
Hi Dmitri,

First, let me summarize your issue and tell me if I'm wrong.
you have a haproxy balancing traffic to squid in reverse-proxy mode
using hash URL as metric.
The problem you have, is when a cache gets up after a crash, it's in
trouble because of getting too much MISS requests.

Are your HTTP backend servers slow?

Have you tried some tiered caching?
I mean, having a backend squid server the edge ones would request
before going to the HTTP backend.
the advantage: the backend squid will be HOT for all your objects. If
a squid goes down, haproxy would balance his traffic to other squids,
but the backend one will keep on learning objects. When the edge
failed squid comes back, then haproxy will balance traffic to it
again, whatever the number of request is. That squid will learn object
from the backend one.

I would do this configuration using ICP.
all your edge squids would ask first your backend one for the object.
If the backend has the object, then the edge will learn it, otherwise
it has to go to the origin.
All the most accessed object would remain in the back squid's memory.
If the backend squid does not work, you can configure your edge squid
to get content from the origin server directly.

That way, you can limit the extra load on the edge squid when doing
cold start.

my 2 cents.

if you have a lot of traffic from your edge to your backend, you can
loadbalance traffic too :)

cheers


On Tue, Nov 16, 2010 at 1:43 AM, Dmitri Smirnov dsmir...@netflix.com wrote:
 Willy,

 thank you for taking time to respond. This is always thought provoking.

 On 11/13/2010 12:18 AM, Willy Tarreau wrote:

 Why reject instead of redispatching ? Do you fear that non-cached requests
 will have a domino effect on all your squids ?

 Yes, this indeed happens. Also, the objective is not to exceed the number of
 connections from squids to backend database. In case of cold cache the
 redispatch will cause a cache entry to be brought into the wrong shard which
 is unlikely to be reused. Thus this would use up a valuable connection just
 to satisfy one request.

 However, even this is an optimistic scenario. These cold cache situations
 happen due to external factors like AWS issue, our home grown DNS messed up
 (AWS does not provide DNS) and etc which causes not all of the squids to be
 reported to proxies and messes up the distribution. This is because haproxy
 is restarted after the config file is regenerated.

 I have been thinking about preserving some of the distribution using server
 IDs when the set of squids partially changes but that's another story, let's
 not digress.

 Thus even with redispatch enabled the other squid is unlikely to have free
 connection slots because when it goes cold, most of them do.

 Needless to say, most of the other components in the system are also in
 distress in case something happens on a large scale. So I choose the
 stability of the system to be the priority even though some of the clients
 will be refused service which happens to be the least of the evils.


 I don't see any http-server-close there, which makes me think that your
 squids are selected upon the first request of a connection and will still
 have to process all subsequent requests, even if their URL don't hash to
 them.

 Good point. This is not the case, however, forceclose takes care of it and I
 can see that most of the time the number of concurrently open connections to
 any particular squid changes very quickly in a range of 0-3 even though each
 of them handles a chunk requests per second.


 Your client timeout at 100ms seems way too low to me. I'm sure you'll
 regularly get truncated responses due to packet losses. In my opinion,
 both your client and server timeouts must cover at least a TCP retransmit
 (3s), so that makes sense to set them to at least 4-5 seconds. Also, do
 not forget that sometimes your cache will have to fetch an object before
 responding, so it could make sense to have a server timeout which is at
 least as big as the squid timeout.

 Agree. Right now server timeout in prod is 5s according to what is
 recommended in the docs. In fact, I will probably reverse my timeout changes
 to be inline with your recommendations.

 Having slept on the problem I came up with a fairly simple idea which is not
 perfect but I think does most of the bang for such a simple change.

 It revolves around of adding a maxconn restriction for every individual
 squid on in the backend.

 And the number can be easily calculated and then tuned after a loadtest.

 Let's assume I have 1 haproxy in front of a single squid.

 Furthermore, HIT latency: 5ms, MISS latency 200ms for simplicity.

 Incoming traffic 1,000 rps at peak.

 From squids to backend lets have 50 connections max, i.e. with 250 rps max.

 So through a single connection allowance for hot caches we will be able to
 process 200 rps. For cold cache we will do only 5 rps.

 This means that to support Hot traffic we need 5 connections at least.
 At the