RE: HAProxy, multicores and EC2

2011-10-10 Thread Erik Torlen
Thank you. 

I tried it using taskset which was pretty easy (taskset -pc 2,3 345) where 2,3 
was the actual
cpus to use and 345 the PID-id.

/E

-Original Message-
From: Vincent Bernat [mailto:ber...@luffy.cx] 
Sent: den 9 oktober 2011 01:56
To: Erik Torlen
Cc: Willy Tarreau; haproxy@formilux.org
Subject: Re: HAProxy, multicores and EC2

OoO  En ce  milieu de  nuit étoilée  du dimanche  09 octobre  2011, vers
04:24, Erik Torlen  disait :

> I read a lot of people that have tried stud. This example is
> interesting in this case because he assigns the
> different processes to different cores with cpuset:
> http://vincent.bernat.im/en/blog/2011-ssl-benchmark.html

> In my case, would cpuset be the same as taskset? 

taskset is  more low level  than cpuset.  You  won't be able  to "evade"
from a  cpuset with taskset. But  if you don't use  cpuset (or cgroups),
taskset should work just fine.

Here is how I do with cpuset :

mkdir /dev/cpuset
mount -t cpuset cpuset /dev/cpuset
cd /dev/cpuset

# All system process on CPU 7
mkdir system
cd system
echo 7 > cpus
echo 0 > mems
while read i; do /bin/echo $i; done < ../tasks > tasks
cd ..
for i in $(seq 0 7); do
mkdir cpu$i
cd cpu$i
echo $i > cpus
echo 0 > mems
cd ..
done

[...]

# Stud on CPU 3-6
PID=stud
i=0
for pid in $(pidof $PID); do
 echo $pid > /dev/cpuset/cpu$(($i + 3))/tasks
 i=$(( ($i+1) % 4))
done

At the end, just check that the process is properly pined down to wanted
CPU with /proc/PID/status:

Cpus_allowed_list:  5
-- 
Vincent Bernat ☯ http://vincent.bernat.im

Don't comment bad code - rewrite it.
- The Elements of Programming Style (Kernighan & Plauger)


Re: HAProxy and IIS 6

2011-10-10 Thread Ricky Boone

On 10/10/2011 02:53 PM, Karthik Iyer wrote:

I am new to haproxy, But i think I can help you here.

You can use a custom health check aspx page and make haproxy do health
checks within certain interval of time using "http-check expect".
Haproxy will take the node down if, reply is not returned within
specified period.


That looks useful, but I'm not sure I can use it in my scenario.  I 
would like to have the health check call a URL on the backend server 
that is relevant, however there's a slight problem:


1) I'm not an ASP.NET developer.  I don't know the first thing about 
performing any specific checks within an ASP.NET application.  :(
2) Not my application, just front-ending it with a load balancer (rather 
use HAProxy than NLB or fork out the cash for an F5 or similar).


This is a snippet of the configuration I have in production now: 
http://pastebin.com/fssNNkqf


Obviously calls to a static html file aren't going to cut it, but it's 
the only option I've had for the time being.  It's also the only way I 
can have my admins take the bad server "out" of the load balancer.


I have a test config, but I want to be sure I'm going about the check 
the right way.  I just want to be sure that the load balancer doesn't 
send traffic to a server that is not responding quickly (still warming 
up or recycling), and that connections don't queue up in the Current 
Session counter.


http://pastebin.com/Zum0RVfH

I have this config running on the secondary load balancer (LB2), and 
when the backend server is having a problem LB2 marks it as down.  I 
just want to be sure this is the right way of going about this, or if 
there are any other recommendations.




RE: HAProxy, multicores and EC2

2011-10-10 Thread Erik Torlen
Hi,

I made some more tests, this time with taskset to see how the performance is 
affected.

I noticed during the tests that the connections against the backend (3 bcks) 
was not divided equal, it
was instead very different. The 1st was ~2200, 2nd ~150 and the 3rd like ~60. 
It also jumped a lot, on the 1st it could be 2000, down to 200, up to 3000 and 
so on.

I'm guessing that this could have to do with the backends being in different 
availability zones (which would be diff datacenters).
And therefore the network latency is causing a delay on the connection against 
the machine that has the longest route?
(the load on the backends was equal, around 60% cpu).

FYI, the haproxy is located in amazon east av. zone 1D. The three backends are 
in B,C and D. Looking at the stats from
HAproxy (attached) you can see that corr conns to backend in zone D is fairly 
low compared to the other zones where
zone B is worst and then zone C. zone B is in average 4-5 times worse compared 
to zone C. 

This is with nbproc=2 and using taskset to bind the both haproxy processes to 
cpu 2,3.
I managed to push ~7500 req/s.

top - 14:32:45 up 4 days, 19:44,  1 user,  load average: 0.79, 0.86, 0.59
Tasks:  82 total,   3 running,  79 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  0.0%sy,  0.0%ni, 93.2%id,  0.0%wa,  1.5%hi,  3.8%si,  1.5%st
Cpu1  :  0.0%us,  0.5%sy,  0.0%ni, 99.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  : 23.9%us, 41.3%sy,  0.0%ni, 16.5%id,  0.0%wa,  0.0%hi, 18.3%si,  0.0%st
Cpu3  : 21.3%us, 42.6%sy,  0.0%ni, 16.7%id,  0.0%wa,  0.0%hi, 19.4%si,  0.0%st
Mem:  15374136k total,  1172536k used, 14201600k free,52024k buffers
Swap:0k total,0k used,0k free,   242540k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 1862 haproxy   20   0  148m  68m  652 R 93.3  0.5   5:42.97 haproxy
 1861 haproxy   20   0  135m  62m  656 R 90.4  0.4   5:35.57 haproxy


Using only nbproc=2 without taskset gave me this. Look at %si, it is majority 
on cpu0.
Managed to push ~6500 req/s, less compared to using taskset.

top - 14:51:56 up 4 days, 20:03,  1 user,  load average: 1.76, 1.53, 0.98
Tasks:  82 total,   3 running,  79 sleeping,   0 stopped,   0 zombie
Cpu0  : 16.2%us, 21.2%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi, 62.6%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  : 22.4%us, 39.3%sy,  0.0%ni, 15.9%id,  0.0%wa,  0.0%hi, 22.4%si,  0.0%st
Mem:  15374136k total,  1216348k used, 14157788k free,52064k buffers
Swap:0k total,0k used,0k free,   242556k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 1915 haproxy   20   0  181m  83m  656 R 99.3  0.6   8:19.49 haproxy
 1916 haproxy   20   0  193m  88m  652 R 90.4  0.6   8:16.89 haproxy


Using nbproc=3 och taskset= 1,2,3 gave worse results comparing to nbproc=2 and 
taskset=2,3.

I will make more tests with your suggestions (tcp-smart-connect + 
tcp-smart-accept).

/E

-Original Message-
From: Willy Tarreau [mailto:w...@1wt.eu] 
Sent: den 8 oktober 2011 23:09
To: Erik Torlen
Cc: haproxy@formilux.org
Subject: Re: HAProxy, multicores and EC2

On Sun, Oct 09, 2011 at 02:24:27AM +, Erik Torlen wrote:
> Thanks for the response Willy
> 
> I agree of what you are saying.
> I have loadtested a lot of different machines/systems and the VMs never have 
> as good performance
> as a physical machine. However, in this case we have to use Amazon so it's 
> more focus to get the most
> out of 1 single instance and then scale with more machines to get more 
> performance.

I see.

> Xtra large EC2's are "supposed" to be dedicated machines in the cloud, no one 
> else should use them except
> for you. But if I can't get HAProxy to use the XL EC2 properly it could be 
> better to have more Large instances
> Instead (2 cores). That would reduce cost and make better use of the 
> instances.

I agree. What is important in EC2 is to reduce the number of packets as much
as possible, as we noticed in the past that every packet has a huge cost.
Using keep-alive with the client (option http-server-close) saves some
packets on the client side and allows haproxy to use TCP RST to close the
server connection and save another packet on this side. Using both
"option tcp-smart-accept" and "option tcp-smart-connect" saves another
packet on each side. You should notice an improvement with these.

> And make stud use one of the cores and HAProxy the other?

Yes, possibly. If you need to run a lot of SSL on the machine, then I
suggest that you keep your XL machine. Recently, stud merged the patches
provided by our dev team at Exceliance, allowing it to scale using
multiple processes. In your case, you should stick all interrupts to
code #0, haproxy to core #1 and stud to all remaining cores. That way
you should get optimal performance.

> I read a lot of people that hav

Re: HAProxy and IIS 6

2011-10-10 Thread Karthik Iyer
On Mon, Oct 10, 2011 at 10:32 PM, Ricky Boone  wrote:

> I am trying to troubleshoot an issue with our load balancer, and how it
> considers a backend server alive or dead.
>
> The servers are running IIS 6 (Win2K3), running an ASP.NET web service in
> its own application pool.  The pool is set with multiple (4, at the moment)
> worker processes.
>
> The problem occurs when the worker processes are starting, or when they
> recycle/refresh due to memory or other thresholds set in the application
> pool.  The load balancer keeps throwing traffic at the server, even though
> it isn't ready.  It shows as an increasing number of Current Connections on
> the backend server.  Where the count normally never exceeds 10-15, it
> usually increases to a few dozen trough a couple hundred before the worker
> processes finally warm up on their own.
>
> I'm aware of ASP.NET and how it caches on the first hit (per worker). We
> have a process (using ApacheBench) to warm-up the worker processes, however
> if there are unexpected refresh/recycle events, we have to disable the
> backend server, manually warm-up the worker processes, then add it back.
>  Quite hectic.
>
> The issues with the application cannot be resolved (not our application).
>  We want the load balancer to stop sending traffic to a server that is not
> responding to requests promptly, but still provide a way for the load
> balancer to assist with the warm-up process.
>


 I am new to haproxy, But i think I can help you here.

You can use a custom health check aspx page and make haproxy do health
checks within certain interval of time using "http-check expect". Haproxy
will take the node down if, reply is not returned within specified period.


Ex :

 backend web-backend
balance leastconn
option httpchk GET /check.aspx HTTP/1.0
http-check expect string Success
server node1 192.168.8.1:80 check inter 3000 rise 2 fall 3 maxconn
250

- Karthik Iyer


HAProxy and IIS 6

2011-10-10 Thread Ricky Boone
I am trying to troubleshoot an issue with our load balancer, and how it 
considers a backend server alive or dead.


The servers are running IIS 6 (Win2K3), running an ASP.NET web service 
in its own application pool.  The pool is set with multiple (4, at the 
moment) worker processes.


The problem occurs when the worker processes are starting, or when they 
recycle/refresh due to memory or other thresholds set in the application 
pool.  The load balancer keeps throwing traffic at the server, even 
though it isn't ready.  It shows as an increasing number of Current 
Connections on the backend server.  Where the count normally never 
exceeds 10-15, it usually increases to a few dozen trough a couple 
hundred before the worker processes finally warm up on their own.


I'm aware of ASP.NET and how it caches on the first hit (per worker). 
We have a process (using ApacheBench) to warm-up the worker processes, 
however if there are unexpected refresh/recycle events, we have to 
disable the backend server, manually warm-up the worker processes, then 
add it back.  Quite hectic.


The issues with the application cannot be resolved (not our 
application).  We want the load balancer to stop sending traffic to a 
server that is not responding to requests promptly, but still provide a 
way for the load balancer to assist with the warm-up process.


I have our current haproxy.cfg file, and one that I'm trying to test 
with (on a secondary system).  If this is the correct forum to address 
this issue, I can forward it as soon as I sanitize it a bit.


Thanks in advance for any help you might be able to provide.

--
Ricky Boone



Re: Haproxy stats page incomplete (1.4.17)

2011-10-10 Thread kristof . alentijns
Hi Cyril, 

I removed the nolinger option but I still seem to have the same problem. 

Thanks, 




From:   Cyril Bonté 
To: kristof.alenti...@numius.eu
Cc: haproxy@formilux.org
Date:   10/10/2011 12:07
Subject:Re: Haproxy stats page incomplete (1.4.17)



Hi Kristof, 

Le lundi 10 octobre 2011 11:47:19, kristof.alenti...@numius.eu a écrit :
> Hey, 
> 
> I am having some problems with the HAProxy stats page. I often have to 
> refresh a couple of times before it displays. Also it seems like it is 
> incomplete as there is no table beneath the "General process 
information" 
> part and no "Display option" and "External ressources" links. When 
looking 
> at the source code of the page, the html seems to stop in the middle of 
a 
> line before the page is finished. When comparing with another HAProxy 
> server we use, there definitely seems to be some html missing. We are 
> using following settings:

I'd suggest you to remove the "option nolinger" line, which can produce 
such side effects.

> 
> global
> maxconn 4096
> daemon
> pidfile /var/run/haproxy.pid
>stats socket /tmp/haproxy
> defaults
> modehttp
> retries 3
> option  redispatch
> option  httpclose
> option  abortonclose
> maxconn 2000
> contimeout  5000
> clitimeout  5
> srvtimeout  5
> 
> listen  REPLICON_HTTP 172.10.15.43:80
> modehttp
> cookie  Replicon insert
> balanceroundrobin
> optionhttpclose
> optionnolinger
> stats   enable
> stats   auth myuser:mypass
> stats   uri /haproxy?stats
> reqadd  X-Forwarded-Proto:\ http
> serverwebserver1 172.10.15.41:80 cookie w1 check 
inter 
> 2000 rise 2 fall 5
> serverwebserver2 172.10.15.42:80 cookie w2 check 
inter 
> 2000 rise 2 fall 5
> 
> Any ideas?
> 
> Thanks, 
> 
> Kristof
> Kristof Alentijns - consultant 
> 
> Greenhill Campus
> Interleuvenlaan 15D - 3001 Heverlee - Belgium 
> [M] +32 479 09 30 48  [T] +32 16 20 29 05  [F] +32 16 22 58 95 
> 
> [W] www.numius.eu
> 
> Sent by mobile phone
> 
> __
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email 
> __

-- 
Cyril Bonté

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__


Kristof Alentijns - consultant 

Greenhill Campus
Interleuvenlaan 15D - 3001 Heverlee - Belgium 
[M] +32 479 09 30 48  [T] +32 16 20 29 05  [F] +32 16 22 58 95 

[W] www.numius.eu

Sent by mobile phone

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__

Re: Haproxy stats page incomplete (1.4.17)

2011-10-10 Thread kristof . alentijns
No, the other server (for a different environment) uses 1.4.10. 



From:   Baptiste 
To: kristof.alenti...@numius.eu
Cc: haproxy@formilux.org
Date:   10/10/2011 12:04
Subject:Re: Haproxy stats page incomplete (1.4.17)




Hi,

Are both HAProxy to the same version?

cheers

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__

Kristof Alentijns - consultant 

Greenhill Campus
Interleuvenlaan 15D - 3001 Heverlee - Belgium 
[M] +32 479 09 30 48  [T] +32 16 20 29 05  [F] +32 16 22 58 95 

[W] www.numius.eu

Sent by mobile phone

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__

Re: Haproxy stats page incomplete (1.4.17)

2011-10-10 Thread Cyril Bonté
Hi Kristof, 

Le lundi 10 octobre 2011 11:47:19, kristof.alenti...@numius.eu a écrit :
> Hey, 
> 
> I am having some problems with the HAProxy stats page. I often have to 
> refresh a couple of times before it displays. Also it seems like it is 
> incomplete as there is no table beneath the "General process information" 
> part and no "Display option" and "External ressources" links. When looking 
> at the source code of the page, the html seems to stop in the middle of a 
> line before the page is finished. When comparing with another HAProxy 
> server we use, there definitely seems to be some html missing. We are 
> using following settings:

I'd suggest you to remove the "option nolinger" line, which can produce such 
side effects.

> 
> global
> maxconn 4096
> daemon
> pidfile /var/run/haproxy.pid
>stats socket /tmp/haproxy
> defaults
> modehttp
> retries 3
> option  redispatch
> option  httpclose
> option  abortonclose
> maxconn 2000
> contimeout  5000
> clitimeout  5
> srvtimeout  5
> 
> listen  REPLICON_HTTP 172.10.15.43:80
> modehttp
> cookie  Replicon insert
> balanceroundrobin
> optionhttpclose
> optionnolinger
> stats   enable
> stats   auth myuser:mypass
> stats   uri /haproxy?stats
> reqadd  X-Forwarded-Proto:\ http
> serverwebserver1 172.10.15.41:80 cookie w1 check inter 
> 2000 rise 2 fall 5
> serverwebserver2 172.10.15.42:80 cookie w2 check inter 
> 2000 rise 2 fall 5
> 
> Any ideas?
> 
> Thanks, 
> 
> Kristof
> Kristof Alentijns - consultant 
> 
> Greenhill Campus
> Interleuvenlaan 15D - 3001 Heverlee - Belgium 
> [M] +32 479 09 30 48  [T] +32 16 20 29 05  [F] +32 16 22 58 95 
> 
> [W] www.numius.eu
> 
> Sent by mobile phone
> 
> __
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email 
> __

-- 
Cyril Bonté



Re: Haproxy stats page incomplete (1.4.17)

2011-10-10 Thread Baptiste
Hi,

Are both HAProxy to the same version?

cheers


Haproxy stats page incomplete (1.4.17)

2011-10-10 Thread kristof . alentijns
Hey, 

I am having some problems with the HAProxy stats page. I often have to 
refresh a couple of times before it displays. Also it seems like it is 
incomplete as there is no table beneath the "General process information" 
part and no "Display option" and "External ressources" links. When looking 
at the source code of the page, the html seems to stop in the middle of a 
line before the page is finished. When comparing with another HAProxy 
server we use, there definitely seems to be some html missing. We are 
using following settings:

global
maxconn 4096
daemon
pidfile /var/run/haproxy.pid
   stats socket /tmp/haproxy
defaults
modehttp
retries 3
option  redispatch
option  httpclose
option  abortonclose
maxconn 2000
contimeout  5000
clitimeout  5
srvtimeout  5

listen  REPLICON_HTTP 172.10.15.43:80
modehttp
cookie  Replicon insert
balanceroundrobin
optionhttpclose
optionnolinger
stats   enable
stats   auth myuser:mypass
stats   uri /haproxy?stats
reqadd  X-Forwarded-Proto:\ http
serverwebserver1 172.10.15.41:80 cookie w1 check inter 
2000 rise 2 fall 5
serverwebserver2 172.10.15.42:80 cookie w2 check inter 
2000 rise 2 fall 5

Any ideas?

Thanks, 

Kristof
Kristof Alentijns - consultant 

Greenhill Campus
Interleuvenlaan 15D - 3001 Heverlee - Belgium 
[M] +32 479 09 30 48  [T] +32 16 20 29 05  [F] +32 16 22 58 95 

[W] www.numius.eu

Sent by mobile phone

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__