RE: HAProxy, multicores and EC2
Thank you. I tried it using taskset which was pretty easy (taskset -pc 2,3 345) where 2,3 was the actual cpus to use and 345 the PID-id. /E -Original Message- From: Vincent Bernat [mailto:ber...@luffy.cx] Sent: den 9 oktober 2011 01:56 To: Erik Torlen Cc: Willy Tarreau; haproxy@formilux.org Subject: Re: HAProxy, multicores and EC2 OoO En ce milieu de nuit étoilée du dimanche 09 octobre 2011, vers 04:24, Erik Torlen disait : > I read a lot of people that have tried stud. This example is > interesting in this case because he assigns the > different processes to different cores with cpuset: > http://vincent.bernat.im/en/blog/2011-ssl-benchmark.html > In my case, would cpuset be the same as taskset? taskset is more low level than cpuset. You won't be able to "evade" from a cpuset with taskset. But if you don't use cpuset (or cgroups), taskset should work just fine. Here is how I do with cpuset : mkdir /dev/cpuset mount -t cpuset cpuset /dev/cpuset cd /dev/cpuset # All system process on CPU 7 mkdir system cd system echo 7 > cpus echo 0 > mems while read i; do /bin/echo $i; done < ../tasks > tasks cd .. for i in $(seq 0 7); do mkdir cpu$i cd cpu$i echo $i > cpus echo 0 > mems cd .. done [...] # Stud on CPU 3-6 PID=stud i=0 for pid in $(pidof $PID); do echo $pid > /dev/cpuset/cpu$(($i + 3))/tasks i=$(( ($i+1) % 4)) done At the end, just check that the process is properly pined down to wanted CPU with /proc/PID/status: Cpus_allowed_list: 5 -- Vincent Bernat ☯ http://vincent.bernat.im Don't comment bad code - rewrite it. - The Elements of Programming Style (Kernighan & Plauger)
Re: HAProxy and IIS 6
On 10/10/2011 02:53 PM, Karthik Iyer wrote: I am new to haproxy, But i think I can help you here. You can use a custom health check aspx page and make haproxy do health checks within certain interval of time using "http-check expect". Haproxy will take the node down if, reply is not returned within specified period. That looks useful, but I'm not sure I can use it in my scenario. I would like to have the health check call a URL on the backend server that is relevant, however there's a slight problem: 1) I'm not an ASP.NET developer. I don't know the first thing about performing any specific checks within an ASP.NET application. :( 2) Not my application, just front-ending it with a load balancer (rather use HAProxy than NLB or fork out the cash for an F5 or similar). This is a snippet of the configuration I have in production now: http://pastebin.com/fssNNkqf Obviously calls to a static html file aren't going to cut it, but it's the only option I've had for the time being. It's also the only way I can have my admins take the bad server "out" of the load balancer. I have a test config, but I want to be sure I'm going about the check the right way. I just want to be sure that the load balancer doesn't send traffic to a server that is not responding quickly (still warming up or recycling), and that connections don't queue up in the Current Session counter. http://pastebin.com/Zum0RVfH I have this config running on the secondary load balancer (LB2), and when the backend server is having a problem LB2 marks it as down. I just want to be sure this is the right way of going about this, or if there are any other recommendations.
RE: HAProxy, multicores and EC2
Hi, I made some more tests, this time with taskset to see how the performance is affected. I noticed during the tests that the connections against the backend (3 bcks) was not divided equal, it was instead very different. The 1st was ~2200, 2nd ~150 and the 3rd like ~60. It also jumped a lot, on the 1st it could be 2000, down to 200, up to 3000 and so on. I'm guessing that this could have to do with the backends being in different availability zones (which would be diff datacenters). And therefore the network latency is causing a delay on the connection against the machine that has the longest route? (the load on the backends was equal, around 60% cpu). FYI, the haproxy is located in amazon east av. zone 1D. The three backends are in B,C and D. Looking at the stats from HAproxy (attached) you can see that corr conns to backend in zone D is fairly low compared to the other zones where zone B is worst and then zone C. zone B is in average 4-5 times worse compared to zone C. This is with nbproc=2 and using taskset to bind the both haproxy processes to cpu 2,3. I managed to push ~7500 req/s. top - 14:32:45 up 4 days, 19:44, 1 user, load average: 0.79, 0.86, 0.59 Tasks: 82 total, 3 running, 79 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 93.2%id, 0.0%wa, 1.5%hi, 3.8%si, 1.5%st Cpu1 : 0.0%us, 0.5%sy, 0.0%ni, 99.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 23.9%us, 41.3%sy, 0.0%ni, 16.5%id, 0.0%wa, 0.0%hi, 18.3%si, 0.0%st Cpu3 : 21.3%us, 42.6%sy, 0.0%ni, 16.7%id, 0.0%wa, 0.0%hi, 19.4%si, 0.0%st Mem: 15374136k total, 1172536k used, 14201600k free,52024k buffers Swap:0k total,0k used,0k free, 242540k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 1862 haproxy 20 0 148m 68m 652 R 93.3 0.5 5:42.97 haproxy 1861 haproxy 20 0 135m 62m 656 R 90.4 0.4 5:35.57 haproxy Using only nbproc=2 without taskset gave me this. Look at %si, it is majority on cpu0. Managed to push ~6500 req/s, less compared to using taskset. top - 14:51:56 up 4 days, 20:03, 1 user, load average: 1.76, 1.53, 0.98 Tasks: 82 total, 3 running, 79 sleeping, 0 stopped, 0 zombie Cpu0 : 16.2%us, 21.2%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 62.6%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 22.4%us, 39.3%sy, 0.0%ni, 15.9%id, 0.0%wa, 0.0%hi, 22.4%si, 0.0%st Mem: 15374136k total, 1216348k used, 14157788k free,52064k buffers Swap:0k total,0k used,0k free, 242556k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 1915 haproxy 20 0 181m 83m 656 R 99.3 0.6 8:19.49 haproxy 1916 haproxy 20 0 193m 88m 652 R 90.4 0.6 8:16.89 haproxy Using nbproc=3 och taskset= 1,2,3 gave worse results comparing to nbproc=2 and taskset=2,3. I will make more tests with your suggestions (tcp-smart-connect + tcp-smart-accept). /E -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: den 8 oktober 2011 23:09 To: Erik Torlen Cc: haproxy@formilux.org Subject: Re: HAProxy, multicores and EC2 On Sun, Oct 09, 2011 at 02:24:27AM +, Erik Torlen wrote: > Thanks for the response Willy > > I agree of what you are saying. > I have loadtested a lot of different machines/systems and the VMs never have > as good performance > as a physical machine. However, in this case we have to use Amazon so it's > more focus to get the most > out of 1 single instance and then scale with more machines to get more > performance. I see. > Xtra large EC2's are "supposed" to be dedicated machines in the cloud, no one > else should use them except > for you. But if I can't get HAProxy to use the XL EC2 properly it could be > better to have more Large instances > Instead (2 cores). That would reduce cost and make better use of the > instances. I agree. What is important in EC2 is to reduce the number of packets as much as possible, as we noticed in the past that every packet has a huge cost. Using keep-alive with the client (option http-server-close) saves some packets on the client side and allows haproxy to use TCP RST to close the server connection and save another packet on this side. Using both "option tcp-smart-accept" and "option tcp-smart-connect" saves another packet on each side. You should notice an improvement with these. > And make stud use one of the cores and HAProxy the other? Yes, possibly. If you need to run a lot of SSL on the machine, then I suggest that you keep your XL machine. Recently, stud merged the patches provided by our dev team at Exceliance, allowing it to scale using multiple processes. In your case, you should stick all interrupts to code #0, haproxy to core #1 and stud to all remaining cores. That way you should get optimal performance. > I read a lot of people that hav
Re: HAProxy and IIS 6
On Mon, Oct 10, 2011 at 10:32 PM, Ricky Boone wrote: > I am trying to troubleshoot an issue with our load balancer, and how it > considers a backend server alive or dead. > > The servers are running IIS 6 (Win2K3), running an ASP.NET web service in > its own application pool. The pool is set with multiple (4, at the moment) > worker processes. > > The problem occurs when the worker processes are starting, or when they > recycle/refresh due to memory or other thresholds set in the application > pool. The load balancer keeps throwing traffic at the server, even though > it isn't ready. It shows as an increasing number of Current Connections on > the backend server. Where the count normally never exceeds 10-15, it > usually increases to a few dozen trough a couple hundred before the worker > processes finally warm up on their own. > > I'm aware of ASP.NET and how it caches on the first hit (per worker). We > have a process (using ApacheBench) to warm-up the worker processes, however > if there are unexpected refresh/recycle events, we have to disable the > backend server, manually warm-up the worker processes, then add it back. > Quite hectic. > > The issues with the application cannot be resolved (not our application). > We want the load balancer to stop sending traffic to a server that is not > responding to requests promptly, but still provide a way for the load > balancer to assist with the warm-up process. > I am new to haproxy, But i think I can help you here. You can use a custom health check aspx page and make haproxy do health checks within certain interval of time using "http-check expect". Haproxy will take the node down if, reply is not returned within specified period. Ex : backend web-backend balance leastconn option httpchk GET /check.aspx HTTP/1.0 http-check expect string Success server node1 192.168.8.1:80 check inter 3000 rise 2 fall 3 maxconn 250 - Karthik Iyer
HAProxy and IIS 6
I am trying to troubleshoot an issue with our load balancer, and how it considers a backend server alive or dead. The servers are running IIS 6 (Win2K3), running an ASP.NET web service in its own application pool. The pool is set with multiple (4, at the moment) worker processes. The problem occurs when the worker processes are starting, or when they recycle/refresh due to memory or other thresholds set in the application pool. The load balancer keeps throwing traffic at the server, even though it isn't ready. It shows as an increasing number of Current Connections on the backend server. Where the count normally never exceeds 10-15, it usually increases to a few dozen trough a couple hundred before the worker processes finally warm up on their own. I'm aware of ASP.NET and how it caches on the first hit (per worker). We have a process (using ApacheBench) to warm-up the worker processes, however if there are unexpected refresh/recycle events, we have to disable the backend server, manually warm-up the worker processes, then add it back. Quite hectic. The issues with the application cannot be resolved (not our application). We want the load balancer to stop sending traffic to a server that is not responding to requests promptly, but still provide a way for the load balancer to assist with the warm-up process. I have our current haproxy.cfg file, and one that I'm trying to test with (on a secondary system). If this is the correct forum to address this issue, I can forward it as soon as I sanitize it a bit. Thanks in advance for any help you might be able to provide. -- Ricky Boone
Re: Haproxy stats page incomplete (1.4.17)
Hi Cyril, I removed the nolinger option but I still seem to have the same problem. Thanks, From: Cyril Bonté To: kristof.alenti...@numius.eu Cc: haproxy@formilux.org Date: 10/10/2011 12:07 Subject:Re: Haproxy stats page incomplete (1.4.17) Hi Kristof, Le lundi 10 octobre 2011 11:47:19, kristof.alenti...@numius.eu a écrit : > Hey, > > I am having some problems with the HAProxy stats page. I often have to > refresh a couple of times before it displays. Also it seems like it is > incomplete as there is no table beneath the "General process information" > part and no "Display option" and "External ressources" links. When looking > at the source code of the page, the html seems to stop in the middle of a > line before the page is finished. When comparing with another HAProxy > server we use, there definitely seems to be some html missing. We are > using following settings: I'd suggest you to remove the "option nolinger" line, which can produce such side effects. > > global > maxconn 4096 > daemon > pidfile /var/run/haproxy.pid >stats socket /tmp/haproxy > defaults > modehttp > retries 3 > option redispatch > option httpclose > option abortonclose > maxconn 2000 > contimeout 5000 > clitimeout 5 > srvtimeout 5 > > listen REPLICON_HTTP 172.10.15.43:80 > modehttp > cookie Replicon insert > balanceroundrobin > optionhttpclose > optionnolinger > stats enable > stats auth myuser:mypass > stats uri /haproxy?stats > reqadd X-Forwarded-Proto:\ http > serverwebserver1 172.10.15.41:80 cookie w1 check inter > 2000 rise 2 fall 5 > serverwebserver2 172.10.15.42:80 cookie w2 check inter > 2000 rise 2 fall 5 > > Any ideas? > > Thanks, > > Kristof > Kristof Alentijns - consultant > > Greenhill Campus > Interleuvenlaan 15D - 3001 Heverlee - Belgium > [M] +32 479 09 30 48 [T] +32 16 20 29 05 [F] +32 16 22 58 95 > > [W] www.numius.eu > > Sent by mobile phone > > __ > This email has been scanned by the MessageLabs Email Security System. > For more information please visit http://www.messagelabs.com/email > __ -- Cyril Bonté __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ Kristof Alentijns - consultant Greenhill Campus Interleuvenlaan 15D - 3001 Heverlee - Belgium [M] +32 479 09 30 48 [T] +32 16 20 29 05 [F] +32 16 22 58 95 [W] www.numius.eu Sent by mobile phone __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __
Re: Haproxy stats page incomplete (1.4.17)
No, the other server (for a different environment) uses 1.4.10. From: Baptiste To: kristof.alenti...@numius.eu Cc: haproxy@formilux.org Date: 10/10/2011 12:04 Subject:Re: Haproxy stats page incomplete (1.4.17) Hi, Are both HAProxy to the same version? cheers __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ Kristof Alentijns - consultant Greenhill Campus Interleuvenlaan 15D - 3001 Heverlee - Belgium [M] +32 479 09 30 48 [T] +32 16 20 29 05 [F] +32 16 22 58 95 [W] www.numius.eu Sent by mobile phone __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __
Re: Haproxy stats page incomplete (1.4.17)
Hi Kristof, Le lundi 10 octobre 2011 11:47:19, kristof.alenti...@numius.eu a écrit : > Hey, > > I am having some problems with the HAProxy stats page. I often have to > refresh a couple of times before it displays. Also it seems like it is > incomplete as there is no table beneath the "General process information" > part and no "Display option" and "External ressources" links. When looking > at the source code of the page, the html seems to stop in the middle of a > line before the page is finished. When comparing with another HAProxy > server we use, there definitely seems to be some html missing. We are > using following settings: I'd suggest you to remove the "option nolinger" line, which can produce such side effects. > > global > maxconn 4096 > daemon > pidfile /var/run/haproxy.pid >stats socket /tmp/haproxy > defaults > modehttp > retries 3 > option redispatch > option httpclose > option abortonclose > maxconn 2000 > contimeout 5000 > clitimeout 5 > srvtimeout 5 > > listen REPLICON_HTTP 172.10.15.43:80 > modehttp > cookie Replicon insert > balanceroundrobin > optionhttpclose > optionnolinger > stats enable > stats auth myuser:mypass > stats uri /haproxy?stats > reqadd X-Forwarded-Proto:\ http > serverwebserver1 172.10.15.41:80 cookie w1 check inter > 2000 rise 2 fall 5 > serverwebserver2 172.10.15.42:80 cookie w2 check inter > 2000 rise 2 fall 5 > > Any ideas? > > Thanks, > > Kristof > Kristof Alentijns - consultant > > Greenhill Campus > Interleuvenlaan 15D - 3001 Heverlee - Belgium > [M] +32 479 09 30 48 [T] +32 16 20 29 05 [F] +32 16 22 58 95 > > [W] www.numius.eu > > Sent by mobile phone > > __ > This email has been scanned by the MessageLabs Email Security System. > For more information please visit http://www.messagelabs.com/email > __ -- Cyril Bonté
Re: Haproxy stats page incomplete (1.4.17)
Hi, Are both HAProxy to the same version? cheers
Haproxy stats page incomplete (1.4.17)
Hey, I am having some problems with the HAProxy stats page. I often have to refresh a couple of times before it displays. Also it seems like it is incomplete as there is no table beneath the "General process information" part and no "Display option" and "External ressources" links. When looking at the source code of the page, the html seems to stop in the middle of a line before the page is finished. When comparing with another HAProxy server we use, there definitely seems to be some html missing. We are using following settings: global maxconn 4096 daemon pidfile /var/run/haproxy.pid stats socket /tmp/haproxy defaults modehttp retries 3 option redispatch option httpclose option abortonclose maxconn 2000 contimeout 5000 clitimeout 5 srvtimeout 5 listen REPLICON_HTTP 172.10.15.43:80 modehttp cookie Replicon insert balanceroundrobin optionhttpclose optionnolinger stats enable stats auth myuser:mypass stats uri /haproxy?stats reqadd X-Forwarded-Proto:\ http serverwebserver1 172.10.15.41:80 cookie w1 check inter 2000 rise 2 fall 5 serverwebserver2 172.10.15.42:80 cookie w2 check inter 2000 rise 2 fall 5 Any ideas? Thanks, Kristof Kristof Alentijns - consultant Greenhill Campus Interleuvenlaan 15D - 3001 Heverlee - Belgium [M] +32 479 09 30 48 [T] +32 16 20 29 05 [F] +32 16 22 58 95 [W] www.numius.eu Sent by mobile phone __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __