500s with 1.4.18 and 1.5d7
I am not sure if these counts are exceeding the never threshold 500 when haproxy encounters an unrecoverable internal error, such as a memory allocation failure, which should never happen I am not sure what I can do to troubleshoot this since it is in prod :( Is there a way to set it to core dump and die when it has a 500? Past few days: today so far: 12 -1 513153 200 137 206 8051 302 1277 304 127 400 22 403 790 404 35 408 32 500 1 503 7 504 yesterday: 3456 -1 14697297 200 4243 206 1 301 257865 302 54130 304 1579 400 1002 403 27800 404 1438 408 130 416 1138 500 5 501 18 502 140 503 1788 504 day before: 1.4.18: 514 -1 3221607 200 1032 206 55671 302 3514 304 283 400 165 403 5691 404 196 408 198 500 329 502 22603 503 38185 504 1.5d7: 3704 -1 12350739 200 3795 206 220736 302 31129 304 1013 400 1124 403 27887 404 1141 408 17 416 950 500 33 502 1206 503 39343 504 $ uname -a Linux filbert 2.6.32.26-175.fc12.x86_64 #1 SMP Wed Dec 1 21:39:34 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux $ free -m total used free sharedbuffers cached Mem: 4934626 4307 0 63216 -/+ buffers/cache:347 4587 Swap:0 0 0 $ /usr/sbin/haproxy1418 -vv HA-Proxy version 1.4.18 2011/09/16 Copyright 2000-2011 Willy Tarreau w...@1wt.eu Build options : TARGET = linux26 CPU = native CC = gcc CFLAGS = -O2 -march=native -g -fno-strict-aliasing OPTIONS = USE_LINUX_SPLICE=1 USE_REGPARM=1 USE_STATIC_PCRE=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200 Encrypted password support via crypt(3): yes Available polling systems : sepoll : pref=400, test result OK epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 4 (4 usable), will use sepoll.
Re: 500s with 1.4.18 and 1.5d7
On 10/3/11 12:19 PM, Brane F. Gračnar wrote: On Monday 03 of October 2011 20:09:17 Hank A. Paulson wrote: I am not sure if these counts are exceeding the never threshold 500 when haproxy encounters an unrecoverable internal error, such as a memory allocation failure, which should never happen I am not sure what I can do to troubleshoot this since it is in prod :( Is there a way to set it to core dump and die when it has a 500? Are you sure, that these are not upstream server 500 errors? Best regards, Brane Good point, I don't know how to differentiate from the haproxy logs which 500s originate from haproxy and which are passed through from the backend servers. I wish there was an easy way to tell since haproxy 500s are much more worrisome. Maybe ai am missing something...
Re: Caching
You can get weird results like this sometimes if you don't use http-close or any other http closing option on http backends. You should paste your config. Maybe there should be a warning, if there is not already, for that situation - maybe just when running -c. On 9/19/11 5:46 AM, Christophe Rahier wrote: I don't use Apache but IIS. I tried to disable caching on IIS but the problem is still there. There's no proxy, all requests are sent from pfSense. Christophe Le 19/09/11 13:45, « Baptiste »bed...@gmail.com a écrit : hi Christophe, HAProxy is *only* a reverse proxy. No caching functions in it. Have you tried to browse your backend servers directly? Can it be related to your browser's cache? cheers On Mon, Sep 19, 2011 at 1:39 PM, Christophe Rahier christo...@qualifio.com wrote: Hi, Is there a caching system at HAProxy? In fact, we find that when we put online new files (CSS, for example) that they are not addressed directly, it usually takes about ten minutes. Thank you in advance for your help. Christophe
http_req_first
can you provide some valid examples of using http_req_first, acl aclX http_req_first or use_backend beX if http_req_first does not seem to work for me in 1.4.17 Thanks.
Re: (in)sanity check on hdr_cnt
On 9/9/11 2:13 AM, Willy Tarreau wrote: Hi Hank, On Thu, Sep 08, 2011 at 07:12:29PM -0700, Hank A. Paulson wrote: Whether I have the rules in the backend or the front does not seem to make a difference - I tried some rules in front and back and neither worked. Maybe I am missing something obvious. You're using rspadd, not reqadd ! You're trying to add a header in the response based on contents from the request. I think you copy-pasted this part from another config ! Use reqadd instead of rspadd and it will work (it does here). Regards, Willy Argh - I do actually want to send a certain rsp header based on the req hdrs, I thought I could do that :( So there is no link between req and rsp? Is there a trick to do it in 1.5? Thanks.
(in)sanity check on hdr_cnt
does hdr_cnt not work or am I just completely unable to get an example that works? I can't imagine it doesn't work but I have tried _many_ - some examples and nothing seems to work (maybe it is 40+ hrs): acl hdrcnttest hdr_cnt gt 0 reqadd x-has-host:\ YES if hdrcnttest acl hdrcnttest hdr_cnt(host) gt 0 reqadd x-has-host:\ YES if hdrcnttest acl hdrcnttest hdr_cnt(Host) gt 0 reqadd x-has-host:\ YES if hdrcnttest acl hdrcnttest hdr_cnt(Host) 1 reqadd x-has-host:\ YES if hdrcnttest reqadd x-has-host:\ YES if { hdr_cnt(Host) gt 0 } reqadd x-has-host:\ YES if { hdr_cnt(Host:) gt 0 } Nothing seems to work, I tried 1.4.15, 1.4.17 and I recompiled 1.4.17 without any options at all for make except linux26 Other acl criteria seem to work as normal, just hdr_cnt... Thanks.
Re: (in)sanity check on hdr_cnt
Whether I have the rules in the backend or the front does not seem to make a difference - I tried some rules in front and back and neither worked. Maybe I am missing something obvious. Thanks. Example with config: [haproxy]# wget -S -O - http://10.1.1.251:82/blank.gif --2011-09-08 19:00:59-- http://10.1.1.251:82/blank.gif Connecting to 10.1.1.251:82... connected. HTTP request sent, awaiting response... T 10.1.1.251:12427 - 10.1.1.251:82 [AP] GET /blank.gif HTTP/1.0..User-Agent: Wget/1.12 (linux-gnu)..Accept: */*..Host: 10.1.1.251:82..Connection: Keep-Alive HTTP/1.0 200 OK Server: thttpd Content-Type: image/gif Date: Fri, 09 Sep 2011 02:00:59 GMT Last-Modified: Wed, 07 Sep 2011 17:17:06 GMT Accept-Ranges: bytes Content-Length: 43 X-nohdrsub: 1 the only rsp hdr added is the negation of a hdr* acl Connection: keep-alive Length: 43 [image/gif] Saving to: “STDOUT” 2011-09-08 19:00:59 (8.57 MB/s) - written to stdout [43/43] config file: defaults #option splice-auto option tcp-smart-connect option http-server-close timeout queue 27s timeout http-request 5s timeout client 33s timeout connect 8s timeout server 33s timeout http-keep-alive 77s timeout tarpit 190s global node hdr_cnt description hdr_cnt loglocalhost local1 # loglocalhost local1 err maxconn32768 uid99 gid99 chroot /var/empty pidfile/var/run/haproxy.pid stats socket /tmp/hap.sock daemon quiet spread-checks 6 frontend hdr_cnt bind 10.0.1.251:82 bind 10.0.1.252:82 bind 10.0.1.253:82 mode http log global option httplog option http-server-close option log-separate-errors maxconn 32768 capture request header Host len 32 capture request header User-Agent len 256 capture request header Content-Length len 10 capture request header Refererlen 384 capture request header Vialen 64 capture request header Cookie len 128 capture response header Content-Length len 10 default_backend www backend www modehttp balance roundrobin server www1 127.0.0.1:81 maxconn 10 option http-server-close acl hashosthdr_via_hdrcntge1 hdr_cnt(Host) ge 1 acl hashosthdr_via_hdrcntlt9 hdr_cnt(Host) lt 9 acl hashosthdr_via_hdrsub hdr_sub(host) -i 10.1 acl hasuahdr_via_hdrcntge1 hdr_cnt(User-Agent) ge 1 acl hasuahdr_via_hdrcnt1 hdr_cnt(User-Agent) 1 rspadd X-gothdrcntge1:\ 1 if hashosthdr_via_hdrcntge1 rspadd X-gothdrcntlt9:\ 1 if hashosthdr_via_hdrcntlt9 rspadd X-gothdrsub:\ 1 if hashosthdr_via_hdrsub rspadd X-nohdrsub:\ 1 if !hashosthdr_via_hdrsub rspadd X-gotuahdrcntge1:\ 1 if hasuahdr_via_hdrcntge1 rspadd X-gotuahdrcnt1:\ 1 if hasuahdr_via_hdrcnt1 On 9/8/11 6:49 AM, Baptiste wrote: hi, where are you doing your ACLs? Frontend or backend? cheers On Thu, Sep 8, 2011 at 3:06 PM, Hank A. Paulson h...@spamproof.nospammail.net wrote: does hdr_cnt not work or am I just completely unable to get an example that works? I can't imagine it doesn't work but I have tried _many_ - some examples and nothing seems to work (maybe it is 40+ hrs): acl hdrcnttest hdr_cnt gt 0 reqadd x-has-host:\ YES if hdrcnttest acl hdrcnttest hdr_cnt(host) gt 0 reqadd x-has-host:\ YES if hdrcnttest acl hdrcnttest hdr_cnt(Host) gt 0 reqadd x-has-host:\ YES if hdrcnttest acl hdrcnttest hdr_cnt(Host) 1 reqadd x-has-host:\ YES if hdrcnttest reqadd x-has-host:\ YES if { hdr_cnt(Host) gt 0 } reqadd x-has-host:\ YES if { hdr_cnt(Host:) gt 0 } Nothing seems to work, I tried 1.4.15, 1.4.17 and I recompiled 1.4.17 without any options at all for make except linux26 Other acl criteria seem to work as normal, just hdr_cnt... Thanks.
Re: cookie-less sessions
On 8/5/11 3:01 PM, Baptiste wrote: Hi Hank Actually stick on URL param should work with client which does not support cookies. is the first reply a 30[12] ? So you are saying that stick on URL param reads the outgoing 302 and saves the URL param from that in the stick table on 1.5? f so, great then problem solved. If it doesn't save it on the way out from the initial redirect then it won't help. Is the same supposed to happen with balance url_param on 1.4? If not, I will switch to 1.5. If it is supposed to, it doesn't, afaict. How is they user aware of the jsid or how is he supposed to send his jsid to the server? 302 to the URL with the jsid URL param. Thanks Do you have a X-Forwarded-For on your proxy or can you setup one? cheers
Re: cookie-less sessions
On 8/6/11 12:32 AM, Willy Tarreau wrote: Hi Baptiste, On Sat, Aug 06, 2011 at 09:24:08AM +0200, Baptiste wrote: On Sat, Aug 6, 2011 at 8:51 AM, Hank A. Paulson h...@spamproof.nospammail.net wrote: On 8/5/11 3:01 PM, Baptiste wrote: Hi Hank Actually stick on URL param should work with client which does not support cookies. is the first reply a 30[12] ? So you are saying that stick on URL param reads the outgoing 302 and saves the URL param from that in the stick table on 1.5? f so, great then problem solved. If it doesn't save it on the way out from the initial redirect then it won't help. Is the same supposed to happen with balance url_param on 1.4? If not, I will switch to 1.5. If it is supposed to, it doesn't, afaict. How is they user aware of the jsid or how is he supposed to send his jsid to the server? 302 to the URL with the jsid URL param. Thanks Do you have a X-Forwarded-For on your proxy or can you setup one? cheers Well, I'm thinking of something, let me run some tests and I'll come back to you with a good or a bad news. Right now I see no way to do that. We'd need to extract the url_param from the Location header, this would be a new pattern. I think it's not too hard to implement. We already have url_param for the request, we could have hdr_url_param(header_name) or something like this. Regards, Willy but it would have to be a combo thing, right? - set up stick table entry on outgoing 302 with url_param blah in location header then check incoming requests for url_param blah and continue to stick those based on the entry created by the initial outgoing response. I don't know if you'd do that as 2 rules or one. This would be a great and wondrous thing for cookie-less clients and decoupled servers - not to mention it may be a unique feature among existing load balancing products.
Re: nbproc1, ksoftirqd and response time - just can't get it
On 7/18/11 5:25 PM, Dmitriy Samsonov wrote: My final task is to handle DDoS attacks with flexible and robust filter available. Haproxy is already helping me to stay alive under ~8-10k DDoS bots (I'm using two servers and DNS RR in production), but attackers are not sleeping and I'm expecting attacks to continue with more bots. I bet they will stop at 20-25k bots. Such botnet will generate approx. 500k session rate. and ~1Gbps bandwidth so I was dreaming to handle it on this one server with two NIC's bonded giving me 2Gbps for traffic:) I think if that is your goal then you should definitely move to the intel NICs, people seem to have problems with those bnx NICs with linux. Since you are using a new-ish kernel, you might also want to look at the splice options and the smart accept/smart* options for haproxy. Since dDOS mitigation is your goal, if you have the money you may want to try the 10Gb NICs since as Willie said they seem to perform better even at lower levels. If you have a non-Dell machine with fewer cores and a faster processor you might want to test that to see if it will work better in this scenario. Also on all machines, try with hyperthreading on/off at the BIOS level to see if that makes a difference. And you can reduce the cores/cpus used in the bios and grub level settings, so you might try going down to 2 cores, 1 cpu, no hyperthreading and see if that makes a difference. Also, if oyu do you an Intel card/Intel onboard NIC, there are some setting IT/AO that may affect ma performance. If this is for dDOS mitigation, for the majority of the connections are you going to be tarpitting or blocking or passing them on to a real backend server? You maybe testing a scenario that does not map well to your real world usage. I would suggest putting keepalived on the current machine (if there is one) and any new machine you are thinking of using to replace the existing one, then you can switch to the new one easily and switch back if you find any show stopper issues. Also, for dDOS mitigation you probably want to increase these: net.ipv4.tcp_max_syn_backlog net.ipv4.ip_local_port_range Here is a facebook note about scaling memcache connections per second: http://www.facebook.com/note.php?note_id=39391378919
Re: more than one haproxy instance on one host/IP
On 7/11/11 11:22 AM, James Bardin wrote: On Mon, Jul 11, 2011 at 2:18 PM, Alexander Hollerith alex.holler...@gmail.com wrote: Thank you very much for pointing me into that direction. I think that definitely answers my question. Since haproxy itself might keep more than one process alive after dealing with an -sf (at least for as long as it takes to finish the work) I assume that keeping alive more than one process, in principle, can't be a problem :) Another FYI, the included init script does this on automatically reload, and prefaces it with a config check to prevent killing the process altogether. -jim AFAIK, if there are config problems haproxy won't start a new process but it also won't kill the old one when using -sf.
Re: HAProxy - 504 Gateway Timeout error.
Try adding: optionhttplog under your listen, I am not sure what haproxy does if you say tcplog after saying httplog, so you want to make sure have httplog since those log entries provide more info. Run with option httplog on the listen during the busy time and post some examples of the full log entries for the 504s - obfuscated as needed. There are 6 or 8 fields that should give some clues to loading, timing, tcp connection disposition and other potential issues. If you switch to a frontend/backend config, I think the haproxy stats page provides slightly more info, but I don't use listen so I am not positive. If you have a heavy / page, even HEADs every 2 seconds might be some load (because AFAIK php has to spin the whole page to know if it has changed depending on the frameworks used), maybe not. Remember, load can be low on the machines/jails and they might still be near or at their limit for sockets, file descriptors, etc - so be sure to check those. Also you can obviously watch for the errors as they happen with something like: tail -f /var/run/log | fgrep 504 | more On 7/6/11 2:44 AM, Gi Dot wrote: Hi, We have recently migrated our game servers from Linux to FreeBSD. We have 8 web servers running in jails, with HAProxy as load balancer. We also have CARP configured in case of network failover. carp is running as master on the 1st server(webm01), and backup on the 2nd server(webm02). haproxy on both servers are actively running, though only one is working at a time, depending on which server with carp acting as master. Both servers have pf running as well. We are running FreeBSD 8.2-RELEASE, haproxy-1.4.15, apache-2.2.19 and the game is php coded. Our network architecture is as follows. There is a backend database running as well on a jail in a different server, which I excluded from the diagram (hope the ascii diagram will be displayed well in the mail): +- wj01 | (webm01) |-- wj02 user carp haproxy --+ | |-- wj03 | | | +- wj04 | | +- wj05 | | | |- wj06 carp haproxy --+ (webm02) |- wj07 | +- wj08 Our main problem at the moment is a lot of users (more than a hundred users) have complained that they are getting a 504 Gateway Timeout error. This normally happens at night (CEST), when most players start playing the game. However, the load of our servers are consistently low at all time. At the moment there is no obvious pattern as to when this error occurs. Here is our haproxy.conf: global log /var/run/log local0 notice maxconn 4096 daemon chroot /var/run/haproxy user haproxy group haproxy stats socket /var/run/haproxy/haproxy.sock uid 1005 gid 1005 defaults logglobal modehttp optionhttpclose optionforwardfor optionhttplog optiontcplog optiondontlognull optiontcpka retries3 option redispatch maxconn2000 timeout connect5000 timeout client 5 timeout server5 listenwebjailfarm 78.xx.xx.xx:80 mode http cookieSERVERID insert nocache indirect balanceroundrobin option httpclose option forwardfor option httpchk HEAD / HTTP/1.0 stats uri /haproxy-status stats enable stats auth admin:password serverwj01 192.168.30.10:80 http://192.168.30.10/ cookie A weight 10 check inter 2000 rise 2 fall 2 serverwj02 192.168.30.20:80 http://192.168.30.20/ cookie B weight 10 check inter 2000 rise 2 fall 2 serverwj03 192.168.30.30:80 http://192.168.30.30/ cookie C weight 10 check inter 2000 rise 2 fall 2 serverwj04 192.168.30.40:80 http://192.168.30.40/ cookie D weight 10 check inter 2000 rise 2 fall 2 serverwj05 192.168.30.50:80 http://192.168.30.50/ cookie E weight 10 check inter 2000 rise 2 fall 2 serverwj06 192.168.30.60:80 http://192.168.30.60/ cookie F weight 10 check inter 2000 rise 2 fall 2 serverwj07 192.168.30.70:80 http://192.168.30.70/ cookie G weight 10 check inter 2000 rise 2 fall 2 serverwj08 192.168.30.80:80 http://192.168.30.80/ cookie H weight 10 check inter 2000 rise 2 fall 2 ## And here is our pf.conf (the exact same
can't figure out why this is causing a CQ
I can't see what I am missing here. Any help is appreciated Jun 14 02:00:00 localhost haproxy[3052]: 10.101.1.2:2892 [14/Jun/2011:02:00:00.088] w wi/wi-9 35/111/-1/-1/146 503 212 W=9 - CQ-- 202/202/27/18/0 1/0 {w.x.y.z|Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/534.7 (KHTML, like Gecko) Chrome/1.0.11.4 Safari/534.7|1549|http://w.x.y.z/w/web/w.jsp?ID=1sid=42||lang=en; auth=5/z/Z; W=9||} {} POST /w/web/w.jsp?ID=1sid=42 HTTP/1.1 config: defaults #option splice-auto option tcp-smart-connect option http-server-close timeout queue 21s timeout http-request 5s timeout client 38s timeout connect 8s timeout server 38s timeout http-keep-alive 8s timeout tarpit 120s global node w1 loglocalhost local0 # loglocalhost local0 err maxconn32768 uid99 gid99 chroot /var/empty pidfile/var/run/haproxy.pid daemon quiet spread-checks 6 frontend w bind 10.1.1.1:80 mode http log global option httplog option http-server-close option log-separate-errors maxconn 32768 capture request header Host len 32 capture request header User-Agent len 256 capture request header Content-Length len 10 capture request header Refererlen 384 capture request header Vialen 64 capture request header Cookie len 128 capture request header X-xylen 64 capture response header Content-Length len 10 capture request header X-xxlen 32 # block any unwanted source IP addresses or networks acl forbidden_src src 0.0.0.0/7 224.0.0.0/3 # acl forbidden_src src_port 0:1023 block if forbidden_src default_backend wi capture cookie W= len 13 backend wi modehttp balance roundrobin cookie W server wi-7 10.1.1.7:80 cookie 7 check observe layer7 inter 90s fastinter 12s downinter 18s rise 2 fall 4 slowstart 360s weight 100 maxconn 18 maxqueue 64 server wi-9 10.1.1.9:80 cookie 9 check observe layer7 inter 90s fastinter 12s downinter 18s rise 2 fall 4 slowstart 360s weight 100 maxconn 18 maxqueue 64 server wi-0 10.1.1.0:80 cookie 0 check observe layer7 inter 90s fastinter 12s downinter 18s rise 2 fall 4 slowstart 360s weight 100 maxconn 18 maxqueue 64 server wi-1 10.1.1.1:80 cookie 1 check observe layer7 inter 90s fastinter 12s downinter 18s rise 2 fall 4 slowstart 360s weight 100 maxconn 18 maxqueue 64 server wi-2 10.1.1.2:80 cookie 2 check observe layer7 inter 90s fastinter 12s downinter 18s rise 2 fall 4 slowstart 360s weight 100 maxconn 18 maxqueue 64 server wi-3 10.1.1.3:80 cookie 3 check observe layer7 inter 90s fastinter 12s downinter 18s rise 2 fall 4 slowstart 360s weight 100 maxconn 18 maxqueue 64 server wi-8 10.1.1.8:80 cookie 8 check observe layer7 inter 90s fastinter 12s downinter 18s rise fall 4 slowstart 360s weight 100 maxconn 18 maxqueue 64 #server wi-2 10.1.1.2:80 cookie 2 check observe layer7 inter 90s fastinter 12s downinter 18s rise 2 fall 4 slowstart 360s backup weight 100 maxconn 24 option redispatch retries 3 option forwardfor option http-server-close #option forceclose option abortonclose
country/ip database website, needs donations to keep going
I recently found this resource: http://www.countryipblocks.net/ on the day they say they are closing due to lack of donations. :( I thought other hap users might be interested in this use case and will hopefully think about donating, too. For one site targeting users in several countries au, nz, etc. to be able to do per-country stats, I am add a header based on the ip: acl ipcc_au src -f /etc/haproxy/au_ips.txt acl ipcc_nz src -f /etc/haproxy/nz_ips.txt reqadd X-Country:\ au if ipcc_au reqadd X-Country:\ nz if ipcc_nz With haproxy and these lists, it is fast and easy to add the country info to requests. Is anyone else doing something similar? Or other sources for this or other similar types of info?
Re: Bench of haproxy
Not necessarily a solution to this performance issue, but I was thinking about how to get to that next level of performance for haproxy. Here is an idea I had that is a bit far out. Supermicro and others now have GPU servers - TESLA from NIVDIA, etc. A project from Korea has used these GPGPUs to create a high speed Linux based router - they have achieved around 40 Gbps. They have a related package for packet I/O: Packet I/O Engine is a high-performance device driver for Intel 82598/82599-based network interface cards. This program is based on Intel IXGBE driver (version 2.0.38.2), but heavily modified for throughput-oriented user-level applications. They take a different approach to memory for packet processing, rather than handing out memory like a Grandma slowing digging out her coin purse and handing out a few pennies each time the kids ask, they pre-allocate healthy amounts of memory. Details here: http://shader.kaist.edu/packetshader/ http://shader.kaist.edu/packetshader/io_engine/index.html Could this be a future for high speed dedicated haproxy machines - some interfaces are taken over by haproxy and dedicated to it using something similar to the above. Other interfaces are left to the normal kernel for management, logging, etc. Maybe not but there might be some usable ideas there. 12 Core CPU, 2 x16 PCIe, one for GPU, one for Intel 10GB X520-T2 Card: http://www.supermicro.com/Aplus/system/1U/1122/AS-1122GG-TF.cfm On 5/6/11 11:25 PM, Jason J. W. Williams wrote: Generally the Caviums are used for SSL offload. The CPUs in F5s generally do the bulk of the L7 + iRules application. -J Sent via iPhone Is your e-mail Premiere? On May 7, 2011, at 0:06, Baptistebed...@gmail.com wrote: On Sat, May 7, 2011 at 12:14 AM, Vincent Bernatber...@luffy.cx wrote: OoO En cette soirée bien amorcée du vendredi 06 mai 2011, vers 22:46, Baptistebed...@gmail.com disait : It seems that the CPU speed of your F5 3900 is 2.4GHz with 8G of memory. The F5 is using some Cavium chip to forward requests. The main processor is mainly used for the web interface which can be pretty slow. ;-) -- mmm... I thought the cavium would be used for L4 balancing only. But it seems they can do layer 7 as well within the chip: http://www.caviumnetworks.com/processor_NITROX-DPI.html Must be quite expensive :D To come back to haproxy, since it's event driven, the fastest the CPU, the most request it will handle :) and the more memory you'll have in your chassis, the more TCP connection you'll be able to maintain. Good luck with your testing. Baptiste I WILL NOT BARF UNLESS I'M SICK I WILL NOT BARF UNLESS I'M SICK I WILL NOT BARF UNLESS I'M SICK -+- Bart Simpson on chalkboard in episode 8F15
Re: [ANNOUNCE] haproxy 1.4.12
I got a segfault at start up when parsing a config that uses pattern files. Same config runs under 1.4.10 Commenting out that line prevents the segfault. Sending more info directly to Willy On 3/8/11 2:18 PM, Willy Tarreau wrote: Hi, I'm announcing haproxy 1.4.12. I know I did not take the time to announce 1.4.11 to the list when I released it one month ago, but now spare time seems to get available again so here are the two announcements at once. First, here's the short changelog between 1.4.10 and 1.4.11 : - [MINOR] cfgparse: Check whether the path given for the stats socket - [DOC] fix a minor typo - [DOC] fix ignore-persist documentation - [BUG] http: fix http-pretend-keepalive and httpclose/tunnel mode - [MINOR] add warnings on features not compatible with multi-process mode - [MINOR] acl: add be_id/srv_id to match backend's and server's id - [MINOR] log: add support for passing the forwarded hostname - [MINOR] log: ability to override the syslog tag - [DOC] fix minor typos in the doc - [DOC] fix another typo in the doc - [BUG] http chunking: don't report a parsing error on connection errors - [BUG] stream_interface: truncate buffers when sending error messages - [BUG] http: fix incorrect error reporting during data transfers - [CRITICAL] session: correctly leave turn-around and queue states on abort - [BUG] session: release slot before processing pending connections - [MINOR] stats: report HTTP message state and buffer flags in error dumps - [MINOR] http: support wrapping messages in error captures - [MINOR] http: capture incorrectly chunked message bodies - [MINOR] stats: add global event ID and count - [OPTIM] http: don't send each chunk in a separate packet - [BUG] acl: fix handling of empty lines in pattern files - [BUG] ebtree: fix ebmb_lookup() with len smaller than the tree's keys - [OPTIM] ebtree: ebmb_lookup: reduce stack usage by moving the return code out of the loop Second, here's the same changelog for 1.4.11 to 1.4.12 : - [MINOR] stats: add support for several packets in stats admin - [BUG] stats: admin commands must check the proxy state - [BUG] stats: admin web interface must check the proxy state - [BUG] http: update the header list's tail when removing the last header - [DOC] fix typos (http-request instead of http-check) - [BUG] http: use correct ACL pointer when evaluating authentication - [BUG] cfgparse: correctly count one socket per port in ranges - [BUG] startup: set the rlimits before binding ports, not after. - [BUG] acl: srv_id must return no match when the server is NULL - [BUG] acl: fd leak when reading patterns from file - [DOC] fix minor typo in usesrc - [BUG] http: fix possible incorrect forwarded wrapping chunk size - [BUG] http: fix computation of message body length after forwarding has started - [BUG] http: balance url_param did not work with first parameters on POST - [TESTS] update the url_param regression test to test check_post too As you can see, those are quite a number of bugs. It does not mean the product is getting worse, rather that users are exploiting more and more its possibilities, that the quality of bug reports really increases, and that we're discovering bugs in the code when developing on new branches. Many of these issues above are really just minor annoyances (mainly wrong flags reported in the logs for some errors for instance). They're still worth upgrading at least to avoid wasting time trying to debug wrong issues. However, there are three in 1.4.11 that I consider more important depending on the use cases : - 1.4.11 : http-pretend-keepalive was definitely fixed for httpclose and tunnel modes. Prior to 1.4.11, it was still possible to see some requests wait for a timeout with some setups and combinations of client/servers. Cyril did an extensive work at testing all imaginable combinations and all happened to work as expected. - 1.4.11 : fix for correctly leaving the turn-around state. While working on 1.5, I discovered a scary bug which could cause some sessions to remain hung forever in the debugger. In theory it is possible to trigger this bug with a faulty server that regularly dies, if the client knows exactly when the connection will abort. The effect is that such sessions will remain present until the process is restarted. The risk that it happens is very low and in fact nobody has ever reported such a situation, still this is something to care about. - 1.4.11 : fix ebtrees in stick tables. A bug in the ebtree code made it possible to have the same binary key at multiple places in the tree, causing it to
Re: Acl url_sub doesn't seems to match
I think this covers the most cases, I am not sure if the -i is needed or not: acl acl_aspc url_dom -i autos-prestige-collection HTTP_URL_ABS acl acl_aspc hdr_dom(Host) -i autos-prestige-collection use_backend aspc if acl_aspc On 1/12/11 11:38 AM, Bryan Talbot wrote: I think the problem is that url_dom operates on the URL found in the request line but in your case, that URL is a relative URI (/) which does not contain a host name. I think if you use hdr_dom(Host) it'll do what you want. -Bryan On Wed, Jan 12, 2011 at 8:39 AM, Contact Dowhile cont...@dowhile.fr mailto:cont...@dowhile.fr wrote: Hello, i have a pretty simple HAProxy configuration, but it can't match an super-simple acl.. Here is the config global daemon user haproxy group haproxy maxconn 5000 defaults mode http maxconn 4950 retries 2 timeout client 60s # Client and server timeout must match the longest timeout server 60s # time we may wait for a response from the server. timeout queue 60s # Don't queue requests too long if saturated. timeout connect 4s # There's no reason to change this one. timeout http-request 5s # A complete request may never take that long. frontend web :80 option forwardfor acl acl_aspc url_dom autos-prestige-collection use_backend aspc if acl_aspc default_backend webfarm backend aspc balance source server webC 10.1.0.26:80 http://10.1.0.26:80 check backend webfarm balance source server webA 10.1.0.20:80 http://10.1.0.20:80 check What i want is going to webfarm for every website, and going to aspc for http://*.autos-prestige-collection.com/* http://autos-prestige-collection.com/* This is because this site is located on a windows iis server... If i'm in debug mode here is what happen myhost etc # haproxy -d -f haproxy.cfg Available polling systems : sepoll : pref=400, test result OK epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 4 (4 usable), will use sepoll. Using sepoll() as the polling mechanism. :web.accept(0004)=0005 from [xx.xx.xx.xx:27615] :web.clireq[0005:]: GET / HTTP/1.1 :web.clihdr[0005:]: Host: www.autos-prestige-collection.com http://www.autos-prestige-collection.com :web.clihdr[0005:]: User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.12) Gecko/20101027 Firefox/3.6.12 :web.clihdr[0005:]: Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 :web.clihdr[0005:]: Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3 :web.clihdr[0005:]: Accept-Encoding: gzip,deflate :web.clihdr[0005:]: Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 :web.clihdr[0005:]: Keep-Alive: 115 :web.clihdr[0005:]: Connection: keep-alive :webfarm.srvrep[0005:0006]: HTTP/1.1 200 OK :webfarm.srvhdr[0005:0006]: Date: Wed, 12 Jan 2011 16:31:13 GMT :webfarm.srvhdr[0005:0006]: Server: Apache/2.2.17 (Fedora) :webfarm.srvhdr[0005:0006]: X-Powered-By: PHP/5.3.4 :webfarm.srvhdr[0005:0006]: Content-Length: 3546 :webfarm.srvhdr[0005:0006]: Connection: close :webfarm.srvhdr[0005:0006]: Content-Type: text/html; charset=UTF-8 :webfarm.srvcls[0005:0006] :webfarm.clicls[0005:0006] :webfarm.closed[0005:0006] We see that Host: www.autos-prestige-collection.com http://www.autos-prestige-collection.com (so it should match my acl isn't it ???) but we see that haproxy redirected this query to webfarm :webfarm.srvhdr[0005:0006]: Server: Apache/2.2.17 (Fedora) My iis server is going ok, if i put this in frontend web :80 default_backend aspc i'm redirected to my iis server (but then ALL my websites are redirected to the iis which i don't want...) I tried with url_dom and url_sub, nothing changes, it never catch the acl rule... I'm running haproxy 1.4.10 on gentoo. Thanks for reading Guillaume
Re: precedence of if conditions (again)
On 6/30/10 9:50 PM, Willy Tarreau wrote: On Wed, Jun 30, 2010 at 08:53:19PM -0700, Bryan Talbot wrote: See section 7.7: AND is implicit. 7.7. Using ACLs to form conditions -- Some actions are only performed upon a valid condition. A condition is a combination of ACLs with operators. 3 operators are supported : - AND (implicit) - OR (explicit with the or keyword or the || operator) - Negation with the exclamation mark (!) I'm realizing that that's not enough to solve Hank's question, because the precedence is not explained in the doc (it was so obvious to me that it was like in other languages that it's not explained), so : reqirep blah if a b or c is evaluated like this : (a and b) or c and : reqirep blah if a b or c d is evaluated like this : (a and b) or (c and d) Regards, Willy I have a more complex grouping and I am still not sure how to create it. I have one required condition A and one of 4 other conditions B1-B4 so I need something like: if A and (B1 or B2 or B3 or B4) is there a way to do that?
Re: Virus warnings originating from the list
Me too, 1 or 2 per day usually - but my server rejects them and then the maillist server complains that msgs to me are bouncing: Some messages to you could not be delivered. If you're seeing this message it means things are back to normal, and it's merely for your information. Here is the list of the bounced messages: 6772 On 12/30/10 1:53 PM, Karl Kloppenborg wrote: Hi Willy, I receive roughly two per day from the same place, which is this formilux servers.. at least two, sometimes (but rarely) three emails a day. --Karl ;) *Karl Kloppenborg* Head of Development *Phone:*1300 884 839 /(AU Only - Business Hours)/ *Website:*AU http://www.crucial.com.au http://www.crucial.com.au/| US http://www.crucialp.com http://www.crucialp.com/ On 31/12/2010, at 2:22, Willy Tarreau wrote: Hi Karl, On Thu, Dec 30, 2010 at 05:24:08PM +1100, Karl Kloppenborg wrote: Hey guys, Our mailserver keeps popping its head up and crying about someone on the list with a virus infection: -- VIRUS ALERT Our content checker found viruses: Suspect.DoubleExtension-zippwd-9, Worm.Mydoom.M in an email to you from probably faked sender: ?...@[88.191.124.161] claiming to be: haproxy+bounces-6752-karl=crucialp@formilux.org mailto:haproxy+bounces-6752-karl=crucialp@formilux.org Content type: Virus Our internal reference code for your message is 15320-02/7TgmtDhTpGW9 First upstream SMTP client IP address: [88.191.124.161] flx02.formilux.org http://flx02.formilux.org According to a 'Received:' trace, the message apparently originated at: [88.191.124.161], flx02.formilux.org http://flx02.formilux.org flx02.formilux.org http://flx02.formilux.org [127.0.0.1] (...) Strange, I don't recall having noticed any such message. Maybe they're simply deleted before reaching me, but I don't think so as I'm not performing any filtering on the ML at home. How many of them do you get a day ? Willy
Re: ACL to use a single Server
On 12/19/10 9:46 PM, Willy Tarreau wrote: Hi Craig, On Thu, Dec 16, 2010 at 11:47:51PM +0100, Craig wrote: A typical use-case is a special server from you cluster that fullfills a maintainance special task, I guess it's a common use-case. Any opinions on this? How would this work if you have several servers in a backend and you have redispatch or other options that could re-route the request to a different server?
Re: Multiple Load Balancers, stick table and url-embedded session support
Please see the thread: need help figuring out a sticking method I asked about this, Willie says there are issues figuring out a workable config syntax for 'regex to pull the URL/URI substring' but (I think) that coding the functionality is not technically super-difficult just not enough hands maybe and the config syntax? I have a feeling this would be a fairly commonly used feature, so it is good to see others asking the same question :) How are you planning to distribute the traffic to the different haproxy instances? LVS? Some hardware? On 12/8/10 8:58 PM, David wrote: Hi there, I have been asked to design an architecture for our load-balancing needs, and it looks like haproxy can do almost everything needed in a fairly straightfoward way. Two of the requirements are stickiness support (always send a request for a given session to the same backend) as well as multiple load balancers running at the same time to avoid single point of failure (hotbackup with only one haproxy running at a time is not considered acceptable). Using multiple HAproxy instances in parallel with stickiness support looks relatively easy if cookies are allowed (through e.g. cookie prefixing) since no information needs to be shared. Unfortunately, we also need to support session id embedded in URL (e.g. http://example.com/foo?sess=someid), and I was hoping that the new sticky table replication in 1.5 could help for that, but I am not sure it is the case. As far as I understand, I need to first define a table with string type, and then use the store-request to store the necessary information. I cannot see a way to get some information embedded in the URL using the existing query extraction methods. Am I missing something, or is it difficult to do this with haproxy ? regards, David
Re: bug? reqidel and pcre
Looks good in my limited test cases, headers are gone regardless of ordering of del statements, but in your notes: but since headers were the last header processing, the issue remained unnoticed. You mean cookies were the last, right? On 11/27/10 10:14 PM, Willy Tarreau wrote: Hi Cyril and Hank, On Thu, Nov 25, 2010 at 08:01:33AM +0100, Cyril Bonté wrote: If I remove only one of the first X-truc headers, it's OK too. I suspect that the removal of more than one headers from the beginning prevents the remaining first header from being matched by next rule :-/ Not only the first one. With this order : reqidel ^Via: reqidel ^X and this request : printf GET / HTTP/1.0\r\nX-truc1: blah\r\nX-truc2: blah\r\nVia: here\r\nVia: here\r\nX-truc3: blah\r\nX-truc4: blah\r\nX-truc5: blah\r\nX-truc6: blah\r\nVia: there\r\nNon-X: blah\r\n\r\n X-truc3, X-truc4, X-truc5, X-truc6 are not removed. Indeed, the list gets corrupted after exactly two consecutive removals, because we do this to remove a header : last-next = cur-next; cur-len = 0; then we iterate this way : last = cur; cur = cur-next; If we do that exactly twice, we leave an empty header in the list, because the first removed header's next is updated instead of updating the last non deleted header ! We were just missing a cur = last when deleting. Note that the bug was also present when removing duplicate headers, but 1) browsers don't send them on two distinct lines, and 2) no more header processing is performed after cookies, leaving the case unnoticed. Interesting, I'll have to check some servers configurations today. You may discover scary things, as usual when we spot a bug :-) Hank, please apply this patch : http://git.1wt.eu/web?p=haproxy-1.4.git;a=commitdiff_plain;h=1337ca I'll release 1.4.10 with the pending fixes after a few more tests. Cheers, Willy
Feature Requests: march native and cwnd setting param
1 - With recent CPUs Intel 5300/5400/5500/5600 and AMD 6100 the set of optimal compiler settings for optimizations :) is not something anyone can keep up with - not to mention different versions of gcc that understand none, some or all of the features of these CPUs. march native allows gcc to take on the burden of optimizing the compile time settings, so if that could be added as one of the options in the makefile, it would be helpful because then I could use the same make... line on every machine but it would self-adjust for that machine. Obviously, this is not a setting that distros would use to spin package binaries, but for great for getting the optimal settings for a given machine. Examples: model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz # cc -march=native -E -v - /dev/null 21 | fgrep cc1 /usr/libexec/gcc/x86_64-redhat-linux/4.4.5/cc1 -E -quiet -v - -march=core2 -mcx16 -msahf -mpopcnt -msse4.2 --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=8192 -mtune=core2 model name : AMD Opteron(tm) Processor 6172 [r...@hesj3-m41 cron.d]# cc -march=native -E -v - /dev/null 21 | fgrep cc1 /usr/libexec/gcc/x86_64-redhat-linux/4.5.1/cc1 -E -quiet -v - -march=amdfam10 -mcx16 -msahf -mpopcnt -mabm --param l1-cache-size=64 --param l1-cache-line-size=64 --param l2-cache-size=512 -mtune=amdfam10 2 - Google has pushed via both tcp related RFCs and patches to the networking code for the linux kernel to allow the initial cwnd to be set as a socket option - this would be a huge help to sites that communicate with the same clients over and over and/or with many small requests allowing a full response in one (or at least fewer) round trips. For one site that I work on that is over 250 ms away with a very reliable gateway on the other end, I burn through several round trips to deliver an icon/small gif/etc - an icon that could have all the necessary packets in flight before the first ack. It turns out the small initial cwnd creates more traffic across the under sea cables than an initial cwnd of 8 or 10 or 12. http://www.amailbox.org/mailarchive/linux-netdev/2010/5/26/6278007 I also wanted to see if you were aware of two other recent kernel changes that could be helpful to haproxy performance, the first could be helpful for the new UNIX socket connections in recent haproxy versions: Implementation of recvmmsg: recvmmsg() is a new syscall that allows to receive with a single syscall multiple messages that would require multiple calls to recvmsg(). For high-bandwith, small packet applications, throughput and latency are improved greatly. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a2e2725541fad72416326798c2d7fa4dafb7d337 The second is RPS from google to improve network processing performance with multiple CPUs - similar to MSI-X but google found that both together had even more performance than just MSI-X: http://kernelnewbies.org/Linux_2_6_35#head-94daf753b96280181e79a71ca4bb7f7a423e302a http://lwn.net/Articles/362339/
bug? reqidel and pcre
I have a possible bug, I have a backend I want strip all the X-* headers off the requests. But I found that if I did: reqidel ^X reqidel ^Via:\ or reqdel ^x-.*:\ reqdel ^Via or similar haproxy [1.4.8 (Fedora package version) and hand compiled 1.4.9 version both using pcre] both would not remove the Via: header (the Via header was the first header after Host, so initially I thought it was header order. I also had some other reqidel's but I removed all of them except ^X and ^Via and the problem remained. I tried many combinations of reqidel/reqdel and removed all the other reqdel/rspadd lines during testing and I still see the same results: X- or x- headers are removed but the Via header is not. The only other thing is that I am capturing an x- header, but why not an explicit error message if that is the cause of the problem.
Re: haproxy gives 502 on links with utf-8 chars?!
Accept-* headers talk about what the ends of the connection want in terms of page content. What is allowed in the headers themselves is a different part of the spec, not spec'd by the content of a header but by the spec itself. Many HTTP/1.1 header field values consist of words separated by LWS or special characters. These special characters MUST be in a quoted string to be used within a parameter value (as defined in section 3.6). Unrecognized header fields [anything like X-*] are treated as entity-header fields. So X-GSS-Metadata is considered an entity-header AFAICT. The extension-header mechanism allows additional entity-header fields to be defined without changing the protocol, but these fields cannot be assumed to be recognizable by the recipient. Unrecognized header fields SHOULD be ignored by the recipient and MUST be forwarded by transparent proxies. 7.2.1 talks about encoding the entity body but not entity headers. I didn't know about trailing headers trailers - Willie, is haproxy coded to watch for those? As is the answer here: http://stackoverflow.com/questions/1361604/how-to-encode-utf8-filename-for-http-headers-python-django It looks like you can't do that On 11/19/10 6:13 AM, Jakov Sosic wrote: On 11/19/2010 03:07 PM, German Gutierrez :: OLX Operation Center wrote: Looks like the field X-GSS-Metadata: Has utf-8 encoded characters, I don't know if that's valid or not, I think not. From wikipedia: http://en.wikipedia.org/wiki/List_of_HTTP_header_fields Accept-Charset Character sets that are acceptable Accept-Charset: utf-8 So I guess I need to force somehow server to set this HTTP header option?
Re: Limiting throughput with a cold cache
A few ideas that you might or might not want to consider: * As another poster just mentioned you might consider ICP but they suggested having all your squids talk to one master squid. I would instead maybe do this: Currently, my understanding of your layout: haproxy - hashed_url - squid X of Y - db shard for X content If you wanted a probably more robust architecture, you might want to try 1 of a couple different things: haproxy - hashed_url - squid X/Y | single hidden squid peer for X/Y - small haproxy - db shard or if you can afford the AWS instances: haproxy - hashed_url - squid Xa/Y | squid Xb/Y - small haproxy - db shard for X content | squid Xc/Y - small haproxy - db shard for X content | squid Xd/Y - small haproxy - db shard for X content Only one of Xa-d has the IP address used in the haproxy config the others are hidden peers that never directly talk to haproxy or the end users directly. They would share data using ICP or maybe cache digests - cache digests are supposed to be faster depending on workload mix. These might be more resistant to outages. If it was me, I would go further and create a set of peer squids for each hash value and have each of those load balanced by a small haproxy and manage their connections to their db shard via another haproxy instance : haproxy - hashed_url - ... | squid Xa/Y | | | | | squid Xb/Y | ...small haproxy for X/Y - | | | - haproxy for db shard X | squid Xc/Y | | | | | squid Xd/Y | In this set up the lines in your main haproxy backend would be pointing to a small haproxy (I don't think there is a way to do this in one single overall haproxy instance) for hash value X of Y. That small haproxy would have N squids all answering user queries and talking to each other as a flat group of peers using ICP or cache digests. That way you would have N squids that are hot for hash value X and if one dies, you can have a longer slow start period and by using ICP/Cache Digests and having N-1 hot caches you would be able to very quickly heat up your new squid instance N_new without overloading your db or significantly slowing your ability to answer queries for hash vlaue X. I would think you want to evolve your config towards allowing some queueing of requests - so that it can absorb some amount of request spikes but without detrimental effect on the backend db, etc. Having the set up above with a haproxy in front and behind your squid group would allow you to have long show start times to allow a new squid to warm up slow without adversely affecting the overall throughput/performance of the system much. 100ms seems like a very short time to wait for clients - if those are real end users and not some internal system that you know is very fast - for instance, you have people using apps on iPhone and iPad and I know that I see bigger delays from wap gates than 100ms so you might want to reconsider some of those timeouts. On 11/15/10 4:43 PM, Dmitri Smirnov wrote: Willy, thank you for taking time to respond. This is always thought provoking. On 11/13/2010 12:18 AM, Willy Tarreau wrote: Why reject instead of redispatching ? Do you fear that non-cached requests will have a domino effect on all your squids ? Yes, this indeed happens. Also, the objective is not to exceed the number of connections from squids to backend database. In case of cold cache the redispatch will cause a cache entry to be brought into the wrong shard which is unlikely to be reused. Thus this would use up a valuable connection just to satisfy one request. However, even this is an optimistic scenario. These cold cache situations happen due to external factors like AWS issue, our home grown DNS messed up (AWS does not provide DNS) and etc which causes not all of the squids to be reported to proxies and messes up the distribution. This is because haproxy is restarted after the config file is regenerated. I have been thinking about preserving some of the distribution using server IDs when the set of squids partially changes but that's another story, let's not digress. Thus even with redispatch enabled the other squid is unlikely to have free connection slots because when it goes cold, most of them do. Needless to say, most of the other components in the system are also in distress in case something happens on a large scale. So I choose the stability of the system to be the priority even though some of the clients will be refused service which happens to be the least of the evils. I don't see any http-server-close there, which makes me think that
Re: (haproxy) How-TO get HAPROXY to balanace 2 SSL encypted Webservers ?
Where is the rest of your haproxy config - if you are talking to port 443 on your tomcat servers... If you have have the 2 backend servers and you want haproxy to talk to the encrypted/ssl ports on them (and you want your end users to see the certs that are on the tomcat servers) then the only thing haproxy can see is the source IP and source port and try to create stickiness with the source IP. So you have to think in those terms - what is unencrypted at the time each request and response passes through haproxy. In this case the end user sees the cert installed on pound and haproxy can use all the layer 7/http capabilities: ssl/443 - pound - non-ssl - haproxy non-ssl - tomcat(s) you can't do (AFAIK): ssl/443 - pound - non-ssl - haproxy - ssl - tomcat(s) because the user would still see only the pound cert and I don't think haproxy can initiate ssl sessions on its own. On 11/15/10 11:08 AM, t...@hush.com wrote: So we have 2 webservers on the backend with SSL encryption. We want to keep this the way it is. Is there a way for HAPROXY to balance these 2 servers with sticky sessions enabled? how can this be done? Currently when trying it this way; defaults log global mode http option httplog option dontlognull retries 3 option redispatch maxconn 2000 contimeout 5000 clitimeout 5 srvtimeout 5 stats enable stats uri /stats frontend http-in bind *:80 acl is_ww2_test1_com hdr_end(host) -i ww2.test1.com use_backend ww2_test1_com if is_ww2_test1_com backend ww2_test1_com balance roundrobin cookie SERVERID insert nocache indirect option httpchk option httpclose option forwardfor server Server1 10.10.10.11:80 cookie Server1 server Server1 10.10.10.12:80 cookie Server2 Since the 2 servers are encrypted on port 443 (with the main front page on port 80 not encrypted), the above setup works until it hits 443 where i get the error Error 310 (net::ERR_TOO_MANY_REDIRECTS): There were too many redirects.. Port 443 on the HAPROXY frontend is using Pound for the encryption. However both backend servers have a Tomcat Keystore (signed through thawte) which I doubt will be compatable with Pound. (and I don't want to resign the cert or get a new cert) Can I somehow get HAPROXY to balance these 2 servers with proper sticky session handling? TIA!
cookie request-learn
Can someone give a request/response level example of how request-learn works? I can't understand how if haproxy has not seen cookie going out it can tell where to direct it to when it comes in. Thanks.
Re: VM benchmarks
I don't have benchmarks, but have sites running haproxy on Xen VMs with apache on Xen VMs and can pump 120 Mbps and 80 million hits a day through one haproxy VM and that is with haproxy doing rsysloging of all requests to 2 remote rsyslog servers on top of the serving of requests with some layer 7 acls to route requests to different backends. Only 50-75 backend servers total though. http keepalive helped alot with the type of requests that haproxy serves so it reduced the work load some from the non-keepalive version. I also use auto-splice on there to reduce overhead somewhat. On 10/26/10 7:38 AM, Ariel wrote: Does anyone know of studies done comparing haproxy on dedicated hardware vs virtual machine? Or perhaps some virtual machine specific considerations? -a
Re: Strange latency
Just a guess, but is there something that might be doing reverse dns lookups for each request when using haproxy? I find when I turn on tcpdump on port 53 on a firewall or router, I and others are surprised at how much reverse lookup traffic there is going on in any given environment. On 10/26/10 2:02 PM, Simon Green - Centric IT Ltd wrote: Don't think there's hasn't been any traffic on this thread, so I thought I'd just chip in and say we run HAProxy on ESX4.1 with Stunnel in front on the same server and Apache servers behind and don't experience anything like the latency you mention below. -Original Message- From: Ariel [mailto:ar...@bidcactus.com] Sent: 25 October 2010 18:45 To: haproxy Subject: Strange latency I am using Rackspace cloud servers and trying to convince my boss that we should be using haproxy instead of apache at our frontend doing load balancing. For the most part I have set up what I consider a fairly successful staging environment (I have working ACL's and cookie based routing). The problem however is that when I use haproxy as my load balancer my round-trip time for a request goes up by about 50ms. With apache as the proxy every request has RTT of ~50ms, but now they are at over 100ms. I am using the same backend servers to test both apache and haproxy, all configuration rules the same as I could make them (client side keep-alive enabled). Also for a comparison I also set up a quick nginx server to do its (very dumb) load balancing solution, and its results are at the same speed or better of apache. Also, even when apache is terminating SSL and forwarding it on, the RTT does not go up. All three software is running (one at a time) on the same virtual server, so I don't think it is that I got a bad VPS slice or something like that. Also, when I use stunnel in front of haproxy to terminate https requests, it adds another ~50ms to the total RTT. And if I have to make the request go through another stunnel to the backend (a requirement for PCI compliance), it adds another ~50ms again. So now using the site with SSL is over 300ms per request just from the start. That may not be *terrible* but the site is very interactive and calls one AJAX request per second to keep lots of things updated. For general users around the internet the site is going to appear unresponsive and slow... I was wondering if anyone using haproxy in a virtualized environment as ever experienced something like this? Or maybe some configuration options to try to debug this? -a
weird error unknown option 'splice-auto'.
Copied a working 1.4.8 config to a Fedora 14 box with Fedora compiled 1.4.8 haproxy and it says unknown option 'splice-auto'. Is that correct? # service haproxy restart [ALERT] 292/190016 (3644) : parsing [/etc/haproxy/haproxy.cfg:2] : unknown option 'splice-auto'. [ALERT] 292/190016 (3644) : Error(s) found in configuration file : /etc/haproxy/haproxy.cfg [ALERT] 292/190016 (3644) : Fatal errors found in configuration. Errors in configuration file, check with haproxy check. # which haproxy /usr/sbin/haproxy # md5sum /usr/sbin/haproxy 6bf6c61a436f3e909be2903dbd702b79 /usr/sbin/haproxy # haproxy -vv HA-Proxy version 1.4.8 2010/06/16 Copyright 2000-2010 Willy Tarreau w...@1wt.eu Build options : TARGET = linux26 CPU = generic CC = gcc CFLAGS = -O2 -g OPTIONS = USE_LINUX_TPROXY=1 USE_REGPARM=1 USE_PCRE=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200 Encrypted password support via crypt(3): yes Available polling systems : sepoll : pref=400, test result OK epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 4 (4 usable), will use sepoll. # uname -a Linux xxyyzz 2.6.35.6-46.fc14.x86_64 #1 SMP Tue Oct 19 02:58:56 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
need help figuring out a sticking method
I have a site that is using a url based stickiness: example.com/something/:delim:K:delim:unique_session_info_here:delim::delim:/more_path_stuff... right now the delim is ( and as far as I know that doesn't need to be encoded. The problem is haproxy 1.4 can't seem to see that in the URL for appsession use. I am wonder how/if I can use the new stick stuff - it seems perfect for this and the table type of string is there, but there is no way I can see to put a string extracted from the url into such a table. Or if there is any other way... since some of the important client users can't use cookies, I am stuck :( Thanks for any ideas. BTW, it looks like this error msg is wrong in 1.4.8: [ALERT] 286/044915 (5233) : parsing [/etc/haproxy/haproxy.cfg:57]: 'observe' expects one of 'none', 'l4events', 'http-responses' but get 'layer' It gets cut off there, but the correct string seems to layer7 not http-responses
Re: Performance Question
What did the haproxy stats web page show during the test? How long was each test run? many people seem to run ab for a few seconds. Was tomcat doing anything for the test urls, I am a bit shocked you got 3700 rps from tomcat. Most apps I have seen on it fail at much lower rps. Raise the max conn for each server and for the front end and see if you get better results. On 10/6/10 7:11 PM, Les Stroud wrote: I did a little more digging and found several blogs that suggest that I will take a performance hit on virtual platforms. In fact, this guy (http://www.mail-archive.com/haproxy@formilux.org/msg03119.html) seems to have the same problem. The part that is concerning me is not the overall performance, but that I am getting worse performance with 4 servers than I am with 1 server. I realize there are a lot of complications, but I have to be doing something very wrong to get a decrease. I have even tried putting haproxy on the same server with 2 tomcat servers and used 127.0.0.1 to take as much of the network out as possible. I still get a lower number of requests per second when going through haproxy to the 2 tomcats (as opposed to going directly to one of the tomcats). This test is using ab locally on the same machine. I have tried all of the sysctl settings that I have found listed on the board. Is there anything I am missing?? I appreciate the help, Les Stroud On Oct 6, 2010, at 3:56 PM, Les Stroud wrote: I’ve figured I would find answers to this in the archive, but have been unable to. So, I appreciate the time. I am setting up an haproxy instance in front of some tomcat instances. As a test, I ran ab against one of the tomcat instances directly with an increasing number of concurrent connections. I then repeated the same test with haproxy fronting 4 tomcat servers. I was hoping to see that the haproxy setup would perform a higher number of requests per second and hold that higher number with increasingly high traffic. Unfortunately, it did not. Hitting the tomcat servers directly, I was able to get in excess of 3700 rqs/s. With haproxy in front of that tomcat instance and three others (using roundrobin), I never surpassed 2500. I also did not find that I was able to handle an increased amount of concurrency (both started giving errors around 2). I have tuned the tcp params on the linux side per the suggestions I have seen on here. Are there any other places I can start to figure out what I have wrong in my configuration?? Thanx, LES ——— haproxy.cfg global #log loghost local0 info maxconn 500 nbproc 4 stats socket /tmp/haproxy.sock level admin defaults log global clitimeout 6 srvtimeout 3 contimeout 4000 retries 3 option redispatch option httpclose option abortonclose listen stats 192.168.60.158:8081 mode http stats uri /stat #Comment this if you need to specify diff stat path for viewing stat page stats enable listen erp_cluster_https 0.0.0.0:81 mode http balance roundrobin option forwardfor except 0.0.0.0 reqadd X-Forwarded-Proto:\ https cookie SERVERID insert indirect server tomcat01-instance1 192.168.60.156:8080 cookie A check server tomcat01-instance2 192.168.60.156:18080 cookie A check server tomcat02-instance1 192.168.60.157:8080 cookie A check server tomcat02-instance2 192.168.60.157:18080 cookie A check
Re: 'haproxy' in AIX OS for C++ Applications
You are going need something like mysql proxy for that. haproxy can only look at the attributes of a connection (port, is there data available from the client (aka did the client speak first), etc. And in the case of httpd the headers but it can't look at the full contents of the whole data stream and make decisions based on that. I don't think the filename of a sql file/script is sent as part of the mysql stream if you do something like : mysql -u blah a.sql then the file name a.sql is not sent to the server, AFAIK. You'd have to write some custom lua and put something like select 'starting a.sql'; at the top of the files for your lua code to look for to direct the requests based on your logic. There are examples of lua code for load balancing read slaves which could be a basis to start with. On 8/17/10 11:52 AM, Turlapati, Sreenivasa wrote: Hi, As part of the request, the Front end sends the user name and some sort if sql script name to the back end needs that needs to execute. Here is the snapshot of our requirement If(sql script name = 'a..sql') Send the request to group 'a' servers If the user already got one earlier request - send the request to the same backend server Else send the request to free backend server Else if (sql script name = 'b..sql') Send the request to group 'a' servers If the user already got one earlier request - send the request to the same backend server Else send the request to free backend server Could you help us how can we achieve the above requirement through haproxy.cfg file. Thank You, Sreeni T Work : +1 781-302-6143 Cell : +1 617-955-3736 sturlap...@statestreet.com -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Tuesday, August 17, 2010 10:47 AM To: Turlapati, Sreenivasa Cc: haproxy@formilux.org Subject: Re: 'haproxy' in AIX OS for C++ Applications On Tue, Aug 17, 2010 at 10:27:04AM -0400, Turlapati, Sreenivasa wrote: Hi, Thxs a lot. Could you kindly let us know how can we read and understand the incoming request. Just do as you would with other rules, however if you want to match some contents data (eg: protocol or HTTP request), you must first ensure that you get a full request before applying the use_backend rule. For instance : frontend xxx1 # use backend yyy if destination port is 12345 acl is_port1 dst_port 12345 use_backend yyy if is_port1 frontend xxx2 # use backend yyy if clients talks HTTP tcp-request inspect-delay 30s tcp-request content accept if HTTP ... use_backend yyy if HTTP I read we need to carry out the below changes to capture the HAProxy log. edit the value of SYSLOGD in /etc/default/syslogd SYSLOGD=-r Then set up syslog facility local0 and direct it to file /var/log/haproxy.log or your desired location by editing /etc/syslog.conf: # Save HA-Proxy logs local0.* /var/log/haproxy_0.log local1.* /var/log/haproxy_1.log yes that's it. Is there any way we can able to get the Log without making the above changes as we don't have permission to modify the syslog.conf. Then the best solution is to have your own syslogd on your own port. That's something very common, and people generally use syslog-ng for that because it's fast, light and very flexible. It adds the benefit that people who manage the system keep their syslog.conf intact and people who manage their applications have their own syslog.conf and their own port. If it's just for testing purposes, then use netcat (nc). You make it listen for UDP traffic on the port you want, then configure haproxy to log on localhost on that port. That's frequently used on testing platforms. Alternatively you can log over a unix socket, but if someone restarts the syslogd (eg: logrotate), the connection will break and remain broken. And the loss rate over the unix socket is generally high. Regards, Willy
Re: 'haproxy' in AIX OS for C++ Applications
On 8/17/10 2:16 PM, Turlapati, Sreenivasa wrote: Thxs a lot. Sorry, if I am misguiding you. I am just curious to know, when HAProxy is set at TCP mode, we want to scan or glance over the incoming request for a particular string say 'XYZ'. If the incoming request contains the 'XYZ' string, route the request to a backend xyz group else route to backend 'ABC' group. Current (and past) versions of haproxy can't do that. Thank You, Sreeni T Work : +1 781-302-6143 Cell : +1 617-955-3736 sturlap...@statestreet.com -Original Message- From: Willy Tarreau [mailto:w...@1wt.eu] Sent: Tuesday, August 17, 2010 4:16 PM To: Turlapati, Sreenivasa Cc: haproxy@formilux.org Subject: Re: 'haproxy' in AIX OS for C++ Applications On Tue, Aug 17, 2010 at 03:50:19PM -0400, Turlapati, Sreenivasa wrote: Hi, We are not trying to use sql proxy, indeed we are using the TCP proxy. Just we need to read the incoming request and based on the sql file name we need to route the request to a backend server. We doesn't need to bother what the sql file got and how to execute it, we need to care which group we need to route the request. I understand what you want to do, but you have to understand that the sql file name designates something which does not exist at the TCP level. This means that an SQL parser is required to be able to extract that from the requests, and in my opinion the best way to find one is to check an SQL proxy. Regards, Willy
Re: Anyone know what software uses MT-Proxy-ID?
On 6/27/10 9:55 PM, Willy Tarreau wrote: Hi Hank, On Sun, Jun 27, 2010 at 02:12:35PM -0700, Hank A. Paulson wrote: I got this error hit via the haproxy socket, I noticed that there are a few hits when searching for it, all related to corrupt headers with lighttpd and people seem to be assuming it is lighttpd's fault but in the case I received, it is clear that there are some junk characters at the beginning of the request. (Perhaps lighttpd needs an option to print errors with hex encoding in order to see the characters causing the problems there) There is also this proxy blocking module for nginx that lists it when searching for signs of a proxy: http://www.linuxboy.net/nginx/ngx_http_proxyblock_module.c.txt I am wondering if this is some kind of web fuzzer software or if it is just poorly coded proxy software or if other people have seen problems with requests with a MT-Proxy-ID. (All the listings that I have seen, locally and on the web, that include the MT-Proxy-ID header have the same 1804289383 value.) Thanks for any insights. Don't you think this could simply be some discovery attack or bypass attempts ? The strangest part is the \x00, which, if intentionally left here, may be present to try to fool some HTTP parsers. Perhaps it targets a very specific product and was just blocked here. Anyway, if it's normally encountered with lighttpd, you may want to share that with the lighttpd guys so that they for once get a full dump of the abnormal request. Sorry, I was not clear - the only substantive search results where I find MT-Proxy-ID have been in some lighttpd discussions. I think they are mistakenly thinking there is a problem with lighttpd, my guess is that they are not seeing the junk characters at the beginning of the request and I am wondering if the software that adds the MT-Proxy-ID header also adds the junk characters due to poor coding, bugs, malicious purpose, etc. My one error hit has nothing to do with lighttpd. I just find it odd that the only references to MT-Proxy-ID are in a few headers in discussions of problem requests. Normally with unusual headers/user-agents you will find some search results with discussions asking about them and discussions of which software or websites use those headers or user-agent strings, etc. With MT-Proxy-ID I found none of that maybe the web hits for that string have been removed by google for some reason :) [04/Jun/2010:01:40:10.550] frontend abc (#1): invalid request src w.x.y.z, session #25252051, backendNONE (#-1), serverNONE (#-1) request length 327 bytes, error at position 0: 0 \x04\x02\x00POST /a/b/c/d HTTP/1.0\r\n 00054 User-Agent: Mozilla/5.0 (compatible; MSIE 6.0;)\r\n 00118 Host: foo.bar\r\n 00137 Accept: */*\r\n 00150 Content-Length: 8\r\n 00169 Content-Type: application/x-www-form-urlencoded\r\n 00218 MT-Proxy-ID: 1804289383\r\n 00243 X-Forwarded-For: x.y.z.w\r\n 00276 Connection: Keep-Alive\r\n 00300 Keep-Alive: 300\r\n 00317 \r\n 00319 xa=23123 Best regards, Willy
Anyone know what software uses MT-Proxy-ID?
I got this error hit via the haproxy socket, I noticed that there are a few hits when searching for it, all related to corrupt headers with lighttpd and people seem to be assuming it is lighttpd's fault but in the case I received, it is clear that there are some junk characters at the beginning of the request. (Perhaps lighttpd needs an option to print errors with hex encoding in order to see the characters causing the problems there) There is also this proxy blocking module for nginx that lists it when searching for signs of a proxy: http://www.linuxboy.net/nginx/ngx_http_proxyblock_module.c.txt I am wondering if this is some kind of web fuzzer software or if it is just poorly coded proxy software or if other people have seen problems with requests with a MT-Proxy-ID. (All the listings that I have seen, locally and on the web, that include the MT-Proxy-ID header have the same 1804289383 value.) Thanks for any insights. [04/Jun/2010:01:40:10.550] frontend abc (#1): invalid request src w.x.y.z, session #25252051, backend NONE (#-1), server NONE (#-1) request length 327 bytes, error at position 0: 0 \x04\x02\x00POST /a/b/c/d HTTP/1.0\r\n 00054 User-Agent: Mozilla/5.0 (compatible; MSIE 6.0;)\r\n 00118 Host: foo.bar\r\n 00137 Accept: */*\r\n 00150 Content-Length: 8\r\n 00169 Content-Type: application/x-www-form-urlencoded\r\n 00218 MT-Proxy-ID: 1804289383\r\n 00243 X-Forwarded-For: x.y.z.w\r\n 00276 Connection: Keep-Alive\r\n 00300 Keep-Alive: 300\r\n 00317 \r\n 00319 xa=23123
Re: about acl in ssl
That is because haproxy does not _yet_ parlez ssl so it can't see the http level attributes to route requests with them. But there is good news and bad news related to that - since I am from the future I can tell you that haproxy will have ssl encrypt/decrypt capabilities added in version 1.8.3, the bad news is that it will be released on December 19, 2012 and it turns out the Mayans were right, so the world does, in fact, end 2 days later. On 6/11/10 2:04 AM, hapr...@serverphorums.com wrote: hi, I had a configure like this : frontend ssl localhost:443 mode tcp acl content_htm path_end .htm use_backend secure_dynamic if content_htm default_backend secure_static # for css, js etc backend secure_static ... #omit detail backend secure_static ... # omit detail the problem is acl content_htm is not work, my intention is to use secure_static when resource is css, js, while the path end with .htm, it use secure-dynamic. not sure if my cofiguration is wrong coz when i use http, it is working, any idea ? kiwi happy hacking ! --- posted at http://www.serverphorums.com http://www.serverphorums.com/read.php?10,161415,161415#msg-161415
Re: A (Hopefully Not too Generic) Question About HAProxy
You do have bunch of services that are http mode that don't seem to have any type of http close. Some I don't understand why they are not http mode and they probably should be. Just a note you may be able to greatly simplify (and possibly speed up) your config using the new capabilities for tables of IPs added in 1.4.6. solr should probably be http mode and anywhere else that you have http mode you probably want an http close option turned on. I am not sure why they chose dispatch for the prod glassfish server, my guess is they are running apache and mod_jk or something and then forwarding the requests to different glassfish servers - are there really more than one prod glassfish servers? I am wondering if the previous admin set up more than one copy of haproxy and that is why several services are redirected to the same machine - like glassfish prod there is no other reference to port 4850 in this config, so what is running on port 4850? haproxy/apache/heaven forbid - glassfish itself? netstat -antope | fgrep LIST | fgrep 4850 I think one of the problems is the inter_server it doesn't have http mode set so if more than one hit/request comes in on an open connection then your request parsing rules are not run on any requests except the first one (as Wille keeps reminding people). That might work ok for most things since you are mostly breaking things up by service liferay goes to the liferay servers, etc - the problem comes in if you have a portal that people sign into and then have a menu/navbar that they can choose different services that should be going to different front/backends. On 5/18/10 3:49 PM, Chih Yin wrote: On Mon, May 17, 2010 at 11:11 PM, Hank A. Paulson h...@spamproof.nospammail.net mailto:h...@spamproof.nospammail.net wrote: On 5/17/10 10:24 PM, Willy Tarreau wrote: On Mon, May 17, 2010 at 07:42:03PM -0700, Hank A. Paulson wrote: I have some sites running a similar set up - Xen domU, keepalived, fedora not RHEL and they get 50+ million hits per day with pretty fast response. you might want to use the log separate errors (sp?) option and review those 50X errors carefully, you might see a pattern - do you have http-close* in all you configs? That got me weird, slow results when I missed it once. Indeed, that *could* be a possibility if combined with a server maxconn because connections would be kept for a long time on the server (waiting for either the client or the server to close) and during that time nobody else could connect. The typical problem with keep-alive to the servers in fact. The 503 could be caused by requests waiting too long in the queue then. My example was just to assure Chin Yin that haproxy on xen should be able to handle his current load depending, of course, on the glassfish servers. I meant some kind of httpclose option (httpclose/forceclose/http-server-close/etc) turned on regardless of keep-alive status - you know, like you are always reminding people :) I noticed when I forgot it on a section (that was not keepalive related) it caused wacky results - hanging browsers, images/icons/css not showing up, etc. Obviously it should not affect single requests like you would assume Akamai would be sending, it was a pure guess. Thank you everyone for your feedback. I really appreciate your help. Sorry for taking so long to respond. I had to get permission from my director to post some of the log data and our haproxy configuration file. I also had to hide a bit more of the configuration than was suggested because of concerns about making the issues we're encountering too public. I hope you understand. From my research on HAProxy and high availability websites in general, it seemed to me that compared to other websites, our traffic volume is actually rather light. In addition to how we have configured HAProxy for our infrastructure, I'm definitely also taking a look at our application servers and our content as well. I started looking at the log files and the HAProxy configuration file more closely today. I attached the (poorly) cleaned HAProxy configuration file. Looking at it, I can already see that the httpclose option isn't consistently included in all the sections, both the frontend and the backend. I will make sure this option is in all sections. Should I also add this to the global settings for HAProxy? Is it okay if this option is listed more than once in a section (I noticed that this happened a couple of times)? Chin Yin, Xani was right, please take a look at your logs. Also, sending us your config would help a lot. Replace IP addresses and passwords with XXX if you want, we'll comment on the rest. BTW
Re: A (Hopefully Not too Generic) Question About HAProxy
On 5/18/10 7:45 PM, Hank A. Paulson wrote: I am wondering if the previous admin set up more than one copy of haproxy and that is why several services are redirected to the same machine - like glassfish prod there is no other reference to port 4850 in this config, so what is running on port 4850? haproxy/apache/heaven forbid - glassfish itself? netstat -antope | fgrep LIST | fgrep 4850 oops, my bad glassfish prod is on a different server - never mind... 172.16.163.1:4850
Re: Binding by Hostname
Hi, A few more troubleshooting ideas: Also, if you do dig www.wildfalcon.com/wildfalcon.com does it resolve to the IPs on the haproxy box? If you telnet to www.wildfalcon.com 80 on the haproxy box does it work? If so then if you do tcpdump for port 53 and watch that during the haproxy start up are any queries for www.wildfalcon.com getting resolved correctly? Lastly, if you strace haproxy as it starts do you see resolver lib calls that are being answered correctly? I test both because dig seems to search slightly differently from the c resolver libs on RedHat Linux. I had a case with boa where it refused to start up because the hostname was not resolvable even though I specified an IP address for it to bind to, so just a thought. On 4/16/10 8:35 AM, Guillaume Bourque wrote: Hi Laurie are the website ip available on the machine where haproxy run ? What os is used for your haproxy server? Bye Laurie Young a écrit : Hi I hope someone can help me here... I'm trying to set up HAproxy to bind two different listeners to different hostnames. I found this in the docs for the bind command: address is optional and can be a host name, so i set up my config file like this defaults mode http frontend www bind wildfalcon.com:80 http://wildfalcon.com:80 timeout client 5000 frontend test bind www.wildfalcon.com:80 http://www.wildfalcon.com:80 timeout client 8640 And I get the following error message Available polling systems : poll : pref=200, test result OK select : pref=150, test result OK Total: 2 (2 usable), will use poll. Using poll() as the polling mechanism. [ALERT] 105/160114 (10091) : Starting frontend www: cannot bind socket [ALERT] 105/160114 (10091) : Starting frontend test: cannot bind socket Why can the socket not be bound to (i'm starting as sudo to ensure I have permissions)? Thanks in advance Laurie -- Dr Laurie Young Scrum Master New Bamboo Follow me on twitter: @wildfalcon Follow us on twitter: @newbamboo Creating fresh, flexible and fast-growing web applications is our passion. 3rd Floor, Gensurco House, 46A Rosebery Avenue, London, EC1R 4RP http://www.new-bamboo.co.uk
Re: External script
You probably have to monitor the log with a log watching tool and have it run the script. Or use the haproxy socket to monitor and trigger the script. On 4/10/10 7:31 AM, Gullin, Daniel wrote: Yes I know, but mean that I got active/backup on the webfarm. I have one webserver that is active and one backup webserver. When the first webserver fail and the failover is done to the backup webserver I need HAProxy to run a external script... My conf: listen x.x.x.x:80 mode http balance roundrobin option ssl-hello-chk server web1 192.168.1.10 check server web2 192.168.2.10 check backup Thanks Daniel 2010/4/10 Bernhard Krieger b...@noremorze.at mailto:b...@noremorze.at Hi, you can use keepalived to install a active/passive loadbalancer. Look at this howto. http://www.howtoforge.com/haproxy_loadbalancer_debian_etch_p2 regards Bernhard Am 10.04.2010 11:39, schrieb Gullin, Daniel: Hi, I´m wondering if it´s possible to let HAProxy /execute/ a external script when a failover is done in a active/passive configuration ? For example I use one active and one backup server and I want to /execute a script when the active server fail and HAProxy is doning a failover to the backup server... ? Thanks Daniel /
Re: Slow loading
On 3/30/10 11:49 PM, Willy Tarreau wrote: On Wed, Mar 31, 2010 at 02:17:37AM -0400, Geoffrey Mina wrote: There was nothing between the two but a switch... although, disabling the Windows firewall on the IIS server seems to have fixed the problem! I don't have much experience with the built in windows firewall... but apparently it's not happy about something. well then either the windows firewall is terribly buggy or the switch is having fun with the TTL (layer3 switch maybe ?), because it is not normal to have the TTL decrease by one if nothing sits between the two machines. I think we'll switch over to a third party firewall application. That's a safer bet :-) Thanks for the help! You guys rock. You're welcome! Willy 2.6.18-164.el5xen If they are using a domU on Xen then there is either a bridge or other forwarding mechanism on the dom0 routing traffic to the VM. That might be causing the ttl decrement, the default is a bridge and I don't know if bridges normally decrement the ttl. iptables and/or conntrack on the dom0 and/or the domU might be culprits in the disappearing packet? I ugess not in this case, but I'd watch them... I turn off iptables completely on the dom0 and domU esp. when trying to troubleshoot. Some people find slow IO with Xen: http://lists.xensource.com/archives/html/xen-users/2009-11/msg00206.html
Re: Haproxy monitoring with munin
On 1/16/10 5:46 PM, Bart van der Schans wrote: Hi, A few days ago there's was some interest in munin plugins for haproxy. I have written a few plugins in perl. To code is fairly strait forward and should be quite easy to adjust to your needs. The four attached plugins are: - haproxy_check_duration: monitor the duration of the health checks per server - haproxy_errors: monitor the rate of 5xx response headers per backend - haproxy_sessions: monitors the rate of (tcp) sessions per backend - haproxy_volume: monitors the bps in and out per backend To use them you'll have to add something like the following to your munin-node config: [haproxy*] user haproxy env.socket /var/run/haproxy.sock The user should have rights to read and write to the unix socket and env.socket should point to the haproxy stats socket. For debugging the dump command line option can be quite useful. It print the complete %hastats data structure containing all the info read from the socket with show stat. I can setup some sourceforge/github thingie which will make it easier to share patches/updates/additions/etc. if people are interested. Regards, Bart I noticed the 1.4.4 version of Munin complains: Service 'haproxy_errors' exited with status 1/0. The normal (non-error) exit paths seem to require exit 0 not exit 1
Re: Question on logging, debug mode
For reference, one of the sites I have is F12, 60-90 million hits/day 60-120+ Mbps logging via rsyslogd to 2 different logging servers (with separate error logging turned on) and it runs fine on a Xen VM with 5 GB and 1 VCPU - always below 80% CPU load. Fedora 8 is hundreds of years old, but it along with rsyslog (logging locally or remotely) and haproxy with a reasonable amount of RAM should be fine for anything below facebook level loads. There are alot of network stack and NIC driver fixes from F8 to F12. You could even do the remote logging via a separate NIC. You say: This is a bit of a problem for us to automate launching of a machine in our environment since we need to edit syslog config files. but if you used syslog-ng you have to create a customized config file for that, too - so it doesn't seem like less work than adding a few lines to your existing rsyslog config. I'm confused why that creates a problem... Willie - From Fedora 8 on, they have been using rsyslog by default not syslog. rsyslog is similar in goals and features to syslog-ng: http://www.rsyslog.com/doc-rsyslog_ng_comparison.html For instance, it can cache to a local disk if your remote logging server is not reachable. On 3/11/10 10:11 AM, Praveen Patnala wrote: Hi Willy, Thanks for the clarification. On my fedora distribution, if I try to install syslog-ng, I see the following errors and the install fails. yum install syslog-ng Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * updates-newkey: kdeforge.unl.edu http://kdeforge.unl.edu * updates: kdeforge.unl.edu http://kdeforge.unl.edu * fedora: kdeforge.unl.edu http://kdeforge.unl.edu Setting up Install Process Parsing package install arguments Resolving Dependencies -- Running transaction check --- Package syslog-ng.i386 0:2.0.10-1.fc8 set to be updated -- Processing Dependency: libevtlog.so.0 for package: syslog-ng -- Running transaction check --- Package eventlog.i386 0:0.2.7-1.fc8 set to be updated -- Processing Conflict: rsyslog conflicts syslog-ng -- Finished Dependency Resolution rsyslog-2.0.2-3.fc8.i386 from installed has depsolving problems -- rsyslog conflicts with syslog-ng Error: rsyslog conflicts with syslog-ng Do you have any other suggestion? Thanks, Praveen. On Tue, Mar 9, 2010 at 9:49 PM, Willy Tarreau w...@1wt.eu mailto:w...@1wt.eu wrote: On Tue, Mar 09, 2010 at 06:16:07PM -0800, Praveen Patnala wrote: HI Willy, I had a question on the debug mode. I got the syslogd working (using -r options) and we do get the logs on a different location than /var/log/messages by modifying the various syslog config files. This is a bit of a problem for us to automate launching of a machine in our environment since we need to edit syslog config files. If we run the haproxy in the debug mode and redirect the logs, do you recommend that? Huh, not at all ! Debug mode dumps *all headers*, which is very verbose. It will also not include important information such as IP addresses, timers, flags, etc... As its name implies, debug mode is for debugging. Can you share any information in terms of performance slowdown or stability in this mode in a production environment? It will be very expensive in terms of CPU and storage, and useless for logging. Some people do that on development machines to get full captures of requests and responses, but that's on development machines only. What I really recommend you to do is to install syslog-ng with just one instance dedicated to local UDP logging. It is very fast, will not interfere at all with existing syslog and will not require any system config change. I regularly recommend this basic configuration : options { sync (0); time_reopen (10); log_fifo_size (1); long_hostnames (off); use_dns (no); use_fqdn (no); create_dirs (no); keep_hostname (yes); }; source s_udp { udp(ip(127.0.0.1) port(514)); }; destination d_haproxy { file(/var/log/haproxy); }; filter f_local0 { facility(local0); }; log { source(s_udp); filter(f_local0); destination(d_haproxy); }; You can even start it on a non-privileged port, or have one instance per haproxy instance, etc... It's the easiest solution to deploy and probably the cleanest. Regards, Willy
Re: Fwd: Site running slow
You have selinux on, so it may be unhappy with some part of haproxy - the directory it uses, the socket listeners, etc. Turn it off (if you can) until you get everything working ok. Turning it off requires a reboot. To see if it is on: # sestatus google for how to turn it off I would back off the check inter to 30s or so and make it an http check of a file that you know exists, if you can have any static files on your servers. This will allow you to see that haproxy is able to find that file, get a 200 response and verify that the server is up. Also, when you say free mem going down to 45Mb are you looking at the first line of free or the second line? Ignore the first line, it is designed to cause panic. eg: $ free -m total used free sharedbuffers cached Mem: 32244 32069174 0 0 19578 -/+ buffers/cache: 12490 19753 Swap: 4095 0 4095 OMG, I only have 174MB of my 32GB of memory available!?! - no, really 19.75 GB is still available. On your haproxy config, if you log errors separately then you can tail -f that error-only log and watch it as you start up haproxy. And why not do http logging if you are doing http mode? Maybe I am missing something. I would back off the check inter to 30s or so and make it an http check of a file that you know exists, if you can have any static files on your servers. This will allow you to see that haproxy is able to find that file, get a 200 response and verify that the server is really is up and responding fully, not just opening a socket. If you can switch to 1.4rc1 then you get alot more info about the health check/health status on the stats page and you can do set log-health-checks as an addition aid to troubleshooting. global log 127.0.0.1 local0 log 127.0.0.1 local1 notice #log loghostlocal0 info option log-separate-errors maxconn 4096 chroot /var/lib/haproxy user haproxy group haproxy daemon # debug #quiet defaults log global modehttp # option httplog option dontlognull retries 3 option redispatch maxconn 4096 contimeout 5s clitimeout 30s srvtimeout 30s listen loadbalancer :80 mode http balance roundrobin option forwardfor except 10.0.1.50 option httpclose option httplog option httpchk HEAD /favicon.ico cookie SERVERID insert indirect nocache server WEB01 10.0.1.108:80 cookie A check inter 30s server WEB05 10.0.1.109:80 cookie B check inter 30s listen statistics 10.0.1.50:8080 stats enable stats auth stats:stats stats uri / [BTW, Did you do a yum upgrade - not yum update after your install of F12?, yum update misses certain kinds of packaging changes, yum upgrade covers all updates, even if the name of a package changes - yum upgrade should be the default used in yum examples - I ask because many people don't do this and there are many security fixes and other package bug fixes that have been posted] On 2/6/10 6:59 AM, Peter Griffin wrote: Hi Will, Yes X-Windows is installed, but the default init is runlevel 3 and I have not started X for the past couple of days. The video card is an addon card so I rule out shared memory. With regards to eth1 I ran iptraf and can see that there is no traffic on eth1 so I'd rule this out as well. I thought about listening for stunnel requests on eth1 10.0.1.51 and connecting to haproxy on 10.0.1.50, but maybe this will cause more problems... I had already ftp'd a file some 70MB to another machine on the same Vlan and I did not see any problems whatsoever. What I'm planning to do now is to setup the LB in another environment with another 2 Web servers and 1 DB server and stress the hell out of it. Then I can also test the network traffic using Iperf. Will report back in a few days, thank you once more. On 6 February 2010 14:29, Willy Tarreau w...@1wt.eu mailto:w...@1wt.eu wrote: On Sat, Feb 06, 2010 at 01:16:00PM +0100, Peter Griffin wrote: Both http https. Also both web servers started to take it in turns to report as DOWN but more frequently the second one than the first. I ran ethtool eth0 and can verify that it's full-duplex 1Gbps: OK. I'm attaching dmesg, I don't understand most of it. well, it shows some video driver issues, which are unrelated (did you start a graphics environment on your LB ?). It seems it's reserving some memory (64 or 512MB, I don't understand well) for the video. I hope it's not a card with shared memory, as the higher the resolution, the lower the remaining memory bandwidth for normal work. But I don't see any
Re: Failover to side B without switch back to side A when it's reachable again.
On 1/27/10 9:42 PM, Willy Tarreau wrote: Hi, On Wed, Jan 27, 2010 at 04:15:30PM +0100, Franco Imboden wrote: Hi, I have a question concerning a failover scenario of a webservice where no cookies are supported. as long as nothing happens, all requests should be routed to side A. If side A is not reachable anymore, the proxy should switch to side B. so far, everything works fine. The problem now is, that if side A becomes reachable again, in this specific scenario all requests should still be routed to side B (and not to side A) until side B fails. Is this possible to solve with haproxy without using cookies ? No I don't see any way to do this because it requires an internal variable to keep state of whom was alone last. I got this question 6 or 7 years ago, and I wanted to implement a sticky failover, until I realized that it can make sense only in active-backup setups, which are the least common ones. More recently I wanted to implement a feature I called buddy servers. The idea is that a server could designate which one must take its traffic when it's down. Most often this will be a backup server. It might be a good starting point for the feature you need, because we could declare in the first server that even if it's up again, it must not take traffic as long as its buddy is there. Regards, Willy You could do it with a separate health check process. Have it know when the traffic to its web server stops, then it starts returning down - whatever haproxy requires for the health check to return to be understood as down. It never changes back to an up response until reset by someone/something.
Re: [ANNOUNCE] haproxy 1.4-dev6 : many fixes
I wanted to report after using 1.4-dev6 for several sites for a couple days that the results seem very good. One site was peaking at over 150 Mbps and over 65 million hits past couple of days, during that time memory use stayed steady between 1.5-2.5 GB and went down when load went down. On 1/7/10 11:05 PM, Willy Tarreau wrote: Hi all, well, some of you have encountered issues with 1.4-dev5 with sessions left in CLOSE_WAIT state or with memory leaks.
Re: [ANNOUNCE] haproxy 1.4-dev5 with keep-alive :-) memory comsuption
Definitely haproxy process, nothing else runs on there and the older version remains stable for days/weeks: F S UIDPID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD 1 S nobody 15547 1 18 80 0 - 1026097 epoll_ 10:54 ? 00:54:30 /usr/sbin/haproxy14d5 -D -f /etc/haproxy/haproxyka.cfg -p /var/run/haproxy.pid -sf 15536 1 S nobody 20631 1 29 80 0 - 17843 epoll_ 13:48 ?00:33:37 /usr/sbin/haproxy -D -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -sf 15547 On 1/5/10 11:10 PM, Willy Tarreau wrote: On Tue, Jan 05, 2010 at 11:00:30PM -0800, Hank A. Paulson wrote: Using git 034550b7420c24625a975f023797d30a14b80830 [BUG] stats: show UP/DOWN status also in tracking servers 6 hours ago... I am still seeing continuous memory consumption (about 1+ GB/hr) at 50-60 Mbps even after the number of connections has stablized: OK. Is this memory used by the haproxy process itself ? If so, could you please send me your exact configuration so that I may have a chance to spot something in the code related to what you use ? A memory leak is something very unlikely in haproxy, though it's not impossible. Everything works with pools which are released when the session closes. But maybe something in this area escaped from my radar (eg: header captures in keep-alive, etc...). 69 CLOSE_WAIT 9 CLOSING 4807 ESTABLISHED 35 FIN_WAIT1 4 FIN_WAIT2 255 LAST_ACK 10 LISTEN 3410 SYN_RECV This one is really impressive. 3410 SYN_RECV basically means you're under a SYN flood, or your network stack is not correctly tuned and you're slowing down your users a lot because they need to wait 3s before retransmitting. Regards, Willy Thanks, we pride ourselves on our huge SYN queue... :)
Re: [ANNOUNCE] haproxy 1.4-dev5 with keep-alive :-) memory comsuption
On 1/4/10 9:15 PM, Willy Tarreau wrote: On Mon, Jan 04, 2010 at 07:05:48PM -0800, Hank A. Paulson wrote: On 1/4/10 2:43 PM, Willy Tarreau wrote: - Maybe this new timeout should have a default value to prevent infinite keep-alive connections. - For this timeout, haproxy could display a warning (at startup) if the value is greater than the client timeout. In fact I think that using http-request by default is fine and even desired. After all, it's the time we accept to keep a connection waiting for a request, which exactly matches that purpose. The ability to have a distinct value for keep-alive is just a bonus. But please do have it as a separate settable timeout for situations like I have on a few servers where 80% or more of the traffic comes from a few IPs and if they have keep alive capability, I am willing to wait a relatively long time (longer than the http request time out) for that connection to send more requests - because the server spent time opening up the tcp window to a good value for decent throughput and don't want to have to start that process over again unnecessarily. That's an interesting point. So basically you're confirming that we don't want a min() of the two values, but rather an override. Hank, if you're interested in trying keep-alive again, please use snapshot 20100105 from here : http://haproxy.1wt.eu/download/1.4/src/snapshot/ The only suspected remaining issue reported by Cyril seems not to be one at first after some tests. I could reproduce the same behaviour but the close_wait connections were the ones pending in the system which got delayed due to SYN_SENT retries and were processed long after the initial one, but all eventually resorbed (at least in my situation). And all the cases you reported with stuck sessions and memory increasing seem to be gone right now. Regards, Willy Using git 034550b7420c24625a975f023797d30a14b80830 [BUG] stats: show UP/DOWN status also in tracking servers 6 hours ago... I am still seeing continuous memory consumption (about 1+ GB/hr) at 50-60 Mbps even after the number of connections has stablized: # w;date;free -m;netstat -ant | fgrep -v connections | fgrep -v Proto | awk '{print $6}' | sort | uniq -c 12:51:16 up 10 days, 19:33, 1 user, load average: 0.00, 0.00, 0.00 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT root pts/110.x.y.z 09:250.00s 0.23s 0.00s w Wed Jan 6 12:51:16 WIT 2010 total used free sharedbuffers cached Mem: 5129 3718 1411 0 94185 -/+ buffers/cache: 3438 1691 Swap:0 0 0 69 CLOSE_WAIT 9 CLOSING 4807 ESTABLISHED 35 FIN_WAIT1 4 FIN_WAIT2 255 LAST_ACK 10 LISTEN 3410 SYN_RECV 2493 TIME_WAIT # w;date;free -m;netstat -ant | fgrep -v connections | fgrep -v Proto | awk '{print $6}' | sort | uniq -c 13:40:58 up 10 days, 20:23, 1 user, load average: 0.00, 0.00, 0.00 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT root pts/110.x.y.z 09:250.00s 0.26s 0.01s w Wed Jan 6 13:40:58 WIT 2010 total used free sharedbuffers cached Mem: 5129 4831298 0 94185 -/+ buffers/cache: 4550578 Swap:0 0 0 86 CLOSE_WAIT 10 CLOSING 4510 ESTABLISHED 40 FIN_WAIT1 7 FIN_WAIT2 390 LAST_ACK 10 LISTEN 3062 SYN_RECV 2256 TIME_WAIT
Re: [ANNOUNCE] haproxy 1.4-dev5 with keep-alive :-)
On 1/4/10 2:43 PM, Willy Tarreau wrote: - Maybe this new timeout should have a default value to prevent infinite keep-alive connections. - For this timeout, haproxy could display a warning (at startup) if the value is greater than the client timeout. In fact I think that using http-request by default is fine and even desired. After all, it's the time we accept to keep a connection waiting for a request, which exactly matches that purpose. The ability to have a distinct value for keep-alive is just a bonus. But please do have it as a separate settable timeout for situations like I have on a few servers where 80% or more of the traffic comes from a few IPs and if they have keep alive capability, I am willing to wait a relatively long time (longer than the http request time out) for that connection to send more requests - because the server spent time opening up the tcp window to a good value for decent throughput and don't want to have to start that process over again unnecessarily.
Re: anyone doing this: internet - http proxy with KA - haproxy?
I tried the version from git and it worked ok - not sure if the client side has KA running, so it may not help the problem. The throughput was fine and stayed high during the run, but the memory use increase was linear until all RAM was consumed - recompiled again and tried with slice-auto on to see if that would help and same result after about 3 hours, all RAM used. On 1/2/10 3:34 PM, Willy Tarreau wrote: [ for an unknown reason, this mail failed to reach the list, trying again ] Hi Hank, I should have read the ML before responding to you privately :-) On Sat, Jan 02, 2010 at 02:24:27AM -0800, Hank A. Paulson wrote: I have a site with 90% of the traffic from a few client IPs that are 300ms or so away, their gateway software doesn't seem to be dealing with thousands of connections very well and we can't take advantage of large tcp windows because the connection is over after one response. So I am thinking of trying to put something that will maintain a few long connections with that far away client IP and see if that improves things. Anyone have any suggestions for http proxies with keep alive that I can put in front of haproxy? Anyone doing this? config suggestions? you can simply download the very latest snapshot (not yet available in the snapshot directory, you'll have to extract it from GIT) : http://haproxy.1wt.eu/git?p=haproxy.git;a=snapshot;sf=tgz Then replace option httpclose with option http-server-close and you'll have keep-alive on the client-side. It also supports pipelining, which further reduces latency when your clients support it too. Best regards, Willy
anyone doing this: internet - http proxy with KA - haproxy?
I have a site with 90% of the traffic from a few client IPs that are 300ms or so away, their gateway software doesn't seem to be dealing with thousands of connections very well and we can't take advantage of large tcp windows because the connection is over after one response. So I am thinking of trying to put something that will maintain a few long connections with that far away client IP and see if that improves things. Anyone have any suggestions for http proxies with keep alive that I can put in front of haproxy? Anyone doing this? config suggestions? Thanks for any info.
Re: at what point does haproxy begin queueing requests for a backend?
On 10/19/09 12:46 PM, Willy Tarreau wrote: On Mon, Oct 19, 2009 at 12:25:00PM -0700, Hank A. Paulson wrote: at what point does haproxy begin queuing requests for a backend? Is it at sum(maxconn * wght) or sum(maxconn) ignoring wght ^^^ This one precisely or is it at fullconn even if fullconn is less than the above 2 values or some other point? No, fullconn determines when maxconn is applied in case of dynamic limitation (when using minconn too), don't use this unless you know you want it. It's too hard to explain how it works, and it seems even harder to understand :-) Under what circumstances does haproxy queue directly to a certain server for a given backend vs a global queue for that whole backend, if ever? It does so when the request contains a cookie indicating it must be processed by that server and not by any other one. The backend's queue is dequeued when a server releases a connection and finds a connection in the backend queue that has been here for more time than the next connection in its own queue. This ensures a very fair queuing. So then in the case of a system with cookies you could have a request with an existing cookie wait for longer than a request that is new? a connection in the backend queue that has been here for more time than the next connection in its own [a given server for that backend] queue. You can take a look at that diagram for more information : http://haproxy.1wt.eu/download/1.3/doc/queuing.pdf Thanks, I had seen that before but wasn't sure exactly what calculated value point triggered the queuing. But that diagram brings up another question: If your maxqueue is 8192, then the queue for a backend and the individual server queues can all be 8192 each? And at what level is the global queue at global global or is it at each backend? Regards, Willy
Re: dynamic weights based on actual server load
For the code you are developing, if you make the interface general enough so that parameters can be added or removed that would be good. Telnet/text/memcached style protocols seem popular to allow easy debugging/monitoring. So if your protocol says a machine has to send a load info bundle like: SS:8cbed340118ddf87e2d8ca4352006572 SYSID: blah1 SAMPLETIME: 2009-10-14-22-00-03 CPU: 83.23343455 NETI: 134238.0232 NETO: 492283.6549 DISK: 433.232 ES:8cbed340118ddf87e2d8ca4352006572 that would give you a generic record format that anyone could create a client or modify your client and add/remove load parameter info fields. Then they just take their list of load parameters and wrap them in the header and footer with the id of that sample record being the md5sum of the included data. SS = start sample record/ES = end sample record Then you could have a separate/pluggable module or process that takes that info and maybe pulls other data from system history files, etc and decides what to set the weight to for that server. You could provide a default weight setter engine that uses a simple algo based on just CPU load or something and others could fill in more complex/custom engines if desired. You might want to check out feedbackd: http://ozlabs.org/~jk/projects/feedbackd/ and his paper on Using Dynamic Feedback to Optimise Load Balancing Decisions http://redfishsoftware.com.au/projects/feedbackd/lca-paper.pdf to get some ideas. It is probably possible to just modify feedbackd to emit haproxy set weight commands. More interesting, I think would be to combine a multiple load parameter (active connections, CPU, net in/out bytes, net in/out packets, disk io, etc) feedback system with the ideas from the NetBSD neural network scheduler, creating an ai based dynamic load balancing system. http://softlayer.dl.sourceforge.net/project/nnsched/docs/thesis/nnsched.pdf This is more possible now that we have multi core systems that would have some idle CPU resources available for the ai compute load. On 10/16/09 10:29 AM, Craig wrote: Hi, a patch (set weight/get weight) I imagined some days ago was integrated just 6hrs after I had thought about it (Willy must be reading them!). I've written a simple (exchangable) interface that prints out a servers load and a client to read it. I plan to read the load from all servers and adjust the weight dynamically according to the load so that a very busy server gets less queries and the whole farm is more balanced. I plan to smoothen the increasing/decreasing a bit so that there aren't too great jumps with the weight, I want to implement a policy of something like oh that server can do 50% more, lets just increase the weight by 25% and check again in a minute. I hope this will autobalance servers with different hardware quite well, so that you don't have to guess or do performance tests to get the weights properly. Some python code is already finished (partly because I'd like to practise a bit) but I didn't continue yet, because I'd like to hear your opionions about this. Am I mad? ;) Best Regards, Craig
Re: 502 errors continue
On 10/7/09 3:21 PM, Ben Fyvie wrote: So what our problem really comes down to is why doesn't mongrel quietly stop receiving requests after monit issues the initial kill. (FYI - it is our understanding that calling mongrel stop also issues a kill command so there is no nicer way to ask it to shut down) I think you are confusing monit and the script that monit runs when doing a mongrel stop in response to a threshold exceeded condition. Your monitrc file has something like: check mongrelX with pid... start=/some/shell/script.sh stop=/some/shell/script.sh if bad things happen then restart blah blah You could change that shell script to do a kill -9 immediately rather than being nice. But you'll still get a 50X if the mongrel was in the middle of a request. Another way, if you are doing such frequent health checks is to have the mongrel stop script set a page/file that tells haproxy that the server is now down (described in haproxy mailing list discussions) then wait a couple seconds before issuing the first nice kill command - thus letting hap stop sending request and allowing that last request to finish (depending on how long your avg request response takes on your mongrels when they are bloated - if it takes 10 seconds for mongrel to complete a response when it is using a ton of memory then you are probably not going to be able to get all the timing to work out).
Re: no session failover on cluster
On 9/21/09 1:28 PM, Stefan wrote: Hello Am Montag 21 September 2009 20:59:01 schrieb Hank A. Paulson: I think you are getting a new cookie because of how your openais/etc is failing over, not haproxy. haproxy doesn't create the cookie it just passes it along. drbd/openais doesn't have a way to maintain the same tcp connection id across failovers (AFAIK), so when you fail over you get a new tcp connection. You have to ask the openais/pacemaker/etc experts if the php sessions/cookies are being replicated across all the servers and if so, how would phpmyadmin get the list of cookies from the failed server. I understood haproxy that it can handle session failover. Im wrong with this? tia stefan Let's say the app (phpmyadmin) on server A issues a cookie with a value of phpcookie42132334595 haproxy forwards that cookie back and forth each request. All is good. later that same day... Server A fails, haproxy forwards the cookie phpcookie42132334595 to server B which has taken over the IP address of server A. Server B says to the browser, I have never heard of cookie phpcookie42132334595 or a session with that cookie. haproxy can't fix that. You say the php or app level session data is there on the drbd device and that it is still there after failover. So you need to make sure haproxy is still delivering the cookie after failover - you can log the cookie using the examples from the manual: capture request header Cookie len 512 Once you confirm that it is delivering the cookie to the _first_ request delivered to the new server (because after the first request, the new server will probably issue a new cookie if it can't find the old session). It probably is working, so the next step is you have to check why the php/app on server B is not seeing the failed over cookies/sessions that are living on the drbd device. Do you have two servers running haproxy with the other stuff running locally on those same servers?
Re: no session failover on cluster
I think you are getting a new cookie because of how your openais/etc is failing over, not haproxy. haproxy doesn't create the cookie it just passes it along. drbd/openais doesn't have a way to maintain the same tcp connection id across failovers (AFAIK), so when you fail over you get a new tcp connection. You have to ask the openais/pacemaker/etc experts if the php sessions/cookies are being replicated across all the servers and if so, how would phpmyadmin get the list of cookies from the failed server. On 9/21/09 2:46 AM, Stefan wrote: Hello all :) I have a pacemaker, openais, drbd, mysql, haproxy running but dont get the session failover running. Haproxy runs also on the cluster, I want to access phpmyadmin without to relogin when there is a failover. I set the session path in php.ini to a directory on drbd device. The session data is there. Also after failover. But I have to relogin when haproxy fails over and I get a new cookie. my config: global log 127.0.0.1 local0 log 127.0.0.1 local1 notice #log loghostlocal0 info maxconn 4096 #debug #quiet user haproxy group haproxy defaults log global modehttp option httplog option dontlognull retries 3 redispatch maxconn 2000 contimeout 5000 clitimeout 5 srvtimeout 5 listen webfarm 10.100.100.200:80 mode http #cookie webfarm insert stats enable stats auth someuser:somepassword balance roundrobin cookie cluster-phpmyadmin prefix option httpclose option forwardfor option httpchk HEAD /check.txt HTTP/1.0 server cluster 10.100.100.200:81 cookie cluster check can someone help? tia stefan
Re: clarification on hdr* matching?
On 9/17/09 1:17 PM, David Birdsong wrote: I'm having trouble with the syntax of the hdr* matching for creating ACL's. I know that my control flow is correct. by changing the acl defiinition to a simple src ip, i get the desired change of backends. ie. this will send traffic to img backends (desired) # acl from_vnsh hdr_sub -i nginx acl from_vnsh src 208.94.1.44 use_backend varnish if METH_GET ! from_vnsh default_backend img this will send it to varnish: acl from_vnsh hdr_sub -i nginx # acl from_vnsh src 208.94.1.44 use_backend varnish if METH_GET ! from_vnsh default_backend img I think the problem is you shouldn't have the METH_GET in there use_backend varnish if ! from_vnsh works for me furthermore, is there a way to match on a header field and not the field's value? i have a field: X-Varnish: 751121622 i want to create the acl simply based on X-Varnish being present irrespective of the value of this field. acl varn_present hdr_cnt(X-Varnish) gt 0
Re: reqidel in backend not working 1.3.20?
Yes, I have httpclose everywhere since I forgot it once. Doing some testing and it seems to freak out when I do reqidel Accept or something like that, I made a bunch of changes and now it is working but there is something about one of the Accept regexes that it didnt seem to like. I am not sure why. On 9/14/09 12:07 AM, Jean-Baptiste Quenot wrote: Did you set httpclose?
reqidel in backend not working 1.3.20?
I have: reqidel ^Cookie or reqidel ^Cookie: or reqidel ^Cookie:\ or reqidel ^Cookie:.* in one backend but the requests are arriving at the servers in that backend with Cookie headers. I tried the follow: * I recompiled with and without PCRE * changed the reqidel lines with and without ^, :\ , etc * checked for stray weird characters with od -c I can't figure out why it is not working - should it work? Platform is Linux x86_64
Re: HAProxy randomly returning 502s when balancing IIS
It was .18 then .19 and I just switched to 20 a couple days ago, so I will check later tonight. On 8/31/09 12:01 PM, Willy Tarreau wrote: Hello, On Mon, Aug 31, 2009 at 11:30:16AM -0700, Hank A. Paulson wrote: I was also getting a lot 502s from Varnish, I had hoped by ignoring them they would go away. I don't want to check if they are still there because I think it might be too soon. which version ? One such issue was found and fixed in 1.3.20. Also, in Miguel's case, it happens that some link load-balancing was causing packets to be reordered and the server on the other side to send an RST too fast (it closes then sends an RST, but the RST is received before the last packet due to the link LB). Willy
Re: Balancing bytes in/out
FYI, I think the default unit for times in haproxy is ms not seconds, so these are not correct, AFAIK. clitimeout 6 # 16.6 Hrs. 60 seconds srvtimeout 3 # 8.33 Hrs. 30 seconds contimeout 4000 # 1.11 Hrs. 4 seconds On 8/12/09 10:22 PM, Nelson Serafica wrote: clitimeout 6 # 16.6 Hrs. srvtimeout 3 # 8.33 Hrs. contimeout 4000 # 1.11 Hrs.