RE: CPU spike after restarting with -sf pid
That's sound advice. We'll upgrade soon and see whether anything changes. Is the latest snapshot really what you recommend for running in production? I was nervous about using a dev release but we desperately needed support for the proxy protocol. On Fri, Jun 07, 2013 at 03:16 PM, Lukas Tribus luky...@hotmail.com wrote: Hi Malcolm, This works but we find that the new haproxy process uses a lot of cpu (100% of one core) for about 20 seconds after the restart. During this time it looks as if various queues fill up and haproxy logs fewer requests than normal. Once the cpu load drops we get a surge of requests (which cause a spike in connections to our db, and a raft of other problems if we do this during heavy traffic). I strongly suggest to update your code to a more recent release, there have been a lot of bug fixes since dev15. Grab dev18 or better yet, the latest snapshot. It doesn't make sense to start troubleshooting on those obsolete releases, especially in the development branch. Lukas
CPU spike after restarting with -sf pid
We restart our haproxy instances with the following command line: haproxy -f /etc/haproxy.cfg -p /var/run/haproxy-private.pid -D -sf contents of pid file This works but we find that the new haproxy process uses a lot of cpu (100% of one core) for about 20 seconds after the restart. During this time it looks as if various queues fill up and haproxy logs fewer requests than normal. Once the cpu load drops we get a surge of requests (which cause a spike in connections to our db, and a raft of other problems if we do this during heavy traffic). I've read about -sf and -st in the man page but I'd like to understand more about what happens when starting haproxy with one of these options. For example, one theory we have is that the old process immediately closes its connections and stops listening for new connections. This would cause all of our clients (browsers) to create a new connection for their next http request) which might force haproxy to do a lot of cpu work for each new connection (we are using haproxy 1.5-dev15 to do ssl in haproxy). Is this plausible? Is there anything else we should investigate?
Re: ACLs that depend on cookie values
On Tue, May 8, 2012 at 1:24 AM, Willy Tarreau w...@1wt.eu wrote: Hi Malcolm, On Mon, May 07, 2012 at 06:19:36PM -0700, Malcolm Handley wrote: I'd like to write an ACL that compares the integer value of a cookie with a constant. (My goal is to be able to block percentiles of our users if we have more traffic than we can handle, so I want to block a request if the cookie's value is, say, less then 25.) I understand that I can do something like hdr_sub(cookie) -i regular expression but that doesn't let me treat the value as an integer and compare it. I also know about hdr_val(header) but that gives me the entire value of the cookie header, not just the value of a particular cookie. Is there any way that I can do this? In the next snapshot I hope to be able to push today, there is a new cookie pattern fetch method which brings a number of cook_* ACL keywords. It does not have cook_val at the moment, but I can check if that's hard to add or not. Cook_val sounds great if you happen to add that. How long do snapshots take to become the stable version, generally? We've had some outages (nothing to do with haproxy, which works great) and definitely don't want to put bleeding-edge code into production at the moment. In the mean time, I think that if you manage to rewrite your cookie header to replace it with a header holding only the value, it might work, though it's dirty and quite tricky. This is a great suggestion. Can you confirm that header rewriting happens before other calls to hdr_val? (Do the commands happen in order?) (One thing that's great about this is it would also let me avoid creating a new header. My goal is to write an ACL of the form [block if cook_value(user_id) % 1000 250] but ACLs don't support much math. But your suggestion would get around this.) Instead, with regex you can actually match integer expressions, it's just a bit complicated but doable. For instance, a value below 25 might be defined like this (not tested right now but you get the idea) : COOK=([0-9]|1[0-9]|2[0-4])([^0-9]|$) I've been doing this for a long time to extract requests by response times in logs until I got fed up and wrote halog. Yeah. I thought of this too. I know that I could do it but we are creating a tool to use in emergencies and I think that I'd be frightened of messing it up in some small but important way. :-) Thanks for the help.
Re: ACLs that depend on cookie values
Oh, one more question: if I use reqrep to modify the cookies header that's going to destroy the original header, I suspect, which would cause problems for the web server that wants to read those cookies. Is there any way around that? On Wed, May 9, 2012 at 3:51 PM, Malcolm Handley malc...@asana.com wrote: On Tue, May 8, 2012 at 1:24 AM, Willy Tarreau w...@1wt.eu wrote: Hi Malcolm, On Mon, May 07, 2012 at 06:19:36PM -0700, Malcolm Handley wrote: I'd like to write an ACL that compares the integer value of a cookie with a constant. (My goal is to be able to block percentiles of our users if we have more traffic than we can handle, so I want to block a request if the cookie's value is, say, less then 25.) I understand that I can do something like hdr_sub(cookie) -i regular expression but that doesn't let me treat the value as an integer and compare it. I also know about hdr_val(header) but that gives me the entire value of the cookie header, not just the value of a particular cookie. Is there any way that I can do this? In the next snapshot I hope to be able to push today, there is a new cookie pattern fetch method which brings a number of cook_* ACL keywords. It does not have cook_val at the moment, but I can check if that's hard to add or not. Cook_val sounds great if you happen to add that. How long do snapshots take to become the stable version, generally? We've had some outages (nothing to do with haproxy, which works great) and definitely don't want to put bleeding-edge code into production at the moment. In the mean time, I think that if you manage to rewrite your cookie header to replace it with a header holding only the value, it might work, though it's dirty and quite tricky. This is a great suggestion. Can you confirm that header rewriting happens before other calls to hdr_val? (Do the commands happen in order?) (One thing that's great about this is it would also let me avoid creating a new header. My goal is to write an ACL of the form [block if cook_value(user_id) % 1000 250] but ACLs don't support much math. But your suggestion would get around this.) Instead, with regex you can actually match integer expressions, it's just a bit complicated but doable. For instance, a value below 25 might be defined like this (not tested right now but you get the idea) : COOK=([0-9]|1[0-9]|2[0-4])([^0-9]|$) I've been doing this for a long time to extract requests by response times in logs until I got fed up and wrote halog. Yeah. I thought of this too. I know that I could do it but we are creating a tool to use in emergencies and I think that I'd be frightened of messing it up in some small but important way. :-) Thanks for the help.
ACLs that depend on cookie values
I'd like to write an ACL that compares the integer value of a cookie with a constant. (My goal is to be able to block percentiles of our users if we have more traffic than we can handle, so I want to block a request if the cookie's value is, say, less then 25.) I understand that I can do something like hdr_sub(cookie) -i regular expression but that doesn't let me treat the value as an integer and compare it. I also know about hdr_val(header) but that gives me the entire value of the cookie header, not just the value of a particular cookie. Is there any way that I can do this?
Why would haproxy send a request to a server that is down?
Hi, everyone. I'm having some trouble with the routing of requests to servers within a backend. Firstly, although I have retries 3 in the defaults section of my config file I'm not seeing any evidence of retries. If a server is down but has not been detected as down by haproxy then a request may still get sent to it and a failure returned to the client. (This is the first bold line in the log below.) Second, occasionally haproxy seems to route a request to a server that it knows is down. (This is the second bolded section below.) I could understand both of these if I were using cookies for routing and had not enabled redispatching. But I'm using balance leastconn with no mention of cookies in the config file. What else might I be doing that would force haproxy to use a downed backend and not retry requests? May 14 04:03:03 prod_lb0 haproxy[28398]: 67.112.125.46:61758 [14/May/2010:04:02:44.517] ws_in ws_in/NOSRV -1/-1/-1/-1/18609 400 187 - - CR-- 7/7/0/0/0 0/0 BADREQ May 14 04:03:17 prod_lb0 haproxy[28398]: *127.0.0.1:58921 [14/May/2010:04:03:08.486] ws_in lists_ws/ws_2 51/0/0/-1/9049 502 204 - - SH-- 7/7/2/1/0 0/0 GET /-/ping HTTP/1.1* May 14 04:03:17 prod_lb0 haproxy[28074]: 67.112.125.46:59661 [14/May/2010:03:04:22.220] ws_in lists_ws/ws_2 55/0/0/31/3535558 101 27943453 - - 2/2/2/2/0 0/0 GET /app/etherlist/socket?session_id=9680378170profiler=1 HTTP/1.1 May 14 04:03:17 prod_lb0 haproxy[28398]: 99.66.213.198:58624 [14/May/2010:03:52:18.455] ws_in lists_ws/ws_2 455/0/0/2/659334 101 3348145 - - 6/6/1/0/0 0/0 GET /app/etherlist/socket?session_id=9385884222profiler=1 HTTP/1.1 May 14 04:03:17 prod_lb0 haproxy[28074]: 98.210.108.197:43537 [14/May/2010:03:04:22.477] ws_in lists_ws/ws_2 165/0/0/2/3535454 101 8809800 - - 1/1/1/1/0 0/0 GET /-/socket?session_id=9646753365 HTTP/1.1 May 14 04:03:18 prod_lb0 haproxy[28074]: 70.36.139.123:56872 [14/May/2010:03:04:22.511] ws_in lists_ws/ws_2 88/0/0/2/3535503 101 25165709 - - 0/0/0/0/0 0/0 GET /-/socket?session_id=9669242145 HTTP/1.1 May 14 04:03:21 prod_lb0 haproxy[28398]: *Server lists_ws/ws_2 is DOWN, reason: Layer4 connection problem, info: Connection refused, check duration: 0ms.* May 14 04:03:23 prod_lb0 haproxy[28398]: *127.0.0.1:58965 [14/May/2010:04:03:20.475] ws_in lists_ws/ws_2 10/0/-1/-1/3032 503 212 - - SC-- 8/8/2/0/3 0/0 GET /-/ping HTTP/1.1* May 14 04:03:29 prod_lb0 haproxy[28398]: *Server lists_ws/ws_2 is UP, reason: Layer4 check passed, check duration: 0ms.* May 14 04:03:34 prod_lb0 haproxy[28398]: 99.66.213.198:59005 [14/May/2010:04:02:43.444] ws_in ws/ws_1 15/0/1/30/50569 304 296 - - cD-- 9/9/4/2/0 0/0 GET /-/static/luna/browser/images/loading.gif HTTP/1.1
Re: Why would haproxy send a request to a server that is down?
Thanks for the fast and helpful replies, Willy. I hadn't realized that a request was connected to a server even before the server had responded successfully. This all makes sense now. I'll try setting option redispatch. I assume that that will solve my problems. In that case I won't have any need to force an early redispatch if the server state changes, though I guess it would make things slightly faster. On Fri, May 14, 2010 at 2:30 PM, Willy Tarreau w...@1wt.eu wrote: Hi Malcolm, On Fri, May 14, 2010 at 12:07:53PM -0700, Malcolm Handley wrote: Hi, everyone. I'm having some trouble with the routing of requests to servers within a backend. Firstly, although I have retries 3 in the defaults section of my config file I'm not seeing any evidence of retries. If a server is down but has not been detected as down by haproxy then a request may still get sent to it and a failure returned to the client. (This is the first bold line in the log below.) Yes, this is expected if your retries value is not large enough to cover the time to detect that the server is down. Also, the retries are only performed on the same server. If you want the request to be redispatched to another server after the last attempt, you should use option redispatch. Second, occasionally haproxy seems to route a request to a server that it knows is down. (This is the second bolded section below.) No, if you look more closely, you'll see that the request was received at 04:03:20.475, *before* the server was marked down (04:03:21), and failed last attempt at 04:03:23. Since the request did not switch to another server on the last retry, I think you did not have option redispatch enabled. I could understand both of these if I were using cookies for routing and had not enabled redispatching. But I'm using balance leastconn with no mention of cookies in the config file. What else might I be doing that would force haproxy to use a downed backend and not retry requests? Well, be careful, retries are always performed on the same server, except the last one which can be redispatched. I will study if we could force an early redispatch in case the server changes state during retries, but there's nothing certain in this area. Regards, Willy - May 14 04:03:03 prod_lb0 haproxy[28398]: 67.112.125.46:61758 [14/May/2010:04:02:44.517] ws_in ws_in/NOSRV -1/-1/-1/-1/18609 400 187 - - CR-- 7/7/0/0/0 0/0 BADREQ May 14 04:03:17 prod_lb0 haproxy[28398]: *127.0.0.1:58921 [14/May/2010:04:03:08.486] ws_in lists_ws/ws_2 51/0/0/-1/9049 502 204 - - SH-- 7/7/2/1/0 0/0 GET /-/ping HTTP/1.1* May 14 04:03:17 prod_lb0 haproxy[28074]: 67.112.125.46:59661 [14/May/2010:03:04:22.220] ws_in lists_ws/ws_2 55/0/0/31/3535558 101 27943453 - - 2/2/2/2/0 0/0 GET /app/etherlist/socket?session_id=9680378170profiler=1 HTTP/1.1 May 14 04:03:17 prod_lb0 haproxy[28398]: 99.66.213.198:58624 [14/May/2010:03:52:18.455] ws_in lists_ws/ws_2 455/0/0/2/659334 101 3348145 - - 6/6/1/0/0 0/0 GET /app/etherlist/socket?session_id=9385884222profiler=1 HTTP/1.1 May 14 04:03:17 prod_lb0 haproxy[28074]: 98.210.108.197:43537 [14/May/2010:03:04:22.477] ws_in lists_ws/ws_2 165/0/0/2/3535454 101 8809800 - - 1/1/1/1/0 0/0 GET /-/socket?session_id=9646753365 HTTP/1.1 May 14 04:03:18 prod_lb0 haproxy[28074]: 70.36.139.123:56872 [14/May/2010:03:04:22.511] ws_in lists_ws/ws_2 88/0/0/2/3535503 101 25165709 - - 0/0/0/0/0 0/0 GET /-/socket?session_id=9669242145 HTTP/1.1 May 14 04:03:21 prod_lb0 haproxy[28398]: *Server lists_ws/ws_2 is DOWN, reason: Layer4 connection problem, info: Connection refused, check duration: 0ms.* May 14 04:03:23 prod_lb0 haproxy[28398]: *127.0.0.1:58965 [14/May/2010:04:03:20.475] ws_in lists_ws/ws_2 10/0/-1/-1/3032 503 212 - - SC-- 8/8/2/0/3 0/0 GET /-/ping HTTP/1.1* May 14 04:03:29 prod_lb0 haproxy[28398]: *Server lists_ws/ws_2 is UP, reason: Layer4 check passed, check duration: 0ms.* May 14 04:03:34 prod_lb0 haproxy[28398]: 99.66.213.198:59005 [14/May/2010:04:02:43.444] ws_in ws/ws_1 15/0/1/30/50569 304 296 - - cD-- 9/9/4/2/0 0/0 GET /-/static/luna/browser/images/loading.gif HTTP/1.1 ---
Not all requests getting logged
I'm having a problem with an haproxy setup where not all of the requests are getting logged (even in debug mode). Specifically, I have an ajax app that periodically POSTs to the server to find out about changes. I know that these requests are going to the proxy because if I kill the proxy the requests start failing. However, these requests are not logged to syslog or printed to the console when haproxy is run in debug mode. Nor are they shown in the stats. But the requests *are* sent to my web server and the responses are forwarded back to the client, just without the addition of the cookie to indicate which server should receive the next request from this client. This is happening with 1.3.23 and 1.4. I'm still scouring the docs and the code trying to figure out what would cause this but any pointers would be great.