RE: CPU spike after restarting with -sf pid

2013-06-10 Thread Malcolm Handley
That's sound advice. We'll upgrade soon and see whether anything changes.

Is the latest snapshot really what you recommend for running in production? I 
was nervous about using a dev release but we desperately needed support for the 
proxy protocol.


On Fri, Jun 07, 2013 at 03:16 PM, Lukas Tribus luky...@hotmail.com wrote:
 Hi Malcolm,
 
 
  This works but we find that the new haproxy process uses a lot of cpu
  (100% of one core) for about 20 seconds after the restart. During this
  time it looks as if various queues fill up and haproxy logs fewer
  requests than normal. Once the cpu load drops we get a surge of
  requests (which cause a spike in connections to our db, and a raft of
  other problems if we do this during heavy traffic).
 
 I strongly suggest to update your code to a more recent release, there
 have been a lot of bug fixes since dev15.
 
 Grab dev18 or better yet, the latest snapshot.
 
 
 It doesn't make sense to start troubleshooting on those obsolete releases,
 especially in the development branch.
 
 
 Lukas   

CPU spike after restarting with -sf pid

2013-06-07 Thread Malcolm Handley
We restart our haproxy instances with the following command line:

haproxy -f /etc/haproxy.cfg -p /var/run/haproxy-private.pid -D -sf
contents of pid file

This works but we find that the new haproxy process uses a lot of cpu
(100% of one core) for about 20 seconds after the restart. During this
time it looks as if various queues fill up and haproxy logs fewer
requests than normal. Once the cpu load drops we get a surge of
requests (which cause a spike in connections to our db, and a raft of
other problems if we do this during heavy traffic).

I've read about -sf and -st in the man page but I'd like to understand
more about what happens when starting haproxy with one of these
options.

For example, one theory we have is that the old process immediately
closes its connections and stops listening for new connections. This
would cause all of our clients (browsers) to create a new connection
for their next http request) which might force haproxy to do a lot of
cpu work for each new connection (we are using haproxy 1.5-dev15 to do
ssl in haproxy).

Is this plausible? Is there anything else we should investigate?



Re: ACLs that depend on cookie values

2012-05-09 Thread Malcolm Handley
On Tue, May 8, 2012 at 1:24 AM, Willy Tarreau w...@1wt.eu wrote:
 Hi Malcolm,

 On Mon, May 07, 2012 at 06:19:36PM -0700, Malcolm Handley wrote:
 I'd like to write an ACL that compares the integer value of a cookie
 with a constant. (My goal is to be able to block percentiles of our
 users if we have more traffic than we can handle, so I want to block a
 request if the cookie's value is, say, less then 25.)

 I understand that I can do something like
     hdr_sub(cookie) -i regular expression
 but that doesn't let me treat the value as an integer and compare it.

 I also know about
     hdr_val(header)
 but that gives me the entire value of the cookie header, not just the
 value of a particular cookie.

 Is there any way that I can do this?

 In the next snapshot I hope to be able to push today, there is a new
 cookie pattern fetch method which brings a number of cook_* ACL keywords.
 It does not have cook_val at the moment, but I can check if that's hard
 to add or not.

Cook_val sounds great if you happen to add that. How long do snapshots
take to become the stable version, generally? We've had some outages
(nothing to do with haproxy, which works great) and definitely don't
want to put bleeding-edge code into production at the moment.

 In the mean time, I think that if you manage to rewrite your cookie header
 to replace it with a header holding only the value, it might work, though
 it's dirty and quite tricky.

This is a great suggestion. Can you confirm that header rewriting
happens before other calls to hdr_val? (Do the commands happen in
order?) (One thing that's great about this is it would also let me
avoid creating a new header. My goal is to write an ACL of the form
[block if cook_value(user_id) % 1000  250] but ACLs don't support
much math. But your suggestion would get around this.)

 Instead, with regex you can actually match integer expressions, it's just
 a bit complicated but doable. For instance, a value below 25 might be
 defined like this (not tested right now but you get the idea) :

      COOK=([0-9]|1[0-9]|2[0-4])([^0-9]|$)

 I've been doing this for a long time to extract requests by response times
 in logs until I got fed up and wrote halog.

Yeah. I thought of this too. I know that I could do it but we are
creating a tool to use in emergencies and I think that I'd be
frightened of messing it up in some small but important way. :-)

Thanks for the help.



Re: ACLs that depend on cookie values

2012-05-09 Thread Malcolm Handley
Oh, one more question: if I use reqrep to modify the cookies header
that's going to destroy the original header, I suspect, which would
cause problems for the web server that wants to read those cookies. Is
there any way around that?

On Wed, May 9, 2012 at 3:51 PM, Malcolm Handley malc...@asana.com wrote:
 On Tue, May 8, 2012 at 1:24 AM, Willy Tarreau w...@1wt.eu wrote:
 Hi Malcolm,

 On Mon, May 07, 2012 at 06:19:36PM -0700, Malcolm Handley wrote:
 I'd like to write an ACL that compares the integer value of a cookie
 with a constant. (My goal is to be able to block percentiles of our
 users if we have more traffic than we can handle, so I want to block a
 request if the cookie's value is, say, less then 25.)

 I understand that I can do something like
     hdr_sub(cookie) -i regular expression
 but that doesn't let me treat the value as an integer and compare it.

 I also know about
     hdr_val(header)
 but that gives me the entire value of the cookie header, not just the
 value of a particular cookie.

 Is there any way that I can do this?

 In the next snapshot I hope to be able to push today, there is a new
 cookie pattern fetch method which brings a number of cook_* ACL keywords.
 It does not have cook_val at the moment, but I can check if that's hard
 to add or not.

 Cook_val sounds great if you happen to add that. How long do snapshots
 take to become the stable version, generally? We've had some outages
 (nothing to do with haproxy, which works great) and definitely don't
 want to put bleeding-edge code into production at the moment.

 In the mean time, I think that if you manage to rewrite your cookie header
 to replace it with a header holding only the value, it might work, though
 it's dirty and quite tricky.

 This is a great suggestion. Can you confirm that header rewriting
 happens before other calls to hdr_val? (Do the commands happen in
 order?) (One thing that's great about this is it would also let me
 avoid creating a new header. My goal is to write an ACL of the form
 [block if cook_value(user_id) % 1000  250] but ACLs don't support
 much math. But your suggestion would get around this.)

 Instead, with regex you can actually match integer expressions, it's just
 a bit complicated but doable. For instance, a value below 25 might be
 defined like this (not tested right now but you get the idea) :

      COOK=([0-9]|1[0-9]|2[0-4])([^0-9]|$)

 I've been doing this for a long time to extract requests by response times
 in logs until I got fed up and wrote halog.

 Yeah. I thought of this too. I know that I could do it but we are
 creating a tool to use in emergencies and I think that I'd be
 frightened of messing it up in some small but important way. :-)

 Thanks for the help.



ACLs that depend on cookie values

2012-05-07 Thread Malcolm Handley
I'd like to write an ACL that compares the integer value of a cookie
with a constant. (My goal is to be able to block percentiles of our
users if we have more traffic than we can handle, so I want to block a
request if the cookie's value is, say, less then 25.)

I understand that I can do something like
hdr_sub(cookie) -i regular expression
but that doesn't let me treat the value as an integer and compare it.

I also know about
hdr_val(header)
but that gives me the entire value of the cookie header, not just the
value of a particular cookie.

Is there any way that I can do this?



Why would haproxy send a request to a server that is down?

2010-05-14 Thread Malcolm Handley
Hi, everyone.

I'm having some trouble with the routing of requests to servers within a
backend.

Firstly, although I have retries 3 in the defaults section of my config
file I'm not seeing any evidence of retries. If a server is down but has not
been detected as down by haproxy then a request may still get sent to it and
a failure returned to the client. (This is the first bold line in the log
below.)

Second, occasionally haproxy seems to route a request to a server that it
knows is down. (This is the second bolded section below.)

I could understand both of these if I were using cookies for routing and had
not enabled redispatching. But I'm using balance leastconn with no mention
of cookies in the config file. What else might I be doing that would force
haproxy to use a downed backend and not retry requests?




May 14 04:03:03 prod_lb0 haproxy[28398]: 67.112.125.46:61758
[14/May/2010:04:02:44.517]
ws_in ws_in/NOSRV -1/-1/-1/-1/18609 400 187 - - CR-- 7/7/0/0/0 0/0
BADREQ
May 14 04:03:17 prod_lb0 haproxy[28398]: *127.0.0.1:58921
[14/May/2010:04:03:08.486]
ws_in lists_ws/ws_2 51/0/0/-1/9049 502 204 - - SH-- 7/7/2/1/0 0/0 GET
/-/ping HTTP/1.1*
May 14 04:03:17 prod_lb0 haproxy[28074]: 67.112.125.46:59661
[14/May/2010:03:04:22.220]
ws_in lists_ws/ws_2 55/0/0/31/3535558 101 27943453 - -  2/2/2/2/0 0/0
GET /app/etherlist/socket?session_id=9680378170profiler=1 HTTP/1.1
May 14 04:03:17 prod_lb0 haproxy[28398]: 99.66.213.198:58624
[14/May/2010:03:52:18.455]
ws_in lists_ws/ws_2 455/0/0/2/659334 101 3348145 - -  6/6/1/0/0 0/0 GET
/app/etherlist/socket?session_id=9385884222profiler=1 HTTP/1.1
May 14 04:03:17 prod_lb0 haproxy[28074]: 98.210.108.197:43537
[14/May/2010:03:04:22.477]
ws_in lists_ws/ws_2 165/0/0/2/3535454 101 8809800 - -  1/1/1/1/0 0/0
GET /-/socket?session_id=9646753365 HTTP/1.1
May 14 04:03:18 prod_lb0 haproxy[28074]: 70.36.139.123:56872
[14/May/2010:03:04:22.511]
ws_in lists_ws/ws_2 88/0/0/2/3535503 101 25165709 - -  0/0/0/0/0 0/0
GET /-/socket?session_id=9669242145 HTTP/1.1
May 14 04:03:21 prod_lb0 haproxy[28398]: *Server lists_ws/ws_2 is DOWN,
reason: Layer4 connection problem, info: Connection refused, check
duration: 0ms.*
May 14 04:03:23 prod_lb0 haproxy[28398]: *127.0.0.1:58965
[14/May/2010:04:03:20.475]
ws_in lists_ws/ws_2 10/0/-1/-1/3032 503 212 - - SC-- 8/8/2/0/3 0/0 GET
/-/ping HTTP/1.1*
May 14 04:03:29 prod_lb0 haproxy[28398]: *Server lists_ws/ws_2 is UP,
reason: Layer4 check passed, check duration: 0ms.*
May 14 04:03:34 prod_lb0 haproxy[28398]: 99.66.213.198:59005
[14/May/2010:04:02:43.444]
ws_in ws/ws_1 15/0/1/30/50569 304 296 - - cD-- 9/9/4/2/0 0/0 GET
/-/static/luna/browser/images/loading.gif HTTP/1.1


Re: Why would haproxy send a request to a server that is down?

2010-05-14 Thread Malcolm Handley
Thanks for the fast and helpful replies, Willy.

I hadn't realized that a request was connected to a server even before the
server had responded successfully. This all makes sense now. I'll try
setting option redispatch. I assume that that will solve my problems. In
that case I won't have any need to force an early redispatch if the server
state changes, though I guess it would make things slightly faster.

On Fri, May 14, 2010 at 2:30 PM, Willy Tarreau w...@1wt.eu wrote:

 Hi Malcolm,

 On Fri, May 14, 2010 at 12:07:53PM -0700, Malcolm Handley wrote:
  Hi, everyone.
 
  I'm having some trouble with the routing of requests to servers within a
  backend.
 
  Firstly, although I have retries 3 in the defaults section of my config
  file I'm not seeing any evidence of retries. If a server is down but has
 not
  been detected as down by haproxy then a request may still get sent to it
 and
  a failure returned to the client. (This is the first bold line in the log
  below.)

 Yes, this is expected if your retries value is not large enough to cover
 the time to detect that the server is down. Also, the retries are only
 performed on the same server. If you want the request to be redispatched
 to another server after the last attempt, you should use option
 redispatch.

  Second, occasionally haproxy seems to route a request to a server that it
  knows is down. (This is the second bolded section below.)

 No, if you look more closely, you'll see that the request was received at
 04:03:20.475, *before* the server was marked down (04:03:21), and failed
 last attempt at 04:03:23. Since the request did not switch to another
 server
 on the last retry, I think you did not have option redispatch enabled.

  I could understand both of these if I were using cookies for routing and
 had
  not enabled redispatching. But I'm using balance leastconn with no
 mention
  of cookies in the config file. What else might I be doing that would
 force
  haproxy to use a downed backend and not retry requests?

 Well, be careful, retries are always performed on the same server, except
 the
 last one which can be redispatched. I will study if we could force an early
 redispatch in case the server changes state during retries, but there's
 nothing
 certain in this area.

 Regards,
 Willy

 -
  May 14 04:03:03 prod_lb0 haproxy[28398]: 67.112.125.46:61758
  [14/May/2010:04:02:44.517]
  ws_in ws_in/NOSRV -1/-1/-1/-1/18609 400 187 - - CR-- 7/7/0/0/0 0/0
  BADREQ
  May 14 04:03:17 prod_lb0 haproxy[28398]: *127.0.0.1:58921
  [14/May/2010:04:03:08.486]
  ws_in lists_ws/ws_2 51/0/0/-1/9049 502 204 - - SH-- 7/7/2/1/0 0/0 GET
  /-/ping HTTP/1.1*
  May 14 04:03:17 prod_lb0 haproxy[28074]: 67.112.125.46:59661
  [14/May/2010:03:04:22.220]
  ws_in lists_ws/ws_2 55/0/0/31/3535558 101 27943453 - -  2/2/2/2/0 0/0
  GET /app/etherlist/socket?session_id=9680378170profiler=1 HTTP/1.1
  May 14 04:03:17 prod_lb0 haproxy[28398]: 99.66.213.198:58624
  [14/May/2010:03:52:18.455]
  ws_in lists_ws/ws_2 455/0/0/2/659334 101 3348145 - -  6/6/1/0/0 0/0
 GET
  /app/etherlist/socket?session_id=9385884222profiler=1 HTTP/1.1
  May 14 04:03:17 prod_lb0 haproxy[28074]: 98.210.108.197:43537
  [14/May/2010:03:04:22.477]
  ws_in lists_ws/ws_2 165/0/0/2/3535454 101 8809800 - -  1/1/1/1/0 0/0
  GET /-/socket?session_id=9646753365 HTTP/1.1
  May 14 04:03:18 prod_lb0 haproxy[28074]: 70.36.139.123:56872
  [14/May/2010:03:04:22.511]
  ws_in lists_ws/ws_2 88/0/0/2/3535503 101 25165709 - -  0/0/0/0/0 0/0
  GET /-/socket?session_id=9669242145 HTTP/1.1
  May 14 04:03:21 prod_lb0 haproxy[28398]: *Server lists_ws/ws_2 is DOWN,
  reason: Layer4 connection problem, info: Connection refused, check
  duration: 0ms.*
  May 14 04:03:23 prod_lb0 haproxy[28398]: *127.0.0.1:58965
  [14/May/2010:04:03:20.475]
  ws_in lists_ws/ws_2 10/0/-1/-1/3032 503 212 - - SC-- 8/8/2/0/3 0/0 GET
  /-/ping HTTP/1.1*
  May 14 04:03:29 prod_lb0 haproxy[28398]: *Server lists_ws/ws_2 is UP,
  reason: Layer4 check passed, check duration: 0ms.*
  May 14 04:03:34 prod_lb0 haproxy[28398]: 99.66.213.198:59005
  [14/May/2010:04:02:43.444]
  ws_in ws/ws_1 15/0/1/30/50569 304 296 - - cD-- 9/9/4/2/0 0/0 GET
  /-/static/luna/browser/images/loading.gif HTTP/1.1
 ---




Not all requests getting logged

2010-03-08 Thread Malcolm Handley
I'm having a problem with an haproxy setup where not all of the requests are
getting logged (even in debug mode). Specifically, I have an ajax app that
periodically POSTs to the server to find out about changes. I know that
these requests are going to the proxy because if I kill the proxy the
requests start failing. However, these requests are not logged to syslog or
printed to the console when haproxy is run in debug mode. Nor are they shown
in the stats. But the requests *are* sent to my web server and the responses
are forwarded back to the client, just without the addition of the cookie to
indicate which server should receive the next request from this client. This
is happening with 1.3.23 and 1.4.

I'm still scouring the docs and the code trying to figure out what would
cause this but any pointers would be great.