add dynamic header to http response?

2013-05-07 Thread Patrick Hemmer
With haproxy 1.5, Is there any way to add a dynamic header to the http
response (like the `http-request add-header` option for request headers)?
I'm adding a X-Request-Id header to requests before forwarding them on
to the back end, but would also like to be able to send this same header
back in the response to the client. Something like the `http-request
add-header` or `unique-id-header` options would be great. The former
might be more flexible as it can be used for other things, but the
latter would also work.

Reading through the docs I don't see any way this could be done. If not
can it be a feature request?

Thanks

-Patrick



syslog timestamp with millisecond

2013-05-10 Thread Patrick Hemmer
The current syslog implementation (via UDP) sends log entries with the
millisecond portion of the timestamp stripped off. Our log collector is
capable of handling timestamps with millisecond accuracy and I would
like to have it do so. Is there any way to accomplish this?

I know you can add an additional timestamp into the log message, but the
log collector uses the syslog timestamp as _the_ timestamp.

Haproxy 1.5 custom log format has access to the date/time, hostname, and
pid, so it would be nice to just have haproxy not add anything to the
log entries other than facility  priority, and then let the custom log
format add the date, host, program, and pid.

-Patrick



haproxy duplicate http_request_counter values

2013-08-11 Thread Patrick Hemmer
I'm using the %rt field in the unique-id-format config parameter (the
full value is %{+X}o%pid-%rt), and am getting lots of duplicates. In
one specific case, haproxy added the same http_request_counter value to
70 different http requests within a span of 61 seconds (from various
client hosts too). Does the http_request_counter only increment under
certain conditions, or is this a bug?

This is with haproxy 1.5-dev19

-Patrick


Re: haproxy duplicate http_request_counter values (BUG)

2013-08-13 Thread Patrick Hemmer

On 2013/08/11 15:45, Patrick Hemmer wrote:
 I'm using the %rt field in the unique-id-format config parameter
 (the full value is %{+X}o%pid-%rt), and am getting lots of
 duplicates. In one specific case, haproxy added the same
 http_request_counter value to 70 different http requests within a span
 of 61 seconds (from various client hosts too). Does the
 http_request_counter only increment under certain conditions, or is
 this a bug?

 This is with haproxy 1.5-dev19

 -Patrick


This appears to be part of a bug. I just experienced a scenario where
haproxy stopped responding. When I went into the log I found binary
garbage in place of the request ID. I have haproxy configured to route
certain URLs, and to respond with a `errorfile` when a request comes in
that doesn't match any of the configure paths. It seems whenever I
request an invalid URL and get the `errorfile` response, the request ID
gets screwed up and becomes jumbled binary data.

For example: haproxy[28645]: 207.178.167.185:49560 api bad_url/NOSRV
71/-1/-1/-1/71 3/3/0/0/3 0/0 127/242 403 PR-- Á + GET / HTTP/1.1
Notice the Á, that's supposed to be the process ID and request ID
separated by a hyphen. When I pipe it into xxd, I get this:

000: 6861 7072 6f78 795b 3238 3634 355d 3a20  haproxy[28645]:
010: 3230 372e 3137 382e 3136 372e 3138 353a  207.178.167.185:
020: 3439 3536 3020 6170 6920 6261 645f 7572  49560 api bad_ur
030: 6c2f 3c4e 4f53 5256 3e20 3731 2f2d 312f  l/NOSRV 71/-1/
040: 2d31 2f2d 312f 3731 2033 2f33 2f30 2f30  -1/-1/71 3/3/0/0
050: 2f33 2030 2f30 2031 3237 2f32 3432 2034  /3 0/0 127/242 4
060: 3033 2050 522d 2d20 90c1 8220 2b20 4745  03 PR-- ... + GE
070: 5420 2f20 4854 5450 2f31 2e31 0a T / HTTP/1.1.


I won't post my entire config as it's over 300 lines, but here's the
juicy stuff:


global
log 127.0.0.1   local0
maxconn 20480
user haproxy
group haproxy
daemon

defaults
log global
modehttp
option  httplog
option  dontlognull
retries 3
option  redispatch
timeout connect 5000
timeout client 6
timeout server 17
option  clitcpka
option  srvtcpka

stats   enable
stats   uri /haproxy/stats
stats   refresh 5
stats   auth my:secret

listen stats
bind 0.0.0.0:90
mode http
stats enable
stats uri /
stats refresh 5

frontend api
  bind *:80
  bind *:81 accept-proxy

  option httpclose
  option forwardfor
  http-request add-header X-Request-Timestamp %Ts.%ms
  unique-id-format %{+X}o%pid-%rt
  unique-id-header X-Request-Id
  rspadd X-Api-Host:\ i-a22932d9

  reqrep ^([^\ ]*)\ ([^\?\ ]*)(\?[^\ ]*)?\ HTTP.*  \0\r\nX-API-URL:\ \2


  acl is_1_1 path_dir /1/my/path
  use_backend 1_1 if is_1_1

  acl is_1_2 path_dir /1/my/other_path
  use_backend 1_2 if is_1_2

  ...

  default_backend bad_url

  log-format %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\
%ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %U/%B\ %ST\ %tsc\ %ID\ +\ %r

backend bad_url
  block if TRUE
  errorfile 403 /etc/haproxy/bad_url.http


content based routing with rewrite (reqrep)

2013-08-26 Thread Patrick Hemmer
So I'm trying to come up with the best way of doing this, but am having
a heck of a time. Basically I have several different backend service
pools, and I have one externally facing haproxy router. I want to take a
map of public URLs and route them to specific backend URLs.
For example

public.example.com/foo/bar - foo.internal.example.com/one
public.example.com/foo/baz - foo.internal.example.com/two
public.example.com/another - more.internal.example.com/three

So the first 2 public URLs go to the same backend, but need different
rewrite rules.

I've tried doing the following config:
frontend public
  acl foo_bar path_dir /foo/bar
  reqrep ^([^\ ]*\ )/foo/bar(.*) \1/one\2 if foo_bar
  use_backend foo if foo_bar

Except it seems that the foo_bar acl isn't cached, and get's
re-evaluated after doing the reqrep, and so the use_backend fails.

The only way I can think of doing this is to put the acl and the
use_backend in the frontend, and then put the acl again with the reqrep
in the backend. Is there any cleaner way (if it works since I haven't
tried it yet)?

-Patrick


Client timeout on http put shows as a server timeout with error 504

2013-09-17 Thread Patrick Hemmer
We have this case with haproxy 1.5-dev19 where when a client is
uploading data via a HTTP PUT request, the client will fail to send all
it's data and haproxy will timeout the connection. The problem is that
haproxy is reporting this an error 504 and connection flags of sH--,
meaning it timed out waiting for the server.

Now I've analyzed the http headers, and the PUT request has a
content-length header, so would it be possible to have haproxy report
these as a client side timeout instead of a server side timeout (when
the amount of data after headers is less than the amount indicated in
the content-length header)? And with a 4XX status code as well.
We have monitoring in place which looks for server errors, and I'd love
for it not to pick up client problems.

-Patrick


Re: Client timeout on http put shows as a server timeout with error 504

2013-09-18 Thread Patrick Hemmer
*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2013-09-18 01:46:50 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *haproxy@formilux.org haproxy@formilux.org
*Subject: *Re: Client timeout on http put shows as a server timeout with
error 504

 Hi Patrick,

 On Tue, Sep 17, 2013 at 06:29:13PM -0400, Patrick Hemmer wrote:
 We have this case with haproxy 1.5-dev19 where when a client is
 uploading data via a HTTP PUT request, the client will fail to send all
 it's data and haproxy will timeout the connection. The problem is that
 haproxy is reporting this an error 504 and connection flags of sH--,
 meaning it timed out waiting for the server.

 Now I've analyzed the http headers, and the PUT request has a
 content-length header, so would it be possible to have haproxy report
 these as a client side timeout instead of a server side timeout (when
 the amount of data after headers is less than the amount indicated in
 the content-length header)? And with a 4XX status code as well.
 We have monitoring in place which looks for server errors, and I'd love
 for it not to pick up client problems.
 I remember having checked for this in the past. I agree that ideally
 we should have a cH--. It's a bit trickier, because in practice it
 is permitted for the server to respond before the end. In fact we'd
 need another state before the Headers state, which is the client's
 body, so that we can report exactly what we were waiting for.

 I could check if it's easier to implement now. A first step could be
 to disable the server-side timeout as long as we're waiting for the
 client. That might do the trick. Probably that you could already check
 for this using a slightly larger server timeout than the client's (eg:
 21s for the server, 20s for the client). If that works, it would
 confirm that we could do this by just disabling the server timeout
 in this situation.
Seems like it's not going to be that simple. We currently have the
server timeout set at 170s, and the client timeout at 60s (and the
connection closes with 504 sH-- after 170s). Though this does seem like
it'd be the right approach; if the client hasn't finished sending all
it's data, the client timeout should kick in.

-Patrick


Re: AW: GA Release of 1.5

2013-09-24 Thread Patrick Hemmer

*From: *Jinn Ko hapr...@mx.ixido.net
*Sent: * 2013-09-24 10:22:49 E
*To: *haproxy@formilux.org
*Subject: *Re: AW: GA Release of 1.5

 Hi,

 It's good to get a better idea of what's needed to see a GA release of
 1.5.  We've been keenly awaiting the GA release, and I certainly
 understand the need to ensure the high performance that we've all come
 to expect of HAProxy.

 In this case, however, I would propose the features are significant
 enough to stabilize what's there and manage expectations.

 An important reason to release the SSL feature set as is could be for
 the potential to release timely security fixes when vulnerabilities or
 exploits are discovered.

 With the knowledge that server-side keep-alives aren't currently
 supported together with SSL we could plan our production deployment to
 take this into account until a future release does support the
 combination.  Documentation could very well reflect these limitations
 and serve to manage users expectations.
I cannot disagree with this argument more. You're requesting that
incomplete software be released because you can work around the
deficiencies in it. People expect that when new versions of software are
released, that the software is complete and functional.

 On 2013-09-23 11:56, Lukas Tribus wrote:
  Hi!


  If you feel SSL support being stable I would really like to see
  a release. This is THE main reason for 1.5.

  I understand your point, but server side keep alives for example
  are important when you run SSL on the backend side, otherwise you
  end up establishing a new SSL session for each and every HTTP
  request.

  I doubt that would scale very well.


 In our case we're deploying a single service and are using the 1.5
 branch primarily to support websockets over SSL, something that was
 fiddly, if not difficult, prior to having SSL support within HAProxy.

 One of the main reasons for the difficulties with websocket over SSL
 is the treatment of the Connection header, which the alternative SSL
 terminator, nginx, is in fact observing the correct behaviour defined
 in the relevant RFC.  Alternative means of terminating SSL are also
 fiddly and we haven't tested them with websockets.

 The good news is haproxy-1.5-dev has been working great for the past
 few months that we've been using it, albeit only in pre-production and
 performance testing environments for now.  The performance aspect
 hasn't been a bottleneck for us so far, so is a minor concern.


  Please don't put the burdon of patching relevant fixes to the
  current users. (It's not patching, but filtering which patches
  are relevant).

  That was just a proposal; whether its achievable or not depends on
  you.


 Unfortunately we don't have the resources to maintain a branch to
 which we could backport relevant fixes, not to mention the overhead of
 managing any security related fixes.
But by requesting that 1.5 be released, you're essentially asking the
people maintaining HAProxy to do the same thing.
While you might argue that maintaining haproxy is their duty, they are
doing that duty by releasing 1.5 when they feel it is ready.

  Now if you are a multi national, multi billion dollar company
  implementing haproxy in a commercial product, you can probably
  justify the effort (or the risk of a unstable component in your
  product - in the end this is just a numbers game for a big
  company).

  If you are a haproxy end-user, I don't see why using a current
  snapshot of the code would hurt (*if you have the time to deploy
  an OSS solution*). Sure, you shoud not blindly upgrade to more
  recent code without extensively testing it first, but that may be a
  good thing to do with stable releases as well.

 We'd certainly fall into this category, however while we're a start-up
 (with no funding), this isn't a great situation to be in.  The concern
 for potential security issues is also valid.


  If you don't have the time and need those bleeding edge features
  today, then you should probably stick to a commercial solution,
  like those from exceliance.fr or loadbalancer.org.

 I'm not sure if either of these offer websocket+SSL support, but
 certainly worth investigating.


  I don't think releasing new stable branches every 6 months is a
  good thing, because in the end, you need long term support for
  this deployments. You don't want to upgrade from one stable major
  release to another every 12 months because of their deprecation,
  right?

 While the release cycles are a topic in themselves, supporting
 relatively major developments such as SSL and websocket support is
 important for the community.
I'm also going to have to voice opinion against rapid release cycles.
One significant result is that distributions end up using old
unsupported versions, and users of that distro being unable to get help
from the community, and as a result having to roll their own packages.

Haproxy has been in existence for years without SSL support. Just
because the 

Re: Client timeout on http put shows as a server timeout with error 504

2013-09-30 Thread Patrick Hemmer
*From: *Patrick Hemmer hapr...@stormcloud9.net
*Sent: * 2013-09-18 10:26:36 E
*To: *haproxy@formilux.org
*Subject: *Re: Client timeout on http put shows as a server timeout with
error
504

 *From: *Willy Tarreau w...@1wt.eu
 *Sent: * 2013-09-18 01:46:50 E
 *To: *Patrick Hemmer hapr...@stormcloud9.net
 *CC: *haproxy@formilux.org haproxy@formilux.org
 *Subject: *Re: Client timeout on http put shows as a server timeout
 with error 504

 Hi Patrick,

 On Tue, Sep 17, 2013 at 06:29:13PM -0400, Patrick Hemmer wrote:
 We have this case with haproxy 1.5-dev19 where when a client is
 uploading data via a HTTP PUT request, the client will fail to send all
 it's data and haproxy will timeout the connection. The problem is that
 haproxy is reporting this an error 504 and connection flags of sH--,
 meaning it timed out waiting for the server.

 Now I've analyzed the http headers, and the PUT request has a
 content-length header, so would it be possible to have haproxy report
 these as a client side timeout instead of a server side timeout (when
 the amount of data after headers is less than the amount indicated in
 the content-length header)? And with a 4XX status code as well.
 We have monitoring in place which looks for server errors, and I'd love
 for it not to pick up client problems.
 I remember having checked for this in the past. I agree that ideally
 we should have a cH--. It's a bit trickier, because in practice it
 is permitted for the server to respond before the end. In fact we'd
 need another state before the Headers state, which is the client's
 body, so that we can report exactly what we were waiting for.

 I could check if it's easier to implement now. A first step could be
 to disable the server-side timeout as long as we're waiting for the
 client. That might do the trick. Probably that you could already check
 for this using a slightly larger server timeout than the client's (eg:
 21s for the server, 20s for the client). If that works, it would
 confirm that we could do this by just disabling the server timeout
 in this situation.
 Seems like it's not going to be that simple. We currently have the
 server timeout set at 170s, and the client timeout at 60s (and the
 connection closes with 504 sH-- after 170s). Though this does seem
 like it'd be the right approach; if the client hasn't finished sending
 all it's data, the client timeout should kick in.

 -Patrick

I'm also seeing a lot of connections being closed by the client and
showing up as 503 (connection flags are CC--), and in my opinion the
client closing the connection shouldn't be a 500 level error.
In this specific case, nginx uses code 499. Perhaps haproxy should adopt
this code as well.

For this and the timeout issue, if this isn't something that will be
fixed any time soon, I'm willing to try and dig into it myself. However
I don't know the haproxy source, so it will likely take me quite some time.

-Patrick


handling hundreds of reqrep statements

2013-10-22 Thread Patrick Hemmer
I'm currently using haproxy (1.5-dev19) as a content based router. It
takes an incoming request, looks at the url, rewrites it, and sends it
on to the appropriate back end.
The difficult part is that we need to all parsing and rewriting after
the first match. This is because we might have a url such as '/foo/bar'
which rewrites to '/foo/baz', and another rewrite from '/foo/b' to
'/foo/c'. As you can see both rules would try to trigger a rewrite on
'/foo/bar/shot', and we'd end up with '/foo/caz/shot'.
Additionally there are hundreds of these rewrites (the config file is
generated from a mapping).

There are 2 questions here:

1) I currently have this working using stick tables (it's unpleasant but
it works).
It basically looks like this:
frontend frontend1
acl foo_bar path_reg ^/foo/bar
use_backend backend1 if foo_bar

acl foo_b path_reg ^/foo/b
use_backend backend1 if foo_b

backend backend1
stick-table type integer size 1 store gpc0 # create a stick table to
store one entry
tcp-request content track-sc1 always_false # enable tracking on sc1.
The `always_false` doesn't matter, it just requires a key, so we give it one
acl rewrite-init sc1_clr_gpc0 ge 0 # ACL to clear gpc0
tcp-request content accept if rewrite-init # clear gpc0 on the start
of every request
acl rewrite-empty sc1_get_gpc0 eq 0 # ACL to check if gpc0 has been set
acl rewrite-set sc1_inc_gpc0 ge 0 # ACL to set gpc0 when a rewrite
has matched

acl foo_bar path_reg ^/foo/bar
reqrep ^(GET|POST)\ /foo/bar(.*) \1\ /foo/baz\2 if rewrite-empty
foo_bar rewrite-set # the conditional first checks if another rewrite
has matched, then checks the foo_bar acl, and then performs the
rewrite-set only if foo_bar matched

acl foo_b path_reg ^/foo/b
reqrep ^(GET|POST)\ /foo/b(.*) \1\ /foo/c\2 if rewrite-empty foo_b
rewrite-set # same procedure as above

(my actual rules are a bit more complicated, but those examples exhibit
all the problem points I have).

The cleaner way I thought of handling this was to instead do something
like this:
backend backend1
acl rewrite-found req.hdr(X-Rewrite-ID,1) -m found

acl foo_bar path_reg ^/foo/bar
reqrep ^(GET|POST)\ /foo/bar(.*) \1\ /foo/baz\2\r\nX-Rewrite-ID:\
foo_bar if !rewrite-found foo_bar

acl foo_b path_reg ^/foo/b
reqrep ^(GET|POST)\ /foo/b(.*) \1\ /foo/c\2\r\nX-Rewrite-ID:\ foo_b
if !rewrite-found foo_b

But this doesn't work. The rewrite-found acl never finds the header and
so both reqrep commands run. Is there any better way of doing this than
the nasty stick table?


2) I would also like to add a field to the log indicating which rule
matched. I can't figure out a way to accomplish this bit.
Since the config file is automatically generated, I was hoping to just
assign a short numeric ID and stick that in the log somehow. The only
way I can think that this could work is by adding a header conditionally
using an acl (or use the header created by the alternate idea above),
and then using `capture request header` to add that to the log. But it
does not appear haproxy can capture headers added by itself.

-Patrick


Re: handling hundreds of reqrep statements

2013-10-22 Thread Patrick Hemmer



*From: *Patrick Hemmer hapr...@stormcloud9.net
*Sent: * 2013-10-22 19:13:08 E
*To: *haproxy@formilux.org
*Subject: *handling hundreds of reqrep statements

 I'm currently using haproxy (1.5-dev19) as a content based router. It
 takes an incoming request, looks at the url, rewrites it, and sends it
 on to the appropriate back end.
 The difficult part is that we need to all parsing and rewriting after
 the first match. This is because we might have a url such as
 '/foo/bar' which rewrites to '/foo/baz', and another rewrite from
 '/foo/b' to '/foo/c'. As you can see both rules would try to trigger a
 rewrite on '/foo/bar/shot', and we'd end up with '/foo/caz/shot'.
 Additionally there are hundreds of these rewrites (the config file is
 generated from a mapping).

 There are 2 questions here:

 1) I currently have this working using stick tables (it's unpleasant
 but it works).
 It basically looks like this:
 frontend frontend1
 acl foo_bar path_reg ^/foo/bar
 use_backend backend1 if foo_bar

 acl foo_b path_reg ^/foo/b
 use_backend backend1 if foo_b

 backend backend1
 stick-table type integer size 1 store gpc0 # create a stick table
 to store one entry
 tcp-request content track-sc1 always_false # enable tracking on
 sc1. The `always_false` doesn't matter, it just requires a key, so we
 give it one
 acl rewrite-init sc1_clr_gpc0 ge 0 # ACL to clear gpc0
 tcp-request content accept if rewrite-init # clear gpc0 on the
 start of every request
 acl rewrite-empty sc1_get_gpc0 eq 0 # ACL to check if gpc0 has
 been set
 acl rewrite-set sc1_inc_gpc0 ge 0 # ACL to set gpc0 when a rewrite
 has matched

 acl foo_bar path_reg ^/foo/bar
 reqrep ^(GET|POST)\ /foo/bar(.*) \1\ /foo/baz\2 if rewrite-empty
 foo_bar rewrite-set # the conditional first checks if another rewrite
 has matched, then checks the foo_bar acl, and then performs the
 rewrite-set only if foo_bar matched

 acl foo_b path_reg ^/foo/b
 reqrep ^(GET|POST)\ /foo/b(.*) \1\ /foo/c\2 if rewrite-empty foo_b
 rewrite-set # same procedure as above

 (my actual rules are a bit more complicated, but those examples
 exhibit all the problem points I have).

 The cleaner way I thought of handling this was to instead do something
 like this:
 backend backend1
 acl rewrite-found req.hdr(X-Rewrite-ID,1) -m found

 acl foo_bar path_reg ^/foo/bar
 reqrep ^(GET|POST)\ /foo/bar(.*) \1\ /foo/baz\2\r\nX-Rewrite-ID:\
 foo_bar if !rewrite-found foo_bar

 acl foo_b path_reg ^/foo/b
 reqrep ^(GET|POST)\ /foo/b(.*) \1\ /foo/c\2\r\nX-Rewrite-ID:\
 foo_b if !rewrite-found foo_b

 But this doesn't work. The rewrite-found acl never finds the header
 and so both reqrep commands run. Is there any better way of doing this
 than the nasty stick table?


 2) I would also like to add a field to the log indicating which rule
 matched. I can't figure out a way to accomplish this bit.
 Since the config file is automatically generated, I was hoping to just
 assign a short numeric ID and stick that in the log somehow. The only
 way I can think that this could work is by adding a header
 conditionally using an acl (or use the header created by the alternate
 idea above), and then using `capture request header` to add that to
 the log. But it does not appear haproxy can capture headers added by
 itself.

 -Patrick

Ok, so I went home and resumed trying to figure this out, starting from
scratch on a whole new machine. Well guess what, the cleaner way
worked. After many proclamations of WTF? out loud (my dog was getting
concerned), I think I found a bug. And I cannot begin to describe just
how awesome this bug is.

Here's how you can duplicate this awesomeness:

Start a haproxy with the following config:
defaults
mode http
timeout connect 1000
timeout client 1000
timeout server 1000

frontend frontend
bind *:2082

maxconn 2

  acl rewrite-found req.hdr(X-Header-ID) -m found

reqrep ^(GET)\ /foo/(.*) \1\ /foo/\2\r\nX-Header-ID:\ bar if
!rewrite-found
reqrep ^(GET)\ /foo/(.*) \1\ /foo/\2\r\nX-Header-ID:\ pop if
!rewrite-found
reqrep ^(GET)\ /foo/(.*) \1\ /foo/\2\r\nX-Header-ID:\ tart if
!rewrite-found

default_backend backend

backend backend
server server 127.0.0.1:2090



Start up a netcat:
while true; do nc -l -p 2090; done


Create a file with the following contents (I'll presume we call it data):
GET /foo/ HTTP/1.1
Accept: */*
User-Agent: Agent
Host: localhost:2082


(with the empty line on the bottom)

And now run:
nc localhost2082  data

In your listening netcat, notice you got 3 X-Header-ID headers.

Now in your data file, move the Accept: */* down one line, so it's
after the User-Agent and retry. Notice you only get 1 X-Header-ID
back. It works!

But wait, it gets even better. Put the Accept: */* line back where it
was, and in the haproxy config, replace all X-Header-ID with
X-HeaderID (just remove

Re: handling hundreds of reqrep statements

2013-10-23 Thread Patrick Hemmer
 



*From: *Patrick Hemmer hapr...@stormcloud9.net
*Sent: * 2013-10-22 23:32:31 E
*CC: *haproxy@formilux.org
*Subject: *Re: handling hundreds of reqrep statements



 
 *From: *Patrick Hemmer hapr...@stormcloud9.net
 *Sent: * 2013-10-22 19:13:08 E
 *To: *haproxy@formilux.org
 *Subject: *handling hundreds of reqrep statements

 I'm currently using haproxy (1.5-dev19) as a content based router. It
 takes an incoming request, looks at the url, rewrites it, and sends
 it on to the appropriate back end.
 The difficult part is that we need to all parsing and rewriting after
 the first match. This is because we might have a url such as
 '/foo/bar' which rewrites to '/foo/baz', and another rewrite from
 '/foo/b' to '/foo/c'. As you can see both rules would try to trigger
 a rewrite on '/foo/bar/shot', and we'd end up with '/foo/caz/shot'.
 Additionally there are hundreds of these rewrites (the config file is
 generated from a mapping).

 There are 2 questions here:

 1) I currently have this working using stick tables (it's unpleasant
 but it works).
 It basically looks like this:
 frontend frontend1
 acl foo_bar path_reg ^/foo/bar
 use_backend backend1 if foo_bar

 acl foo_b path_reg ^/foo/b
 use_backend backend1 if foo_b

 backend backend1
 stick-table type integer size 1 store gpc0 # create a stick table
 to store one entry
 tcp-request content track-sc1 always_false # enable tracking on
 sc1. The `always_false` doesn't matter, it just requires a key, so we
 give it one
 acl rewrite-init sc1_clr_gpc0 ge 0 # ACL to clear gpc0
 tcp-request content accept if rewrite-init # clear gpc0 on the
 start of every request
 acl rewrite-empty sc1_get_gpc0 eq 0 # ACL to check if gpc0 has
 been set
 acl rewrite-set sc1_inc_gpc0 ge 0 # ACL to set gpc0 when a
 rewrite has matched

 acl foo_bar path_reg ^/foo/bar
 reqrep ^(GET|POST)\ /foo/bar(.*) \1\ /foo/baz\2 if rewrite-empty
 foo_bar rewrite-set # the conditional first checks if another rewrite
 has matched, then checks the foo_bar acl, and then performs the
 rewrite-set only if foo_bar matched

 acl foo_b path_reg ^/foo/b
 reqrep ^(GET|POST)\ /foo/b(.*) \1\ /foo/c\2 if rewrite-empty
 foo_b rewrite-set # same procedure as above

 (my actual rules are a bit more complicated, but those examples
 exhibit all the problem points I have).

 The cleaner way I thought of handling this was to instead do
 something like this:
 backend backend1
 acl rewrite-found req.hdr(X-Rewrite-ID,1) -m found

 acl foo_bar path_reg ^/foo/bar
 reqrep ^(GET|POST)\ /foo/bar(.*) \1\ /foo/baz\2\r\nX-Rewrite-ID:\
 foo_bar if !rewrite-found foo_bar

 acl foo_b path_reg ^/foo/b
 reqrep ^(GET|POST)\ /foo/b(.*) \1\ /foo/c\2\r\nX-Rewrite-ID:\
 foo_b if !rewrite-found foo_b

 But this doesn't work. The rewrite-found acl never finds the header
 and so both reqrep commands run. Is there any better way of doing
 this than the nasty stick table?


 2) I would also like to add a field to the log indicating which rule
 matched. I can't figure out a way to accomplish this bit.
 Since the config file is automatically generated, I was hoping to
 just assign a short numeric ID and stick that in the log somehow. The
 only way I can think that this could work is by adding a header
 conditionally using an acl (or use the header created by the
 alternate idea above), and then using `capture request header` to add
 that to the log. But it does not appear haproxy can capture headers
 added by itself.

 -Patrick

 Ok, so I went home and resumed trying to figure this out, starting
 from scratch on a whole new machine. Well guess what, the cleaner
 way worked. After many proclamations of WTF? out loud (my dog was
 getting concerned), I think I found a bug. And I cannot begin to
 describe just how awesome this bug is.

 Here's how you can duplicate this awesomeness:

 Start a haproxy with the following config:
 defaults
 mode http
 timeout connect 1000
 timeout client 1000
 timeout server 1000

 frontend frontend
 bind *:2082

 maxconn 2

   acl rewrite-found req.hdr(X-Header-ID) -m found

 reqrep ^(GET)\ /foo/(.*) \1\ /foo/\2\r\nX-Header-ID:\ bar if
 !rewrite-found
 reqrep ^(GET)\ /foo/(.*) \1\ /foo/\2\r\nX-Header-ID:\ pop if
 !rewrite-found
 reqrep ^(GET)\ /foo/(.*) \1\ /foo/\2\r\nX-Header-ID:\ tart if
 !rewrite-found

 default_backend backend

 backend backend
 server server 127.0.0.1:2090



 Start up a netcat:
 while true; do nc -l -p 2090; done


 Create a file with the following contents (I'll presume we call it
 data):
 GET /foo/ HTTP/1.1
 Accept: */*
 User-Agent: Agent
 Host: localhost:2082


 (with the empty line on the bottom)

 And now run:
 nc localhost2082  data

 In your listening netcat, notice you got 3 X-Header-ID headers.

 Now in your data file

Re: handling hundreds of reqrep statements

2013-10-23 Thread Patrick Hemmer
 



*From: *hushmeh...@hushmail.com
*Sent: * 2013-10-23 01:06:24 E
*To: *hapr...@stormcloud9.net
*CC: *haproxy@formilux.org
*Subject: *Re: handling hundreds of reqrep statements


 On Wed, 23 Oct 2013 05:33:38 +0200 Patrick Hemmer 
 hapr...@stormcloud9.net wrote:
reqrep ^(GET)\ /foo/(.*) \1\ /foo/\2\r\nX-Header-ID:\ bar if
 !rewrite-found
 What about reqadd? Clumsy fiddling with \r\n (or \n\r) in regexp 
 seems awkward to me.
 reqadd X-Header-ID:\ bar unless rewrite-found

Ya, I think I figured out the issue. Had to do with haproxy
pre-allocating buffers for each header, and not expecting them being
moved around.
Unfortunately I can't use reqadd to add a header as reqadd happens too
late in the process. All reqrep statements happen before reqadd. So if I
put an acl on reqrep to skip it if the header has been added, it'll
always run the reqrep because the header gets added afterwards.
However I think I can use http-request set-header instead of reqadd.
It's not as simple as the reqrep \r\n idea, but still better than the
nasty stick table.


disable backend through socket

2013-12-20 Thread Patrick Hemmer
Simple question: Is there any way to disable a backend through the socket?
I see you can disable both frontends, and servers through the socket,
but I don't see a way to do a backend.

-Patrick


Re: disable backend through socket

2013-12-22 Thread Patrick Hemmer
No. As I said, I want to disable the backend.
http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4.2-disabled


-Patrick



*From: *Jonathan Matthews cont...@jpluscplusm.com
*Sent: * 2013-12-22 16:23:18 E
*To: *haproxy@formilux.org
*Subject: *Re: disable backend through socket

 On 22 Dec 2013 20:32, Patrick Hemmer hapr...@stormcloud9.net
 mailto:hapr...@stormcloud9.net wrote:
 
  That disables a server. I want to disable a backend.

 No, you want to disable all the servers in a backend. I'm not sure
 there's a shortcut that's better than just doing them one by one.
 Others may be able to advise about alternatives, but is that an option
 for you?

 Jonathan




Re: disable backend through socket

2013-12-23 Thread Patrick Hemmer
 On Sun, Dec 22, 2013 at 05:05:16PM -0500, Patrick Hemmer wrote:
 No. As I said, I want to disable the backend.
 http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4.2-disabled
 That doesn't really work for backends since they don't decide to get
 traffic. At least if a config accepts to start with the disabled
 keyword in a backend and this backend is referenced in a frontend, I
 have no idea what it does behind the scenes. I'm not even sure the
 backend is completely initialized.

Ah, ok. I can live with that :-)

 What do you want to do exactly ? Do you just want to disable the
 health checks ? It's unclear what result you're seeking in fact.

I was just looking to disable backends without restarting the service.
Nothing more. Nothing less.
Currenly when I want to disable a backend I just update the config and
reload haproxy. Not a big deal. Was just hoping that since frontends and
servers could both be enabled/disabled through the socket, that backends
could too.

The reason why I don't want to disable individual servers is that we
have an automated process which enables  disables servers. If a backend
is disabled, then I don't want a server to automatically get enabled and
start taking traffic. By disabling the backend, we prevent this scenario.

 Willy

Thank you

-Patrick


Re: disable backend through socket

2013-12-26 Thread Patrick Hemmer
*From: *Gabriel Sosa sosagabr...@gmail.com
*Sent: * 2013-12-26 09:41:21 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *haproxy@formilux.org
*Subject: *Re: disable backend through socket




 On Mon, Dec 23, 2013 at 12:21 PM, Patrick Hemmer
 hapr...@stormcloud9.net mailto:hapr...@stormcloud9.net wrote:

 On Sun, Dec 22, 2013 at 05:05:16PM -0500, Patrick Hemmer wrote:
 No. As I said, I want to disable the backend.
 
 http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#4.2-disabled
 That doesn't really work for backends since they don't decide to get
 traffic. At least if a config accepts to start with the disabled
 keyword in a backend and this backend is referenced in a frontend, I
 have no idea what it does behind the scenes. I'm not even sure the
 backend is completely initialized.

 Ah, ok. I can live with that :-)


 What do you want to do exactly ? Do you just want to disable the
 health checks ? It's unclear what result you're seeking in fact.

 I was just looking to disable backends without restarting the
 service. Nothing more. Nothing less.
 Currenly when I want to disable a backend I just update the config
 and reload haproxy. Not a big deal. Was just hoping that since
 frontends and servers could both be enabled/disabled through the
 socket, that backends could too.

 The reason why I don't want to disable individual servers is that
 we have an automated process which enables  disables servers. If
 a backend is disabled, then I don't want a server to automatically
 get enabled and start taking traffic. By disabling the backend, we
 prevent this scenario.

 Willy

 Thank you

 -Patrick



 Patrick,

 did you take a look to the load balancer feedback feature? [1] I think
 this might help you.

 Saludos

 [1]
 http://blog.loadbalancer.org/open-source-windows-service-for-reporting-server-load-back-to-haproxy-load-balancer-feedback-agent/


I have seen this yes, but unfortunately it still operates on a
per-server basis. I would have to reach out to every server and tell the
feedback agent to advertise itself as in maintenance. The goal is to
be able to put the entire backend in maintenance, regardless of what the
status of the individual servers are.

This isn't that big of a deal. I currently have a haproxy controller
daemon which adjusts the haproxy.cfg (sets backend disable) and reloads.
I just like to avoid reloading as much as possible.

-Patrick


Re: Just a simple thought on health checks after a soft reload of HAProxy....

2014-01-21 Thread Patrick Hemmer

*From: *Malcolm Turnbull malc...@loadbalancer.org
*Sent: * 2014-01-14 07:13:27 E
*To: *haproxy@formilux.org haproxy@formilux.org
*Subject: *Just a simple thought on health checks after a soft reload of
HAProxy

 Just a simple though on health checks after a soft reload of HAProxy

 If for example you had several backend servers one of which had crashed...
 Then you make make a configuration change to HAProxy and soft reload,
 for instance adding a new backend server.

 All the servers are instantly brought up and available for traffic
 (including the crashed one).
 So traffic will possibly be sent to a broken server...

 Obviously its only a small problem as it is fixed as soon as the
 health check actually runs...

 But I was just wondering is their a way of saying don't bring up a
 server until it passes a health check?
I was just thinking of this issue myself and google turned up your post.
Personally I would not like that every server is considered down until
after the health checks pass. Basically this would result in things
being down after a reload, which defeats the point of the reload being
non-interruptive.

I can think of 2 possible solutions:
1) When the new process comes up, do an initial check on all servers
(just one) which have checks enabled. Use that one check as the verdict
for whether each server should be marked 'up' or 'down'. After each
server has been checked once, then signal the other process to shut down
and start listening.
2) Use the stats socket (if enabled) to pull the stats from the previous
process. Use its health check data to pre-populate the health data of
the new process. This one has a few drawbacks though. The server 
backend names must match between the old and new config, and the stats
socket has to be enabled. It would probably be harder to code as well,
but I really don't know on that.

-Patrick


determine size of http headers

2014-01-23 Thread Patrick Hemmer
What I'd like to do is add a few items to the log line which contain the
size of the headers, and then the value of the Content-Length header.
This way if the connection is broken for any reason, we can determine if
the client sent all the data they were supposed to.

Logging the Content-Length header is easy, but I can't find a way to get
the size of the headers.
The only way that pops into mind is to look for the first occurrence of
\r\n\r\n and get its offset (and preferably add 4 as to include the size
of the \r\n\r\n in the calculation). But I don't see a way to accomplish
this.

Any ideas?

Thanks

-Patrick


Re: haproxy duplicate http_request_counter values

2014-01-25 Thread Patrick Hemmer
This patch does appear to have solved the issue reported, but it
introduced another.
If I use `http-request add-header` with %rt in the value to add the
request ID, and then I also use it in `unique-id-format`, the 2 settings
get different values. the value used for`http-request add-header` will
be one less than the value used for `unique-id-format` (this applies to
both using %ID in the log format and using `unique-id-header`).

Without this patch, all values are the same.

-Patrick


*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2013-08-13 11:53:16 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *haproxy@formilux.org haproxy@formilux.org
*Subject: *Re: haproxy duplicate http_request_counter values

 Hi Patrick,

 On Sun, Aug 11, 2013 at 03:45:36PM -0400, Patrick Hemmer wrote:
 I'm using the %rt field in the unique-id-format config parameter (the
 full value is %{+X}o%pid-%rt), and am getting lots of duplicates. In
 one specific case, haproxy added the same http_request_counter value to
 70 different http requests within a span of 61 seconds (from various
 client hosts too). Does the http_request_counter only increment under
 certain conditions, or is this a bug?
 Wow, congrats, you found a nice ugly bug! Here's how the counter is
 retrieved at the moment of logging :

   iret = snprintf(tmplog, dst + maxsize - tmplog, %04X, 
 global.req_count);

 As you can see, it uses a global variable which holds the global number of
 requests seen at the moment of logging (or assigning the header) instead of
 a unique value assigned to each request!

 So all the requests that are logged in the same time frame between two
 new requests get the same ID :-(

 The counter should be auto-incrementing so that each retrieval is unique.

 Please try with the attached patch.

 Thanks,
 Willy




Re: haproxy duplicate http_request_counter values

2014-01-25 Thread Patrick Hemmer
Actually I sent that prematurely. The behavior is actually even simpler.
With `http-request add-header`, %rt is one less than when used in a
`log-format` or `unique-id-header`. I'm guessing incrementing the value
happens after `http-request` is processed, but before log-format or
unique-id-header.

-Patrick



*From: *Patrick Hemmer hapr...@stormcloud9.net
*Sent: * 2014-01-25 03:40:38 E
*To: *Willy Tarreau w...@1wt.eu
*CC: *haproxy@formilux.org haproxy@formilux.org
*Subject: *Re: haproxy duplicate http_request_counter values

 This patch does appear to have solved the issue reported, but it
 introduced another.
 If I use `http-request add-header` with %rt in the value to add the
 request ID, and then I also use it in `unique-id-format`, the 2
 settings get different values. the value used for`http-request
 add-header` will be one less than the value used for
 `unique-id-format` (this applies to both using %ID in the log format
 and using `unique-id-header`).

 Without this patch, all values are the same.

 -Patrick

 
 *From: *Willy Tarreau w...@1wt.eu
 *Sent: * 2013-08-13 11:53:16 E
 *To: *Patrick Hemmer hapr...@stormcloud9.net
 *CC: *haproxy@formilux.org haproxy@formilux.org
 *Subject: *Re: haproxy duplicate http_request_counter values

 Hi Patrick,

 On Sun, Aug 11, 2013 at 03:45:36PM -0400, Patrick Hemmer wrote:
 I'm using the %rt field in the unique-id-format config parameter (the
 full value is %{+X}o%pid-%rt), and am getting lots of duplicates. In
 one specific case, haproxy added the same http_request_counter value to
 70 different http requests within a span of 61 seconds (from various
 client hosts too). Does the http_request_counter only increment under
 certain conditions, or is this a bug?
 Wow, congrats, you found a nice ugly bug! Here's how the counter is
 retrieved at the moment of logging :

   iret = snprintf(tmplog, dst + maxsize - tmplog, %04X, 
 global.req_count);

 As you can see, it uses a global variable which holds the global number of
 requests seen at the moment of logging (or assigning the header) instead of
 a unique value assigned to each request!

 So all the requests that are logged in the same time frame between two
 new requests get the same ID :-(

 The counter should be auto-incrementing so that each retrieval is unique.

 Please try with the attached patch.

 Thanks,
 Willy





Re: haproxy duplicate http_request_counter values

2014-01-25 Thread Patrick Hemmer
*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-01-25 04:43:28 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *haproxy@formilux.org haproxy@formilux.org
*Subject: *Re: haproxy duplicate http_request_counter values

 Hi Patrick,

 On Sat, Jan 25, 2014 at 03:40:38AM -0500, Patrick Hemmer wrote:
 This patch does appear to have solved the issue reported, but it
 introduced another.
 If I use `http-request add-header` with %rt in the value to add the
 request ID, and then I also use it in `unique-id-format`, the 2 settings
 get different values. the value used for`http-request add-header` will
 be one less than the value used for `unique-id-format` (this applies to
 both using %ID in the log format and using `unique-id-header`).
 You're damn right! I forgot this case where the ID could be used twice :-(

 So we have no other choice but copying the ID into the session or HTTP
 transaction, since it's possible to use it several times. At the same
 time, I'm wondering if we should not also increment it for new sessions,
 because for people who forward non-HTTP traffic, there's no unique counter.

 What I'm thinking about is the following then :

   - increment the global counter on each new session and store it into
 the session.
   - increment it again when dealing with a new request over an existing
 session.

 That way it would count each transaction, either TCP connection or HTTP
 request. And since the ID would be assigned to the session, it would
 remain stable for all the period where it's needed.

 What do you think ?

Sounds reasonable. Running through it in my head, I can't conjure up any
scenario where that approach wouldn't work.


-Patrick




Re: haproxy duplicate http_request_counter values

2014-01-25 Thread Patrick Hemmer
Confirmed. Testing various scenarios, and they all work.

Thanks for the quick patch :-)

-Patrick


*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-01-25 05:09:09 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *haproxy@formilux.org haproxy@formilux.org
*Subject: *Re: haproxy duplicate http_request_counter values

 On Sat, Jan 25, 2014 at 05:05:07AM -0500, Patrick Hemmer wrote:
 Sounds reasonable. Running through it in my head, I can't conjure up any
 scenario where that approach wouldn't work.
 Same here. And it works fine for me with the benefit of coherency
 between all reported unique IDs.

 I'm about to merge the attached patch, if you want to confirm that
 it's OK for you as well, feel free to do so :-)

 Willy




Re: Real client IP address question

2014-01-27 Thread Patrick Hemmer
You can use the proxy protocol for this. Haproxy doesn't allow
manipulation of the TCP stream itself as it could be any number of
protocols which haproxy doesn't support. However the proxy protocol
sends a line at the very beginning of the stream containing the client
source IP, port, destination,  destination port, then it starts sending
the data. As such, whatever you're sending to has to be capable of
handling the proxy protocol header (and be configured to do so).

See
http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#5.2-send-proxy
and http://haproxy.1wt.eu/download/1.5/doc/proxy-protocol.txt

-Patrick



*From: *Semenov, Evgeny ev.seme...@brokerkf.ru
*Sent: * 2014-01-27 09:06:59 E
*To: *haproxy@formilux.org haproxy@formilux.org
*Subject: *Real client IP address question

 Hi,

  

 There is a setting('forward for' option)in haproxy allowing to forward
 the traffic with the real client IP address to the end server. This
 setting works only for HTTP traffic. Is there a way to make a similar
 setting for TCP?

 I run haproxy on Linux OS.

  

  

  

 Best regards,

 Evgeny Semenov

  




capture.req.hdr

2014-02-06 Thread Patrick Hemmer
I really like this feature, and it was something actually on my todo
list of things to look into adding to haproxy.
However there is one thing I would consider supporting. Instead of
requiring the index of the capture keyword in the config, which is very
cumbersome and awkward in my opinion, support using the header name.

Now I imagine the immediate response to this is going to be that this
would require searching for the header by name every time
capture.req.hdr is used, and the captured headers are stored in a simple
array not maintaining the header names. This would complicate the code
and possibly slow haproxy down.
But, an alternate idea would be to transform the header name into its
index at the time of parsing configuration. This would let the user use
a header name, but the actual haproxy code which translates
capture.req.hdr wouldn't change at all.
It would be a lot less fragile when someone updates their config to
capture an additional header, but forgets to update all indexes (plus
having to keep track of indexes in the first place).


-Patrick


Re: Just a simple thought on health checks after a soft reload of HAProxy....

2014-02-22 Thread Patrick Hemmer
 



*From: *Sok Ann Yap sok...@gmail.com
*Sent: * 2014-02-21 05:11:48 E
*To: *haproxy@formilux.org
*Subject: *Re: Just a simple thought on health checks after a soft
reload of HAProxy

 Patrick Hemmer haproxy@... writes:

   From: Willy Tarreau w at 1wt.eu

   Sent:  2014-01-25 05:45:11 E

 Till now that's exactly what's currently done. The servers are marked
 almost dead, so the first check gives the verdict. Initially we had
 all checks started immediately. But it caused a lot of issues at several
 places where there were a high number of backends or servers mapped to
 the same hardware, because the rush of connection really caused the
 servers to be flagged as down. So we started to spread the checks over
 the longest check period in a farm. 

 Is there a way to enable this behavior? In my
 environment/configuration, it causes absolutely no issue that all
 the checks be fired off at the same time.
 As it is right now, when haproxy starts up, it takes it quite a
 while to discover which servers are down.
 -Patrick

 I faced the same problem in http://thread.gmane.org/
 gmane.comp.web.haproxy/14644

 After much contemplation, I decided to just patch away the initial spread 
 check behavior: https://github.com/sayap/sayap-overlay/blob/master/net-
 proxy/haproxy/files/haproxy-immediate-first-check.diff



I definitely think there should be an option to disable the behavior. We
have an automated system which adds and removes servers from the config,
and then bounces haproxy. Every time haproxy is bounced, we have a
period where it can send traffic to a dead server.


There's also a related bug on this.
The bug is that when I have a config with inter 30s fastinter 1s and
no httpchk enabled, when haproxy first starts up, it spreads the checks
over the period defined as fastinter, but the stats output says UP 1/3
for the full 30 seconds. It also says L4OK in 30001ms, when I know it
doesn't take the server 30 seconds to simply accept a connection.
Yet you get different behavior when using httpchk. When I add option
httpchk, it still spreads the checks over the 1s fastinter value, but
the stats output goes full UP immediately after the check occurs, not
UP 1/3. It also says L7OK/200 in 0ms, which is what I expect to see.

-Patrick



Re: Just a simple thought on health checks after a soft reload of HAProxy....

2014-02-24 Thread Patrick Hemmer
Unfortunately retry doesn't work in our case as we run haproxy on 2
layers, frontend servers and backend servers (to distribute traffic
among multiple processes on each server). So when an app on a server
goes down, the haproxy on that server is still up and accepting
connections, but the layer 7 http checks from the frontend haproxy are
failing. But since the backend haproxy is still accepting connections,
the retry option does not work.

-Patrick


*From: *Baptiste bed...@gmail.com
*Sent: * 2014-02-24 07:18:00 E
*To: *Malcolm Turnbull malc...@loadbalancer.org
*CC: *Neil n...@iamafreeman.com, Patrick Hemmer
hapr...@stormcloud9.net, HAProxy haproxy@formilux.org
*Subject: *Re: Just a simple thought on health checks after a soft
reload of HAProxy

 Hi Malcolm,

 Hence the retry and redispatch options :)
 I know it's a dirty workaround.

 Baptiste


 On Sun, Feb 23, 2014 at 8:42 PM, Malcolm Turnbull
 malc...@loadbalancer.org wrote:
 Neil,

 Yes, peers are great for passing stick tables to the new HAProxy
 instance and any current connections bound to the old process will be
 fine.
 However  any new connections will hit the new HAProxy process and if
 the backend server is down but haproxy hasn't health checked it yet
 then the user will hit a failed server.



 On 23 February 2014 10:38, Neil n...@iamafreeman.com wrote:
 Hello

 Regarding restarts, rather that cold starts, if you configure peers the
 state from before the restart should be kept. The new process haproxy
 creates is automatically a peer to the existing process and gets the state
 as was.

 Neil

 On 23 Feb 2014 03:46, Patrick Hemmer hapr...@stormcloud9.net wrote:



 
 From: Sok Ann Yap sok...@gmail.com
 Sent: 2014-02-21 05:11:48 E
 To: haproxy@formilux.org
 Subject: Re: Just a simple thought on health checks after a soft reload of
 HAProxy

 Patrick Hemmer haproxy@... writes:

   From: Willy Tarreau w at 1wt.eu

   Sent:  2014-01-25 05:45:11 E

 Till now that's exactly what's currently done. The servers are marked
 almost dead, so the first check gives the verdict. Initially we had
 all checks started immediately. But it caused a lot of issues at several
 places where there were a high number of backends or servers mapped to
 the same hardware, because the rush of connection really caused the
 servers to be flagged as down. So we started to spread the checks over
 the longest check period in a farm.

 Is there a way to enable this behavior? In my
 environment/configuration, it causes absolutely no issue that all
 the checks be fired off at the same time.
 As it is right now, when haproxy starts up, it takes it quite a
 while to discover which servers are down.
 -Patrick

 I faced the same problem in http://thread.gmane.org/
 gmane.comp.web.haproxy/14644

 After much contemplation, I decided to just patch away the initial spread
 check behavior: https://github.com/sayap/sayap-overlay/blob/master/net-
 proxy/haproxy/files/haproxy-immediate-first-check.diff



 I definitely think there should be an option to disable the behavior. We
 have an automated system which adds and removes servers from the config, 
 and
 then bounces haproxy. Every time haproxy is bounced, we have a period where
 it can send traffic to a dead server.


 There's also a related bug on this.
 The bug is that when I have a config with inter 30s fastinter 1s and no
 httpchk enabled, when haproxy first starts up, it spreads the checks over
 the period defined as fastinter, but the stats output says UP 1/3 for the
 full 30 seconds. It also says L4OK in 30001ms, when I know it doesn't 
 take
 the server 30 seconds to simply accept a connection.
 Yet you get different behavior when using httpchk. When I add option
 httpchk, it still spreads the checks over the 1s fastinter value, but the
 stats output goes full UP immediately after the check occurs, not UP
 1/3. It also says L7OK/200 in 0ms, which is what I expect to see.

 -Patrick



 --
 Regards,

 Malcolm Turnbull.

 Loadbalancer.org Ltd.
 Phone: +44 (0)870 443 8779
 http://www.loadbalancer.org/




Re: AW: Keeping statistics after a reload

2014-02-28 Thread Patrick Hemmer
I have seen feature requests in the past that when haproxy reloads, to
pull the health status of the servers so that haproxy knows their state
without having to health check them. Willy has said he liked the idea
(http://marc.info/?l=haproxym=139064677914723). If this gets
implemented, it would probably be a minor detail to not only dump the
up/down state, but all stats.

-Patrick




*From: *PiBa-NL piba.nl@gmail.com
*Sent: * 2014-02-28 11:15:19 E
*To: *Andreas Mock andreas.m...@drumedar.de, haproxy@formilux.org
haproxy@formilux.org
*Subject: *Re: AW: Keeping statistics after a reload

 Hi Andreas,

 Its not like your question was wrong, but probably there is no
 good/satisfying short answer to this, and it was overrun by other
 mails...

 As far as i know it is not possible to keep this kind information
 persisted in haproxy itself when a config restart is needed.

 The -sf only makes sure old connections will nicely be closed when
 they are 'done'.

 I have 'heard' of statistics gathering tools that use the haproxy unix
 stats socket to query the stats and store the information in a
 separate database that way you could get continued statistics after
 the config is changed.. I don't have any examples on how to do this or
 have a name of such a tool in mind though.. Though googling for
 haproxy monitoring quickly shows some commercial tools that have
 haproxy plugins and probably would provide answers to the questions
 you have.

 Maybe others on the list do use programs/scripts/tools to also keep
 historical/cumulative data for haproxy and can share their experience
 with it?

 Greets PiBa-NL

 Andreas Mock schreef op 28-2-2014 16:33:
 Hi all,

 the list is normally really responsive. In this case nobody
 gave an answer. So, I don't know whether my question was such a
 stupid one that nobody wanted to answer.

 So, I bring it up again in the hope someone is answering:
 Is there a way to reload the configuration without loosing
 current statistics? Or is this conceptually not possible?

 Best regards
 Andreas Mock

 -Ursprüngliche Nachricht-
 Von: Andreas Mock [mailto:andreas.m...@drumedar.de]
 Gesendet: Montag, 24. Februar 2014 16:36
 An: haproxy@formilux.org
 Betreff: Keeping statistics after a reload

 Hi all,

 is there a way to reload a haproxy config without resetting the
 statistics shown on the stats page?

 I used

 haproxy -p /var/run/haproxy.pid -sf $(cat /var/run/haproxy.pid)

 to make such a reload. But after that all statistics are reset.

 Best regards
 Andreas Mock








Re: rewrite URI help

2014-03-04 Thread Patrick Hemmer
The haproxy log contains the original request, not the rewritten one. If
you want to see the rewritten URL you need to look at the backend server
which is receiving the request.

-Patrick



*From: *Steve Phillips stw...@gmail.com
*Sent: * 2014-03-04 19:54:44 E
*To: *HAProxy haproxy@formilux.org
*Subject: *rewrite URI help

 Trying to reverse proxy all requests to 

 /slideshare 

 to 

 www.slideshare.net/api/2/get_slideshow
 http://www.slideshare.net/api/2/get_slideshow

 my front-end config:

  acl url_slideshare   path_dir   slideshare
  use_backend slideshare if url_slideshare

 and back-end:

 backend slideshare
   option http-server-close
   option httpclose
   reqrep ^([^\ ]*)\ /slideshare(.*)  \1\ /api/2/get_slideshow\2
   server slideshare www.slideshare.net:443
 http://www.slideshare.net:443 ssl verify none

 requests to /slideshow however, are not being rewritten:

 173.11.67.214:60821 http://173.11.67.214:60821
 [04/Mar/2014:19:49:03.257] main slideshare/slideshare
 6142/0/289/121/6552 404 9299 - -  0/0/0/0/0 0/0 {} GET
 /slideshare?slideshow_url=http%3A%2F%2Fwww.slideshare.net
 http://2Fwww.slideshare.net%2FAaronKlein1%2Foptimizing-aws-economicsdetailed=1api_key=msCpLON8hash=a7fe5fd52cc86e4a4a3d1022cb7c63476b79e044ts=1393980574
 HTTP/1.1

 Is my regex incorrect?  Am I missing something else?  

 Thanks.

 Steve



tcp-request content track

2014-03-11 Thread Patrick Hemmer
2 related questions:

I'm trying to find a way to concat multiple samples to use in a stick table.
Basically in my frontend I pattern match on the request path to
determine which backend to send a request to. The client requests also
have a client ID header. I want to rate limit based on a combination of
this pattern that matched, and the client ID. Currently the way I do
this is an http-request set-header rule that adds a new header
combining a unique ID for the pattern that matched along with the
client-ID header. Then in the backend I have a tcp-requst content
track-sc2 on that header. This works, but I'm wondering if there's a
better way.


Secondly, the above works, but when I do a show table mybackend on the
stats socket, both the conn_cur and use counters never decrease.
They seem to be acting as the total number of requests, not the number
of active connections. Is this a bug, or am I misunderstanding something?


-Patrick


Re: tcp-request content track

2014-03-12 Thread Patrick Hemmer
Created a new config as an example. My existing config is huge, and hard
to read (generated programtically).

In regards to the bug, it appears it was a bug. I was using 1.5-dev19.
After upgrading to 1.5-dev22 it started behaving as expected.

Below is the config I'm using to accomplish what I want. As mentioned,
I'm basically rate limiting on a combination of the X-Client-Id header
and the matching URL. And as you can see, it's quite ugly and complex to
accomplish it :-(
For example, the same X-Client-Id should be able to hit /foo/bar 3 times
every 15 seconds, with only 1 open connection (the  rules). It
should be able to hit /asdf at 5 times every 15 seconds with 3 open
connections (the  rules).


global
log 127.0.0.1:514 local1 debug
maxconn 4096
daemon
stats socket /tmp/haproxy.sock level admin

defaults
log global
mode http
option httplog
option dontlognull
retries 3
option redispatch
maxconn 2000
timeout connect 200
timeout client 6
timeout server 17
option clitcpka
option srvtcpka
option abortonclose
stats enable
stats uri /haproxy/stats

frontend f1
bind *:1500
option httpclose

acl internal dst 127.0.0.2
acl have_request_id req.fhdr(X-Request-Id) -m found

http-request set-header X-API-URL %[path] if !internal
http-request add-header X-Request-Timestamp %Ts.%ms
http-request set-header X-Request-Id %[req.fhdr(X-Request-Id)] if
internal have_request_id
http-request set-header X-Request-Id %{+X}o%pid-%rt if !internal ||
!have_request_id
http-request set-header X-API-Host i-12345678
http-response set-header X-API-Host i-12345678

unique-id-format %{+X}o%pid-%rt
log-format %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\
%ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %U/%B\ %ST\ %tsc\ %ID\ +\ %r


acl rewrite-found req.hdr(X-Rewrite-ID,1) -m found

acl _path path_reg ^/foo/([^\ ?]*)$
acl _method method GET
http-request set-header X-Rewrite-Id  if !rewrite-found
_path _method
acl -rewrite req.hdr(X-Rewrite-Id) -m str 
http-request set-header X-Limit-Id %[req.hdr(X-Client-Id)] if
-rewrite
use_backend b1 if -rewrite
reqrep ^(GET)\ /foo/([^\ ?]*)([\ ?].*|$) \1\ /echo/bar/\2\3 if
-rewrite

acl _path path_reg ^/([^\ ?]*)$
acl _method method GET
http-request set-header X-Rewrite-Id  if !rewrite-found
_path _method
acl -rewrite req.hdr(X-Rewrite-Id) -m str 
http-request set-header X-Limit-Id %[req.hdr(X-Client-Id)] if
-rewrite
use_backend b1 if -rewrite
reqrep ^(GET)\ /([^\ ?]*)([\ ?].*|$) \1\ /echo/\2\3 if -rewrite

backend b1
stick-table type string len 12 size 1000 expire 1h store
http_req_rate(15000),conn_cur
tcp-request content track-sc2 req.hdr(X-Limit-ID)

acl -rewrite req.hdr(X-Rewrite-Id) -m str 
acl _req_rate sc2_http_req_rate gt 3
acl _conn_cur sc2_conn_cur gt 1
tcp-request content reject if -rewrite _req_rate
tcp-request content reject if -rewrite _conn_cur

acl -rewrite req.hdr(X-Rewrite-Id) -m str 
acl _req_rate sc2_http_req_rate gt 5
acl _conn_cur sc2_conn_cur gt 3
tcp-request content reject if -rewrite _req_rate
tcp-request content reject if -rewrite _conn_cur


server s1 127.0.0.1:2700
server s2 127.0.0.1:2701
server s3 127.0.0.1:2702



-Patrick



*From: *Baptiste bed...@gmail.com
*Sent: * 2014-03-12 06:26:32 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *haproxy@formilux.org haproxy@formilux.org
*Subject: *Re: tcp-request content track

 It would be easier to help you if you share your configuration!

 Baptiste

 On Wed, Mar 12, 2014 at 1:36 AM, Patrick Hemmer hapr...@stormcloud9.net 
 wrote:
 2 related questions:

 I'm trying to find a way to concat multiple samples to use in a stick table.
 Basically in my frontend I pattern match on the request path to determine
 which backend to send a request to. The client requests also have a client
 ID header. I want to rate limit based on a combination of this pattern that
 matched, and the client ID. Currently the way I do this is an http-request
 set-header rule that adds a new header combining a unique ID for the
 pattern that matched along with the client-ID header. Then in the backend I
 have a tcp-requst content track-sc2 on that header. This works, but I'm
 wondering if there's a better way.


 Secondly, the above works, but when I do a show table mybackend on the
 stats socket, both the conn_cur and use counters never decrease. They
 seem to be acting as the total number of requests, not the number of active
 connections. Is this a bug, or am I misunderstanding something?


 -Patrick



module/plugin support?

2014-03-18 Thread Patrick Hemmer
I was wondering if there were ever any thoughts about adding
module/plugin support to haproxy.

The plugin would be used for adding features to haproxy that are beyond
the scope of haproxy's core focus (fast simple load balancing).
Reading the recent radius authentication thread surprised me. I never
would have expected that to be something haproxy would support. But I
think it would make sense as a plugin.

-Patrick


Re: Radius authentication

2014-03-18 Thread Patrick Hemmer
I'm assuming it'll be generic authentication. What information will be
made available to the auth daemon? Just the Authorization header?

I would love a feature that allowed any/multiple header to be passed
through. We use haproxy on an API service, which all incoming requests
must pass in a key and signature. The signature is a hash of a secret
token, the URI and several headers. Currently each backend application
that receives the request has to perform the authentication, but it
would be awesome if we could leverage this auth daemon to perform the
authentication before passing the request through.

-Patrick


*From: *Baptiste bed...@gmail.com
*Sent: * 2014-03-18 11:03:56 E
*To: *Roel Cuppen r...@cuppie.com
*CC: *HAProxy haproxy@formilux.org
*Subject: *Re: Radius authentication

 Well, I'm currently writing a piece of code which stands behind
 HAProxy and whose purpose is to authenticate a user.
 Once authenticated, it updates HAProxy who, in turn, let the user
 browse the application and sets authentication requirement on the fly.

 I think OTP will be possible :)

 Still a lot of work to do on this project and HAProxy needs some
 patches as well, so I can't say more for now.
 Just stay tuned, I'll update the ML once done :)

 That said, if you have some requirements, this is the moment :)

 Baptiste


 On Tue, Mar 18, 2014 at 2:04 PM, Roel Cuppen r...@cuppie.com wrote:
 Hi Baptiste,

 Many thanks for your explination
 What kind of daemon is it ?


 OTP = One Time Password.

 Kind regards,

 Roel


 2014-03-18 11:03 GMT+01:00 Baptiste bed...@gmail.com:

 Hi Roel,

 Let say there are currently some developments in that way.
 It won't be part of HAProxy, but rather a third party daemon
 interacting deeply with HAProxy.

 What do you mean by OTP?

 Baptiste



 On Mon, Mar 17, 2014 at 9:43 PM, Roel Cuppen r...@cuppie.com wrote:
 Hi,

 I would like to know if it is possible to add radius authentication. So
 that
 the http authentication users kan exist in a radius database.

 Whenever a radius authentication feature is active , it;s possbile to
 add
 OTP authentication.

 Kind regards,

 Cuppie




Re: No ssl or crt in bind when compiled with USE_OPENSSL=1

2014-03-30 Thread Patrick Hemmer
1.4 does not support SSL. SSL was added in 1.5-dev12

-Patrick



*From: *Juan Jimenez jjime...@electric-cloud.com
*Sent: * 2014-03-30 02:44:42 E
*To: *haproxy@formilux.org haproxy@formilux.org
*Subject: *No ssl or crt in bind when compiled with USE_OPENSSL=1

 I am trying to figure out why haproxy 1.4.25 does not like crt and ssl in
 bindŠ

 I recompiled with:

   make TARGET=linux2628 USE_OPENSSL=1
   make install

 The cfg file looks like this:

 global 
 log 127.0.0.1   local0
 log 127.0.0.1   local1 notice
 #log loghostlocal0 info
 maxconn 4096
 #chroot /usr/share/haproxy
 user skytap
 group skytap
 daemon
 #debug
 #quiet

 defaults
 log global
 option  dontlognull
 retries 3
 option redispatch
 maxconn 2000
 contimeout  5000
 clitimeout  5
 srvtimeout  5

 listen stats *:1936
mode http
stats enable
stats realm Haproxy\ Statistics
stats uri /
stats refresh 30
stats show-legends

 frontend commander-server-frontend-insecure
  mode http
  bind 0.0.0.0:8000
  default_backend commander-server-backend

 frontend commander-stomp-frontend
  mode tcp
  bind 0.0.0.0:61613 ssl crt /home/skytap/server.pem
  default_backend commander-stomp-backend
  option tcplog
  log global

 frontend commander-server-frontend-secure
  mode tcp
  bind 0.0.0.0:8443 ssl crt /home/skytap/server.pem
  default_backend commander-server-backend

 backend commander-server-backend
 mode http
 server node1 10.0.0.7:8000 check
 server node2 10.0.0.9:8000 check
   server node3 10.0.0.10:8000 check
 stats enable
 option httpchk GET /commanderRequest/health

 backend commander-stomp-backend
 mode tcp
 server node1 10.0.0.7:61613 check
 server node2 10.0.0.9:61613 check
   server node3 10.0.0.10:61613 check
 option tcplog
 log global

 ‹


 And the error messages are:

 [skytap@haproxy haproxy-1.4.25]$ haproxy -c -f /etc/haproxy/haproxy.cfg
 [ALERT] 087/233343 (3964) : parsing [/etc/haproxy/haproxy.cfg:38] : 'bind'
 only supports the 'transparent', 'defer-accept', 'name', 'id', 'mss' and
 'interface' options.
 [ALERT] 087/233343 (3964) : parsing [/etc/haproxy/haproxy.cfg:45] : 'bind'
 only supports the 'transparent', 'defer-accept', 'name', 'id', 'mss' and
 'interface' options.
 [ALERT] 087/233343 (3964) : Error(s) found in configuration file :
 /etc/haproxy/haproxy.cfg
 [ALERT] 087/233343 (3964) : Fatal errors found in configuration.

 ‹‹

 ??


 Juan Jiménez
 Electric Cloud, Inc.
 Sr. Solutions Engineer - US Northeast Region
 Mobile +1.787.464.5062 | Fax +1.617-766-6980
 jjime...@electric-cloud.com
 www.electric-cloud.com http://www.electric-cloud.com/





haproxy intermittently not connecting to backend

2014-04-01 Thread Patrick Hemmer
We have an issue with haproxy (1.5-dev22-1a34d57) where it is
intermittently not connecting to the backend server. However the
behavior it is exhibiting seems strange.
The reason I say strange is that in one example, it logged that the
client disconnected after ~49 seconds with a connection flags of CC--.
However our config has timeout connect 5000, so it should have timed
out connecting to the backend server after 5 seconds. Additionally we
have retries 3 in the config, so upon timing out, it should have tried
another backend server, but it never did (the retries counter in the log
shows 0).
At the time of this log entry, the backend server is responding
properly. For the ~49 seconds prior to the log entry, the backend server
has taken other requests. The backend server is also another haproxy
(same version).

Here's an example of one such log entry:

198.228.211.13:60848 api~ platform-push/i-84d931a5 49562/0/-1/-1/49563 
0/0/0/0/0 0/0 691/212 
span class=t style=border-color: rgb(204, 204, 204); font-style: normal; 
cursor: pointer;503 CC-- 4F8E-4624 + GET 
/1/sync/notifications/subscribe?sync_box_id=12345sender=27B9A93C-F473-4385-A662-352AD34A2453
 HTTP/1.1


The log format is defined as:
%ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq\
%U/%B\ %ST\ %tsc\ %ID\ +\ %r

Running a show errors on the stats socket did not return any relevant
results.

Here's the relevant portions of the haproxy config. It is not the entire
thing as the whole config is 1,513 lines long.

global
  log 127.0.0.1 local0
  maxconn 20480
  user haproxy
  group haproxy
  daemon
  stats socket /var/run/hapi/haproxy/haproxy.sock level admin

defaults
  log global
  mode http
  option httplog
  option dontlognull
  option log-separate-errors
  retries 3
  option redispatch
  timeout connect 5000
  timeout client 6
  timeout server 17
  option clitcpka
  option srvtcpka
  option abortonclose
  option splice-auto
  monitor-uri /haproxy/ping
  stats enable
  stats uri /haproxy/stats
  stats refresh 15
  stats auth user:pass

frontend api
  bind *:80
  bind *:443 ssl crt /etc/haproxy/server.pem
  maxconn 2
  option httpclose
  option forwardfor
  acl internal src 10.0.0.0/8
  acl have_request_id req.fhdr(X-Request-Id) -m found
  http-request set-nice -100 if internal
  http-request add-header X-API-URL %[path] if !internal
  http-request add-header X-Request-Timestamp %Ts.%ms
  http-request add-header X-Request-Id %[req.fhdr(X-Request-Id)] if
internal have_request_id
  http-request set-header X-Request-Id %{+X}o%pid-%rt if !internal ||
!have_request_id
  http-request add-header X-API-Host i-4a3b1c6a
  unique-id-format %{+X}o%pid-%rt
  log-format %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\
%ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %U/%B\ %ST\ %tsc\ %ID\ +\ %r
  default_backend DEFAULT_404

  acl rewrite-found req.hdr(X-Rewrite-ID,1) -m found

  acl nqXn_path path_reg ^/1/sync/notifications/subscribe/([^\ ?]*)$
  acl nqXn_method method OPTIONS GET HEAD POST PUT DELETE TRACE CONNECT
PATCH
  http-request set-header X-Rewrite-Id nqXn if !rewrite-found nqXn_path
nqXn_method
  acl rewrite-nqXn req.hdr(X-Rewrite-Id) -m str nqXn
  use_backend platform-push if rewrite-nqXn
  reqrep ^(OPTIONS|GET|HEAD|POST|PUT|DELETE|TRACE|CONNECT|PATCH)\
/1/sync/notifications/subscribe/([^\ ?]*)([\ ?].*|$) \1\
/1/sync/subscribe/\2\3 if rewrite-nqXn


backend platform-push
  option httpchk GET /ping
  default-server inter 15s fastinter 1s
  server i-6eaf724d 10.230.23.64:80 check observe layer4
  server i-84d931a5 10.230.42.8:80 check observe layer4



Re: haproxy intermittently not connecting to backend

2014-04-01 Thread Patrick Hemmer
Apologies, my mail client went stupid. Here's the log entry unmangled:

198.228.211.13:60848 api~ platform-push/i-84d931a5 49562/0/-1/-1/49563
0/0/0/0/0 0/0 691/212 503 CC-- 4F8E-4624 + GET
/1/sync/notifications/subscribe?sync_box_id=12496sender=D7A9F93D-F653-4527-A022-383AD55A1943
HTTP/1.1

-Patrick



*From: *Patrick Hemmer hapr...@stormcloud9.net
*Sent: * 2014-04-01 15:20:15 E
*To: *haproxy@formilux.org
*Subject: *haproxy intermittently not connecting to backend

 We have an issue with haproxy (1.5-dev22-1a34d57) where it is
 intermittently not connecting to the backend server. However the
 behavior it is exhibiting seems strange.
 The reason I say strange is that in one example, it logged that the
 client disconnected after ~49 seconds with a connection flags of
 CC--. However our config has timeout connect 5000, so it should
 have timed out connecting to the backend server after 5 seconds.
 Additionally we have retries 3 in the config, so upon timing out, it
 should have tried another backend server, but it never did (the
 retries counter in the log shows 0).
 At the time of this log entry, the backend server is responding
 properly. For the ~49 seconds prior to the log entry, the backend
 server has taken other requests. The backend server is also another
 haproxy (same version).

 Here's an example of one such log entry:

  198.228.211.13:60848 api~ platform-push/i-84d931a5 49562/0/-1/-1/49563 
 0/0/0/0/0 0/0 691/212
   
 lt;
 span class=t style=border-color: rgb(204, 204, 204); font-style: normal; 
 cursor: pointer;503 CC-- 4F8E-4624 + GET 
 /1/sync/notifications/subscribe?sync_box_
  id=12345sender=27B9A93C-F473-4385-A662-352AD34A2453 HTTP/1.1

 The log format is defined as:
 %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\ %ac/%fc/%bc/%sc/%rc\
 %sq/%bq\ %U/%B\ %ST\ %tsc\ %ID\ +\ %r

 Running a show errors on the stats socket did not return any
 relevant results.

 Here's the relevant portions of the haproxy config. It is not the
 entire thing as the whole config is 1,513 lines long.

 global
   log 127.0.0.1 local0
   maxconn 20480
   user haproxy
   group haproxy
   daemon
   stats socket /var/run/hapi/haproxy/haproxy.sock level admin

 defaults
   log global
   mode http
   option httplog
   option dontlognull
   option log-separate-errors
   retries 3
   option redispatch
   timeout connect 5000
   timeout client 6
   timeout server 17
   option clitcpka
   option srvtcpka
   option abortonclose
   option splice-auto
   monitor-uri /haproxy/ping
   stats enable
   stats uri /haproxy/stats
   stats refresh 15
   stats auth user:pass

 frontend api
   bind *:80
   bind *:443 ssl crt /etc/haproxy/server.pem
   maxconn 2
   option httpclose
   option forwardfor
   acl internal src 10.0.0.0/8
   acl have_request_id req.fhdr(X-Request-Id) -m found
   http-request set-nice -100 if internal
   http-request add-header X-API-URL %[path] if !internal
   http-request add-header X-Request-Timestamp %Ts.%ms
   http-request add-header X-Request-Id %[req.fhdr(X-Request-Id)] if
 internal have_request_id
   http-request set-header X-Request-Id %{+X}o%pid-%rt if !internal ||
 !have_request_id
   http-request add-header X-API-Host i-4a3b1c6a
   unique-id-format %{+X}o%pid-%rt
   log-format %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\
 %ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %U/%B\ %ST\ %tsc\ %ID\ +\ %r
   default_backend DEFAULT_404

   acl rewrite-found req.hdr(X-Rewrite-ID,1) -m found

   acl nqXn_path path_reg ^/1/sync/notifications/subscribe/([^\ ?]*)$
   acl nqXn_method method OPTIONS GET HEAD POST PUT DELETE TRACE
 CONNECT PATCH
   http-request set-header X-Rewrite-Id nqXn if !rewrite-found
 nqXn_path nqXn_method
   acl rewrite-nqXn req.hdr(X-Rewrite-Id) -m str nqXn
   use_backend platform-push if rewrite-nqXn
   reqrep ^(OPTIONS|GET|HEAD|POST|PUT|DELETE|TRACE|CONNECT|PATCH)\
 /1/sync/notifications/subscribe/([^\ ?]*)([\ ?].*|$) \1\
 /1/sync/subscribe/\2\3 if rewrite-nqXn


 backend platform-push
   option httpchk GET /ping
   default-server inter 15s fastinter 1s
   server i-6eaf724d 10.230.23.64:80 check observe layer4
   server i-84d931a5 10.230.42.8:80 check observe layer4




Re: modifing default haproxy emit codes

2014-04-02 Thread Patrick Hemmer
You want the errorfile config param.
http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#errorfile

-Patrick



*From: *Piavlo lolitus...@gmail.com
*Sent: * 2014-04-02 15:16:22 E
*To: *haproxy@formilux.org
*Subject: *modifing default haproxy emit codes

  Hi,

 According to the docs:

 Haproxy may emit the following status codes by itself :
503  when no server was available to handle the request, or in
 response to
 monitoring requests which match the monitor fail condition
504  when the response timeout strikes before the server responds

 Instead what I need is that if no server is available for or if server
 does not send and http response within a defined timeout is that
 haproxy respond with 204. Is it possible?
 I assume I can define fallback backend that will redispatch the
 requests to a dumb http server that always answers with 204? Instead
 it would be much better if haproxy itself could reply with 204.

 tnx




Re: haproxy intermittently not connecting to backend

2014-04-02 Thread Patrick Hemmer
That makes perfect sense. Thank you very much.

-Patrick


*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-04-02 15:38:04 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *haproxy@formilux.org
*Subject: *Re: haproxy intermittently not connecting to backend

 Hi Patrick,

 On Tue, Apr 01, 2014 at 03:20:15PM -0400, Patrick Hemmer wrote:
 We have an issue with haproxy (1.5-dev22-1a34d57) where it is
 intermittently not connecting to the backend server. However the
 behavior it is exhibiting seems strange.
 The reason I say strange is that in one example, it logged that the
 client disconnected after ~49 seconds with a connection flags of CC--.
 However our config has timeout connect 5000, so it should have timed
 out connecting to the backend server after 5 seconds. Additionally we
 have retries 3 in the config, so upon timing out, it should have tried
 another backend server, but it never did (the retries counter in the log
 shows 0).
 No, retries impacts only retries to the same server, it's option redispatch
 which allows the last retry to be performed on another server. But you have
 it anyway.

 At the time of this log entry, the backend server is responding
 properly. For the ~49 seconds prior to the log entry, the backend server
 has taken other requests. The backend server is also another haproxy
 (same version).

 Here's an example of one such log entry:
 [fixed version pasted here]

 198.228.211.13:60848 api~ platform-push/i-84d931a5 49562/0/-1/-1/49563 
 0/0/0/0/0 0/0 691/212 503 CC-- 4F8E-4624 + GET 
 /1/sync/notifications/subscribe?sync_box_id=12496sender=D7A9F93D-F653-4527-A022-383AD55A1943
  HTTP/1.1
 OK in fact the client did not wait 49 seconds. If you look closer, you'll
 see that the client remained silent for 49 seconds (typically a connection
 pool or a preconnect) and closed immediately after sending the request (in
 the same millisecond). Since you have option abortonclose, the connection
 was aborted before the server had a chance to respond.

 So I can easily imagine that you randomly get this error, you're in a race
 condition, if the server responds immediately, you win the race and the
 request is handled, otherwise it's aborted.

 Please start by removing option abortonclose, I think it will fix the issue.
 Second thing you can do is to remove option httpclose or replace it with
 option http-server-close which is active and not just passive. The 
 connections
 will last less time on your servers which is always appreciated.

 I'm not seeing any other issue, so with just this you should be fine.

 The log format is defined as:
 %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq\
 %U/%B\ %ST\ %tsc\ %ID\ +\ %r

 Running a show errors on the stats socket did not return any relevant
 results.

 Here's the relevant portions of the haproxy config. It is not the entire
 thing as the whole config is 1,513 lines long.

 global
   log 127.0.0.1 local0
   maxconn 20480
   user haproxy
   group haproxy
   daemon
   stats socket /var/run/hapi/haproxy/haproxy.sock level admin

 defaults
   log global
   mode http
   option httplog
   option dontlognull
   option log-separate-errors
   retries 3
   option redispatch
   timeout connect 5000
   timeout client 6
   timeout server 17
   option clitcpka
   option srvtcpka
   option abortonclose
   option splice-auto
   monitor-uri /haproxy/ping
   stats enable
   stats uri /haproxy/stats
   stats refresh 15
   stats auth user:pass

 frontend api
   bind *:80
   bind *:443 ssl crt /etc/haproxy/server.pem
   maxconn 2
   option httpclose
   option forwardfor
   acl internal src 10.0.0.0/8
   acl have_request_id req.fhdr(X-Request-Id) -m found
   http-request set-nice -100 if internal
   http-request add-header X-API-URL %[path] if !internal
   http-request add-header X-Request-Timestamp %Ts.%ms
   http-request add-header X-Request-Id %[req.fhdr(X-Request-Id)] if
 internal have_request_id
   http-request set-header X-Request-Id %{+X}o%pid-%rt if !internal ||
 !have_request_id
   http-request add-header X-API-Host i-4a3b1c6a
   unique-id-format %{+X}o%pid-%rt
   log-format %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\
 %ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %U/%B\ %ST\ %tsc\ %ID\ +\ %r
   default_backend DEFAULT_404

   acl rewrite-found req.hdr(X-Rewrite-ID,1) -m found

   acl nqXn_path path_reg ^/1/sync/notifications/subscribe/([^\ ?]*)$
   acl nqXn_method method OPTIONS GET HEAD POST PUT DELETE TRACE CONNECT
 PATCH
   http-request set-header X-Rewrite-Id nqXn if !rewrite-found nqXn_path
 nqXn_method
   acl rewrite-nqXn req.hdr(X-Rewrite-Id) -m str nqXn
   use_backend platform-push if rewrite-nqXn
   reqrep ^(OPTIONS|GET|HEAD|POST|PUT|DELETE|TRACE|CONNECT|PATCH)\
 /1/sync/notifications/subscribe/([^\ ?]*)([\ ?].*|$) \1\
 /1/sync/subscribe/\2\3 if rewrite-nqXn


 backend platform-push
   option httpchk GET /ping
   default-server inter 15s fastinter 1s

suppress reqrep / use_backend warning

2014-04-08 Thread Patrick Hemmer
Would it be possible to get an option to suppress the warning when a
reqrep rule is placed after a use_backend rule?
[WARNING] 097/205824 (4777) : parsing
[/var/run/hapi/haproxy/haproxy.cfg:1443] : a 'reqrep' rule placed after
a 'use_backend' rule will still be processed before.

I prefer keeping my related rules grouped together, and so this message
pops up every time haproxy is (re)started. Currently it logs out 264
lines each start (I have a lot of rules), and is thus fairly annoying. I
am well aware of what the message means and my configuration is not
affected by it.

-Patrick


haproxy mis-reporting layer 4 checks

2014-04-10 Thread Patrick Hemmer
I've brought up this bug before
(http://marc.info/?l=haproxym=139312718801838), but it seems to not
have gotten any attention, so I'm raising it again.

There is an issue with haproxy mis-reporting layer 4 checks. There are
2, likely related, issues.
1) When haproxy first starts up, it will report the server as UP 1/3
for however long the check interval is set to. If the interval is 30
seconds, it will say UP 1/3 for 30 seconds.
2) Haproxy is adding the check interval time to the time of the check
itself. For example, if I have a check interval of 30 seconds, the
statistics output reports a check completion time of 30001ms.

Attached is a simple configuration that can be used to demonstrate this
issue. Launch haproxy, and then go to http://localhost/haproxy/stats

-Patrick
global
log 127.0.0.1   local0

defaults
log global
modehttp
option  httplog
timeout connect 5000
timeout client 6
timeout server 17

stats   enable
stats   uri /haproxy/stats

frontend f1
bind 0.0.0.0:9000

default_backend b1

backend b1
server s1 localhost:9001 check inter 1

frontend f2
bind 0.0.0.0:9001


Re: haproxy mis-reporting layer 4 checks

2014-04-11 Thread Patrick Hemmer


*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-04-11 08:29:15 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *haproxy@formilux.org
*Subject: *Re: haproxy mis-reporting layer 4 checks

 Hi Patrick,

 On Thu, Apr 10, 2014 at 02:17:02PM -0400, Patrick Hemmer wrote:
 I've brought up this bug before
 (http://marc.info/?l=haproxym=139312718801838), but it seems to not
 have gotten any attention, so I'm raising it again.

 There is an issue with haproxy mis-reporting layer 4 checks. There are
 2, likely related, issues.
 1) When haproxy first starts up, it will report the server as UP 1/3
 for however long the check interval is set to. If the interval is 30
 seconds, it will say UP 1/3 for 30 seconds.
 2) Haproxy is adding the check interval time to the time of the check
 itself. For example, if I have a check interval of 30 seconds, the
 statistics output reports a check completion time of 30001ms.
 We used to have this bug in a certain version (I don't remember which
 one), but I fail to reproduce it anymore with latest master using your
 configuration. What version are you running ?

 Willy

Ah, you're right. I was testing against 1.5-dev22. Using the latest
master this is fixed.

Sorry for the noise.

-Patrick



Re: haproxy intermittently not connecting to backend

2014-04-11 Thread Patrick Hemmer
This just keeps coming back to bug me. I don't think the client closing
the connection should result in a 5XX code. 5XX should indicate a server
issue, and the client closing the connection before the server has a
chance to respond isn't a server issue. Only if the server doesn't
respond within the configured timeout should it be a 5XX.

Nginx uses 499 for client closed connection. Perhaps haproxy could use
that status code as well when `option abortonclose` is used.

-Patrick



*From: *Patrick Hemmer hapr...@stormcloud9.net
*Sent: * 2014-04-02 15:50:22 E
*To: *haproxy@formilux.org
*Subject: *Re: haproxy intermittently not connecting to backend

 That makes perfect sense. Thank you very much.

 -Patrick

 
 *From: *Willy Tarreau w...@1wt.eu
 *Sent: * 2014-04-02 15:38:04 E
 *To: *Patrick Hemmer hapr...@stormcloud9.net
 *CC: *haproxy@formilux.org
 *Subject: *Re: haproxy intermittently not connecting to backend

 Hi Patrick,

 On Tue, Apr 01, 2014 at 03:20:15PM -0400, Patrick Hemmer wrote:
 We have an issue with haproxy (1.5-dev22-1a34d57) where it is
 intermittently not connecting to the backend server. However the
 behavior it is exhibiting seems strange.
 The reason I say strange is that in one example, it logged that the
 client disconnected after ~49 seconds with a connection flags of CC--.
 However our config has timeout connect 5000, so it should have timed
 out connecting to the backend server after 5 seconds. Additionally we
 have retries 3 in the config, so upon timing out, it should have tried
 another backend server, but it never did (the retries counter in the log
 shows 0).
 No, retries impacts only retries to the same server, it's option redispatch
 which allows the last retry to be performed on another server. But you have
 it anyway.

 At the time of this log entry, the backend server is responding
 properly. For the ~49 seconds prior to the log entry, the backend server
 has taken other requests. The backend server is also another haproxy
 (same version).

 Here's an example of one such log entry:
 [fixed version pasted here]

 198.228.211.13:60848 api~ platform-push/i-84d931a5 49562/0/-1/-1/49563 
 0/0/0/0/0 0/0 691/212 503 CC-- 4F8E-4624 + GET 
 /1/sync/notifications/subscribe?sync_box_id=12496sender=D7A9F93D-F653-4527-A022-383AD55A1943
  HTTP/1.1
 OK in fact the client did not wait 49 seconds. If you look closer, you'll
 see that the client remained silent for 49 seconds (typically a connection
 pool or a preconnect) and closed immediately after sending the request (in
 the same millisecond). Since you have option abortonclose, the connection
 was aborted before the server had a chance to respond.

 So I can easily imagine that you randomly get this error, you're in a race
 condition, if the server responds immediately, you win the race and the
 request is handled, otherwise it's aborted.

 Please start by removing option abortonclose, I think it will fix the 
 issue.
 Second thing you can do is to remove option httpclose or replace it with
 option http-server-close which is active and not just passive. The 
 connections
 will last less time on your servers which is always appreciated.

 I'm not seeing any other issue, so with just this you should be fine.

 The log format is defined as:
 %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq\
 %U/%B\ %ST\ %tsc\ %ID\ +\ %r

 Running a show errors on the stats socket did not return any relevant
 results.

 Here's the relevant portions of the haproxy config. It is not the entire
 thing as the whole config is 1,513 lines long.

 global
   log 127.0.0.1 local0
   maxconn 20480
   user haproxy
   group haproxy
   daemon
   stats socket /var/run/hapi/haproxy/haproxy.sock level admin

 defaults
   log global
   mode http
   option httplog
   option dontlognull
   option log-separate-errors
   retries 3
   option redispatch
   timeout connect 5000
   timeout client 6
   timeout server 17
   option clitcpka
   option srvtcpka
   option abortonclose
   option splice-auto
   monitor-uri /haproxy/ping
   stats enable
   stats uri /haproxy/stats
   stats refresh 15
   stats auth user:pass

 frontend api
   bind *:80
   bind *:443 ssl crt /etc/haproxy/server.pem
   maxconn 2
   option httpclose
   option forwardfor
   acl internal src 10.0.0.0/8
   acl have_request_id req.fhdr(X-Request-Id) -m found
   http-request set-nice -100 if internal
   http-request add-header X-API-URL %[path] if !internal
   http-request add-header X-Request-Timestamp %Ts.%ms
   http-request add-header X-Request-Id %[req.fhdr(X-Request-Id)] if
 internal have_request_id
   http-request set-header X-Request-Id %{+X}o%pid-%rt if !internal ||
 !have_request_id
   http-request add-header X-API-Host i-4a3b1c6a
   unique-id-format %{+X}o%pid-%rt
   log-format %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\
 %ac/%fc/%bc

Re: suppress reqrep / use_backend warning

2014-04-13 Thread Patrick Hemmer


*From: *Cyril Bonté cyril.bo...@free.fr
*Sent: * 2014-04-13 11:15:26 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *haproxy@formilux.org
*Subject: *Re: suppress reqrep / use_backend warning

 Hi Patrick,

 Le 08/04/2014 23:04, Patrick Hemmer a écrit :
 Would it be possible to get an option to suppress the warning when a
 reqrep rule is placed after a use_backend rule?
 [WARNING] 097/205824 (4777) : parsing
 [/var/run/hapi/haproxy/haproxy.cfg:1443] : a 'reqrep' rule placed after
 a 'use_backend' rule will still be processed before.

 I prefer keeping my related rules grouped together, and so this message
 pops up every time haproxy is (re)started. Currently it logs out 264
 lines each start (I have a lot of rules), and is thus fairly annoying. I
 am well aware of what the message means and my configuration is not
 affected by it.


 Do you want to ignore every warnings or only some warnings ?
I would think ignoring only some warnings would be preferable. Ignoring
all warnings might lead to people disabling them all, and then when a
new warning comes up that hasn't been seen before, it'll be missed.


 For the first case you can use the global keyword quiet (or its
 command line equivalent -q).
Ah, didn't know `quiet` would suppress warnings as well. This might be
acceptable.


 For the second one, there is nothing available yet, but I was thinking
 of something like annotations in configuration comments.
 For example :
 - @ignore-warnings to ignore the warnings of the current line
 - @BEGIN ignore-warnings to start a block of lines where warnings will
 be ignored
 - @END ignore-warnings to stop ignoring warnings.

 frontend test :
   mode http
   reqrep ^([^\ :]*)\ /static/(.*) \1\ /\2
   block if TRUE   # @ignore-warnings
   block if FALSE  # @ignore-warnings
   block if TRUE
   block if TRUE
   block if TRUE
   # @BEGIN ignore-warnings
   block if TRUE
   block if TRUE
   block if TRUE
   block if TRUE
   block if TRUE
   # @END ignore-warnings
   block if TRUE
   block if TRUE
   block if TRUE

 Please find a quick and dirty patch to illustrate. Is this something
 that could be useful ?
Hadn't really thought about the best way to solve it until now. I like
the per-line suppression more than the @BEGIN/@END one. The only other
way I can think of doing this is by having a config directive such as:
ignore-warnings reqrep_use_backend
Which would suppress all occurrences of that specific warning. But then
the warning message itself would need some sort of identifier on it so
we knew what argument to pass to 'ignore-warnings'

I'll play with the patch tomorrow, see how manageable it is.

But really, this is a trivial matter. I'd be OK with whatever is decided.


-Patrick


Re: haproxy intermittently not connecting to backend

2014-04-14 Thread Patrick Hemmer
*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-04-14 11:27:59 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *haproxy@formilux.org
*Subject: *Re: haproxy intermittently not connecting to backend

 Hi Patrick,

 On Sat, Apr 12, 2014 at 01:38:54AM -0400, Patrick Hemmer wrote:
 This just keeps coming back to bug me. I don't think the client closing
 the connection should result in a 5XX code. 5XX should indicate a server
 issue, and the client closing the connection before the server has a
 chance to respond isn't a server issue. Only if the server doesn't
 respond within the configured timeout should it be a 5XX.

 Nginx uses 499 for client closed connection. Perhaps haproxy could use
 that status code as well when `option abortonclose` is used.
 It's wrong to invent new status codes, because they'll sooner or later
 conflict with one officially assigned (or worse, they'll become so much
 used that they'll make it harder to improve the standards).
RFC2616 says HTTP status codes are extensible and even gives a
specific scenario how the client should handle an unregistered code
(look for the if an unrecognized status code of 431 is received by the
client example).

 I get your point though. I'm used to say that 5xx is an error that the
 client should not be able to induce. This is not totally right nowadays
 due to 501, nor due to web services which like to return 500 when they
 want to return false... But in general that's the idea.

 However here it's not as black and white. If a client manages to close
 before the connection to the server is opened, it's generally because
 the server takes time to accept the connection. The longer it takes,
 the higher the number of 503 due to client aborts. What we should try
 to avoid in my opinion is to return 503 immediately. I think that the
 semantics of 408 are the closest to what we're expecting in terms of
 asking the client to retry if needed, eventhough that's a different
 technical issue. I'd rather not use plain 400 to avoid polluting the
 real 400 that admins have a hard time trying to fix sometimes.
I disagree with the statement that we should avoid immediate response
when the connection is closed. Going back to RFC1945 (HTTP 1.0), we have
this:
In any case, the closing of the connection by either or both parties
always terminates the current request, regardless of its status.
But that is HTTP 1.0, so it's validity in this case is tenuous. I
couldn't find a similar statement in RFC2616, or anything which states
how it should be handled when the client closes it's connection prior to
response. I guess this is why it's a configurable option :-)

If we want to use a registered status code, I would argue in favor of
417 which has the following in it's description:
if the server is a proxy, the server has unambiguous evidence that the
request could not be met by the next-hop server

Would it be difficult to add a parameter to the option? Such as option
httpclose 417 to control how haproxy responds?

 Any optinion on this ?

 Willy





haproxy incorrectly reporting connection flags

2014-04-16 Thread Patrick Hemmer
With 1.5-dev22, we have a scenario where haproxy is saying the client
closed the connection, but really the server is the one that closed it.

Here is the log entry from haproxy:
haproxy[12540]: 10.230.0.195:33580 storage_upd storage_upd/storage_upd_2
0/0/0/522/555 0/0/0/0/0 0/0 412/271 200 CD-- 73E3-20FF5 + GET
/1/public_link/1BMcSfqg3OM4Ng HTTP/1.1

The log format is defined as:
capture request header X-Request-Id len 64
log-format %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\
%ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %U/%B\ %ST\ %tsc\ %hrl\ +\ %r

Attached is the haproxy config, along with a packet capture between
haproxy and the backend server.
The packet capture shows that the backend server listening on port 4001
sent a TCP FIN packet to haproxy first. Therefore haproxy shouldn't have
logged it with C---

-Patrick
global
log 127.0.0.1   local0
maxconn 20480
user haproxy
group haproxy
daemon
stats socket /var/run/haproxy.sock

defaults
log global
modehttp
option  httplog
option  dontlognull
retries 3
option  redispatch
timeout connect 5000
timeout client 6
timeout server 17
option  clitcpka
option  srvtcpka
option  abortonclose
option  splice-auto

stats   enable
stats   uri /haproxy/stats
stats   refresh 5
stats   auth user:pass

frontend storage_upd
bind 0.0.0.0:80
bind 0.0.0.0:81 accept-proxy
default_backend storage_upd
maxconn 2

capture request header X-Request-Id len 64

http-request add-header X-Request-Timestamp %Ts.%ms

log-format %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\ 
%ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %U/%B\ %ST\ %tsc\ %hrl\ +\ %r

backend storage_upd
fullconn 2
server storage_upd_1 127.0.0.1:4000 check  
server storage_upd_2 127.0.0.1:4001 check  
server storage_upd_3 127.0.0.1:4002 check  
server storage_upd_4 127.0.0.1:4003 check  


haproxy.pcap
Description: Binary data


Re: haproxy incorrectly reporting connection flags

2014-04-22 Thread Patrick Hemmer

*From: *Patrick Hemmer hapr...@stormcloud9.net
*Sent: * 2014-04-16 17:38:54 E
*To: *haproxy@formilux.org haproxy@formilux.org
*Subject: *haproxy incorrectly reporting connection flags

 With 1.5-dev22, we have a scenario where haproxy is saying the client
 closed the connection, but really the server is the one that closed it.

 Here is the log entry from haproxy:
 haproxy[12540]: 10.230.0.195:33580 storage_upd
 storage_upd/storage_upd_2 0/0/0/522/555 0/0/0/0/0 0/0 412/271 200 CD--
 73E3-20FF5 + GET /1/public_link/1BMcSfqg3OM4Ng HTTP/1.1

 The log format is defined as:
 capture request header X-Request-Id len 64
 log-format %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\
 %ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %U/%B\ %ST\ %tsc\ %hrl\ +\ %r

 Attached is the haproxy config, along with a packet capture between
 haproxy and the backend server.
 The packet capture shows that the backend server listening on port
 4001 sent a TCP FIN packet to haproxy first. Therefore haproxy
 shouldn't have logged it with C---

 -Patrick

Any feedback on this?
I can happily provide any additional information if needed.

-Patrick


Re: haproxy incorrectly reporting connection flags

2014-04-23 Thread Patrick Hemmer
 

*From: *Cyril Bonté cyril.bo...@free.fr
*Sent: * 2014-04-23 02:37:07 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *haproxy@formilux.org haproxy@formilux.org
*Subject: *Re: haproxy incorrectly reporting connection flags

 Hi Patrick,

 Le 23/04/2014 03:25, Patrick Hemmer a écrit :
 Any feedback on this?
 I can happily provide any additional information if needed.

 Didn't you see Lukas' mail ? That's exactly what he asked for ;-)


Sorry about that. I see it on the mailing list archive, but not in my
client :-(

In response to Lukas' email:
Yes, I can reliably reproduce the issue. Here's another one with pcaps
of the eth0 and lo interfaces.

2014-04-23T13:52:35.438+00:00 local0/info(6) haproxy[12631]:
10.230.1.210:45631 storage_upd storage_upd/storage_upd_2 0/0/0/932/1004
0/0/0/0/0 0/0 299/270 200 CD-- 738E-8D38 + GET
/1/public_link/1BMcSfqg3OM4Ng HTTP/1.1

-Patrick


haproxy-eth0.pcap
Description: application/vnd.tcpdump.pcap


haproxy-lo.pcap
Description: application/vnd.tcpdump.pcap


Re: haproxy incorrectly reporting connection flags

2014-04-23 Thread Patrick Hemmer
*From: *Lukas Tribus luky...@hotmail.com
*Sent: * 2014-04-23 12:16:01 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *haproxy@formilux.org haproxy@formilux.org
*Subject: *RE: haproxy incorrectly reporting connection flags


 Sorry about that. I see it on the mailing list archive, but not in my 
 client :-(
 Probably catched by a spam filter, I did respond directly to you and the
 mailing list.




 Yes, I can reliably reproduce the issue. Here's another one with pcaps 
 of the eth0 and lo interfaces.
 Can you also provide ./haproxy -vv output please.



 Thanks,

 Lukas

HA-Proxy version 1.5-dev22-1a34d57 2014/02/03
Copyright 2000-2014 Willy Tarreau w...@1wt.eu

Build options :
  TARGET  = linux2628
  CPU = generic
  CC  = gcc
  CFLAGS  = -O2 -g -fno-strict-aliasing
  OPTIONS = USE_LINUX_SPLICE=1 USE_LINUX_TPROXY=1 USE_ZLIB=1
USE_OPENSSL=1 USE_PCRE=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200

Encrypted password support via crypt(3): yes
Built with zlib version : 1.2.3.4
Compression algorithms supported : identity, deflate, gzip
Built with OpenSSL version : OpenSSL 1.0.1 14 Mar 2012
Running on OpenSSL version : OpenSSL 1.0.1 14 Mar 2012
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports prefer-server-ciphers : yes
Built with PCRE version : 8.12 2011-01-15
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with transparent proxy support using: IP_TRANSPARENT
IPV6_TRANSPARENT IP_FREEBIND

Available polling systems :
  epoll : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.


-Patrick


Re: please check

2014-05-02 Thread Patrick Hemmer
*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-05-02 02:02:11 E
*To: *Rachel Chavez rachel.chave...@gmail.com
*CC: *haproxy@formilux.org
*Subject: *Re: please check

 On Thu, May 01, 2014 at 03:44:46PM -0400, Rachel Chavez wrote:
 The problem is:

 when client sends a request with incomplete body (it has content-length but
 no body) then haproxy returns a 5XX error when it should be a client issue.
 It's a bit more complicated than that. When the request body flows from the
 client to the server, at any moment the server is free to respond (either
 with an error, a redirect, a timeout or whatever). So as soon as we start
 to forward a request body from the client to the server, we're *really*
 waiting for the server to send a verdict about that request.
At any moment the server is free to respond yes, but the server cannot
respond *properly* until it gets the complete request.
If the response depends on the request payload, the server doesn't know
whether to respond with 200 or with a 400.

RFC2616 covers this behavior in depth. See 8.2.3 Use of the 100
(Continue) Status. This section indicates that it should not be
expected for the server to respond without a request body unless the
client explicitly sends a Expect: 100-continue



 In the session.c file starting in 2404 i make sure that if I haven't
 received the entire body of the request I continue to wait for it by
 keeping AN_REQ_WAIT_HTTP as part of the request analyzers list as long as
 the client read timeout hasn't fired yet.
 It's unrelated unfortunately and it cannot work. AN_REQ_WAIT_HTTP is meant
 to wait for a *new* request. So if the client doesn't send a complete
 request, it's both wrong and dangerous to expect a new request inside the
 body. When the body is being forwarded, the request flows through
 http_request_forward_body(). This one already tests for the client timeout
 as you can see. I'm not seeing any error processing there though, maybe
 we'd need to set some error codes there to avoid them getting the default
 ones.

 In the proto_http.c file what I tried to do is avoid getting a server
 timeout when the client had timed-out already.
 I agree that it's always the *first* timeout which strikes which should
 indicate the faulty side, because eventhough they're generally set to the
 same value, people who want to enforce a specific processing can set them
 apart.

 Regards,
 Willy



Re: please check

2014-05-02 Thread Patrick Hemmer
*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-05-02 11:15:07 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *Rachel Chavez rachel.chave...@gmail.com, haproxy@formilux.org
*Subject: *Re: please check

 Hi Patrick,

 On Fri, May 02, 2014 at 10:57:38AM -0400, Patrick Hemmer wrote:
 *From: *Willy Tarreau w...@1wt.eu
 *Sent: * 2014-05-02 02:02:11 E
 *To: *Rachel Chavez rachel.chave...@gmail.com
 *CC: *haproxy@formilux.org
 *Subject: *Re: please check

 On Thu, May 01, 2014 at 03:44:46PM -0400, Rachel Chavez wrote:
 The problem is:

 when client sends a request with incomplete body (it has content-length but
 no body) then haproxy returns a 5XX error when it should be a client issue.
 It's a bit more complicated than that. When the request body flows from the
 client to the server, at any moment the server is free to respond (either
 with an error, a redirect, a timeout or whatever). So as soon as we start
 to forward a request body from the client to the server, we're *really*
 waiting for the server to send a verdict about that request.
 At any moment the server is free to respond yes, but the server cannot
 respond *properly* until it gets the complete request.
 Yes it can, redirects are the most common anticipated response, as the
 result of a POST to a page with an expired cookie. And the 302 is a
 clean response, it's not even an error.
I should have clarified what I meant by properly more. I didn't mean
that the server can't respond at all, as there are many cases it can,
some of which you point out. I meant that if the server is expecting a
request body, it can't respond with a 200 until it verifies that request
body.

 If the response depends on the request payload, the server doesn't know
 whether to respond with 200 or with a 400.
 With WAFs deployed massively on server infrastructures, 403 are quite
 common long before the whole data. 413 request entity too large appears
 quite commonly as well. 401 and 407 can also happen when authentication
 is needed.

 RFC2616 covers this behavior in depth. See 8.2.3 Use of the 100
 (Continue) Status. This section indicates that it should not be
 expected for the server to respond without a request body unless the
 client explicitly sends a Expect: 100-continue
 Well, 2616 is 15-years old now and pretty obsolete, which is why the
 HTTP-bis WG is working on refreshing this. New wording is clearer about
 how a request body is used :

o  A server MAY omit sending a 100 (Continue) response if it has
   already received some or all of the message body for the
   corresponding request, or if the framing indicates that there is
   no message body.

 Note the some or all.
I'm assuming you're quoting from:
http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-26#section-5.1.1

This only applies if the Expect: 100-continue was sent. Expect:
100-continue was meant to solve the issue where the client has a large
body, and wants to make sure that the server will accept the body before
sending it (and wasting bandwidth). Meaning that without sending
Expect: 100-continue, it is expected that the server will not send a
response until the body has been sent.


 It's very tricky to find which side is responsible for a stalled upload.
 I've very commonly found that frozen servers, or those with deep request
 queues will stall during body transfers because they still didn't start
 to consume the part of the request that's queued into network buffers.

 All I mean is that it's unfortunately not *that* white and black. We
 *really* need to make a careful difference between what happens on the
 two sides. The (hard) goal I'm generally seeking is to do my best so
 that a misbehaving user doesn't make us believe that a server is going
 badly. That's not easy, considering for example the fact that the 501
 message could be understood as a server error while it's triggered by
 the client.

 In general (unless there's something wrong with the way client timeouts
 are reported in http_request_forward_body), client timeouts should be
 reported as such, and same for server timeouts. It's possible that there
 are corner cases, but we need to be extremely careful about them and not
 try to generalize.
I agree, a client timeout should be reported as such, and that's what
this is all about. If the client sends half the body (or no body), and
then freezes, the client timeout should kick in and send back a 408, not
the server timeout resulting in a 504.

I think in this regards it is very clear.
* The server may respond with the HTTP response status code any time it
feels like it.
* Enable the server timeout and disable the client timeout upon any of
the following:
* The client sent Expect: 100-continue and has completed all headers
* The complete client request has been sent, including body if
Content-Length  0
* Writing to the server socket would result in a blocking write
(indicating that the remote end is not processing).
* Enable the client timeout

Re: please check

2014-05-02 Thread Patrick Hemmer
*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-05-02 12:56:16 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *Rachel Chavez rachel.chave...@gmail.com, haproxy@formilux.org
*Subject: *Re: please check

 On Fri, May 02, 2014 at 12:18:43PM -0400, Patrick Hemmer wrote:
 At any moment the server is free to respond yes, but the server cannot
 respond *properly* until it gets the complete request.
 Yes it can, redirects are the most common anticipated response, as the
 result of a POST to a page with an expired cookie. And the 302 is a
 clean response, it's not even an error.
 I should have clarified what I meant by properly more. I didn't mean
 that the server can't respond at all, as there are many cases it can,
 some of which you point out. I meant that if the server is expecting a
 request body, it can't respond with a 200 until it verifies that request
 body.
 OK, but from a reverse-proxy point of view, all of them are equally valid,
 and there's even no way to know if the server is interested in receiving
 these data at all. The only differences are that some of them are considered
 precious (ie those returning 200) and other ones less since they're
 possibly ephemeral.

 If the response depends on the request payload, the server doesn't know
 whether to respond with 200 or with a 400.
 With WAFs deployed massively on server infrastructures, 403 are quite
 common long before the whole data. 413 request entity too large appears
 quite commonly as well. 401 and 407 can also happen when authentication
 is needed.

 RFC2616 covers this behavior in depth. See 8.2.3 Use of the 100
 (Continue) Status. This section indicates that it should not be
 expected for the server to respond without a request body unless the
 client explicitly sends a Expect: 100-continue
 Well, 2616 is 15-years old now and pretty obsolete, which is why the
 HTTP-bis WG is working on refreshing this. New wording is clearer about
 how a request body is used :

o  A server MAY omit sending a 100 (Continue) response if it has
   already received some or all of the message body for the
   corresponding request, or if the framing indicates that there is
   no message body.

 Note the some or all.
 I'm assuming you're quoting from:
 http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-26#section-5.1.1
 Yes indeed. Ah in fact I found the exact part I was looking for, it's in
 the same block, two points below :

o  A server that responds with a final status code before reading the
   entire message body SHOULD indicate in that response whether it
   intends to close the connection or continue reading and discarding
   the request message (see Section 6.6 of [Part1]).

 This only applies if the Expect: 100-continue was sent. Expect:
 100-continue was meant to solve the issue where the client has a large
 body, and wants to make sure that the server will accept the body before
 sending it (and wasting bandwidth). Meaning that without sending
 Expect: 100-continue, it is expected that the server will not send a
 response until the body has been sent.
 No, it is expected that it will need to consume all the data before the
 connection may be reused for sending another request. That is the point
 of 100. And the problem is that if the server closes the connection when
 responding early (typically a 302) and doesn't drain the client's data,
 there's a high risk that the TCP stack will send an RST that can arrive
 before the actual response, making the client unaware of the response.
 That's why the server must consume the data even if it responds before
 the end.
 A 100-continue expectation informs recipients that the client is
   about to send a (presumably large) message body in this request and
   wishes to receive a 100 (Continue) interim response if the request-
   line and header fields are not sufficient to cause an immediate
   success, redirect, or error response.  This allows the client to wait
   for an indication that it is worthwhile to send the message body
   before actually doing so, which can improve efficiency when the
   message body is huge or when the client anticipates that an error is
   likely


 (...)
 In general (unless there's something wrong with the way client timeouts
 are reported in http_request_forward_body), client timeouts should be
 reported as such, and same for server timeouts. It's possible that there
 are corner cases, but we need to be extremely careful about them and not
 try to generalize.
 I agree, a client timeout should be reported as such, and that's what
 this is all about. If the client sends half the body (or no body), and
 then freezes, the client timeout should kick in and send back a 408, not
 the server timeout resulting in a 504.
 Yes, I agree with this description.

 I think in this regards it is very clear.
 * The server may respond with the HTTP response status code any time it
 feels like it.
 OK

 * Enable the server timeout and disable the client

Re: please check

2014-05-02 Thread Patrick Hemmer
*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-05-02 14:00:24 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *Rachel Chavez rachel.chave...@gmail.com, haproxy@formilux.org
*Subject: *Re: please check

 On Fri, May 02, 2014 at 01:32:30PM -0400, Patrick Hemmer wrote:
 I've set up a test scenario, and the only time haproxy responds with 408
 is if the client times out in the middle of request headers. If the
 client has sent all headers, but no body, or partial body, it times out
 after the configured 'timeout server' value, and responds with 504.
 OK that's really useful. I'll try to reproduce that case. Could you please
 test again with a shorter client timeout than server timeout, just to ensure
 that it's not just a sequencing issue ?
I have. In my test setup, timeout client 1000 and timeout server 2000.

With incomplete headers I get:
haproxy[8893]: 127.0.0.1:41438 [02/May/2014:14:11:26.373] f1 f1/NOSRV
-1/-1/-1/-1/1001 408 212 - - cR-- 0/0/0/0/0 0/0 BADREQ

With no body I get:
haproxy[8893]: 127.0.0.1:41439 [02/May/2014:14:11:29.576] f1 b1/s1
0/0/0/-1/2002 504 194 - - sH-- 1/1/1/1/0 0/0 GET / HTTP/1.1

With incomplete body I get:
haproxy[8893]: 127.0.0.1:41441 [02/May/2014:14:11:29.779] f1 b1/s1
0/0/0/-1/2002 504 194 - - sH-- 0/0/0/0/0 0/0 GET / HTTP/1.1




 Applying the patch solves this behavior. But my test scenario is very
 simple, and I'm not sure if it has any other consequences.
 It definitely has, which is why I'm trying to find the *exact* problem in
 order to fix it.

 Thanks,
 Willy



-Patrick


Re: please check

2014-05-02 Thread Patrick Hemmer
 



*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-05-02 15:06:13 E
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *Rachel Chavez rachel.chave...@gmail.com, haproxy@formilux.org
*Subject: *Re: please check

 On Fri, May 02, 2014 at 02:14:41PM -0400, Patrick Hemmer wrote:
 *From: *Willy Tarreau w...@1wt.eu
 *Sent: * 2014-05-02 14:00:24 E
 *To: *Patrick Hemmer hapr...@stormcloud9.net
 *CC: *Rachel Chavez rachel.chave...@gmail.com, haproxy@formilux.org
 *Subject: *Re: please check

 On Fri, May 02, 2014 at 01:32:30PM -0400, Patrick Hemmer wrote:
 I've set up a test scenario, and the only time haproxy responds with 408
 is if the client times out in the middle of request headers. If the
 client has sent all headers, but no body, or partial body, it times out
 after the configured 'timeout server' value, and responds with 504.
 OK that's really useful. I'll try to reproduce that case. Could you please
 test again with a shorter client timeout than server timeout, just to ensure
 that it's not just a sequencing issue ?
 I have. In my test setup, timeout client 1000 and timeout server 2000.

 With incomplete headers I get:
 haproxy[8893]: 127.0.0.1:41438 [02/May/2014:14:11:26.373] f1 f1/NOSRV
 -1/-1/-1/-1/1001 408 212 - - cR-- 0/0/0/0/0 0/0 BADREQ

 With no body I get:
 haproxy[8893]: 127.0.0.1:41439 [02/May/2014:14:11:29.576] f1 b1/s1
 0/0/0/-1/2002 504 194 - - sH-- 1/1/1/1/0 0/0 GET / HTTP/1.1

 With incomplete body I get:
 haproxy[8893]: 127.0.0.1:41441 [02/May/2014:14:11:29.779] f1 b1/s1
 0/0/0/-1/2002 504 194 - - sH-- 0/0/0/0/0 0/0 GET / HTTP/1.1
 Great, thank you. I think that it tends to fuel the theory that the
 response error is not set where it should be in the forwarding path.

 I'll check this ASAP. BTW, it would be nice if you could check this
 as well with 1.4.25, I guess it does the same.

 Best regards,
 Willy

Confirmed. Exact same behavior with 1.4.25

-Patrick



Re: please check

2014-05-06 Thread Patrick Hemmer
*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-05-06 17:41:18 E
*To: *Patrick Hemmer hapr...@stormcloud9.net, Rachel Chavez
rachel.chave...@gmail.com
*CC: *haproxy@formilux.org
*Subject: *Re: please check

 Hi Patrick, hi Rachel,

 I might have fixed half of the issue, I'd like you to test the attached patch.
 It ensures that the client-side timeout is only disabled after transmitting
 the whole body and not during the transmission. It will report cD in the
 flags, but does not affect the status code yet. It does not abort when the
 client timeout strikes, but still when the server timeout strikes, which is
 another tricky thing to do properly. That's why I would be happy if you could
 at least confirm that you correctly get cD or sH (or even sD) depending on
 who times out first.

 Thanks,
 Willy

So good news, bad news, and strange news.

The good news: It is reporting cD-- as it should

The bad news: It's not reporting any return status at all. Before it
would log 504 and send a 504 response back. Now it logs -1 and doesn't
send anything back. It's just closing the connection.

The strange news: Contrary to your statement, the client connection is
closed after the 1 second timeout. It even logs this. The only thing
that doesn't happen properly is the absence of any response. Just
immediate connection close.


Before patch:
haproxy[26318]: 127.0.0.1:51995 [06/May/2014:18:55:33.002] f1 b1/s1
0/0/0/-1/2001 504 194 - - sH-- 0/0/0/0/0 0/0 GET / HTTP/1.1

After patch:
haproxy[27216]: 127.0.0.1:52027 [06/May/2014:18:56:34.165] f1 b1/s1
0/0/0/-1/1002 -1 0 - - cD-- 0/0/0/0/0 0/0 GET / HTTP/1.1

-Patrick


Re: please check

2014-05-07 Thread Patrick Hemmer

*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-05-07 09:45:47 E
*To: *Patrick Hemmer hapr...@stormcloud9.net, Rachel Chavez
rachel.chave...@gmail.com
*CC: *haproxy@formilux.org
*Subject: *Re: please check

 Hi Patrick, hi Rachel,

 so with these two patches applied on top of the previous one, I get the
 behaviour that we discussed here.

 Specifically, we differentiate client-read timeout, server-write timeouts
 and server read timeouts during the data forwarding phase. Also, we disable
 server read timeout until the client has sent its whole request. That way
 I'm seeing the following flags in the logs :

   - cH when client does not send everything before the server starts to
 respond, which is OK. Status=408 there.

   - cD when client stops sending data after the server starts to respond,
 or if the client stops reading data, which in both case is a clear
 client timeout. In both cases, the status is unaltered and nothing
 is emitted since the beginning of the response was already transmitted ;

   - sH when the server does not respond, including if it stops reading the
 message body (eg: process stuck). Then we have 504.

   - sD if the server stops reading or sending data during the data phase.

 The changes were a bit tricky, so any confirmation from any of you would
 make me more comfortable merging them into mainline. I'm attaching these
 two extra patches, please give them a try.

 Thanks,
 Willy

Works beautifully. I had created a little test suite to test to test a
bunch of conditions around this, and they all pass.
Will see about throwing this in our development environment in the next
few days if a release doesn't come out before then.

Thank you much :-)

-Patrick


Re: unique-id-header with capture request header

2014-05-13 Thread Patrick Hemmer
*From: *Bryan Talbot bryan.tal...@playnext.com
*Sent: * 2014-05-13 11:52:32 E
*To: *HAProxy haproxy@formilux.org
*Subject: *unique-id-header with capture request header

 We have more than 1 proxy tier. The edge proxy generates a unique ID
 and the other tiers (and apps in between) log the value and pass it
 around as a per-request id.

 Middle tier haproxy instances capture and log the unique id using
 capture request header which works fine; however, for the edge proxy
 this doesn't work since the ID doesn't seem to be available as a
 request header yet and a custom log-format must be used instead.

 This means that the logs generated by edge and non-edge proxies have a
 different format and be parsed specially. Also, a custom log-format
 must be maintained just for this purpose.

 What's the best way to capture the unique-id generated in a proxy or
 from a request and log it in a consistent way?

 -Bryan


We do this exact same thing. Unique ID generated on the outside, and
passed around on the inside.
However we use a custom log format on both the internal and external
haproxy so that the format is exactly the same.

External:
unique-id-format %{+X}o%pid-%rt
capture request header X-Request-Id len 12
log-format %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\
%ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %U/%B\ %ST\ %tsc\
%ID,%[capture.req.hdr(0)]\ +\ %r

Internal:
capture request header X-Request-Id len 12
log-format %ci:%cp\ %ft\ %b/%s\ %Tq/%Tw/%Tc/%Tr/%Tt\
%ac/%fc/%bc/%sc/%rc\ %sq/%bq\ %U/%B\ %ST\ %tsc\ %[capture.req.hdr(0)]\ +\ %r

We capture on external as well as occasionally an internal service will
re-route back through the external, and so it gets a new ID, and we can
correlate the two. But the fields are space delimited, so that even
showing 2 IDs, it's still a single field separated by comma (assuming
some client doesn't pass an invalid value in. but that's also why this
is at the end of the log line)

The other advantage of doing it that the default http log format is
missing %U, which we find useful to have.

-Patrick


Disable TLS renegotiation

2014-05-16 Thread Patrick Hemmer
While going through the Qualys SSL test
(https://www.ssllabs.com/ssltest), one of the items it mentions is a DoS
vulnerability in regards to client-side initiated SSL renegotiation
(https://community.qualys.com/blogs/securitylabs/2011/10/31/tls-renegotiation-and-denial-of-service-attacks).
While researching the subject, it seems that the only reliable way to
mitigate the issue is in the server software. Apache has implemented
code to disable renegotiation. Would it be possible to add an option in
haproxy to disable it?

-Patrick


Re: Disable TLS renegotiation

2014-05-16 Thread Patrick Hemmer

*From: *Lukas Tribus luky...@hotmail.com
*Sent: * 2014-05-16 13:23:43 E
*To: *Patrick Hemmer hapr...@stormcloud9.net, haproxy@formilux.org
haproxy@formilux.org
*Subject: *RE: Disable TLS renegotiation

 Hi Patrick,


 While going through the Qualys SSL test  
 (https://www.ssllabs.com/ssltest), one of the items it mentions is a  
 DoS vulnerability in regards to client-side initiated SSL renegotiation  
 (https://community.qualys.com/blogs/securitylabs/2011/10/31/tls-renegotiation-and-denial-of-service-attacks).
   
 While researching the subject, it seems that the only reliable way to  
 mitigate the issue is in the server software. Apache has implemented  
 code to disable renegotiation. Would it be possible to add an option in  
 haproxy to disable it?
 Looks like its already disabled by default?

 https://www.ssllabs.com/ssltest/analyze.html?d=demo.1wt.eu

 --- Secure Client-Initiated Renegotiation:
   No
 --- Insecure Client-Initiated Renegotiation:
   No



 Regards,

 Lukas

 
Doh!

You're right, I screwed up the test when I ran it. Yes, it is disabled.
Sorry for the noise.

-Patrick


Re: Error 408 with Chrome

2014-05-26 Thread Patrick Hemmer

*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-05-26 12:07:09 EDT
*To: *Arnall arnall2...@gmail.com
*CC: *haproxy@formilux.org
*Subject: *Re: Error 408 with Chrome

 On Mon, May 26, 2014 at 05:52:15PM +0200, Arnall wrote:
 Le 26/05/2014 16:13, Willy Tarreau a écrit :
 Hi Arnall,

 On Mon, May 26, 2014 at 11:56:52AM +0200, Arnall wrote:
 Hi Willy,

 same problem here with Chrome version 35.0.1916.114 m and :
 HA-Proxy version 1.4.22 2012/08/09 (Debian 6) Kernel 3.8.13-OVH
 HA-Proxy version 1.5-dev24-8860dcd 2014/04/26 (Debian GNU/Linux 7.5)
 Kernel 3.10.13-OVH

 htmlbodyh1408 Request Time-out/h1
 Your browser didn't send a complete request in time.
 /body/html

 Timing : Blocking 2ms /  Receiving : 1ms
 Where are you measuring this ? I suspect on the browser, right ? In
 this case it confirms the malfunction of the preconnect. You should
 take a network capture which will be usable as a reliable basis for
 debugging. I'm pretty sure that what you'll see in fact is the following
 sequence :

browser haproxy
--- connect --
... long pause ...
 408 + FIN ---
... long pause ...
--- send request -
 RST -

 And you see the error in the browser immediately. The issue is then
 caused by the browser not respecting this specific rule :

  
 Yes it was measured on the browser (Chrome network monitor)
 I 've made a network capture for you.(in attachment)
 Thank you. If you looked at the connection from port 62691, it's exactly
 the sequence I described above. So that clearly explains what Chrome is
 the only one affected!

 Best regards,
 Willy


Has anyone opened a bug against Chrome for this behavior (did a brief
search and didn't see one)? I'd be interested in following it as this
behavior will likely have an impact on an upcoming project I've got.

-Patrick


Re: HAProxy 1.5 release?

2014-06-18 Thread Patrick Hemmer
Haproxy 1.6 is very close to release.
See http://marc.info/?l=haproxym=140129354705695 and
http://marc.info/?l=haproxym=140085816115800

-Patrick


*From: *Stephen Balukoff sbaluk...@bluebox.net
*Sent: * 2014-06-18 08:40:55 EDT
*To: *haproxy@formilux.org
*Subject: *HAProxy 1.5 release?

 Hey Willy!

 I'm involved in a group that is building a highly-scalable open source
 virtual appliance-based load balancer for use with cloud operating
 systems like OpenStack. We are planning on making haproxy the core
 component of the solution we're building.

 At my company we've actually been using haproxy 1.5 for a couple years
 now in production to great effect, and absolutely love it. But I'm
 having trouble getting the rest of the members of my team to go along
 with the idea of using 1.5 in our solution simply because of its
 official status as a development branch. There are just so many
 useful new features in 1.5 that I'd really rather not have to go back
 to 1.4 in our solution...

 So! My question is: What can we do to help y'all bring the 1.5 branch
 far enough along such that y'all are comfortable releasing it as the
 official stable branch of haproxy? (Note we do have people in our
 group with connections in some of the major linux distros who can help
 to fast-track its adoption into official releases of said distros.)

 Thanks,
 Stephen

 -- 
 Stephen Balukoff
 Blue Box Group, LLC
 (800)613-4305 x807



Re: HAProxy 1.5 release?

2014-06-18 Thread Patrick Hemmer
Err, pardon the typo, 1.5 :-)

-Patrick


*From: *Patrick Hemmer hapr...@stormcloud9.net
*Sent: * 2014-06-18 08:49:27 EDT
*To: *Stephen Balukoff sbaluk...@bluebox.net, haproxy@formilux.org
*Subject: *Re: HAProxy 1.5 release?

 Haproxy 1.6 is very close to release.
 See http://marc.info/?l=haproxym=140129354705695 and
 http://marc.info/?l=haproxym=140085816115800

 -Patrick

 
 *From: *Stephen Balukoff sbaluk...@bluebox.net
 *Sent: * 2014-06-18 08:40:55 EDT
 *To: *haproxy@formilux.org
 *Subject: *HAProxy 1.5 release?

 Hey Willy!

 I'm involved in a group that is building a highly-scalable open
 source virtual appliance-based load balancer for use with cloud
 operating systems like OpenStack. We are planning on making haproxy
 the core component of the solution we're building.

 At my company we've actually been using haproxy 1.5 for a couple
 years now in production to great effect, and absolutely love it. But
 I'm having trouble getting the rest of the members of my team to go
 along with the idea of using 1.5 in our solution simply because of
 its official status as a development branch. There are just so many
 useful new features in 1.5 that I'd really rather not have to go back
 to 1.4 in our solution...

 So! My question is: What can we do to help y'all bring the 1.5 branch
 far enough along such that y'all are comfortable releasing it as the
 official stable branch of haproxy? (Note we do have people in our
 group with connections in some of the major linux distros who can
 help to fast-track its adoption into official releases of said
 distros.)

 Thanks,
 Stephen

 -- 
 Stephen Balukoff
 Blue Box Group, LLC
 (800)613-4305 x807




Re: 3rd regression : enough is enough!

2014-06-23 Thread Patrick Hemmer

*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-06-23 10:23:44 EDT
*To: *haproxy@formilux.org
*CC: *Patrick Hemmer hapr...@stormcloud9.net, Rachel Chavez
rachel.chave...@gmail.com
*Subject: *3rd regression : enough is enough!

 Hi guys,

 today we got our 3rd regression caused by the client-side timeout changes
 introduced in 1.5-dev25. And this one is a major one, causing FD leaks
 and CPU spins when servers do not advertise a content-length and the
 client does not respond to the FIN.  And the worst of it, is I have no
 idea how to fix this at all.

 I had that bitter feeling when doing these changes a month ago that
 they were so much tricky that something was obviously going to break.
 It has broken twice already and we could fix the issues. The second
 time was quite harder, and we now see the effect of the regressions
 and their workarounds spreading like an oil stain on paper, with
 workarounds becoming more and more complex and less under control.

 So in the end I have reverted all the patches responsible for these
 regressions. The purpose of these patches was to report cD instead
 of sD in the logs in the case where a client disappears during a
 POST and haproxy has a shorter timeout than the server's.

 I'll issue 1.5.1 shortly with the fix before everyone gets hit by busy
 loops and lacks of file descriptors. If we find another way to do it
 later, we'll try it in 1.6 and may consider backpoting to 1.5 if the
 new solution is absolutely safe. But we're very far away from that
 situation now.

 I'm sorry for this mess just before the release, next time I'll be
 stricter about such dangerous changes that I don't feel at ease with.

 Willy



This is unfortunate. I'm guessing a lot of the issue was in ensuring the
client timeout was observed. Would it at least be possible to change the
response, so that even if the server timeout is what kills the request,
that the client gets sent back a 408 instead of a 503?

-Patrick


Re: 3rd regression : enough is enough!

2014-06-24 Thread Patrick Hemmer
*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-06-24 01:33:41 EDT
*To: *Patrick Hemmer hapr...@stormcloud9.net
*CC: *haproxy@formilux.org, Rachel Chavez rachel.chave...@gmail.com
*Subject: *Re: 3rd regression : enough is enough!

 Hi Patrick,

 On Mon, Jun 23, 2014 at 09:30:11PM -0400, Patrick Hemmer wrote:
 This is unfortunate. I'm guessing a lot of the issue was in ensuring the
 client timeout was observed. Would it at least be possible to change the
 response, so that even if the server timeout is what kills the request,
 that the client gets sent back a 408 instead of a 503?
 For now I have no idea. All the mess came from the awful changes that
 were needed to ignore the server-side timeout and pretend it came from
 the client despite the server triggering first. This required to mess
 up with these events in a very dangerous way :-(

 So right now I'd suggest to try with a shorter client timeout than the
 server timeout. 
That's what we're doing. The 'timeout client' is set to 6, 'timeout
server' is set to 17

 I can try to see how to better *report* this specific
 event if needed, but I don't want to put the brown paper bag on
 timeouts anymore.

 Regards,
 Willy




Re: 3rd regression : enough is enough!

2014-06-24 Thread Patrick Hemmer

*From: *Lukas Tribus luky...@hotmail.com
*Sent: * 2014-06-24 06:44:44 EDT
*To: *Willy Tarreau w...@1wt.eu, Patrick Hemmer hapr...@stormcloud9.net
*CC: *haproxy@formilux.org haproxy@formilux.org, Rachel Chavez
rachel.chave...@gmail.com
*Subject: *RE: 3rd regression : enough is enough!


 
 Date: Tue, 24 Jun 2014 07:33:41 +0200
 From: w...@1wt.eu
 To: hapr...@stormcloud9.net
 CC: haproxy@formilux.org; rachel.chave...@gmail.com
 Subject: Re: 3rd regression : enough is enough!

 Hi Patrick,

 On Mon, Jun 23, 2014 at 09:30:11PM -0400, Patrick Hemmer wrote:
 This is unfortunate. I'm guessing a lot of the issue was in ensuring the
 client timeout was observed. Would it at least be possible to change the
 response, so that even if the server timeout is what kills the request,
 that the client gets sent back a 408 instead of a 503?
 For now I have no idea. All the mess came from the awful changes that
 were needed to ignore the server-side timeout and pretend it came from
 the client despite the server triggering first. This required to mess
 up with these events in a very dangerous way :-(

 So right now I'd suggest to try with a shorter client timeout than the
 server timeout. I can try to see how to better *report* this specific
 event if needed, but I don't want to put the brown paper bag on
 timeouts anymore.
 I fully agree.


 But perhaps we can document the current behavior in those particular
 conditions better, so that this is better known until we find a good
 code-based solution.


 What is the issue here exactly? When the client uploads large POST
 requests and the server timeout is larger than the client timeout,
 we will see a sD flag in the log? Is that all, or are there other
 conditions in which a client timeout trigger a sD log?
The issue is that the 'timeout client' parameter isn't being observed
once the client goes into the data phase. If the server is waiting for
data (http body), it won't send anything back until the client sends in
a body. Since 'timeout client' isn't being observed, the 'timeout
server' kicks in and haproxy responds with a 503 because the server took
too long to respond when it was really the client's issue because the
client didn't send a body. This is supposed to be a 408.

 Can it be workarounded completely by configuring the server timeout
 larger then the client timeout?





 Regards,

 Lukas

 



Re: Getting size of response

2014-08-26 Thread Patrick Hemmer
*From:* Nick Jennings n...@silverbucket.net
*Sent:* 2014-08-26 19:55:34 EDT
*To:* haproxy haproxy@formilux.org
*Subject:* Getting size of response

 Hi all, is there a way to get the size of a response as it's being
 sent out through haproxy during logging? The node.js app (restify) is
 sending gzip'd responses but not including a Content-Length due to
 some bug. I was wondering if I could get the size with haproxy and
 side-step the whole issue.

 Thanks,
 Nick


If you're using `option httplog`, the 8th field is bytes sent to client.
If you're using `option tcplog`, it's the 7th field.
See http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#8.2.3

-Patrick




Re: Spam to this list?

2014-09-05 Thread Patrick Hemmer
*From: *Willy Tarreau w...@1wt.eu
*Sent: * 2014-09-05 11:19:22 EDT
*To: *Ghislain gad...@aqueos.com
*CC: *Mark Janssen maniac...@gmail.com, david rene comba lareu
shadow.of.sou...@gmail.com, Colin Ingarfield co...@ingarfield.com,
haproxy@formilux.org haproxy@formilux.org
*Subject: *Re: Spam to this list?

 On Fri, Sep 05, 2014 at 04:32:55PM +0200, Ghislain wrote:
 hi,

   this is not spam but some bad behavior of a person  that is 
 inscribing the mail of this mailing list to newsletter just to annoy 
 people.
  This guy must be laughing like mad about how such a looser he is but 
 no spam filter will prevent this, there is no filter against human 
 stupidity that is legal in our country.
 That's precisely the point unfortunately :-/

 And the other annoying part are those recurring claims from people who
 know better than anyone else and who pretend that they can magically run
 a mail server with no spam. That's simply nonsense and utterly false. Or
 they have such aggressive filters that they can't even receive the complaints
 from their users when mails are eaten. Everyone can do that, it's enough
 to alias haproxy@formilux.org to /dev/null to get the same effect!

 But the goal of the ML is not to block the maximum amount of spam but to
 ensure optimal delivery to its subscribers. As soon as you add some level
 of filtering, you automatically get some increasing amount of false positive.

 We even had to drop one filter a few months ago because some gmail users
 could not post anymore.

 I'm open to suggestions, provided that :

1) it doesn't add *any* burden on our side (scalability means that the
   processing should be distributed, not centralized)

2) it doesn't block any single valid e-mail, even from non-subscriber
Have it ever been tried enabling a spam filter in a dry-run mode? Run it
for a year, and just have it add a header indicating whether it would
have blocked the message. Then see if any legitimate messages would have
been blocked.

I also want to point out that the mailing list itself sometimes lands on
various blacklists because of the amount of spam coming from it. So now
users using mail providers subscribing to these blacklists are not just
not losing a few messages, they're losing every message.


3) it doesn't require anyone to resubscribe nor change their ingress
   filters to get the mails into the same box.

4) it doesn't add extra delays to posts (eg: no grey-listing) because
   that's really painful for people who post patches and are impatient
   to see them reach the ML.
In the past you stated that you have grey-listing enable (
http://marc.info/?l=haproxym=139748200027362w=2 ), and here you're
stating that you don't want it. Now I'm confused which is really the case.
If indeed grey-listing is not enabled, why not enable it for
non-subscribers? I'd bet that all the people sending patches are subscribed.

 I'm always amazed how people are anonyed with spam in 2014. Spam is part
 of the internet experience and is so ubiquitous that I think these people
 have been living under a rock. Probably those people consider that we
 should also run blood tests on people who want to jump into a bus to
 ensure that they don't come in with any minor disease in hope that all
 diseases will finally disappear. I'm instead in the camp of those who
 consider that training the population is the best resistance, and I think
 that all living being history already proved me right.
I would argue the opposite, this is 2014, we should have capable spam
handling technologies. And indeed we do!
The thing is that spam handling has to be handled on the original
recipient of the email (haproxy@formilux.org). Once the message has been
sent through a relay (the mailing list), many spam filtering
capabilities no longer work (DNSBL, greylisting, SPF, etc). Thus it is
the responsibility of the relay to do the filtering.


 I probably received 5 more spams while writing this, and who cares!
Obviously quite a few people care.
This is your list, and I respect that, but your opinion seems to be the
minority.

You've stated in the past that you don't like it when actions result in
people unsubscribing from the list. How many people unsubscribe because
they are tired of the spam?

I know I barely pay as much attention to the mailing list as I used to
because of the amount of spam. Oh look, a message. SPAM. Oh look, a
message. SPAM again...

-Patrick


Re: Spam to this list?

2014-09-05 Thread Patrick Hemmer
*From: *Cyril Bonté cyril.bo...@free.fr
*Sent: * 2014-09-05 15:50:21 EDT
*To: *Patrick Hemmer hapr...@stormcloud9.net, Willy Tarreau
w...@1wt.eu, Ghislain gad...@aqueos.com
*CC: *Mark Janssen maniac...@gmail.com, david rene comba lareu
shadow.of.sou...@gmail.com, Colin Ingarfield co...@ingarfield.com,
haproxy@formilux.org haproxy@formilux.org
*Subject: *Re: Spam to this list?

 Hi,

 Le 05/09/2014 20:39, Patrick Hemmer a écrit :
 Obviously quite a few people care.
 This is your list, and I respect that, but your opinion seems to be the
 minority.

 Without facts, this is as true as if I argue that the majority doesn't
 care that much but are more annoyed by the amount of mails containing
 the subject Spam to this list?.



Ah, but your facts are in the discussion thread. I've seen very few
people supporting the current state of things. Yes it is possible there
are other people who haven't replied, but I think we can make a couple
deductions:
* Those who have strong feelings on the matter have already reported in
* Those who havent either:
   * don't have a strong opinion
   * feel their stance is sufficiently represented.
   * haven't checked their mail
In all cases, I think it would be reasonable to assume that the sample
set already provided would reflect the general trend of further
responses. Meaning that the majority opinion would remain the majority
opinion.

-Patrick



mailing list archives dead

2016-04-04 Thread Patrick Hemmer
It looks like the mailing list archives stopped working mid-December.

https://marc.info/?l=haproxy

-Patrick


Re: unique-id-header and req.hdr

2017-01-27 Thread Patrick Hemmer


On 2017/1/27 14:38, Cyril Bonté wrote:
> Le 27/01/2017 à 20:11, Ciprian Dorin Craciun a écrit :
>> On Fri, Jan 27, 2017 at 9:01 PM, Cyril Bonté 
>> wrote:
>>> Instead of using "unique-id-header" and temporary headers, you can
>>> use the
>>> "unique-id" fetch sample [1] :
>>>
>>> frontend public
>>> bind *:80
>>> unique-id-format %{+X}o\ %ci:%cp_%fi:%fp_%Ts_%rt:%pid
>>> default_backend ui
>>>
>>> backend ui
>>> http-request set-header X-Request-Id %[unique-id] unless {
>>> req.hdr(X-Request-Id) -m found }
>>
>>
>> Indeed this might be one version of ensuring that a `X-Request-Id`
>> exists, however it doesn't serve a second purpose
>
> And that's why I didn't reply to your anwser but to the original
> question ;-)
>

Something that might satisfy both requests, why not just append to the
existing request-id?

unique-id-format %[req.hdr(X-Request-ID)],%{+X}o\
%ci:%cp_%fi:%fp_%Ts_%rt:%pid

This does result in a leading comma if X-Request-ID is unset. If that's
unpleasant, you could do something like write tiny LUA sample converter
to append a comma if the value is not empty.

-Patrick


Re: unique-id-header and req.hdr

2017-01-27 Thread Patrick Hemmer


On 2017/1/27 15:31, Ciprian Dorin Craciun wrote:
> On Fri, Jan 27, 2017 at 10:24 PM, Patrick Hemmer
> <hapr...@stormcloud9.net> wrote:
>> Something that might satisfy both requests, why not just append to the
>> existing request-id?
>>
>> unique-id-format %[req.hdr(X-Request-ID)],%{+X}o\
>> %ci:%cp_%fi:%fp_%Ts_%rt:%pid
>>
>> This does result in a leading comma if X-Request-ID is unset. If that's
>> unpleasant, you could do something like write tiny LUA sample converter to
>> append a comma if the value is not empty.
>
> However, just setting the `unique-id-format` is not enough, as we
> should also send that ID to the backend, thus there is a need of
> `http-request set-header X-Request-Id %[unique-id] if !...`.  (By not
> using the `http-request`, we do get the ID from the header in the log,
> but not to the backend.)

That's what the `unique-id-header` config parameter is for.

>
> But now -- I can't say with certainty, but I remember trying various
> variants -- I think the evaluation order of `unique-id-format` is
> after all the `http-request` rules, thus the header will always be
> empty (if not explicitly set in the request), although in the log we
> would have a correct ID.
>
>
> (This is why I settled with a less optimal solution of having two
> headers, but with identical values, and working correctly in all
> instances.)
>
> Ciprian.
>



haproxy consuming 100% cpu - epoll loop

2017-01-16 Thread Patrick Hemmer
So on one of my local development machines haproxy started pegging the
CPU at 100%
`strace -T` on the process just shows:

...
epoll_wait(0, {}, 200, 0)   = 0 <0.03>
epoll_wait(0, {}, 200, 0)   = 0 <0.03>
epoll_wait(0, {}, 200, 0)   = 0 <0.03>
epoll_wait(0, {}, 200, 0)   = 0 <0.03>
epoll_wait(0, {}, 200, 0)   = 0 <0.03>
epoll_wait(0, {}, 200, 0)   = 0 <0.03>
...

Opening it up with gdb, the backtrace shows:

(gdb) bt
#0  0x7f4d18ba82a3 in __epoll_wait_nocancel () from /lib64/libc.so.6
#1  0x7f4d1a570ebc in _do_poll (p=, exp=-1440976915)
at src/ev_epoll.c:125
#2  0x7f4d1a4d3098 in run_poll_loop () at src/haproxy.c:1737
#3  0x7f4d1a4cf2c0 in main (argc=, argv=) at src/haproxy.c:2097

This is haproxy 1.7.0 on CentOS/7


Unfortunately I'm not sure what triggered it.

-Patrick


missing documentation on 51degrees samples

2016-10-07 Thread Patrick Hemmer
The documentation doesn't mention the sample fetcher `51d.all`, nor the
converter `51d.single`. The only place they're mentioned is the repo README.

Also the documentation for `51degrees-property-name-list` indicates it
takes an optional single string argument (`[]`), rather than
multiple string arguments (`...`). This led me to expect it was
comma delimited, which ended up not working.

-Patrick


format string fetch method?

2016-10-06 Thread Patrick Hemmer
While working with the `http-request set-var` (and a few other places,
but primarily here), it would be very useful to be able to use haproxy
format strings to define the variable.

For example
  http-request set-var(txn.foo) fmt(%ci:%cp:%Ts)
Or even
  http-request set-var(txn.foo) fmt(%ci:%cp:%Ts),crc32()

I don't currently see a way to do this, but I could be missing something.
If it's not possible, any chance of getting it added?

-Patrick


configure peer namespace

2016-10-09 Thread Patrick Hemmer
Can we get the ability to configure the peer namespace?
Right now haproxy uses the default namespace, but in our system we have
an "internal" interface which is able to talk to the other haproxy
nodes, and this interface is in another network namespace.

Additionally, the error output for failure to bind to a peer address is
lacking. I had to do an `strace` to figure out what the error was:
[ALERT] 282/214021 (2725) : [haproxy.main()] .
[ALERT] 282/214021 (2725) : [haproxy.main()] Some protocols failed to
start their listeners! Exiting.

That's on haproxy 1.6.9

Anyway, I can change the namespace that haproxy is launched in, and then
manually override the namespace for every `bind` and `server` parameter,
but it's rather cumbersome to do so. Would be much nicer to be able to
control the peer binding namespace, just like any other bind.

If this would be a simple change, I might be willing to attempt it. But
I've never worked in the haproxy source before, so not sure how involved
it would be.

Thanks

-Patrick


trouble with sc0_conn_cur

2016-11-27 Thread Patrick Hemmer
I'm trying to limit concurrent connections but having trouble getting it
working with sc0_conn_cur.
The relevant portion of my config looks like:

frontend www
  log-format %ac\ %tsc\ %[sc0_conn_cur]
  stick-table type ip size 1 expire 10s peers cluster store conn_cur
  tcp-request connection track-sc0 src
  tcp-request connection reject if { sc0_conn_cur gt 16 }

And then I fire off 50 simultaneous requests to test. None of the
requests are closed, and the resulting log entries look like this:
50  -

What am I missing here? Why does it seem like the track isn't working?

This is with haproxy 1.6.9

-Patrick


Re: using comma in argument to sample fetch & converter

2016-12-08 Thread Patrick Hemmer


On 2016/12/7 19:15, Cyril Bonté wrote:
> Hi,
>
> On 07/12/2016 21:40, Patrick Hemmer wrote:
>> How do you use a comma inside an argument to a sample fetcher or
>> converter?
>> For example, the sample fetch str, if I try to do `str(foo,bar)` I get
>> the error
>>
>> fetch method 'str' : end of arguments expected at position 2, but
>> got ',bar'
>>
>> All variations such as `str('foo,bar')`, `str(foo\,bar)`, etc, result in
>> the same error.
>
> Commas are not supported in converters and sample fetches.
> There may be several workaround but without any context, it's
> difficult to provide one.
>
> For example, you can provide an "urlencoded" string and use url_dec as
> a converter :
>   http-response add-header X-test %[str("foo%2Cbar"),url_dec]
>
The use case is that I have the DeviceAtlas library setting a variable.
But if there is no UserAgent header, the variable is unset. In this case
I want to set it to a default value with the same format, just empty
fields (e.g. "" or "-,-,-,-").
The url_dec trick seems sufficient to accomplish this. A little hackish,
but it works.
Thanks

-Patrick


Re: Define path of configuration files in systemd unit

2016-12-13 Thread Patrick Hemmer
On 2016/12/13 11:14, Ricardo Fraile wrote:
> Hello Jarno,
>
>
> Yes, you are right, this is not an elegant solution, and reloading
> doesn't work. This is the systemd report:
>
>
> # systemctl status haproxy.service -l
> ● haproxy.service - HAProxy Load Balancer
>Loaded: loaded (/etc/systemd/system/haproxy.service; enabled)
>Active: active (running) since Tue 2016-12-13 09:25:13 CET; 1s ago
>   Process: 28736 ExecReload=/bin/kill -USR2 $MAINPID (code=exited,
> status=0/SUCCESS)
>   Process: 28764 ExecStartPre=/bin/sh -c /usr/local/sbin/haproxy -c -q
> -- /etc/haproxy/* (code=exited, status=0/SUCCESS)
>  Main PID: 28766 (sh)
>CGroup: /system.slice/haproxy.service
>├─28766 /bin/sh -c /usr/local/sbin/haproxy-systemd-wrapper
> -p /run/haproxy.pid -- /etc/haproxy/*
>├─28769 /usr/local/sbin/haproxy-systemd-wrapper
> -p /run/haproxy.pid
> -- /etc/haproxy/haproxy.conf /etc/haproxy/z.conf /etc/haproxy/zz.conf
>├─28770 /usr/local/sbin/haproxy -Ds -p /run/haproxy.pid
> -- /etc/haproxy/haproxy.conf /etc/haproxy/z.conf /etc/haproxy/zz.conf
>└─28771 /usr/local/sbin/haproxy -Ds -p /run/haproxy.pid
> -- /etc/haproxy/haproxy.conf /etc/haproxy/z.conf /etc/haproxy/zz.conf
>
>
> Thanks,
>
>
> El lun, 12-12-2016 a las 19:36 +0200, Jarno Huuskonen escribió:
>> Hi Ricardo,
>>
>> On Mon, Dec 12, Ricardo Fraile wrote:
>>> Yes, shell expansion did the trick, this is the working systemd unit:
>>>
>>>
>>> [Unit]
>>> Description=HAProxy Load Balancer
>>> After=network.target
>>>
>>> [Service]
>>> ExecStartPre=/bin/sh -c "/usr/local/sbin/haproxy -c -q
>>> -- /etc/haproxy/*"
>>> ExecStart=/bin/sh -c "/usr/local/sbin/haproxy-systemd-wrapper
>>> -p /run/haproxy.pid -- /etc/haproxy/*"
>>> ExecReload=/bin/kill -USR2 $MAINPID
>> Does the /bin/sh -c add extra process to haproxy process tree ?
>> Does systemctl status haproxy that "Main PID:" belongs to
>> haproxy-systemd-wrapper process and reloading config works ?
>>
>> -Jarno
>>

You can solve that specific issue easily by adding `exec` to the command.

ExecStart=/bin/sh -c "exec /usr/local/sbin/haproxy-systemd-wrapper
-p /run/haproxy.pid -- /etc/haproxy/*"

-Patrick


using comma in argument to sample fetch & converter

2016-12-07 Thread Patrick Hemmer
How do you use a comma inside an argument to a sample fetcher or converter?
For example, the sample fetch str, if I try to do `str(foo,bar)` I get
the error

fetch method 'str' : end of arguments expected at position 2, but
got ',bar'

All variations such as `str('foo,bar')`, `str(foo\,bar)`, etc, result in
the same error.

-Patrick



Re: [PATCH] MINOR: systemd unit works with cfgdir and cfgfile

2017-01-12 Thread Patrick Hemmer


On 2017/1/12 06:42, Ricardo Fraile wrote:
> Hello,
>
>
> As 1.7 release allow to load multiple files from a directory:
>
>
> https://cbonte.github.io/haproxy-dconv/1.7/management.html
>
>  -f  : adds  to the list of configuration files
> to be loaded. If  is a directory, all the files (and only files)
> it contains are added in lexical order (using LC_COLLATE=C) to the list
> of configuration files to be loaded ; only files with ".cfg" extension
> are added, only non hidden files (not prefixed with ".") are added.
>
>
> I think that the systemd unit can have the configuration directory
> instead of the path to the file for allow the same behaviour that the
> "-f" option provides.
>
>
> Thanks in advance,
>
>
>
> Regards,
This change is rather dangerous. It's not unlikely that people will have
multiple config files in their `/etc/haproxy` directory. Such might
happen if users keep backups of previous versions when they make a
change, or if they have alternate configs for different purposes. In
general when applications support loading multiple config files, the
configs are typically put in a `.d` directory, such as
`/etc/haproxy/conf.d`.

-Patrick


deviceatlas issues

2016-12-01 Thread Patrick Hemmer
After using the addon, I've run across a few issues trying to get it
running.
The first are mostly documentation issues:
1. The example for `da-csv-conv` has `da-csv()` instead of `da-csv-conv()`.
2. The documentation lists the parameter `deviceatlas-separator`. The
param is really `deviceatlas-property-separator`.
3. The deviceatlas library seems to provide a `da-csv-fetch`, which
isn't in the documentation at all.
4. Documentation on ` deviceatlas-log-level` doesn't indicate the
maximum level. And is just not very descriptive in general. I tried
changing the value and saw no change in any logging.

But I also ran across an issue when I try to use a personalized
enterprise data file in that all lookups seem to fail. The resulting
`da-csv-conv()` just returns empty fields. The fields I'm using are
basic ones like browserName,browserVersion,osName,osVersion. The free
file (20160203_compact.json) works fine. The web UI for controlling the
file indicates the fields are selected, and when I grep the json file, I
get fields results as "sbrowserName","sbrowserVersion". So not sure why
it doesn't work.

-Patrick


capture header VS set-var

2016-12-04 Thread Patrick Hemmer
I was mostly just wondering about differences between using things like
`capture request header`/`http-request capture` and `http-request set-var`.

set-var seems to have all the capabilities of captures, and are much
easier to work with. You don't have to pre-declare them, you don't have
to set their size, you don't have to remember the order in which you
declared them and reference them by an index.

So what's the benefit of captures? Are they more performant or something?

-Patrick


Re: ALERT:sendmsg logger #1 failed: Resource temporarily unavailable (errno=11)

2017-01-05 Thread Patrick Hemmer


On 2017/1/5 02:15, Igor Cicimov wrote:
> Hi all,
>
> On one of my haproxy's I get the following message on reload:
>   
>   
>   
> [ALERT] 004/070949 (21440) : sendmsg logger #1 failed: Resource
> temporarily unavailable (errno=11)
>
> Has anyone seen this before or any pointers where to look for to
> correct this?
>
> Thanks,
> Igor
>
Google has several entries on the subject:
https://bugs.launchpad.net/kolla/+bug/1549753
http://comments.gmane.org/gmane.comp.web.haproxy/4716

-Patrick


Re: Is it possible to avoid 503 error when one backend server has down and health check hasn't been launched yet

2016-12-24 Thread Patrick Hemmer

On 2016/12/24 10:42, Alex.Chen wrote:
> for my scenario, i need to using "balance source" to keep the
> persistence of haproxy's balancing, I find that when one of my backend
> server (s1) has been killed, and if the next round health check is
> still not launched, then s1 is still be marked as UP. after 3 retries,
> the redispatch option does not work, I still get a 503 error. after a
> while, health check launched and s1 has been marked as DOWN, then my
> req has been forward to another backend server and everything is ok now. 
>
> my quesition is that, is there any config can help me to avoid 503
> error when 3 retries have been failed but s1 is still marked as UP
> before the next round health check
>
> I debug  haproxy(1.6.10) and find that when I using "balance source",
>  the redispatch option does not work actually. after 3 retries,
> redispatch does not work, I guess that is because "balance source" is
> deterministic based on source IP and server state info(UP/DOWN and
> weight) (from
> : 
> http://blog.haproxy.com/2013/04/22/client-ip-persistence-or-source-ip-hash-load-balancing/
> ) so if the server looks like "UP" then the balance source will still
> assign redispatch new conn to this deaded server s1.
>
>
I would think the "observe" option should handle this issue.
https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#5.2-observe

-Patrick


Re: ssl offloading and send-proxy-v2-ssl

2016-12-26 Thread Patrick Hemmer


On 2016/12/23 09:28, Arnall wrote:
> Hi everyone,
>
> i'm using a nbproc > 1 configuration for ssl offloading :
>
> listen web_tls
> mode http
> bind *:443 ssl crt whatever.pem process 2
> bind *:443 ssl crt whatever.pem process 3
>
> ../..
> server web_plain u...@plain.sock send-proxy-v2-ssl
>
> frontend web_plain
> bind*:80 process 1
> bind u...@plain.sock process 1 accept-proxy
>
> ../..
>
> And i'm looking for a secure solution in the web_plain frontend to
> know if the request come from web_tls or not ( in fact i want to know
> if the connection was initially made via SSL/TLS transport ).
>
> I though that send-proxy-v2-ssl could help but i have no idea how ...
> src and src_port are OK with the proxy protocol but ssl_fc in
> web_plain keeps answering false  ( 0 ) even the request come from
> web_tls.
>
> I could set and forward a secret header set in web_tls but i don't
> like the idea ... (have to change the header each time an admin sys
> leave the enterprise... )
>
> Thanks.
>
>
>

This use case has come up a few times:
https://www.mail-archive.com/haproxy@formilux.org/msg23882.html
My crude solution is an ACL check on the port the client connected to
(dst_port eq 443).

-Patrick



Re: capture header VS set-var

2016-12-07 Thread Patrick Hemmer


On 2016/12/5 03:35, thierry.fourn...@arpalert.org wrote:
> On Sun, 4 Dec 2016 09:17:00 -0500
> Patrick Hemmer <hapr...@stormcloud9.net> wrote:
>
>> I was mostly just wondering about differences between using things like
>> `capture request header`/`http-request capture` and `http-request set-var`.
>>
>> set-var seems to have all the capabilities of captures, and are much
>> easier to work with. You don't have to pre-declare them, you don't have
>> to set their size, you don't have to remember the order in which you
>> declared them and reference them by an index.
>>
>> So what's the benefit of captures? Are they more performant or something?
>
> Hi Patrick,
>
> This is a good question !
>
> Capture header are faster. It uses memory pool of predefined size and
> the access to a captures header is fast because it is indexed with an
> integer.
>
> set-var uses the standard system memory allocation and a memory
> accounting system for preventing a large amount of consomation. This
> memory allocation system is slow. In other way, each is stored in a
> list and HAProxy must browse this list for each var access.
>
> So, you're right: capture header is more performant. set-var is more
> easy to use.
>
> Note that the var cotaining ip or integer doesn't use the system memory
> allocator.
>
> Finally, I realize a quick benchmark on my computer. I capture the
> header user-agent containing 127 bytes (and configured for capturing
> 128) with the two methods.
>
>set-var : 90262 req/s
>capture : 92257 req/s
>
> So, on my computer, "set-var" is 2% slowly than "capture".
>
> Thierry
>

Thanks for the info. That kinda confirms some of my thoughts on the
subject. The bit about ip/integer is useful to know.

-Patrick



Re: Odd behaviour with option forwardfor.

2017-07-22 Thread Patrick Hemmer
On 2017/7/22 11:11, Claus Strommer wrote:
> Hi all, I'm seeing some odd behaviour with our haproxy balancer and am
> looking for some insights.
>
> The setup:
>
> I have a webserver that is behind two haproxy balancers (version
> 1.5.18 on EL7), which are behind CloudFlare.   In effect the request goes
>
> client->CF->haproxy1->haproxy2->server. 
>
> On both haproxy balancers I have "option forwardfor" and "capture
> request header X-Forwarded-For len 128" set.  On the server I also
> capture X-Forwarded-For
>
> Now here is where the odd behaviour (*highlighted*) happens:
>
> * haproxy1 logs the full X-Forwarded-For header.
> * *haproxy2 only logs the IP of the CF proxy (the last address in
> X-Forwarded-For)*
> * server logs the full X-Forwarded-For header.
> * If I turn off "option forwardfor" on haproxy1, then haproxy2 logs
> the full header as received by CF. 
> * Changing the length of the capture request does not seem to make a
> difference.
> * I noticed that haproxy uses spaces after the comma between the
> header entries, but CF does not.  I tried replicating this issue with
> a direct curl request to haproxy2 replicating the x-forwarded-for
> header that haproxy1 would have sent, and I cannot reproduce the issue.
>
> The only thing that I notice is that CF
>
> Am I missing something obvious here?  Below are the full options I'm
> using on haproxy1 and haproxy2.  Everything after that is ACLs
>
> defaults
> modehttp
> log global
> option  httplog
> option  dontlognull
> option http-server-close
> option forwardfor   except 127.0.0.0/8 
> option  redispatch
> retries 3
>
> frontend  http *:80
> mode http
> reqadd X-Forwarded-Proto:\ https
> redirect scheme https code 301
>
> frontend https
> bind *:443 ssl crt /etc/pki/tls/certs/hacert.pem
> mode http
> capture request header Host len 50
>

The "option forwardfor" setting appends a complete new header, not
appends the value to an existing header. From the docs on "option
forwardfor":
> Enable insertion of the X-Forwarded-For header to requests sent to servers
...
> this header is always appended at the end of the existing header list

Your header capture is grabbing the last X-Forwarded-For header.
On issues like this, you should perform a packet capture. It would make
the issue immediately apparent.

Personally I use 2 rules similar to the following to append to
X-Forwarded-For:

  http-request set-header X-Forwarded-For %[req.fhdr(X-Forwarded-For)],\
%[src] if { req.fhdr(X-Forwarded-For) -m found }
  http-request set-header X-Forwarded-For %[src] if !{
req.fhdr(X-Forwarded-For) -m found }

-Patrick


Looking for a way to limit simultaneous connections per IP

2017-06-28 Thread Patrick Hemmer
So as the subject indicates, I'm looking to limit concurrent connections
to a backend by the source IP. The behavior I'm trying for is that if
the client has more than 6 connections, we sit on the request for a
second, and then send back a 302 redirect to the same resource that was
just requested.

I was able to accomplish this using a stick table for tracking
connection count, and a Lua script for doing the sleep ("sit on the
request" part), but it has a significant flaw. Once the >6 connection
limit is hit, and we start redirecting with 302, the client can't leave
this state. When they come back in after the redirect, they'll still
have >6 connections, and will hit the rate limit rule again.

We instead need a way to differentiate (count) connections held open and
sitting in the Lua delay function, and connections being processed by a
server.

I'd be open to other ways of accomplishing the end goal as well. We want
to use the 302 redirect so the rate limit is transparent to the client.
And we want the delay so that the client just doesn't hammer haproxy
with request after request, and the browser report it as a redirect loop
(a brief delay will allow the existing connections to finish processing
so that after the 302, it can be handled). And we're trying for a
per-client limit (as opposed to a simple "maxconn" setting and a FIFO
queue) to prevent a single client from monopolizing the backend resource.

-Patrick


Re: Looking for a way to limit simultaneous connections per IP

2017-06-28 Thread Patrick Hemmer


On 2017/6/28 17:40, Mark Staudinger wrote:
> Hi Patrick,
>
> Where are you using the stick table and lua script call?  Frontend or
> backend?
>
> Perhaps this would work:
>
> * In the frontend, check the connection count from the "real backend"
> stick table
> * if the count is > 6, set ACL for the source
> *Use this ACL to steer the conection to the "redirect backend" which
> will call the lua script to sleep/redirect
>
> In this way, redirected requests won't add to the backend count for
> the stick table counting such things, because they go to a different
> backend that doesn't actually talk to the resource you are protecting.
>
I think I found the solution. It's very similar to what you proposed.

frontend foofront
  http-request lua.delay_request if { src,table_conn_cur(fooback) ge 6 }
  http-request redirect prefix / code 302 if {
src,table_conn_cur(fooback) ge 6 }

backend fooback
  stick-table type ip size 1 expire 10s peers cluster store conn_cur
  http-request track-sc1 src


I didn't see this solution at first as I didn't see the `table_conn_cur`
converter. I thought the only way to get a value from a stick table was
to track the connection.
The documentation is also a little confusing as it seems to imply it'll
use the string form of the IP address, when I expect the table stores
the binary form of the IP address. But it seems to work from my testing.


> Best,
> -Mark
>
> On Wed, 28 Jun 2017 16:56:03 -0400, Patrick Hemmer
> <hapr...@stormcloud9.net> wrote:
>
> So as the subject indicates, I'm looking to limit concurrent
> connections to a backend by the source IP. The behavior I'm trying
> for is that if the client has more than 6 connections, we sit on
> the request for a second, and then send back a 302 redirect to the
> same resource that was just requested.
>
> I was able to accomplish this using a stick table for tracking
> connection count, and a Lua script for doing the sleep ("sit on
> the request" part), but it has a significant flaw. Once the >6
> connection limit is hit, and we start redirecting with 302, the
> client can't leave this state. When they come back in after the
> redirect, they'll still have >6 connections, and will hit the rate
> limit rule again.
>
> We instead need a way to differentiate (count) connections held
> open and sitting in the Lua delay function, and connections being
> processed by a server.
>
> I'd be open to other ways of accomplishing the end goal as well.
> We want to use the 302 redirect so the rate limit is transparent
> to the client. And we want the delay so that the client just
> doesn't hammer haproxy with request after request, and the browser
> report it as a redirect loop (a brief delay will allow the
> existing connections to finish processing so that after the 302,
> it can be handled). And we're trying for a per-client limit (as
> opposed to a simple "maxconn" setting and a FIFO queue) to prevent
> a single client from monopolizing the backend resource.
>
> -Patrick
>
>
>
>



Re: Logging SSL pre-master-key

2017-06-30 Thread Patrick Hemmer


On 2017/6/30 01:00, Willy Tarreau wrote:
> Hi Patrick, sorry for the delay :-/
>
> On Mon, Jun 19, 2017 at 01:54:36PM -0400, Patrick Hemmer wrote:
>> Well my argument for keeping the name starting with `ssl_fc_session_` is
>> that there is also `ssl_fc_session_id`. These 2 fetches pull their
>> attribute from the same "session" structure. They are also closely
>> related as using `ssl_fc_session_key` almost requires the
>> `ssl_fc_session_id` value (you could technically get by without it, but
>> it'll make using the key rather difficult unless you have some other way
>> of correlating a key with a specific SSL session).
> OK, that totally makes sense then.
>
>>>>  static int
>>>> +smp_fetch_ssl_fc_session_key(const struct arg *args, struct sample *smp, 
>>>> const char *kw, void *private)
>>>> +{
>>>> +#if OPENSSL_VERSION_NUMBER >= 0x1010L || defined(OPENSSL_IS_BORINGSSL)
>>> Here I'd put the ifdef around the whole function, like we have for
>>> ALPN for example, so that there's no risk it could be used in a non-working
>>> config.
>> My objection to this is that most of the other sample fetches don't do
>> this, and so it makes the user experience inconsistent. For example the
>> `ssl_fc_session_id` fetch, which `ssl_fc_session_key` is strongly
>> related to, behaves the same way.
> But this has already been causing problems, because people build with their
> lib, then configure their logs, and suddenly realize that the field is empty
> and report the bug here. The issue I'm having is that there's no notification
> that this will not work. Using #ifdef ensures that what is not supported will
> report an error. Then the user looks at the keyword in the doc and reads
> "requires openssl 1.1 or above" and understands why there's this problem.
I can include an additional patch to change the existing SSL fetches to
the desired behavior. Would that be acceptable?
>
>> The ALPN/NPN fetches are the only ones
>> that make using the fetch a config error if the underlying support is
>> missing.
> For SSL indeed, but if you look at other places like TCP, bind keywords
> or server keywords, it's exactly the opposite. There are other places
> which are cleaner, it's only the arg resolver which has the ifdef so
> that instead of not parsing, it's parsed and reported as "not supported"
> or "requires version blah". I just can't find such an example but I'm
> pretty sure we do have some.
>
> Cheers,
> Willy



Re: Logging SSL pre-master-key

2017-06-12 Thread Patrick Hemmer


On 2017/6/12 15:14, Lukas Tribus wrote:
> Hello,
>
>
> Am 12.06.2017 um 19:35 schrieb Patrick Hemmer:
>> Would we be able to get a new sample which provides the SSL session
>> master-key?
>> This is so that when performing packet captures with ephemeral ciphers
>> (DHE), we can decrypt the traffic in the capture.
> There is no master key. What you need is the key for the symmetric
> crypto, and you cannot extract it from haproxy currently.
>
> More importantly, OpenSSL implements this functionality only the master
> branch (see [1] and [2]), none of the release branches actually have
> this functionality.
> So we need OpenSSL to release a new branch with this functionality
> (1.1.1), we have to implement it in haproxy and then still it will only
> work for <=TLSv1.2.
>
> TLSv1.3 will need additional secrets and a different key logging API [3].
>
>
> I suggest you use SSLKEYLOGFILE features in the browsers at this point,
> as the functionality is far from being ready for any OpenSSL based
> application.
>
>
> Regards,
> Lukas
>
> [1]
> https://github.com/openssl/openssl/commit/2faa1b48fd6864f6bb8f992fd638378202fdd416
> [2]
> https://www.openssl.org/docs/manmaster/man3/SSL_CTX_set_keylog_callback.html
> [3] https://github.com/openssl/openssl/pull/2287
>

Maybe there's some misunderstanding, because we seem to be talking about
different things, as there definitely is a master key.

I patched my haproxy to add a ssl_fc_session_key fetch, and with the
value I was able to decrypt my test sessions encrypted with
TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256.

Since the implementation was fairly easy, I've included a patch for it.
But I've never submitted anything before, so there's a good chance of
something being wrong.

The only thing is that the function to do the extraction was added in
1.1.0
(https://github.com/openssl/openssl/commit/858618e7e037559b75b0bfca4d30440f9515b888)
The underlying vars are still there, and when I looked have been there
since as early as I could find (going back to 1998). But I'm not sure
how you feel about extracting the values without the helper function.

-Patrick
From a6fa01c65f615887b08f86bb67ac7ef6231dbc34 Mon Sep 17 00:00:00 2001
From: Patrick Hemmer <hapr...@stormcloud9.net>
Date: Mon, 12 Jun 2017 18:03:48 -0400
Subject: [PATCH] MINOR: ssl: add fetch 'ssl_fc_session_key' and
 'ssl_bc_session_key'

These fetches return the SSL master key of the front/back connection.
This is useful to decrypt traffic encrypted with ephemeral ciphers.
---
 doc/configuration.txt | 10 ++
 src/ssl_sock.c| 45 +
 2 files changed, 55 insertions(+)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 49bfd85..e7cfd85 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -13930,6 +13930,11 @@ ssl_bc_session_id : binary
   made over an SSL/TLS transport layer. It is useful to log if we want to know
   if session was reused or not.
 
+ssl_bc_session_key : binary
+  Returns the SSL master key of the back connection when the outgoing
+  connection was made over an SSL/TLS transport layer. It is useful to decrypt
+  traffic sent using ephemeral ciphers.
+
 ssl_bc_use_keysize : integer
   Returns the symmetric cipher key size used in bits when the outgoing
   connection was made over an SSL/TLS transport layer.
@@ -14185,6 +14190,11 @@ ssl_fc_session_id : binary
   a server. It is important to note that some browsers refresh their session ID
   every few minutes.
 
+ssl_fc_session_key : binary
+  Returns the SSL master key of the front connection when the incoming
+  connection was made over an SSL/TLS transport layer. It is useful to decrypt
+  traffic sent using ephemeral ciphers.
+
 ssl_fc_sni : string
   This extracts the Server Name Indication TLS extension (SNI) field from an
   incoming connection made via an SSL/TLS transport layer and locally
diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index af09cfb..2fe7e2f 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -6170,6 +6170,49 @@ smp_fetch_ssl_fc_session_id(const struct arg *args, 
struct sample *smp, const ch
 }
 
 static int
+smp_fetch_ssl_fc_session_key(const struct arg *args, struct sample *smp, const 
char *kw, void *private)
+{
+   struct connection *conn = objt_conn((kw[4] != 'b') ? smp->sess->origin :
+   smp->strm ? smp->strm->si[1].end : 
NULL);
+
+   SSL_SESSION *ssl_sess;
+   int data_len;
+   struct chunk *data;
+
+   smp->flags = SMP_F_CONST;
+   smp->data.type = SMP_T_BIN;
+
+   if (!conn || !conn->xprt_ctx || conn->xprt != _sock)
+   return 0;
+
+   ssl_sess = SSL_get_session(conn->xprt_ctx);
+   if (!ssl_sess)
+   return 0;
+
+
+#if OPENSSL_VERSION_NUMBER >= 0x1010L
+   data = get_trash_chu

Logging SSL pre-master-key

2017-06-12 Thread Patrick Hemmer
Would we be able to get a new sample which provides the SSL session
master-key?
This is so that when performing packet captures with ephemeral ciphers
(DHE), we can decrypt the traffic in the capture.

-Patrick


Re: Logging SSL pre-master-key

2017-06-22 Thread Patrick Hemmer

On 2017/6/19 13:54, Patrick Hemmer wrote:
>
>
> On 2017/6/17 00:00, Willy Tarreau wrote:
>> Hi Patrick,
>>
>> On Fri, Jun 16, 2017 at 09:36:30PM -0400, Patrick Hemmer wrote:
>>> The main reason I had for supporting the older code is that it seems
>>> many (most?) linux distros, such as the one we use (CentOS/7), still
>>> ship with 1.0.1 or 1.0.2. However since this is a minor change and a
>>> feature enhancement, I doubt this will get backported to 1.7, meaning
>>> we'll have to manually patch it into the version we use. And since we're
>>> doing that, we can just use the patch that supports older OpenSSL.
>> Yes in this case I think it makes sense. And given the proliferation
>> of *ssl implementations, I really prefer to avoid adding code touching
>> directly internal openssl stuff.
>>
>>> Anyway, here is the updated patch with the support for <1.1.0 dropped,
>>> as well as the BoringSSL support Emmanuel requested.
>> OK cool.
>>
>>> One other minor thing I was unsure about was the fetch name. Currently I
>>> have it as "ssl_fc_session_key", but "ssl_fc_session_master_key" might
>>> be more accurate. However this is rather verbose and would make it far
>>> longer than any other sample fetch name, and I was unsure if there were
>>> rules around the naming.
>> There are no particular rules but it's true that long names are painful,
>> especially for those with "fat fingers". I'd suggest that given that it
>> starts with "ssl_fc_" and "ssl_bc_", the context is already quite
>> restricted and you could probably call them "smk" or "smkey". One would
>> argue that looking at the doc will be necessary, but with long names you
>> also have to look at the doc to find how to spell them anyway, the
>> difference being that short names don't mangle your configs too much,
>> especially in logformat strings.
> Well my argument for keeping the name starting with `ssl_fc_session_`
> is that there is also `ssl_fc_session_id`. These 2 fetches pull their
> attribute from the same "session" structure. They are also closely
> related as using `ssl_fc_session_key` almost requires the
> `ssl_fc_session_id` value (you could technically get by without it,
> but it'll make using the key rather difficult unless you have some
> other way of correlating a key with a specific SSL session).
>>> +ssl_bc_session_key : binary
>>> +  Returns the SSL session master key of the back connection when the 
>>> outgoing
>>> +  connection was made over an SSL/TLS transport layer. It is useful to 
>>> decrypt
>>> +  traffic sent using ephemeral ciphers.
>> Here it would be nice to add a short sentence like "Not available before
>> openssl 1.1.0" so that users don't waste too much time on something not
>> working.
>>
>>>  static int
>>> +smp_fetch_ssl_fc_session_key(const struct arg *args, struct sample *smp, 
>>> const char *kw, void *private)
>>> +{
>>> +#if OPENSSL_VERSION_NUMBER >= 0x1010L || defined(OPENSSL_IS_BORINGSSL)
>> Here I'd put the ifdef around the whole function, like we have for
>> ALPN for example, so that there's no risk it could be used in a non-working
>> config.
> My objection to this is that most of the other sample fetches don't do
> this, and so it makes the user experience inconsistent. For example
> the `ssl_fc_session_id` fetch, which `ssl_fc_session_key` is strongly
> related to, behaves the same way. The ALPN/NPN fetches are the only
> ones that make using the fetch a config error if the underlying
> support is missing.
>>> +   struct connection *conn = objt_conn((kw[4] != 'b') ? smp->sess->origin :
>>> +   smp->strm ? smp->strm->si[1].end : 
>>> NULL);
>>> +
>>> +   SSL_SESSION *ssl_sess;
>>> +   int data_len;
>>> +   struct chunk *data;
>>> +
>>> +   if (!conn || !conn->xprt_ctx || conn->xprt != _sock)
>>> +   return 0;
>>> +
>>> +   ssl_sess = SSL_get_session(conn->xprt_ctx);
>>> +   if (!ssl_sess)
>>> +   return 0;
>>> +
>>> +   data = get_trash_chunk();
>>> +   data_len = SSL_SESSION_get_master_key(ssl_sess, (unsigned char 
>>> *)data->str, data->size);
>>> +   if (!data_len)
>>> +   return 0;
>>> +
>>> +   smp->flags = SMP_F_CONST;
>>> +   smp->data.type = SMP_T_BIN;
>>> +   data->len = data_len;

Re: Logging SSL pre-master-key

2017-06-19 Thread Patrick Hemmer


On 2017/6/17 00:00, Willy Tarreau wrote:
> Hi Patrick,
>
> On Fri, Jun 16, 2017 at 09:36:30PM -0400, Patrick Hemmer wrote:
>> The main reason I had for supporting the older code is that it seems
>> many (most?) linux distros, such as the one we use (CentOS/7), still
>> ship with 1.0.1 or 1.0.2. However since this is a minor change and a
>> feature enhancement, I doubt this will get backported to 1.7, meaning
>> we'll have to manually patch it into the version we use. And since we're
>> doing that, we can just use the patch that supports older OpenSSL.
> Yes in this case I think it makes sense. And given the proliferation
> of *ssl implementations, I really prefer to avoid adding code touching
> directly internal openssl stuff.
>
>> Anyway, here is the updated patch with the support for <1.1.0 dropped,
>> as well as the BoringSSL support Emmanuel requested.
> OK cool.
>
>> One other minor thing I was unsure about was the fetch name. Currently I
>> have it as "ssl_fc_session_key", but "ssl_fc_session_master_key" might
>> be more accurate. However this is rather verbose and would make it far
>> longer than any other sample fetch name, and I was unsure if there were
>> rules around the naming.
> There are no particular rules but it's true that long names are painful,
> especially for those with "fat fingers". I'd suggest that given that it
> starts with "ssl_fc_" and "ssl_bc_", the context is already quite
> restricted and you could probably call them "smk" or "smkey". One would
> argue that looking at the doc will be necessary, but with long names you
> also have to look at the doc to find how to spell them anyway, the
> difference being that short names don't mangle your configs too much,
> especially in logformat strings.
Well my argument for keeping the name starting with `ssl_fc_session_` is
that there is also `ssl_fc_session_id`. These 2 fetches pull their
attribute from the same "session" structure. They are also closely
related as using `ssl_fc_session_key` almost requires the
`ssl_fc_session_id` value (you could technically get by without it, but
it'll make using the key rather difficult unless you have some other way
of correlating a key with a specific SSL session).
>
>> +ssl_bc_session_key : binary
>> +  Returns the SSL session master key of the back connection when the 
>> outgoing
>> +  connection was made over an SSL/TLS transport layer. It is useful to 
>> decrypt
>> +  traffic sent using ephemeral ciphers.
> Here it would be nice to add a short sentence like "Not available before
> openssl 1.1.0" so that users don't waste too much time on something not
> working.
>
>>  static int
>> +smp_fetch_ssl_fc_session_key(const struct arg *args, struct sample *smp, 
>> const char *kw, void *private)
>> +{
>> +#if OPENSSL_VERSION_NUMBER >= 0x1010L || defined(OPENSSL_IS_BORINGSSL)
> Here I'd put the ifdef around the whole function, like we have for
> ALPN for example, so that there's no risk it could be used in a non-working
> config.
My objection to this is that most of the other sample fetches don't do
this, and so it makes the user experience inconsistent. For example the
`ssl_fc_session_id` fetch, which `ssl_fc_session_key` is strongly
related to, behaves the same way. The ALPN/NPN fetches are the only ones
that make using the fetch a config error if the underlying support is
missing.
>
>> +struct connection *conn = objt_conn((kw[4] != 'b') ? smp->sess->origin :
>> +smp->strm ? smp->strm->si[1].end : 
>> NULL);
>> +
>> +SSL_SESSION *ssl_sess;
>> +int data_len;
>> +struct chunk *data;
>> +
>> +if (!conn || !conn->xprt_ctx || conn->xprt != _sock)
>> +return 0;
>> +
>> +ssl_sess = SSL_get_session(conn->xprt_ctx);
>> +if (!ssl_sess)
>> +return 0;
>> +
>> +data = get_trash_chunk();
>> +data_len = SSL_SESSION_get_master_key(ssl_sess, (unsigned char 
>> *)data->str, data->size);
>> +if (!data_len)
>> +return 0;
>> +
>> +smp->flags = SMP_F_CONST;
>> +smp->data.type = SMP_T_BIN;
>> +data->len = data_len;
> If you want, you can even get rid of your data_len variable by directly
> assigning SSL_SESSION_get_master_key() to data->len.
>
>> +static int
>>  smp_fetch_ssl_fc_sni(const struct arg *args, struct sample *smp, const char 
>> *kw, void *private)
>>  {
>>  #ifdef SSL_CTRL_SET_TLSEXT_HOSTNAME
>> @@ -7841,6 +7875,7 @@ static struct samp

HAProxy won't shut down

2017-05-23 Thread Patrick Hemmer
We've been running across a fair amount of haproxy processes lately that
won't shut down. We're currently using 1.7.5, but have also experienced
the issue with earlier versions, 1.7.2 for sure, but likely back even
further.
The processes are getting signaled to shut down by the
haproxy-systemd-wrapper after sending it a SIGHUP.
The last thing logged by the process was all the "Stopping frontend"
"Stopping backend" and "Proxy XXX stopped" messages.

When I do an `lsof -p XXX` I get:

# lsof -p 28856
COMMAND   PID USER   FD  TYPE DEVICE SIZE/OFF   NODE
NAME
haproxy 28856 root  cwd   DIR  253,0 4096128 /
haproxy 28856 root  rtd   DIR  253,0 4096128 /
haproxy 28856 root  txt   REG  253,0  1562240   25168059
/usr/sbin/haproxy
haproxy 28856 root  DEL   REG0,4   420037375
/dev/zero
haproxy 28856 root  mem   REG  253,062184   26659777
/usr/lib64/libnss_files-2.17.so
haproxy 28856 root  mem   REG  253,0   155744   25213445
/usr/lib64/libselinux.so.1
haproxy 28856 root  mem   REG  253,0   111080   26659787
/usr/lib64/libresolv-2.17.so
haproxy 28856 root  mem   REG  253,015688   25315637
/usr/lib64/libkeyutils.so.1.5
haproxy 28856 root  mem   REG  253,062744   25394528
/usr/lib64/libkrb5support.so.0.1
haproxy 28856 root  mem   REG  253,0   143944   26659785
/usr/lib64/libpthread-2.17.so
haproxy 28856 root  mem   REG  253,0   202568   25300495
/usr/lib64/libk5crypto.so.3.1
haproxy 28856 root  mem   REG  253,015848   25213462
/usr/lib64/libcom_err.so.2.1
haproxy 28856 root  mem   REG  253,0   959008   25394526
/usr/lib64/libkrb5.so.3.3
haproxy 28856 root  mem   REG  253,0   324888   25300491
/usr/lib64/libgssapi_krb5.so.2.2
haproxy 28856 root  mem   REG  253,011384   25167850
/usr/lib64/libfreebl3.so
haproxy 28856 root  mem   REG  253,0  2118128   25167885
/usr/lib64/libc-2.17.so
haproxy 28856 root  mem   REG  253,0   398264   25195400
/usr/lib64/libpcre.so.1.2.0
haproxy 28856 root  mem   REG  253,02   25195408
/usr/lib64/libpcreposix.so.0.0.1
haproxy 28856 root  mem   REG  253,0  1141928   26148751
/usr/lib64/libm-2.17.so
haproxy 28856 root  mem   REG  253,0  2025472   25300659
/usr/lib64/libcrypto.so.1.0.1e
haproxy 28856 root  mem   REG  253,0   454024   25300661
/usr/lib64/libssl.so.1.0.1e
haproxy 28856 root  mem   REG  253,019776   26148750
/usr/lib64/libdl-2.17.so
haproxy 28856 root  mem   REG  253,090664   25213451
/usr/lib64/libz.so.1.2.7
haproxy 28856 root  mem   REG  253,041080   25167891
/usr/lib64/libcrypt-2.17.so
haproxy 28856 root  mem   REG  253,0   155464   26148745
/usr/lib64/ld-2.17.so
haproxy 28856 root0u  a_inode0,90   5823
[eventpoll]
haproxy 28856 root1u IPv4  420797940  0t0TCP
10.0.33.145:35754->10.0.33.147:1029 (CLOSE_WAIT)
haproxy 28856 root2u IPv4  420266351  0t0TCP
10.0.33.145:52898->10.0.33.147:1029 (CLOSE_WAIT)
haproxy 28856 root3r  REG0,30 4026531956 net
haproxy 28856 root4u IPv4  422150834  0t0TCP
10.0.33.145:38874->10.0.33.147:1029 (CLOSE_WAIT)
haproxy 28856 root5r  REG0,30 4026532437 net
haproxy 28856 root6r  REG0,30 4026531956 net
haproxy 28856 root   13u unix 0x88009af6e800  0t0  420037384
socket

All those sockets have been sitting there like that for a long time.
The :1029 sockets are "peer" sync connections.
File descriptor 13 is likely one of:
* The syslog connection to /dev/log
* A dead connection from an SSL worker process. We use nbproc>1 with
dedicated processes handling SSL termination, and then unix domain
sockets to forward to the main haproxy process. PID 28856 is the main
process, not an SSL terminator. The SSL terminator processes are already
shut down, so there's nothing on the other end of that socket.
I'm not sure what the other "net" sockets are.



When I `strace -p XXX` I get:

# strace -p 28856
Process 28856 attached
epoll_wait(0, {}, 200, 319) = 0
epoll_wait(0, {}, 200, 0)   = 0
epoll_wait(0, {}, 200, 362) = 0
epoll_wait(0, {}, 200, 0)   = 0
epoll_wait(0, {}, 200, 114) = 0
epoll_wait(0, {}, 200, 0)   = 0
epoll_wait(0, {}, 200, 203) = 0
epoll_wait(0, {}, 200, 0)   = 0
epoll_wait(0, {}, 200, 331) = 0
epoll_wait(0, {}, 200, 0)   = 0



When I do `bt full` in gdb I get:

(gdb) bt full
#0  0x7f5f3efdacf3 in __epoll_wait_nocancel () from 

Re: HAProxy won't shut down

2017-05-29 Thread Patrick Hemmer

On 2017/5/29 08:22, Frederic Lecaille wrote:
>
> Hi Patrick,
>
> First thank you for this nice and helpful report.
>
> Would it be possible to have an output of this command the next time
> you reproduce such an issue please?
>
> echo "show sess" | socat stdio 

Unfortunately this would not be possible. When the issue occurs, the
haproxy process has stopped accepting connections on all sockets. If I
were to run this command, it would be sent to the new process, not the
one that won't shut down.

>
> I have only one question (see below).
>
> On 05/24/2017 10:40 AM, Willy Tarreau wrote:
>> Hi Patrick,
>>
>> On Tue, May 23, 2017 at 01:49:42PM -0400, Patrick Hemmer wrote:
>> (...)
>>> haproxy 28856 root1u IPv4  420797940  0t0   
>>> TCP 10.0.33.145:35754->10.0.33.147:1029 (CLOSE_WAIT)
>>> haproxy 28856 root2u IPv4  420266351  0t0   
>>> TCP 10.0.33.145:52898->10.0.33.147:1029 (CLOSE_WAIT)
>>> haproxy 28856 root3r  REG0,30
>>> 4026531956 net
>>> haproxy 28856 root4u IPv4  422150834  0t0   
>>> TCP 10.0.33.145:38874->10.0.33.147:1029 (CLOSE_WAIT)
>>
>> These ones are very interesting.
>
> These traces also seem interesting to me.
>
> # strace -p 28856
> Process 28856 attached
> epoll_wait(0, {}, 200, 319) = 0
> epoll_wait(0, {}, 200, 0)   = 0
> epoll_wait(0, {}, 200, 362) = 0
> epoll_wait(0, {}, 200, 0)   = 0
> epoll_wait(0, {}, 200, 114) = 0
> epoll_wait(0, {}, 200, 0)   = 0
> epoll_wait(0, {}, 200, 203) = 0
> epoll_wait(0, {}, 200, 0)   = 0
> epoll_wait(0, {}, 200, 331) = 0
> epoll_wait(0, {}, 200, 0)
>
>
> Were such "epoll_wait(0, 0, 200, 0)" calls infinitively displayed?
Yes

>
>
> In fact I am wondering if it is normal to have so much epoll_wait(0,
> {}, 200, 0) calls for a haproxy process which has shut down.
>
> I suspect they are in relation with peer tasks (obviously which has
> expired).
>
> If this is the case, and with configurations with only peer tasks,
> haproxy would definitively hang consuming a lot of CPU resources.
HAProxy was not consuming high CPU. Note that in every other call to
`epoll_wait`, the 4th value was >0. If every single timeout value were
0, then yes, it would spin consuming CPU.

>
> So, I had a look at the peer struct task 'expire' member handling
> code, and I have just found a situation where pollers in relation with
> peer tasks are often called with an expired timeout leading haproxy to
> consume a lot of CPU resources. In fact this happens each time the
> peer task has expired during a fraction of second.
>
> It is easy to reproduce this issue with a sort of peer simulator ;):
>
> strace -ttf socat TCP4-LISTEN:,reuseaddr,fork SYSTEM:"echo
> 200;sleep 10"
>
> This peer must be started *before* the other remote haproxy process
> with only peers as backends.
>
> strace is here only to have an idea of the moment where the remote
> haproxy peer has just connected.
>
> The sleep command is here to have enough time to block (ctrl + s) our
> peer simulator process after the haproxy peer has just connected.
>
> So this peer accepts any remote peer sessions sending "200" status
> messages (and that's all).
>
> A haproxy peer which connects to such a peer which does not reply to a
> synchronization request would endlessly consume high CPU ressources
> until you unblock (ctrl + q) the peer simulator process.
>
> *Unhappily, I do not see any relation between this bug and the
> "CLOSE_WAIT peer state issue" which prevents haproxy from correctly
> shutting down.*
>
> I have attached a patch to this mail which fixes this issue.
Again, we're not seeing high CPU usage in this specific case. We have
reported a completely different scenario where haproxy starts consuming
CPU doing `epoll_wait(x,x,x,0)`, but this is not that. Every time this
shutdown issue occurs, the process is not consuming CPU.
However it is possible the 2 issues might have the same root cause. I
will try out the patch and see what happens.

Thanks

-Patrick
>
> Regards,
>
> Fred.
>
>
>
>
>
>



Re: haproxy consuming 100% cpu - epoll loop

2017-05-18 Thread Patrick Hemmer

On 2017/1/17 17:02, Willy Tarreau wrote:
> Hi Patrick,
>
> On Tue, Jan 17, 2017 at 02:33:44AM +, Patrick Hemmer wrote:
>> So on one of my local development machines haproxy started pegging the
>> CPU at 100%
>> `strace -T` on the process just shows:
>>
>> ...
>> epoll_wait(0, {}, 200, 0)   = 0 <0.03>
>> epoll_wait(0, {}, 200, 0)   = 0 <0.03>
>> epoll_wait(0, {}, 200, 0)   = 0 <0.03>
>> epoll_wait(0, {}, 200, 0)   = 0 <0.03>
>> epoll_wait(0, {}, 200, 0)   = 0 <0.03>
>> epoll_wait(0, {}, 200, 0)   = 0 <0.03>
>> ...
> Hmm not good.
>
>> Opening it up with gdb, the backtrace shows:
>>
>> (gdb) bt
>> #0  0x7f4d18ba82a3 in __epoll_wait_nocancel () from /lib64/libc.so.6
>> #1  0x7f4d1a570ebc in _do_poll (p=, exp=-1440976915)
>> at src/ev_epoll.c:125
>> #2  0x7f4d1a4d3098 in run_poll_loop () at src/haproxy.c:1737
>> #3  0x7f4d1a4cf2c0 in main (argc=, argv=> out>) at src/haproxy.c:2097
> Ok so an event is not being processed correctly.
>
>> This is haproxy 1.7.0 on CentOS/7
> Ah, that could be a clue. We've had 2 or 3 very ugly bugs in 1.7.0
> and 1.7.1. One of them is responsible for the few outages on haproxy.org
> (last one happened today, I left it running to get the core to confirm).
> One of them is an issue with the condition to wake up an applet when it
> failed to get a buffer first and it could be what you're seeing. The
> other ones could possibly cause some memory corruption resulting in
> anything.
>
> Thus I'd strongly urge you to update this one to 1.7.2 (which I'm going
> to do on haproxy.org now that I could get a core). Continue to monitor
> it but I'd feel much safer after this update.
>
> Thanks for your report!
> Willy
>
So I just had this issue recur, this time on version 1.7.2.

-Patrick


haproxy doesn't restart after segfault on systemd

2017-05-18 Thread Patrick Hemmer
So we had an incident today where haproxy segfaulted and our site went
down. Unfortunately we did not capture a core, and the segfault message
logged to dmesg just showed it inside libc. So there's likely not much
we can do here. We'll be making changes to ensure we capture a core in
the future.

However the issue I am reporting that is reproducible (on version 1.7.5)
is that haproxy did not auto restart, which would have minimized the
downtime to the site. We use nbproc > 1, so we have multiple haproxy
processes running, and when one of them dies, neither the
"haproxy-master" process or the "haproxy-systemd-wrapper" process exits,
which prevents systemd from starting the service back up.

While I think this behavior would be fine, a possible alternative would
be for the "haproxy-master" process to restart the dead worker without
having to kill all the other processes.

Another possible action would be to leave the workers running, but
signal them to stop accepting new connections, and then let the
"haproxy-master" exit so systemd will restart it.

But in any case, I think we need some way of handling this so that site
interruption is minimal.

-Patrick


Re: HAProxy won't shut down

2017-05-30 Thread Patrick Hemmer


On 2017/5/29 16:04, Frederic Lecaille wrote:
> On 05/29/2017 06:12 PM, Patrick Hemmer wrote:
>>
>> On 2017/5/29 08:22, Frederic Lecaille wrote:
>>>
>>> Hi Patrick,
>>>
>>> First thank you for this nice and helpful report.
>>>
>>> Would it be possible to have an output of this command the next time
>>> you reproduce such an issue please?
>>>
>>> echo "show sess" | socat stdio 
>>
>> Unfortunately this would not be possible. When the issue occurs, the
>> haproxy process has stopped accepting connections on all sockets. If I
>> were to run this command, it would be sent to the new process, not the
>> one that won't shut down.
>
>
> If you send a SIGHUP to haproxy-systemd-wrapper it asks the old
> process to graceful stop.
Yes, that is what my issue report is about. When sent a SIGHUP, the new
process comes up, but the old process won't shut down.

>
> Please have a look to this documentation:
>
> https://cbonte.github.io/haproxy-dconv/1.7/management.html#4
>
> So you are true, if everything goes well no more connection are
> accept()'ed by the old process (the sockets have been unbound). But in
> your reported case the peers sockets are not closed because still in
> CLOSE_WAIT state, so are still being processed, so stats information
> are still available from the socket stats.
The information might still be tracked within the process, but there is
no way to query the information because the process is no longer
accepting new connections. The new process has taken over control of the
admin socket.

>
> If I have missed something please does not hesitate to yell at me ;) .
>
> I have been told that "show sess *all*" give more information.
>
>>>
>>> I have only one question (see below).
>>>
>>> On 05/24/2017 10:40 AM, Willy Tarreau wrote:
>>>> Hi Patrick,
>>>>
>>>> On Tue, May 23, 2017 at 01:49:42PM -0400, Patrick Hemmer wrote:
>>>> (...)
>>>>> haproxy 28856 root1u IPv4  420797940  0t0
>>>>> TCP 10.0.33.145:35754->10.0.33.147:1029 (CLOSE_WAIT)
>>>>> haproxy 28856 root2u IPv4  420266351  0t0
>>>>> TCP 10.0.33.145:52898->10.0.33.147:1029 (CLOSE_WAIT)
>>>>> haproxy 28856 root3r  REG0,30
>>>>> 4026531956 net
>>>>> haproxy 28856 root4u IPv4  422150834  0t0
>>>>> TCP 10.0.33.145:38874->10.0.33.147:1029 (CLOSE_WAIT)
>>>>
>>>> These ones are very interesting.
>>>
>>> These traces also seem interesting to me.
>>>
>>> # strace -p 28856
>>> Process 28856 attached
>>> epoll_wait(0, {}, 200, 319) = 0
>>> epoll_wait(0, {}, 200, 0)   = 0
>>> epoll_wait(0, {}, 200, 362) = 0
>>> epoll_wait(0, {}, 200, 0)   = 0
>>> epoll_wait(0, {}, 200, 114) = 0
>>> epoll_wait(0, {}, 200, 0)   = 0
>>> epoll_wait(0, {}, 200, 203) = 0
>>> epoll_wait(0, {}, 200, 0)   = 0
>>> epoll_wait(0, {}, 200, 331) = 0
>>> epoll_wait(0, {}, 200, 0)
>>>
>>>
>>> Were such "epoll_wait(0, 0, 200, 0)" calls infinitively displayed?
>> Yes
>>
>>>
>>>
>>> In fact I am wondering if it is normal to have so much epoll_wait(0,
>>> {}, 200, 0) calls for a haproxy process which has shut down.
>>>
>>> I suspect they are in relation with peer tasks (obviously which has
>>> expired).
>>>
>>> If this is the case, and with configurations with only peer tasks,
>>> haproxy would definitively hang consuming a lot of CPU resources.
>> HAProxy was not consuming high CPU. Note that in every other call to
>> `epoll_wait`, the 4th value was >0. If every single timeout value were
>> 0, then yes, it would spin consuming CPU.
>>
>
> agreed... but perhaps your configuration does not use only peer tasks,
> contrary to my configuration... this is your traces which lead me to
> check how the peer task expiration is handled with configurations with
> only peers as backends.
>
> In my case with only two peers I see such following traces, after a
> peer has sent a synchronization request:
>
> epoll_wait(0, {}, 200, 1000}
> epoll_wait(0, {}, 200, 1000}
> epoll_wait(0, {}, 200, 1000}
> epoll_wait(0, {}, 200, 1000}
> epoll_wait(0, {}, 200, X}# with X < 1000
>
> followed by a big loop of
>
>

Re: Logging SSL pre-master-key

2017-06-16 Thread Patrick Hemmer


On 2017/6/16 09:34, Willy Tarreau wrote:
> Hi Patrick,
>
> On Mon, Jun 12, 2017 at 07:31:36PM -0400, Patrick Hemmer wrote:
>> I patched my haproxy to add a ssl_fc_session_key fetch, and with the
>> value I was able to decrypt my test sessions encrypted with
>> TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256.
>>
>> Since the implementation was fairly easy, I've included a patch for it.
>> But I've never submitted anything before, so there's a good chance of
>> something being wrong.
> No problem, that's what public review is made for. BTW at first glance
> your patch looks clean ;-)
>
>> The only thing is that the function to do the extraction was added in
>> 1.1.0
>> (https://github.com/openssl/openssl/commit/858618e7e037559b75b0bfca4d30440f9515b888)
>> The underlying vars are still there, and when I looked have been there
>> since as early as I could find (going back to 1998). But I'm not sure
>> how you feel about extracting the values without the helper function.
> I'd then suggest to proceed differently (if that's OK for you), which
> is to only expose this sample fetch function in 1.1.0 and above. If
> you're fine with running on 1.1 you won't feel any difference. Others
> who don't need this sample fetch right now will not feel any risk of
> build problem.
The main reason I had for supporting the older code is that it seems
many (most?) linux distros, such as the one we use (CentOS/7), still
ship with 1.0.1 or 1.0.2. However since this is a minor change and a
feature enhancement, I doubt this will get backported to 1.7, meaning
we'll have to manually patch it into the version we use. And since we're
doing that, we can just use the patch that supports older OpenSSL.

Anyway, here is the updated patch with the support for <1.1.0 dropped,
as well as the BoringSSL support Emmanuel requested.

One other minor thing I was unsure about was the fetch name. Currently I
have it as "ssl_fc_session_key", but "ssl_fc_session_master_key" might
be more accurate. However this is rather verbose and would make it far
longer than any other sample fetch name, and I was unsure if there were
rules around the naming.

-Patrick
From 217f02d6cd39cb0f40cae74098acf3b586442194 Mon Sep 17 00:00:00 2001
From: Patrick Hemmer <hapr...@stormcloud9.net>
Date: Mon, 12 Jun 2017 18:03:48 -0400
Subject: [PATCH] MINOR: ssl: add fetch 'ssl_fc_session_key' and
 'ssl_bc_session_key'

These fetches return the SSL master key of the front/back connection.
This is useful to decrypt traffic encrypted with ephemeral ciphers.
---
 doc/configuration.txt | 10 ++
 src/ssl_sock.c| 36 
 2 files changed, 46 insertions(+)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 49bfd85..3b9d96e 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -13930,6 +13930,11 @@ ssl_bc_session_id : binary
   made over an SSL/TLS transport layer. It is useful to log if we want to know
   if session was reused or not.
 
+ssl_bc_session_key : binary
+  Returns the SSL session master key of the back connection when the outgoing
+  connection was made over an SSL/TLS transport layer. It is useful to decrypt
+  traffic sent using ephemeral ciphers.
+
 ssl_bc_use_keysize : integer
   Returns the symmetric cipher key size used in bits when the outgoing
   connection was made over an SSL/TLS transport layer.
@@ -14185,6 +14190,11 @@ ssl_fc_session_id : binary
   a server. It is important to note that some browsers refresh their session ID
   every few minutes.
 
+ssl_fc_session_key : binary
+  Returns the SSL session master key of the front connection when the incoming
+  connection was made over an SSL/TLS transport layer. It is useful to decrypt
+  traffic sent using ephemeral ciphers.
+
 ssl_fc_sni : string
   This extracts the Server Name Indication TLS extension (SNI) field from an
   incoming connection made via an SSL/TLS transport layer and locally
diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index 3680515..73fbc31 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -6170,6 +6170,40 @@ smp_fetch_ssl_fc_session_id(const struct arg *args, 
struct sample *smp, const ch
 }
 
 static int
+smp_fetch_ssl_fc_session_key(const struct arg *args, struct sample *smp, const 
char *kw, void *private)
+{
+#if OPENSSL_VERSION_NUMBER >= 0x1010L || defined(OPENSSL_IS_BORINGSSL)
+   struct connection *conn = objt_conn((kw[4] != 'b') ? smp->sess->origin :
+   smp->strm ? smp->strm->si[1].end : 
NULL);
+
+   SSL_SESSION *ssl_sess;
+   int data_len;
+   struct chunk *data;
+
+   if (!conn || !conn->xprt_ctx || conn->xprt != _sock)
+   return 0;
+
+   ssl_sess = SSL_get_session(conn->xprt_ctx);
+   if (!ssl_sess)
+   return 0;
+
+   data = get_trash_chunk();

clarification on documentation for sticky counter tracking duration

2017-05-04 Thread Patrick Hemmer
I'm looking to get some clarification around the documentation for the
duration of sticky counter tracking. There are 2 specific points I'm a
little confused on.

1. Under the documentation for `tcp-request content`, it says:
> In case of HTTP keep-alive with the client, all tcp-request content
rules are evaluated again, so haproxy keeps a record of what sticky
counters were assigned by a "tcp-request connection" versus a
"tcp-request content" rule, and flushes all the content-related ones
after processing an HTTP request

This sounds like if I were to use `tcp-request content track-sc0 src`,
that after the http request is finished processing, the sticky counter
for the source IP will be stop being tracked. However this does not
appear to be the case as when I log sc0_conn_cur, it is counting clients
that are sitting idle between http requests (using http keep-alive).

2. Under the documentation for `http-request track-sc0`, it says:
> Once a "track-sc*" rule is executed, the key is looked up in the table
and if it is not found, an entry is allocated for it. Then a pointer to
that entry is kept during all the session's life

This sounds like the sticky counter is tracked for the duration of the
http session: "entry is kept during all the **session's** life". However
this does not seem to be the case as when I log sc0_conn_cur, clients
that are sitting idle between http requests are not counted.


Am I interpreting the documentation incorrectly, or is the documentation
incorrect?


-Patrick


  1   2   3   >