Re: [PATCH] MINOR: dns: support advertising UDP message size.

2016-12-14 Thread Conrad Hoffmann
Hello Willy, Baptiste,

sorry to revive this very old thread, but I was wondering if there is still
interest in this, now that the DNS subsystem has been refactored?

I was looking at implementing SRV records and noticed that the default UDP
message size of 512 has even become a build-time constant now, so the patch
would need some additional work, but I'd be happy to do so if you'd still
be open to merging it.

As another point of interest, many of our production systems DNS responses
don't fit into a UDP packet anymore, so we have to rely on TCP fallback a
lot. Would such a thing have any chance of getting merged? Or would there
be fundamental concerns with that approach? I mean, it could of course be
made opt-in only for example...

Thanks a lot,
Conrad


On 07/03/2016 09:05 PM, Baptiste wrote:
>> It's very nice having support for EDNS0, but IMHO it shouldn't be
>> enabled by default if it doesn't fallback.
> 
> Hi Remi,
> 
> My intention was to not enable this feature by default.
> 
> Baptiste
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: Admin socket ACL's

2017-01-26 Thread Conrad Hoffmann


Hi Alex,

you can totally do something like that with some preparations. We use the
following setup:

1. We have a "tarpit" backend:

backend tarpit
  timeout tarpit 30s
  reqtarpit

2. In our config, we have a permanent rules like this:

acl blacklisted-cidr req.hdr_ip(X-Real-IP) -f tarpit-cidr.lst
use_backend tarpit if blacklisted-cidr

The file tarpit-cidr.lst can contain a list of IP addresses to block,
however ours is usually empty, instead we apply blocks via the admin socket
by running an admin socket command like:

add acl tarpit-cidr.lst 


Note that this might be a little different from what you require. First, if
you want to block right away, you might want to use something different as
a backend, but using a dedicated backend probably makes sense for you as
well. Gives you stats and everything.

Also, we look at the X-Real-IP header because we get the traffic from our
CDN (which we can trust to set that header). You might want to change that
to look at `src` like in your example. But otherwise, should work the same.

Also, please note that when using the admin interface to add IPs to the
ACL, these will get lost during a restart (unless you also write them to
the file itself).

Hope that helps,
Conrad

On 01/25/2017 07:06 PM, Alexey Zilber wrote:
> Hi All,
> 
>  Is there way to do something like this from the admin socket:
> 
> acl bad_ip src 184.66.248.33
> 
> tcp-request connection reject if bad_ip
> 
> 
> Thanks!
> 
> Alex
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Conrad Hoffmann
Hi Olivier,

I was very eager to try out your patch set, thanks a lot! However, after
applying all of them (including the last three), it seems that using
environment variables in the config is broken (i.e. ${VARNAME} does not get
replaced with the value of the environment variable anymore). I am using
the systemd-wrapper, but not systemd.

Just from looking at the patches I couldn't make out what exactly might be
the problem. I guess it could be either the wrapper not passing on its
environment properly or haproxy not using it properly anymore. If you have
any immediate ideas, let me know. I'll try to fully debug this when I can
spare the time.

Thanks a lot,
Conrad

On 04/10/2017 08:09 PM, Olivier Houchard wrote:
> 
> Hi,
> 
> On top of those patches, here a 3 more patches.
> The first one makes the systemd wrapper check for a HAPROXY_STATS_SOCKET
> environment variable. If set, it will use that as an argument to -x, when
> reloading the process.
> The second one sends listening unix sockets, as well as IPv4/v6 sockets. 
> I see no reason not to, and that means we no longer have to wait until
> the old process close the socket before being able to accept new connections
> on it.
> The third one adds a new global optoin, nosockettransfer, if set, we assume
> we will never try to transfer listening sockets through the stats socket,
> and close any socket nout bound to our process, to save a few file
> descriptors.
> 
> Regards,
> 
> Olivier
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Conrad Hoffmann
On 04/12/2017 03:37 PM, Olivier Houchard wrote:
> On Wed, Apr 12, 2017 at 03:16:31PM +0200, Conrad Hoffmann wrote:
>> Hi Olivier,
>>
>> I was very eager to try out your patch set, thanks a lot! However, after
>> applying all of them (including the last three), it seems that using
>> environment variables in the config is broken (i.e. ${VARNAME} does not get
>> replaced with the value of the environment variable anymore). I am using
>> the systemd-wrapper, but not systemd.
>>
>> Just from looking at the patches I couldn't make out what exactly might be
>> the problem. I guess it could be either the wrapper not passing on its
>> environment properly or haproxy not using it properly anymore. If you have
>> any immediate ideas, let me know. I'll try to fully debug this when I can
>> spare the time.
>>
>> Thanks a lot,
>> Conrad
>>
> 
> Hi Conrad,
> 
> You're right, that was just me being an idiot :)
> Please replace 0007-MINOR-systemd-wrapper-add-support-for-passing-the-x-.patch
> with the one attached, or just replace in src/haproxy-systemd-wrapper.c :
> 
> argv = calloc(4 + main_argc + nb_pid + 1 +
>   stats_socket != NULL ? 2 : 0, sizeof(char *));
> by :
> argv = calloc(4 + main_argc + nb_pid + 1 +
>   (stats_socket != NULL ? 2 : 0), sizeof(char *));
> 
> Regards,
> 
> Olivier


Works indeed, hard to spot :)

Thanks a lot, I'll now get back to testing the actual reloads!

Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: [RFC][PATCHES] seamless reload

2017-04-12 Thread Conrad Hoffmann
Hi again,

so I tried to get this to work, but didn't manage yet. I also don't quite
understand how this is supposed to work. The first haproxy process is
started _without_ the -x option, is that correct? Where does that instance
ever create the socket for transfer to later instances?

I have it working now insofar that on reload, subsequent instances are
spawned with the -x option, but they'll just complain that they can't get
anything from the unix socket (because, for all I can tell, it's not
there?). I also can't see the relevant code path where this socket gets
created, but I didn't have time to read all of it yet.

Am I doing something wrong? Did anyone get this to work with the
systemd-wrapper so far?

Also, but this might be a coincidence, my test setup takes a huge
performance penalty just by applying your patches (without any reloading
whatsoever). Did this happen to anybody else? I'll send some numbers and
more details tomorrow.

Thanks a lot,
Conrad

On 04/12/2017 03:47 PM, Conrad Hoffmann wrote:
> On 04/12/2017 03:37 PM, Olivier Houchard wrote:
>> On Wed, Apr 12, 2017 at 03:16:31PM +0200, Conrad Hoffmann wrote:
>>> Hi Olivier,
>>>
>>> I was very eager to try out your patch set, thanks a lot! However, after
>>> applying all of them (including the last three), it seems that using
>>> environment variables in the config is broken (i.e. ${VARNAME} does not get
>>> replaced with the value of the environment variable anymore). I am using
>>> the systemd-wrapper, but not systemd.
>>>
>>> Just from looking at the patches I couldn't make out what exactly might be
>>> the problem. I guess it could be either the wrapper not passing on its
>>> environment properly or haproxy not using it properly anymore. If you have
>>> any immediate ideas, let me know. I'll try to fully debug this when I can
>>> spare the time.
>>>
>>> Thanks a lot,
>>> Conrad
>>>
>>
>> Hi Conrad,
>>
>> You're right, that was just me being an idiot :)
>> Please replace 
>> 0007-MINOR-systemd-wrapper-add-support-for-passing-the-x-.patch
>> with the one attached, or just replace in src/haproxy-systemd-wrapper.c :
>>
>> argv = calloc(4 + main_argc + nb_pid + 1 +
>>  stats_socket != NULL ? 2 : 0, sizeof(char *));
>> by :
>> argv = calloc(4 + main_argc + nb_pid + 1 +
>>  (stats_socket != NULL ? 2 : 0), sizeof(char *));
>>
>> Regards,
>>
>> Olivier
> 
> 
> Works indeed, hard to spot :)
> 
> Thanks a lot, I'll now get back to testing the actual reloads!
> 
> Conrad
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: [RFC][PATCHES] seamless reload

2017-04-13 Thread Conrad Hoffmann
Hi Olivier,

On 04/12/2017 06:09 PM, Olivier Houchard wrote:
> On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote:
>> On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
>>> Hi again,
>>>
>>> so I tried to get this to work, but didn't manage yet. I also don't quite
>>> understand how this is supposed to work. The first haproxy process is
>>> started _without_ the -x option, is that correct? Where does that instance
>>> ever create the socket for transfer to later instances?
>>>
>>> I have it working now insofar that on reload, subsequent instances are
>>> spawned with the -x option, but they'll just complain that they can't get
>>> anything from the unix socket (because, for all I can tell, it's not
>>> there?). I also can't see the relevant code path where this socket gets
>>> created, but I didn't have time to read all of it yet.
>>>
>>> Am I doing something wrong? Did anyone get this to work with the
>>> systemd-wrapper so far?
>>>
>>> Also, but this might be a coincidence, my test setup takes a huge
>>> performance penalty just by applying your patches (without any reloading
>>> whatsoever). Did this happen to anybody else? I'll send some numbers and
>>> more details tomorrow.
>>>
>>
>> Ok I can confirm the performance issues, I'm investigating.
>>
> 
> Found it, I was messing with SO_LINGER when I shouldn't have been.



thanks a lot, I can confirm that the performance regression seems to be gone!

I am still having the other (conceptual) problem, though. Sorry if this is
just me holding it wrong or something, it's been a while since I dug
through the internals of haproxy.

So, as I mentioned before, we use nbproc (12) and the systemd-wrapper,
which in turn starts haproxy in daemon mode, giving us a process tree like
this (path and file names shortened for brevity):

\_ /u/s/haproxy-systemd-wrapper -f ./hap.cfg -p /v/r/hap.pid
\_ /u/s/haproxy-master
\_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
\_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
\_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
\_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
\_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
\_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
\_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
\_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
\_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
\_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
\_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
\_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds

Now, in our config file, we have something like this:

# expose admin socket for each process
  stats socket ${STATS_ADDR}   level admin process 1
  stats socket ${STATS_ADDR}-2 level admin process 2
  stats socket ${STATS_ADDR}-3 level admin process 3
  stats socket ${STATS_ADDR}-4 level admin process 4
  stats socket ${STATS_ADDR}-5 level admin process 5
  stats socket ${STATS_ADDR}-6 level admin process 6
  stats socket ${STATS_ADDR}-7 level admin process 7
  stats socket ${STATS_ADDR}-8 level admin process 8
  stats socket ${STATS_ADDR}-9 level admin process 9
  stats socket ${STATS_ADDR}-10 level admin process 10
  stats socket ${STATS_ADDR}-11 level admin process 11
  stats socket ${STATS_ADDR}-12 level admin process 12

Basically, we have a dedicate admin socket for each ("real") process, as we
need to be able to talk to each process individually. So I was wondering:
which admin socket should I pass as HAPROXY_STATS_SOCKET? I initially
thought it would have to be a special stats socket in the haproxy-master
process (which we currently don't have), but as far as I can tell from the
output of `lsof` the haproxy-master process doesn't even hold any FDs
anymore. Will this setup currently work with your patches at all? Do I need
to add a stats socket to the master process? Or would this require a list
of stats sockets to be passed, similar to the list of PIDs that gets passed
to new haproxy instances, so that each process can talk to the one from
which it is taking over the socket(s)? In case I need a stats socket for
the master process, what would be the directive to create it?

Thanks a lot,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: [RFC][PATCHES] seamless reload

2017-04-13 Thread Conrad Hoffmann
On 04/13/2017 11:31 AM, Olivier Houchard wrote:
> On Thu, Apr 13, 2017 at 11:17:45AM +0200, Conrad Hoffmann wrote:
>> Hi Olivier,
>>
>> On 04/12/2017 06:09 PM, Olivier Houchard wrote:
>>> On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote:
>>>> On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
>>>>> Hi again,
>>>>>
>>>>> so I tried to get this to work, but didn't manage yet. I also don't quite
>>>>> understand how this is supposed to work. The first haproxy process is
>>>>> started _without_ the -x option, is that correct? Where does that instance
>>>>> ever create the socket for transfer to later instances?
>>>>>
>>>>> I have it working now insofar that on reload, subsequent instances are
>>>>> spawned with the -x option, but they'll just complain that they can't get
>>>>> anything from the unix socket (because, for all I can tell, it's not
>>>>> there?). I also can't see the relevant code path where this socket gets
>>>>> created, but I didn't have time to read all of it yet.
>>>>>
>>>>> Am I doing something wrong? Did anyone get this to work with the
>>>>> systemd-wrapper so far?
>>>>>
>>>>> Also, but this might be a coincidence, my test setup takes a huge
>>>>> performance penalty just by applying your patches (without any reloading
>>>>> whatsoever). Did this happen to anybody else? I'll send some numbers and
>>>>> more details tomorrow.
>>>>>
>>>>
>>>> Ok I can confirm the performance issues, I'm investigating.
>>>>
>>>
>>> Found it, I was messing with SO_LINGER when I shouldn't have been.
>>
>> 
>>
>> thanks a lot, I can confirm that the performance regression seems to be gone!
>>
>> I am still having the other (conceptual) problem, though. Sorry if this is
>> just me holding it wrong or something, it's been a while since I dug
>> through the internals of haproxy.
>>
>> So, as I mentioned before, we use nbproc (12) and the systemd-wrapper,
>> which in turn starts haproxy in daemon mode, giving us a process tree like
>> this (path and file names shortened for brevity):
>>
>> \_ /u/s/haproxy-systemd-wrapper -f ./hap.cfg -p /v/r/hap.pid
>> \_ /u/s/haproxy-master
>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>
>> Now, in our config file, we have something like this:
>>
>> # expose admin socket for each process
>>   stats socket ${STATS_ADDR}   level admin process 1
>>   stats socket ${STATS_ADDR}-2 level admin process 2
>>   stats socket ${STATS_ADDR}-3 level admin process 3
>>   stats socket ${STATS_ADDR}-4 level admin process 4
>>   stats socket ${STATS_ADDR}-5 level admin process 5
>>   stats socket ${STATS_ADDR}-6 level admin process 6
>>   stats socket ${STATS_ADDR}-7 level admin process 7
>>   stats socket ${STATS_ADDR}-8 level admin process 8
>>   stats socket ${STATS_ADDR}-9 level admin process 9
>>   stats socket ${STATS_ADDR}-10 level admin process 10
>>   stats socket ${STATS_ADDR}-11 level admin process 11
>>   stats socket ${STATS_ADDR}-12 level admin process 12
>>
>> Basically, we have a dedicate admin socket for each ("real") process, as we
>> need to be able to talk to each process individually. So I was wondering:
>> which admin socket should I pass as HAPROXY_STATS_SOCKET? I initially
>> thought it would have to be a special stats socket in the haproxy-master
>> process (which we currently don't have), but as far as I can tell from the
>> output of `lsof` the haproxy-master process doesn't even hold any FDs
>> anymore. Will this setup currently work with your patches at all? Do I need
>> to add a stats so

Re: [RFC][PATCHES] seamless reload

2017-04-13 Thread Conrad Hoffmann


On 04/13/2017 02:28 PM, Olivier Houchard wrote:
> On Thu, Apr 13, 2017 at 12:59:38PM +0200, Conrad Hoffmann wrote:
>> On 04/13/2017 11:31 AM, Olivier Houchard wrote:
>>> On Thu, Apr 13, 2017 at 11:17:45AM +0200, Conrad Hoffmann wrote:
>>>> Hi Olivier,
>>>>
>>>> On 04/12/2017 06:09 PM, Olivier Houchard wrote:
>>>>> On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote:
>>>>>> On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
>>>>>>> Hi again,
>>>>>>>
>>>>>>> so I tried to get this to work, but didn't manage yet. I also don't 
>>>>>>> quite
>>>>>>> understand how this is supposed to work. The first haproxy process is
>>>>>>> started _without_ the -x option, is that correct? Where does that 
>>>>>>> instance
>>>>>>> ever create the socket for transfer to later instances?
>>>>>>>
>>>>>>> I have it working now insofar that on reload, subsequent instances are
>>>>>>> spawned with the -x option, but they'll just complain that they can't 
>>>>>>> get
>>>>>>> anything from the unix socket (because, for all I can tell, it's not
>>>>>>> there?). I also can't see the relevant code path where this socket gets
>>>>>>> created, but I didn't have time to read all of it yet.
>>>>>>>
>>>>>>> Am I doing something wrong? Did anyone get this to work with the
>>>>>>> systemd-wrapper so far?
>>>>>>>
>>>>>>> Also, but this might be a coincidence, my test setup takes a huge
>>>>>>> performance penalty just by applying your patches (without any reloading
>>>>>>> whatsoever). Did this happen to anybody else? I'll send some numbers and
>>>>>>> more details tomorrow.
>>>>>>>
>>>>>>
>>>>>> Ok I can confirm the performance issues, I'm investigating.
>>>>>>
>>>>>
>>>>> Found it, I was messing with SO_LINGER when I shouldn't have been.
>>>>
>>>> 
>>>>
>>>> thanks a lot, I can confirm that the performance regression seems to be 
>>>> gone!
>>>>
>>>> I am still having the other (conceptual) problem, though. Sorry if this is
>>>> just me holding it wrong or something, it's been a while since I dug
>>>> through the internals of haproxy.
>>>>
>>>> So, as I mentioned before, we use nbproc (12) and the systemd-wrapper,
>>>> which in turn starts haproxy in daemon mode, giving us a process tree like
>>>> this (path and file names shortened for brevity):
>>>>
>>>> \_ /u/s/haproxy-systemd-wrapper -f ./hap.cfg -p /v/r/hap.pid
>>>> \_ /u/s/haproxy-master
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>
>>>> Now, in our config file, we have something like this:
>>>>
>>>> # expose admin socket for each process
>>>>   stats socket ${STATS_ADDR}   level admin process 1
>>>>   stats socket ${STATS_ADDR}-2 level admin process 2
>>>>   stats socket ${STATS_ADDR}-3 level admin process 3
>>>>   stats socket ${STATS_ADDR}-4 level admin process 4
>>>>   stats socket ${STATS_ADDR}-5 level admin process 5
>>>>   stats socket ${STATS_ADDR}-6 level admin process 6
>>>>   stats socket ${STATS_ADDR}-7 level admin process 7
>>>>   stats socket ${STATS_ADDR}-8 level admin process 8
>>>>   stats socket ${STATS_ADDR}-9 level admin process 9
>>

Re: [RFC][PATCHES] seamless reload

2017-04-13 Thread Conrad Hoffmann


On 04/13/2017 03:50 PM, Olivier Houchard wrote:
> On Thu, Apr 13, 2017 at 03:06:47PM +0200, Conrad Hoffmann wrote:
>>
>>
>> On 04/13/2017 02:28 PM, Olivier Houchard wrote:
>>> On Thu, Apr 13, 2017 at 12:59:38PM +0200, Conrad Hoffmann wrote:
>>>> On 04/13/2017 11:31 AM, Olivier Houchard wrote:
>>>>> On Thu, Apr 13, 2017 at 11:17:45AM +0200, Conrad Hoffmann wrote:
>>>>>> Hi Olivier,
>>>>>>
>>>>>> On 04/12/2017 06:09 PM, Olivier Houchard wrote:
>>>>>>> On Wed, Apr 12, 2017 at 05:50:54PM +0200, Olivier Houchard wrote:
>>>>>>>> On Wed, Apr 12, 2017 at 05:30:17PM +0200, Conrad Hoffmann wrote:
>>>>>>>>> Hi again,
>>>>>>>>>
>>>>>>>>> so I tried to get this to work, but didn't manage yet. I also don't 
>>>>>>>>> quite
>>>>>>>>> understand how this is supposed to work. The first haproxy process is
>>>>>>>>> started _without_ the -x option, is that correct? Where does that 
>>>>>>>>> instance
>>>>>>>>> ever create the socket for transfer to later instances?
>>>>>>>>>
>>>>>>>>> I have it working now insofar that on reload, subsequent instances are
>>>>>>>>> spawned with the -x option, but they'll just complain that they can't 
>>>>>>>>> get
>>>>>>>>> anything from the unix socket (because, for all I can tell, it's not
>>>>>>>>> there?). I also can't see the relevant code path where this socket 
>>>>>>>>> gets
>>>>>>>>> created, but I didn't have time to read all of it yet.
>>>>>>>>>
>>>>>>>>> Am I doing something wrong? Did anyone get this to work with the
>>>>>>>>> systemd-wrapper so far?
>>>>>>>>>
>>>>>>>>> Also, but this might be a coincidence, my test setup takes a huge
>>>>>>>>> performance penalty just by applying your patches (without any 
>>>>>>>>> reloading
>>>>>>>>> whatsoever). Did this happen to anybody else? I'll send some numbers 
>>>>>>>>> and
>>>>>>>>> more details tomorrow.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Ok I can confirm the performance issues, I'm investigating.
>>>>>>>>
>>>>>>>
>>>>>>> Found it, I was messing with SO_LINGER when I shouldn't have been.
>>>>>>
>>>>>> 
>>>>>>
>>>>>> thanks a lot, I can confirm that the performance regression seems to be 
>>>>>> gone!
>>>>>>
>>>>>> I am still having the other (conceptual) problem, though. Sorry if this 
>>>>>> is
>>>>>> just me holding it wrong or something, it's been a while since I dug
>>>>>> through the internals of haproxy.
>>>>>>
>>>>>> So, as I mentioned before, we use nbproc (12) and the systemd-wrapper,
>>>>>> which in turn starts haproxy in daemon mode, giving us a process tree 
>>>>>> like
>>>>>> this (path and file names shortened for brevity):
>>>>>>
>>>>>> \_ /u/s/haproxy-systemd-wrapper -f ./hap.cfg -p /v/r/hap.pid
>>>>>> \_ /u/s/haproxy-master
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./hap.cfg -p /v/r/hap.pid -Ds
>>>>>> \_ /u/s/haproxy -f ./h

Re: [RFC][PATCHES] seamless reload

2017-04-13 Thread Conrad Hoffmann
On 04/13/2017 05:10 PM, Olivier Houchard wrote:
> On Thu, Apr 13, 2017 at 04:59:26PM +0200, Conrad Hoffmann wrote:
>> Sure, here it is ;P
>>
>> I now get a segfault (on reload):
>>
>> *** Error in `/usr/sbin/haproxy': corrupted double-linked list:
>> 0x05b511e0 ***
>>
>> Here is the backtrace, retrieved from the core file:
>>
>> (gdb) bt
>> #0  0x7f4c92801067 in __GI_raise (sig=sig@entry=6) at
>> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
>> #1  0x7f4c92802448 in __GI_abort () at abort.c:89
>> #2  0x7f4c9283f1b4 in __libc_message (do_abort=do_abort@entry=1,
>> fmt=fmt@entry=0x7f4c92934210 "*** Error in `%s': %s: 0x%s ***\n") at
>> ../sysdeps/posix/libc_fatal.c:175
>> #3  0x7f4c9284498e in malloc_printerr (action=1, str=0x7f4c929302ec
>> "corrupted double-linked list", ptr=) at malloc.c:4996
>> #4  0x7f4c92845923 in _int_free (av=0x7f4c92b71620 ,
>> p=, have_lock=0) at malloc.c:3996
>> #5  0x00485850 in tcp_find_compatible_fd (l=0xaaed20) at
>> src/proto_tcp.c:812
>> #6  tcp_bind_listener (listener=0xaaed20, errmsg=0x7ffccc774e10 "",
>> errlen=100) at src/proto_tcp.c:878
>> #7  0x00493ce1 in start_proxies (verbose=0) at src/proxy.c:793
>> #8  0x004091ec in main (argc=21, argv=0x7ffccc775168) at
>> src/haproxy.c:1942
> 
> Ok, yet another stupid mistake, hopefully the attached patch fixes this :)
> 
> Thanks !
> 
> Olivier


It does indeed! Not only does it work now, the result is impressive! Not a
single dropped request even when aggressively reloading under substantial load!

You certainly gave me an unexpected early easter present here :)

I will now head out, but I am planning on installing a 1.8 version with
your patches on our canary pool (which receives a small amount of
production traffic to test changes) after the holidays. I will happily test
anything else that might be helpful for you. I will also set up a proper
load test inside our data center then, but I expect no surprises there (my
current tests were done over a VPN link, somewhat limiting the achievable
throughput).

Once more, thank you so much! This will greatly simplify much of our
operations. If there is anything else we can help test, let me know :)

Cheers,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: [RFC][PATCHES] seamless reload

2017-04-19 Thread Conrad Hoffmann


On 04/19/2017 11:22 AM, Olivier Houchard wrote:
very long conversation
>> Joining this again a bit late, do you still want me to test it?
>> I would like to know if there are any performance impact, but I see that
>> Conrad Hoffmann has done all the hard work on this. So, we can conclude that 
>> we
>> don't expect any performance impact.
>>
>> Once again thanks a lot for your hard work on this.
>> @Conrad Hoffmann, thanks a lot for testing this and checking if there is any
>> performance impact.
>>
> 
> 
> Hi Pavlos,
> 
> More testing is always appreciated :)
> I don't expect any performance impact, but that wouldn't be the first time
> I say that and am proven wrong, though as you said with all the testing
> Conrad did, I'm fairly confident it should be OK.
> 
> Thanks !
> 
> Olivier

I also think more testing is always very welcome, especially as there are
so many different configurations that certain code paths might for example
never be executed with the configuration we are running here!

Cheers,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: [PATCHES] 3 patches for DNS SRV records

2017-08-11 Thread Conrad Hoffmann
Hi,

first of all: great to see that this is making progress! I am very excited
about everything related to SRV records and also server-templates. I tested
a fresh master build with these patches applied, here are my observations:

On 08/11/2017 11:10 AM, Baptiste Assmann wrote:
> Hi All
> 
> So, I enabled latest (brilliant) contribution from Olivier into my
> Kubernetes cluster and I discovered it did not work as expected.
> After digging into the issues, I found 3 bugs directly related to the
> way SRV records must be read and processed by HAProxy.
> It was clearly hard to spot them outside a real orchestrator :)
> 
> Please find in attachment 3 patches to fix them.
> 
> Please note that I might have found an other bug, that I'll dig into
> later.
> When "scalling in" (reducing an app footprint in kubernetes), HAProxy
> considers some servers (pods in kubernetes) in error "no dns
> resolution". This is normal. What is not normal is that those servers
> never ever come back to live, even when I scale up again>
> Note that thanks to (Salut) Fred contribution about server-templates
> some time ago, we can do some very cool fancy configurations like the
> one below: (I have a headless service called 'red' in my kubernetes, it
> points to my 'red' application)
> 
> backend red
>   server-template red 20 _http._tcp.red.default.svc.cluster.local:8080
> inter 1s resolvers kube check
> 
> In one line, we can enable automatic "scalling follow-up" in HAProxy.

I tried a very similar setup, like this:

>  resolvers servicediscovery
>nameserver dns1 10.33.60.31:53
>nameserver dns2 10.33.19.32:53
>nameserver dns3 10.33.25.28:53
>
>resolve_retries   3
>timeout retry 1s
>hold valid   10s
>hold obsolete 5s
>
>  backend testbackend
>server-template test 20 http.web.production.:80 check

This is the first time I am testing the server-template keyword at all, but
I immediately noticed that I sometimes get a rather uneven distribution of
pods, e.g. this (with the name resolving to 5 addresses):

> $ echo "show servers state testbackend" | \
>nc localhost 2305 | grep testbackend | \
>awk '{print $5}' | sort | uniq -c
>  7 10.146.112.130
>  6 10.146.148.92
>  3 10.146.172.225
>  4 10.146.89.208

This uses only four of the five servers, with a quite uneven distribution.
Other attempts do you use all five servers, but the distribution still
seems pretty uneven most of the time. Is that intentional? Is the list
populated randomnly?

Then, nothing changed when I scaled up or down (except the health checks
taking some serves offline), but the addresses were never updated. Is that
the bug you mentioned, or am I doing it wrong?

Also, as more of a side node, we do use SRV records, but not underscores
int the names, which I realize is not very common, but also not exactly
forbidden (as far as I understand the RFC it's more of a suggestion). Would
be great if this could be indicated in some way in the config maybe.

And lastly, I know this isn't going to be solved on a Friday afternoon, but
I'll let you know that our infrastructure has reached a scale where DNS
over UDP almost never cuts it anymore (due to the amount of records
returned), and I think many people who are turning to e.g. Kubernetes do so
because they have to operate at such scale, so my guess is this might be
one of the more frequently requested features at some point :)

These just as "quick" feedback, depending on the time I'll have I'll try to
take a closer look at a few things and provide more details if possible.

Again, thanks a lot for working on this, let me know if you are interested
in any specific details.

Thanks a lot,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: [PATCHES] 3 patches for DNS SRV records

2017-08-11 Thread Conrad Hoffmann


On 08/11/2017 02:56 PM, Conrad Hoffmann wrote:
> Hi,
> 
> first of all: great to see that this is making progress! I am very excited
> about everything related to SRV records and also server-templates. I tested
> a fresh master build with these patches applied, here are my observations:
> 
> On 08/11/2017 11:10 AM, Baptiste Assmann wrote:
>> Hi All
>>
>> So, I enabled latest (brilliant) contribution from Olivier into my
>> Kubernetes cluster and I discovered it did not work as expected.
>> After digging into the issues, I found 3 bugs directly related to the
>> way SRV records must be read and processed by HAProxy.
>> It was clearly hard to spot them outside a real orchestrator :)
>>
>> Please find in attachment 3 patches to fix them.
>>
>> Please note that I might have found an other bug, that I'll dig into
>> later.
>> When "scalling in" (reducing an app footprint in kubernetes), HAProxy
>> considers some servers (pods in kubernetes) in error "no dns
>> resolution". This is normal. What is not normal is that those servers
>> never ever come back to live, even when I scale up again>
>> Note that thanks to (Salut) Fred contribution about server-templates
>> some time ago, we can do some very cool fancy configurations like the
>> one below: (I have a headless service called 'red' in my kubernetes, it
>> points to my 'red' application)
>>
>> backend red
>>   server-template red 20 _http._tcp.red.default.svc.cluster.local:8080
>> inter 1s resolvers kube check
>>
>> In one line, we can enable automatic "scalling follow-up" in HAProxy.
> 
> I tried a very similar setup, like this:
> 
>>  resolvers servicediscovery
>>nameserver dns1 10.33.60.31:53
>>nameserver dns2 10.33.19.32:53
>>nameserver dns3 10.33.25.28:53
>>
>>resolve_retries   3
>>timeout retry 1s
>>hold valid   10s
>>hold obsolete 5s
>>
>>  backend testbackend
>>server-template test 20 http.web.production.:80 check
> 
> This is the first time I am testing the server-template keyword at all, but
> I immediately noticed that I sometimes get a rather uneven distribution of
> pods, e.g. this (with the name resolving to 5 addresses):
> 
>> $ echo "show servers state testbackend" | \
>>nc localhost 2305 | grep testbackend | \
>>awk '{print $5}' | sort | uniq -c
>>  7 10.146.112.130
>>  6 10.146.148.92
>>  3 10.146.172.225
>>  4 10.146.89.208
> 
> This uses only four of the five servers, with a quite uneven distribution.
> Other attempts do you use all five servers, but the distribution still
> seems pretty uneven most of the time. Is that intentional? Is the list
> populated randomnly?
> 
> Then, nothing changed when I scaled up or down (except the health checks
> taking some serves offline), but the addresses were never updated. Is that
> the bug you mentioned, or am I doing it wrong?'

Ok, I am an idiot. I realized I forgot the `resolvers` keyword for the
`server-template` directive. So scaling and DNS updates actually work now,
which is already amazing. However, the distribution thing is still somewhat
of an issue.


> Also, as more of a side node, we do use SRV records, but not underscores
> int the names, which I realize is not very common, but also not exactly
> forbidden (as far as I understand the RFC it's more of a suggestion). Would
> be great if this could be indicated in some way in the config maybe.
> 
> And lastly, I know this isn't going to be solved on a Friday afternoon, but
> I'll let you know that our infrastructure has reached a scale where DNS
> over UDP almost never cuts it anymore (due to the amount of records
> returned), and I think many people who are turning to e.g. Kubernetes do so
> because they have to operate at such scale, so my guess is this might be
> one of the more frequently requested features at some point :)
> 
> These just as "quick" feedback, depending on the time I'll have I'll try to
> take a closer look at a few things and provide more details if possible.
> 
> Again, thanks a lot for working on this, let me know if you are interested
> in any specific details.
> 
> Thanks a lot,
> Conrad
> 

Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: RFC uuid for log-format

2019-09-03 Thread Conrad Hoffmann
Hi,

On Tue, 2019-09-03 at 14:07 +, Schimweg, Luca wrote:
> thanks for mentioning rand, I didn't know about it... With some
> rand()s I was able to generate a UUID, I'll have to do some
> performance checks, but I think it will be fair enough. Then we don't
> need a %uuid or similar from my point of view.

FWIW, we use (among others) %Ts (timestamp), %pid (PID b/c multi-
process setup), %rc (retry count), and %rt (session request counter),
which produces some reasonably unique id, though our constraint is only
for it to be (mostly) unique across one server. You can also throw in
the server IP or name of course.

Cheers,
Conrad


> 
> Thanks,
> Luca
> 
> On 03.09.19, 10:34, "li...@ltri.eu on behalf of Lukas Tribus" <
> li...@ltri.eu> wrote:
> 
> Hello Luca,
> 
> On Tue, 3 Sep 2019 at 09:18, Schimweg, Luca <
> luca.schim...@sap.com> wrote:
> >
> > Hey,
> >
> >
> >
> > for one use case I have, I would need a variable like %uuid in
> log-formats,
> > which just generates a random UUID. The use-case would be, to
> be able
> > to set the unique-id-format to this uuid, so that we can have a
> random uuid
> > set as request-id for any incoming request. Right now, it’s
> quite difficult to
> > get a random and unique request-id, the random uuid approach
> would
> > definitely help with that.
> 
> I assume the rand [1] does not suffice. In this case, I'd suggest
> to
> use LUA for this, maybe by using some library like lua-resty-jit-
> uuid
> [2].
>     
> 
> Lukas
> 
> [1] 
> https://cbonte.github.io/haproxy-dconv/1.9/configuration.html#7.3.2-rand
> [2] https://github.com/thibaultcha/lua-resty-jit-uuid
> 
> 
-- 
Conrad Hoffmann
Systems and Traffic Engineering

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Artem Fishman | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG
Charlottenburg  | HRB 110657B  




2.0 regression? Control socket and ACL patterns with spaces

2019-09-04 Thread Conrad Hoffmann
Hi there,

I just started some testing with haproxy 2.0 and noticed something that
I guess could be considered a regression? Not sure, I also may have
missed something in the release announcements or such. 
Anyways: we use the control socket to dynamically update ACLs. With
haproxy 2.0, escaped spaces in ACL patterns are no longer recognized.
Here is an example from a recent 2.0 repo build:

$ ../haproxy-2.0/haproxy -v
HA-Proxy version 2.0.5-4db294-10 2019/08/26 - https://haproxy.org/
$ ../haproxy-2.0/haproxy -f ./min.cfg &
[1] 4128
<5>Proxy http-in started.
$ echo 'show acl tarpit-ua.lst' | socat - /tmp/haproxy.sock

$ echo 'add acl tarpit-ua.lst spaces\ test' | socat - /tmp/haproxy.sock

$ echo 'show acl tarpit-ua.lst' | socat - /tmp/haproxy.sock
0x7f3d3c01f960 spaces

I think you get the idea, but happy to supply more details. Looking at
the code (cli.c), it seems to me that this never got implemented (in
the rewrite?). This worked fine before and I dare say it is a tad
dangerous, because it might silently shorten your regular expressions
to match _a lot_ of user agents :)

More for illustrative purposes than anything else I attached a patch
that fixes the issue for me (against 2.0 repo). However, I have not yet
spent a lot of time with the 2.0 code base, so I'll leave it up to you
to judge the quality thereof. No offense taken if you'd rather
implement this differently. However, also happy to work on it with some
pointers to the right direction.

I hope I am not just missing something really obvious here :)

As usual, thanks for all the fish,
Conrad
-- 
Conrad Hoffmann
Systems and Traffic Engineering

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Artem Fishman | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG
Charlottenburg  | HRB 110657B  
diff --git a/src/cli.c b/src/cli.c
index 9a9f80f9..43550067 100644
--- a/src/cli.c
+++ b/src/cli.c
@@ -533,8 +533,11 @@ static int cli_parse_request(struct appctx *appctx)
 			break;
 
 		args[i] = p;
-		p += strcspn(p, " \t");
-		*p++ = 0;
+		do {
+			/* skip to next unescaped space, also accounting for potential '// ' */
+			p += strcspn(p, " \t") + 1;
+		} while ((p > appctx->chunk->area + 2) && *(p-2) == '\\' && *(p-3) != '\\');
+		*(p - 1) = 0;
 
 		/* unescape backslashes (\) */
 		for (j = 0, k = 0; args[i][k]; k++) {


Re: 2.0 regression? Control socket and ACL patterns with spaces

2019-09-05 Thread Conrad Hoffmann
On Wed, 2019-09-04 at 21:41 +0200, Tim Düsterhus wrote:
> Conrad,
> 
> Am 04.09.19 um 10:17 schrieb Conrad Hoffmann:
> > More for illustrative purposes than anything else I attached a
> > patch
> > that fixes the issue for me (against 2.0 repo). However, I have not
> > yet
> > spent a lot of time with the 2.0 code base, so I'll leave it up to
> > you
> > to judge the quality thereof. No offense taken if you'd rather
> > implement this differently. However, also happy to work on it with
> > some
> > pointers to the right direction.
> > 
> > I hope I am not just missing something really obvious here :)
> > 
> 
> I don't use this feature, but I believe your patch is incorrectly
> handling backslashes. Consider this:
> 
> foo\\\ bar
> 
> Then both the char in front of the space and the char in front of
> that
> are backslashes. But the space is an escaped space.

Indeed, thanks for pointing that out! And I briefly felt clever for at
least considering the escaped backslashes :)

This is probably hard to get right without some sort of proper parser.
I'll think about it some more, but also wait for feedback from the
maintainers.

Thanks again,
Conrad

> 
> Best regards
> Tim Düsterhus
-- 
Conrad Hoffmann
Systems and Traffic Engineering

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Artem Fishman | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG
Charlottenburg  | HRB 110657B  




Re: 2.0 regression? Control socket and ACL patterns with spaces

2019-09-11 Thread Conrad Hoffmann
Hey again,

sorry for reviving an old(ish) thread, but I would love to get some
kind of feedback, even if it's just that this will stay this way and I
need to deal with it.

I also tried with the payload syntax in the meantime, but to no avail.
Is there any supported way of adding patterns with spaces to an ACL
over the command socket at all in 2.0?

Thanks a lot,
Conrad

On Wed, 2019-09-04 at 10:17 +0200, Conrad Hoffmann wrote:
> Hi there,
> 
> I just started some testing with haproxy 2.0 and noticed something
> that
> I guess could be considered a regression? Not sure, I also may have
> missed something in the release announcements or such. 
> Anyways: we use the control socket to dynamically update ACLs. With
> haproxy 2.0, escaped spaces in ACL patterns are no longer recognized.
> Here is an example from a recent 2.0 repo build:
> 
> $ ../haproxy-2.0/haproxy -v
> HA-Proxy version 2.0.5-4db294-10 2019/08/26 - https://haproxy.org/
> $ ../haproxy-2.0/haproxy -f ./min.cfg &
> [1] 4128
> <5>Proxy http-in started.
> $ echo 'show acl tarpit-ua.lst' | socat - /tmp/haproxy.sock
> 
> $ echo 'add acl tarpit-ua.lst spaces\ test' | socat -
> /tmp/haproxy.sock
> 
> $ echo 'show acl tarpit-ua.lst' | socat - /tmp/haproxy.sock
> 0x7f3d3c01f960 spaces
> 
> I think you get the idea, but happy to supply more details. Looking
> at
> the code (cli.c), it seems to me that this never got implemented (in
> the rewrite?). This worked fine before and I dare say it is a tad
> dangerous, because it might silently shorten your regular expressions
> to match _a lot_ of user agents :)
> 
> More for illustrative purposes than anything else I attached a patch
> that fixes the issue for me (against 2.0 repo). However, I have not
> yet
> spent a lot of time with the 2.0 code base, so I'll leave it up to
> you
> to judge the quality thereof. No offense taken if you'd rather
> implement this differently. However, also happy to work on it with
> some
> pointers to the right direction.
> 
> I hope I am not just missing something really obvious here :)
> 
> As usual, thanks for all the fish,
> Conrad
-- 
Conrad Hoffmann
Systems and Traffic Engineering

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Artem Fishman | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG
Charlottenburg  | HRB 110657B  




Re: 2.0 regression? Control socket and ACL patterns with spaces

2019-09-12 Thread Conrad Hoffmann
Hi, 

On Thu, 2019-09-12 at 08:12 +0200, Willy Tarreau wrote:
> Hi Conrad,
> 
> On Wed, Sep 11, 2019 at 04:53:18PM +0200, Conrad Hoffmann wrote:
> > Hey again,
> > 
> > sorry for reviving an old(ish) thread, but I would love to get some
> > kind of feedback, even if it's just that this will stay this way
> > and I
> > need to deal with it.
> > 
> > I also tried with the payload syntax in the meantime, but to no
> > avail.
> > Is there any supported way of adding patterns with spaces to an ACL
> > over the command socket at all in 2.0?
> 
> I didn't notice this thread last week, thanks for reviving it. Well
> your report and your attempt at fixing it perfectly illustrate why
> I absolutely never use such functions like strcspn(). The reason is
> that such functions tend to make you write context-unaware parsers.
> It's like looking for a closing parethesis without counting the
> number of opening ones.
> 
> Instead we should use the same tokenizer as the config parser uses,
> it's quite simple and not prone to such issues. Plus we'd even get
> support for quotes and hex characters as a bonus (though we could
> imagine not supporting some of them in the backported version, but
> I'm not sure of the benefit of *not* supporting them).

That's great to hear! I trust your judgement there, and it would
probably be the cleaner solution (as it re-enables the previous
behavior), but I'll just mention that I'd be perfectly fine with using
the payload syntax if that worked for "acl add" in case that's easier
to implement.

Anyways, thanks for picking that up, looking forward to whatever
solution :)

Thanks a lot,
Conrad


> 
> Maybe we should do that before issuing 2.0.6 so that we don't keep
> the CLI broken for too long.
> 
> Willy
-- 
Conrad Hoffmann
Systems and Traffic Engineering

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Artem Fishman | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG
Charlottenburg  | HRB 110657B  




[PATCH] epoll: avoid possible CPU hogging of process waiting for connection termination before restart

2014-05-15 Thread Conrad Hoffmann
Hi everyone,

I am still somewhat new to haproxy, so I maybe missing a few bits and
pieces here. If so, don't hesitate to educate me :)

First of all, we are using HAProxy here at SoundCloud, so a big thanks
to everyone who invested time in this wonderful project!

We are very keen on using SSL termination with HAProxy, so we have been
running the 1.5-dev builds on non-critical infrastructure. We kept
running into one problem though. Sorry for this rather long description,
put reproduction is kind of difficult, so I want to mention everything
that might be related.

We use "nbproc 12". To restart HAProxy, we send USR1 to the worker
processes. Usually, as expected, they stop accepting connections, shut
down once all connections are terminated, then the parent process
terminates and gets restarted by the systemd-wrapper (I am aware we
could/should send USR2 to the systemd-wrapper, but I think the use case
is still valid).

If haproxy has been running for a certain amount of time (this part
makes reproduction cumbersome), we quite often run into the following
issue when restarting haproxy: if some of the workers have active
connections that they must wait for to be terminated, a fraction of
those would start hogging CPU like crazy. Attaching a debugger showed
that epoll_wait() would immediately return with events that they did not
process because they had already closed the file descriptor (and thus
owner == 0x0). This led me to the epoll(7) man page [1], which in its
Q&A section states for Q6 that just closing a fd may not be sufficient
if the inode referred to by that fd is still open in another process,
which is the case when you fork() with open fds like haproxy does.

The scenario I constructed from this information (and here I am still
missing some knowledge) is that if a processes has a connection going on
when the signal hits, it can happen that it closes a file descriptor
that it doesn't have any connections on but then, while polling for
events on the remaining connection, receives events for the closed fd
because that one is still open in another process. Sorry for this being
kind of vague, but I am still trying to understand the relevant code paths.

However, based on the above assumption, I came up with the attached
patch as a remedy. Basically it just implements the poller.clo(fd)
handler for epoll, which is already implemented for other pollers. I
can't say with certainty, but while we had bug hit rate of at least 80%
before, I have not been able to trigger the bug in 5 attempts with the
patch applied, so I currently think it actually works.

I would love to hear some feedback on this issue, maybe someone can give
insight on the actual code path that might be at hand here, or possibly
any side effects this patch might introduce that I have not thought of, etc.

Thanks a lot and keep being awesome!
Conrad

[1] http://linux.die.net/man/7/epoll
diff --git a/src/ev_epoll.c b/src/ev_epoll.c
index 2849ec6..9a704a3 100644
--- a/src/ev_epoll.c
+++ b/src/ev_epoll.c
@@ -46,6 +46,17 @@ static struct epoll_event ev;
 #endif
 
 /*
+ * Immediately remove file descriptor from epoll set upon close.
+ * Since we forked, some fds share inodes with the other process, and epoll may
+ * send us events even though this process closed the fd (see man 7 epoll,
+ * "Questions and answers", Q 6).
+ */
+REGPRM1 static void __fd_clo(int fd)
+{
+	epoll_ctl(epoll_fd, EPOLL_CTL_DEL, fd, &ev);
+}
+
+/*
  * Linux epoll() poller
  */
 REGPRM2 static void _do_poll(struct poller *p, int exp)
@@ -267,7 +278,7 @@ static void _do_register(void)
 	p->pref = 300;
 	p->private = NULL;
 
-	p->clo  = NULL;
+	p->clo  = __fd_clo;
 	p->test = _do_test;
 	p->init = _do_init;
 	p->term = _do_term;


Re: [PATCH] epoll: avoid possible CPU hogging of process waiting for connection termination before restart

2014-05-16 Thread Conrad Hoffmann
Hi,

I just figured out some of the missing pieces. When sending USR1 to the
workers, they just close() their listening sockets instead of shutdown()
(like they do for TTOU), causing the file descriptor to stay open in the
parent process, which is why there are still epoll events coming in for it.

Does this make sense?

But I guess the patch remains valid. Or maybe one could use shutdown()
for the USR1 path, too?

Any input appreciated :)

Cheers,
Conrad


On 05/15/2014 11:26 AM, Conrad Hoffmann wrote:
> Hi everyone,
> 
> I am still somewhat new to haproxy, so I maybe missing a few bits and
> pieces here. If so, don't hesitate to educate me :)
> 
> First of all, we are using HAProxy here at SoundCloud, so a big thanks
> to everyone who invested time in this wonderful project!
> 
> We are very keen on using SSL termination with HAProxy, so we have been
> running the 1.5-dev builds on non-critical infrastructure. We kept
> running into one problem though. Sorry for this rather long description,
> put reproduction is kind of difficult, so I want to mention everything
> that might be related.
> 
> We use "nbproc 12". To restart HAProxy, we send USR1 to the worker
> processes. Usually, as expected, they stop accepting connections, shut
> down once all connections are terminated, then the parent process
> terminates and gets restarted by the systemd-wrapper (I am aware we
> could/should send USR2 to the systemd-wrapper, but I think the use case
> is still valid).
> 
> If haproxy has been running for a certain amount of time (this part
> makes reproduction cumbersome), we quite often run into the following
> issue when restarting haproxy: if some of the workers have active
> connections that they must wait for to be terminated, a fraction of
> those would start hogging CPU like crazy. Attaching a debugger showed
> that epoll_wait() would immediately return with events that they did not
> process because they had already closed the file descriptor (and thus
> owner == 0x0). This led me to the epoll(7) man page [1], which in its
> Q&A section states for Q6 that just closing a fd may not be sufficient
> if the inode referred to by that fd is still open in another process,
> which is the case when you fork() with open fds like haproxy does.
> 
> The scenario I constructed from this information (and here I am still
> missing some knowledge) is that if a processes has a connection going on
> when the signal hits, it can happen that it closes a file descriptor
> that it doesn't have any connections on but then, while polling for
> events on the remaining connection, receives events for the closed fd
> because that one is still open in another process. Sorry for this being
> kind of vague, but I am still trying to understand the relevant code paths.
> 
> However, based on the above assumption, I came up with the attached
> patch as a remedy. Basically it just implements the poller.clo(fd)
> handler for epoll, which is already implemented for other pollers. I
> can't say with certainty, but while we had bug hit rate of at least 80%
> before, I have not been able to trigger the bug in 5 attempts with the
> patch applied, so I currently think it actually works.
> 
> I would love to hear some feedback on this issue, maybe someone can give
> insight on the actual code path that might be at hand here, or possibly
> any side effects this patch might introduce that I have not thought of, etc.
> 
> Thanks a lot and keep being awesome!
> Conrad
> 
> [1] http://linux.die.net/man/7/epoll
> 



Re: [PATCH] epoll: avoid possible CPU hogging of process waiting for connection termination before restart

2014-05-17 Thread Conrad Hoffmann
Hi Willy,

On 05/17/2014 04:39 AM, Willy Tarreau wrote:
> On 05/15/2014 11:26 AM, Conrad Hoffmann wrote:
>> If haproxy has been running for a certain amount of time (this part
>> makes reproduction cumbersome), we quite often run into the following
>> issue when restarting haproxy: if some of the workers have active
>> connections that they must wait for to be terminated, a fraction of
>> those would start hogging CPU like crazy.
> 
> I had already observed that case myself and thought it was gone because
> as you explain it's very hard to reproduce.

After sleeping a night over it, I came up with a sure-fire way to
reproduce it. See below.

>> Attaching a debugger showed
>> that epoll_wait() would immediately return with events that they did not
>> process because they had already closed the file descriptor (and thus
>> owner == 0x0).
> 
> Could you please confirm that it happens with the listening sockets ?

Yes. I ran lsof on all processes before sending USR1, then attached a
debugger the one gone haywire and checked the fd in the epoll event. It
was one of the original listening sockets.

Here is a nice way to trigger it:
 - run haproxy in deamon mode, but with nbproc 1
 - open a HTTP keep-alive connection (and keep it open)
 - send USR1 to the only worker (it should stay running, because of the
   ongoing connection)
 - open a new connection -> the worker goes to 100%

Works all the time for me. Running lsof before and after the signal
neatly shows how the worker closed the listening socket but the parent
still has it.

>> However, based on the above assumption, I came up with the attached
>> patch as a remedy. Basically it just implements the poller.clo(fd)
>> handler for epoll, which is already implemented for other pollers. I
>> can't say with certainty, but while we had bug hit rate of at least 80%
>> before, I have not been able to trigger the bug in 5 attempts with the
>> patch applied, so I currently think it actually works.
> 
> I'm absolutely sure that your patch fixes the issue because it removes the
> event before getting rid of the reference to the FD. However I have a problem
> with this approach, it will add an extra syscall to *every* connection, which
> is quite noticeable in a number of environments. Ideally we should extend the
> poller's API to have a "close-shared" function or something like this which
> would be used only for listening FDs that are shared between processes.
> 
> Maybe we could simply have a flag in the fdtab indicating that the fd is
> shared (basically all FDs opened before the fork would have it), and decide
> in poller.clo() to only call epoll_ctl(DEL) when this flag is set, then clear
> it. It would seam reasonable in my opinion. Do you think that you could extend
> your current patch to do that ?

Good point on the performance issue. I whipped up a new patch, I hope
this is what you meant? I surely haven't yet fully understood all
architecture and design decisions, so feel free to nitpick :)

Also, I wasn't sure if this should go into the UNIX sockets, too? Now
that I am writing this, I actually suppose so. Let me know if I should
add it there, too!

Enjoy your weekend,
Conrad
diff --git a/include/types/fd.h b/include/types/fd.h
index 1c2c7c8..12d5b6b 100644
--- a/include/types/fd.h
+++ b/include/types/fd.h
@@ -86,6 +86,7 @@ struct fdtab {
 	unsigned char new:1; /* 1 if this fd has just been created */
 	unsigned char updated:1; /* 1 if this fd is already in the update list */
 	unsigned char linger_risk:1; /* 1 if we must kill lingering before closing */
+	unsigned char listener:1;/* 1 if a listening socket, requires EPOLL_CTL_DEL on close */
 };
 
 /* less often used information */
diff --git a/src/ev_epoll.c b/src/ev_epoll.c
index 2849ec6..46e5f30 100644
--- a/src/ev_epoll.c
+++ b/src/ev_epoll.c
@@ -46,6 +46,19 @@ static struct epoll_event ev;
 #endif
 
 /*
+ * Immediately remove file descriptor from epoll set upon close.
+ * Since we forked, some fds share inodes with the other process, and epoll may
+ * send us events even though this process closed the fd (see man 7 epoll,
+ * "Questions and answers", Q 6).
+ */
+REGPRM1 static void __fd_clo(int fd)
+{
+	if (unlikely(fdtab[fd].listener)) {
+		epoll_ctl(epoll_fd, EPOLL_CTL_DEL, fd, &ev);
+	}
+}
+
+/*
  * Linux epoll() poller
  */
 REGPRM2 static void _do_poll(struct poller *p, int exp)
@@ -267,7 +280,7 @@ static void _do_register(void)
 	p->pref = 300;
 	p->private = NULL;
 
-	p->clo  = NULL;
+	p->clo  = __fd_clo;
 	p->test = _do_test;
 	p->init = _do_init;
 	p->term = _do_term;
diff --git a/src/proto_tcp.c b/src/proto_tcp.c
index a672de4..8d4acaf 100644
--- a/src/proto_tcp.c
+++ b/src/proto_tcp.c
@@ -882,6 +882,7 @@ int tcp_bind_l

Re: [PATCH] epoll: avoid possible CPU hogging of process waiting for connection termination before restart

2014-05-19 Thread Conrad Hoffmann
Hey Willy,

On 05/19/2014 06:30 AM, Willy Tarreau wrote:
>> Here is a nice way to trigger it:
>>  - run haproxy in deamon mode, but with nbproc 1
>>  - open a HTTP keep-alive connection (and keep it open)
>>  - send USR1 to the only worker (it should stay running, because of the
>>ongoing connection)
>>  - open a new connection -> the worker goes to 100%
>>
>> Works all the time for me. Running lsof before and after the signal
>> neatly shows how the worker closed the listening socket but the parent
>> still has it.
> 
> I'm just confused with what the "parent" here can be since you have nbproc 1,
> so there's a single process overall. Or maybe you're using the debugger to
> breakpoint the parent process and prevent it from leaving ?
> 
> I was thinking that starting with nbproc 2 with long connections on both
> would cause the issue to happen as well.

I am not sure what the proper terminology is, here. But when you run in
daemon mode with nbproc 1 you get exactly one fork. The output of ps
looks somewhat like this:

18616 ?S  0:00  |   \_ haproxy-systemd-wrapper
18618 ?S  0:00  |   \_ /usr/sbin/haproxy
18619 ?Ss 0:09  |   \_ /usr/sbin/haproxy

I was referring to 18618 as the "parent" process. It never does
anything, just wait()s.

>>> Maybe we could simply have a flag in the fdtab indicating that the fd is
>>> shared (basically all FDs opened before the fork would have it), and decide
>>> in poller.clo() to only call epoll_ctl(DEL) when this flag is set, then 
>>> clear
>>> it. It would seam reasonable in my opinion. Do you think that you could 
>>> extend
>>> your current patch to do that ?
>>
>> Good point on the performance issue. I whipped up a new patch, I hope
>> this is what you meant? I surely haven't yet fully understood all
>> architecture and design decisions, so feel free to nitpick :)
> 
> Yes, that's about the principle, though I thought about doing it slightly
> differently : in my opinion the problem is not caused by the FD being a
> listener, but by it being cloned across the fork(), eventhough for now
> there's a 1:1 relation between the two, that could change in the future
> (eg: a shared inter-process socket for stats or whatever).
> 
> I would have proceeded slightly differently :
> 
>   - in fd_insert(), I'd have added :
> 
> fdtab[fd].cloned = 0;
> 
>   - in ev_epoll.c:_do_fork(), I'd have marked all the opened FDs as cloned :
> 
> for (i = 0; i <= maxfd; i++)
>if (fdtab[fd].owner)
>   fdtab[fd].cloned = 1;
> 
> That way, we'd be sure that all open FDs are properly tagged, whatever they
> are used for.
> 
>> Also, I wasn't sure if this should go into the UNIX sockets, too? Now
>> that I am writing this, I actually suppose so. Let me know if I should
>> add it there, too!
> 
> One more reason for doing the job in the poller, otherwise when adding
> new protocols, it's easy to forget about this!

Sorry, I read your suggestions somewhat too hastily. You are absolutely
right. I modified the patch accordingly, with one improvement
suggestion, though: I marked the fds  as cloned in fork_poller() instead
of evpolls _do_fork. This way, this flag is set regardless of the
protocol and the poller used and any other (future?) pollers could still
use it. Seemed more generic, what do you think? I can also put it into
_do_fork if you prefer that.

Cheers,
Conrad
diff --git a/include/proto/fd.h b/include/proto/fd.h
index 605dc21..b0a478e 100644
--- a/include/proto/fd.h
+++ b/include/proto/fd.h
@@ -335,6 +335,7 @@ static inline void fd_insert(int fd)
 	fdtab[fd].ev = 0;
 	fdtab[fd].new = 1;
 	fdtab[fd].linger_risk = 0;
+	fdtab[fd].cloned = 0;
 	if (fd + 1 > maxfd)
 		maxfd = fd + 1;
 }
diff --git a/include/types/fd.h b/include/types/fd.h
index 1c2c7c8..057d968 100644
--- a/include/types/fd.h
+++ b/include/types/fd.h
@@ -86,6 +86,7 @@ struct fdtab {
 	unsigned char new:1; /* 1 if this fd has just been created */
 	unsigned char updated:1; /* 1 if this fd is already in the update list */
 	unsigned char linger_risk:1; /* 1 if we must kill lingering before closing */
+	unsigned char cloned:1;  /* 1 if a cloned socket, requires EPOLL_CTL_DEL on close */
 };
 
 /* less often used information */
diff --git a/src/ev_epoll.c b/src/ev_epoll.c
index 2849ec6..9d359b2 100644
--- a/src/ev_epoll.c
+++ b/src/ev_epoll.c
@@ -46,6 +46,19 @@ static struct epoll_event ev;
 #endif
 
 /*
+ * Immediately remove file descriptor from epoll set upon close.
+ * Since we forked, some fds share inodes with the other process, and epoll may
+ * send us events even though this process closed the fd (see man 7 epoll,
+ * "Questions and answers", Q 6).
+ */
+REGPRM1 static void __fd_clo(int fd)
+{
+	if (unlikely(fdtab[fd].cloned)) {
+		epoll_ctl(epoll_fd, EPOLL_CTL_DEL, fd, &ev);
+	}
+}
+
+/*
  * Linux epoll() poller
  */
 REGPRM2 static void _do_poll(struct poller *p, int exp)
@@ -267,7 +280,7

Re: [PATCH] epoll: avoid possible CPU hogging of process waiting for connection termination before restart

2014-05-20 Thread Conrad Hoffmann
Hey Willy,

> That's what I initially thought, because I didn't remember there was this
> common fork_poller() function, but what you did is better, each part does
> its own job, that's really clean this way.
> 
> Can you please provide a commit message summarizing the issue you faced
> and the solution, then I'll merge your work ASAP ?

Awesome. I hope the attached commit is acceptable.

Thanks a bunch,
Conrad
>From 88770e22b5165a0584d307dc2805e33600b4ac37 Mon Sep 17 00:00:00 2001
From: Conrad Hoffmann 
Date: Tue, 20 May 2014 14:28:24 +0200
Subject: [PATCH] Fix possible CPU hogging of worker processes after receiving
 SIGUSR1.

When run in daemon mode (i.e. with at least one forked process) and using
the epoll poller, sending USR1 (graceful shutdown) to the worker processes
can cause some workers to start running at 100% CPU. Precondition is having
an established HTTP keep-alive connection when the signal is received.

The cloned (during fork) listening sockets do not get closed in the parent
process, thus they do not get removed from the epoll set automatically
(see man 7 epoll). This can lead to the process receiving epoll events
that it doesn't feel responsible for, resulting in an endless loop around
epoll_wait() delivering these events.

The solution is to explicitly remove these file descriptors from the epoll
set. To not degrade performance, care was taken to only do this when
neccessary, i.e. when the file descriptor was cloned during fork.

Signed-off-by: Conrad Hoffmann 
---
 include/proto/fd.h |  1 +
 include/types/fd.h |  1 +
 src/ev_epoll.c | 15 ++-
 src/fd.c   |  7 +++
 4 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/include/proto/fd.h b/include/proto/fd.h
index 605dc21..b0a478e 100644
--- a/include/proto/fd.h
+++ b/include/proto/fd.h
@@ -335,6 +335,7 @@ static inline void fd_insert(int fd)
 	fdtab[fd].ev = 0;
 	fdtab[fd].new = 1;
 	fdtab[fd].linger_risk = 0;
+	fdtab[fd].cloned = 0;
 	if (fd + 1 > maxfd)
 		maxfd = fd + 1;
 }
diff --git a/include/types/fd.h b/include/types/fd.h
index 1c2c7c8..057d968 100644
--- a/include/types/fd.h
+++ b/include/types/fd.h
@@ -86,6 +86,7 @@ struct fdtab {
 	unsigned char new:1; /* 1 if this fd has just been created */
 	unsigned char updated:1; /* 1 if this fd is already in the update list */
 	unsigned char linger_risk:1; /* 1 if we must kill lingering before closing */
+	unsigned char cloned:1;  /* 1 if a cloned socket, requires EPOLL_CTL_DEL on close */
 };
 
 /* less often used information */
diff --git a/src/ev_epoll.c b/src/ev_epoll.c
index 2849ec6..9d359b2 100644
--- a/src/ev_epoll.c
+++ b/src/ev_epoll.c
@@ -46,6 +46,19 @@ static struct epoll_event ev;
 #endif
 
 /*
+ * Immediately remove file descriptor from epoll set upon close.
+ * Since we forked, some fds share inodes with the other process, and epoll may
+ * send us events even though this process closed the fd (see man 7 epoll,
+ * "Questions and answers", Q 6).
+ */
+REGPRM1 static void __fd_clo(int fd)
+{
+	if (unlikely(fdtab[fd].cloned)) {
+		epoll_ctl(epoll_fd, EPOLL_CTL_DEL, fd, &ev);
+	}
+}
+
+/*
  * Linux epoll() poller
  */
 REGPRM2 static void _do_poll(struct poller *p, int exp)
@@ -267,7 +280,7 @@ static void _do_register(void)
 	p->pref = 300;
 	p->private = NULL;
 
-	p->clo  = NULL;
+	p->clo  = __fd_clo;
 	p->test = _do_test;
 	p->init = _do_init;
 	p->term = _do_term;
diff --git a/src/fd.c b/src/fd.c
index 66f1e8b..c238bc8 100644
--- a/src/fd.c
+++ b/src/fd.c
@@ -438,6 +438,13 @@ int list_pollers(FILE *out)
  */
 int fork_poller()
 {
+	int fd;
+	for (fd = 0; fd <= maxfd; fd++) {
+		if (fdtab[fd].owner) {
+			fdtab[fd].cloned = 1;
+		}
+	}
+
 	if (cur_poller.fork) {
 		if (cur_poller.fork(&cur_poller))
 			return 1;
-- 
1.9.2



[PATCH] Improve and simplify systemd-wrapper.

2014-07-24 Thread Conrad Hoffmann
Hey Willy,

I know you're not that much into the systemd stuff, but although we
don't use systemd we actually us the wrapper a lot, and I thought it
could use a little make-over.

There was only one small bug really, but once I got going I thought i
might as well simplify it a litle. I pulled everything out of the signal
handlers, switched from signal() to sigaction(), got rid of global state
and got rid of the wrapper exec()'ing itself (not sure if there was some
reason for that, but it seems to be unneccessary).

It is late at night and I might have to revise a few things after some
sleep, but I wanted to get this out for you and others to look at. It is
against tonights master of the haproxy-1.5 repo, I could also create it
for 1.6 instead (or maybe it even applies out of the box).

Keep it up and merci beaucoup,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
>From 582490ce909040285153a9644530c8913927a462 Mon Sep 17 00:00:00 2001
From: Conrad Hoffmann 
Date: Fri, 25 Jul 2014 02:58:51 +0200
Subject: [PATCH] Improve and simplify systemd-wrapper.

The signal handling did too much work in the context of the signal
handler, which didn't yet cause problems (for me) but is to be
avoided. While at it I switched from the deprecated signal() to
the slightly more modern sigaction(). I also got rid of the
wrapper re-executing itself on SIGUSR2, unless I overlooked
something this was not needed. Moving stuff out of the signal
handlers also obsoleted all but one global variables, making the
code much more readable.

Finally this also fixes an actual bug, where the -p switch that
the wrapper looks for would not be found if an odd number of
argument-less switches is used before it, e.g.:

  haproxy-systemd-wrapper -de -p mypidfile.pid

would not work this way, but only if -de is placed at the end.

Signed-off-by: Conrad Hoffmann 
---
 src/haproxy-systemd-wrapper.c | 272 +++---
 1 file changed, 151 insertions(+), 121 deletions(-)

diff --git a/src/haproxy-systemd-wrapper.c b/src/haproxy-systemd-wrapper.c
index ba07ebe..01dd74c 100644
--- a/src/haproxy-systemd-wrapper.c
+++ b/src/haproxy-systemd-wrapper.c
@@ -18,171 +18,201 @@
 #include 
 #include 
 
-#define REEXEC_FLAG "HAPROXY_SYSTEMD_REEXEC"
-#define SD_DEBUG "<7>"
-#define SD_NOTICE "<5>"
+/* Taken from sd-daemon.h to not actually depend on systemd. */
+#define SD_CRIT"<2>"  /* critical conditions */
+#define SD_NOTICE  "<5>"  /* normal but significant condition */
+#define SD_DEBUG   "<7>"  /* debug-level messages */
 
-static char *pid_file = "/run/haproxy.pid";
-static int wrapper_argc;
-static char **wrapper_argv;
+#define WRAPPER_CRIT SD_CRIT "haproxy-systemd-wrapper: "
+#define WRAPPER_NOTICE SD_NOTICE "haproxy-systemd-wrapper: "
+#define WRAPPER_DEBUG SD_DEBUG "haproxy-systemd-wrapper: "
 
-static void locate_haproxy(char *buffer, size_t buffer_size)
-{
-	char *end = NULL;
+#define PATH_BUF_SIZE 512
+#define DEFAULT_PID_FILE "/run/haproxy.pid"
+
+
+static volatile sig_atomic_t sig_caught = 0;
 
-	if (readlink("/proc/self/exe", buffer, buffer_size) > 0)
-		end = strrchr(buffer, '/');
+static void signal_handler(int signum)
+{
+	sig_caught = signum;
+}
 
-	if (end == NULL) {
-		strncpy(buffer, "/usr/sbin/haproxy", buffer_size);
-		return;
+/* Find the absolute path to haproxy executable, result must be free'd. */
+static char *locate_haproxy(void)
+{
+	char buffer[PATH_BUF_SIZE];
+
+	/* Assume haproxy executable is in same directory as the wrapper. */
+	if (readlink("/proc/self/exe", buffer, PATH_BUF_SIZE) > 0) {
+		char *end = strrchr(buffer, '/');
+		if (end) {
+			strncpy(end + 1, "haproxy", buffer + PATH_BUF_SIZE - (end + 1));
+			buffer[PATH_BUF_SIZE - 1] = '\0';
+			return strdup(buffer);
+		}
 	}
-	end[1] = '\0';
-	strncpy(end + 1, "haproxy", buffer + buffer_size - (end + 1));
-	buffer[buffer_size - 1] = '\0';
+	/* Emergency solution: assume some common default. */
+	return strdup("/usr/sbin/haproxy");
 }
 
-static void spawn_haproxy(char **pid_strv, int nb_pid)
+/* Spawn a new instance of haproxy, appending PIDs to command line as required. */
+static void spawn_haproxy(int argc, char **argv, char **pids)
 {
-	char haproxy_bin[512];
-	pid_t pid;
-	int main_argc;
-	char **main_argv;
-
-	main_argc = wrapper_argc - 1;
-	main_argv = wrapper_argv + 1;
-
-	pid = fork();
-	if (!pid) {
-		/* 3 for "haproxy -Ds -sf" */
-		char **argv = calloc(4 + main_argc + nb_pid + 1, sizeof(char *));
-		int i;
-		int argno = 0;
-		l

Re: Client certs on tcp and securing stats socket

2014-07-24 Thread Conrad Hoffmann
4/07/12
> Copyright 2000-2014 Willy Tarreau 
> 
> Build options :
>   TARGET  = linux2628
>   CPU = generic
>   CC  = gcc
>   CFLAGS  = -O2 -g -fno-strict-aliasing
>   OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_PCRE=1
> 
> Default settings :
>   maxconn = 2000, bufsize = 16384, maxrewrite = 8192, maxpollevents = 200
> 
> Encrypted password support via crypt(3): yes
> Built with zlib version : 1.2.3
> Compression algorithms supported : identity, deflate, gzip
> Built with OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
> Running on OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
> OpenSSL library supports TLS extensions : yes
> OpenSSL library supports SNI : yes
> OpenSSL library supports prefer-server-ciphers : yes
> Built with PCRE version : 7.8 2008-09-05
> PCRE library supports JIT : no (USE_PCRE_JIT not set)
> Built with transparent proxy support using: IP_TRANSPARENT
> IPV6_TRANSPARENT IP_FREEBIND
> 
> Available polling systems :
>   epoll : pref=300,  test result OK
>poll : pref=200,  test result OK
>  select : pref=150,  test result OK
> Total: 3 (3 usable), will use epoll.
> 
> Here's how I'm connecting with a client cert:
> 
> socat stdio 
> openssl-connect:domeyers-haproxy-socket-test.gha.usw1.cld.scea.com:7828,cer
> t=/root/haproxy-ssl/haproxy-client.pem,cafile=/root/haproxy-ssl/haproxy-ser
> ver.crt
> OR (less often - I tried openssl in case there was a problem with my socat)
> openssl s_client -connect
> domeyers-haproxy-socket-test.gha.usw1.cld.scea.com:7828 -cert
> ./haproxy-client.crt -key ./haproxy-client.key
> 
> And without:
> 
> socat stdio 
> openssl-connect:domeyers-haproxy-socket-test.gha.usw1.cld.scea.com:7828,caf
> ile=/root/haproxy-ssl/haproxy-server.crt
> openssl s_client -connect
> domeyers-haproxy-socket-test.gha.usw1.cld.scea.com:7828
> 
> Here is it allowing my connection, without a client cert, despite the
> config above:
> 
> [root@domeyers-haproxy-socket-test haproxy-ssl]# socat stdio
> openssl-connect:domeyers-haproxy-socket-test.gha.usw1.cld.scea.com:7828,caf
> ile=/root/haproxy-ssl/haproxy-server.crt
> show stat
> # 
> pxname,svname,qcur,qmax,scur,smax,slim,stot,bin,bout,dreq,dresp,ereq,econ,e
> resp,wretr,wredis,status,weight,act,bck,chkfail,chkdown,lastchg,downtime,ql
> imit,pid,iid,sid,throttle,lbtot,tracked,type,rate,rate_lim,rate_max,check_s
> tatus,check_code,check_duration,hrsp_1xx,hrsp_2xx,hrsp_3xx,hrsp_4xx,hrsp_5x
> x,hrsp_other,hanafail,req_rate,req_rate_max,req_tot,cli_abrt,srv_abrt,comp_
> in,comp_out,comp_byp,comp_rsp,lastsess,last_chk,last_agt,qtime,ctime,rtime,
> ttime,
> default_stats,FRONTEND,,,0,0,2000,0,0,0,0,0,0,OPEN,1,2,00,0
> ,0,00,0,0,0,0,0,,0,0,0,,,0,0,0,0
> default_stats,BACKEND,0,0,0,0,200,0,0,0,0,0,,0,0,0,0,UP,0,0,0,,0,6304,0,,1,
> 2,0,,0,,1,0,,00,0,0,0,0,0,0,0,0,0,0,0,-1,,,0,0,0,0,
> main-service,FRONTEND,,,0,0,2000,0,0,0,0,0,0,OPEN,1,3,00,0,
> 0,00,0,0,0,0,0,,0,0,0,,,0,0,0,0
> main-service,BACKEND,0,0,0,0,200,0,0,0,0,0,,0,0,0,0,UP,0,0,0,,0,6304,0,,1,3
> ,0,,0,,1,0,,00,0,0,0,0,0,0,0,0,0,0,0,-1,,,0,0,0,0,
> stats-socket,FRONTEND,,,1,1,2000,10,91,12588,0,0,0,OPEN,1,4,0,,
> ,,0,0,0,1,,,0,0,0,,,0,0,0,0
> local_socket,localhost,0,0,1,1,,10,91,12588,,0,,0,0,0,0,no
> check,1,1,0,,1,5,1,,10,,2,0,,1,,00,0,3,,,1,1,0,59,
> local_socket,BACKEND,0,0,1,1,200,10,91,12588,0,0,,0,0,0,0,UP,1,1,0,,0,6304,
> 0,,1,5,0,,10,,1,0,,1,,0,0,0,0,0,0,3,,,1,1,0,59,
> 
> Log line:
> 2014-07-22T15:03:10-07:00 localhost haproxy[21598]: 10.108.0.166:60116
> [22/Jul/2014:15:03:06.419] stats-socket~ local_socket/localhost 5/1/3890
> 1352 -- 0/0/0/0/0 0/0 ""
> 
> I've never seen anything logged as ssl_c_s_dn (the last field) -- it's
> always empty. At first I thought I wasn't using the client cert the right
> way for haproxy to receive it, because I was having trouble matching ACLs
> against it. But then I found that
>  I didn't even need a client cert to connect, so now I'm troubleshooting
> that.
> 
> Any help would be appreciated, including alternate options for securing
> this traffic.
> 
> For the record, here are some of the config lines I was trying to use to
> validate fields in the client cert:
> 
>   tcp-request inspect-delay 30s
>   acl client_cert_cn_ok ssl_c_s_dn(CN) -m str my.host.name.com
>   tcp-request content accept if client_cert_cn_ok
>   tcp-request content reject
> 
> 
> Thanks!
> Donovan
> 
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: [PATCH] Improve and simplify systemd-wrapper.

2014-07-25 Thread Conrad Hoffmann
Hey,

On 07/25/2014 08:31 AM, Willy Tarreau wrote:
>> There was only one small bug really, but once I got going I thought i
>> might as well simplify it a litle. I pulled everything out of the signal
>> handlers, switched from signal() to sigaction(), got rid of global state
>> and got rid of the wrapper exec()'ing itself (not sure if there was some
>> reason for that, but it seems to be unneccessary).
> 
> From what I remember, the purpose was to be able to upgrade the wrapper
> itself without having to kill it. Typically in order to apply changes
> like you just performed... So I think you should bring that feature back.

Oh, right, I actually knew that before. Another proof of the importance
of sometimes taking a step back. Also, I'll add a comment :)

> You cleanup patch is interesting but it does too many things at once for
> a single patch, I'd strongly prefer if you would cut it into pieces each
> addressing a specific issue. It would make the code more easily reviewable
> and would also help troubleshooting in the event it would cause any minor
> regression.

Sounds very reasonable. I'll see if I can break it in chunks that would
be independently revertable, although I think that may be difficult, but
at least things will be easier to review and merge. Will get back soon!

Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: [PATCH] Improve and simplify systemd-wrapper.

2014-07-28 Thread Conrad Hoffmann
Hello,

attached are the first two patches, one fixing the actual bug I
encountered and one just tidying up the signal handling a little. More
to come.

Are they ok like this?

Cheers,
Conrad

On 07/25/2014 11:04 AM, Conrad Hoffmann wrote:
> Hey,
> 
> On 07/25/2014 08:31 AM, Willy Tarreau wrote:
>>> There was only one small bug really, but once I got going I thought i
>>> might as well simplify it a litle. I pulled everything out of the signal
>>> handlers, switched from signal() to sigaction(), got rid of global state
>>> and got rid of the wrapper exec()'ing itself (not sure if there was some
>>> reason for that, but it seems to be unneccessary).
>>
>> From what I remember, the purpose was to be able to upgrade the wrapper
>> itself without having to kill it. Typically in order to apply changes
>> like you just performed... So I think you should bring that feature back.
> 
> Oh, right, I actually knew that before. Another proof of the importance
> of sometimes taking a step back. Also, I'll add a comment :)
> 
>> You cleanup patch is interesting but it does too many things at once for
>> a single patch, I'd strongly prefer if you would cut it into pieces each
>> addressing a specific issue. It would make the code more easily reviewable
>> and would also help troubleshooting in the event it would cause any minor
>> regression.
> 
> Sounds very reasonable. I'll see if I can break it in chunks that would
> be independently revertable, although I think that may be difficult, but
> at least things will be easier to review and merge. Will get back soon!
> 
> Conrad
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
>From 34ac06bae5856d28773a194e7d2b1ee508213c39 Mon Sep 17 00:00:00 2001
From: Conrad Hoffmann 
Date: Mon, 28 Jul 2014 23:22:43 +0200
Subject: [PATCH 1/2] Fix search for -p argument in systemd wrapper.

Searching for the pid file in the list of arguments did not
take flags without parameters into account, like e.g. -de. Because
of this, the wrapper would use a different pid file than haproxy
if such an argument was specified before -p.

The new version can still yield a false positive for some crazy
situations, like your config file name starting with "-p", but
I think this is as good as it gets without using getopt or some
library.

Signed-off-by: Conrad Hoffmann 
---
 src/haproxy-systemd-wrapper.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/src/haproxy-systemd-wrapper.c b/src/haproxy-systemd-wrapper.c
index ba07ebe..529b213 100644
--- a/src/haproxy-systemd-wrapper.c
+++ b/src/haproxy-systemd-wrapper.c
@@ -130,11 +130,8 @@ static void sigint_handler(int signum __attribute__((unused)))
 static void init(int argc, char **argv)
 {
 	while (argc > 1) {
-		if (**argv == '-') {
-			char *flag = *argv + 1;
-			--argc; ++argv;
-			if (*flag == 'p')
-pid_file = *argv;
+		if ((*argv)[0] == '-' && (*argv)[1] == 'p') {
+			pid_file = *(argv + 1);
 		}
 		--argc; ++argv;
 	}
-- 
2.0.2

>From da798ecab739c6287290400c42d876f305a56174 Mon Sep 17 00:00:00 2001
From: Conrad Hoffmann 
Date: Mon, 28 Jul 2014 23:52:20 +0200
Subject: [PATCH 2/2] Improve signal handling in systmd wrapper.

Move all code out of the signal handlers, since this is potentially
dangerous. To make sure the signal handlers behave as expected, use
sigaction() instead of signal(). That also obsoletes messing with
the signal mask after restart.

Signed-off-by: Conrad Hoffmann 
---
 src/haproxy-systemd-wrapper.c | 37 -
 1 file changed, 24 insertions(+), 13 deletions(-)

diff --git a/src/haproxy-systemd-wrapper.c b/src/haproxy-systemd-wrapper.c
index 529b213..90a94ce 100644
--- a/src/haproxy-systemd-wrapper.c
+++ b/src/haproxy-systemd-wrapper.c
@@ -22,6 +22,8 @@
 #define SD_DEBUG "<7>"
 #define SD_NOTICE "<5>"
 
+static volatile sig_atomic_t caught_signal;
+
 static char *pid_file = "/run/haproxy.pid";
 static int wrapper_argc;
 static char **wrapper_argv;
@@ -103,7 +105,12 @@ static int read_pids(char ***pid_strv)
 	return read;
 }
 
-static void sigusr2_handler(int signum __attribute__((unused)))
+static void signal_handler(int signum)
+{
+	caught_signal = signum;
+}
+
+static void do_restart(void)
 {
 	setenv(REEXEC_FLAG, "1", 1);
 	fprintf(stderr, SD_NOTICE "haproxy-systemd-wrapper: re-executing\n");
@@ -111,7 +118,7 @@ static void sigusr2_handler(int signum __attribute__((unused)))
 	execv(wrapper_argv[0], wrapper_argv);
 }
 
-static void sigint_handler(int signum __attribute__((un

[PATCH] Remove more global state from systemd wrapper

2014-08-20 Thread Conrad Hoffmann
Hi,

this is the next step in my ongoing quest to give some lovin' to the
systemd wrapper.

It's against 1.6, I guess there is no reason to backport this to 1.5.

Does it look acceptable?

Cheers,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
>From b7f869424c6a75dc0b7ff734b61a98080448e5ad Mon Sep 17 00:00:00 2001
From: Conrad Hoffmann 
Date: Thu, 21 Aug 2014 07:44:12 +0200
Subject: [PATCH] Remove more global state from systemd wrapper.

Now that the signal handling is reduced to a minimum, all state
can be kept locally, making functions easier to read and debug.

Signed-off-by: Conrad Hoffmann 
---
 src/haproxy-systemd-wrapper.c | 54 +++
 1 file changed, 24 insertions(+), 30 deletions(-)

diff --git a/src/haproxy-systemd-wrapper.c b/src/haproxy-systemd-wrapper.c
index 90a94ce..97e80e5 100644
--- a/src/haproxy-systemd-wrapper.c
+++ b/src/haproxy-systemd-wrapper.c
@@ -19,15 +19,12 @@
 #include 
 
 #define REEXEC_FLAG "HAPROXY_SYSTEMD_REEXEC"
+#define DEFAULT_PID_FILE "/run/haproxy.pid"
 #define SD_DEBUG "<7>"
 #define SD_NOTICE "<5>"
 
 static volatile sig_atomic_t caught_signal;
 
-static char *pid_file = "/run/haproxy.pid";
-static int wrapper_argc;
-static char **wrapper_argv;
-
 static void locate_haproxy(char *buffer, size_t buffer_size)
 {
 	char *end = NULL;
@@ -44,7 +41,7 @@ static void locate_haproxy(char *buffer, size_t buffer_size)
 	buffer[buffer_size - 1] = '\0';
 }
 
-static void spawn_haproxy(char **pid_strv, int nb_pid)
+static void spawn_haproxy(int wrapper_argc, char **wrapper_argv, int pidc, char **pidv)
 {
 	char haproxy_bin[512];
 	pid_t pid;
@@ -57,7 +54,7 @@ static void spawn_haproxy(char **pid_strv, int nb_pid)
 	pid = fork();
 	if (!pid) {
 		/* 3 for "haproxy -Ds -sf" */
-		char **argv = calloc(4 + main_argc + nb_pid + 1, sizeof(char *));
+		char **argv = calloc(4 + main_argc + pidc + 1, sizeof(char *));
 		int i;
 		int argno = 0;
 		locate_haproxy(haproxy_bin, 512);
@@ -65,10 +62,10 @@ static void spawn_haproxy(char **pid_strv, int nb_pid)
 		for (i = 0; i < main_argc; ++i)
 			argv[argno++] = main_argv[i];
 		argv[argno++] = "-Ds";
-		if (nb_pid > 0) {
+		if (pidc > 0) {
 			argv[argno++] = "-sf";
-			for (i = 0; i < nb_pid; ++i)
-argv[argno++] = pid_strv[i];
+			for (i = 0; i < pidc; ++i)
+argv[argno++] = pidv[i];
 		}
 		argv[argno] = NULL;
 
@@ -82,7 +79,7 @@ static void spawn_haproxy(char **pid_strv, int nb_pid)
 	}
 }
 
-static int read_pids(char ***pid_strv)
+static int read_pids(const char *pid_file, char ***pid_strv)
 {
 	FILE *f = fopen(pid_file, "r");
 	int read = 0, allocated = 8;
@@ -110,19 +107,19 @@ static void signal_handler(int signum)
 	caught_signal = signum;
 }
 
-static void do_restart(void)
+static void do_restart(char *const *argv)
 {
 	setenv(REEXEC_FLAG, "1", 1);
 	fprintf(stderr, SD_NOTICE "haproxy-systemd-wrapper: re-executing\n");
 
-	execv(wrapper_argv[0], wrapper_argv);
+	execv(argv[0], argv);
 }
 
-static void do_shutdown(void)
+static void do_shutdown(const char *pid_file)
 {
 	int i, pid;
 	char **pid_strv = NULL;
-	int nb_pid = read_pids(&pid_strv);
+	int nb_pid = read_pids(pid_file, &pid_strv);
 	for (i = 0; i < nb_pid; ++i) {
 		pid = atoi(pid_strv[i]);
 		if (pid > 0) {
@@ -134,25 +131,22 @@ static void do_shutdown(void)
 	free(pid_strv);
 }
 
-static void init(int argc, char **argv)
+static char* get_pid_file(int argc, char **argv)
 {
+	char *pid_file = DEFAULT_PID_FILE;
 	while (argc > 1) {
 		if ((*argv)[0] == '-' && (*argv)[1] == 'p') {
 			pid_file = *(argv + 1);
 		}
 		--argc; ++argv;
 	}
+	return pid_file;
 }
 
 int main(int argc, char **argv)
 {
 	int status;
-
-	wrapper_argc = argc;
-	wrapper_argv = argv;
-
-	--argc; ++argv;
-	init(argc, argv);
+	char *pid_file = get_pid_file(argc, argv);
 
 	struct sigaction sa;
 	memset(&sa, 0, sizeof(struct sigaction));
@@ -163,30 +157,30 @@ int main(int argc, char **argv)
 	if (getenv(REEXEC_FLAG) != NULL) {
 		/* We are being re-executed: restart HAProxy gracefully */
 		int i;
-		char **pid_strv = NULL;
-		int nb_pid = read_pids(&pid_strv);
+		char **pidv = NULL;
+		int pidc = read_pids(pid_file, &pidv);
 
 		unsetenv(REEXEC_FLAG);
-		spawn_haproxy(pid_strv, nb_pid);
+		spawn_haproxy(argc, argv, pidc, pidv);
 
-		for (i = 0; i < nb_pid; ++i)
-			free(pid_strv[i]);
-		free(pid_strv);
+		for (i = 0; i < pidc; ++i)
+			free(pidv[i]);
+		free(pidv);
 	}
 	else {
 		/* Start a fresh copy of HAProxy */
-		spawn_haproxy(NULL, 0);
+		spawn_haproxy(argc, argv, 0, NULL);
 	}
 
 	status = -1;
 	while (-1 != wait(&status

Re: Running multiple haproxy instances to use multiple cores efficiently

2014-10-28 Thread Conrad Hoffmann
Hey Chris,

we've been running haproxy with "nbproc 12" for quite a while now and it
works great for us. We haven't even gotten around to tying interrupts to
certain cores, works pretty well without. No need for multiple config
files either.

Cheers,
Conrad

On 10/27/2014 07:41 PM, Chris Allen wrote:
> We're running haproxy on a 2x4 core Intel E5-2609 box. At present
> haproxy is running on
> a single core and saturating that core at about 15,000 requests per second.
> 
> Our application has four distinct front-ends (listening on four separate
> ports) so it would be
> very easy for us to run four haproxy instances, each handling one of the
> four front-ends.
> 
> This should then allow us to use four of our eight cores. However we
> won't be able to tie hardware
> interrupts to any particular core.
> 
> Is this arrangement likely to give us a significant performance boost?
> Or are we heading for trouble because
> we can't tie interrupts to any particular core?
> 
> Any advice would be much appreciated. Many thanks,
> 
> Chris.
> 
> 



Re: the order of evaluation of acl's

2014-10-28 Thread Conrad Hoffmann
Hi,

On 10/24/2014 02:12 PM, jeff saremi wrote:
> What is the order of evaluation of 'and's and 'or's in a use_backend clause?
> 
> This is what the docs say:
>  [!]acl1 [!]acl2 ... [!]acln  { or [!]acl1 [!]acl2 ... [!]acln } ...
> 
> and apparently i cannot use paranthesis to group them. However i need to 
> write something like the following:
> use_backend some_backend if ( ( acl1 acl2) or (acl3 acl4) ) or acl5

Why not just break it down into several lines:

use_backend some_backend if acl1 acl2
use_backend some_backend if acl3 acl4
use_backend some_backend if acl5

Especially if you care about the order of execution, this concern is
much more explicitly expressed this way.

Regards,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Connection pooling and the Connection header

2014-10-28 Thread Conrad Hoffmann
Hey all,

can someone give me a very brief summary of how haproxy handles its
connection pooling when the backend server sends "Connection: close"
and/or HTTP/1.0?

Or, to be more specific, we have (for certain traffic) an haproxy
instance as backend for another haproxy. We are seeing huge increases in
"sessions_cur" when one of the all-the-way-back-ends fail. I noticed
that haproxy itself responds with HTTP/1.0 and "Connection: close" when
a backend is unavailable. Would that mess with the connection pooling of
the front-most haproxy? And, if so, would there any way to change this
behaviour?

We cannot rule out that this is an issue caused by how clients react to
the errors, but I would nevertheless love to gain some insight into this
scenario. Any hints would be greatly appreciated!

Thanks,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: NOSRV error

2015-10-05 Thread Conrad Hoffmann
Hi,

(comments inline)

On 10/05/2015 03:23 PM, Kevin COUSIN wrote:
> Hi list
> 
> I want to LB an https backend (Layer 4 LB), but I have a lot of NOSRV errors 
> in log : 
> 
> Oct  5 15:09:38 localhost haproxy[13839]: 10.250.0.4:43318 
> [05/Oct/2015:15:09:38.486] fe_pp-portail-https be_pp-xctl-https/ 
> -1/-1/0 0 -- 0/0/0/0/3 0/0
> Oct  5 15:09:43 localhost haproxy[13839]: 10.250.0.4:44851 
> [05/Oct/2015:15:09:43.642] fe_pp-portail-https be_pp-xctl-https/ 
> -1/-1/0 0 -- 0/0/0/0/3 0/0
> Oct  5 15:09:48 localhost haproxy[13839]: 10.250.0.4:29479 
> [05/Oct/2015:15:09:48.761] fe_pp-portail-https be_pp-xctl-https/ 
> -1/-1/0 0 -- 0/0/0/0/3 0/0
> Oct  5 15:09:53 localhost haproxy[13839]: 10.250.0.4:53748 
> [05/Oct/2015:15:09:53.790] fe_pp-portail-https be_pp-xctl-https/ 
> -1/-1/0 0 -- 0/0/0/0/3 0/0
> Oct  5 15:09:58 localhost haproxy[13839]: 10.250.0.4:44828 
> [05/Oct/2015:15:09:58.847] fe_pp-portail-https be_pp-xctl-https/ 
> -1/-1/0 0 -- 0/0/0/0/3 0/0
> Oct  5 15:10:03 localhost haproxy[13839]: 10.250.0.4:51021 
> [05/Oct/2015:15:10:03.937] fe_pp-portail-https be_pp-xctl-https/ 
> -1/-1/0 0 -- 0/0/0/0/3 0/0
> Oct  5 15:10:08 localhost haproxy[13839]: 10.250.0.4:21815 
> [05/Oct/2015:15:10:08.925] fe_pp-portail-https be_pp-xctl-https/ 
> -1/-1/0 0 -- 0/0/0/0/3 0/0
> Oct  5 15:10:13 localhost haproxy[13839]: 10.250.0.4:57069 
> [05/Oct/2015:15:10:13.902] fe_pp-portail-https be_pp-xctl-https/ 
> -1/-1/0 0 -- 0/0/0/0/3 0/0
> Oct  5 15:10:18 localhost haproxy[13839]: 10.250.0.4:42239 
> [05/Oct/2015:15:10:18.873] fe_pp-portail-https be_pp-xctl-https/ 
> -1/-1/0 0 -- 0/0/0/0/3 0/0
> Oct  5 15:10:23 localhost haproxy[13839]: 10.250.0.4:65477 
> [05/Oct/2015:15:10:23.893] fe_pp-portail-https be_pp-xctl-https/ 
> -1/-1/0 0 -- 0/0/0/0/3 0/0
> Oct  5 15:10:28 localhost haproxy[13839]: 10.250.0.4:51091 
> [05/Oct/2015:15:10:28.860] fe_pp-portail-https be_pp-xctl-https/ 
> -1/-1/0 0 -- 0/0/0/0/3 0/0

This usually means that there is no server in the backend because they were
either misconfigured or taken out of the rotation, e.g. due to failed
health checks.

> Here is my configuration (works well with http)
> 
> global
>log 127.0.0.1 local4
>maxconn 65535
>user haproxy
>group haproxy
>daemon
>stats socket /var/lib/haproxy/stats user haproxy group haproxy
>ssl-server-verify none
>ssl-default-bind-ciphers 
> ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128:AES256:AES:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK
>tune.ssl.default-dh-param 2048

Not sure what exactly you want to achieve here. If you want to loadbalance
on TCP level, HAProxy doesn't need to know anything about any TLS parameters.

> defaults
> log global
> mode tcp
> option tcplog
> option contstats   # Enable continuous traffic Statistics 
> updates
> option redispatch
> timeout client 2h  #alctl: client inactivity timeout
> maxconn 15000
> timeout client-fin 1m   # When connection are close on one 
> side only
> timeout server  60s
> timeout connect 60s
> timeout tunnel 2h   # Set the maximum inactivity time on the 
> client and server side for tunnels.
> default-server inter 2s  fall 3 rise 2 on-marked-down 
> shutdown-sessions
> 
> 
> frontend fe_pp-portail-http
> bind 10.250.0.48:80
> default_backend be_pp-xctl-http
> 
> frontend fe_pp-portail-https
> bind 10.250.0.48:443
> default_backend be_pp-xctl-https
> 
> backend be_pp-xctl-http
> balance source
> server pp-xctl01002-http 172.21.12.8:80 
> 
> backend be_pp-xctl-https
> balance source
> server pp-xctl01002-https 172.21.12.8:443
> 
> I got the certificate on my server If I use openssl s_client.

Can you elaborate on this? Are you connecting with s_client to haproxy or
to your server?
Can you confirm that you want you web server to do the actual TLS handshake
and not HAProxy?

Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: use part of url - as backend name?

2016-01-28 Thread Conrad Hoffmann
Hi,

I am not sure if there is a better way, but the only one I can think of
would be something like:

  http-request set-header X-Tmp %[path]
  http-request replace-header X-Tmp /([^/]+) \1
  use_backend %[req.hdr(X-Tmp),lower,map(/path/tofile)]

No idea which one is preferable with regards to performance, maybe test,
I'd assume it depends on your number of backends. The basic string map is
indeed quite performant, so using this variant might help if you do have a
lot of backends.

Cheers,
Conrad


On 01/28/2016 04:27 PM, Klavs Klavsen wrote:
> I got this to work:
> 
> use_backend %[path,lower,map_reg(/path/tofile)]
> 
> and with a map file containing:
> ^\/sebseb/
> ^\/test2/
> 
> and so on..
> 
> Only issue is the performance hit.. its going to do regex match with path
> on each line in the map file.. :(
> 
> I'd prefer to find a function to replace path.. which could substract the
> first part ^\/([^/])\/ - and then use \1 to match in map file..
> 
> But perhaps the map_reg isn't that expensive?
> 
> Klavs Klavsen wrote on 01/28/2016 03:39 PM:
>> Hi guys,
>>
>> I figured I could use map feature of 1.5, but I'm coming up short,
>> trying to change this:
>>
>>   use_backend %[req.hdr(host),lower,map(/path/tofile)]
>>
>>
>> To instead take a part of the uri.
>>
>> I can't find any list of functions (such as hdr, hdr_end etc.) in the
>> docs :(
>>
>> I figured I could use "something like" regrep instead of hdr(host)?
>>
>>
>> Klavs Klavsen wrote on 01/26/2016 02:53 PM:
>>> Hi guys,
>>>
>>> we have a long list of backends (want to monitor each application on a
>>> tomcat behind us) - and would like to use part of the url ( first part
>>> between / / ) to identify the backend (in haproxy) to use.
>>>
>>> so if we get a request for /sebseb/api/whatever - we'd like to use
>>> backend named kki-sebseb.
>>>
>>> we figure this should not be dangerous since haproxy seems fairly well
>>> coded - so a request for something bogus, should just result in the
>>> backend being invalid - and returning a 504 (we hope) ?
>>>
>>> I can't seem to find any examples on the net.. and can't figure it out
>>> from the haproxy 1.5 docs..
>>>
>>> I was hoping any of you had some hints :)
>>>
>>
>>
> 
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: Feature Request for log stdout ...

2016-02-18 Thread Conrad Hoffmann
Two more cents from my side:

socklog [1] also works pretty well...

[1] http://smarden.org/socklog/

Conrad

On 02/18/2016 11:28 AM, Baptiste wrote:
> On Thu, Feb 18, 2016 at 10:57 AM, Willy Tarreau  wrote:
>> Hi Aleks,
>>
>> On Wed, Feb 17, 2016 at 04:30:06PM +0100, Aleksandar Lazic wrote:
>>> Hi.
>>>
>>> how difficult is it to be able to add "log stdout;" to haproxy?
>>>
>>> I ask because in some PaaS environment is it difficult to setup a
>>> dedicated user yust for haproxy.
>>>
>>> It fits also a little bit better to http://12factor.net/logs
>>
>> It's been discussed a few times in the past. The response is "no".
>> It's totally insane to emit logs to a blocking destination. Your
>> whole haproxy process will run at the speed of the logs consumer
>> and the log processing will incure its latency to the process.
>>
>> If one day we implement an synchronous stream logging task, this
>> could change, but for now we send immediate logs as datagrams in
>> order never to block.
>>
>> To get an idea about what it can look like with blocking logs,
>> simply run "haproxy -d 2>&1 | more" and don't press any key.
>> You'll quickly see that the system continues to accept new
>> connections and that they will randomly freeze at various steps.
>>
>> Regards,
>> Willy
>>
>>
> 
> My 2 cents: Some tools may be used for this purpose:
> 
> Configure HAProxy to send logs to port 2000, then use:
> 
> - socat:
> socat -u UDP-RECV:2000 -
> <133>Feb 18 11:27:02 haproxy[4134]: Proxy f started.
> <133>Feb 18 11:27:02 haproxy[4134]: Proxy b started.
> <133>Feb 18 11:27:02 haproxy[4134]: Proxy stats started.
> <129>Feb 18 11:27:02 haproxy[4134]: Server b/s is DOWN, reason: Layer4
> connection problem, info: "Connection refused", check duration: 0ms. 0
> active and 0 backup servers left. 0 sessions active, 0 requeued, 0
> remaining in queue.
> <128>Feb 18 11:27:02 haproxy[4134]: backend b has no server available!
> 
> - netcat:
> netcat -l -k -u 2000
> <133>Feb 18 11:28:17 haproxy[4303]: Proxy f started.
> <133>Feb 18 11:28:17 haproxy[4303]: Proxy b started.
> <133>Feb 18 11:28:17 haproxy[4303]: Proxy stats started.
> <129>Feb 18 11:28:17 haproxy[4303]: Server b/s is DOWN, reason: Layer4
> connection problem, info: "Connection refused", check duration: 0ms. 0
> active and 0 backup servers left. 0 sessions active, 0 requeued, 0
> remaining in queue.
> <128>Feb 18 11:28:17 haproxy[4303]: backend b has no server available!
> 
> 
> 
> Baptiste
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: Proliferation of processes under systemd wrapper

2016-02-25 Thread Conrad Hoffmann
Hi Bharat,

On 02/24/2016 03:04 AM, BR Kumar wrote:
> Couple of questions related to the systemd wrapper:
> 
> 1) I noticed that it spawns a 2 level hierarchy of haproxy processes
> instead of a single child process. Can someone help understand why?

It's a little complex, but I can try:

It is probably easier to understand when thinking about a scenario where
nbproc is > 1, let's say 4. When running under supervision of systemd (or
any other service manager, e.g. runit), you need a single application
process that does not daemonize. Systemd monitors this process to e.g. to
restart it if it dies. But most service managers are designed to supervise
one process per service. So if we want 4 processes, the application itself
has to manage this. This is why you would see a process hierarchy like this:

systemd
L haproxy-systemd-wrapper
  L haproxy
L haproxy \
L haproxy  } nbproc processes
L haproxy  }
L haproxy /

Now, you might say, but there is already the systemd-wrapper , why do we
need the one additional haproxy instance? It is related to the special
(carefully designed) restart mechanism that haproxy implements. Going into
details would make this a pretty long mail, but the gist it that _all_
haproxy process will be replaced by new ones in the process, including the
parent one. Thus, the moment that the parent process were to exit, systemd
would think the service needs to be restarted and fire up yet another
instance of haproxy (even though there is already a replacement, but
systemd doesn't know about it). So the sole purpose of the wrapper is to
prevent systemd from interfering with th restart mechanism implement in the
application itself, basically.
Sorry, this leaves out a lot of details, but I hope it does make a little
sense.

> 2) The problem arises when the intermediate haproxy process dies for any
> reason and the child process is adopted by pid 1. When the systemd wrapper
> is restarted, this results in a total of 3 haproxy processes. Is this
> behavior expected? Are there fixes/workaround to ensure that only
> systemd-wrapper controlled processes are retained?

This is intentional. The restart mechanism is designed to not interrupt
existing connections. As such, the child processes will be detached from
their parent, keep running until all connections they are processing are
closed by the client side and then exit.

Cheers,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: Proliferation of processes under systemd wrapper

2016-02-29 Thread Conrad Hoffmann
Hi,

On 02/27/2016 02:23 AM, BR Kumar wrote:
> Hi Conrad,
> 
> Thanks for responding. Regarding point (2), I'm not sure the last level of
> child processes are awaiting connection closure and would exit afterwards.
> In my test, netstat showed no connection against the haproxy port. Yet, the
> orphaned child process continued indefinitely. To reiterate, if the
> intermediate haproxy process (highlighted through ** in the subsequent
> illustration) were to die, its child processes linger on forever regardless
> of connections. In addition, there doesn't appear to be a way to reap them
> when the wrapper re-spawns the new hierarchy. This can be simulated easily
> in a test environment. I believe this shortcoming needs a fix. Thoughts?

This depends on your definition of "terminate". If you just kill the
process, yes, the child processes will be reparented to init and stick
around. There is not much to be fixed about this, you just shoudn't be oing
this.

But if you send the proper signal then the parent process will in turn send
signals to the child processes before it exits, basically telling the child
processes to finish their existing connections, then terminate. In this
case, they might linger around for a few minutes but will eventually terminate.

If this doesn't happen in your setup then I assume the service handling is
not properly set up, e.g. systemd does not understand how to properly
start/stop/reload the service. These things are configurable and haproxy is
not quite the "default case" here.

Hope that helps,
Conrad

> 
> Thanks,
> Bharath
> 
> 
> systemd
> L haproxy-systemd-wrapper
>   L ***haproxy** <---terminate this*
> L haproxy \
> L haproxy  } nbproc processes
> L haproxy  }
> L haproxy /
> 
> 
> On Thu, Feb 25, 2016 at 3:02 PM, Conrad Hoffmann 
> wrote:
> 
>> Hi Bharat,
>>
>> On 02/24/2016 03:04 AM, BR Kumar wrote:
>>> Couple of questions related to the systemd wrapper:
>>>
>>> 1) I noticed that it spawns a 2 level hierarchy of haproxy processes
>>> instead of a single child process. Can someone help understand why?
>>
>> It's a little complex, but I can try:
>>
>> It is probably easier to understand when thinking about a scenario where
>> nbproc is > 1, let's say 4. When running under supervision of systemd (or
>> any other service manager, e.g. runit), you need a single application
>> process that does not daemonize. Systemd monitors this process to e.g. to
>> restart it if it dies. But most service managers are designed to supervise
>> one process per service. So if we want 4 processes, the application itself
>> has to manage this. This is why you would see a process hierarchy like
>> this:
>>
>> systemd
>> L haproxy-systemd-wrapper
>>   L haproxy
>> L haproxy \
>> L haproxy  } nbproc processes
>> L haproxy  }
>> L haproxy /
>>
>> Now, you might say, but there is already the systemd-wrapper , why do we
>> need the one additional haproxy instance? It is related to the special
>> (carefully designed) restart mechanism that haproxy implements. Going into
>> details would make this a pretty long mail, but the gist it that _all_
>> haproxy process will be replaced by new ones in the process, including the
>> parent one. Thus, the moment that the parent process were to exit, systemd
>> would think the service needs to be restarted and fire up yet another
>> instance of haproxy (even though there is already a replacement, but
>> systemd doesn't know about it). So the sole purpose of the wrapper is to
>> prevent systemd from interfering with th restart mechanism implement in the
>> application itself, basically.
>> Sorry, this leaves out a lot of details, but I hope it does make a little
>> sense.
>>
>>> 2) The problem arises when the intermediate haproxy process dies for any
>>> reason and the child process is adopted by pid 1. When the systemd
>> wrapper
>>> is restarted, this results in a total of 3 haproxy processes. Is this
>>> behavior expected? Are there fixes/workaround to ensure that only
>>> systemd-wrapper controlled processes are retained?
>>
>> This is intentional. The restart mechanism is designed to not interrupt
>> existing connections. As such, the child processes will be detached from
>> their parent, keep running until all connections they are processing are
>> closed by the client side and then exit.
>>
>> Cheers,
>> Conrad
>> --
>> Conrad Hoffmann
>> Traffic Engineer
>>
>> SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany
>>
>> Managing Director: Alexander Ljung | Incorporated in England & Wales
>> with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
>> HRB 110657B
>>
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: Only using map file when an entry exists

2016-03-03 Thread Conrad Hoffmann
If you are using haproxy >=1.6, you might be able to do something like this:

acl no_redir %[req.redir] -m str NO_REDIR
http-request set-var(req.redir) \
%[hdr(host),map(/etc/haproxy/redirect_host.map,NO_REDIR)]
http-request redirect location %[req.redir] code 301 if !no_redir

This is completely made up and untested, but I hope you get the idea.
Avoids a second map lookup altogether, but also map lookups are quite fast,
so unless you map is huge you don't really need to worry about this. Also,
double negation, but this is just to give you some idea

Cheers,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



SO_REUSEPORT and process load distribution

2016-03-24 Thread Conrad Hoffmann
Hello,

I know SO_REUSEPORT has been discussed here a few times and I am aware that
haproxy uses it to make restarts less disruptive, as a new instance can
bind() to the listen ports without the need to stop the old instance first.

But there is another aspect of SO_REUSEPORT that I have not yet seen
discussed here (my apologies if I missed this, pointers welcome). The
original patch for SO_REUSEPORT [1] explicitly mentions an uneven
distribution across threads/processes for the "accept on a single
listener socket from multiple threads" case, which the author intended to
get rid of with SO_REUSEPORT. What I have so far never seen a detailed
explanation of, though, is what the exact criteria are to get this to work.

We always have and still are seeing this uneven distribution in our haproxy
processes (haproxy 1.6, linux 3.16). My assumption is that this is because
we use haproxy in daemon mode, which (I think) basically means that when a
reload happens, one process is started, it binds all the sockets (using
SO_REUSEPORT, which lets it take over sockets from the previous instance),
but then fork()s the child processes, which thus inherit the already bound
socket(s).

My gut feeling - but I cannot point to any reliable reference - is that the
even load distribution mecahnism only kicks in if all processes do the call
to at least bind(), possibly listen() or even socket() themselves. Just
setting SO_REUSEPORT but otherwise using the "accept on a single
listener socket from multiple threads" might not actually reap the intended
benefits here.

So my question is basically: do I understand this situation correctly? Can
someone who is more experienced with the kernel networking code comment on
this aspect of SO_REUSEPORT?

[1] https://lwn.net/Articles/542718/

Thanks a lot,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: ssl offloading

2016-04-01 Thread Conrad Hoffmann
I can't really back this up with reliable numbers, but a company I once
worked for experimented with such hardware. The outcome was, and I would
still always recommend this today, to rather throw more regular hardware at
the problem. Modern processors have a lot special instructions specifically
for cryptographic operations (maybe make sure you are making full use of
that) and are way cheaper than specialized SSL hardware. Stuff like SSL
changes a lot and often needs immediate security fixes, so going with
general purpose hardware where you are not dependent on some vendor support
will likely make your life easier at some point.

That's just an opinion after all, of course.

Cheers,
Conrad

On 04/01/2016 10:06 AM, Gerd Mueller wrote:
> We are experiencing 100% cpu load by this specific haproxy thread during huge 
> ssl load. With haproxy .4 we first used stunnel, than apache with mod_ssl. I 
> think haproxy with ssl performance much better than the other 2 but I am 
> thinking about offloading to a specific ssl device. Does anybody know 
> anything about dedicated hardware?
> 
> Thanks Gerd
> 
>  Weitergeleitete Nachricht 
> Von: Nathan Williams 
> mailto:nathan%20williams%20%3cnath.e.w...@gmail.com%3e>>
> An: Lukas Tribus 
> mailto:lukas%20tribus%20%3cluky...@hotmail.com%3e>>, 
> Gerd Mueller 
> mailto:gerd%20mueller%20%3cgerd.muel...@mikatiming.de%3e>>,
>  haproxy@formilux.org 
> mailto:%22hapr...@formilux.org%22%20%3chapr...@formilux.org%3e>>
> Betreff: Re: ssl offloading
> Datum: Fri, 1 Apr 2016 01:54:29 +
> 
> 
> stunnel's what we used before Haproxy had it built in, which worked fine, but 
> SSL offloading in Haproxy's been excellent in our experience, so my guess 
> would be that you could make it work with some config tuning.
> 
> On Thu, Mar 31, 2016, 12:45 PM Lukas Tribus 
> mailto:luky...@hotmail.com>> wrote:
>> Hi list,
>>
>> what are your ideas about offloading of ssl? ssl inside haproxy is nice
>> but is very expensive.
> 
> Why would you think that?
> 
> 
> Lukas
> 
> 
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



[PATCH] BUG/MINOR: dumpstats: fix write to global chunk

2016-04-01 Thread Conrad Hoffmann
Hi,

there is a small bug in dumpstats.c, stats_dump_fields_csv(). One of the
writes is to the global `trash` chunk, which happens to be the right one,
but it should be to what gets passed in as argument (out).

Not even sure this was worth a patch, but it is attached if you like.
Otherwise, feel free to just fix by hand :)

Happy weekend,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
>From 5ec0353a7c06c51053995bba76e3fd0143e7a8d8 Mon Sep 17 00:00:00 2001
From: Conrad Hoffmann 
Date: Fri, 1 Apr 2016 20:40:58 +0200
Subject: [PATCH] BUG/MINOR: dumpstats: fix write to global chunk

This just happens to work as it is the correct chunk, but should be whatever
gets passed in as argument.

Signed-off-by: Conrad Hoffmann 
---
 src/dumpstats.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/dumpstats.c b/src/dumpstats.c
index 341fd56..3299f1b 100644
--- a/src/dumpstats.c
+++ b/src/dumpstats.c
@@ -3211,7 +3211,7 @@ static int stats_dump_fields_csv(struct chunk *out, const struct field *stats)
 		if (!chunk_strcat(out, ","))
 			return 0;
 	}
-	chunk_strcat(&trash, "\n");
+	chunk_strcat(out, "\n");
 	return 1;
 }
 
-- 
2.7.4



Re: Using socket commands to add a new server to a backend

2016-05-17 Thread Conrad Hoffmann
Hi Tugberk,

On 05/16/2016 05:25 PM, Tugberk Ugurlu wrote:
> Hi,
> Is it possible to register a new server under a backend? This is similar to 
> what is being asked here: 
> http://haproxy.formilux.narkive.com/1OibZABp/using-haproxy-socket-to-add-new-servers
> I cannot see the command available under the documented list of commands: 
> http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#9.2

The response from the thread you linked to still holds true.

> I am not sure what is the best way to handle rolling out new servers and 
> taking down the old ones without any downtime through an automated way.

The reload in haproxy is designed to be very-close-to-zero downtime, so
depending on your use case it might just be good enough. If you do opt for
the solution mentioned in the above thread, do keep in mind that your
config on disk an the state in haproxy might diverge, so your deployments
will have take that into consideration.

Cheers,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: Using socket commands to add a new server to a backend

2016-05-17 Thread Conrad Hoffmann
On 05/17/2016 12:53 PM, Tugberk Ugurlu wrote:
> Thanks for getting back!
> Is there a place that I can influence this change or do you think that this 
> is something not possible cuz of some internal architectural concerns or 
> similar? I am trying to mimic the AWS ELB's register and deregister features.
> BTW, is HAProxy open source? If so, is it on GitHub or somewhere else?

It most certainly is. You can find links to git repositories and other
resources at http://www.haproxy.org/

Note that the main repos are _not_ on Github, but if you prefer Github
there is an _inofficial_ clone on Github: https://github.com/haproxy/haproxy

For discussions and feature requests this mailing list is the right place.
As for this specific feature, it has been requested several times before
and is, as far as I understand, being worked on, but not with high priority.

> Thanks for getting back again, I do appreciate it!

You're welcome :)

Conrad


> Tugberk
> 
>> Subject: Re: Using socket commands to add a new server to a backend
>> To: tugb...@outlook.com; haproxy@formilux.org
>> From: con...@soundcloud.com
>> Date: Tue, 17 May 2016 12:30:47 +0200
>>
>> Hi Tugberk,
>>
>> On 05/16/2016 05:25 PM, Tugberk Ugurlu wrote:
>>> Hi,
>>> Is it possible to register a new server under a backend? This is similar to 
>>> what is being asked here: 
>>> http://haproxy.formilux.narkive.com/1OibZABp/using-haproxy-socket-to-add-new-servers
>>> I cannot see the command available under the documented list of commands: 
>>> http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#9.2
>>
>> The response from the thread you linked to still holds true.
>>
>>> I am not sure what is the best way to handle rolling out new servers and 
>>> taking down the old ones without any downtime through an automated way.
>>
>> The reload in haproxy is designed to be very-close-to-zero downtime, so
>> depending on your use case it might just be good enough. If you do opt for
>> the solution mentioned in the above thread, do keep in mind that your
>> config on disk an the state in haproxy might diverge, so your deployments
>> will have take that into consideration.
>>
>> Cheers,
>> Conrad
>> -- 
>> Conrad Hoffmann
>> Traffic Engineer
>>
>> SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany
>>
>> Managing Director: Alexander Ljung | Incorporated in England & Wales
>> with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
>> HRB 110657B
> 
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: Using socket commands to add a new server to a backend

2016-05-18 Thread Conrad Hoffmann
On 05/18/2016 12:35 AM, Tugberk Ugurlu wrote:
> Thanks!
> What do you think about this approach: 
> http://engineeringblog.yelp.com/2015/04/true-zero-downtime-haproxy-reloads.html
> Seems a bit complicated to get it right but wonder if it's worth investing 
> into.

That really depends on your needs. As the article mentions, haproxy only
looses 0.25% of requests while being restarted *10 times a second*. For
their third test (1M reqquests) they mention that haproxy reloaded over
1500 times during the test. As the first test had "only" 20 requests,
assuming 300 restarts during that test is a fair estimate. So while
hammering haproxy with requests (that's what a benchmark does), it looses
1/300th of 0.25% of requests during a single restart, or 0.0008%. By rough
estimates from the details given in the article, this amounts to 1.6
dropped requests - if you have 67.000 req/s!

I'll leave it up to you how much effort you think is worth putting into
reducing that number.

To give you some ideas, other options are having haproxy running on
multiple hosts and load balancing between these hosts, e.g with DNS. For
restarts, shift traffic to one instance, restart the other, then vice
versa. Such a setup will save you from much more than packet loss during
restarts, e.g. hardware failure, bad deploys, etc.

Just my two cents.

Cheers,
Conrad


> Tugberk
>> Subject: Re: Using socket commands to add a new server to a backend
>> To: tugb...@outlook.com; haproxy@formilux.org
>> From: con...@soundcloud.com
>> Date: Tue, 17 May 2016 13:40:55 +0200
>>
>> On 05/17/2016 12:53 PM, Tugberk Ugurlu wrote:
>>> Thanks for getting back!
>>> Is there a place that I can influence this change or do you think that this 
>>> is something not possible cuz of some internal architectural concerns or 
>>> similar? I am trying to mimic the AWS ELB's register and deregister 
>>> features.
>>> BTW, is HAProxy open source? If so, is it on GitHub or somewhere else?
>>
>> It most certainly is. You can find links to git repositories and other
>> resources at http://www.haproxy.org/
>>
>> Note that the main repos are _not_ on Github, but if you prefer Github
>> there is an _inofficial_ clone on Github: https://github.com/haproxy/haproxy
>>
>> For discussions and feature requests this mailing list is the right place.
>> As for this specific feature, it has been requested several times before
>> and is, as far as I understand, being worked on, but not with high priority.
>>
>>> Thanks for getting back again, I do appreciate it!
>>
>> You're welcome :)
>>
>> Conrad
>>
>>
>>> Tugberk
>>>
>>>> Subject: Re: Using socket commands to add a new server to a backend
>>>> To: tugb...@outlook.com; haproxy@formilux.org
>>>> From: con...@soundcloud.com
>>>> Date: Tue, 17 May 2016 12:30:47 +0200
>>>>
>>>> Hi Tugberk,
>>>>
>>>> On 05/16/2016 05:25 PM, Tugberk Ugurlu wrote:
>>>>> Hi,
>>>>> Is it possible to register a new server under a backend? This is similar 
>>>>> to what is being asked here: 
>>>>> http://haproxy.formilux.narkive.com/1OibZABp/using-haproxy-socket-to-add-new-servers
>>>>> I cannot see the command available under the documented list of commands: 
>>>>> http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#9.2
>>>>
>>>> The response from the thread you linked to still holds true.
>>>>
>>>>> I am not sure what is the best way to handle rolling out new servers and 
>>>>> taking down the old ones without any downtime through an automated way.
>>>>
>>>> The reload in haproxy is designed to be very-close-to-zero downtime, so
>>>> depending on your use case it might just be good enough. If you do opt for
>>>> the solution mentioned in the above thread, do keep in mind that your
>>>> config on disk an the state in haproxy might diverge, so your deployments
>>>> will have take that into consideration.
>>>>
>>>> Cheers,
>>>> Conrad
>>>> -- 
>>>> Conrad Hoffmann
>>>> Traffic Engineer
>>>>
>>>> SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany
>>>>
>>>> Managing Director: Alexander Ljung | Incorporated in England & Wales
>>>> with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
>>>> HRB 110657B
>>>   
>>>
>>
>> -- 
>> Conrad Hoffmann
>> Traffic Engineer
>>
>> SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany
>>
>> Managing Director: Alexander Ljung | Incorporated in England & Wales
>> with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
>> HRB 110657B
> 
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: Setting weight through stats socket, nbproc > 1

2016-05-31 Thread Conrad Hoffmann
0,00,0,
> [r...@unknownservers.net  run]# echo show stat | nc -U hapxy.sock  |
> grep host02.unknownservers.net
> pxy_noxff,host02.unknownservers.net,0,0,0,1,,11,17448,7938,,0,,0,0,0,0,UP,20,1,0,0,0,43,0,,3,7,25,,11,,2,0,,1,L7OK,200,0,0,5,6,0,0,0,00,0,
> pxy_xff,host02.unknownservers.net,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,1,0,,3,8,25,,0,pxy_noxff/host02.unknownservers.net,2,0,,00,0,0,0,0,0,00,0,
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



[PATCH] MINOR: dns: support advertising UDP message size.

2016-06-23 Thread Conrad Hoffmann
Hi,

attached is my shot at supporting negotiating UDP message sizes > 512 bytes
in the Haproxy DNS implementation. The default DNS size of 512 bytes can
often lead to truncated responses, which are discarded.

With the attached patch and adding "udp_msg_size 2048" to my resolvers
section, I can succesfully use a hostname that yields ~100 A records in our
company network, something that did not work before.

This was an approach suggested by Baptiste a long time ago, unfortunately I
got distracted with other things for a while. Nevertheless, I'd also like
to discuss

a) whether this is really a sufficient replacement for DNS over TCP support
b) why truncated responses are immediately discarded, even if they contain
   one or more records that satisfy the original request

But such discussion could optionally be moved to a seperate thread.

Thanks for all the hard work,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
From 460f879bdb669bfe0f389269113fbfbdfbec3e8f Mon Sep 17 00:00:00 2001
From: Conrad Hoffmann 
Date: Thu, 10 Sep 2015 11:52:27 +0200
Subject: [PATCH] MINOR: dns: support advertising UDP message size.

By default, DNS has a maximum UDP message size of 512 bytes. OPT records are an
RFC compliant way of advertising the client's support for larger messages.

This commit adds a new config option "udp_msg_size" to the "resolvers" section.
It takes a single integer argument. If used, the value is advertised as message
size in an OPT record sent with each query.

If unsure, 2048 is a good value to start, as it is the minimum size expected to
be handled by DNS servers supporting the EDNS(0) extensions (of which OPT
records are a part).
---
 doc/configuration.txt | 15 +++
 include/proto/dns.h   |  2 +-
 include/types/dns.h   | 11 +++
 src/cfgparse.c| 10 ++
 src/dns.c | 25 ++---
 5 files changed, 59 insertions(+), 4 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 45e2e06..0382131 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -11126,6 +11126,21 @@ timeout  
: time related to the event. It follows the HAProxy time format.
 is expressed in milliseconds.
 
+udp_msg_size 
+  Advertise a maximum UDP message size of  bytes to a server when
+  resolving names.
+  Default value: 0 (do not advertise anything)
+
+  Standard DNS uses 512 bytes as maximum UDP message size. This can be too
+  small for answers containing a lot or response records. A client can
+  advertise support for larger message sizes by including an OPT record in
+  the query, specifying a new maximum message size. Setting this to a value
+  > 0 causes haproxy to advertise this value with each query.
+
+  Note that setting this to an arbitrarily large value might not be helpful.
+  If needed, start with 2048, which is the minimum size supported by all
+  servers that support this DNS extension.
+
 Example of a resolvers section (with default values):
 
resolvers mydns
diff --git a/include/proto/dns.h b/include/proto/dns.h
index 170eefa..e121e5f 100644
--- a/include/proto/dns.h
+++ b/include/proto/dns.h
@@ -28,7 +28,7 @@
 char *dns_str_to_dn_label(const char *string, char *dn, int dn_len);
 int dns_str_to_dn_label_len(const char *string);
 int dns_hostname_validation(const char *string, char **err);
-int dns_build_query(int query_id, int query_type, char *hostname_dn, int hostname_dn_len, char *buf, int bufsize);
+int dns_build_query(int query_id, int query_type, char *hostname_dn, int hostname_dn_len, int udp_msg_size, char *buf, int bufsize);
 struct task *dns_process_resolve(struct task *t);
 int dns_init_resolvers(void);
 uint16_t dns_rnd16(void);
diff --git a/include/types/dns.h b/include/types/dns.h
index 50636fd..88f95c7 100644
--- a/include/types/dns.h
+++ b/include/types/dns.h
@@ -44,6 +44,7 @@
 #define DNS_RTYPE_A		1	/* IPv4 address */
 #define DNS_RTYPE_CNAME		5	/* canonical name */
 #define DNS_RTYPE_		28	/* IPv6 address */
+#define DNS_RTYPE_OPT		41	/* OPT record */
 #define DNS_RTYPE_ANY		255	/* all records */
 
 /* dns rcode values */
@@ -85,6 +86,15 @@ struct dns_question {
 	unsigned short	qclass;		/* query class */
 };
 
+/* structure to describe a DNS OPT record */
+struct dns_opt_record {
+	unsigned short	qtype;		/* record type (41) */
+	unsigned short	qclass;		/* requestor's UDP payload size */
+	unsigned int	ttl;		/* extended RCODE and flags */
+	unsigned short	rdlen;		/* length of RDATA */
+	unsigned char	rdata[0];	/* {attribute,value} pairs */
+};
+
 /*
  * resolvers section and parameters. It is linked to the name servers
  * servers points to it.
@@ -100,6 +110,7 @@ struct dns_resolvers {
 	struct list nameser

Re: [PATCH] MINOR: dns: support advertising UDP message size.

2016-06-24 Thread Conrad Hoffmann
Hi Willy,

On 06/24/2016 03:52 PM, Willy Tarreau wrote:
> Hi Conrad,
> 
> On Thu, Jun 23, 2016 at 06:49:19PM +0200, Conrad Hoffmann wrote:
>> Hi,
>>
>> attached is my shot at supporting negotiating UDP message sizes > 512 bytes
>> in the Haproxy DNS implementation. The default DNS size of 512 bytes can
>> often lead to truncated responses, which are discarded.
>>
>> With the attached patch and adding "udp_msg_size 2048" to my resolvers
>> section, I can succesfully use a hostname that yields ~100 A records in our
>> company network, something that did not work before.
>>
>> This was an approach suggested by Baptiste a long time ago, unfortunately I
>> got distracted with other things for a while. Nevertheless, I'd also like
>> to discuss
>>
>> a) whether this is really a sufficient replacement for DNS over TCP support
>> b) why truncated responses are immediately discarded, even if they contain
>>one or more records that satisfy the original request
>>
>> But such discussion could optionally be moved to a seperate thread.
> 
> Thanks for this, it indeed sounds useful and could be even more useful
> once we manage to setup a backend using DNS responses. I'm having one
> question though : what would be the impact of enabling this all the time,
> I mean without the configuration directive ?

Yeah, I was pondering the same thing. DNS servers not capable of that
extensions (very few, I think) would ignore it, so always adding the OPT
record would be safe indeed. I just wasn't sure about the value itself. I
guess always setting it rather high would be an option. I didn't quite
finish thinking about whether there are scenarios where the server actually
sending such a big packet it might not make it through your network (e.g.
if the server would set the IP don't fragment bit). In our setup, IP
fragmentation kicks in and it works nicely.

Maybe you have more thoughts on that? I'd be happy with either solution.

Conrad


> 
> willy
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



[PATCH][RFC] CLEANUP: dns: use struct dns_header for parsing

2016-06-25 Thread Conrad Hoffmann
Hi,

while poking around in the DNS response parsing, I thought this might be a
good idea, but not 100% sure about considerations that may have went into
the existing code.

I was about to add more variables like `ancount` (e.g. `arcount`), then
realized it's all already there, in the struct dns_header. Downsides would
be having to remember to call ntohX() on integer members and heap access
vs. stack access. But I believe that's negligable. One could even keep the
`ancount` variable and just assign it once by reading the struct member, I
think the code simplification is still nice (especially the `flags` checks).

Just a thought, comments welcome.

Cheers,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
From 842223e59a219f204848993b3613847b6019eaf1 Mon Sep 17 00:00:00 2001
From: Conrad Hoffmann 
Date: Sat, 25 Jun 2016 16:12:01 +0200
Subject: [PATCH] CLEANUP: dns: use struct dns_header for parsing

Use the already present struct dns_header as wire representation and validate
the response by checking the members of this struct.

Mainly useful to not keep adding additional variables as more parts of the
response become interesting, e.g. the additional record section for SRV
records.
---
 src/dns.c | 53 ++---
 1 file changed, 14 insertions(+), 39 deletions(-)

diff --git a/src/dns.c b/src/dns.c
index 3b3dfc5..345edfb 100644
--- a/src/dns.c
+++ b/src/dns.c
@@ -342,7 +342,8 @@ void dns_update_resolvers_timeout(struct dns_resolvers *resolvers)
 int dns_validate_dns_response(unsigned char *resp, unsigned char *bufend, char *dn_name, int dn_name_len)
 {
 	unsigned char *reader, *cname, *ptr;
-	int i, len, flags, type, ancount, cnamelen, expected_record;
+	int i, len, type, cnamelen, expected_record;
+	struct dns_header *header;
 
 	reader = resp;
 	cname = NULL;
@@ -355,11 +356,12 @@ int dns_validate_dns_response(unsigned char *resp, unsigned char *bufend, char *
 			  * of record is expected to consider the response as valid. (SRV or TXT types)
 			  */
 
-	/* move forward 2 bytes for the query id */
-	reader += 2;
-	if (reader >= bufend)
+	/* Assert response includes at least a valid header. */
+	if (reader + sizeof(struct dns_header) >= bufend)
 		return DNS_RESP_INVALID;
 
+	header = (struct dns_header*)reader;
+
 	/*
 	 * flags are stored over 2 bytes
 	 * First byte contains:
@@ -369,50 +371,23 @@ int dns_validate_dns_response(unsigned char *resp, unsigned char *bufend, char *
 	 *  - truncated (1 bit)
 	 *  - recursion desired (1 bit)
 	 */
-	if (reader + 2 >= bufend)
-		return DNS_RESP_INVALID;
-
-	flags = reader[0] * 256 + reader[1];
-
-	if (flags & DNS_FLAG_TRUNCATED)
+	if (header->tc)
 		return DNS_RESP_TRUNCATED;
 
-	if ((flags & DNS_FLAG_REPLYCODE) != DNS_RCODE_NO_ERROR) {
-		if ((flags & DNS_FLAG_REPLYCODE) == DNS_RCODE_NX_DOMAIN)
+	if (header->rcode != DNS_RCODE_NO_ERROR) {
+		if (header->rcode == DNS_RCODE_NX_DOMAIN)
 			return DNS_RESP_NX_DOMAIN;
-		else if ((flags & DNS_FLAG_REPLYCODE) == DNS_RCODE_REFUSED)
+		else if (header->rcode == DNS_RCODE_REFUSED)
 			return DNS_RESP_REFUSED;
 
 		return DNS_RESP_ERROR;
 	}
 
-	/* move forward 2 bytes for flags */
-	reader += 2;
-	if (reader >= bufend)
-		return DNS_RESP_INVALID;
-
-	/* move forward 2 bytes for question count */
-	reader += 2;
-	if (reader >= bufend)
-		return DNS_RESP_INVALID;
-
-	/* analyzing answer count */
-	if (reader + 2 > bufend)
-		return DNS_RESP_INVALID;
-	ancount = reader[0] * 256 + reader[1];
-
-	if (ancount == 0)
+	if (ntohs(header->ancount) == 0)
 		return DNS_RESP_ANCOUNT_ZERO;
 
-	/* move forward 2 bytes for answer count */
-	reader += 2;
-	if (reader >= bufend)
-		return DNS_RESP_INVALID;
-
-	/* move forward 4 bytes authority and additional count */
-	reader += 4;
-	if (reader >= bufend)
-		return DNS_RESP_INVALID;
+	/* move forward by size of header */
+	reader += sizeof(struct dns_header);
 
 	/* check if the name can stand in response */
 	if (dn_name && ((reader + dn_name_len + 1) > bufend))
@@ -442,7 +417,7 @@ int dns_validate_dns_response(unsigned char *resp, unsigned char *bufend, char *
 		return DNS_RESP_INVALID;
 
 	/* now parsing response records */
-	for (i = 1; i <= ancount; i++) {
+	for (i = 1; i <= ntohs(header->ancount); i++) {
 		if (reader >= bufend)
 			return DNS_RESP_INVALID;
 
-- 
2.9.0



[PATCH] Allow setting server port via admin socket.

2016-06-28 Thread Conrad Hoffmann
Hi,

Attached patch allows setting a server's port in addition to the address
via the admin socket, e.g.:

  set server mybackend/server-1 addr 127.0.0.1:8080

I find this already useful by itself, and furthermore this can be used (one
day) to update a server's address/port via DNS SRV lookups.

However, the implementation has a slight problem: I tried to support the
same port specifications as the server definition, i.e. absolute (:X),
relative mapping (:+X, :-X) and port forwarding (omitting the port).
On the other hand, I wanted to keep the command backwards compatible, which
means omitting the port must not change it. Thus, I ended up with the crude
workaround of requiring ':+65336' (== (ushort)0) to enable port forwarding,
which is a little ugly.

Nevertheless, I used this succesfully to change ports on a sever and switch
between all modes, which is quite fancy.

So I guess the questions would be:

1. interesting at all?
2. would maybe be an option to ditch backwards compatibility for the sake
   of a clean implementation (probably less confusing in the long run)?
3. or are there maybe other ideas on how to gracefully handle backwards
   compatibility?

Thanks a lot,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
From 094680a4f55870993e25636c3cef34e681a353dd Mon Sep 17 00:00:00 2001
From: Conrad Hoffmann 
Date: Wed, 29 Jun 2016 00:44:21 +0200
Subject: [PATCH] Allow setting server port via admin socket.

It is already possible to change a server's IP address, effectively switching
to a new server. However, it is not currently possible to also set the port.

This commit extends the 'set server ... addr' command to understand and change
ports as well. Like the original server definition, it supports absolute port,
relative port mapping and port forwarding. Unfortunately, to keep the command
backwards compatible (no port means keep current port), the workaround to
enable port forwarding is to specify the port as '+65336'.

As mentioned in a comment, this functionality can be used in the future to
update the address and port of a server by means of DNS SRV lookups.
---
 doc/management.txt |  8 --
 include/proto/server.h |  2 +-
 src/server.c   | 70 ++
 3 files changed, 66 insertions(+), 14 deletions(-)

diff --git a/doc/management.txt b/doc/management.txt
index caf7eb0..c7a0076 100644
--- a/doc/management.txt
+++ b/doc/management.txt
@@ -1579,8 +1579,12 @@ set rate-limit ssl-sessions global 
   is passed in number of sessions per second sent to the SSL stack. It applies
   before the handshake in order to protect the stack against handshake abuses.
 
-set server / addr 
-  Replace the current IP address of a server by the one provided.
+set server / addr [:[]
+  Replace the current IP address and port of a server by the one provided.
+  If port is omitted, the server's current settings are used. The port syntax
+  supports relative port mapping via + and -. To enable port
+  forwarding, it is currently neccessary to use the workaround of setting the
+  port to '+65536'.
 
 set server / agent [ up | down ]
   Force a server's agent to a new state. This can be useful to immediately
diff --git a/include/proto/server.h b/include/proto/server.h
index ee45f63..7e5c8c1 100644
--- a/include/proto/server.h
+++ b/include/proto/server.h
@@ -39,7 +39,7 @@ int srv_downtime(const struct server *s);
 int srv_lastsession(const struct server *s);
 int srv_getinter(const struct check *check);
 int parse_server(const char *file, int linenum, char **args, struct proxy *curproxy, struct proxy *defproxy);
-int update_server_addr(struct server *s, void *ip, int ip_sin_family, const char *updater);
+int update_server_addr(struct server *s, void *ip, int ip_sin_family, unsigned short port, int map, const char *updater);
 struct server *server_find_by_id(struct proxy *bk, int id);
 struct server *server_find_by_name(struct proxy *bk, const char *name);
 struct server *server_find_best_match(struct proxy *bk, char *name, int id, int *diff);
diff --git a/src/server.c b/src/server.c
index 1095754..7390ecb 100644
--- a/src/server.c
+++ b/src/server.c
@@ -817,17 +817,46 @@ const char *server_parse_weight_change_request(struct server *sv,
 const char *server_parse_addr_change_request(struct server *sv,
  const char *addr_str, const char *updater)
 {
-	unsigned char ip[INET6_ADDRSTRLEN];
+	void *ip;
+	unsigned short port;
+	int high, low, map;
+	char *fqdn, *err;
+	struct sockaddr_storage *sa;
+
+	err = fqdn = NULL;
+	sa = str2sa_range(addr_str, &low, &high, &err, NULL, &fqdn, 0);
+	// Make sure parsing succeded and the address is numer

Re: [PATCH] Allow setting server port via admin socket.

2016-07-04 Thread Conrad Hoffmann
Hi Baptiste,

On 07/03/2016 09:09 PM, Baptiste wrote:
> Hi Conrad,
> 
> I have a few pending patches to cover this change.
> Your current implementation does not cover all the cases and breaks
> some implementation (when the server port is related the frontend's
> port and the need for the health check port to be set).
> 
> I'll submit my patches to Willy soon.
> Part of this patch allows changing the port through the socket like you did.

this is great news, thanks!

Conrad


> 
> Baptiste
> 
> 
> 
> On Wed, Jun 29, 2016 at 1:08 AM, Conrad Hoffmann  
> wrote:
>> Hi,
>>
>> Attached patch allows setting a server's port in addition to the address
>> via the admin socket, e.g.:
>>
>>   set server mybackend/server-1 addr 127.0.0.1:8080
>>
>> I find this already useful by itself, and furthermore this can be used (one
>> day) to update a server's address/port via DNS SRV lookups.
>>
>> However, the implementation has a slight problem: I tried to support the
>> same port specifications as the server definition, i.e. absolute (:X),
>> relative mapping (:+X, :-X) and port forwarding (omitting the port).
>> On the other hand, I wanted to keep the command backwards compatible, which
>> means omitting the port must not change it. Thus, I ended up with the crude
>> workaround of requiring ':+65336' (== (ushort)0) to enable port forwarding,
>> which is a little ugly.
>>
>> Nevertheless, I used this succesfully to change ports on a sever and switch
>> between all modes, which is quite fancy.
>>
>> So I guess the questions would be:
>>
>> 1. interesting at all?
>> 2. would maybe be an option to ditch backwards compatibility for the sake
>>of a clean implementation (probably less confusing in the long run)?
>> 3. or are there maybe other ideas on how to gracefully handle backwards
>>compatibility?
>>
>> Thanks a lot,
>> Conrad
>> --
>> Conrad Hoffmann
>> Traffic Engineer
>>
>> SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany
>>
>> Managing Director: Alexander Ljung | Incorporated in England & Wales
>> with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
>> HRB 110657B

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: Reflexions around timers in the logs

2016-08-23 Thread Conrad Hoffmann
, as the dates will now reflect
> the start of receipt of requests and not the end of last response, and the
> other ones will not be harmed by any change.
> 
> Any opinion on this is welcome. Note: the main work is done, it's only a
> matter of defining what we want to see in the logs, it's easy to adapt
> once we know what we want. BTW, I'm also thinking about adding a field to
> indicate if a request is the first one of a connection or not.
> 
> I'm attaching Thierry's revised patch for reference (the one implementing
> my initial proposal that I'm not happy with anymore).
> 
> Thanks,
> Willy
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: [PATCH] New DNS parser

2016-09-09 Thread Conrad Hoffmann
Hi Baptiste,

I'm super excited :) Unfortunately, I'll be on vacation for three weeks,
beginning this weekend, so I won't be able to provide any feedback before I
get back. But I will give it test ride immediately after.

Thanks so much,
Conrad

On 09/08/2016 09:50 PM, Baptiste wrote:
> Hi all,
> 
> Please find in attachment 10 patches to cover the following new topic in
> HAProxy:
> 
> 1. a new DNS parser, which stores the DNS response into a DNS structure,
> instead of manipulating a buffer.
> => it doesn't add any feature by itself, but it will make DNS consumer
> tasks much easier when using DNS responses
> 
> 2. when the DNS response finishes with a CNAME, now HAProxy sends a new
> query, changing the query type (from  to A or A to )
> 
> I heavily tested the code, but I'd like more people to test it in their own
> environment.
> 
> We can now move forward on the next big development: filling servers in a
> backend based on records read in a DNS responses.
> 
> Conrad: I have a quick and dirty and not finished patch to read and store
> SRV records. If you want to use it for your own dev, please let me know.
> 
> Baptiste
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: SSL/ECC and nbproc >1

2016-10-21 Thread Conrad Hoffmann
.
> 
> So then I started "ab" parallel from each and it was going down to about
> ~4xx requests/s for ECC on each node which is far below the ~1500 (single
> proc) or ~1300 (multi proc) which is much more than I expected tbh. I
> thought it would scale much better for up to nbproc and getting worse when
>>nbproc. I did some basic checks to figure out the reason/bottleneck and to
> me it looks like a lot of switch/epoll_wait. In (h)top it shows that
> ksoftirqd + one haproxy proc is burning 100% cpu of a single core, it's not
> distributed above multiple cores. I'm not sure yet whether it's related to
> the SSL part, HAProxy or some Kernel foo. HTTP performs better. ~27k total
> on localhost, ~5400 single ab via EC2 and still ~2100 per EC2 with a total
> of 15 instances - and the HTTP proc is just a single proc!
> 
> So I wonder what's the reason for the single blocking core. Is that the
> reason for the rather poor performance because it has an impact on any of
> those processes? Can we distribute that onto multiple cores/process as
> well? Any ideas?
> 
> Oh, and I was using 1.6.5:
> HA-Proxy version 1.6.5 2016/05/10
> Copyright 2000-2016 Willy Tarreau 
> 
> Build options :
>   TARGET  = linux2628
>   CPU = generic
>   CC  = gcc
>   CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
>   OPTIONS = USE_LIBCRYPT=1 USE_ZLIB=1 USE_OPENSSL=1 USE_PCRE=1
> 
> Default settings :
>   maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
> 
> Encrypted password support via crypt(3): yes
> Built with zlib version : 1.2.7
> Compression algorithms supported : identity("identity"),
> deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
> Built with OpenSSL version : OpenSSL 1.0.1e 11 Feb 2013
> Running on OpenSSL version : OpenSSL 1.0.1t  3 May 2016 (VERSIONS DIFFER!)
> OpenSSL library supports TLS extensions : yes
> OpenSSL library supports SNI : yes
> OpenSSL library supports prefer-server-ciphers : yes
> Built with PCRE version : 8.30 2012-02-04
> PCRE library supports JIT : no (USE_PCRE_JIT not set)
> Built without Lua support
> Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
> IP_FREEBIND
> 
> Available polling systems :
>   epoll : pref=300,  test result OK
>poll : pref=200,  test result OK
>  select : pref=150,  test result OK
> Total: 3 (3 usable), will use epoll.
> 
> I actually thought I was using 1.6.9 on that host already so I just
> upgraded and tried again some benchmarks but it looks like it's almost
> equal at the first glance.
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Interpreting ttime, rtime & friends correctly

2015-02-05 Thread Conrad Hoffmann
Hello everybody,

I have wondered this several times before and searched the mailing list
to no avail, so I thought I'd just go ahead and ask:

Can somebody shed some more light on how to read the ttime metric that
haproxy supplies?

I am confident I understand qtime, ctime and rtime, but at least ttime I
don't understand. The docs say:

> 61. ttime [..BS]: the average total session time in ms over the 1024 >
last requests

Now, we are not using any session tracking at all. When I read this
value for a backend, how are the 1024 last requests mapped to session
times? Does one request map to one session?
If so, I am very concerned about how high the ttime values are compared
to rtime for some of our backend. What other factors, besides [qcr]time
are making up a sessions ttime?
If not, wouldn't some of the last 1024 requests belong to sessions that
are not yet over? If so, is any value for these sessions used to
calculate the final metric?

Any insights greatly appreciated,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: Server IP resolution using DNS in HAProxy

2015-07-17 Thread Conrad Hoffmann
Hello,

On 07/14/2015 10:11 PM, Baptiste wrote:
> 
> I know the message above is very long, but we really need your feedback!
> 

First, many thanks for tackling this! It surely makes many peoples live
much easier. Reading this makes me want two things, one of them being a
little "not-haproxy-like" maybe.
First would be resolution of SRV records and actually using the port
supplied by the SRV record as the port for the server. I looked at the code
and it doesn't seem like too much work, most of it would probably be
changing the config stuff accordingly.
The other one is... well you asked for it ;) so here it goes: it would be
great to express in the config something like "resolve this name and use up
to X servers from the reply". The backend then allocates X servers.
Assuming that the initial lookup returns Y results, the (sorted) records
get assinged to the first Y servers, the other X-Y servers get marked as
down. Upon a new lookup, same procedure for a potentially changing value of Y.
I realize this a pretty bold feature request for several reasons, but I
have actually spent some thought on it think it might be doable without
really violating any of HAProxy's design paradigms. I would also be willing
to invest some time (code) into this myself.
If you think this might be at least worth a discussion, I'd be happy to
share some more detailed thoughts and it would be great to hear your
thoughts on that, too.

Cheers,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: Server IP resolution using DNS in HAProxy

2015-07-18 Thread Conrad Hoffmann
On 07/17/2015 10:37 PM, Baptiste wrote:
>> First would be resolution of SRV records and actually using the port
>> supplied by the SRV record as the port for the server. I looked at the code
>> and it doesn't seem like too much work, most of it would probably be
>> changing the config stuff accordingly.
> 
> You're right, this could be an interesting option.
> Actually, we though about the SRV records to populate a full backend
> server list with IP, name, port, weight using a single DNS query.

Awesome! That's exactly what I want :)
It's what I tried to express in the second part of my request, probably in
a slightly weird way.

>> The other one is... well you asked for it ;) so here it goes: it would be
>> great to express in the config something like "resolve this name and use up
>> to X servers from the reply". The backend then allocates X servers.
>> Assuming that the initial lookup returns Y results, the (sorted) records
>> get assinged to the first Y servers, the other X-Y servers get marked as
>> down. Upon a new lookup, same procedure for a potentially changing value of 
>> Y.
>> I realize this a pretty bold feature request for several reasons, but I
>> have actually spent some thought on it think it might be doable without
>> really violating any of HAProxy's design paradigms. I would also be willing
>> to invest some time (code) into this myself.
>> If you think this might be at least worth a discussion, I'd be happy to
>> share some more detailed thoughts and it would be great to hear your
>> thoughts on that, too.
> 
> First, there are some limitations about creating servers on the fly in
> a backend.
> So instead, we propose you to pre-allocate servers by configuration
> and then wait for the DNS to populate it.

It's kind of what I meant, although I had something in mind where you say
once "preallocate 20 servers for this backend" instead of listing 20
servers with the same hostname. But that's syntactic sugar and I would
happily use the solution listing a server for each desired allocation.

> I don't speak about a request per server, I speak here about one
> request for the whole farm :)

Again, awesome! That's what I'm looking for!

> You go one step further than the design we have about SRV records to
> populate the backend.
> We thought using priority to decide whether a server is active or backup.
> The advantage is that you don't need to reload HAProxy to change your X value 
> ;)

Yes, perfect. And, what I tried to express above, HAProxy should mark some
servers down if the number records received is less than the number of
servers configured.

> I would welcome a contribution about SRV record type.
> That said, before this, I have to rewrite part of the response parser
> to store the response in a real DNS packet structure instead of
> keeping data in a buffer.

Sounds good. I am not sure what you have in mind there or what parts of the
code would be touched by that, so if you let me know when it's done I'll
happily send some patches as a base for further discussion!

Thanks, Baptiste, this get's me even more excited about HAProxy than I am
already :)

Cheers,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: use_server

2015-07-29 Thread Conrad Hoffmann
Hi,

On 07/29/2015 01:07 PM, mlist wrote:
> We have 3 backend servers balanced with haproxy with "cookie insert" option 
> and ssl redirection.
> For our help desk, we need operators can access a specific backend server to 
> check specific server when we have problems on this backend server.
> 
> We try to do that with the following section, so no hosts file nor DNS 
> mapping and special binding on backend servers are needed to access specific 
> backend server, maintaining the right cookie after use_server.
> 
> We have 3 backend servers: web1, web4 and web10. We want to go on:
> web1 if in the URL haproxy find " aps_ass_UNIQUE_web1_"  -> ex: 
> http:///app1/aps_ass_UNIQUE_web1_
> web4 if in the URL haproxy find " aps_ass_UNIQUE_web4_"  -> ex: 
> http:///app1/aps_ass_UNIQUE_web4_
> web4 if in the URL haproxy find " aps_ass_UNIQUE_web10_"  -> ex: 
> http:///app1/aps_ass_UNIQUE_web10_
> 
> Following configuration does not work, can you help to identify a solution ?
> 
>acl aps_ass_web1 path_reg (.*)\/aps_ass_UNIQUE_web1_(.*)

I don't think the "/" should be escaped, so basically just:

acl aps_ass_web1 path_reg (.*)/aps_ass_UNIQUE_web1_(.*)

Also, while I am not really an expert on this, I think this is a pretty
expensive regex, since the .* will initially match the whole path, then has
to back-search to match the end. If the path pattern is always like the one
you mentioned above you might be better off with something like
(/[^/]+)/aps_ass_UNIQUE_web1_(.*) (that's off the top of my head, no
guarantees for correctness). Might not be worth worrying about if you don't
have performance issues, though.

Hope that helps,
Conrad


>acl aps_ass_web4 path_reg (.*)\/aps_ass_UNIQUE_web4_(.*)
>acl aps_ass_web10 path_reg (.*)\/aps_ass_UNIQUE_web10_(.*)
>reqirep (.*)\/aps_ass_UNIQUE_web[0-9]*_(.*) \1\2
>use-server web1 if aps_ass_web1
>use-server web4 if aps_ass_web4
>use-server web10 if aps_ass_web10
> 
> Thank you.
> 
> Roberto
> 
> 
> -Original Message-
> From: Cyril Bonté [mailto:cyril.bo...@free.fr] 
> Sent: martedì 28 luglio 2015 10.24
> To: mlist
> Cc: haproxy@formilux.org
> Subject: Re: Regex
> 
> Hi again,
> 
> On 28/07/2015 09:42, mlist wrote:
>>
>> I wrote in the first mail I used 2 different regex. The (1) probably doesn't 
>> match, but the (2) ?
>>
>>> (2) reqirep (.*)\/demoj(.*) \1/DemoJ\2
>>
>> This regex match https://, I'm Wrong ?
> 
> Good news for you, at this step, no, you are perfectly right (except 
> that there is an unneeded antislash before /demoj).
> 
> And I realize that you are also right about something unclear in the 
> documentation : headers transformations only apply when the request line 
> doesn't match. This is quite well documented in the code, not in the 
> documentation... Here comes the confusion on your side and mine.
> 
> This means that whatever you do, if your request line contains "/demoj"
> a regex like (.*)/demoj(.*) will never apply to headers.
> You have to sepcify 2 different regex which will ensure that will match 
> only one case at a time (request line OR headers).
> 
> But I still don't understand why you want to modify the referer.
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: appending characters to a custom field without intervening spaces

2015-08-10 Thread Conrad Hoffmann
Hey Jose,

to save you some of the pain I just went through, I would like to add the
following:

%Ts does not neccessarily give you the time of the processing of the
request, which is what you would want for the typical X-Request-Start
header (are you using NewRelic or a similar service?). This is very hard to
deduce from the documentation, but as far as I can tell it is in fact:

 - the time of the accept() call for the connection when this is the first
   request on this connection (which is usually acceptable, but might not
   be sometimes)
 - something like the end time of the previous request for any subsequent
   requests on that connection.

Thus, if you have long-lived connections that are not constantly churning
out requests, the value of %Ts might significantly deviate from what you
might expect. Maybe Willy can shed some light on when exactly the value of
%Ts is set, but I encourage you do a small test before trusting it (I used
a simple telnet session that I pasted some requests into, with a pause in
between).

In the end I resorted to calling date(), which only has second resolution
but is at least never off by more than a second.

Hope that helps,
Conrad

On 08/10/2015 06:24 PM, Jose Nunez wrote:
> Hi Willy,
> 
> thanks for your answer, it works perfectly, thanks!
> 
> Jose
> 
> On Sun, Aug 9, 2015 at 4:55 AM, Willy Tarreau  wrote:
> 
>> Hi Jose,
>>
>> On Fri, Aug 07, 2015 at 01:28:13PM -0400, Jose Nunez wrote:
>>> Hi,
>>>
>>> I need to express something similar to this:
>>>
>>> http-request set-header X-REQUEST-START  t=%[Ts]%[ms]000
>>>
>>> (to append three "0"s at the end of the timestamp with milliseconds).
>>>
>>> I have tried with other ways to append the three "0"s at the end:
>>>
>>> http-request set-header X-REQUEST-START  t=%Ts%[ms]\x30\x30\x30
>>>
>>> and
>>>
>>> http-request set-header X-REQUEST-START  t=%Ts%ms\x30\x30\x30
>>>
>>> and no avail either.
>>
>> You've met the limits of the log format which is not a language and
>> which requires some delimiters to be detected. Unfortunately it doesn't
>> have any delimiter which doesn't appear in the output. I found a way to
>> abuse it using %[] to mark a new word (since %[] detects the end of the
>> current block using the closing bracket). Using just "%[]" emits a warning
>> and does exactly what you want. A cleaner method in 1.6 consists in
>> emitting
>> an empty string as a delimitor : %[str()]. In 1.5, there is no str(), but
>> you can use %[env()] which will retrieve the contents of the environment
>> variable with no name, it doesn't exist so it returns an empty string. Yes
>> I know that's ugly, but the log format has gone far beyond its design goals
>> already!
>>
>> Thus it will give you this in 1.5 :
>>
>>  http-request set-header X-REQUEST-START  t=%Ts%ms%[env()]000
>>
>> In 1.6 you can also do that :
>>
>>  http-request set-header X-REQUEST-START  t=%Ts%ms%[str(000)]
>>
>> Also, please note that what you're doing above only works because %ms is
>> left-padded with zeroes. I'm not seeing this documented anywhere though.
>>
>> Willy
>>
>>
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Fix triggering of runtime DNS resolution?

2015-09-02 Thread Conrad Hoffmann
Hello,

it's kind of late and I am not 100% sure I'm getting this right, so would
be great if someone could double-check this:

Essentially, the runtime DNS resolution was never triggered for me. I
tracked this down to a signed/unsigned problem in the usage of
tick_is_expired() from checks.c:2158.

curr_resolution->last_resolution is being initialized to zero
(server.c:981), which in turn makes it say a few thousand after the value
of hold.valid is added (also checks.c:2158). It is then compared to now_ms,
which is an unsigned integer so large that it is out of the signed integer
range. Thus, the comparison will not get the expected result, as it is done
on integer values (now_ms cast to integer gave e.g. -1875721083 a few
minutes ago, which is undeniably smaller then 3000).

One way to fix this is to initialize curr_resolution->last_resolution to
now_ms instead of zero (attached "patch"), but then it only works because
both values are converted to negative integers. While I think that this
will reasonably hide the problem for the time being, I do think there is a
deeper problem here, which is the frequent passing of an unsigned integer
into a function that takes signed int as argument.

I see that tick_* is used all over the place, so I thought I would rather
consult someone before spending lots of time creating a patch that would
not be used. Also, I would need some more time to actually figure out what
the best solution would be.

Does anyone have any thoughts on this? Is someone maybe already aware of this?

Thanks a lot,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
diff --git a/src/server.c b/src/server.c
index f3b0f16..e88302b 100644
--- a/src/server.c
+++ b/src/server.c
@@ -978,7 +978,7 @@ int parse_server(const char *file, int linenum, char **args, struct proxy *curpr
 			curr_resolution->status = RSLV_STATUS_NONE;
 			curr_resolution->step = RSLV_STEP_NONE;
 			/* a first resolution has been done by the configuration parser */
-			curr_resolution->last_resolution = 0;
+			curr_resolution->last_resolution = now_ms;
 			newsrv->resolution = curr_resolution;
 
  skip_name_resolution:


Re: Fix triggering of runtime DNS resolution?

2015-09-03 Thread Conrad Hoffmann
Hi Baptiste (and others),

I can confirm that the two patches applied make this work as expected for
me (first resolution at first health check, then not again until hold.valid
is elapsed). Thanks a lot!

I am still wondering if the signature of tick_* shouldn't also be changed
to unsigned int? I took another look today and it seems like it almost
exclusively used with unsigned int values (mostly derived from now_ms), and
I would expect a tick to be unsigned anyways.

I think having the interface as it is just unneccessarily invites bugs like
that one. Would be willing to take a stab at that, too, if so desired...

Anyways, thanks for all the fish,
Conrad

On 09/03/2015 11:50 AM, Baptiste wrote:
> Hi Conrad,
> 
> Please use the two patches in attachement.
> 
> Baptiste
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



DNS: defaulting resolve-prefer to ipv6 can lead to unexpected results

2015-09-06 Thread Conrad Hoffmann
Hi,

I ran into the following problem: I have a server specified by hostname,
using the new DNS feature. I initially did not specify the "resolve-prefer"
parameter. The initial lookup of the server succeeded and produced an IPv4
address. Unfortunately, that address was then never updated because of the
following situation:

 - In server.c:1034, resolve-prefer silently defaults to ipv6
 - The DNS server gave no records in response to ANY query
 - Resolvers check resolve-prefer, try again with  query
 - Query yields no results

This was a little unexpected, because the initial resolution works just
fine. I think there are at least two possible options to improve this
behaviour:

 - Document the defaulting to ipv6 and only allow ipv6 for the initial
   lookup as well (in my scenario, this would have lead to failure to
   start, leading to me adding resolve-prefer ipv4, an acceptable solution
 - Do not default to ipv6, leave as unspecified instead. If ANY query
   doesn't produce results, check current address type if no resolve-
   prefer is specified

The attached patch is merely to demonstrate the latter solution. It worked
for me, but I didn't check too hard if leaving resolver_family_priority set
to AF_UNSPEC might lead to other problems elsewhere.

Maybe there is even other/better solutions?

Regards,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B
diff --git a/src/server.c b/src/server.c
index f3b0f16..02085ec 100644
--- a/src/server.c
+++ b/src/server.c
@@ -1030,8 +1030,6 @@ int parse_server(const char *file, int linenum, char **args, struct proxy *curpr
 			newsrv->agent.health	= newsrv->agent.rise;	/* up, but will fall down at first failure */
 			newsrv->agent.server	= newsrv;
 			newsrv->resolver_family_priority = curproxy->defsrv.resolver_family_priority;
-			if (newsrv->resolver_family_priority == AF_UNSPEC)
-newsrv->resolver_family_priority = AF_INET6;
 
 			cur_arg = 3;
 		} else {
@@ -2138,8 +2136,14 @@ int snr_resolution_error_cb(struct dns_resolution *resolution, int error_code)
 /* let's change the query type */
 if (resolution->resolver_family_priority == AF_INET6)
 	resolution->query_type = DNS_RTYPE_;
-else
+else if (resolution->resolver_family_priority == AF_INET)
 	resolution->query_type = DNS_RTYPE_A;
+else if (resolution->resolver_family_priority == AF_UNSPEC) {
+	if (s->addr.ss_family == AF_INET)
+		resolution->query_type = DNS_RTYPE_A;
+	else
+		resolution->query_type = DNS_RTYPE_;
+}
 
 dns_send_query(resolution);
 


Build failure in current master HEAD

2015-09-11 Thread Conrad Hoffmann
Hi,

I cannot build the current dev's master HEAD (ec3c37d) because of this error:

> In file included from include/proto/proto_http.h:26:0,
>  from src/stick_table.c:26:
> include/types/action.h:102:20: error: field ‘re’ has incomplete type
> struct my_regex re;/* used by replace-header and replace-value */
> ^
> Makefile:771: recipe for target 'src/stick_table.o' failed
> make: *** [src/stick_table.o] Error 1

The struct act_rule defined in action.h includes a full struct my_regex
without #include-ing regex.h. Both gcc 5.2.0 and clang 3.6.2 do not allow this.

Cheers,
Conrad
-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B



Re: Haproxy & Kubernetes, dynamic backend configuration

2015-09-18 Thread Conrad Hoffmann
If I may chime in here: Kubernetes supports service discovery through DNS
SRV records for most use-cases, so the dynamic DNS support that Baptiste is
currently working on would be a perfect fit. No special API support required.

Cheers,
Conrad

On 09/18/2015 01:53 PM, Smain Kahlouch wrote:
> Hi Baptiste,
> 
> How do you do ? :)
> 
> Anyway my concern is more about changing dynamically parameters, by
> fetching a source at regular intervals (api) than about kubernetes itself.
> 
> Actually, i only have one need currently : the list of servers part of a
> backend.
> This is the only thing i want to change dinamically, the other parameters
> can be set on startup.
> When the server is configured for the first time i use Ansible to fetch the
> api and retrieve the necessary information.
> 
> I look forward to the new way to acheive that.
> 
> Regards,
> Smana
> 
> 2015-09-18 13:31 GMT+02:00 Baptiste :
> 
>> On Fri, Sep 18, 2015 at 10:49 AM, Smain Kahlouch 
>> wrote:
>>> Hello all,
>>>
>>> I guess this question has been posted many times but maybe there is some
>> new
>>> way to achieve this.
>>>
>>> I'm currently testing kubernetes with calico and i configured a fixed
>>> loadbalancing using "NodePort" kubernetes loadbalancing.
>>>
>>> But i wanted to bypass that second kubernete's internal loadbalancing.
>>> The idea would be to loadbalance directly to the pods instead of a
>>> kubernetes service address.
>>> To do so i found the vulcan loadbalancer which seems to be well suited
>> for
>>> dynamic configuration. This documentation describes how it works.
>>>
>>> Is there a way to achieve the same behaviour : listen the api and change
>>> backends dynamically ?
>>>
>>> Thanks for your help,
>>>
>>> Regards,
>>> Smana
>>>
>>
>>
>> Hey Smaine,
>>
>> I'm totally lost with all you buzz keywords!
>>
>> there is no way currently to achieve this purpose.
>> That said, we're aware of this type of requirements and are thinking
>> about different methods to achieve this goal.
>>
>> That said, could you please list here what HAProxy's parameters you
>> would like to see dynamically changeable at run time?
>>
>> Baptiste
>>
> 

-- 
Conrad Hoffmann
Traffic Engineer

SoundCloud Ltd. | Rheinsberger Str. 76/77, 10115 Berlin, Germany

Managing Director: Alexander Ljung | Incorporated in England & Wales
with Company No. 6343600 | Local Branch Office | AG Charlottenburg |
HRB 110657B