Re: Connections stuck in CLOSE_WAIT state with h2

2018-05-24 Thread Willy Tarreau
On Thu, May 24, 2018 at 11:20:13PM +0200, Janusz Dziemidowicz wrote:
> 2018-05-24 22:26 GMT+02:00 Willy Tarreau :
> >> This kinda seems like the socket was closed on the writing side, but
> >> the client has already sent something and everything is stuck. I was
> >> not able to reproduce the problem by myself. Any ideas how to debug
> >> this further?
> >
> > For now not much comes to my mind. I'd be interested in seeing the
> > output of "show fd" issued on the stats socket of such a process (it
> > can be large, be careful).
> 
> Will do tomorrow. Forgot to mention, apart from this issue everything
> seems to work fine. No user reports any problem. Obviously it consumes
> more and more memory. So I can enable h2 for an hour or two to avoid
> problems.

Great, thank you.

> I've seen this issue also in 1.8.8, which was the first version I've
> used after 1.7.x.

That's pretty useful. I'm realizing that haproxy.org isn't totally
up to date (1.8.5+) so I'll upgrade it to see. If it starts to fail
it will indicate a regression between 1.8.5 and 1.8.8 (quite some
h2 fixes).

> My actual config is a bit more complicated (multiple
> processes per socket, some stats, etc.), but I've been stripping it
> down and down and what I've attached is still producing this issue for
> me.

OK that's a good job, thanks for doing this.

> Anyway, I'll do another round of experiments (without tfo) tomorrow.

Much appreciated, thank you.

Willy



Re: Connections stuck in CLOSE_WAIT state with h2

2018-05-24 Thread Janusz Dziemidowicz
2018-05-24 22:26 GMT+02:00 Willy Tarreau :
>> This kinda seems like the socket was closed on the writing side, but
>> the client has already sent something and everything is stuck. I was
>> not able to reproduce the problem by myself. Any ideas how to debug
>> this further?
>
> For now not much comes to my mind. I'd be interested in seeing the
> output of "show fd" issued on the stats socket of such a process (it
> can be large, be careful).

Will do tomorrow. Forgot to mention, apart from this issue everything
seems to work fine. No user reports any problem. Obviously it consumes
more and more memory. So I can enable h2 for an hour or two to avoid
problems.

>> haproxy -vv (Debian package rebuilt on stretch with USE_TFO):
>
> Interesting, and I'm seeing "tfo" on your bind line. We don't have it
> on haproxy.org. Could you please re-test without it, just in case ?
> Maybe you're receiving SYN+data+FIN that are not properly handled.

I've spend some time tweaking several settings already. I believe I've
checked without tfo and there was no difference. Will repeat that
tomorrow to be sure.

>> HA-Proxy version 1.8.9-1~tsg9+1 2018/05/21
>
> Is 1.8.9 the first version you tested or is it the first one you saw
> the issue on, or did you notice the issue on another 1.8 version ? If
> it turned out to be a regression it could be easier to spot in fact.
>
> Your config is very clean and shows nothing suspicious at all. Thus at
> first knowing if tfo changes anything would be a good start.

I've seen this issue also in 1.8.8, which was the first version I've
used after 1.7.x. My actual config is a bit more complicated (multiple
processes per socket, some stats, etc.), but I've been stripping it
down and down and what I've attached is still producing this issue for
me.

Anyway, I'll do another round of experiments (without tfo) tomorrow.

-- 
Janusz Dziemidowicz



Re: remaining process after (seamless) reload

2018-05-24 Thread William Dauchy
Hi William,

Thank you for your reply.

On Thu, May 24, 2018 at 12:01:38PM +0200, William Lallemand wrote:
> I managed to reproduce something similar with the 1.8.8 version. It looks like
> letting a socat connected to the socket helps.
>
> I'm looking into the code to see what's happening.

Indeed, after some more hours, I got the same issue on v1.8.8. However it seems 
to
be easier to reproduce in v1.8.9, but I might be wrong.
So now I bet on either thread issue, or bind with reuseport.
I'll try to do some more tests.

Best,
-- 
William



Re: Connections stuck in CLOSE_WAIT state with h2

2018-05-24 Thread Willy Tarreau
Hi Janusz,

On Thu, May 24, 2018 at 01:49:52PM +0200, Janusz Dziemidowicz wrote:
> Recently I've moved several servers from haproxy 1.7.x to 1.8.x I have
> a setup with nghttpx handling h2 (haproxy connects to nghttpx via unix
> socket which handles h2 and connects back to haproxy with plain
> http/1.1 also through unix socket).
> 
> After the upgrade I wanted to switch to native h2 supported by
> haproxy. Unfortunately, it seems that over time haproxy is
> accumulating sockets in CLOSE_WAIT state. Currently, after 12h I have
> 5k connections in this state. All of them have non-zero Recv-Q and
> zero Send-Q. netstat -ntpa shows something like this:
> 
> tcp1  0 IP:443  IP:28032  CLOSE_WAIT  115495/haproxy
> tcp   35  0 IP:443  IP:49531   CLOSE_WAIT  115495/haproxy
> tcp  507  0 IP:443  IP:31938 CLOSE_WAIT  115495/haproxy
> tcp  134  0 IP:443  IP:49672  CLOSE_WAIT  115495/haproxy
> tcp  732  0 IP:443  IP:3180   CLOSE_WAIT  115494/haproxy
> tcp  746  0 IP:443  IP:39731  CLOSE_WAIT  115494/haproxy
> tcp   35  0 IP:443  IP:62986  CLOSE_WAIT  115495/haproxy
> tcp  585  0 IP:443  IP:51318 CLOSE_WAIT  115493/haproxy
> tcp  100  0 IP:443  IP:60449 CLOSE_WAIT  115493/haproxy
> tcp   35  0 IP:443  IP:1274  CLOSE_WAIT  115494/haproxy
> ..

I never managed to see this happen yet. Even haproxy.org uses H2 and I've
just checked on the server, zero CLOSE_WAIT. What is strange is that they
all have pending data, it means they sent some data and closed. It could
correspond to a timeout where the client finally closed not receiving a
response.

> Those are all frontend connections. Reloading haproxy removes those
> connections, but only after hard-stop-after kicks in and old processes
> are killed. Disabling native h2 support and switching back to nghttpx
> makes the problem disappear.

OK.

> This kinda seems like the socket was closed on the writing side, but
> the client has already sent something and everything is stuck. I was
> not able to reproduce the problem by myself. Any ideas how to debug
> this further?

For now not much comes to my mind. I'd be interested in seeing the
output of "show fd" issued on the stats socket of such a process (it
can be large, be careful).

> haproxy -vv (Debian package rebuilt on stretch with USE_TFO):

Interesting, and I'm seeing "tfo" on your bind line. We don't have it
on haproxy.org. Could you please re-test without it, just in case ?
Maybe you're receiving SYN+data+FIN that are not properly handled.

> HA-Proxy version 1.8.9-1~tsg9+1 2018/05/21

Is 1.8.9 the first version you tested or is it the first one you saw
the issue on, or did you notice the issue on another 1.8 version ? If
it turned out to be a regression it could be easier to spot in fact.

Your config is very clean and shows nothing suspicious at all. Thus at
first knowing if tfo changes anything would be a good start.

Thanks!
Willy



Re: [PATCH] BUG/MEDIUM: stick-tables: Decrement ref_cnt in table_* converters

2018-05-24 Thread Willy Tarreau
Hi Daniel,

On Thu, May 17, 2018 at 02:05:28PM -0400, Daniel Corbett wrote:
> Hello,
> 
> When using table_* converters ref_cnt was incremented
> and never decremented causing entries to not expire.
> 
> The root cause appears to be that stktable_lookup_key()
> was called within all sample_conv_table_* functions which was
> incrementing ref_cnt and not decrementing after completion.
> 
> Added stktable_release() to the end of each sample_conv_table_*
> function.

Interesting one! However it's not correct, it doesn't decrement the
refcount on the return 0 path so the problem remains when a data
type is looked up in a table where it is not stored. For example :

ts = stktable_lookup_key(t, key);

  ## ref_cnt incremented here only if ts != NULL

smp->flags = SMP_F_VOL_TEST;
smp->data.type = SMP_T_SINT;
smp->data.u.sint = 0;

if (!ts) /* key not present */
return 1;

ptr = stktable_data_ptr(t, ts, STKTABLE_DT_CONN_CUR);
if (!ptr)

  ## here it's not decremented

return 0; /* parameter not stored */

smp->data.u.sint = stktable_data_cast(ptr, conn_cur);
  + stktable_release(t, ts);
return 1;

Given that all functions seem to be written the same way, I suggest
that we change the end and invert the !ptr condition to centralize
the release call. It would give this above :

ptr = stktable_data_ptr(t, ts, STKTABLE_DT_CONN_CUR);
if (ptr)
smp->data.u.sint = stktable_data_cast(ptr, conn_cur);
stktable_release(t, ts);
return !!ptr;

Could you please rework your patch to do this so that I can merge it ?

Thanks!
Willy



Re: warnings during loading load-server-state, expected?

2018-05-24 Thread Willy Tarreau
On Sat, May 19, 2018 at 08:08:13PM -0400, Daniel Corbett wrote:
> From 24f8a74f490435969c04e2bb5387d396b62850c0 Mon Sep 17 00:00:00 2001
> From: Daniel Corbett 
> Date: Sat, 19 May 2018 19:43:24 -0400
> Subject: [PATCH] BUG/MEDIUM: servers state: Add srv_addr default placeholder

(...)

Merged, thanks!
Willy



Re: gRPC protocol

2018-05-24 Thread Aleksandar Lazic

On 24/05/2018 11:54, Daniel Corbett wrote:

Hello Aleks,

On 05/24/2018 10:54 AM, Aleksandar Lazic wrote:


I remembert that Willy mentioned this in any of his mail.
Do you have any rough timeline, this year, next year something like this
;-)



We're aiming to have the native internal HTTP representation completed 
for 1.9 which is slated for an -rc1 around the end of September with a 
potential release between mid-October and the end of November. While 
I cannot make any promises we're hoping to have gRPC added within this 
release as well.


Cool thanks for answer.


Thanks,
-- Daniel


Regards
aleks



Re: gRPC protocol

2018-05-24 Thread Daniel Corbett

Hello Aleks,


On 05/24/2018 10:54 AM, Aleksandar Lazic wrote:


I remembert that Willy mentioned this in any of his mail.
Do you have any rough timeline, this year, next year something like this
;-)



We're aiming to have the native internal HTTP representation completed 
for 1.9 which is slated for an -rc1 around the end of September with a 
potential release between mid-October and the end of November.  While I 
cannot make any promises we're hoping to have gRPC added within this 
release as well.



Thanks,
-- Daniel





Re: [PATCH][MINOR] config: Implement 'parse-resolv-conf' directive for resolvers

2018-05-24 Thread Jim Freeman
Would that I could gift you time away from lesser things (fix the
plumbing?  make breakfast?) from across the ocean ...

I do have some small sense of how ...
overwhelming/consuming/pressing/stressful/... driving a project the size
and stature (and awesome capability) of haproxy would be.

Huge thanks and kudos to the whole crew !!


On Thu, May 24, 2018 at 9:02 AM, Ben Draut  wrote:

> Willy, I think you've reviewed this one already. :) I fixed a few
> things after your review, then you said you just wanted to wait
> for Baptiste to ACK back on 4/27.
>
> I pinged Baptiste independently, just to make sure he had
> seen your note. He replied, but he's been busy too. (Sorry
> to add to the pile!) My understanding was that we're just
> waiting for him.
>
> Thanks,
>
> Ben
>
> On Thu, May 24, 2018 at 8:58 AM, Willy Tarreau  wrote:
>
>> Hi Jim,
>>
>> On Thu, May 24, 2018 at 08:50:29AM -0600, Jim Freeman wrote:
>> > I'm not seeing any signs of this feature sliding into 1.9 source - any
>> > danger of it not going in to the current dev branch?
>> > Are there further concerns/problems/... standing in the way ?  (it
>> > addresses one of my few haproxy gripes)
>>
>> Sorry but it's my fault. I'm totally overwhelmed at the moment with
>> tons of e-mails that take time to process and that I can't cope with
>> anymore. I already have in my todo list to review Ben's patch and
>> Patrick's patches and I cannot find any single hour to do this. I'm
>> spending some time finishing slides, which are totally incompatible
>> with code review, I'll get back to this ASAP.
>>
>> At least it's not lost at all, and indeed it's not yet in 1.9 but
>> I don't see any reason why this wouldn't go there.
>>
>> Thanks,
>> Willy
>>
>
>


Re: [PATCH][MINOR] config: Implement 'parse-resolv-conf' directive for resolvers

2018-05-24 Thread Ben Draut
Willy, I think you've reviewed this one already. :) I fixed a few
things after your review, then you said you just wanted to wait
for Baptiste to ACK back on 4/27.

I pinged Baptiste independently, just to make sure he had
seen your note. He replied, but he's been busy too. (Sorry
to add to the pile!) My understanding was that we're just
waiting for him.

Thanks,

Ben

On Thu, May 24, 2018 at 8:58 AM, Willy Tarreau  wrote:

> Hi Jim,
>
> On Thu, May 24, 2018 at 08:50:29AM -0600, Jim Freeman wrote:
> > I'm not seeing any signs of this feature sliding into 1.9 source - any
> > danger of it not going in to the current dev branch?
> > Are there further concerns/problems/... standing in the way ?  (it
> > addresses one of my few haproxy gripes)
>
> Sorry but it's my fault. I'm totally overwhelmed at the moment with
> tons of e-mails that take time to process and that I can't cope with
> anymore. I already have in my todo list to review Ben's patch and
> Patrick's patches and I cannot find any single hour to do this. I'm
> spending some time finishing slides, which are totally incompatible
> with code review, I'll get back to this ASAP.
>
> At least it's not lost at all, and indeed it's not yet in 1.9 but
> I don't see any reason why this wouldn't go there.
>
> Thanks,
> Willy
>


Re: [PATCH][MINOR] config: Implement 'parse-resolv-conf' directive for resolvers

2018-05-24 Thread Willy Tarreau
Hi Jim,

On Thu, May 24, 2018 at 08:50:29AM -0600, Jim Freeman wrote:
> I'm not seeing any signs of this feature sliding into 1.9 source - any
> danger of it not going in to the current dev branch?
> Are there further concerns/problems/... standing in the way ?  (it
> addresses one of my few haproxy gripes)

Sorry but it's my fault. I'm totally overwhelmed at the moment with
tons of e-mails that take time to process and that I can't cope with
anymore. I already have in my todo list to review Ben's patch and
Patrick's patches and I cannot find any single hour to do this. I'm
spending some time finishing slides, which are totally incompatible
with code review, I'll get back to this ASAP.

At least it's not lost at all, and indeed it's not yet in 1.9 but
I don't see any reason why this wouldn't go there.

Thanks,
Willy



Re: gRPC protocol

2018-05-24 Thread Aleksandar Lazic

Hi Daniel.

On 24/05/2018 10:09, Daniel Corbett wrote:

Hello Aleks,

gRPC is on our road map. We're currently working on implementing a 
new native internal HTTP representation and that will bring us end to 
end HTTP/2, which is the requirement for us to add gRPC.


I remembert that Willy mentioned this in any of his mail.
Do you have any rough timeline, this year, next year something like this
;-)

In regards to the gRPC lua script -- thanks for sharing. It's the 
first time I have seen it so can't comment :)


NP, m2 ;-)


Thanks,
-- Daniel


Best regards
aleks



Re: [PATCH][MINOR] config: Implement 'parse-resolv-conf' directive for resolvers

2018-05-24 Thread Jim Freeman
I'm not seeing any signs of this feature sliding into 1.9 source - any
danger of it not going in to the current dev branch?
Are there further concerns/problems/... standing in the way ?  (it
addresses one of my few haproxy gripes)

...jfree
[ grateful/impressed haproxy user - thanks to all involved ]

On Fri, Apr 27, 2018 at 10:59 PM, Willy Tarreau  wrote:

> On Fri, Apr 27, 2018 at 08:58:52PM -0600, Ben Draut wrote:
> > > >   newnameserver->addr = *sk;
> > > >   }
> > > > + else if (strcmp(args[0], "parse-resolv-conf") == 0) {
> > >
> > > I think you should register a config keyword and parse this in its own
> > > function if at all possible, but I don't know if resolvers can use
> > > registered config keywords, so if it's not possible, please ignore this
> > > comment.
> > >
> >
> > The resolvers section isn't registering any config keywords at the
> moment,
> > so I'm going to leave it the way it is to be consistent.
>
> OK.
>
> > > > + free(sk);
> > > > + free(resolv_line);
> > > > + if (fclose(f) != 0) {
> > > > + ha_warning("parsing [%s:%d] : failed to close
> > > handle to /etc/resolv.conf.\n",
> > > > +file, linenum);
> > > > + err_code |= ERR_WARN;
> > > > + }
> > >
> > > In practice you don't need to run this check on a read-only file, as it
> > > cannot fail, and if it really did, the user couldn't do anything about
> > > it anyway.
> > >
> >
> > Great, removed.
> >
> > I also fixed the memory leaks that you pointed out. (I think) But I did
> > notice that
> > valgrind reports that the 'newnameserver' allocation is being leaked
> > anyway, both
> > when using parse-resolv-conf as well as the regular nameserver
> > directive...Let
> > me know if I should do something about that. To me it seems the resolvers
> > code
> > should be freeing that.
>
> It means there's nothing in the deinit() function to take care of the
> nameservers. It would be better to do it just to avoid the warnings
> you're seeing. Do not hesitate to propose a patch for this if you want,
> and please mark it for backporting.
>
> > +resolv_out:
> > + if (sk != NULL)
> > + free(sk);
>
> Here you don't need the test because free(NULL) is a NOP.
>
> > + if (resolv_line != NULL)
> > + free(resolv_line);
>
> Same here.
>
> If you want I can take care of them when merging. Let's wait for Baptiste's
> ACK now.
>
> Thanks,
> Willy
>
>


Re: gRPC protocol

2018-05-24 Thread Daniel Corbett

Hello Aleks,

gRPC is on our road map.  We're currently working on implementing a new 
native internal HTTP representation and that will bring us end to end 
HTTP/2, which is the requirement for us to add gRPC.


In regards to the gRPC lua script -- thanks for sharing.  It's the first 
time I have seen it so can't comment :)


Thanks,
-- Daniel




Re: [PATCH] lua & threads

2018-05-24 Thread Willy Tarreau
On Thu, May 24, 2018 at 02:38:58PM +0200, Thierry Fournier wrote:
> I do not observe error during runtime, my only one problem is the
> compilation. I don't understand the impact of these modification,
> and so I can't test, because I don't known the impact on the
> polling.
(...)

Don't worry, I've finally fixed it, as it broke the build in 1.8.9
as well when threads were disabled. And indeed setting it to zero is
OK (I tested and that's fine since it's the same thing that is done
when you have a single thread).

Cheers,
Willy



Re: [PATCH] lua & threads

2018-05-24 Thread Thierry Fournier


> On 22 May 2018, at 19:03, Willy Tarreau  wrote:
> 
> Hi Thierry,
> 
> On Mon, May 21, 2018 at 07:58:01PM +0200, Thierry Fournier wrote:
>> Hi,
>> 
>> You will two patches in attachment.
>> 
>> - The first fix some Lua error messages
> 
> thanks, I've merged this one already.
> 
>> - The second fix a build error. This second should be reviewed because, I'm 
>> not
>>   so proud of solution :-) Note that this build error happens for compilation
>>   without threads on macosx.
> 
> In my opinion this one looks wrong. Apparently there's a special case for
> all_threads_mask when set to zero to indicate that no threads are enabled,
> and it bypasses any such checks, which is better than setting it to ULONG_MAX.
> 
> I *suspect* it doesn't have any impact for now, except that since code relies
> on !all_threads_mask it can progressively spread and break again later. So
> please check by setting it to 0UL and if it works that's OK.


I do not observe error during runtime, my only one problem is the
compilation. I don’t understand the impact of these modification,
and so I can’t test, because I don’t known the impact on the
polling.

The only one function impacted is “done_update_polling()” in proto/fd.h
which hangs during compilation without threads.

In other way, I so not like the remplacement of a variable by a define.
It can be create hard situation in the future, like which will work only
in thread case.

   long *ptr = &all_threads_mask


My patch have for goal to shows the error and not fix it.

BR,
Thierry


Connections stuck in CLOSE_WAIT state with h2

2018-05-24 Thread Janusz Dziemidowicz
Recently I've moved several servers from haproxy 1.7.x to 1.8.x I have
a setup with nghttpx handling h2 (haproxy connects to nghttpx via unix
socket which handles h2 and connects back to haproxy with plain
http/1.1 also through unix socket).

After the upgrade I wanted to switch to native h2 supported by
haproxy. Unfortunately, it seems that over time haproxy is
accumulating sockets in CLOSE_WAIT state. Currently, after 12h I have
5k connections in this state. All of them have non-zero Recv-Q and
zero Send-Q. netstat -ntpa shows something like this:

tcp1  0 IP:443  IP:28032  CLOSE_WAIT  115495/haproxy
tcp   35  0 IP:443  IP:49531   CLOSE_WAIT  115495/haproxy
tcp  507  0 IP:443  IP:31938 CLOSE_WAIT  115495/haproxy
tcp  134  0 IP:443  IP:49672  CLOSE_WAIT  115495/haproxy
tcp  732  0 IP:443  IP:3180   CLOSE_WAIT  115494/haproxy
tcp  746  0 IP:443  IP:39731  CLOSE_WAIT  115494/haproxy
tcp   35  0 IP:443  IP:62986  CLOSE_WAIT  115495/haproxy
tcp  585  0 IP:443  IP:51318 CLOSE_WAIT  115493/haproxy
tcp  100  0 IP:443  IP:60449 CLOSE_WAIT  115493/haproxy
tcp   35  0 IP:443  IP:1274  CLOSE_WAIT  115494/haproxy
..

Those are all frontend connections. Reloading haproxy removes those
connections, but only after hard-stop-after kicks in and old processes
are killed. Disabling native h2 support and switching back to nghttpx
makes the problem disappear.

This kinda seems like the socket was closed on the writing side, but
the client has already sent something and everything is stuck. I was
not able to reproduce the problem by myself. Any ideas how to debug
this further?

haproxy -vv (Debian package rebuilt on stretch with USE_TFO):
HA-Proxy version 1.8.9-1~tsg9+1 2018/05/21
Copyright 2000-2018 Willy Tarreau 

Build options :
  TARGET  = linux2628
  CPU = generic
  CC  = gcc
  CFLAGS  = -g -O2 -fdebug-prefix-map=/root/haproxy-1.8.9=.
-fstack-protector-strong -Wformat -Werror=format-security -Wdate-time
-D_FORTIFY_SOURCE=2
  OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1
USE_LUA=1 USE_SYSTEMD=1 USE_PCRE=1 USE_PCRE_JIT=1 USE_TFO=1 USE_NS=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.1.0f  25 May 2017
Running on OpenSSL version : OpenSSL 1.1.0f  25 May 2017
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2
Built with Lua version : Lua 5.3.3
Built with transparent proxy support using: IP_TRANSPARENT
IPV6_TRANSPARENT IP_FREEBIND
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE version : 8.39 2016-06-14
Running on PCRE version : 8.39 2016-06-14
PCRE library supports JIT : yes
Built with zlib version : 1.2.8
Running on zlib version : 1.2.8
Compression algorithms supported : identity("identity"),
deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with network namespace support.

Available polling systems :
  epoll : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace

config

global
  log /dev/log daemon warning
  log-send-hostname
  chroot /var/lib/haproxy
  maxconn 65536
  user haproxy
  group haproxy
  daemon
  nbproc 4
  stats socket /var/run/haproxy/stats.socket user haproxy mode 0640
level user process 1
  stats socket /var/run/haproxy/stats-1.socket user haproxy mode 0640
level admin process 1
  stats socket /var/run/haproxy/stats-2.socket user haproxy mode 0640
level admin process 2
  stats socket /var/run/haproxy/stats-3.socket user haproxy mode 0640
level admin process 3
  stats socket /var/run/haproxy/stats-4.socket user haproxy mode 0640
level admin process 4
  ssl-default-bind-ciphers
ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES128-SHA:ECDHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA
  ssl-default-bind-options ssl-min-ver TLSv1.0
  tune.ssl.cachesize 20
  tune.ssl.lifetime 24h
  hard-stop-after 2h

  unix-bind prefix /var/lib/haproxy/ mode 600 user haproxy group haproxy


defaults http
  option dontlognull
  option dontlog-normal
  option redispatch
  option tcp-smart-connect
  option httplog

  timeout client 60s
  timeout connect 10s
  timeout server 60s
  timeout tunnel 10m
  timeout client-fin 30s
  timeout http-keep-alive 30s
  timeout http-request 30s

  log global
  retries 3
  backlog 16384

Re: remaining process after (seamless) reload

2018-05-24 Thread William Lallemand
On Thu, May 24, 2018 at 10:07:23AM +0200, William Dauchy wrote:
> On Wed, May 23, 2018 at 08:45:04PM +0200, William Dauchy wrote:
> > More details which could help understand what is going on:
> >
> > ps output:
> >
> > root 15928  0.3  0.0 255216 185268 ?   Ss   May21  10:11 
> > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> > 16988 16912 6340 28271 30590 30334 -x /var/lib/haproxy/stats
> > haproxy   6340  2.0  0.0 526172 225476 ?   Ssl  May22  35:03  \_ 
> > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> > 6328 6315 -x /var/lib/haproxy/stats
> > haproxy  28271  1.8  0.0 528720 229508 ?   Ssl  May22  27:13  \_ 
> > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> > 28258 28207 28232 6340 -x /var/lib/haproxy/stats
> > haproxy  30590  265  0.0 527268 225032 ?   Rsl  04:35 2188:55  \_ 
> > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> > 30578 28271 6340 -x /var/lib/haproxy/stats
> > haproxy  30334  197  0.0 526704 224544 ?   Rsl  09:17 1065:59  \_ 
> > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> > 30322 30295 27095 6340 28271 30590 -x /var/lib/haproxy/stats
> > haproxy  16912  1.7  0.0 527544 216552 ?   Ssl  18:14   0:03  \_ 
> > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> > 16899 28271 30590 30334 6340 -x /var/lib/haproxy/stats
> > haproxy  17001  2.2  0.0 528392 214656 ?   Ssl  18:17   0:00  \_ 
> > /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> > 16988 16912 6340 28271 30590 30334 -x /var/lib/haproxy/stats
> >
> >
> > lsof output:
> >
> > haproxy6340haproxy5u unix 0x883feec97000   0t0  
> > 679289634 /var/lib/haproxy/stats.15928.tmp
> > haproxy6340  6341  haproxy5u unix 0x883feec97000   0t0  
> > 679289634 /var/lib/haproxy/stats.15928.tmp
> > haproxy6340  6342  haproxy5u unix 0x883feec97000   0t0  
> > 679289634 /var/lib/haproxy/stats.15928.tmp
> > haproxy6340  6343  haproxy5u unix 0x883feec97000   0t0  
> > 679289634 /var/lib/haproxy/stats.15928.tmp
> > haproxy   17020haproxy5u unix 0x883feec97000   0t0  
> > 679289634 /var/lib/haproxy/stats.15928.tmp
> > haproxy   17020 17021  haproxy5u unix 0x883feec97000   0t0  
> > 679289634 /var/lib/haproxy/stats.15928.tmp
> > haproxy   17020 17022  haproxy5u unix 0x883feec97000   0t0  
> > 679289634 /var/lib/haproxy/stats.15928.tmp
> > haproxy   17020 17023  haproxy5u unix 0x883feec97000   0t0  
> > 679289634 /var/lib/haproxy/stats.15928.tmp
> > haproxy   28271haproxy5u unix 0x883feec97000   0t0  
> > 679289634 /var/lib/haproxy/stats.15928.tmp
> > haproxy   28271 28272  haproxy5u unix 0x883feec97000   0t0  
> > 679289634 /var/lib/haproxy/stats.15928.tmp
> > haproxy   28271 28273  haproxy5u unix 0x883feec97000   0t0  
> > 679289634 /var/lib/haproxy/stats.15928.tmp
> > haproxy   28271 28274  haproxy5u unix 0x883feec97000   0t0  
> > 679289634 /var/lib/haproxy/stats.15928.tmp
> >
> > (So on unhealthy nodes, I find old processes which are still linked to
> > the socket.)
> >
> > The provisioning part is also seeing data which are supposed to be
> > already updated through the runtime API. I suspect I am getting old
> > data when connecting to the unix socket. The later being still attached
> > to an old process?
> > Indeed, if I try
> > for i in {1..500}; do sudo echo "show info" | sudo socat stdio  
> > /var/lib/haproxy/stats  | grep Pid; done
> >
> > I get "Pid: 17001" most of the time, which is the last process
> > but I sometimes get: "Pid: 28271"(!) which is a > 24 hours old
> > process.
> >
> > Is there something we are doing wrongly?
> 
> After some more testing, I don't have this issue using haproxy v1.8.8
> (rollbacked for > 12 hours). I hope I don't speak too fast.
> 

Hi,

I managed to reproduce something similar with the 1.8.8 version. It looks like
letting a socat connected to the socket helps.

I'm looking into the code to see what's happening.

-- 
William Lallemand



Re: Haproxy 1.8 with OpenSSL 1.1.1-pre4 stops working after 1 hour

2018-05-24 Thread Emeric Brun
Hi Lukas,

On 05/24/2018 11:27 AM, Lukas Tribus wrote:
> Hi Emeric,
> 
> 
> On 24 May 2018 at 11:19, Emeric Brun  wrote:
>> in pre6 there is a news wrapping function on getrandom which have different 
>> fallback way to use the syscall.
>>
>> Perhaps the openssl -r output depends of that (if getrandom was found from 
>> glibc or if a syscall loaded from a different way and considered 
>> os-specific).
> 
> No, openssl version -r output is a verbatim copy of what was passed to
> --with-rand-seed at configure time:
> https://github.com/openssl/openssl/pull/5910#issuecomment-391514494
> 
> 
>> @Lukas Which build-workarround did you use?
> 
> No workaround at all, getrandom works for me out of the box in -pre6
> (with libc2.23) and the output of "openssl version -r" is
> "os-specific", which also is expected behavior as per the github
> discussion above. The raw syscall as implemented in-pre6 works for me.

Ok, i've initialy patched the pre3, and port my patch on pre6 in a different 
way but i didn't check if it work of the box in pre6 :)

In my case i'm using cross-compilation and there is not access to kernel 
includes, only those of the builded libc of the sysroot.

This way i don't now how openssl building will be able to retrieve the 
SYS_getrandom not the syscall def because the are not present in my sysroot.

Anyway, it seems that there was two issues in that thread, do still have one 
Lukas?


R,
Emeric



Re: Haproxy 1.8 with OpenSSL 1.1.1-pre4 stops working after 1 hour

2018-05-24 Thread Lukas Tribus
Hi Emeric,


On 24 May 2018 at 11:19, Emeric Brun  wrote:
> in pre6 there is a news wrapping function on getrandom which have different 
> fallback way to use the syscall.
>
> Perhaps the openssl -r output depends of that (if getrandom was found from 
> glibc or if a syscall loaded from a different way and considered os-specific).

No, openssl version -r output is a verbatim copy of what was passed to
--with-rand-seed at configure time:
https://github.com/openssl/openssl/pull/5910#issuecomment-391514494


> @Lukas Which build-workarround did you use?

No workaround at all, getrandom works for me out of the box in -pre6
(with libc2.23) and the output of "openssl version -r" is
"os-specific", which also is expected behavior as per the github
discussion above. The raw syscall as implemented in-pre6 works for me.


Lukas



Re: Haproxy 1.8 with OpenSSL 1.1.1-pre4 stops working after 1 hour

2018-05-24 Thread Emeric Brun
Hi Lukas,

On 05/23/2018 09:48 PM, Lukas Tribus wrote:
> Hello,
> 
> 
> On 23 May 2018 at 18:29, Emeric Brun  wrote:
>> This issue was due to openssl-1.1.1 which re-seed after an elapsed time or 
>> number of request.
>>
>> If /dev/urandom is used as seeding source when haproxy is chrooted it fails 
>> to re-open /dev/urandom 
>>
>> By defaut the openssl-1.1.1 configure script uses the syscall getrandom as 
>> seeding source and fallback on /dev/urandom if not available.
>>
>> So you don't face the issue if your openssl-1.1.1 is compiled to use 
>> getrandom
>>
>> But getrandom syscall is available only since kernel > 3.17 and the main 
>> point: for glibc > 2.25.
>>
>> With openssl-1.1.1 you can check this this way:
>> # ./openssl-1.1.1/openssl version -r
>> Seeding source: getrandom-syscall
> 
> I have glibc 2.23 (Ubuntu 16.04) and openssl shows "os-specific", even
> if kernel headers are installed while compiling, yet -pre6 does not
> hang for me in chroot (-pre4 did):
> 
> lukas@dev:~/libsslbuild/bin$ uname -r
> 4.4.0-109-generic
> lukas@dev:~/libsslbuild/bin$ ./openssl version
> OpenSSL 1.1.1-pre6 (beta) 1 May 2018
> lukas@dev:~/libsslbuild/bin$ ./openssl version -r
> Seeding source: os-specific
> lukas@dev:~/libsslbuild/bin$
> 
> 
> But, stracing haproxy shows that the library IS ACTUALLY using
> getrandom(). So the "Seeding source" output of the executable is
> wrong. Gonna dig into this as well, but seeing how my haproxy
> executable uses getrandom() calls, this perfectly explains why I did
> not see this in -pre6 (which has the build-workaround for < libc 2.25,
> while pre4 did not, so it did not use the getrandom() call).

in pre6 there is a news wrapping function on getrandom which have different 
fallback way to use the syscall.

Perhaps the openssl -r output depends of that (if getrandom was found from 
glibc or if a syscall loaded from a different way and considered os-specific).

@Lukas Which build-workarround did you use?

In my case i've patched openssl to define SYS_getrandom (depending of the arch) 
and i set --with-rand-seed=getrandom

You may have a more elegant way to do that without a patch.

/*
 * syscall_random(): Try to get random data using a system call
 * returns the number of bytes returned in buf, or <= 0 on error.
 */
int syscall_random(void *buf, size_t buflen)
{
#  if defined(OPENSSL_HAVE_GETRANDOM)
return (int)getrandom(buf, buflen, 0);
#  endif

#  if defined(__linux) && defined(SYS_getrandom)
return (int)syscall(SYS_getrandom, buf, buflen, 0);
#  endif

#  if defined(__FreeBSD__) && defined(KERN_ARND)
return (int)sysctl_random(buf, buflen);
#  endif

   /* Supported since OpenBSD 5.6 */
#  if defined(__OpenBSD__) && OpenBSD >= 201411
return getentropy(buf, buflen);
#  endif

return -1;
}

My patch:
--- openssl-1.1.1-pre6/crypto/rand/rand_unix.c.ori  2018-05-22 
14:06:03.490771549 +0200
+++ openssl-1.1.1-pre6/crypto/rand/rand_unix.c  2018-05-22 14:14:33.133237079 
+0200
@@ -173,6 +173,26 @@
 #   define OPENSSL_HAVE_GETRANDOM
 #  endif

+# if defined(__linux)
+#   include 
+#if !defined(SYS_getrandom)
+#if !defined(__NR_getrandom)
+#if defined(__powerpc__) || defined(__powerpc64__)
+#define __NR_getrandom 236
+#elif defined(__sparc__) || defined(__sparc64__)
+#define __NR_getrandom 347 
+#elif defined(__x86_64__)
+#define __NR_getrandom 318
+#elif defined (__i386__)
+#define __NR_getrandom 355
+#elif defined (__s390__) || defined(__s390x__)
+#define __NR_getrandom 249
+#endif /* $arch */
+#endif /* __NR_getrandom */
+#   define SYS_getrandom __NR_getrandom
+#endif
+#endif
+
 #  if defined(OPENSSL_HAVE_GETRANDOM)
 #   include 
 #  endif
~


> 
> 
> @Sander it looks like openssl folks won't change their mind about
> this. You have to either upgrade to a kernel more recent than 3.17 so
> that getrandom() can be used, or make /dev/xrandom available within
> your chroot.
> 
> 
> 
> Lukas
> 

Emeric



Re: Haproxy 1.8 with OpenSSL 1.1.1-pre4 stops working after 1 hour

2018-05-24 Thread Sander Hoentjen
On 05/23/2018 09:48 PM, Lukas Tribus wrote:
> Hello,
>
>
> On 23 May 2018 at 18:29, Emeric Brun  wrote:
>> This issue was due to openssl-1.1.1 which re-seed after an elapsed time or 
>> number of request.
>>
>> If /dev/urandom is used as seeding source when haproxy is chrooted it fails 
>> to re-open /dev/urandom 
>>
>> By defaut the openssl-1.1.1 configure script uses the syscall getrandom as 
>> seeding source and fallback on /dev/urandom if not available.
>>
>> So you don't face the issue if your openssl-1.1.1 is compiled to use 
>> getrandom
>>
>> But getrandom syscall is available only since kernel > 3.17 and the main 
>> point: for glibc > 2.25.
>>
>> With openssl-1.1.1 you can check this this way:
>> # ./openssl-1.1.1/openssl version -r
>> Seeding source: getrandom-syscall
> I have glibc 2.23 (Ubuntu 16.04) and openssl shows "os-specific", even
> if kernel headers are installed while compiling, yet -pre6 does not
> hang for me in chroot (-pre4 did):
>
> lukas@dev:~/libsslbuild/bin$ uname -r
> 4.4.0-109-generic
> lukas@dev:~/libsslbuild/bin$ ./openssl version
> OpenSSL 1.1.1-pre6 (beta) 1 May 2018
> lukas@dev:~/libsslbuild/bin$ ./openssl version -r
> Seeding source: os-specific
> lukas@dev:~/libsslbuild/bin$
>
>
> But, stracing haproxy shows that the library IS ACTUALLY using
> getrandom(). So the "Seeding source" output of the executable is
> wrong. Gonna dig into this as well, but seeing how my haproxy
> executable uses getrandom() calls, this perfectly explains why I did
> not see this in -pre6 (which has the build-workaround for < libc 2.25,
> while pre4 did not, so it did not use the getrandom() call).
>
>
> @Sander it looks like openssl folks won't change their mind about
> this. You have to either upgrade to a kernel more recent than 3.17 so
> that getrandom() can be used, or make /dev/xrandom available within
> your chroot.
When I make /dev/*random available in the chroot, indeed it works fine.
Thanks guys!
As you all have guessed, I am indeed running an older kernel that
doesn't have the getrandom syscall.

Regards,
Sander



Re: SSL certs loading performance regression

2018-05-24 Thread Emmanuel Hocdet

> Le 24 mai 2018 à 09:21, Hervé Commowick  a 
> écrit :
> 
> I didn't know about the curves parameter, and i don't see performance
> regression with it. I don't really understand why this kind of parameter
> can influence certs loading time.
> 

I don't know really why either.
"ecdhe" uses EC_KEY_new_by_curve_name  SSL_CTX_set_tmp_ecdh, EC_KEY_free
if openssl really compute a key it could explain.

« curves » is the parameter to use for this usage.

Manu.




Re: haproxy=1.8.5 stuck in thread syncing

2018-05-24 Thread Максим Куприянов
Hi, Christopher!

Could you tell if these patches will be backported to haproxy 1.8 or not?

2018-04-11 20:06 GMT+03:00 Максим Куприянов :

> Hi!
>
> Thank you very much for the patches. Looks like they helped.
>
> 2018-03-29 14:25 GMT+05:00 Christopher Faulet :
>
>> Le 28/03/2018 à 14:16, Максим Куприянов a écrit :
>>
>>> Hi!
>>>
>>> I'm sorry but configuration it's too huge too share (over 100 different
>>> proxy sections). This is also the reason I can't exactly determine the
>>> failing section. Is there a way to get this data from core-file?
>>>
>>> 2018-03-28 11:18 GMT+03:00 Christopher Faulet >> >:
>>>
>>> Le 28/03/2018 à 09:36, Максим Куприянов a écrit :
>>>
>>> Hi!
>>>
>>> Yesterday one of our haproxies (1.8.5) with nbthread=8 set in
>>> its config stuck with 800% CPU usage. Some responses were served
>>> successfully but many of them just timed out. perf top showed
>>> this:
>>>59.19%  [.] thread_enter_sync
>>>32.68%  [.] fwrr_get_next_server
>>>
>>>
>>> Hi,
>>>
>>> Could you share your configuration please ? It will help to diagnose
>>> the problem. In your logs, what is the values of srv_queue and
>>> backend_queue fields ?
>>>
>>>
>> Hi,
>>
>> Ok, I partly reproduce your problem using a backend, with an hundred
>> servers and a maxconn to 2 for each one. In this case, I observe same CPUs
>> consumption. I have no timeouts (it probably depends on your values) but
>> performances are quite low.
>>
>> I think you're hitting a limitation of the current design. We have no
>> mechanism to migrate entities between threads. So to force threads wakeup,
>> we use the sync point. It was not designed to be called very often. In your
>> case, it eats all the CPU.
>>
>> I attached 3 patches. They add a mechanism to wakeup threads selectively
>> without any lock or loop. They must be applied on HAProxy 1.8 (it will not
>> work on the upstream). So you can check if it fixes your problem or not. It
>> will be useful to validate it is a design limitation and not a bug.
>>
>> This is just an experimentation. I hope it works well but I didn't do a
>> lot of testing. If yes, I'll then discuss with Willy if it is pertinent or
>> not to do the threads wakeup this way. But, in all cases, it will probably
>> not be backported in HAProxy 1.8.
>>
>> --
>> Christopher Faulet
>>
>
>
Thank you!
Maksim


Re: subscribe

2018-05-24 Thread Aleksandar Lazic

Hi Stephan.

On 24/05/2018 10:16, Stephan Seitz wrote:

subscribe


Please use haproxy+subscr...@formilux.org as shown in

https://www.haproxy.org/#tact


Mit freundlichen Grüßen,

Stephan Seitz


Best regards
Aleks


--

Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-44
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht
Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin




subscribe

2018-05-24 Thread Stephan Seitz
subscribe

Mit freundlichen Grüßen,

Stephan Seitz

--

Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-44
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht
Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin




signature.asc
Description: This is a digitally signed message part


Re: remaining process after (seamless) reload

2018-05-24 Thread William Dauchy
On Wed, May 23, 2018 at 08:45:04PM +0200, William Dauchy wrote:
> More details which could help understand what is going on:
>
> ps output:
>
> root 15928  0.3  0.0 255216 185268 ?   Ss   May21  10:11 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 16988 16912 6340 28271 30590 30334 -x /var/lib/haproxy/stats
> haproxy   6340  2.0  0.0 526172 225476 ?   Ssl  May22  35:03  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 6328 6315 -x /var/lib/haproxy/stats
> haproxy  28271  1.8  0.0 528720 229508 ?   Ssl  May22  27:13  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 28258 28207 28232 6340 -x /var/lib/haproxy/stats
> haproxy  30590  265  0.0 527268 225032 ?   Rsl  04:35 2188:55  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 30578 28271 6340 -x /var/lib/haproxy/stats
> haproxy  30334  197  0.0 526704 224544 ?   Rsl  09:17 1065:59  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 30322 30295 27095 6340 28271 30590 -x /var/lib/haproxy/stats
> haproxy  16912  1.7  0.0 527544 216552 ?   Ssl  18:14   0:03  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 16899 28271 30590 30334 6340 -x /var/lib/haproxy/stats
> haproxy  17001  2.2  0.0 528392 214656 ?   Ssl  18:17   0:00  \_ 
> /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -sf 
> 16988 16912 6340 28271 30590 30334 -x /var/lib/haproxy/stats
>
>
> lsof output:
>
> haproxy6340haproxy5u unix 0x883feec97000   0t0  
> 679289634 /var/lib/haproxy/stats.15928.tmp
> haproxy6340  6341  haproxy5u unix 0x883feec97000   0t0  
> 679289634 /var/lib/haproxy/stats.15928.tmp
> haproxy6340  6342  haproxy5u unix 0x883feec97000   0t0  
> 679289634 /var/lib/haproxy/stats.15928.tmp
> haproxy6340  6343  haproxy5u unix 0x883feec97000   0t0  
> 679289634 /var/lib/haproxy/stats.15928.tmp
> haproxy   17020haproxy5u unix 0x883feec97000   0t0  
> 679289634 /var/lib/haproxy/stats.15928.tmp
> haproxy   17020 17021  haproxy5u unix 0x883feec97000   0t0  
> 679289634 /var/lib/haproxy/stats.15928.tmp
> haproxy   17020 17022  haproxy5u unix 0x883feec97000   0t0  
> 679289634 /var/lib/haproxy/stats.15928.tmp
> haproxy   17020 17023  haproxy5u unix 0x883feec97000   0t0  
> 679289634 /var/lib/haproxy/stats.15928.tmp
> haproxy   28271haproxy5u unix 0x883feec97000   0t0  
> 679289634 /var/lib/haproxy/stats.15928.tmp
> haproxy   28271 28272  haproxy5u unix 0x883feec97000   0t0  
> 679289634 /var/lib/haproxy/stats.15928.tmp
> haproxy   28271 28273  haproxy5u unix 0x883feec97000   0t0  
> 679289634 /var/lib/haproxy/stats.15928.tmp
> haproxy   28271 28274  haproxy5u unix 0x883feec97000   0t0  
> 679289634 /var/lib/haproxy/stats.15928.tmp
>
> (So on unhealthy nodes, I find old processes which are still linked to
> the socket.)
>
> The provisioning part is also seeing data which are supposed to be
> already updated through the runtime API. I suspect I am getting old
> data when connecting to the unix socket. The later being still attached
> to an old process?
> Indeed, if I try
> for i in {1..500}; do sudo echo "show info" | sudo socat stdio  
> /var/lib/haproxy/stats  | grep Pid; done
>
> I get "Pid: 17001" most of the time, which is the last process
> but I sometimes get: "Pid: 28271"(!) which is a > 24 hours old
> process.
>
> Is there something we are doing wrongly?

After some more testing, I don't have this issue using haproxy v1.8.8
(rollbacked for > 12 hours). I hope I don't speak too fast.

-- 
William



Re: SSL certs loading performance regression

2018-05-24 Thread Hervé Commowick
I didn't know about the curves parameter, and i don't see performance
regression with it. I don't really understand why this kind of parameter
can influence certs loading time.

Hervé.

Le 23/05/2018 à 15:08, Emmanuel Hocdet a écrit :
> Hi Hervé,
> 
>> Le 22 mai 2018 à 10:31, Hervé Commowick  a 
>> écrit :
>>
>> Hello HAProxy ML,
>>
>> I tracked down a performance regression about loading bunch of
>> certificates, at least 3x to 5x more time for loading 10 certs since
>> this commit
>> http://git.haproxy.org/?p=haproxy-1.8.git;a=commitdiff;h=f6b37c67be277b5f0ae60438d796ff29ef19be40
>>
>> This regression is 1.8 specific, (no issue in 1.6 or 1.7 branch)
>>
>> my bind line :
>> bind 127.0.0.1:1443 ssl crt ssl10k ecdhe secp384r1
>>
>> After some tests with William, it looks like it is also related to
>> "ecdhe secp384r1" parameter, i don't really understand why, but without
>> this i don't see any regression (and it looks like secp384r1 was
>> effectively working in old version)
>>
> 
> can you try with « curves » parameter and not the old « ecdhe » ?
> 
>> Let me know if i can test something, from 1min30s to 5min has some
>> impacts as you can understand :-)
>>
> 
> Manu.
>