date:20190503

libressl is known to present "bigger than openssl-1.1.1" version (while
lacking many features)
let us wait for libressl+travis-ci patch approval

сб, 4 мая 2019 г. в 00:09, Olivier Houchard :

> Hi Igor,
>
> On Fri, May 03, 2019 at 05:21:50PM +0800, Igor Pav wrote:
> > Just tested with openssl 1.1.1b and haproxy 1.9.7, it appears no
> > success, you are right :)
> >
>
> Indeed :)
> I just pushed commit 010941f87605e8219d25becdbc652350a687d6a2 to master,
> that
> let me do 0RTT both as server and as client. This should be backported to
> 1.8 and 1.9 soon.
> Please note, however, that we will only attempt to connect to a server
> using 0RTT if the client did so, as we have to be sure the client support
> it,
> in case it receives a 425.
> This may change in 2.0, if we add the ability to retry failed requests.
>
> Regards,
>
> Olivier
>
>

Re: reg-tests are broken when running osx + openssl

I would even use shorter path, i.e.

mkdir ~/t

пт, 3 мая 2019 г. в 22:27, Frederic Lecaille :

> On 5/3/19 5:35 PM, Frederic Lecaille wrote:
> > On 5/3/19 3:44 PM, Илья Шипицин wrote:
> >>
> >>
> >> пт, 3 мая 2019 г. в 18:42, Tim Düsterhus  >> >:
> >>
> >> Ilya,
> >>
> >> Am 03.05.19 um 15:39 schrieb Илья Шипицин:
> >>  > when I played with enabling travis-ci, I tried to set TMPDIR
> >> directly,
> >>  > however I was not lucky enough.
> >>  > Later Tim added "sed" magic to .travis.yml
> >>  >
> >>  > personally, I do not understand why "sed" is better than
> >> assigning TMPDIR
> >>  > directly.
> >>
> >> I did not try using TMPDIR=/tmp or something like that, because I
> >> thought there must be a reason why it's that strange long path.
> >>
> >>
> >> I tried /tmp and /var/tmp
> >> it seems that not any filesystem on osx can hold network socket (at
> >> least from my point of view)
> >
> > try to create a working directory owned by the user which run the reg
> > test :
> >
> > $ mkdir -p ~/tmp/
> > $ TMPDIR=~/tmp make reg-tests
>
> I confirm that with such a value everything work on all OS'es
> (https://travis-ci.com/haproxyFred/haproxy)
>
> The attached patch should fix this issue.
>
> Thank you Tim, Ilya.
>
> Fred.
>

Re: reg-tests are broken when running osx + openssl


On 5/3/19 5:35 PM, Frederic Lecaille wrote:

On 5/3/19 3:44 PM, Илья Шипицин wrote:



пт, 3 мая 2019 г. в 18:42, Tim Düsterhus >:


    Ilya,

    Am 03.05.19 um 15:39 schrieb Илья Шипицин:
 > when I played with enabling travis-ci, I tried to set TMPDIR
    directly,
 > however I was not lucky enough.
 > Later Tim added "sed" magic to .travis.yml
 >
 > personally, I do not understand why "sed" is better than
    assigning TMPDIR
 > directly.

    I did not try using TMPDIR=/tmp or something like that, because I
    thought there must be a reason why it's that strange long path.


I tried /tmp and /var/tmp
it seems that not any filesystem on osx can hold network socket (at 
least from my point of view)


try to create a working directory owned by the user which run the reg 
test :


    $ mkdir -p ~/tmp/
    $ TMPDIR=~/tmp make reg-tests


I confirm that with such a value everything work on all OS'es 
(https://travis-ci.com/haproxyFred/haproxy)


The attached patch should fix this issue.

Thank you Tim, Ilya.

Fred.
>From fc9decae9ec679038dc494ad612dd3eb144de408 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20L=C3=A9caille?= 
Date: Fri, 3 May 2019 19:16:02 +0200
Subject: [PATCH] BUILD: travis: TMPDIR replacement.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

TMPDIR default value is too long especially on OSX systems.
We decided to shorten it for all the OS'es.

Thank you to Tim DÃ¼sterhus and Ilya for having helped on this issue.
---
 .travis.yml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index f689fe982..7475ad028 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -31,12 +31,12 @@ before_script:
   # This is a fix for the super long TMPDIR on Mac making
   # the unix socket path names exceed the maximum allowed
   # length.
-  - sed -i'.original' '/TESTDIR=.*haregtests/s/haregtests-.*XX/regtest.XXX/' scripts/run-regtests.sh
+  - mkdir ~/tmp
 
 script:
   - make CC=$CC V=1 TARGET=$TARGET $FLAGS
   - ./haproxy -vv
-  - env VTEST_PROGRAM=../vtest/vtest make reg-tests
+  - env TMPDIR=~/tmp VTEST_PROGRAM=../vtest/vtest make reg-tests
 
 after_failure:
   - |
-- 
2.11.0

Re: [External] Re: QAT intermittent healthcheck errors

Hi Marcin,

On 5/3/19 4:56 PM, Marcin Deranek wrote:
> Hi Emeric,
> 
> On 5/3/19 4:50 PM, Emeric Brun wrote:
> 
>> I've a testing platform here but I don't use the usdm_drv but the 
>> qat_contig_mem and I don't reproduce this issue (I'm using QAT 1.5, as the 
>> doc says to use with my chip) .
> 
> I see. I use qat 1.7 and qat-engine 0.5.40.
> 
>> Anyway, could you re-compile a haproxy's binary if I provide you a testing 
>> patch?
> 
> Sure, that should not be a problem.

The patch in attachment.
> 
>> The idea is to perform a deinit in the master to force a close of those 
>> '/dev's at each reload. Perhaps It won't fix our issue but this leak of fd 
>> should not be.
> 
> Hope this will give us at least some more insight..
> Regards,
> 
> Marcin Deranek

R,
Emeric
>From ca57857a492e898759ef211a8fd9714d0f7dd7fa Mon Sep 17 00:00:00 2001
From: Emeric Brun 
Date: Fri, 3 May 2019 17:06:59 +0200
Subject: [PATCH] BUG/MEDIUM: ssl: fix ssl engine's open fds are leaking.

The master didn't call the engine deinit, resulting
in a leak of fd opened by the engine during init. The
workers inherit of these accumulated fds at each reload.

This patch add a call to engine deinit on the master just
before reloading with an exec.
---
 src/haproxy.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/haproxy.c b/src/haproxy.c
index 603f084c..f77eb1b4 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -588,6 +588,13 @@ void mworker_reload()
 	if (fdtab)
 		deinit_pollers();
 
+#if defined(USE_OPENSSL)
+#ifndef OPENSSL_NO_ENGINE
+	/* Engines may have opened fds and we must close them */
+	ssl_free_engines();
+#endif
+#endif
+
 	/* restore the initial FD limits */
 	limit.rlim_cur = rlim_fd_cur_at_boot;
 	limit.rlim_max = rlim_fd_max_at_boot;
-- 
2.17.1

Re: reg-tests are broken when running osx + openssl


On 5/3/19 3:44 PM, Илья Шипицин wrote:



пт, 3 мая 2019 г. в 18:42, Tim Düsterhus >:


Ilya,

Am 03.05.19 um 15:39 schrieb Илья Шипицин:
 > when I played with enabling travis-ci, I tried to set TMPDIR
directly,
 > however I was not lucky enough.
 > Later Tim added "sed" magic to .travis.yml
 >
 > personally, I do not understand why "sed" is better than
assigning TMPDIR
 > directly.

I did not try using TMPDIR=/tmp or something like that, because I
thought there must be a reason why it's that strange long path.


I tried /tmp and /var/tmp
it seems that not any filesystem on osx can hold network socket (at 
least from my point of view)


try to create a working directory owned by the user which run the reg test :

   $ mkdir -p ~/tmp/
   $ TMPDIR=~/tmp make reg-tests

Re: [External] Re: QAT intermittent healthcheck errors

2019-05-03 Thread Marcin Deranek


Hi Emeric,

On 5/3/19 4:50 PM, Emeric Brun wrote:


I've a testing platform here but I don't use the usdm_drv but the 
qat_contig_mem and I don't reproduce this issue (I'm using QAT 1.5, as the doc 
says to use with my chip) .


I see. I use qat 1.7 and qat-engine 0.5.40.


Anyway, could you re-compile a haproxy's binary if I provide you a testing 
patch?


Sure, that should not be a problem.



The idea is to perform a deinit in the master to force a close of those '/dev's 
at each reload. Perhaps It won't fix our issue but this leak of fd should not 
be.


Hope this will give us at least some more insight..
Regards,

Marcin Deranek


On 5/3/19 4:21 PM, Marcin Deranek wrote:

Hi Emeric,

It looks like on every reload master leaks /dev/usdm_drv device:

# systemctl restart haproxy.service
# ls -la /proc/$(cat haproxy.pid)/fd|fgrep dev
lr-x-- 1 root root 64 May  3 15:40 0 -> /dev/null
lrwx-- 1 root root 64 May  3 15:40 7 -> /dev/usdm_drv

# systemctl reload haproxy.service
# ls -la /proc/$(cat haproxy.pid)/fd|fgrep dev
lr-x-- 1 root root 64 May  3 15:40 0 -> /dev/null
lrwx-- 1 root root 64 May  3 15:40 7 -> /dev/usdm_drv
lrwx-- 1 root root 64 May  3 15:40 9 -> /dev/usdm_drv

# systemctl reload haproxy.service
# ls -la /proc/$(cat haproxy.pid)/fd|fgrep dev
lr-x-- 1 root root 64 May  3 15:40 0 -> /dev/null
lrwx-- 1 root root 64 May  3 15:40 10 -> /dev/usdm_drv
lrwx-- 1 root root 64 May  3 15:40 7 -> /dev/usdm_drv
lrwx-- 1 root root 64 May  3 15:40 9 -> /dev/usdm_drv

Obviously workers do inherit this from the master. Looking at workers I see the 
following:

* 1st gen:

# ls -al /proc/36083/fd|awk '/dev/ {print $NF}'|sort
/dev/null
/dev/null
/dev/qat_adf_ctl
/dev/qat_adf_ctl
/dev/qat_adf_ctl
/dev/qat_dev_processes
/dev/uio19
/dev/uio3
/dev/uio35
/dev/usdm_drv

* 2nd gen:

# ls -al /proc/41637/fd|awk '/dev/ {print $NF}'|sort
/dev/null
/dev/null
/dev/qat_adf_ctl
/dev/qat_adf_ctl
/dev/qat_adf_ctl
/dev/qat_dev_processes
/dev/uio23
/dev/uio39
/dev/uio7
/dev/usdm_drv
/dev/usdm_drv

Looks like only /dev/usdm_drv is leaked.

Cheers,

Marcin Deranek

On 5/3/19 2:22 PM, Emeric Brun wrote:

Hi Marcin,

On 4/29/19 6:41 PM, Marcin Deranek wrote:

Hi Emeric,

On 4/29/19 3:42 PM, Emeric Brun wrote:

Hi Marcin,




I've also a contact at intel who told me to try this option on the qat engine:


--disable-qat_auto_engine_init_on_fork/--enable-qat_auto_engine_init_on_fork
    Disable/Enable the engine from being initialized automatically 
following a
    fork operation. This is useful in a situation where you want to tightly
    control how many instances are being used for processes. For instance 
if an
    application forks to start a process that does not utilize QAT currently
    the default behaviour is for the engine to still automatically get 
started
    in the child using up an engine instance. After using this flag either 
the
    engine needs to be initialized manually using the engine message:
    INIT_ENGINE or will automatically get initialized on the first QAT 
crypto
    operation. The initialization on fork is enabled by default.


I tried to build QAT Engine with disabled auto init, but that did not help. Now 
I get the following during startup:

2019-04-29T15:13:47.142297+02:00 host1 hapee-lb[16604]: qaeOpenFd:753 Unable to 
initialize memory file handle /dev/usdm_drv
2019-04-29T15:13:47+02:00 localhost hapee-lb[16611]: 127.0.0.1:60512 
[29/Apr/2019:15:13:47.139] vip1/23: SSL handshake failure


" INIT_ENGINE or will automatically get initialized on the first QAT crypto 
operation"

Perhaps the init appears "with first qat crypto operation" and is delayed after 
the fork so if a chroot is configured, it doesn't allow some accesses
to /dev. Could you perform a test in that case without chroot enabled in the 
haproxy config ?


Removed chroot and now it initializes properly. Unfortunately reload still causes 
"stuck" HAProxy process :-(

Marcin Deranek


Could you check with "ls -l /proc//fd" if the "/dev/" is 
open multiple times after a reload?

Emeric

Re: [External] Re: QAT intermittent healthcheck errors

Hi Marcin,

Good so we progress!

I've a testing platform here but I don't use the usdm_drv but the 
qat_contig_mem and I don't reproduce this issue (I'm using QAT 1.5, as the doc 
says to use with my chip) .

Anyway, could you re-compile a haproxy's binary if I provide you a testing 
patch?

The idea is to perform a deinit in the master to force a close of those '/dev's 
at each reload. Perhaps It won't fix our issue but this leak of fd should not 
be.

R,
Emeric

On 5/3/19 4:21 PM, Marcin Deranek wrote:
> Hi Emeric,
> 
> It looks like on every reload master leaks /dev/usdm_drv device:
> 
> # systemctl restart haproxy.service
> # ls -la /proc/$(cat haproxy.pid)/fd|fgrep dev
> lr-x-- 1 root root 64 May  3 15:40 0 -> /dev/null
> lrwx-- 1 root root 64 May  3 15:40 7 -> /dev/usdm_drv
> 
> # systemctl reload haproxy.service
> # ls -la /proc/$(cat haproxy.pid)/fd|fgrep dev
> lr-x-- 1 root root 64 May  3 15:40 0 -> /dev/null
> lrwx-- 1 root root 64 May  3 15:40 7 -> /dev/usdm_drv
> lrwx-- 1 root root 64 May  3 15:40 9 -> /dev/usdm_drv
> 
> # systemctl reload haproxy.service
> # ls -la /proc/$(cat haproxy.pid)/fd|fgrep dev
> lr-x-- 1 root root 64 May  3 15:40 0 -> /dev/null
> lrwx-- 1 root root 64 May  3 15:40 10 -> /dev/usdm_drv
> lrwx-- 1 root root 64 May  3 15:40 7 -> /dev/usdm_drv
> lrwx-- 1 root root 64 May  3 15:40 9 -> /dev/usdm_drv
> 
> Obviously workers do inherit this from the master. Looking at workers I see 
> the following:
> 
> * 1st gen:
> 
> # ls -al /proc/36083/fd|awk '/dev/ {print $NF}'|sort
> /dev/null
> /dev/null
> /dev/qat_adf_ctl
> /dev/qat_adf_ctl
> /dev/qat_adf_ctl
> /dev/qat_dev_processes
> /dev/uio19
> /dev/uio3
> /dev/uio35
> /dev/usdm_drv
> 
> * 2nd gen:
> 
> # ls -al /proc/41637/fd|awk '/dev/ {print $NF}'|sort
> /dev/null
> /dev/null
> /dev/qat_adf_ctl
> /dev/qat_adf_ctl
> /dev/qat_adf_ctl
> /dev/qat_dev_processes
> /dev/uio23
> /dev/uio39
> /dev/uio7
> /dev/usdm_drv
> /dev/usdm_drv
> 
> Looks like only /dev/usdm_drv is leaked.
> 
> Cheers,
> 
> Marcin Deranek
> 
> On 5/3/19 2:22 PM, Emeric Brun wrote:
>> Hi Marcin,
>>
>> On 4/29/19 6:41 PM, Marcin Deranek wrote:
>>> Hi Emeric,
>>>
>>> On 4/29/19 3:42 PM, Emeric Brun wrote:
 Hi Marcin,

>
>> I've also a contact at intel who told me to try this option on the qat 
>> engine:
>>
>>> --disable-qat_auto_engine_init_on_fork/--enable-qat_auto_engine_init_on_fork
>>>    Disable/Enable the engine from being initialized automatically 
>>> following a
>>>    fork operation. This is useful in a situation where you want to 
>>> tightly
>>>    control how many instances are being used for processes. For 
>>> instance if an
>>>    application forks to start a process that does not utilize QAT 
>>> currently
>>>    the default behaviour is for the engine to still automatically 
>>> get started
>>>    in the child using up an engine instance. After using this flag 
>>> either the
>>>    engine needs to be initialized manually using the engine message:
>>>    INIT_ENGINE or will automatically get initialized on the first 
>>> QAT crypto
>>>    operation. The initialization on fork is enabled by default.
>
> I tried to build QAT Engine with disabled auto init, but that did not 
> help. Now I get the following during startup:
>
> 2019-04-29T15:13:47.142297+02:00 host1 hapee-lb[16604]: qaeOpenFd:753 
> Unable to initialize memory file handle /dev/usdm_drv
> 2019-04-29T15:13:47+02:00 localhost hapee-lb[16611]: 127.0.0.1:60512 
> [29/Apr/2019:15:13:47.139] vip1/23: SSL handshake failure

 " INIT_ENGINE or will automatically get initialized on the first QAT 
 crypto operation"

 Perhaps the init appears "with first qat crypto operation" and is delayed 
 after the fork so if a chroot is configured, it doesn't allow some accesses
 to /dev. Could you perform a test in that case without chroot enabled in 
 the haproxy config ?
>>>
>>> Removed chroot and now it initializes properly. Unfortunately reload still 
>>> causes "stuck" HAProxy process :-(
>>>
>>> Marcin Deranek
>>
>> Could you check with "ls -l /proc//fd" if the "/dev/" 
>> is open multiple times after a reload?
>>
>> Emeric
>>

Re: [External] Re: QAT intermittent healthcheck errors

2019-05-03 Thread Marcin Deranek


Hi Emeric,

It looks like on every reload master leaks /dev/usdm_drv device:

# systemctl restart haproxy.service
# ls -la /proc/$(cat haproxy.pid)/fd|fgrep dev
lr-x-- 1 root root 64 May  3 15:40 0 -> /dev/null
lrwx-- 1 root root 64 May  3 15:40 7 -> /dev/usdm_drv

# systemctl reload haproxy.service
# ls -la /proc/$(cat haproxy.pid)/fd|fgrep dev
lr-x-- 1 root root 64 May  3 15:40 0 -> /dev/null
lrwx-- 1 root root 64 May  3 15:40 7 -> /dev/usdm_drv
lrwx-- 1 root root 64 May  3 15:40 9 -> /dev/usdm_drv

# systemctl reload haproxy.service
# ls -la /proc/$(cat haproxy.pid)/fd|fgrep dev
lr-x-- 1 root root 64 May  3 15:40 0 -> /dev/null
lrwx-- 1 root root 64 May  3 15:40 10 -> /dev/usdm_drv
lrwx-- 1 root root 64 May  3 15:40 7 -> /dev/usdm_drv
lrwx-- 1 root root 64 May  3 15:40 9 -> /dev/usdm_drv

Obviously workers do inherit this from the master. Looking at workers I 
see the following:


* 1st gen:

# ls -al /proc/36083/fd|awk '/dev/ {print $NF}'|sort
/dev/null
/dev/null
/dev/qat_adf_ctl
/dev/qat_adf_ctl
/dev/qat_adf_ctl
/dev/qat_dev_processes
/dev/uio19
/dev/uio3
/dev/uio35
/dev/usdm_drv

* 2nd gen:

# ls -al /proc/41637/fd|awk '/dev/ {print $NF}'|sort
/dev/null
/dev/null
/dev/qat_adf_ctl
/dev/qat_adf_ctl
/dev/qat_adf_ctl
/dev/qat_dev_processes
/dev/uio23
/dev/uio39
/dev/uio7
/dev/usdm_drv
/dev/usdm_drv

Looks like only /dev/usdm_drv is leaked.

Cheers,

Marcin Deranek

On 5/3/19 2:22 PM, Emeric Brun wrote:

Hi Marcin,

On 4/29/19 6:41 PM, Marcin Deranek wrote:

Hi Emeric,

On 4/29/19 3:42 PM, Emeric Brun wrote:

Hi Marcin,




I've also a contact at intel who told me to try this option on the qat engine:


--disable-qat_auto_engine_init_on_fork/--enable-qat_auto_engine_init_on_fork
   Disable/Enable the engine from being initialized automatically following 
a
   fork operation. This is useful in a situation where you want to tightly
   control how many instances are being used for processes. For instance if 
an
   application forks to start a process that does not utilize QAT currently
   the default behaviour is for the engine to still automatically get 
started
   in the child using up an engine instance. After using this flag either 
the
   engine needs to be initialized manually using the engine message:
   INIT_ENGINE or will automatically get initialized on the first QAT crypto
   operation. The initialization on fork is enabled by default.


I tried to build QAT Engine with disabled auto init, but that did not help. Now 
I get the following during startup:

2019-04-29T15:13:47.142297+02:00 host1 hapee-lb[16604]: qaeOpenFd:753 Unable to 
initialize memory file handle /dev/usdm_drv
2019-04-29T15:13:47+02:00 localhost hapee-lb[16611]: 127.0.0.1:60512 
[29/Apr/2019:15:13:47.139] vip1/23: SSL handshake failure


" INIT_ENGINE or will automatically get initialized on the first QAT crypto 
operation"

Perhaps the init appears "with first qat crypto operation" and is delayed after 
the fork so if a chroot is configured, it doesn't allow some accesses
to /dev. Could you perform a test in that case without chroot enabled in the 
haproxy config ?


Removed chroot and now it initializes properly. Unfortunately reload still causes 
"stuck" HAProxy process :-(

Marcin Deranek


Could you check with "ls -l /proc//fd" if the "/dev/" is 
open multiple times after a reload?

Emeric

Re: leak of handle to /dev/urandom since 1.8?

2019-05-03 Thread Lukas Tribus

Hello,


On Fri, 3 May 2019 at 14:15, Emeric Brun  wrote:
> >> Please do not commit this yet.
> >>
> >> We need those random devices open in openssl 1.1.1. We specifically
> >> pushed for this and had very long conversations with openssl folks.
> >>
> >> I don't have time to dig up the entire history right now, will do that
> >> later for context, however, please do not commit this yet.
> >>
> >>
> >
> > Lukas,
> >
> > This is the code of deinitilisation of the master, which is launched before
> > the re-execution of the master, it does not impact the workers.
> >
>
> Indeed if the workers keep the fd open it should work, the master is outside 
> de chroot and doesn't need to keep the fd open.

Ok, thanks for clarifying to both of you, I imagined something like
this but wanted to be sure.


cheers,
lukas

findings of gcc address sanitizer

Hello,

I run reg-tests on gcc-9 (fedora 30).
I built haproxy the following way

make CC=gcc V=1 TARGET=$TARGET $FLAGS DEBUG_CFLAGS="-fsanitize=address
-ggdb" LDFLAGS="-lasan"

asan found couple of things

***  h10.1 debug|#0 0x6db986 in update_log_hdr src/log.c:1399
***  h10.1 debug|#1 0x6db986 in __do_send_log src/log.c:1547
***  h10.1 debug|#2 0x6db986 in __send_log src/log.c:1764
***  h10.1 debug|#3 0x6e274e in strm_log src/log.c:2959
***  h10.1 debug|#4 0x559753 in process_stream src/stream.c:2665
***  h10.1 debug|#5 0x7b66b6 in process_runnable_tasks
src/task.c:389
***  h10.1 debug|#6 0x6127f9 in run_poll_loop src/haproxy.c:2447
***  h10.1 debug|#7 0x6127f9 in run_thread_poll_loop
src/haproxy.c:2512
***  h10.1 debug|#8 0x42241d in main src/haproxy.c:3183
***  h10.1 debug|#9 0x7f8ebc8aff32 in __libc_start_main
(/lib64/libc.so.6+0x23f32)
***  h10.1 debug|#10 0x4250bd in _start
(/home/ilia/haproxy-1/haproxy+0x4250bd)
***  h10.1 debug|
***  h10.1 debug|0x61903c95 is located 21 bytes inside of 1025-byte
region [0x61903c80,0x61904081)
***  h10.1 debug|freed by thread T0 here:
***  h10.1 debug|#0 0x7f8ebd15c5de in realloc
(/lib64/libasan.so.5+0x10e5de)
***  h10.1 debug|#1 0x6dbd31 in my_realloc2
include/common/standard.h:1432
***  h10.1 debug|#2 0x6dbd31 in init_log_buffers src/log.c:1880
***  h10.1 debug|
***  h10.1 debug|previously allocated by thread T0 here:
***  h10.1 debug|#0 0x7f8ebd15c5de in realloc
(/lib64/libasan.so.5+0x10e5de)
***  h10.1 debug|#1 0x6dbd31 in my_realloc2
include/common/standard.h:1432
***  h10.1 debug|#2 0x6dbd31 in init_log_buffers src/log.c:1880
***  h10.1 debug|
***  h10.1 debug|SUMMARY: AddressSanitizer: heap-use-after-free
src/log.c:1399 in update_log_hdr






***  h10.1
debug|=
***  h10.1 debug|==23684==ERROR: LeakSanitizer: detected memory leaks
***  h10.1 debug|
***  h10.1 debug|Direct leak of 24 byte(s) in 1 object(s) allocated
from:
***  h10.1 debug|#0 0x7f9ac626f1a8 in __interceptor_malloc
(/lib64/libasan.so.5+0x10e1a8)
***  h10.1 debug|#1 0x7f9ac6076b1b  (/lib64/libssl.so.1.1+0x33b1b)
***  h10.1 debug|
***  h10.1 debug|SUMMARY: AddressSanitizer: 24 byte(s) leaked in 1
allocation(s).

Re: reg-tests are broken when running osx + openssl

пт, 3 мая 2019 г. в 18:42, Tim Düsterhus :

> Ilya,
>
> Am 03.05.19 um 15:39 schrieb Илья Шипицин:
> > when I played with enabling travis-ci, I tried to set TMPDIR directly,
> > however I was not lucky enough.
> > Later Tim added "sed" magic to .travis.yml
> >
> > personally, I do not understand why "sed" is better than assigning TMPDIR
> > directly.
>
> I did not try using TMPDIR=/tmp or something like that, because I
> thought there must be a reason why it's that strange long path.
>

I tried /tmp and /var/tmp
it seems that not any filesystem on osx can hold network socket (at least
from my point of view)


>
> Best regards
> Tim Düsterhus
>

Re: reg-tests are broken when running osx + openssl

2019-05-03 Thread Tim Düsterhus

Ilya,

Am 03.05.19 um 15:39 schrieb Илья Шипицин:
> when I played with enabling travis-ci, I tried to set TMPDIR directly,
> however I was not lucky enough.
> Later Tim added "sed" magic to .travis.yml
> 
> personally, I do not understand why "sed" is better than assigning TMPDIR
> directly.

I did not try using TMPDIR=/tmp or something like that, because I
thought there must be a reason why it's that strange long path.

Best regards
Tim Düsterhus

Re: reg-tests are broken when running osx + openssl

пт, 3 мая 2019 г. в 18:33, Frederic Lecaille :

> On 5/3/19 1:34 PM, Tim Düsterhus wrote:
> > Fred,
> > Ilya,
>
> Hello Tim,
>
> > Am 03.05.19 um 13:20 schrieb Frederic Lecaille:
> >> About the test which fail, I would say that such errors are not
> >> negligible :
> >>
> >>  Starting frontend GLOBAL: cannot change UNIX socket ownership
> >> [/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg8gn/T//regtest.zHu/
> >
> > I believe this is an issue with the long TMPDIR that I tried to mitigate
> > with this:
> https://github.com/haproxy/haproxy/blob/master/.travis.yml#L34
>
> With your patch, vtest is able to create the LOG files at the same place
> $TMPDIR/ where the UNIX stats socket should be
> created. So this does not interfere with the test.
>
> > While debugging I noticed that the validation did not properly account
> > for the temporary extension of the filename during start-up, causing
> > HAProxy to accept the filename during the check, but fail to set it up.
> > This leads to the misleading error message.
>
> Yes, perhaps he UNIX stats socket filename is too long (I have found 104
> max length for sun_path on Max OS X, 108 on Linux).
>
> So, I propose you revert your fix, and try to find another ways to set
> TMPDIR with a shorter value than the default one which is too long for
> UNIX sockets. At least this is the correct way to change the working
> directory for vtest.
>
> For instance we have:
>
>
> /var/folders/nz/vv4_9tw56nv9k3tkvyszvwg8gn/T//regtest.zHu/vtc.23058.0fa4d8bc/h1/stats.sock
>
> which is 94 bytes long. Should work only if we do not add an ..tmp
> extension bigger than 10 bytes. I guess this is not the case when the
> PID is big. Now I understand why some test may pass.
>
> I have also noted that there is a missing closing bracket in this log line:
>
> ***  h10.0 debug|[ALERT] 122/093540 (23139) : Starting frontend
> GLOBAL: cannot change UNIX socket ownership
> [/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg8gn/T//regtest.zHu/
>
> which is built like that:
>
>  snprintf(errmsg, errlen, "%s [%s]", msg, path);
>
> with 100 as errlen value: "cannot change UNIX socket ownership
> [/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg8gn/T//regtest.zHu/" is
> exactly a 100 bytes long string. So here the path for the UNIX socket is
> truncated in the log.
>
> So let's try with a shorter TMPDIR variable please. This should fix the
> issue.
>

when I played with enabling travis-ci, I tried to set TMPDIR directly,
however I was not lucky enough.
Later Tim added "sed" magic to .travis.yml

personally, I do not understand why "sed" is better than assigning TMPDIR
directly.

please enable travis-ci.com on your accounts and try your ideas (with osx).


>
> > I did not get around to investigating this further and filing a bug
> > report, however.
> >
> > Best regards
> > Tim Düsterhus
> >
>
>

Re: reg-tests are broken when running osx + openssl

On 5/3/19 1:34 PM, Tim Düsterhus wrote:

Fred,
Ilya,

Hello Tim,

Am 03.05.19 um 13:20 schrieb Frederic Lecaille:

About the test which fail, I would say that such errors are not
negligible :

Starting frontend GLOBAL: cannot change UNIX socket ownership
[/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg8gn/T//regtest.zHu/

I believe this is an issue with the long TMPDIR that I tried to mitigate
with this: https://github.com/haproxy/haproxy/blob/master/.travis.yml#L34

With your patch, vtest is able to create the LOG files at the same place
$TMPDIR/ where the UNIX stats socket should be
created. So this does not interfere with the test.

While debugging I noticed that the validation did not properly account
for the temporary extension of the filename during start-up, causing
HAProxy to accept the filename during the check, but fail to set it up.
This leads to the misleading error message.

Yes, perhaps he UNIX stats socket filename is too long (I have found 104
max length for sun_path on Max OS X, 108 on Linux).

So, I propose you revert your fix, and try to find another ways to set
TMPDIR with a shorter value than the default one which is too long for
UNIX sockets. At least this is the correct way to change the working
directory for vtest.

For instance we have:

/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg8gn/T//regtest.zHu/vtc.23058.0fa4d8bc/h1/stats.sock

which is 94 bytes long. Should work only if we do not add an ..tmp
extension bigger than 10 bytes. I guess this is not the case when the
PID is big. Now I understand why some test may pass.

I have also noted that there is a missing closing bracket in this log line:

*** h10.0 debug|[ALERT] 122/093540 (23139) : Starting frontend
GLOBAL: cannot change UNIX socket ownership
[/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg8gn/T//regtest.zHu/

which is built like that:

snprintf(errmsg, errlen, "%s [%s]", msg, path);

with 100 as errlen value: "cannot change UNIX socket ownership
[/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg8gn/T//regtest.zHu/" is
exactly a 100 bytes long string. So here the path for the UNIX socket is
truncated in the log.

So let's try with a shorter TMPDIR variable please. This should fix the
issue.

I did not get around to investigating this further and filing a bug
report, however.

Best regards
Tim Düsterhus

Re: [External] Re: QAT intermittent healthcheck errors

Hi Marcin,

On 4/29/19 6:41 PM, Marcin Deranek wrote:
> Hi Emeric,
> 
> On 4/29/19 3:42 PM, Emeric Brun wrote:
>> Hi Marcin,
>>
>>>
 I've also a contact at intel who told me to try this option on the qat 
 engine:

> --disable-qat_auto_engine_init_on_fork/--enable-qat_auto_engine_init_on_fork
>   Disable/Enable the engine from being initialized automatically 
> following a
>   fork operation. This is useful in a situation where you want to 
> tightly
>   control how many instances are being used for processes. For 
> instance if an
>   application forks to start a process that does not utilize QAT 
> currently
>   the default behaviour is for the engine to still automatically get 
> started
>   in the child using up an engine instance. After using this flag 
> either the
>   engine needs to be initialized manually using the engine message:
>   INIT_ENGINE or will automatically get initialized on the first QAT 
> crypto
>   operation. The initialization on fork is enabled by default.
>>>
>>> I tried to build QAT Engine with disabled auto init, but that did not help. 
>>> Now I get the following during startup:
>>>
>>> 2019-04-29T15:13:47.142297+02:00 host1 hapee-lb[16604]: qaeOpenFd:753 
>>> Unable to initialize memory file handle /dev/usdm_drv
>>> 2019-04-29T15:13:47+02:00 localhost hapee-lb[16611]: 127.0.0.1:60512 
>>> [29/Apr/2019:15:13:47.139] vip1/23: SSL handshake failure
>>
>> " INIT_ENGINE or will automatically get initialized on the first QAT crypto 
>> operation"
>>
>> Perhaps the init appears "with first qat crypto operation" and is delayed 
>> after the fork so if a chroot is configured, it doesn't allow some accesses
>> to /dev. Could you perform a test in that case without chroot enabled in the 
>> haproxy config ?
> 
> Removed chroot and now it initializes properly. Unfortunately reload still 
> causes "stuck" HAProxy process :-(
> 
> Marcin Deranek

Could you check with "ls -l /proc//fd" if the "/dev/" is 
open multiple times after a reload?

Emeric

Re: leak of handle to /dev/urandom since 1.8?

Hi Lukas,

On 5/3/19 1:49 PM, William Lallemand wrote:
> On Fri, May 03, 2019 at 01:38:00PM +0200, Lukas Tribus wrote:
>> Hello everyone,
>>
>>
>> On Fri, 3 May 2019 at 12:50, Robert Allen1  wrote:
>>> +#if defined(USE_OPENSSL) && (OPENSSL_VERSION_NUMBER >= 0x10101000L)
>>> +   if (global.ssl_used_frontend || global.ssl_used_backend)
>>> +   /* close random device FDs */
>>> +   RAND_keep_random_devices_open(0);
>>> +#endif
>>>
>>> and requests a backport to 1.8 and 1.9 where we noticed this issue (and
>>> which
>>> include the re-exec for reload code, if I followed its history
>>> thoroughly).
>>
>> Please do not commit this yet.
>>
>> We need those random devices open in openssl 1.1.1. We specifically
>> pushed for this and had very long conversations with openssl folks.
>>
>> I don't have time to dig up the entire history right now, will do that
>> later for context, however, please do not commit this yet.
>>
>>
> 
> Lukas,
> 
> This is the code of deinitilisation of the master, which is launched before
> the re-execution of the master, it does not impact the workers.
> 

Indeed if the workers keep the fd open it should work, the master is outside de 
chroot and doesn't need to keep the fd open.

Emeric

Re: leak of handle to /dev/urandom since 1.8?

2019-05-03 Thread William Lallemand

On Fri, May 03, 2019 at 01:38:00PM +0200, Lukas Tribus wrote:
> Hello everyone,
> 
> 
> On Fri, 3 May 2019 at 12:50, Robert Allen1  wrote:
> > +#if defined(USE_OPENSSL) && (OPENSSL_VERSION_NUMBER >= 0x10101000L)
> > +   if (global.ssl_used_frontend || global.ssl_used_backend)
> > +   /* close random device FDs */
> > +   RAND_keep_random_devices_open(0);
> > +#endif
> >
> > and requests a backport to 1.8 and 1.9 where we noticed this issue (and
> > which
> > include the re-exec for reload code, if I followed its history
> > thoroughly).
> 
> Please do not commit this yet.
> 
> We need those random devices open in openssl 1.1.1. We specifically
> pushed for this and had very long conversations with openssl folks.
> 
> I don't have time to dig up the entire history right now, will do that
> later for context, however, please do not commit this yet.
> 
> 

Lukas,

This is the code of deinitilisation of the master, which is launched before
the re-execution of the master, it does not impact the workers.

-- 
William Lallemand

Re: reg-tests are broken when running osx + openssl

2019-05-03 Thread Tim Düsterhus

Fred,
Ilya,

Am 03.05.19 um 13:20 schrieb Frederic Lecaille:
> About the test which fail, I would say that such errors are not
> negligible :
> 
>     Starting frontend GLOBAL: cannot change UNIX socket ownership
> [/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg8gn/T//regtest.zHu/

I believe this is an issue with the long TMPDIR that I tried to mitigate
with this: https://github.com/haproxy/haproxy/blob/master/.travis.yml#L34

While debugging I noticed that the validation did not properly account
for the temporary extension of the filename during start-up, causing
HAProxy to accept the filename during the check, but fail to set it up.
This leads to the misleading error message.

I did not get around to investigating this further and filing a bug
report, however.

Best regards
Tim Düsterhus

Re: leak of handle to /dev/urandom since 1.8?

2019-05-03 Thread Lukas Tribus

Hello everyone,

On Fri, 3 May 2019 at 12:50, Robert Allen1  wrote:
> +#if defined(USE_OPENSSL) && (OPENSSL_VERSION_NUMBER >= 0x10101000L)
> +   if (global.ssl_used_frontend || global.ssl_used_backend)
> +   /* close random device FDs */
> +   RAND_keep_random_devices_open(0);
> +#endif
>
> and requests a backport to 1.8 and 1.9 where we noticed this issue (and
> which
> include the re-exec for reload code, if I followed its history
> thoroughly).

Please do not commit this yet.

We need those random devices open in openssl 1.1.1. We specifically
pushed for this and had very long conversations with openssl folks.

I don't have time to dig up the entire history right now, will do that
later for context, however, please do not commit this yet.

Also CCing Emeric.

Thanks,
Lukas

Re: reg-tests are broken when running osx + openssl


On 5/3/19 1:20 PM, Frederic Lecaille wrote:

So on OSX you should try to use/create a temporary working director 
where you have enough permissions to create a stats UNIX socket with 
0600 as permissions.


I meant you should try to create a temporary working directory for vtest 
using the TMPDIR environment variable, as follows for instance:


  $ mkdir ~/tmp/foo

  $ TMPDIR=~/tmp/foo make reg-tests 
reg-tests/http-capture/multiple_headers.vtc


## Preparing to run tests ##
Testing with haproxy version: 2.0-dev2-a48237-261
Target : linux2628
Options : +EPOLL -KQUEUE -MY_EPOLL -MY_SPLICE +NETFILTER +PCRE -PCRE_JIT 
-PCRE2 -PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED -REGPARM 
-STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT 
+CRYPT_H -VSYSCALL -GETADDRINFO +OPENSSL +LUA +FUTEX +ACCEPT4 
-MY_ACCEPT4 +ZLIB -SLZ +CPU_AFFINITY -TFO -NS +DL +RT -DEVICEATLAS 
-51DEGREES -WURFL -SYSTEMD -OBSOLETE_LINKER +PRCTL

## Gathering tests to run ##
  Add test: reg-tests/http-capture/multiple_headers.vtc
## Starting vtest ##
Testing with haproxy version: 2.0-dev2-a48237-261
#top  TEST reg-tests/http-capture/multiple_headers.vtc TIMED OUT 
(kill -9)
#top  TEST reg-tests/http-capture/multiple_headers.vtc FAILED 
(10.010) signal=9

1 tests failed, 0 tests skipped, 0 tests passed
## Gathering results ##
## Test case: reg-tests/http-capture/multiple_headers.vtc ##
## test results in: 
"/home/flecaille/tmp/foo/haregtests-2019-05-03_13-24-59.2P0ECZ/vtc.1327.1fe5daa1"

 c15.0 HTTP rx timeout (fd:9 5000 ms)
Makefile:971: recipe for target 'reg-tests' failed
make: *** [reg-tests] Error 1


As you can see the logs are now in /home/flecaille/tmp/foo (with ~ my 
home directory: /home/flecaille)


The LOG file is here: 
/home/flecaille/tmp/foo/haregtests-2019-05-03_13-24-59.2P0ECZ/vtc.1327.1fe5daa1/LOG


and the UNIX stats socket is here:

$ grep stats 
/home/flecaille/tmp/foo/haregtests-2019-05-03_13-24-59.2P0ECZ/vtc.1327.1fe5daa1/LOG 

 h 0.0 conf|\tstats socket 
"/home/flecaille/tmp/foo/haregtests-2019-05-03_13-24-59.2P0ECZ/vtc.1327.1fe5daa1/h/stats.sock" 
level admin mode 600

Re: reg-tests are broken when running osx + openssl


On 5/3/19 11:39 AM, Илья Шипицин wrote:

Hello,

I'm expanding openssl matrix.
here's failing build

https://travis-ci.org/chipitsine/haproxy-1/jobs/527683332


Hello Ilya,

In fact this has nothing to see with openssl. A lot of tests without any 
usage of TLS/SSL also fail.


There are a lot of HTTP rx which timed out.

Only these two tests passed:

   reg-tests/http-capture/multiple_headers.vtc
   reg-tests/spoe/wrong_init.vtc

but in these cases we do not have any log.

About the test which fail, I would say that such errors are not negligible :

Starting frontend GLOBAL: cannot change UNIX socket ownership 
[/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg8gn/T//regtest.zHu/


I have simulated it with such a patch on Linux:

$ git diff src/proto_uxst.c
diff --git a/src/proto_uxst.c b/src/proto_uxst.c
index 980a22649..b5f945b9f 100644
--- a/src/proto_uxst.c
+++ b/src/proto_uxst.c
@@ -309,6 +309,10 @@ static int uxst_bind_listener(struct listener 
*listener, char *errmsg, int errle

goto err_unlink_temp;
}

+   err |= ERR_FATAL | ERR_ALERT;
+   msg = "cannot change UNIX socket ownership";
+   goto err_unlink_temp;
+
ready = 0;
ready_len = sizeof(ready);
if (getsockopt(fd, SOL_SOCKET, SO_ACCEPTCONN, , 
_len) == -1)



I got the same results as yours: lots of HTTP RX time out because 
haproxy exited unespectedly.


But in such a case on my PC only reg-tests/spoe/wrong_init.vtc succeeds.
I do not understand how reg-tests/http-capture/multiple_headers.vtc can 
succeed on your side.


Would be interesting to run it on OSX with this command:

$ make reg-tests reg-tests/http-capture/multiple_headers.vtc -- --debug


So on OSX you should try to use/create a temporary working director 
where you have enough permissions to create a stats UNIX socket with 
0600 as permissions.


And let's see if that fixes your issue.


Fred.

Re: leak of handle to /dev/urandom since 1.8?

2019-05-03 Thread Robert Allen1

Hi William,

William Lallemand  wrote on 03/05/2019 11:06:41:


> your mailer seems to mess with the whitespaces and tabs in the patch.

Apologies again for the formatting on my last message.

My mailer -- if it deserves the term -- is Lotus Notes, which is
apparently incapable of doing the right thing when it comes to
wrapping. I'll try to fix this before I embarrass myself further... :)

(Or else only write short paragraphs and sentences.)

Rob


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Re: leak of handle to /dev/urandom since 1.8?

2019-05-03 Thread Robert Allen1

Hi William,

William Lallemand  wrote on 03/05/2019 11:06:41:

> Could you send us as an attachment or using git-send-email because
> your mailer seems to mess with the whitespaces and tabs in the patch.
> Also add a line at the end of the commit message indicating in which 
version
> this patch should be backported. Thanks!

Apologies! I have attached it now, with a backports line.

> > * My reading of RAND_keep_random_devices_open is that it expects 
OpenSSL
> >   rand_lib initialisation to have occurred already, and it will do it 
if 
> > not.
> >   So it seems possible that this function call could incur some delays 
if
> >   rand_lib is not yet initialised and the entropy sources cause delay, 

> > etc.
> >   However, I don't know how big a concern that is. Any thoughts?
> 
> In this case you could check the variables global.ssl_used_frontend &&
> global.ssl_used_backend to ensure that SSL was used in the 
configuration.
> When those variables are not set, the random is not initialized. 

I did this in the attached patch.

However, I checked the current implementation in OpenSSL and I overstated 
the
problem before: the initialisation consists of constructing three locks 
and
initialising a short array of structs, with no obvious usage of random 
devices.
Therefore, it should not be very expensive, although it is still 
unnecessary.

For the sake of the list, the patch now looks like:

+#if defined(USE_OPENSSL) && (OPENSSL_VERSION_NUMBER >= 0x10101000L)
+   if (global.ssl_used_frontend || global.ssl_used_backend)
+   /* close random device FDs */
+   RAND_keep_random_devices_open(0);
+#endif

and requests a backport to 1.8 and 1.9 where we noticed this issue (and 
which
include the re-exec for reload code, if I followed its history 
thoroughly).

Rob


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


0001-BUG-MINOR-mworker-close-OpenSSL-FDs-on-reload.patch
Description: Binary data

[PR] BUILD: extend travis-ci matrix

2019-05-03 Thread PR Bot

Dear list!

Author: Ilya Shipitsin 
Number of patches: 1

This is an automated relay of the Github pull request:
   BUILD: extend travis-ci matrix

Patch title(s): 
   BUILD: extend travis-ci matrix

Link:
   https://github.com/haproxy/haproxy/pull/91

Edit locally:
   wget https://github.com/haproxy/haproxy/pull/91.patch && vi 91.patch

Apply locally:
   curl https://github.com/haproxy/haproxy/pull/91.patch | git am -

Description:
   added openssl-1.0.2, 1.1.0, 1.1.1, libressl-2.7.5, 2.8.3, 2.9.1
   added linux-ppc64le image
   
   libressl builds are yet broken.
   they will get repaired after separate patch (already sent to mailing
   list)

Instructions:
   This github pull request will be closed automatically; patch should be
   reviewed on the haproxy mailing list (haproxy@formilux.org). Everyone is
   invited to comment, even the patch's author. Please keep the author and
   list CCed in replies. Please note that in absence of any response this
   pull request will be lost.

Re: leak of handle to /dev/urandom since 1.8?

2019-05-03 Thread William Lallemand

Hi Robert,

> Hi William,
> 
> Thanks for the your input. I've included a patch below against current 
> master
> that I hope conforms to the contribution guidelines well enough. :)
>

Could you send us as an attachment or using git-send-email because
your mailer seems to mess with the whitespaces and tabs in the patch.
Also add a line at the end of the commit message indicating in which version
this patch should be backported. Thanks!

> A couple of thoughts on my work:
> 
> * Having to include a file directly from OpenSSL seems unfortunate, but OK 
> in
>   the context of the preprocessor guard
> * The comment is perhaps redundant, but I don't think the side effect of 
> the
>   OpenSSL function is obvious from its name otherwise

Fine to me.

> * My reading of RAND_keep_random_devices_open is that it expects OpenSSL
>   rand_lib initialisation to have occurred already, and it will do it if 
> not.
>   So it seems possible that this function call could incur some delays if
>   rand_lib is not yet initialised and the entropy sources cause delay, 
> etc.
>   However, I don't know how big a concern that is. Any thoughts?

In this case you could check the variables global.ssl_used_frontend &&
global.ssl_used_backend to ensure that SSL was used in the configuration.
When those variables are not set, the random is not initialized. 

Regards,

-- 
William Lallemand

reg-tests are broken when running osx + openssl