Re: Linux Builds broken on Travis CI

2019-09-15 Thread Willy Tarreau
On Sat, Sep 14, 2019 at 09:20:53PM +0500,  ??? wrote:
> it turned out that ASAN is "the root cause" of those failures. let us
> disable it for a while (I attached patch)

OK now merged but I really fail to see the relation between ASAN and
the random "out of memory" errors spewed by *vtest* itself, not even
haproxy. I think you only caught a side effect. Anyway I've merged
your patch so that we can observe with less noise, but I don't expect
to see any difference here.

Willy



Re: Linux Builds broken on Travis CI

2019-09-14 Thread Илья Шипицин
it turned out that ASAN is "the root cause" of those failures. let us
disable it for a while (I attached patch)

пт, 13 сент. 2019 г. в 19:23, Илья Шипицин :

>
>
> On Fri, Sep 13, 2019, 3:49 PM Willy Tarreau  wrote:
>
>> On Fri, Sep 13, 2019 at 03:45:21PM +0500,  ??? wrote:
>> > now build fails with
>> >
>> > "** h1 debug|[ALERT] 255/081449 (8721) : failed to allocate resources
>> for
>> > thread 1."
>>
>> That's exactly the issues I was talking about that started to happen
>> at an increasing frequency over the last few weeks.
>>
>> > no more failures due to leaks.
>>
>> Great! What do you think about leaving the tests only for the cron tasks ?
>>
>
> Give me few days ))
>
>
>
>> Willy
>>
>
From 8254d7dbd00cf9c6f06aa434d1bf156a6b04d7e0 Mon Sep 17 00:00:00 2001
From: Ilya Shipitsin 
Date: Sat, 14 Sep 2019 21:18:49 +0500
Subject: [PATCH] BUILD: CI: temporarily disable ASAN

it turned out that ASAN breaks things. until this is resolved,
let us disable ASAN
---
 .travis.yml | 1 -
 1 file changed, 1 deletion(-)

diff --git a/.travis.yml b/.travis.yml
index a65662496..8fb906409 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -82,7 +82,6 @@ install:
   - scripts/build-ssl.sh > build-ssl.log 2>&1 || (cat build-ssl.log && exit 1)
 
 script:
-  - if [ "${CC}"  = "clang" ]; then export FLAGS="$FLAGS USE_OBSOLETE_LINKER=1" DEBUG_CFLAGS="-g -fsanitize=address" LDFLAGS="-fsanitize=address"; fi
   - make -C contrib/wurfl
   - make -j3 CC=$CC V=1 TARGET=$TARGET $FLAGS DEBUG_CFLAGS="$DEBUG_CFLAGS" LDFLAGS="$LDFLAGS" 51DEGREES_SRC="$FIFTYONEDEGREES_SRC" EXTRA_OBJS="$EXTRA_OBJS"
   - if [ "${TRAVIS_OS_NAME}" = "linux" ]; then export LD_LIBRARY_PATH="${HOME}/opt/lib:${LD_LIBRARY_PATH:-}"; fi
-- 
2.20.1



Re: Linux Builds broken on Travis CI

2019-09-13 Thread Илья Шипицин
On Fri, Sep 13, 2019, 3:49 PM Willy Tarreau  wrote:

> On Fri, Sep 13, 2019 at 03:45:21PM +0500,  ??? wrote:
> > now build fails with
> >
> > "** h1 debug|[ALERT] 255/081449 (8721) : failed to allocate resources for
> > thread 1."
>
> That's exactly the issues I was talking about that started to happen
> at an increasing frequency over the last few weeks.
>
> > no more failures due to leaks.
>
> Great! What do you think about leaving the tests only for the cron tasks ?
>

Give me few days ))



> Willy
>


Re: Linux Builds broken on Travis CI

2019-09-13 Thread Willy Tarreau
On Fri, Sep 13, 2019 at 03:45:21PM +0500,  ??? wrote:
> now build fails with
> 
> "** h1 debug|[ALERT] 255/081449 (8721) : failed to allocate resources for
> thread 1."

That's exactly the issues I was talking about that started to happen
at an increasing frequency over the last few weeks.

> no more failures due to leaks.

Great! What do you think about leaving the tests only for the cron tasks ?

Willy



Re: Linux Builds broken on Travis CI

2019-09-13 Thread Илья Шипицин
now build fails with

"** h1 debug|[ALERT] 255/081449 (8721) : failed to allocate resources for
thread 1."


no more failures due to leaks.



пт, 13 сент. 2019 г. в 13:33, Willy Tarreau :

> On Fri, Sep 13, 2019 at 01:23:12PM +0500,  ??? wrote:
> > Build was failed due to memory leak detected by asan
> >
> > https://github.com/haproxy/haproxy/issues/256
> >
> >
> > I think we can change the way asan works, I.e. log errors and do not stop
> > tests
>
> I didn't even notice it was this one because we've had too many errors
> reported over the last weeks as I mentioned. At this point I'd rather
> do the opposite and possibly keep asan (if it reports valid things only)
> and drop the tests which randomly fail 50% of the time on this
> infrastructure.
>

I meant "collect and report leaks separately from other failures".
I'll send patch soon


>
> Willy
>


Re: Linux Builds broken on Travis CI

2019-09-13 Thread Willy Tarreau
On Fri, Sep 13, 2019 at 01:23:12PM +0500,  ??? wrote:
> Build was failed due to memory leak detected by asan
> 
> https://github.com/haproxy/haproxy/issues/256
> 
> 
> I think we can change the way asan works, I.e. log errors and do not stop
> tests

I didn't even notice it was this one because we've had too many errors
reported over the last weeks as I mentioned. At this point I'd rather
do the opposite and possibly keep asan (if it reports valid things only)
and drop the tests which randomly fail 50% of the time on this
infrastructure.

Willy



Re: Linux Builds broken on Travis CI

2019-09-13 Thread Илья Шипицин
Build was failed due to memory leak detected by asan

https://github.com/haproxy/haproxy/issues/256


I think we can change the way asan works, I.e. log errors and do not stop
tests



On Fri, Sep 13, 2019, 7:59 AM Willy Tarreau  wrote:

> Hi Tim,
>
> On Fri, Sep 06, 2019 at 04:30:24PM +0200, Tim Düsterhus wrote:
> > Dear List
> >
> > something between 02bac85bee664976f6dcecc424864e9fb99975be and
> > f909c91e8a739b9ef7409b399259201fe883771c broke all the Linux builds on
> > Travis CI:
> >
> > - 41 reg tests fail with a timeout
> > - 3 reg tests pass
> >
> > FreeBSD works fine.
> >
> > Somebody really ought to take a look. I might try to bisect if I find a
> > bit of spare time. If someone beats me to it: Go ahead.
>
> I've been quite annoyed with this a number of times and ended up not
> looking at build reports anymore due to this. I've spent some time
> looking at the cause as well and bisecting, coming to the conclusion
> that apparently the travis VMs are regularly overloaded. Most of the
> times we see TCP connection timeouts on the loop back preventing the
> vtest client from reaching haproxy! I've even seen a number of "out
> of memory" messages hitting the client. It's possible that their
> hypervisor is sometimes running out of memory. Maybe their service
> is abused by other projects which induce a huge load. At some point I
> used to click "build again", which managed to randomly work, but I
> gave up, being used to seeing this constantly red :-(
>
> I think that we should simply disable reg tests and stick to build
> tests only. There's nothing worse than getting used to seeing errors,
> as by not seeing a difference between a build error and a test error
> we get trained to ignore results.
>
> Maybe we can keep the reg tests for cron jobs, but given that they
> similarly fail I don't see the benefit either.
>
> I too would like to see them turn green again :-/
>
> Cheers,
> Willy
>
>


Re: Linux Builds broken on Travis CI

2019-09-12 Thread Willy Tarreau
Hi Tim,

On Fri, Sep 06, 2019 at 04:30:24PM +0200, Tim Düsterhus wrote:
> Dear List
> 
> something between 02bac85bee664976f6dcecc424864e9fb99975be and
> f909c91e8a739b9ef7409b399259201fe883771c broke all the Linux builds on
> Travis CI:
> 
> - 41 reg tests fail with a timeout
> - 3 reg tests pass
> 
> FreeBSD works fine.
> 
> Somebody really ought to take a look. I might try to bisect if I find a
> bit of spare time. If someone beats me to it: Go ahead.

I've been quite annoyed with this a number of times and ended up not
looking at build reports anymore due to this. I've spent some time
looking at the cause as well and bisecting, coming to the conclusion
that apparently the travis VMs are regularly overloaded. Most of the
times we see TCP connection timeouts on the loop back preventing the
vtest client from reaching haproxy! I've even seen a number of "out
of memory" messages hitting the client. It's possible that their
hypervisor is sometimes running out of memory. Maybe their service
is abused by other projects which induce a huge load. At some point I
used to click "build again", which managed to randomly work, but I
gave up, being used to seeing this constantly red :-(

I think that we should simply disable reg tests and stick to build
tests only. There's nothing worse than getting used to seeing errors,
as by not seeing a difference between a build error and a test error
we get trained to ignore results.

Maybe we can keep the reg tests for cron jobs, but given that they
similarly fail I don't see the benefit either.

I too would like to see them turn green again :-/

Cheers,
Willy