I was actually the one that had the hunch to disable compression. I
suspected that this was the issue because there was a bunch of "abort"
calls in include/common/hathreads.h" which is used by the compression
stuff. However I just noticed those aborts are actually only there if
DEBUG_THREAD is defined which it doesn't seem to be for our build. So
basically, I have no clue whatsoever why disabling compression fixes the
I can see next week if we can make a build with slz instead of zlib (we
seem to be linked against zlib/libz atm).
On 4/6/2018 14:18, Willy Tarreau wrote:
On Fri, Apr 06, 2018 at 10:53:36AM +0000, Frank Schreuder wrote:
We tested haproxy 1.8.6 with compression enabled today, within the first few
hours it already went wrong:
[ALERT] 095/120526 (12989) : Current worker 5241 exited with code 134
OK thanks, and sorry for that.
Our other balancer running haproxy 1.8.5 with compression disabled is still
running fine after 2 days with the same workload.
So there seems to be a locking issue when compression is enabled.
Well, an issue with compression, but I'm really not seeing what makes
you speak about locking since :
- you don't seem to have threads enabled
- locking issues generally cause deadlocks, not aborts
The other problem is that we noticed already that there are very few
abort() calls in haproxy and none of them in this area. So it's very
possible that it comes from another layer detecting an issue provoked
by compression. Typically the libc's malloc/free can stop the program
using abort() if they detect a corruption.
It would really help to know where this abort() happens, at least to
get a backtrace.
By the way, area you using zlib or slz ? zlib uses a tricky allocator.
I checked it again yesterday and it was made thread safe. But we couldn't
rule out an issue there. slz doesn't need memory however. If you're on
zlib, switching to slz could also indicate if the problem is related to
these memory allocations or not.