On Mon, Dec 4, 2017 at 11:56 AM, Christopher Lane
<[email protected]> wrote:
>
>
>
> On Mon, Dec 4, 2017 at 4:22 AM Lukas Tribus <[email protected]> wrote:
>
>>Hello Christopher,
>
>
>>2017-12-01 20:59 GMT+01:00 Christopher Lane <[email protected]>:
>>>
>>> gist with backtrace, -vv output, and config file. Also strace.
>>>
>>> https://gist.github.com/jayalane/c6dbe7918aa9635b62c874d20f57dfec
>>>
>>> It does all the listens and then right after the first epoll is done it
>>> has this segv. all the local variables are "optimized out"
>>>
>>> (We really want the hard-stop-after -- we get a leak of children with our
>>> super frequent soft-reloads).
>
>>FYI, hard-stop-after has been backported to 1.7 stable and is in all
>>rebuilds starting with 1.7.4:
>
>> https://www.mail-archive.com/[email protected]/msg25494.html
>
>
>
>> Regards,
>> Lukas
>
> Olivier:
>
> The patch worked beautifully to keep the 1.8.0 from crashing.
>
> Lukas:
>
> Thanks for the tip, we'll consider 1.7.9 then (settled on 1.8.0 by starting
> out with reading the release notes for it; we are upgrading from 1.5.0).
>
> --Chris
Unrelated to the prior contents of this thread, I found the root cause
for our issue (child leak).
The haproxy was being started from a Java process using Runtime.exec()
and the PIDs were jammed into 1 cell of argv. It killed the first
child but not the later ones, as atol("13233 13234 13235 13236")
returns 13233. We have fixed the Java code.
http://git.haproxy.org/?p=haproxy.git;a=blob;f=src/haproxy.c;h=eb5e65b40e7b8b2a4f8fb04b3552401e42fb0a89;hb=HEAD#l1421
I note that the -sf parsing code uses atol and has no error checking.
Would the project be interested in a patch to use strtol with error
checks? Could log a warning if unconsumed bytes in the arg (safer),
or fail to start (unsafe). I'm sure I'm not the only one with this
sort of bug, given how tricky shell escaping and so on is.
Thanks again, and we are enjoying 1.8.1 and much lower loadavgs.
--Chris