Hi Hemant,
On Tue, Nov 28, 2017 at 12:15:32PM -0600, Coscend@HAProxy wrote:
> Thank you for giving time after a large 1.8.0 release to provide us vectors
> on testing HAProxy work mode. This community's intensity and laser focus
> are a standout. Below is all the information you asked us to look at. By
> any chance, was there any critical change between 1.8-dev3 and 1.8-rc
> series/1.8.0?
Sure, a lot! Too many to enumerate in fact.
> New checks: Loads stats page and accesses all backends
>
> SUCCESS: 1.8-dev0/dev1/dev2/dev3
> FAILS: 1.8-rc1/rc2rc3/rc4 and 1.8.0
>
> Browser message:
>
> This site can't be reached
>
> coscend.com took too long to respond.
Ah OK so it's not a "connection refused" as I thought it was, but the
fact that it apparently hangs. That's a different case. I suspect the
process quickly goes up in a loop at 100% CPU due to a bug. There has
been a locking bug on the mailers section causing this, would you happen
to have one such setting in your config ?
> (1) check if the process is still present (ps aux)
>
> root 2801 0.0 0.1 74928 11560 ? S 11:28 0:00
> /usr/local/sbin/haproxy -Ws -V -C /usr/local/haproxy-1.8.0/conf -f
> /usr/local/haproxy-1.8.0/conf -f <path to configuration files> -D -p
> /var/run/haproxy.pid
>
> haproxy 2802 99.5 0.1 75620 12408 ? Rs 11:28 10:50
^^^^^
Bingo, spinning process!
Thanks for all the elements. If you're having a "mailers" section,
then the next nighly snapshot will contain the fix, or you can pick
it yourself, it's commit 8f42b19 ("BUG/MEDIUM: tcp-check: Don't lock
the server in tcpcheck_main").
If you don't have such a section, that's embarrassing, it means there
is another locking bug. In this case I'll ask you for more info off-list.
Regards,
Willy