Hi all,

We carried out an update from dev12 to dev21 as per my previous message to
the list and the specific issue I mentioned before no longer occurred -
good.

Unfortunately we hit a fairly major problem which summed up is a 'packet of
death' scenario that affects dev21 (have not built backwards in time yet to
determine the exact 'regression') but not dev12.

I mentioned this briefly in IRC but for archive's sake here's what we spent
the day debugging:

Steps to reproduce:
1) Build a C6.4 system using vault.centos.org
2) Install haproxy-1.5-dev21.x86_64
3) Set the sysctl properties net.ipv6.conf.default.disable_ipv6 = 1 and
net.ipv6.conf.all.disable_ipv6 = 1
4) Start a basic haproxy configuration with a chroot specified and an ssl
enabled listener.
5) Use a tool that can specify cipher strength (LOW or MEDIUM crashes but
HIGH does not) to open a connection to the listener such as ApacheBench eg:
ab -c 1 -n 1 -Z LOW https://targethost/

Results:
Haproxy gets a segfault - specifically a SIGABRT - and the process dies.

Expectation:
Haproxy carries on working without issue.

Workarounds:
As you can read above there's specific things that need to line up for this
to occur (and unfortunately we hit them all on our production systems as it
turns out).
1) If the glibc from 6.5 is installed (and yes boxes should be updated and
'there is only C6 not C6.X' should prevail) then no crash occurs.
2) If haproxy is not in a chroot then no crash occurs.
3) If ipv6 is not disabled (ie systctl reports disable_ipv6 = 0) then no
crash occurs.

Specifics:
We spent most of the day in gdb, strace and ltrace today working through
the specific codepaths and it would appear that SSL makes a call through
SSL_get_hostname which then calls through
libkrb krb5int_get_fq_local_hostname (localhost, sizeof(localhost)) and the
routines this calls ends up with  -5 (EAI_NODATA) being passed to
krb5int_translate_gai_error which then, if EAI_NODATA has not been defined
which appears might be the case without __USE_GNU, calls abort().

As you can see we went into some detail on this ... We're not sure at this
time why dev12 does not crash or specifically what fixes in glibc for the
el6.5 point release avoid this code path and crash (updating openssl or
krb5-libs does not help only glibc).

TL:DR; chroot haproxy, disable_ipv6 and run c6.4 with an SSL front end and
ab -c 1 -n 1 -Z LOW https://target/ is enough to crash your haproxy with a
SIGABRT.

 I'm not sure how much more we are going to deep dive this given the
available workarounds but it's a heads up for anyone else that hits the
three criteria and an interesting problem for why, exactly, it happens ;)

If anyone has any thoughts or insights I'd be intrigued to hear them and if
you want to reproduce and have difficulties doing so I'd be happy to help.

Cheers,

James

Reply via email to