Hi Wietse,

[CCing haproxy ML and responding inline]

First, thanks a lot for your detailed analysis.

On Sat, Jan 11, 2020 at 09:03:19PM -0500, Wietse Venema wrote:
> I found a portability bug in haproxy's CRC32 implementation, while
> adding adding CRC32 support to Postfix's haproxy2 code. The haproxy
> code produces incorrect results on ix86. Which just happens to work
> as long as everyone is using ix86.
> 
> The same bug appears to exist in other hash functions in
> haproxy/src/hash.c.
> 
> My proposed solution: introduce a NEW type of TLV for checksums
> that are generated with a more portable CRC32 implementation.
> 
> Details
> =======
> 
> Haproxy CRC32 support adds a checksum to the haproxy protocol header.
> This protects against random bitflips that happen before a proxy
> header is written to a sender's TCP stack, and after a proxy header
> is read from a receiver's TCP stack (during transport over TCP, the
> proxy header is always protected by the TCP checksum).
> 
> Here's the code from haproxy/src/hash.c:
> 
>     unsigned int hash_crc32(const char *key, int len)
>     {
>         unsigned int hash;
>         ...
>         hash = ~0;
>         while (len--) {
>             hash ^= *key++;
>             ...
>         }
>         ...
>     }
> 
> The bug is that 'key' should be 'const unsigned char *'. Otherwise,
> the result depends on the hardware and/or compiler.

Yes, good catch. It's even why I use ARM a lot, in order to catch such
bugs, but this one managed to slip through it seems :-/

> Below is a small test program to illustrate these differences.
> 
> * On systems where 'char' is a signed type, *key is a signed
>   8-bit value in the range -128..+127. In the code above, this is
>   converted to int -128..+127, then converted to unsigned, where
>   the negative integers become very large positive numbers.
> 
>   Example: the 8-bit value 0xff becomes the 32-bit value 0xffffffff.
>   This breaks the CRC32 computation, due to non-portable programming.
>   See test program below.
> 
> * On systems where 'char' is an unsigned type. *key is an unsigned
>   8-bit value in the range 0..255. In the code above, this is
>   converted to int 0..255, then converned to unsigned 0..255.
> 
>   Example: the 8-bit value 0xff becomes the 32-bit value 0xff. This
>   does not break the CRC32 computation, but that is just the result
>   of dumb luck. Code that is portable would always work this way.
> 
> As luck has it, with ix86 systems, 'char' is a signed type. Therefore
> the CRC32 computation is screwed up (both i386 and x86_64 a.k.a. amd64).
> 
> I wrote a small test program (not included) that compares the result
> from haproxy's hash_crc32() implementation and some other crc32
> implementations that I found on the web, plus an implementation
> that I cobbled together. All portable implementations agree on the
> result. hash_crc32() only some of the time agrees with the portable
> implementations.
> 
> This makes Postfix support for haproxy CRC checks pointless.  It
> would have to compute two versions of CRC32: the correct one and
> the one that is broken, otherwise CRC32 would only work between
> similar systems.

No, there is no reason you have to compute both. I looked at the doc
and the doc explicitly states that these hashes respect the standards,
and for CRC32C the RFC is even mentioned. What the doc claims is true
for ARM/PPC, it's wrong on x86. Even if it can break certain things,
the code is already broken as it means that haproxy will not be able
to properly interoperate between two architectures. Additionally,
someone migrating from ARM to x86 would get caught by this bug. Thus
we have to fix haproxy.

HAProxy doesn't emit the CRC32c TLV field in the proxy protocol but
it does validate it if present. This means that everyone using this
validation will compare it against an external agent provided one,
which has zero chance of using haproxy's bogus code, hence it already
does not work. In addition, the CRC validation is very recent (1.9)
so it's not much spread in versions found in distros, so even if
someone would have created their own implementation indoors, it would
be very limited because in practice I think only one distro ships such
a version.

Last, the CRC32 calculation using converters was added in 1.6 and is
wrong as well since it uses the same function. This can affect logs
and/or stick tables (which are the main locations where users will
place this). But similarly this is already wrong so it needs to be
fixed.

All this shows that neither CRC32 nor CRC32c are that much used today,
or that the usage is limited to situations not sensitive to interop
issues. So I want to see it fixed, and we'll put a prominent warning
in next releases. Do you want to propose a patch or should I do it ?

Many thanks!
Willy

Reply via email to