On Mon, Sep 19, 2016 at 10:08:32AM +0200, Christopher Faulet wrote:
> Le 18/09/2016 à 04:17, Bertrand Jacquin a écrit :
> > Today I noticed data corruption when haproxy is used for compression
> > offloading. I bisected twice, and it lead to this specific commit but
> > I'm not 100% confident this commit is the actual root cause.
> > 
> > HTTP body coming from the nginx backend is consistent, but HTTP headers
> > are different depending on the setup I'm enabling. Data corruption only
> > happens with transfer encoding chunked. HTTP body coming then from
> > haproxy to curl can be randomly corrupted, I attached a diff
> > (v1.7-dev1-50-gd7c9196ae56e.Transfer-Encoding-chunked.diff) revealing an
> > unrelated blob like TLS structure in the middle of the javascript. For
> > example, you will find my x509 client certificate in there
> > 
> > I'm also attaching HTTP headers from haproxy to nginx may that help.
> > 
> > Note that I tested with zlib 1.2.8 and libslz 1.0.0, result remains the
> > same in both case.
> 
> I've done some tests. Unfortunately, I'm unable to reproduce the bug. So
> I need more information. First, I need to known how you hit it. Is this
> happen under load or randomly when you do a single request ?

I can reproduce this issue with 100% accuracy on arm or amd64:

  $ for (( i = 0 ; i < 25 ; i++ )) ; do
      curl -s -H 'Accept-Encoding: gzip' \
        'https://pants-off.xyz/v1.7-dev1-50-gd7c9196ae56e.js' \
      | zcat | md5sum
    done
  01a32fcef0a6894caf112c1a9d5c2a5d  -
  b2a109843f4c43fcde3cb051e4fbf8d2  -
  dedc59fb28ae5d91713234e3e5c08dec  -
  3c8f6d8d53c0ab36bb464b8283570355  -
  e1957e16479bc3106adc68fee2019be8  -
  4cc54367717e5adcdf940f619949ea72  -
  bf637a26e62582c35da6888a4928d4ec  -
  3eeecd478f8e6ea4d690c70f9444954a  -
  79ab805209777ab02bdc6fb829048c74  -
  2aaf9577c1fefdd107a5173aee270c83  -
  .. and so on, shrinking the output here

Note that md5sum of the file should be a4d8bb8ba2a76d7caf090ab632708d7d.

> Then, do
> you still have the bug when you are not using SSL ? Let me also know how
> often the bug appear.

I did not do that test since is was easy for me to track output
containing details about my x509 client certificate.

Running same test as before with 100 iterations, counting similar
output.

  $ for (( i = 0 ; i < 100 ; i++ )) ; do
      curl -s -H 'Accept-Encoding: gzip' \
        'http://pants-off.xyz/v1.7-dev1-50-gd7c9196ae56e.js' \
      | zcat | md5sum
    done | uniq -c
      1 6c38ef6556efa9e0fa6825803679b2f2  -
     99 a4d8bb8ba2a76d7caf090ab632708d7d  -

Note that 6c38ef6556efa9e0fa6825803679b2f2 appears for the first
iteration. Second test after a few seconds.

      1 ffaf62147b43f82d587df59c39b48e54  -
     29 a4d8bb8ba2a76d7caf090ab632708d7d  -
      1 ae6e4404422b93c9fe64bffdea87f36d  -
     41 a4d8bb8ba2a76d7caf090ab632708d7d  -
      1 3e8c507e16733af8b728e229c00f21c3  -
      4 a4d8bb8ba2a76d7caf090ab632708d7d  -
      1 f6195005b050edcb5ca682b1cde9777f  -
     22 a4d8bb8ba2a76d7caf090ab632708d7d  -

Third test:

      1 6c38ef6556efa9e0fa6825803679b2f2  -
     80 a4d8bb8ba2a76d7caf090ab632708d7d  -
      1 3e8c507e16733af8b728e229c00f21c3  -
     18 a4d8bb8ba2a76d7caf090ab632708d7d  -

So it looks a bit more stable. Now if if query HTTP and HTTPS at the
same time, here is what I get:

HTTPS:
      2 17bfe6f7f6296cc5e1d623381afc9e55  -
      1 cbc1779ce5636c31bcf3ea175088da11  -
      1 52ba63995295f5399ddd91b9f9bdf81d  -
      1 5b4115080f35ac5f564b7164a3ada701  -
      1 adfb87fe9efc33e0218a891b2b8b4d42  -
      1 a6f8707556b2f760d20b51dd59b11fb4  -
      .. and so on

HTTP:
      1 3a794f99df4f7a282f822bbaca508852  -
      1 24242f218d9041383c523984d19feddc  -
      2 a4d8bb8ba2a76d7caf090ab632708d7d  -
      1 9987d0621c7fbe4b399e462f421b2157  -
      1 a4d8bb8ba2a76d7caf090ab632708d7d  -
      1 e261d9cdf988c4fd3d75877812fa5028  -
      .. and so on

Yet it does not look stable. Let's do a test with HTTP only from 2
different hosts:

HTTP client 1:
      1 64cd299604d1f7fac29ef7b2b623b1d0  -
      6 a4d8bb8ba2a76d7caf090ab632708d7d  -
      1 bd0372d30c564925ebd1866cf2476474  -
     11 a4d8bb8ba2a76d7caf090ab632708d7d  -
      1 64cd299604d1f7fac29ef7b2b623b1d0  -
      9 a4d8bb8ba2a76d7caf090ab632708d7d  -

HTTP client 2:
      1 8749926476d446ead3bd8d81523330eb  -
     16 a4d8bb8ba2a76d7caf090ab632708d7d  -
      1 c533c33a3ff469086bdbffffb6a936e2  -
     14 a4d8bb8ba2a76d7caf090ab632708d7d  -
      1 bd89ab7eab271b2ac13dff42e8e96ba4  -

We are again in a less stable situation.

> And finally, If you can share with me your HA and
> Nginx configurations, this could help.

I'm attaching a strip down version of haproxy/nginx/php-fpm on which I
can reproduice this issue.

Cheers,

-- 
Bertrand

Attachment: v1.7-dev1-50-gd7c9196ae56e.tgz
Description: GNU Unix tar archive

Attachment: signature.asc
Description: Digital signature

Reply via email to