Re: Random 502's and instant 504's after upgrading
On 2019-07-22 13:05, Sander Klein wrote: On 2019-07-22 10:59, Christopher Faulet wrote: Le 20/07/2019 à 19:50, Sander Klein a écrit : Sorry, I forgot to mention, I pushed another patch that may help you. In HAProxy 2.0, it is the commit 0bf28f856 ("BUG/MINOR: mux-h1: Close server connection if input data remains in h1_detach()"). I don't know if your HAProxy already includes it or not. If not, please give it a try. If your tests were made with this last commit, it means there is a bug somewhere else. Just tested with haproxy-ss-20190720 and I do not see any strange 502's anymore. Thanks! Greets, Sander 0x2E78FBE8.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature
Re: Random 502's and instant 504's after upgrading
On 2019-07-22 10:59, Christopher Faulet wrote: Le 20/07/2019 à 19:50, Sander Klein a écrit : Sorry, I forgot to mention, I pushed another patch that may help you. In HAProxy 2.0, it is the commit 0bf28f856 ("BUG/MINOR: mux-h1: Close server connection if input data remains in h1_detach()"). I don't know if your HAProxy already includes it or not. If not, please give it a try. If your tests were made with this last commit, it means there is a bug somewhere else. Ah, no, I used vanilla 2.0.2 with only your other patch applied. I see if I can test again. Sander 0x2E78FBE8.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature
Re: Random 502's and instant 504's after upgrading
Le 20/07/2019 à 19:50, Sander Klein a écrit : I just pathed up 2.0.2 and tested it. I still get 502's but a lot less. I'm not sure if this is because I do less request/s or I hit something else. The show errors show: --- [20/Jul/2019:19:34:45.629] backend cluster1-xx (#11): invalid response frontend cluster1 (#3), server xxx (#1), event #0, src x.x.x.x:52007 buffer starts at 0 (including 0 out), 10809 free, len 5575, wraps at 16336, error at position 0 H1 connection flags 0x, H1 stream flags 0x4094 H1 msg state MSG_RPVER(10), H1 msg flags 0x1404 H1 chunk len 0 bytes, H1 body len 0 bytes : --- --- [20/Jul/2019:19:40:32.643] backend cluster1-xx (#11): invalid response frontend webservices (#18), server xxx (#2), event #13, src x:x:x:x:x:x:x:x:59724 buffer starts at 0 (including 0 out), 16377 free, len 7, wraps at 16384, error at position 0 H1 connection flags 0x, H1 stream flags 0x4094 H1 msg state MSG_RPBEFORE(8), H1 msg flags 0x1404 H1 chunk len 0 bytes, H1 body len 0 bytes : 0 :10}]}} --- There is of course more with the first one, but I do not want to put that on the mailinglist. It seems like a partial response body. I can send it to you private if you want. Sorry, I forgot to mention, I pushed another patch that may help you. In HAProxy 2.0, it is the commit 0bf28f856 ("BUG/MINOR: mux-h1: Close server connection if input data remains in h1_detach()"). I don't know if your HAProxy already includes it or not. If not, please give it a try. If your tests were made with this last commit, it means there is a bug somewhere else. -- Christopher Faulet
Re: Random 502's and instant 504's after upgrading
On 2019-07-19 14:05, Christopher Faulet wrote: Le 19/07/2019 à 09:36, Sander Klein a écrit : --- HTTP/1.1 200 OK Server: nginx Date: Fri, 19 Jul 2019 07:32:03 GMT Content-Type: application/json; charset=UTF-8 Transfer-Encoding: chunked Vary: Accept-Encoding Vary: Accept-Encoding Cache-Control: private, must-revalidate ETag: "178c3f242b0151fe57e02f6e8817ce3a" Access-Control-Allow-Origin: * Access-Control-Allow-Methods: POST, GET, OPTIONS, PUT, PATCH, DELETE, HEAD Length: unspecified [application/json] --- Maybe the 'Length: unspecified' has something to do with it. No, this line is reported by wget because there is no "Content-Length" header. Heh, doh, sorry about that :-) So, as I said, I pushed a fix (https://github.com/haproxy/haproxy/commit/03627245). It was backported to 2.0. Could you check if it fixes your issue about 502 errors ? I just pathed up 2.0.2 and tested it. I still get 502's but a lot less. I'm not sure if this is because I do less request/s or I hit something else. The show errors show: --- [20/Jul/2019:19:34:45.629] backend cluster1-xx (#11): invalid response frontend cluster1 (#3), server xxx (#1), event #0, src x.x.x.x:52007 buffer starts at 0 (including 0 out), 10809 free, len 5575, wraps at 16336, error at position 0 H1 connection flags 0x, H1 stream flags 0x4094 H1 msg state MSG_RPVER(10), H1 msg flags 0x1404 H1 chunk len 0 bytes, H1 body len 0 bytes : --- --- [20/Jul/2019:19:40:32.643] backend cluster1-xx (#11): invalid response frontend webservices (#18), server xxx (#2), event #13, src x:x:x:x:x:x:x:x:59724 buffer starts at 0 (including 0 out), 16377 free, len 7, wraps at 16384, error at position 0 H1 connection flags 0x, H1 stream flags 0x4094 H1 msg state MSG_RPBEFORE(8), H1 msg flags 0x1404 H1 chunk len 0 bytes, H1 body len 0 bytes : 0 :10}]}} --- There is of course more with the first one, but I do not want to put that on the mailinglist. It seems like a partial response body. I can send it to you private if you want. For 504 errors, I have no idea for now. I'm not sure about these 504's either. I had a couple of reports about these and 1 of our developers had it one time, but I haven't seen it myself or seen any proof about this. But like I said, the logs show nothing. I will keep my eye on this. Sander 0x2E78FBE8.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature
Re: Random 502's and instant 504's after upgrading
Le 19/07/2019 à 09:36, Sander Klein a écrit : The show errors: --- Total events captured on [19/Jul/2019:08:34:25.093] : 31 [19/Jul/2019:08:34:23.405] backend cluster1-xx (#11): invalid response frontend webservices (#18), server xxx (#2), event #30, src x.x.x.x:63290 buffer starts at 0 (including 0 out), 16268 free, len 116, wraps at 16384, error at position 0 H1 connection flags 0x, H1 stream flags 0x4094 H1 msg state MSG_RPBEFORE(8), H1 msg flags 0x1404 H1 chunk len 0 bytes, H1 body len 0 bytes : 0 {"metadata":{"pagination":{"total":0,"rows":25,"currentPage":1,"pages" 00070+ :0},"facets":[],"activeFacets":[]},"media":[]} Thanks. So the problem seems to be the same than the issue #176 on github (https://github.com/haproxy/haproxy/issues/176). I pushed a fix. --- I also did this request with wget to see what the response should be, and it seems that this is the first part of the 297229 bytes long body. The response headers are: --- HTTP/1.1 200 OK Server: nginx Date: Fri, 19 Jul 2019 07:32:03 GMT Content-Type: application/json; charset=UTF-8 Transfer-Encoding: chunked Vary: Accept-Encoding Vary: Accept-Encoding Cache-Control: private, must-revalidate ETag: "178c3f242b0151fe57e02f6e8817ce3a" Access-Control-Allow-Origin: * Access-Control-Allow-Methods: POST, GET, OPTIONS, PUT, PATCH, DELETE, HEAD Length: unspecified [application/json] --- Maybe the 'Length: unspecified' has something to do with it. No, this line is reported by wget because there is no "Content-Length" header. So, as I said, I pushed a fix (https://github.com/haproxy/haproxy/commit/03627245). It was backported to 2.0. Could you check if it fixes your issue about 502 errors ? For 504 errors, I have no idea for now. -- Christopher Faulet
Re: Random 502's and instant 504's after upgrading
Hi Lukas and Christopher, I've combined the answer of your 2 mails. On 2019-07-18 17:17, Lukas Tribus wrote: Could be related to: https://github.com/haproxy/haproxy/issues/176 Probably, but I'm not doing HTTP/1 and I have not found a request to reproduce it with. It happens at random. Can you provide the "show errors" output from the admin cli for those requests, and possible try one of the mentioned workarounds (http-reuse never or http-server-close)? The show errors: --- Total events captured on [19/Jul/2019:08:34:25.093] : 31 [19/Jul/2019:08:34:23.405] backend cluster1-xx (#11): invalid response frontend webservices (#18), server xxx (#2), event #30, src x.x.x.x:63290 buffer starts at 0 (including 0 out), 16268 free, len 116, wraps at 16384, error at position 0 H1 connection flags 0x, H1 stream flags 0x4094 H1 msg state MSG_RPBEFORE(8), H1 msg flags 0x1404 H1 chunk len 0 bytes, H1 body len 0 bytes : 0 {"metadata":{"pagination":{"total":0,"rows":25,"currentPage":1,"pages" 00070+ :0},"facets":[],"activeFacets":[]},"media":[]} --- I also did this request with wget to see what the response should be, and it seems that this is the first part of the 297229 bytes long body. The response headers are: --- HTTP/1.1 200 OK Server: nginx Date: Fri, 19 Jul 2019 07:32:03 GMT Content-Type: application/json; charset=UTF-8 Transfer-Encoding: chunked Vary: Accept-Encoding Vary: Accept-Encoding Cache-Control: private, must-revalidate ETag: "178c3f242b0151fe57e02f6e8817ce3a" Access-Control-Allow-Origin: * Access-Control-Allow-Methods: POST, GET, OPTIONS, PUT, PATCH, DELETE, HEAD Length: unspecified [application/json] --- Maybe the 'Length: unspecified' has something to do with it. If I enable http-reuse the problem is still there. Only no option http-use-htx 'fixes' it. I've stripped my config to the parts that I think are related: --- global master-worker log /dev/loglocal0 log /dev/loglocal1 notice daemon userhaproxy group haproxy maxconn 32768 spread-checks 3 nbproc 1 nbthread4 stats socket/var/run/haproxy.stat mode 666 level admin ssl-default-bind-ciphers ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS ssl-default-bind-options no-sslv3 no-tls-tickets ssl-default-server-ciphers ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS ssl-default-server-options no-sslv3 no-tls-tickets tune.ssl.default-dh-param 2048 ### # Defaults ### defaults log global timeout check 2s timeout client 60s timeout connect 10s timeout http-keep-alive 4s timeout http-request15s timeout queue 30s timeout server 60s timeout tarpit 120s errorfile 400 /etc/haproxy/errors.loc/400.http errorfile 403 /etc/haproxy/errors.loc/403.http errorfile 500 /etc/haproxy/errors.loc/500.http errorfile 502 /etc/haproxy/errors.loc/502.http errorfile 503 /etc/haproxy/errors.loc/503.http errorfile 504 /etc/haproxy/errors.loc/504.http frontend webservices bind x.x.x.x:80 transparent bind x.x.x.x:443 transparent ssl crt /etc/haproxy/ssl/somecert.pem alpn h2,http/1.1 bind 2001:xxx:xxx:x::xx:80 transparent bind 2001:xxx:xxx:x::xx:443 transparent ssl crt /etc/haproxy/ssl/somecert.pem alpn h2,http/1.1 modehttp maxconn 4096 option httplog option dontlog-normal option http-ignore-probes
Re: Random 502's and instant 504's after upgrading
Le 18/07/2019 à 16:50, Sander Klein a écrit : On 2019-07-18 09:15, Sander Klein wrote: Hi, Last night I tried upgrading from haproxy 1.9.8 to 2.0.2. After upgrading I get random 502's and random instant 504's when visiting pages. Just tested with 'no option http-use-htx' in the defaults section and then my problems went away. Seems like a bug in HTX. Any info needed for this one? Hi, Could you share your configuration please ? And if possible, it could be good to check if you have same errors with HTTP/1 requests. -- Christopher Faulet
Re: Random 502's and instant 504's after upgrading
Hello, On Thu, 18 Jul 2019 at 16:51, Sander Klein wrote: > > On 2019-07-18 09:15, Sander Klein wrote: > > Hi, > > > > Last night I tried upgrading from haproxy 1.9.8 to 2.0.2. After > > upgrading I get random 502's and random instant 504's when visiting > > pages. > > > Just tested with 'no option http-use-htx' in the defaults section and > then my problems went away. Seems like a bug in HTX. Any info needed for > this one? Could be related to: https://github.com/haproxy/haproxy/issues/176 Can you provide the "show errors" output from the admin cli for those requests, and possible try one of the mentioned workarounds (http-reuse never or http-server-close)? Lukas
Re: Random 502's and instant 504's after upgrading
On 2019-07-18 09:15, Sander Klein wrote: Hi, Last night I tried upgrading from haproxy 1.9.8 to 2.0.2. After upgrading I get random 502's and random instant 504's when visiting pages. Just tested with 'no option http-use-htx' in the defaults section and then my problems went away. Seems like a bug in HTX. Any info needed for this one? Sander 0x2E78FBE8.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature
Random 502's and instant 504's after upgrading
Hi, Last night I tried upgrading from haproxy 1.9.8 to 2.0.2. After upgrading I get random 502's and random instant 504's when visiting pages. For the 502's I see the following in the log: Jul 18 08:14:09 HOST haproxy[2003]: xxx:xxx:xxx:xxx:xxx::xxx [18/Jul/2019:08:14:09.133] cluster1-in~ cluster1/BACK1 0/0/0/-1/0 502 1976 - - PH-- 382/129/8/5/0 0/0 {somesite.nl|Mozilla/5.0 (Win|354|https://somesite.nl/stuff/goes/here/xxx} {} "POST /stuff/goes/here/xxx HTTP/2.0" Jul 18 08:15:08 HOST haproxy[2003]: x.x.x.x:50004 [18/Jul/2019:08:15:08.712] cluster1-in~ cluster1/BACK2 0/0/0/-1/0 502 1976 - - PH-- 365/150/5/2/0 0/0 {somesite.nl|Mozilla/5.0 (Win||https://somesite.nl/other/stuf/here/please/xxx} {} "GET /img/uploads/path/somejpeg.jpg HTTP/2.0" The 504's are another thing, I do not see them logged at all. The only things I notice is that they are instant, so no timeout is reached. Downgrading back to 1.9.8 fixes the problem again. I might try disabling htx later today to see what happens. The backends are NGINX servers which talk plain http/1.1. Sander 0x2E78FBE8.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature