Re: Haproxy 1.8 http/2 mode does not pass the h2spec conformance test
Hi WIlly, Most of the tests pass now in 1.8.2. However, the following test still hangs in the --strict mode. 2: Sends a large size DATA frame that exceeds the SETTINGS_MAX_FRAME_SIZE On Sun, Nov 26, 2017 at 11:54 PM Robin Anilwrote: > I can get the tests to match yours now. Looks like it was a connection > issue. > Finished in 70.5089 seconds > > 145 tests, 116 passed, 0 skipped, 29 failed > > > > Also, If you haven't noticed the --strict option, in that mode I can get > the test to hang on > >4.2. Frame Size > > ✔ 1: Sends a DATA frame with 2^14 octets in length > > 2: Sends a large size DATA frame that exceeds the > SETTINGS_MAX_FRAME_SIZE > > On Sun, Nov 26, 2017 at 11:39 PM Robin Anil wrote: > >> Sorry, this one, not the alpine build >> >> >> https://github.com/docker-library/haproxy/blob/0c9c27713bfca8505331e0da2664a9454755c7b9/1.8-rc/Dockerfile >> >> On Sun, Nov 26, 2017 at 11:36 PM Robin Anil wrote: >> >>> I am running HaProxy 1.8 within docker container using a fork of the >>> official build and pointed them to the latest 1.8.0 instead of rc-4. So it >>> is built using openssl1.1 >>> >>> Docker hub link: https://hub.docker.com/_/haproxy/ >>> See the Dockerfile >>> https://github.com/docker-library/haproxy/blob/0c9c27713bfca8505331e0da2664a9454755c7b9/1.8-rc/alpine/Dockerfile >>> >>> Just replaced these two lines >>> ENV HAPROXY_VERSION 1.8.0 >>> ENV HAPROXY_MD5 6ccea4619b7183fbcc8c98bae1f9823d >>> >>> On Sun, Nov 26, 2017 at 11:29 PM Willy Tarreau wrote: >>> On Mon, Nov 27, 2017 at 06:15:48AM +0100, Willy Tarreau wrote: > On Mon, Nov 27, 2017 at 05:10:16AM +, Robin Anil wrote: > > A very stripped down version of config > > Thank you, I'll check if anything there can explain this. So with your config I'm getting this : Finished in 52.3802 seconds 145 tests, 110 passed, 4 skipped, 31 failed I had to disable ssl-mode-async as my openssl version doesn't support it. Of the 31, I'm seeing real bugs compared to what is expected to work, such as : 3.8. GOAWAY à 1: Sends a GOAWAY frame -> The endpoint MUST accept GOAWAY frame. Expected: Connection closed Actual: Timeout Others are expected for now : 3.10. CONTINUATION à 1: Sends a CONTINUATION frame -> The endpoint MUST accept CONTINUATION frame. Expected: HEADERS Frame (stream_id:1) Actual: Connection closed 4. HTTP Message Exchanges à 4: Sends a POST request with trailers -> The endpoint MUST respond to the request. Expected: HEADERS Frame (stream_id:1) Actual: Connection closed And the other ones are very useful as they likely indicate missing checks. So I'll take a look. Thanks! Willy >>>
Re: disable-on-404 functionality change in 1.8
> On Dec 23, 2017, at 2:24 AM, Willy Tarreauwrote: > > Hi guys, > > On Sat, Dec 23, 2017 at 08:58:43AM +0100, Cyril Bonté wrote: >> It looks to be a code regression. >> >> Emeric, can you have a look at commit 5a1335110c ? It seems there was an >> unwanted change in the function call : srv_set_stopping() was replaced by >> srv_set_running() >> [...] >> /* Marks the check as valid and tries to set its server into >> stopping mode >> @@ -406,7 +371,7 @@ static void check_notify_stopping(struct check *check) >> if ((s->agent.state & CHK_ST_ENABLED) && (s->agent.health < >> s->agent.rise)) >> return; >> >> -srv_set_stopping(s, (!s->track && !(s->proxy->options2 & >> PR_O2_LOGHCHKS)) >> ? check_reason_string(check) : NULL); >> +srv_set_running(s, NULL, (!s->track && !(s->proxy->options2 & >> PR_O2_LOGHCHKS)) ? check : NULL); > > Nice catch! Thanks Paul for your helpful report and Cyril for spotting > the bug. Now fixed, I can prepare 1.8.2 :-) > > Willy Thanks all for getting this fixed so dang quickly on a Friday night before a holiday weekend. It looks good to me now. Much appreciated. -Paul
[ANNOUNCE] haproxy-1.8.2
Hi, HAProxy 1.8.2 was released on 2017/12/23. It added 64 new commits after version 1.8.1. This version fixes all the issues diagnosed since 1.8.1. The most important ones are : - truncated and slow HTTP/2 POST forms - abortonclose killing all HTTP/2 requests - single server taking all the load in map-based algorithms - timeouts and too later connection shutdown on TCP/tunnel - cache did not consider cache-control in the request - various server state transition issues (down->maint, stopping) - email alerts unexpectedly modifying the server state - log fd leaks across reloads in master-worker mode - deadlocks in variables usage under threads There are still a few pending reports that need to be analysed, but having a new reference version without all the problems above will help sorting the bug reports and will save most users from not fun surprises. If you are on 1.8, please upgrade to 1.8.2, at least before reporting a bug. We'll all save valuable time :-) Please find the usual URLs below : Site index : http://www.haproxy.org/ Discourse: http://discourse.haproxy.org/ Sources : http://www.haproxy.org/download/1.8/src/ Git repository : http://git.haproxy.org/git/haproxy-1.8.git/ Git Web browsing : http://git.haproxy.org/?p=haproxy-1.8.git Changelog: http://www.haproxy.org/download/1.8/src/CHANGELOG Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/ Willy --- Complete changelog : Aleksandar Lazic (1): CONTRIB: halog: Add help text for -s switch in halog program Bertrand Jacquin (8): MINOR: netscaler: respect syntax MINOR: netscaler: remove the use of cip_magic only used once MINOR: netscaler: rename cip_len to clarify its uage BUG/MEDIUM: netscaler: use the appropriate IPv6 header size BUG/MAJOR: netscaler: address truncated CIP header detection MINOR: netscaler: check in one-shot if buffer is large enough for IP and TCP header MEDIUM: netscaler: do not analyze original IP packet size MEDIUM: netscaler: add support for standard NetScaler CIP protocol Christopher Faulet (3): BUG/MINOR: action: Don't check http capture rules when no id is defined BUG/MEDIUM: threads/vars: Fix deadlock in register_name BUG/MEDIUM: mworker: Set FD_CLOEXEC flag on log fd Cyril Bonté (2): BUG: MAJOR: lb_map: server map calculation broken BUG: MINOR: http: don't check http-request capture id when len is provided David Carlier (1): BUILD/MINOR: Makefile : enabling USE_CPU_AFFINITY Davor Ocelic (1): DOC/MINOR: intro: typo, wording, formatting fixes Emeric Brun (3): BUG/MEDIUM: ssl engines: Fix async engines fds were not considered to fix fd limit automatically. BUG/MEDIUM: checks: a down server going to maint remains definitely stucked on down state. BUG/MEDIUM: checks: a server passed in maint state was not forced down. Eric Salama (1): BUG/MEDIUM: lua: fix crash when using bogus mode in register_service() PiBa-NL (1): BUG/MEDIUM: email-alert: don't set server check status from a email-alert task Ryan O'Hara (2): CONTRIB: iprange: Fix compiler warning in iprange.c CONTRIB: halog: Fix compiler warnings in halog.c Thierry FOURNIER (2): DOC: notifications: add precisions about thread usage BUG/MEDIUM: lua/notification: memory leak Tim Duesterhus (2): MINOR: mworker: Update messages referencing exit-on-failure MINOR: mworker: Improve wording in `void mworker_wait()` Vincent Bernat (1): MINOR: systemd: remove comment about HAPROXY_STATS_SOCKET William Lallemand (1): BUG/MINOR: ssl: support tune.ssl.cachesize 0 again Willy Tarreau (35): BUG/MAJOR: hpack: don't pretend large headers fit in empty table BUG/MEDIUM: mworker: also close peers sockets in the master BUG/MEDIUM: peers: set NOLINGER on the outgoing stream interface BUG/MEDIUM: h2: fix handling of end of stream again MINOR: conn_stream: add new flag CS_FL_RCV_MORE to indicate pending data BUG/MEDIUM: stream-int: always set SI_FL_WAIT_ROOM on CS_FL_RCV_MORE BUG/MEDIUM: h2: automatically set CS_FL_RCV_MORE when the output buffer is full BUG/MEDIUM: h2: enable recv polling whenever demuxing is possible BUG/MEDIUM: h2: work around a connection API limitation BUG/MEDIUM: h2: debug incoming traffic in h2_wake() MINOR: h2: store the demux padding length in the h2c struct BUG/MEDIUM: h2: support uploading partial DATA frames MINOR: h2: don't demand that a DATA frame is complete before processing it BUG/MEDIUM: h2: don't switch the state to HREM before end of DATA frame BUG/MEDIUM: h2: don't close after the first DATA frame on tunnelled responses BUG/MEDIUM: http: don't disable lingering on requests with tunnelled responses BUG/MEDIUM: h2: fix stream limit enforcement BUG/MINOR: stream-int: don't try to
Re: disable-on-404 functionality change in 1.8
Hi guys, On Sat, Dec 23, 2017 at 08:58:43AM +0100, Cyril Bonté wrote: > It looks to be a code regression. > > Emeric, can you have a look at commit 5a1335110c ? It seems there was an > unwanted change in the function call : srv_set_stopping() was replaced by > srv_set_running() > [...] > /* Marks the check as valid and tries to set its server into > stopping mode > @@ -406,7 +371,7 @@ static void check_notify_stopping(struct check *check) > if ((s->agent.state & CHK_ST_ENABLED) && (s->agent.health < > s->agent.rise)) > return; > > - srv_set_stopping(s, (!s->track && !(s->proxy->options2 & > PR_O2_LOGHCHKS)) > ? check_reason_string(check) : NULL); > + srv_set_running(s, NULL, (!s->track && !(s->proxy->options2 & > PR_O2_LOGHCHKS)) ? check : NULL); Nice catch! Thanks Paul for your helpful report and Cyril for spotting the bug. Now fixed, I can prepare 1.8.2 :-) Willy
Re: Issue after upgrade from 1.7 to 1.8 related with active sessions
Hi Ricardo, On Sat, Dec 23, 2017 at 09:06:36AM +, Ricardo Fraile wrote: > Hello Willy, > > > It works perfect! Problem solved :) Great, thanks! > The doubt that I have now is related with the trace line "-1 ENOTCONN > (Transport endpoint is not connected)" and the relationship with the issue... >From what I remember it was on a shutdown(SHUT_WR). It's harmless but suboptimal. It proves that there is a situation where two different events lead to an attempt to close and that we don't know we have already closed. We could save a shutdown() syscall by refining the condition there. Given that your original issue was caused by a missing call to shutdown() I'd rather postpone such investigation ;-) Cheers, Willy
Re: Issue after upgrade from 1.7 to 1.8 related with active sessions
Hello Willy, It works perfect! Problem solved :) >From my side, yesterday afternoon I was walking along the commits to reach >when was the change. I finished in the same commit "MEDIUM: connection: make >conn_sock_shutw() aware of lingering", and the workaround that I found was >using "option nolinger". Coincidentally, when I was going to write the email, >your answer arrived with the right fix. The doubt that I have now is related with the trace line "-1 ENOTCONN (Transport endpoint is not connected)" and the relationship with the issue... It still happend, but the problem is solved, therefore it hasn't any link between each one. I found that this behaviour was introduced since the commit "3256073976d4f43e12e7ff97d243fdb8eb56165a - MEDIUM: stream: do not forcefully close the client connection anymore", but I can't reproduce it if I make the test sending the request (a simple curl) from outside the server network using a vpn link. Due that I can't see any other issue, does it fit inside the expected behaviour? Thanks for your time Willy and Christopher. De: Willy TarreauEnviado: viernes, 22 de diciembre de 2017 18:57 Para: Ricardo Fraile Cc: haproxy@formilux.org Asunto: Re: Issue after upgrade from 1.7 to 1.8 related with active sessions Hi Ricardo, On Fri, Dec 22, 2017 at 12:37:42PM +0100, Ricardo Fraile wrote: > Continuing with the investigation, I changed the listen only to this: > > listen proxy-test-tcp > bind *:81 > option tcplog > server test1 192.168.1.101:80 > > > And the difference between 1.7 and 1.8 tracing the process who receive > only 1 request is that the shutdown of the socket who receives the > request fails with an ENOTCONN. In 1.8 continue in CLOSE_WAIT a few > time, meanwhile in 1.7 pass to TIME_WAIT as usual. (...) I finally found it thanks to all your information and to Christopher's bisect. I've just fixed it now with the attached patch. Feel free to retest it, but I'm confident I can issue 1.8.2 now. Many thanks for your very detailed report! Willy