Re: Haproxy 1.8 http/2 mode does not pass the h2spec conformance test

2017-12-23 Thread Robin Anil
Hi WIlly,

Most of the tests pass now in 1.8.2. However, the following test still
hangs in the --strict mode.


   2: Sends a large size DATA frame that exceeds the
SETTINGS_MAX_FRAME_SIZE

On Sun, Nov 26, 2017 at 11:54 PM Robin Anil  wrote:

> I can get the tests to match yours now. Looks like it was a connection
> issue.
> Finished in 70.5089 seconds
>
> 145 tests, 116 passed, 0 skipped, 29 failed
>
>
>
> Also, If you haven't noticed the --strict option, in that mode I can get
> the test to hang on
>
>4.2. Frame Size
>
>   ✔ 1: Sends a DATA frame with 2^14 octets in length
>
> 2: Sends a large size DATA frame that exceeds the
> SETTINGS_MAX_FRAME_SIZE
>
> On Sun, Nov 26, 2017 at 11:39 PM Robin Anil  wrote:
>
>> Sorry, this one, not the alpine build
>>
>>
>> https://github.com/docker-library/haproxy/blob/0c9c27713bfca8505331e0da2664a9454755c7b9/1.8-rc/Dockerfile
>>
>> On Sun, Nov 26, 2017 at 11:36 PM Robin Anil  wrote:
>>
>>> I am running HaProxy 1.8 within docker container using a fork of the
>>> official build and pointed them to the latest 1.8.0 instead of rc-4. So it
>>> is built using openssl1.1
>>>
>>> Docker hub link: https://hub.docker.com/_/haproxy/
>>> See the Dockerfile
>>> https://github.com/docker-library/haproxy/blob/0c9c27713bfca8505331e0da2664a9454755c7b9/1.8-rc/alpine/Dockerfile
>>>
>>> Just replaced these two lines
>>> ENV HAPROXY_VERSION 1.8.0
>>> ENV HAPROXY_MD5 6ccea4619b7183fbcc8c98bae1f9823d
>>>
>>> On Sun, Nov 26, 2017 at 11:29 PM Willy Tarreau  wrote:
>>>
 On Mon, Nov 27, 2017 at 06:15:48AM +0100, Willy Tarreau wrote:
 > On Mon, Nov 27, 2017 at 05:10:16AM +, Robin Anil wrote:
 > > A very stripped down version of config
 >
 > Thank you, I'll check if anything there can explain this.

 So with your config I'm getting this :

 Finished in 52.3802 seconds
 145 tests, 110 passed, 4 skipped, 31 failed

 I had to disable ssl-mode-async as my openssl version doesn't support
 it.

 Of the 31, I'm seeing real bugs compared to what is expected to work,
 such
 as :

 3.8. GOAWAY
   Ã 1: Sends a GOAWAY frame
 -> The endpoint MUST accept GOAWAY frame.
Expected: Connection closed
  Actual: Timeout

 Others are expected for now :

 3.10. CONTINUATION
   Ã 1: Sends a CONTINUATION frame
 -> The endpoint MUST accept CONTINUATION frame.
Expected: HEADERS Frame (stream_id:1)
  Actual: Connection closed

   4. HTTP Message Exchanges
 Ã 4: Sends a POST request with trailers
   -> The endpoint MUST respond to the request.
  Expected: HEADERS Frame (stream_id:1)
Actual: Connection closed

 And the other ones are very useful as they likely indicate missing
 checks.
 So I'll take a look. Thanks!

 Willy

>>>


Re: disable-on-404 functionality change in 1.8

2017-12-23 Thread Paul Lockaby


> On Dec 23, 2017, at 2:24 AM, Willy Tarreau  wrote:
> 
> Hi guys,
> 
> On Sat, Dec 23, 2017 at 08:58:43AM +0100, Cyril Bonté wrote:
>> It looks to be a code regression.
>> 
>> Emeric, can you have a look at commit 5a1335110c ? It seems there was an
>> unwanted change in the function call : srv_set_stopping() was replaced by
>> srv_set_running()
>> [...]
>> /* Marks the check  as valid and tries to set its server into
>> stopping mode
>> @@ -406,7 +371,7 @@ static void check_notify_stopping(struct check *check)
>>  if ((s->agent.state & CHK_ST_ENABLED) && (s->agent.health <
>> s->agent.rise))
>>  return;
>> 
>> -srv_set_stopping(s, (!s->track && !(s->proxy->options2 & 
>> PR_O2_LOGHCHKS))
>> ? check_reason_string(check) : NULL);
>> +srv_set_running(s, NULL, (!s->track && !(s->proxy->options2 &
>> PR_O2_LOGHCHKS)) ? check : NULL);
> 
> Nice catch! Thanks Paul for your helpful report and Cyril for spotting
> the bug. Now fixed, I can prepare 1.8.2 :-)
> 
> Willy

Thanks all for getting this fixed so dang quickly on a Friday night before a 
holiday weekend. It looks good to me now. Much appreciated.

-Paul

[ANNOUNCE] haproxy-1.8.2

2017-12-23 Thread Willy Tarreau
Hi,

HAProxy 1.8.2 was released on 2017/12/23. It added 64 new commits
after version 1.8.1.

This version fixes all the issues diagnosed since 1.8.1. The most
important ones are :
  - truncated and slow HTTP/2 POST forms
  - abortonclose killing all HTTP/2 requests
  - single server taking all the load in map-based algorithms
  - timeouts and too later connection shutdown on TCP/tunnel
  - cache did not consider cache-control in the request
  - various server state transition issues (down->maint, stopping)
  - email alerts unexpectedly modifying the server state
  - log fd leaks across reloads in master-worker mode
  - deadlocks in variables usage under threads

There are still a few pending reports that need to be analysed, but
having a new reference version without all the problems above will
help sorting the bug reports and will save most users from not fun
surprises.

If you are on 1.8, please upgrade to 1.8.2, at least before reporting
a bug. We'll all save valuable time :-)

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Sources  : http://www.haproxy.org/download/1.8/src/
   Git repository   : http://git.haproxy.org/git/haproxy-1.8.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-1.8.git
   Changelog: http://www.haproxy.org/download/1.8/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/

Willy
---
Complete changelog :
Aleksandar Lazic (1):
  CONTRIB: halog: Add help text for -s switch in halog program

Bertrand Jacquin (8):
  MINOR: netscaler: respect syntax
  MINOR: netscaler: remove the use of cip_magic only used once
  MINOR: netscaler: rename cip_len to clarify its uage
  BUG/MEDIUM: netscaler: use the appropriate IPv6 header size
  BUG/MAJOR: netscaler: address truncated CIP header detection
  MINOR: netscaler: check in one-shot if buffer is large enough for IP and 
TCP header
  MEDIUM: netscaler: do not analyze original IP packet size
  MEDIUM: netscaler: add support for standard NetScaler CIP protocol

Christopher Faulet (3):
  BUG/MINOR: action: Don't check http capture rules when no id is defined
  BUG/MEDIUM: threads/vars: Fix deadlock in register_name
  BUG/MEDIUM: mworker: Set FD_CLOEXEC flag on log fd

Cyril Bonté (2):
  BUG: MAJOR: lb_map: server map calculation broken
  BUG: MINOR: http: don't check http-request capture id when len is provided

David Carlier (1):
  BUILD/MINOR: Makefile : enabling USE_CPU_AFFINITY

Davor Ocelic (1):
  DOC/MINOR: intro: typo, wording, formatting fixes

Emeric Brun (3):
  BUG/MEDIUM: ssl engines: Fix async engines fds were not considered to fix 
fd limit automatically.
  BUG/MEDIUM: checks: a down server going to maint remains definitely 
stucked on down state.
  BUG/MEDIUM: checks: a server passed in maint state was not forced down.

Eric Salama (1):
  BUG/MEDIUM: lua: fix crash when using bogus mode in register_service()

PiBa-NL (1):
  BUG/MEDIUM: email-alert: don't set server check status from a email-alert 
task

Ryan O'Hara (2):
  CONTRIB: iprange: Fix compiler warning in iprange.c
  CONTRIB: halog: Fix compiler warnings in halog.c

Thierry FOURNIER (2):
  DOC: notifications: add precisions about thread usage
  BUG/MEDIUM: lua/notification: memory leak

Tim Duesterhus (2):
  MINOR: mworker: Update messages referencing exit-on-failure
  MINOR: mworker: Improve wording in `void mworker_wait()`

Vincent Bernat (1):
  MINOR: systemd: remove comment about HAPROXY_STATS_SOCKET

William Lallemand (1):
  BUG/MINOR: ssl: support tune.ssl.cachesize 0 again

Willy Tarreau (35):
  BUG/MAJOR: hpack: don't pretend large headers fit in empty table
  BUG/MEDIUM: mworker: also close peers sockets in the master
  BUG/MEDIUM: peers: set NOLINGER on the outgoing stream interface
  BUG/MEDIUM: h2: fix handling of end of stream again
  MINOR: conn_stream: add new flag CS_FL_RCV_MORE to indicate pending data
  BUG/MEDIUM: stream-int: always set SI_FL_WAIT_ROOM on CS_FL_RCV_MORE
  BUG/MEDIUM: h2: automatically set CS_FL_RCV_MORE when the output buffer 
is full
  BUG/MEDIUM: h2: enable recv polling whenever demuxing is possible
  BUG/MEDIUM: h2: work around a connection API limitation
  BUG/MEDIUM: h2: debug incoming traffic in h2_wake()
  MINOR: h2: store the demux padding length in the h2c struct
  BUG/MEDIUM: h2: support uploading partial DATA frames
  MINOR: h2: don't demand that a DATA frame is complete before processing it
  BUG/MEDIUM: h2: don't switch the state to HREM before end of DATA frame
  BUG/MEDIUM: h2: don't close after the first DATA frame on tunnelled 
responses
  BUG/MEDIUM: http: don't disable lingering on requests with tunnelled 
responses
  BUG/MEDIUM: h2: fix stream limit enforcement
  BUG/MINOR: stream-int: don't try to 

Re: disable-on-404 functionality change in 1.8

2017-12-23 Thread Willy Tarreau
Hi guys,

On Sat, Dec 23, 2017 at 08:58:43AM +0100, Cyril Bonté wrote:
> It looks to be a code regression.
> 
> Emeric, can you have a look at commit 5a1335110c ? It seems there was an
> unwanted change in the function call : srv_set_stopping() was replaced by
> srv_set_running()
> [...]
>  /* Marks the check  as valid and tries to set its server into
> stopping mode
> @@ -406,7 +371,7 @@ static void check_notify_stopping(struct check *check)
>   if ((s->agent.state & CHK_ST_ENABLED) && (s->agent.health <
> s->agent.rise))
>   return;
> 
> - srv_set_stopping(s, (!s->track && !(s->proxy->options2 & 
> PR_O2_LOGHCHKS))
> ? check_reason_string(check) : NULL);
> + srv_set_running(s, NULL, (!s->track && !(s->proxy->options2 &
> PR_O2_LOGHCHKS)) ? check : NULL);

Nice catch! Thanks Paul for your helpful report and Cyril for spotting
the bug. Now fixed, I can prepare 1.8.2 :-)

Willy



Re: Issue after upgrade from 1.7 to 1.8 related with active sessions

2017-12-23 Thread Willy Tarreau
Hi Ricardo,

On Sat, Dec 23, 2017 at 09:06:36AM +, Ricardo Fraile wrote:
> Hello Willy,
> 
> 
> It works perfect! Problem solved :)

Great, thanks!

> The doubt that I have now is related with the trace line "-1 ENOTCONN
> (Transport endpoint is not connected)" and the relationship with the issue...

>From what I remember it was on a shutdown(SHUT_WR). It's harmless but
suboptimal. It proves that there is a situation where two different
events lead to an attempt to close and that we don't know we have
already closed. We could save a shutdown() syscall by refining the
condition there. Given that your original issue was caused by a missing
call to shutdown() I'd rather postpone such investigation ;-)

Cheers,
Willy



Re: Issue after upgrade from 1.7 to 1.8 related with active sessions

2017-12-23 Thread Ricardo Fraile
Hello Willy,


It works perfect! Problem solved :)

>From my side, yesterday afternoon I was walking along the commits to reach 
>when was the change. I finished in the same commit "MEDIUM: connection: make 
>conn_sock_shutw() aware of lingering", and the workaround that I found was 
>using "option nolinger". Coincidentally, when I was going to write the email, 
>your answer arrived with the right fix.

The doubt that I have now is related with the trace line "-1 ENOTCONN 
(Transport endpoint is not connected)" and the relationship with the issue...

It still happend, but the problem is solved, therefore it hasn't any link 
between each one.

I found that this behaviour was introduced since the commit 
"3256073976d4f43e12e7ff97d243fdb8eb56165a - MEDIUM: stream: do not forcefully 
close the client connection anymore", but I can't reproduce it if I make the 
test sending the request (a simple curl) from outside the server network using 
a vpn link. Due that I can't see any other issue, does it fit inside the 
expected behaviour?


Thanks for your time Willy and Christopher.





De: Willy Tarreau 
Enviado: viernes, 22 de diciembre de 2017 18:57
Para: Ricardo Fraile
Cc: haproxy@formilux.org
Asunto: Re: Issue after upgrade from 1.7 to 1.8 related with active sessions

Hi Ricardo,

On Fri, Dec 22, 2017 at 12:37:42PM +0100, Ricardo Fraile wrote:
> Continuing with the investigation, I changed the listen only to this:
>
> listen proxy-test-tcp
> bind *:81
> option tcplog
> server test1 192.168.1.101:80
>
>
> And the difference between 1.7 and 1.8 tracing the process who receive
> only 1 request is that the shutdown of the socket who receives the
> request fails with an ENOTCONN. In 1.8 continue in CLOSE_WAIT a few
> time, meanwhile in 1.7 pass to TIME_WAIT as usual.

(...)

I finally found it thanks to all your information and to Christopher's
bisect. I've just fixed it now with the attached patch. Feel free to
retest it, but I'm confident I can issue 1.8.2 now.

Many thanks for your very detailed report!

Willy