Hi Willy,
you wrote here, that tunnel mode is normally not recommended.
We understand that, but we are looking for the best configuration for our 
purposes.
We want to use haproxy as an internal loadbalancer between our frontend and 
backend servers.
For some purposes we will need sharding (like loadbalancing by url parameter) 
and for some not.

For the latter we think tunnel mode should be appropriate in the following 
scenario:
- the client (which are our frontend servers) establishes several keepalived 
connections to haproxy
- haproxy establishes several keepalived connections to backend servers
- the client communicates with haproxy using one of available connections
- the request is delivered to the backend assigned to this connection

We wonder why is this impossible when the request needs to be analyzed. Our 
idea is that the client could use keepalived connections to haproxy,
and also haproxy would use keepalived connections to backend servers. It would 
only choose the connection after request analyzis.
Does this make any sense and is such configuration possible? If not, what 
configuration would you recommend? 
The closest to this that we could achieve was setting option http-server-close, 
which at least allows the clients to use keepalived connections.

Best regards,
Mikolaj

----- Oryginalna wiadomość -----
Od: "Willy Tarreau" <[email protected]>
Do: "Jaroslaw Bojar" <[email protected]>
DW: [email protected], "Miko??aj Gierulski" <[email protected]>
Wysłane: sobota, 20 październik 2012 9:08:06
Temat: Re: Problem with chunked transfer encoding and short responses

Hi Jaroslaw,

first, thank you for this detailed report and the traces.

On Fri, Oct 19, 2012 at 03:45:29PM +0200, Jaroslaw Bojar wrote:
> Hi,
> during evaluation of Haproxy for our internal loadbalancer we encountered 
> following problem.

What version was this ?

> Our backend sends very short responses (less than 1000 bytes). 
> Transfer-encoding is set to chunked.
> Client connecting via haproxy experiences unexpectedly long delays (about 2-3 
> seconds) before receiving response.

>From what I'm seeing in your trace, there is exactly a 2 seconds delay
which surprizingly corresponds to the server timeout in the configuration.
So this clearly looks like a bug !

> Larger responses (larger than about 1223 bytes) are received immediately. 
> Obviously, when connecting to the backend directly, there is no such delay.

your network trace is ver clear on this :

14:56:21.186677 IP (tos 0x0, ttl 60, id 63690, offset 0, flags [DF], proto TCP 
(6), length 86)
    172.23.26.210.5280 > 10.205.61.112.57001: Flags [P.], cksum 0xd5d4 
(correct), seq 1714337006:1714337052, ack 3935368426, win 15544, length 46
        0x0000:  4500 0056 f8ca 4000 3c06 36b1 ac17 1ad2  E..V..@.<.6.....
        0x0010:  0acd 3d70 14a0 dea9 662e b4ee ea90 f4ea  ..=p....f.......
        0x0020:  5018 3cb8 d5d4 0000 3233 0d0a 4c69 6e65  P.<.....23..Line
        0x0030:  206e 756d 6265 7220 303c 6272 3e0a 0a3c  .number.0<br>..<
        0x0040:  2f62 6f64 793e 0a3c 2f68 746d 6c3e 0a0d  /body>.</html>..
        0x0050:  0a30 0d0a 0d0a                           .0....

==> This is the last chunk of the response.

14:56:21.186685 IP (tos 0x0, ttl 64, id 56173, offset 0, flags [DF], proto TCP 
(6), length 40)
    10.205.61.112.57001 > 172.23.26.210.5280: Flags [.], cksum 0x997d 
(correct), seq 3935368426, ack 1714337052, win 6432, length 0
        0x0000:  4500 0028 db6d 4000 4006 503c 0acd 3d70  E..(.m@[email protected]<..=p
        0x0010:  ac17 1ad2 dea9 14a0 ea90 f4ea 662e b51c  ............f...
        0x0020:  5010 1920 997d 0000                      P....}..

==> it is ACKed so we're certain it has been delivered to the kernel
    socket buffers

14:56:23.191667 IP (tos 0x0, ttl 64, id 39685, offset 0, flags [DF], proto TCP 
(6), length 355)
    10.205.61.112.13091 > 10.201.54.67.50481: Flags [F.], cksum 0x8a9e 
(incorrect -> 0x425e), seq 3006575672:3006575987, ack 346156374, win 6432, 
length 315
        0x0000:  4500 0163 9b05 4000 4006 1547 0acd 3d70  E..c..@[email protected]..=p
        0x0010:  0ac9 3643 3323 c531 b334 b438 14a1 ed56  ..6C3#.1.4.8...V
        0x0020:  5011 1920 8a9e 0000 4854 5450 2f31 2e31  P.......HTTP/1.1
        0x0030:  2032 3030 204f 4b0d 0a58 2d50 6f77 6572  .200.OK..X-Power
        0x0040:  6564 2d42 793a 204a 5350 2f32 2e32 0d0a  ed-By:.JSP/2.2..
        0x0050:  5365 7276 6572 3a20 476c 6173 7346 6973  Server:.GlassFis
        0x0060:  6820 5365 7276 6572 204f 7065 6e20 536f  h.Server.Open.So

The equivalent part goes out exactly 2 seconds after.

> We experienced this behaviour with three different backends (tomcat, weblogic 
> and glassfish).

Do you know if you can reproduce it when sending only one chunk of
data in response ? Anyway, it's so simlpe that I should be able to
reproduce it with netcat.

> We analyzed the communications with tcpdump and came to following conclusions:
> 
> Haproxy seems to buffer the response received in several TCP packets, before
> sending it to the client (which is understandable).

In fact it reads as much as possible at once and sends as much as
possible at once too. By default, unless http-no-delay is set, it
tells the system that more data will follow (MSG_MORE) so that the
system can merge adjacent segments in less TCP packets, and usually
the system buffers up to 40 ms before sending incomplete data. Here
we're clearly not in this situation.

> Delays described above happen when the whole response is received by haproxy
> (last chunk 0<CR><LF><CR><LF> is received) before it starts sending response
> to the client (probably response still fits some buffer). The whole response
> is sent to the client in one TCP packet.

I agree, and this test is really good because there is no risk of lack of
any resource at any moment.

> With longer responses haproxy sends response to the client in two or more TCP
> packets, and there is no delay before the last packet (when last chunk is
> received).
> 
> We were able to workaround this problem with following solutions:
>  - option http-no-delay
>  - option http-server-close

In fact without http-server-close, haproxy switches to tunnel mode just
after the transfer, allowing data to flow between the two sides just as
if it was a raw TCP connection. Using this mode is normally not recommended
since it will not see subsequent exchanges anymore, and I suspect that
the issue happens when switching to tunnel mode, probably that data are
not scheduled for being sent at this point.

> We read some discussions about similar problems
> (http://www.mail-archive.com/[email protected]/msg05419.html), but we find
> our situation different - we are not abusing http protocol. 
> It's different, because we experience delays after the whole message was sent
> by the backend.

I totally agree with you, your test is clean, I didn't see anything
wrong in the responses.

I'll try to reproduce with latest dev and your config ; even if tunnel
mode is normally not recommended, the bug should be diagnosed in case
it's a larger one.

Thanks!
Willy

--
Niniejsza wiadomość pochodzi z domeny @agora.pl, należącej do Grupy
Kapitałowej Agory. Główne spółki wchodzące w skład Grupy Kapitałowej
Agory to:

Agora SA, ul. Czerska 8/10, 00-732 Warszawa; Numer identyfikacji
podatkowej: PL 526-030-56-44; Miejsce zarejestrowania: Sąd Rejonowy dla
m. st. Warszawy: Numer rejestru KRS: 59944; Kapitał zakładowy:
50.937.386 zł, wpłacony w całości.

Agora - Poligrafia Sp. z o.o., ul. Towarowa 4, 43-110 Tychy; Numer
identyfikacji podatkowej: PL 646-20-72-095; Miejsce zarejestrowania: Sąd
Rejonowy w Katowicach Numer rejestru KRS: 72481; Kapitał zakładowy:
1.000.000,00 zł.

Grupa Radiowa Agory Sp. z o.o., ul. Czerska 8/10, 00-732 Warszawa; Numer
identyfikacji podatkowej: PL 521-289-70-03; Miejsce zarejestrowania: Sąd
Rejonowy dla m. st. Warszawy Numer rejestru KRS: 126767; Kapitał
zakładowy:  25.019.500,00 zł.

Więcej informacji  o spółkach na stronie: www.agora.pl Wiadomość jest
przeznaczona wyłącznie dla zamierzonego adresata i może zawierać
informacje o charakterze poufnym. W razie stwierdzenia, że odbiorcą
miała być inna osoba prosimy poinformować nadawcę oraz niezwłocznie
usunąć wiadomość. Wiadomość może nie stanowić oficjalnego stanowiska
spółki Agora SA i nie być związana z jej działalnością.
------------
This message was sent from domain @agora.pl belonging to Agora Group.
Principal companies in the Agora Group structure are:

Agora SA, ul. Czerska 8/10, 00-732 Warszawa; Polish VAT and tax ID no.:
PL 526-030-56-44; Place of registration: Regional Court for the Capital
City of Warsaw; Registration no.: 59944; Share capital: PLN 50.937.386,
fully paid-up.

Agora - Poligrafia Sp. z o.o., ul. Towarowa 4, 43-110 Tychy; Polish VAT
and tax ID no.: PL 646-20-72-095; Place of registration: Regional Court
in Katowice; Registration no.: 72481; Share capital: PLN 1.000.000,00.

Grupa Radiowa Agory Sp. z o.o., ul. Czerska 8/10, 00-732 Warszawa;
Polish VAT and tax ID no.: PL 521-289-70-03; Place of registration:
Regional Court for the Capital City of Warsaw; Registration no.: 126767;
Share capital: PLN 25.019.500,00.

For more information about our companies see site: www.agora.pl This
message is for the intended recipient only and it may contain
confidential information. If you receive this message in error, please
immediately delete it and notify the sender. This message may not
represent the official views of Agora SA and may not be related to its
business.


Reply via email to