Hi David,

On Wed, Feb 12, 2014 at 01:42:18PM +0000, David Harrold wrote:
> Hi Willy
> 
> Well that took a lot longer than anticipated?

Don't worry, all those who contribute code (including me) know this
symptom quite well. First you think you'll do something simple and
after scratching a bit you have to change a lot more things.

> my first cut only worked for
> small requests. It does not seem possible to use TFO if the application layer
> request is not all going to fit within the SYN packet. A fragmented SYN is
> likely to get dropped by the receiver, and if we hold some of the request
> data back to avoid SYN fragmentation then the remote application just gets
> pushed a partial request, so it still breaks. (At least on Linux) 

Ah, very interesting discovery. It might have explained the issues I
experienced in the past. I feel a bit worried by this because it means
that we don't know what happens if we push extra data in a second send().
Normally the TCP stack ought to take care of the default MSS and only
send a few amount of data so that the outgoing packet is not larger
than the MTU for the route, or the cached MSS, and the rest could be
either buffered or returned as "not consumed".

> I moved the TFO client code back into proto_tcp.c,  this made it easier to
> fallback to regular tcp connection if it is going to be unwise to attempt TFO
> for this request. Trying to catch this in raw_sock was too tricky for me.

OK I understand the difficulty. That said we'll have to put it back there
anyway because now it breaks SSL. The TCP layer must not know the buffers,
which are known only by the connection which links the two together. For
example, among the side effect, the PROXY protocol header could be sent
*after* the beginning of the data.

> I?ve also added an additional tfo-max-rs parameter to set a limit on the
> maximum request size for which to attempt TFO. If not supplied then uses a
> default of 1420 bytes to allow for headers and tcp options. Looks like it
> might be possible to replace this with a lookup to find out the cached mss
> size for each cached TFO cookie but I did not attempt that yet.

I think we'll have to bring this to [email protected] to know if the
behaviour you're facing is normal. Normally an application should not have
to know the MSS, MTU or such parameters which TCP is responsible for.
Therefore we'll know if we really have to handle this ourselves or if we
just need to use this as a workaround for bogus kernels (just like the
splice workaround we already have).

> I?ve tested this patch with debian wheezy using kernel 3.12 from 
> wheezy-backports.  
>  On the client side, set net.ipv4.tcp_fastopen = 1
>  Server side, set net.ipv4.tcp_fastopen = 2
> 
> I?d say this is still Experimental, approach with caution?

Hey that's really great because at least you got it working. My earlier tests
"used to" work for a while then stopped, probably because of the bogus kernel
patches I backported at this time.

I'm realizing that it's sad that openssl uses write() to send data, because
we wont be able to use TFO with it. Maybe the kernel's TFO API could be
improved so that we only *prepare* the socket for a fastopen then later send
the data (eg: sendto(MSG_MORE|MSG_FASTOPEN) + send/write/whatever).

> Patch against 1.5-dev22 [RELEASE] follows

Just a comment below (not directly related to TFO) :

>  #if defined(TCP_FASTOPEN)
>       if (listener->options & LI_O_TCP_FO) {
>               /* TFO needs a queue length, let's use the configured backlog */
> -             int qlen = listener->backlog ? listener->backlog : 
> listener->maxconn;
> +             /* Safer to fallback to maxaccept, not maxconn? */
> +             int qlen = listener->backlog ? listener->backlog : 
> listener->maxaccept;

This change is wrong, the default backlog must really be maxconn and is
documented as such. Maxaccept is only the number of consecutive accept()
we're willing to do in a row. It only aims at reducing processing latency,
and running with such very small values will totally kill performance.
Some sites already require backlogs > 1000 while this value can be even
smaller than 10 in multi-process mode.

That said, I think it would be nice if some users could test your patch
and report their success or failures. TFO is still not very common and I'm
pretty sure that a number of issues are still to be discovered. The earlier
the better. I think that your patch should work correctly provided that :
  - there is no SSL to the server
  - there is no "send-proxy" on the server line

For health checks, I don't know the impact, I expect them to work (within
the aforementionned limitations).

Thanks!
Willy


Reply via email to