On Tue, 13 Mar 2018, Eric Dumazet wrote:

> On Tue, Mar 13, 2018 at 3:25 AM, Ilpo Järvinen
> <ilpo.jarvi...@helsinki.fi> wrote:
> > Here is a series of fixes to issues that occur when SACK is not
> > enabled for a TCP connection. These are not fixes to just some
> > remote corner cases of recovery but many/almost all recoveries
> > without SACK will be impacted by one (or even more than one) of
> > them. The sender-side changes (1-4) are not mainly, if any, to
> > improve performance (throughput) but address correctness
> > (congestion control responses should not get incorrectly
> > reverted) and burstiness (that may cause additional problems
> > later as some of the packets in such bursts may get dropped
> > needing again to resort to loss recovery that is likely
> > similarly bursty).
> 
> GRO (or similar) adoption a long time ago (not only by linux) had a
> serious impact on non SACK flow.
> Should we also disable GRO by default ?
> (my answer is no, just in case someone wonders)
>
> I am sorry, but I am still not convinced by your patches, trying to
> give a wrong incentive to people keeping their prehistoric stacks,
> that have serious problems on wifi anyway, and or middle boxes
> "aggregating/compressing ACKS"

So now you think that I'm against high perf enhancements (even after 
having written some bits of them)?

However, I think that in case somebody disables something that is enabled 
by default, be it SACK, GRO, timestamps, etc., TCP he/she will get should 
still make sense (and that regardless of some middleboxes trying hard to 
break otherwise working stuff). But I guess you don't agree here(?) and 
would perhaps even want to remove ability to disable them? Although 
obviously that wouldn't even work with non-SACK (unless RST or so starts 
to get used but even that could be worked-around unfortunately).

Also, I'm somewhat confused your position here. On one hand you said that
tests should be added to regression tests to avoid breaking these non-SACK
cases again implying that things should be fixed and on the other hand you 
seem to say that non-SACK must be left broken?

> The last patch is particularly problematic in my opinion, you have not
> provided packetdrill tests to prove there was no danger.

Can that even be done? Proving absence of danger with packetdrill?
AFAIK that kind of proofs are available only with formal verification.
 
> It seems you leave to us the task of dealing with possible issues,
> only added a bold changelog.

Heh, I tried to add details to the changelog because you explicitly said
I should and now you fault me on doing that :-). Also, you're leaving out 
the detail that I tried to understand the condition that worried you and
found out that the code already disallows any shrinking of advertized 
window for duplicate ACKs (but I guess there just might have been some 
miscommunication between us given that your last reply to 5/5
v1 made no sense to me).

> Since the bugs you claim to fix are at least 10 years old, and your
> fixes are far from being trivial,
> please wait our packetdrill regression tests being published.
> This should happen in less than one month I believe, in time for linux-4.17
> 
> Note that the publication of the updated packetdrill and test suite is
> not an easy task,
> since we accumulated a lot of hacks both in kernel to cope with timer
> slacks and in packetdrill
> for various experimental or private features that are not yet in
> upstream kernels.
> 
> So we are cleaning all this, and will be happy to help you if you need.

I slightly disagree on the trivial bit here as I think it's trivial to 
see the changes only affect non-SACK flows (of which you seem to say you 
don't care if they're broken as that gives incentive to use SACK). But 
then I'm not too worried about waiting a month.


-- 
 i.

Reply via email to