Date: Mon, 10 May 1999 08:02:12 +0200
   From: Andi Kleen <[EMAIL PROTECTED]>

   I agree with that, but a few comments:
   - it doesn't handle the case properly when the new pmtu is < 0.5*old_pmtu,
   in this case it still eats a time out because it doesn't retry until 
   all tcp_fragment()s have finished their job. 

Hmmm... can you paint a small picture for me so I can visuallize how
this can happen?  I can't see it at the moment.

   - in some cases it can cause quite some packet flood. the new code is 
   even more aggressive than the old one or fast retransmit. would it
   make sense to restrict it to only kick a few packets? 

By the same argument the old code produced a flood too, of
approximately half the size.  And ultimately, the old code generated
_more_ traffic since _everything_ was resent when we hit the timeout
(which due to the frags not being sent, is guarenteed).

In fact I do not believe we will flood as much as you believe.
Here is my analysis:

1) When we send out the first packet which exceeds the path MTU
   along the hop, call this event 'A'

2) Consider, in an attempt to generate the worst possible case,
   that we had the rest of the window after the packet in 'A'
   and all subsequent packets fit in the MTU.  Thus we sent them
   all, call this event 'B'

3) At the point in which the PMTU message arrives, assume the
   rest of the window has been transmitted already.

4) We resend at this moment, in response to the PMTU event,
   all packets of length > PMTU, and all fragments created
   by the chopping up we needed to do.

   As a side effect of this, we will merge in subsequent packets
   which can be coalesced with the smaller frags we make.  This
   can cause a chain reaction of further coalescing.

   In the worst possible case, we will resend the whole window
   of data.  This can only happen _iff_:

   1) Each post-pmtu-frag packet has not been acknowledged by
      SACK data.

   2) The length of every pmtu-frag packet plus the length of
      the next packet in the queue does not exceed the total
      space available in the pmtu-frag's SKB.

Now the common case of PMTU being triggered is beginning to send bulk
data and full sized frames, and in fact usually a stream of them.
There will be no "floods" in these cases, we will resend exactly what
could not have reached the other end.

SACK information, when present, puts a hard cap on the worst possible
case, because if we have the "> PMTU packet, many < PMTU packets"
case, the SACKS will come back and keep us from doing anything stupid.

I claim that the new code is no worse than the old code.

Remember, what we really want this code to do is recover quickly from
the single PMTU event case.  And I believe it does so satisfactorily
right now.

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to [EMAIL PROTECTED]

Reply via email to