Date: Mon, 10 May 1999 08:02:12 +0200
From: Andi Kleen <[EMAIL PROTECTED]>
I agree with that, but a few comments:
- it doesn't handle the case properly when the new pmtu is < 0.5*old_pmtu,
in this case it still eats a time out because it doesn't retry until
all tcp_fragment()s have finished their job.
Hmmm... can you paint a small picture for me so I can visuallize how
this can happen? I can't see it at the moment.
- in some cases it can cause quite some packet flood. the new code is
even more aggressive than the old one or fast retransmit. would it
make sense to restrict it to only kick a few packets?
By the same argument the old code produced a flood too, of
approximately half the size. And ultimately, the old code generated
_more_ traffic since _everything_ was resent when we hit the timeout
(which due to the frags not being sent, is guarenteed).
In fact I do not believe we will flood as much as you believe.
Here is my analysis:
1) When we send out the first packet which exceeds the path MTU
along the hop, call this event 'A'
2) Consider, in an attempt to generate the worst possible case,
that we had the rest of the window after the packet in 'A'
and all subsequent packets fit in the MTU. Thus we sent them
all, call this event 'B'
3) At the point in which the PMTU message arrives, assume the
rest of the window has been transmitted already.
4) We resend at this moment, in response to the PMTU event,
all packets of length > PMTU, and all fragments created
by the chopping up we needed to do.
As a side effect of this, we will merge in subsequent packets
which can be coalesced with the smaller frags we make. This
can cause a chain reaction of further coalescing.
In the worst possible case, we will resend the whole window
of data. This can only happen _iff_:
1) Each post-pmtu-frag packet has not been acknowledged by
SACK data.
2) The length of every pmtu-frag packet plus the length of
the next packet in the queue does not exceed the total
space available in the pmtu-frag's SKB.
Now the common case of PMTU being triggered is beginning to send bulk
data and full sized frames, and in fact usually a stream of them.
There will be no "floods" in these cases, we will resend exactly what
could not have reached the other end.
SACK information, when present, puts a hard cap on the worst possible
case, because if we have the "> PMTU packet, many < PMTU packets"
case, the SACKS will come back and keep us from doing anything stupid.
I claim that the new code is no worse than the old code.
Remember, what we really want this code to do is recover quickly from
the single PMTU event case. And I believe it does so satisfactorily
right now.
Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to [EMAIL PROTECTED]