Christian,

> On Oct 7, 2016, at 9:57 PM, Christian Huitema <huit...@huitema.net> wrote:
> 
>> On Friday, October 7, 2016 8:48 AM, Fred Templin wrote:
>> ...
>> As per your doc, it is the reassembly buffer size (and not the sizes of any 
>> links the
>> tunnel is configured over) that determines the tunnel MTU. So, in this 
>> example, the
>> egress could just as well configure a 9KB reassembly buffer size and hence 
>> the
>> tunnel would have a 9KB minus ENCAPS MTU.
> 
> Fred, technically, you are correct. But let's go back to the argument 
> developed by Kent and Mogul, "Fragmentation considered harmful." The core of 
> the argument is that if the network loses a fragment, TCP has to retransmit 
> the whole packet. It is always inefficient; if the stars are aligned just 
> wrong, it can be extremely inefficient. In contrast, if there is no 
> fragmentation, the unit of transmission is the unit of control, and TCP 
> performs efficiently.

It is always less efficient than not fragmenting, but the loss issue matters 
only when the probability of losing an entire packet is much lower then the 
probability of losing one or more fragments. If congestion losses are 
correlated (drop one fragment means drop the rest of the packet) or 
non-congestion losses are spread out (so the likelihood of losing only any 
fragment of a packet the same as losing the packet if it were whole) then it 
doesn't matter. 

> 
> If each link on a path has a fixed MTU, MTU discovery will ensure that 
> packets are not fragmented.
 If/when it works. PMTUD depends on ICMPs that are often blocked. Only PLMUTD 
will succeed in those cases. 

> If links have a variable MTU, you will want MTU discovery to discover the MTU 
> "of the moment," so as to ensure TCP efficiency.

Yes, and any variation in the IP path can cause that change in path MTU.

> Tunnels are links with variable MTU. A change in route could cause the tunnel 
> MTU to drop from 9K to 1.5K. 

Not necessarily.  A tunnel with static end points would not have a variable MTU 
because the MTU of the tunnel (not inside the tunnel, which is not visible to 
the transiting packet) is determined by.the semblance limit, not the path 
inside the tunnel.

> 
> The two classic strategies for tunnel MTU is to ensure a lowest common 
> denominator, or to use a reassembly buffer and go for some high MTU size.

You are assuming it is critical to avoid fragmentation inside the tunnel. There 
are links that almost never avoid this (ATM). It's not critical unless the 
losses are dominant inside the tunnel and are impacted by the fragmentation 
(e.g., fragment losses result in higher packet losses than there would have 
been with unfragmented packets, as described above).


> Neither is very satisfactory. The lowest common denominator approach foregoes 
> the advantage of large MTU when they are actually available. The reassembly 
> approach relies on intermediate fragmentation, with the corresponding 
> retransmission inefficiency. 

Many links live with this inefficiency. It's provably impossible to avoid this 
for IP over IP because of minimum MTU requirements.

> 
> My preferred solution would have the tunnel endpoints perform MTU discovery.

Sure - to guide the ingress source fragmentation (or not) of encapsulated 
packets.

> They could track the tunnel MTU as it evolves over time, send ICMP "too big" 
> messages when packets are too long, and effectively enable end-to-end MTU 
> discovery.

They should perform PLMTUD with the egress to figure out bout the atom MTU 
inside the tunnel and the egress reassembly limit - only the latter determines 
the tunnel MTU.

That tunnel MTU should be used by the node where the tunnel attaches to decide 
what to do - whether to fragment, whether to send an ICMP too-big (which will 
likely be dropped) or whatever. It is never the job of the tunnel to send these 
messages, though. The tunnel ingress is directly equivalent to a network 
interface and its MTU value.

All of this is already in the draft, FWIW.

Joe


> 
> -- Christian Huitema
> 
> 
> 

_______________________________________________
Int-area mailing list
Int-area@ietf.org
https://www.ietf.org/mailman/listinfo/int-area

Reply via email to