Christian, > On Oct 7, 2016, at 9:57 PM, Christian Huitema <huit...@huitema.net> wrote: > >> On Friday, October 7, 2016 8:48 AM, Fred Templin wrote: >> ... >> As per your doc, it is the reassembly buffer size (and not the sizes of any >> links the >> tunnel is configured over) that determines the tunnel MTU. So, in this >> example, the >> egress could just as well configure a 9KB reassembly buffer size and hence >> the >> tunnel would have a 9KB minus ENCAPS MTU. > > Fred, technically, you are correct. But let's go back to the argument > developed by Kent and Mogul, "Fragmentation considered harmful." The core of > the argument is that if the network loses a fragment, TCP has to retransmit > the whole packet. It is always inefficient; if the stars are aligned just > wrong, it can be extremely inefficient. In contrast, if there is no > fragmentation, the unit of transmission is the unit of control, and TCP > performs efficiently.
It is always less efficient than not fragmenting, but the loss issue matters only when the probability of losing an entire packet is much lower then the probability of losing one or more fragments. If congestion losses are correlated (drop one fragment means drop the rest of the packet) or non-congestion losses are spread out (so the likelihood of losing only any fragment of a packet the same as losing the packet if it were whole) then it doesn't matter. > > If each link on a path has a fixed MTU, MTU discovery will ensure that > packets are not fragmented. If/when it works. PMTUD depends on ICMPs that are often blocked. Only PLMUTD will succeed in those cases. > If links have a variable MTU, you will want MTU discovery to discover the MTU > "of the moment," so as to ensure TCP efficiency. Yes, and any variation in the IP path can cause that change in path MTU. > Tunnels are links with variable MTU. A change in route could cause the tunnel > MTU to drop from 9K to 1.5K. Not necessarily. A tunnel with static end points would not have a variable MTU because the MTU of the tunnel (not inside the tunnel, which is not visible to the transiting packet) is determined by.the semblance limit, not the path inside the tunnel. > > The two classic strategies for tunnel MTU is to ensure a lowest common > denominator, or to use a reassembly buffer and go for some high MTU size. You are assuming it is critical to avoid fragmentation inside the tunnel. There are links that almost never avoid this (ATM). It's not critical unless the losses are dominant inside the tunnel and are impacted by the fragmentation (e.g., fragment losses result in higher packet losses than there would have been with unfragmented packets, as described above). > Neither is very satisfactory. The lowest common denominator approach foregoes > the advantage of large MTU when they are actually available. The reassembly > approach relies on intermediate fragmentation, with the corresponding > retransmission inefficiency. Many links live with this inefficiency. It's provably impossible to avoid this for IP over IP because of minimum MTU requirements. > > My preferred solution would have the tunnel endpoints perform MTU discovery. Sure - to guide the ingress source fragmentation (or not) of encapsulated packets. > They could track the tunnel MTU as it evolves over time, send ICMP "too big" > messages when packets are too long, and effectively enable end-to-end MTU > discovery. They should perform PLMTUD with the egress to figure out bout the atom MTU inside the tunnel and the egress reassembly limit - only the latter determines the tunnel MTU. That tunnel MTU should be used by the node where the tunnel attaches to decide what to do - whether to fragment, whether to send an ICMP too-big (which will likely be dropped) or whatever. It is never the job of the tunnel to send these messages, though. The tunnel ingress is directly equivalent to a network interface and its MTU value. All of this is already in the draft, FWIW. Joe > > -- Christian Huitema > > > _______________________________________________ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area