On Friday, October 7, 2016 8:48 AM, Fred Templin wrote: > ... > As per your doc, it is the reassembly buffer size (and not the sizes of any > links the > tunnel is configured over) that determines the tunnel MTU. So, in this > example, the > egress could just as well configure a 9KB reassembly buffer size and hence the > tunnel would have a 9KB minus ENCAPS MTU.
Fred, technically, you are correct. But let's go back to the argument developed by Kent and Mogul, "Fragmentation considered harmful." The core of the argument is that if the network loses a fragment, TCP has to retransmit the whole packet. It is always inefficient; if the stars are aligned just wrong, it can be extremely inefficient. In contrast, if there is no fragmentation, the unit of transmission is the unit of control, and TCP performs efficiently. If each link on a path has a fixed MTU, MTU discovery will ensure that packets are not fragmented. If links have a variable MTU, you will want MTU discovery to discover the MTU "of the moment," so as to ensure TCP efficiency. Tunnels are links with variable MTU. A change in route could cause the tunnel MTU to drop from 9K to 1.5K. The two classic strategies for tunnel MTU is to ensure a lowest common denominator, or to use a reassembly buffer and go for some high MTU size. Neither is very satisfactory. The lowest common denominator approach foregoes the advantage of large MTU when they are actually available. The reassembly approach relies on intermediate fragmentation, with the corresponding retransmission inefficiency. My preferred solution would have the tunnel endpoints perform MTU discovery. They could track the tunnel MTU as it evolves over time, send ICMP "too big" messages when packets are too long, and effectively enable end-to-end MTU discovery. -- Christian Huitema _______________________________________________ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area