Joe, On May 29, 2013, at 4:31 PM, Joe Touch <[email protected]<mailto:[email protected]>> wrote:
On 5/29/2013 1:06 PM, Linda Dunbar wrote: Ron, I do have a few questions and suggestion about the practices documented in the draft: - Section 4.1, second paragraph: why DF bit "MUST" set to 1 when the payload header has "0"? I would think default should be same as the "payload" DF setting. DF in a tunnel header should be set to match the capability of that tunnel mechanism. Either the tunnel wants to find the MTU and match it or not; in the former case, it has to set DF=1 all the time. For this doc, whether to set or copy the DF bit depends on whether PMTUD is supported on a given tunnel. If it is (whether PMTUD or PLMTUD), then set DF; otherwise, I agree - copy DF. - Section 5.1, last sentence: "The GRE egree router will forward the payload fragments to their ultimate destination where there will be reassembled" The GRE egress router should reassemble the fragmented payloads before sending to the destination. In general, the network outside the Egress router should be same as the network before entering the Ingress router. If the data frame size is less than then MTU of the network before entering the Ingress router, the decapsulated frame size should be less than the MTU of the network outside the Egress router. I agree; my general recommendation has always been "the egress should always clean up any mess created by an ingress" - which means using "outer" fragmentation rather than "inner". Why do you characterize this as a "mess"? Thinking of the tunnel as a logical link with its own logical link MTU (LMTU), and this link is unwilling/incapable of performing fragmentation and reassembly at its "data-link", then the appropriate behavior would be: * If the incoming IPv4 datagram has DF=1, drop and send a PTB back; and * if the incoming IPv4 datagram has DF=0, then fragment it and send it over the link. What's different? Yes, this can cause repeated frag/reassembly, but the alternative is to shift work to the end host, which I think is inappropriate. I do not understand why it is "inappropriate" -- when thinking of the tunnel as a link. There is also a potential mischaracterization when you say "shift work to the end host", because that can lead to assumptions that there is "a single" end host (singular). In the case of a p2p tunnel, the challenge is that there is a single pair of endpoints in the tunnel, but a multitude of hosts behind and before them. A different view would be that it's more efficient to distribute reassembly to many endpoints instead of attacking the tunnel tailend and making it work on behalf of many hosts/ Thanks, -- Carlos. Further, this sort of clean-up is required by IPv6 and for IPv4 when when DF=1 on the inner packet anyway. In addition, it is the Ingress node split the frame, the final destination may not be aware of how to reassemble the data frame. This is a spurious point; the information is there if the inner packet was fragmentable. - There should be two options when the encapsulated data frame exceeds MTU: a) split the data frame to two smaller frames, with each frame being encapsulated; b) truncate the frame if the Egress node needs to receive the data frame but doesn't have the capability to reassemble split frames. There are two viable options: - fragment and reassemble (same as your option "a)" above) - drop the packet (and presumably send a signal upstream) Sending a truncated frame serves no useful purpose. Joe _______________________________________________ Int-area mailing list [email protected]<mailto:[email protected]> https://www.ietf.org/mailman/listinfo/int-area
_______________________________________________ Int-area mailing list [email protected] https://www.ietf.org/mailman/listinfo/int-area
