Hey All, I took a quick peek at draft-bonica-intarea-gre-mtu-03.txt. This is very interesting as we ourselves are have just been dealing with GRE fragmentation issues over a link to a new peer. I'll probably be bringing up some old arguments here (sorry) but here are my observations:
1) From my reading of it, the draft really doesn't deal with large (size exceeding LMTU) packets with the DF bit set from a host who doesn't support PMTUD itself and/or for whatever reason can't receive ICMP type 4. It would appear to call for just dropping the traffic. Perhaps I'm misreading 2.2 but I see, for non-default operation: a. If large, non-fragmentable packet received then drop and send ICMP type 4 to host b. If large, fragmentable packet received and transport from ingress to egress is IPv4, encapsulate and then fragment and transmit c. Otherwise if large fragmentable packet received and transport not IPv4, fragment packet first then encapsulate Please feel free to correct if I've missed something. 2) Unfortunately if 1) is true and this draft is implemented as written we'd be dooming a lot of communication to a black hole, which is bad, IMHO. 3) Egress reassembly works as a good solution most of the time. The flip argument is of course is that this can cause heavy burden on the egress router, as had been pointed out. A potential compromise would be to implement egress reassembly only if the DF bit is set, and otherwise just fragment prior to encapsulation. Unfortunately this varies from the intended aim of the RFC as this is not implemented in code (AFAIK) and so doesn't represented current available best practice. I also can't say exactly how much traffic would fall into either category or how much it would actually help. 4) We found that TCP MSS clamping on the tunnel interfaces was absolutely vital to getting fairly reliable communication and avoiding issues. I feel it should be included in this as best practice if you're going to have to deal with a TMTU of less than 1500 bytes. It would be my number one piece of advice anyway. 5) The best, best practice we found is just to ensure you can get a PMTU between ingress and egress of over 1524 bytes. Reassembly, whether at egress router or client, causes delays in transmission and thus reduced throughput, excessive resource utilization, etc. PMTUD means (small) delays for retransmissions. I think this should be stressed in the draft that this is the only real way to make the tunnel transparent. 6) Last paragraph on 2.1, first sentence, small typo, you compare strategy 1 to itself. My 0.02. Thanks, Joshua Shire [email protected]<mailto:[email protected]>
_______________________________________________ Int-area mailing list [email protected] https://www.ietf.org/mailman/listinfo/int-area
