On 2-aug-2007, at 15:30, Joe Touch wrote:
So for IPv4, I don't think copying the DF bit makes sense. This
leaves
two options:
1. Do PMTUD for the tunnel
2. Don't do PMTUD for the tunnel
How can a tunnel do PMTUD if it is:
a) unidirectional
b) uses IP or IPsec as encapsulation (4821 describes methods
for connection-oriented protocols, e.g., TCP and SCTP)
Same as elsewhere: send packets with DF bit set, observe ICMP too bigs.
Overall, conventional PMTUD (via ICMP response) SHOULD be supported at
tunnel endpoints, but even if it is, there's no reason to expect that
tunnels will receive ICMPs any more than endpoints. (i.e., they
tend to
be dropped by firewalls, etc.)
Well, that consideration factors into the decision to either 1. do
PMTUD, or 2. don't do PMTUD. I'm not sure what people use tunnels
for. When I have occasion to use them, obviously the too bigs aren't
filtered.
4821-PMTUD cannot be supported at most tunnel endpoints, AFAICT.
Right, it works at the transport layer.
tunnel endpoints MUST either:
1. set outer DF=0 and allow fragmentation (including at the
tunnel source
So far so good...
2. set outer DF=1 when their payload fits,
...but this makes no sense at all. The whole point of PMTUD is to
_find_ _out_ whether stuff fits. You can't know that in advance.
Receipt of a too-big at the tunnel source should not be expected to be
translated to be sent to the original packet's source;
Not for IPv4. For IPv6, that would be a valid choice but handling
this in the same way as IPv4 would also be fine.
The primary benefit of receiving such
messages is for subsequent packets; the tunnel source would
decrease its
MTU, and then **other** packets from that source (or any other source)
would correct the actions above (#1 would make smaller fragments, #2
would generate ICMPs back to the source).
Right. Note that TCP tends to send out two packets at a time, so with
this in effect the first packet will trigger PMTUD in the tunnel, but
by then, the second packet is also on its way, so both packets will
be lost and TCP will probably stall for some time. Then when the
third packet comes, the sending host finally sees the too big.
These rules apply equally to IPv4 and IPv6; in neither case should
tunnels fragment the encapsulated packet, IMO.
Why not?
Fragmentation needs to happen in certain cases with IPv4. The only
choice is who is going to reassemble.
Some choices and the extra headers they allow for:
1492: PPPoE
1480: PPPoE / IPv4
1476: PPPoE / IPv4 / IPv4 + GRE
1472: PPPoE + IPv4 / IPv4 + GRE
1460: PPPoE + IPv4 / IPv4 + GRE / 2 x IPv4 / IPv6
1452: PPPoE + 2 x IPv4 / 2 x IPv4 + GRE / PPPoE + IPv6
There are many other cases - notably IPsec tunnels, which consume even
more bytes. Tunnel endpoints may employ header compression which may
somewhat compensate for size inflation too. IMO, it's not useful to
guess these sizes or expected layerings, as the use of layered VPNs
and
overlays is likely to increase over time.
I'm aware that there is a race going on to see who can be the first
to implement 1500 bytes of overhead per packet. Obviously whatever
maximum packet size above 68 bytes a sender of a packet chooses,
there will be some configuration that can't carry packets of that
size. And since datagram based applications can't arbitrarily reduce
their packet size, there will always be _some_ fragmentation. (Or
black holes if people prevent fragmentation from working properly.)
Reducing packet sizes a few percent for applications / transports
that require a one time packet size choice seems like a good idea to
avoid triggering these issues unnecessarily.
_______________________________________________
Int-area mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/int-area