On Thu, Aug 11, 2016 at 5:53 PM, Alia Atlas <[email protected]> wrote:
> I have been reviewing the three encapsulations and have several comments on
> each.
> One aspect, however, seems to have not been discussed at all and I would
> like to highlight it here.
> This primarily applies to Geneve and GUE (though GUE does add MTU
> fragmentation at the encapsulator as an option), but is also not described
> for VXLAN-GPE although PMTU discovery and setting of the DF bit for IPv4
> packets is required.
>
> I found draft-saum-nvo3-pmtud-over-vxlan-03 to have some useful pieces of
> the problem
> and in specific to articulate the need for the VTEP to translate and relay
> ICMP messages back to the VM so that the MTU can be adjusted.
>
> In draft-ietf-nvo3-geneve-02 Sec 4.1, the use of PMTU discovery is
> recommended.
> Even if a router in the underlay is implementing RFC 4884, this guarantees
> at most 128 bytes of the incoming packet data.  The expectation is that this
> will include all necessary headers.  In this case, that needs to be enough
> for the VTEP to be able to relay and translate the ICMP message back to the
> VM.
>
> However, for Geneve and GUE, there is a real risk that 128 bytes is not
> sufficient to include the encapsulated IP header.     If the underlay is
> IPv6, that takes up 48 bytes for the IPv6 header plus the UDP header.   The
> base Geneve header is 12 bytes and an encapsulated IPv6 header is another
> 40.   That leaves at most 28 bytes for the Geneve options - which is to say
> that at most 3 can fit.   Of course, the situation with IPv4 is better - in
> that case 68 bytes are available for Geneve options.
>
Alia,

There has been a lot of discussion on tunnel MTU and fragmentation
(draft-ietf-intarea-tunnels, RFC4459). The conclusion seems to be that
fragmentation over tunnels is necessary and that in most deployments
the MTUs can be configured to avoid fragmentation.

For GUE, if the PMTUD were run for the tunnel, the ICMP PTB message
could only refer to the tunnel. So PTB would result in the MTU at the
tunnel ingress point to be lowered. Subsequent packets would from the
source would then get PTB from the NVE.

There is another problem though with PMTUD, with the added headers of
encapsulation the path MTU for the overlay network could fall below
1280 which is the minimum MTU for IPv6. We cannot send PTB message for
IPv6 packets that are already 1280 bytes or less.

Also, the GUE payload will often encrypted anyway so the NVE wouldn't
be able to access the inner packet in an ICMP message in that case
even if the headers were small.

Given all this we don't recommend PMTUD be run for the underlay
network of tunnels, instead we implement encapsulation level
fragmentation (draft-herbert-gue-extensions). We have real cases where
we can't control the MTUs (like at a POP) so we need to fragment. This
has not become a major issue. Such fragmentation is always exactly two
fragments, packet loss in the backend network is low, and fragments
are delivered in order-- we're not seeing the problem of inordinate
memory consumption for reassembly.

Tom

_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

Reply via email to