Re: [Int-area] Fragmentation and Path MTU text in nvo3 dataplane reqts draft

Templin, Fred L Fri, 16 May 2014 12:13:36 -0700

Hi Linda,

The maximum MTU for an IPv4 link is 64KB (per RFC0791) and the maximum
MTU for an IPv6 link is 4GB (per RFC2675). For tunnels over IPv4, the
IPv4 network itself is the link layer for the tunnel. For tunnels over
IPv6, the IPv6 network is the link-layer. That means that the largest
encapsulated packet that will fit over an IPv4 "link" is 64KB minus
the encapsulation overhead, and the largest over an IPv6 "link" is
4GB minus overhead. But, typical values will be much smaller than
this because most underlying links configure smaller MTUs.


Thanks - Fred
[email protected]

> -----Original Message-----
> From: Linda Dunbar [mailto:[email protected]]
> Sent: Friday, May 16, 2014 11:56 AM
> To: Templin, Fred L; Black, David; [email protected]; [email protected]
> Cc: Mark Townsley; [email protected]
> Subject: RE: [Int-area] Fragmentation and Path MTU text in nvo3 dataplane 
> reqts draft
> 
> I am confused of the math here. If most physical links MTU is 1500 bytes (or 
> 2000 bytes for some), how
> do we have""4GB"" minus the encapsulation overhead for IPv6, as the "tunnel 
> link Maximum Transmission
> Unit (MTU)".
> 
> Is 4GB correct ???
> 
> Linda
> 
> -----Original Message-----
> From: Int-area [mailto:[email protected]] On Behalf Of Templin, Fred L
> Sent: Friday, May 16, 2014 1:28 PM
> To: Black, David; [email protected]; [email protected]
> Cc: Mark Townsley; [email protected]
> Subject: Re: [Int-area] Fragmentation and Path MTU text in nvo3 dataplane 
> reqts draft
> 
> > That document should be the place to put generic recommendations for
> > tunnel MTU handling that apply to all tunnel types.
> 
> In case you are wondering what I think "generic recommendations for tunnel 
> MTU handling" should look
> like, here is what I think:
> 
>    The tunnel link Maximum Transmission Unit (MTU) is 64KB minus the
>    encapsulation overhead for IPv4 [RFC0791] and 4GB minus the
>    encapsulation overhead for IPv6 [RFC2675].  This is the most that
>    IPv4 and IPv6 (respectively) can convey within the constraints of
>    protocol constants, but actual sizes available for tunneling will
>    frequently be much smaller.
> 
>    The base tunneling specifications for IPv4 and IPv6 typically set a
>    static MTU on the tunnel ingress to 1500 bytes minus the
>    encapsulation overhead or smaller still if the tunnel is likely to
>    incur additional encapsulations on the path.  This can result in path
>    MTU related black holes when packets that are too large to be
>    accommodated over the tunnel are dropped, but the resulting ICMP
>    Packet Too Big (PTB) messages are lost on the return path.  As a
>    result, tunnels use the following MTU mitigations to accommodate
>    larger packets.
> 
>    Tunnels set their ingress MTU to the larger of the
>    underlying interface MTU minus the encapsulation overhead, and 1500
>    bytes.  Tunnels optionally cache per-egress MTU values in
>    the underlying IP path MTU discovery cache initialized to the
>    underlying interface MTU.
> 
>    Tunnels admit packets that are no larger than 1280 bytes minus the
>    encapsulation overhead (*) as well as packets that are larger than
>    1500 bytes into the tunnel without fragmentation, i.e., as long as
>    they are no larger than the tunnel ingress MTU before encapsulation
>    and also no larger than the cached per-egress MTU following
>    encapsulation.  For IPv4, the ingress sets the "Don't Fragment" (DF)
>    bit to 0 for packets no larger than 1280 bytes minus the encapsulation
>    overhead (*) and sets the DF bit to 1 for packets larger than 1500
>    bytes.  If a large packet is lost in the path, the ingress may
>    optionally cache the MTU reported in the resulting PTB message or may
>    ignore the message, e.g., if there is a possibility that the message
>    is spurious.
> 
>    For packets admitted into the tunnel that are larger than 1280 bytes
>    minus the encapsulation overhead (*) but no larger than 1500 bytes,
>    the ingress uses IP fragmentation to fragment the encapsulated packet
>    into two pieces (where the first fragment contains 1024 bytes of the
>    fragmented inner packet) then sends the fragments to the egress.
>    If the outer protocol is IPv4, the node sends the fragments with
>    DF set to 0 and subject to rate limiting to avoid
>    reassembly errors [RFC4963][RFC6864].  For both IPv4 and IPv6, the
>    ingress also sends a 1500 byte probe message (**) to the egress,
>    subject to rate limiting. To construct a probe, the ingress prepares
>    an ICMPv6 Neighbor Solicitation (NS) message with trailing padding
>    octets added to a length of 1500 bytes but does not include the
>    length of the padding in the IPv6 Payload Length field.  The ingress
>    then encapsulates the NS in the outer encapsulation headers (while
>    including the length of the padding in the outer length fields), sets
>    DF to 1 (for IPv4) and sends the padded NS message to the neighbor.
>    If the egress returns an NA message, the ingress may then send whole
>    packets within this size range and (for IPv4) relax the rate limiting
>    requirement. (Note that for tunnels that do not perform IPv6 neighbor
>    discovery, an ICMP echo request message can be used instead of NS.)
> 
>    The egress MUST be capable of reassembling packets up to 1500 bytes
>    plus the encapsulation overhead length.  It is therefore RECOMMENDED
>    that the egress be capable of reassembling at least 2KB.
> 
>    (*) Note that if it is known without probing that the minimum Path
>    MTU to a tunnel egress is MINMTU bytes (where 1280 < MINMTU < 1500)
>    then MINMTU can be used instead of 1280 in the fragmentation threshold
>    considerations listed above.
> 
>    (**) It is RECOMMENDED that no probes smaller than 1500 bytes be used
>    for MTU probing purposes, since smaller probes may be fragmented if
>    there is a nested tunnel somewhere on the path to the egress.
>    Probe sizes larger than 1500 bytes MAY be used, but may be
>    unnecessary since original sources are expected to use [RFC4821]
>    when sending large packets.
> 
> I think this applies to all IP-in-(foo)-in-IP tunnel types, and could go as a 
> set of generic
> recommendations to be cited by other documents.
> 
> Comments?
> 
> Thanks - Fred
> [email protected]
> 
> > -----Original Message-----
> > From: Int-area [mailto:[email protected]] On Behalf Of
> > Templin, Fred L
> > Sent: Thursday, May 15, 2014 3:41 PM
> > To: Black, David; [email protected]; [email protected]
> > Cc: Mark Townsley; [email protected]
> > Subject: Re: [Int-area] Fragmentation and Path MTU text in nvo3
> > dataplane reqts draft
> >
> > Hi,
> >
> > > -----Original Message-----
> > > From: tsv-area [mailto:[email protected]] On Behalf Of
> > > Black, David
> > > Sent: Wednesday, May 14, 2014 1:53 PM
> > > To: [email protected]; [email protected]
> > > Subject: Fragmentation and Path MTU text in nvo3 dataplane reqts
> > > draft
> > >
> > > <WG chair hat off>
> > >
> > > Over in the nvo3 WG, draft-ietf-nvo3-dataplane-requirements-03
> > > contains some text on dealing with the fragmentation and MTU effects of 
> > > tunnels.
> > > I thought I'd ask for some early review of this text, given recent
> > > IESG excitement around fragmentation and Path MTU topics in another draft:
> >
> > All tunnels have trouble with path MTU, and in some cases have no
> > choice but to fragment. However, they should strive to tune out
> > fragmentation and forward whole packets whenever possible.
> >
> > Over in the intarea, there have been sporadic ongoing discussions
> > about how to recommend generic MTU mitigations for tunnels. Joe Touch
> > and Mark Townsley have been working for a long time on a document
> > titled "Tunnels in the Internet Architecture":
> >
> > http://tools.ietf.org/id/draft-ietf-intarea-tunnels-00.txt
> >
> > That document should be the place to put generic recommendations for
> > tunnel MTU handling that apply to all tunnel types.
> >
> > Tunnel MTU issues keep popping up in all places, and this is just
> > another example. Is it time to revive Joe and Mark's document?
> >
> > Thanks - Fred
> > fred.l.templin@boeing.
> >
> > > http://datatracker.ietf.org/doc/draft-ietf-ipsecme-ikev2-fragmentati
> > > on/ballot/
> > >
> > > I believe that the nvo3 draft is in better shape in these areas.
> > > Nonetheless, I've included its current text on fragmentation and
> > > path MTU below, and (on behalf of the draft authors and nvo3 WG
> > > chairs) I'm looking for input on what that text should say and why.
> > >
> > > In nvo3 terminology, an overlay network is an inner network that is
> > > tunneled over an outer underlay network.  The nvo3 WG also uses
> > > "Tenant System" as the term for a sender/receiver of network traffic
> > > because multi-tenancy is an important motivation for the WG's activities 
> > > in network
> virtualization.
> > >
> > > --------------------------------------
> > >
> > > 3.5. Path MTU
> > >
> > >        The tunnel overlay header can cause the MTU of the path to the
> > >        egress tunnel endpoint to be exceeded.
> > >
> > >        IP fragmentation SHOULD be avoided for performance reasons.
> > >
> > >        The interface MTU as seen by a Tenant System SHOULD be adjusted 
> > > such
> > >        that no fragmentation is needed. This can be achieved by
> > >        configuration or be discovered dynamically.
> > >
> > >        Either of the following options MUST be supported:
> > >
> > >           o Classical ICMP-based MTU Path Discovery [RFC1191] [RFC1981] or
> > >             Extended MTU Path Discovery techniques such as defined in
> > >             [RFC4821]
> > >
> > >           o Segmentation and reassembly support from the overlay layer
> > >             operations without relying on the Tenant Systems to know about
> > >             the end-to-end MTU
> > >
> > >           o The underlay network MAY be designed in such a way that the 
> > > MTU
> > >             can accommodate the extra tunnel overhead.
> > >
> > > --------------------------------------
> > >
> > > </WG chair hat off>
> > >
> > > Thanks,
> > > --David
> > > ----------------------------------------------------
> > > David L. Black, Distinguished Engineer EMC Corporation, 176 South
> > > St., Hopkinton, MA  01748
> > > +1 (508) 293-7953             FAX: +1 (508) 293-7786
> > > [email protected]        Mobile: +1 (978) 394-7754
> > > ----------------------------------------------------
> >
> > _______________________________________________
> > Int-area mailing list
> > [email protected]
> > https://www.ietf.org/mailman/listinfo/int-area
> 
> _______________________________________________
> Int-area mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/int-area

_______________________________________________
Int-area mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/int-area

Re: [Int-area] Fragmentation and Path MTU text in nvo3 dataplane reqts draft

Reply via email to