Re: [nvo3] comments on draft-drake-nvo3-evpn-control-plane

Aldrin Isaac Wed, 26 Sep 2012 08:50:14 -0700

Comments inline -- aldrin

On Tue, Sep 25, 2012 at 5:54 PM, Sunny Rajagopalan
<[email protected]> wrote:


> 1) I would suggest *not* altering the semantics of the MPLS label in the BGP
> route. Instead, use the route distinguisher to carry the 24-bit VNID (this
> is arguably better since the semantics of the RD align better with the
> semantics of the VNID). I would suggest encoding this as a type 0 RD, with
> the VNID going into the Assigned number sub-field. In addition, call out
> that an MPLS label value of 0 in the BGP route is a valid value, and will be
> used by PEs which do not support MPLS encap.

The RD was not intended to be used to signal data plane bits. That's
what the label field is for.

>
> 2) I've separately mentioned that the draft should call out that the PE
> control plane need not be co-located with the PE forwarding plane, and that
> XMPP could be used as the messaging format between the two. Each endpoint
> update should look like this:{endpoint_mac, {endpoint_ips}, {NVE IPs}, RD,
> label}. This allows for true decoupling of the control plane from the frame
> format.

This only partially covers the case where a MAC is actually known.
There will be more than one type of update to implement the E-VPN
application.  I think the starting point is to focus on the fully
distributed implementation (E-VPN co-resident with NVE) and then work
our way up once that is defined.  The message format for that
would/should match one-for-one with the BGP fields.

>
> 3) Even though it's not called out explicitly, the model used in both
> draft-drake-nvo3-evpn-control-plane and draft-marques-l3vpn-end-system
> assumes that the PE forwarding plane is interested in having all of the
> endpoint routes in their participating VNIDs. In my view, this puts an
> unnecessary load on NVEs. Instead, we can modify XMPP so that the NVE can
> request end point resolution of specific addresses within a VNID, so it only
> ever needs to cache information about flows that are transiting it.

RT Constrain is used to limit what get sent to NVE/PE.  Furthermore
the draft already states that an implementation does not have to
install a route into FIB until it is needed.

In the directory-based model is there well established/understood ways
to suppress lookups that are bound to fail?  If you are proposing a
directory-based model, then first order of business is to write a
draft that clearly articulates the methodology, pros, cons, safety
valves, experience, tradeoffs, yada yada about such a system so we
don't have to learn the hard way (at the expense of operators).  Until
then we're talking bits and pieces and speculating quite a bit and
really has no place in the draft.  In the push based model, the NVE
just drops the packet if matching forwarding entry does not exist --
it's cold hard, clear, no crossed fingers, always works.

>
> The issue here is how to push endpoint updates to the NVE when endpoints
> move (since having the PE keep track of which update to push to which NVE is
> unreasonable), My proposal here is to place the onus of requesting updates
> on the NVE, possibly triggered by the receipt of an ICMP error message. In
> other words, mandate that when an NVE receives a packet that after
> decapsulation is found to belong to an end-host that is no longer present in
> the attached customer network, it generates an ICMP error (rate controlled)
> back to source, taking care to include in the ICMP error packet enough of
> the payload so that the remote PE can figure out the customer endpoints that
> were attempting to communicate. On receipt of such an ICMP error, the NVE
> can extract the endpoint information from the payload and request the
> control plane for an endpoint update.

MAC moves are already supported quite well.  In an implementation
where the controlling E-VPN application instance is not co-resident
with the NVE then the messages that would be passed between the app
instance and the NVE would need to be defined.  However your proposal
would not be E-VPN -- looks more like data-plane signaling.


> 4) I would also suggest not having the NVE keep track of the encapsulation
> used by the remote endpoint. (this means that the tunnel encapsulation
> attribute in the draft would be unnecessary). Instead, the onus of
> translating between encapsulation methods should be on gateways. If you
> define the XMPP format well, you should be able to communicate end point
> information in a way that is agnostic of the encap method used by the NVE,
> allowing it to do the one encap it does best. A gateway can do this
> translation without BGP control plane intervention, because it would be
> configured to have interfaces that are (for eg) NVGRE on one arm and VXLAN
> on the other, and it would be obvious as to what encap to put on a packet
> going from one arm to the other. Applying an MPLS label would involve the
> gateway participating in BGP.

Gateways = choke points.  They should be avoided.  Every time I buy
the next guys NVO3 system (because it's faster, more scalable, etc)
I'll have to figure out where/how to gateway between it and my other
ten NVO3 systems.  :-/  I prefer Inter-AS option-C where I can have
it.

> 5) The multi-homing discussed in the draft only covers the case where the CE
> devices are physically separated from the PE devices by physical links. In
> the case where the PE forwarding plane is implemented in the hypervisor, the
> more important multi-homing question is what to do if the NVE is connected
> to two or more upstream devices, (basically, the NVE has two or more IP
> addresses). What I would like to happen is have a mac address route be
> associated with multiple NVE addresses. I believe this is possible using the
> framework established in the draft-raggarwa-sajassi-l2vpn-evpn, but it might
> be worthwhile to call out this case in the
> draft-drake-nvo3-evpn-control-plane draft. This is useful because
> draft-raggarwa-sajassi-l2vpn-evpn treats multiple PE IP addresses associated
> with the same mac address as belonging to separate MESs, and assumes that
> the CE-PE links will be labelled with an ethernet segment, which is not the
> case for the hypervisor NVE/PE.

When the hosts are VM, then the ESI=0.  Multi-path support is
considered a part of common capabilities of BGP and would require use
of different RD for a route for each NVE IP where RR are present.

>
> Long email, thanks for reading!
> --
> Sunny
> p.s. John, your email address in the draft is incorrect.
>
_______________________________________________
nvo3 mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/nvo3

Re: [nvo3] comments on draft-drake-nvo3-evpn-control-plane

Reply via email to