Thank you Gyan
On Mon, Apr 27, 2020 at 2:13 AM Rabadan, Jorge (Nokia - US/Mountain View) < jorge.raba...@nokia.com> wrote: > Gyan, > > > > Yes, the GW redundancy in the dci draft is based on an “Interconnect” > Ethernet Segment (I-ES), that uses the same DF Election, split-horizon, > mass withdraw and aliasing/backup procedures as any Ethernet Segment. > > > > Thanks. > > Jorge > > > > *From: *Gyan Mishra <hayabusa...@gmail.com> > *Date: *Monday, April 27, 2020 at 1:50 AM > *To: *"Rabadan, Jorge (Nokia - US/Mountain View)" <jorge.raba...@nokia.com > > > *Cc: *BESS <bess@ietf.org>, Jeff Tantsura <jefftant.i...@gmail.com>, > "Lukas Krattiger (lkrattig)" <lkrat...@cisco.com>, "saja...@cisco.com" < > saja...@cisco.com> > *Subject: *Re: [bess] VXLAN BGP EVPN Question > > > > > > Jorge > > > > In the BGP EVPN NVO RFC 8365 there are controls built in for Mac flooding > related to intra pod with all active multi homed hosts. So with any multi > home failure the mass mac withdrawal all NVEs reconverge to new next hop > when the ES of failed gateway is withdrawn. Also the backup path aliasing > for multi homed always active for load balancing of remote NVEs. Split > horizon filtering for BUM traffic to prevent looping back to different ES > gateway connected to host. > > > > > > So with the DCI overlay draft those same EVPN procedures for intra pod NVE > to help with convergence and flooding is now applied to the inter pod > stitched NVE via the UMR route for BUM traffic. > > > > So the new UMR route type prevents re-flooding when the routes are all > known via alias to redundant gateway similar to the backup path aliasing > for load balancing intra-site. > > > > Kind regards > > > > Gyan > > > > On Sun, Apr 26, 2020 at 10:02 AM Rabadan, Jorge (Nokia - US/Mountain View) > <jorge.raba...@nokia.com> wrote: > > Hi Gyan, > > > > Actually we started with the evpn dci draft in 2013 :-) > > > > The way I see the unknown mac route it saves flooding if all the MACs in > the POD/DC are known beforehand. The unknown unicast traffic can be aliased > to the GWs. In case of failure in one of the GWs, the AD per-ES route for > the I-ES will be withdrawn (mass withdraw for all EVIs) and the unknown > traffic can be sent to the redundant GWs. So this failure won’t generate > any extra flooding. > > > > Thanks. > > Jorge > > > > *From: *Gyan Mishra <hayabusa...@gmail.com> > *Date: *Saturday, April 25, 2020 at 8:45 AM > *To: *"Lukas Krattiger (lkrattig)" <lkrat...@cisco.com>, " > saja...@cisco.com" <saja...@cisco.com> > *Cc: *BESS <bess@ietf.org>, Jeff Tantsura <jefftant.i...@gmail.com>, > "Rabadan, Jorge (Nokia - US/Mountain View)" <jorge.raba...@nokia.com> > *Subject: *Re: [bess] VXLAN BGP EVPN Question > > > > > > + Ali > > > > Lukas > > > > I noticed that Ali was on the multi site draft which I which expired in > 2017 around the same time the DCI overlay draft was submitted. I went > through the logs but did not go through the mail archives to see what > happen to multi site draft. My guess is these were two competing drafts > and multi site was geared solely to EVPN procedures for vxlan encapsulation > and thus did not achieve WG adoption, where your DCI overlay draft accounts > for every encapsulation type using EVPN procedures and is more > comprehensive approach to DCI providing an improved solution to Multisite > vxlan overlay stitching. > > > > I like the re-origination of the VNI and RD idea using local context on > the gateway as an additional control mechanism which prevents Type 2 mac-ip > routes from being flooded between pods that should not without flood > filters. With the multi site feature there are no control and all mobility > routes are flooded unfortunately active or not. > > > > With this draft is it possible to add a feature for conversation learning > of only active flows when the type 1 BGP a-d is sent for initial BUM > advertisement for arp or nd, there could be a snooping mechanism similar to > IGMP snooping that discovers the active flow and thus creates the control > plane level type 2 Mac-IP state followed by being flooded in data plane NVE > tunnel overlay. I think this concept could apply intra site fabric leaf to > leaf but I think would be extremely beneficial for inter pod or inter site. > > > > This could be separate feature or option to the selective advertisement. > > > > So the selective advertisement works in conjunction with re-origination of > RD and locally significant VNI. > > > > So what I would envision with the conversation learning active flow > detection feature you would use global VNI and now only the active type-2 > Mac-IP routes would be propagated inter pod or site. > > > > This feature would be a tremendous benefit to operators and help with mac > scale. > > > > In our Cisco multisite feature implementations we do use the recommended > BUM traffic multi site feature specific suppression applied on the BGW. So > that definitely helps with the BUM suppression for sure. > > > > In section 3.5.1 UMR - so the route type is like a default Mac route 0/48 > with ESI set to DCI gateway I-ESI for all active multi homing, and so > instead of flooding all mac’s and have to rely on mass mac withdrawals > during a failure, now only the UMR is withdrawn. Is that correct? > > > > That’s a huge savings on resources. > > > > Kind regards > > > > Gyan > > > > On Fri, Apr 24, 2020 at 3:25 PM Lukas Krattiger (lkrattig) < > lkrat...@cisco.com> wrote: > > Thanks Jorge and Jeff for guiding all the way thru the features and > functions we have around, in DCI-overlay and Multi-Site. > > > > Gyan, > > > > Specific to the VNI distribution, BUM handling and the re-origination in > Multi-Site. > > With re-origination, the RDs are changed on the GW node. With this in > mind, the VNI could be Global or local significant. In the case of local > significants, we can stitch VNIs together (ie (VNI1 - GW - VNI2 - GW - > VNI3). > > Further, MAC- or IP-VRFs that are not supposed to be extended to a remote > Sites will not advertise any MAC or IP routes beyond the local GW. This way > you will keep the control-plane clean and avoid unnecessary creation of > flood lists. This is what we call selective advertisement, which is > different than conversational learning. Conversational learning could be a > complement to selective advertisement. The unknown MAC approach that Jorge > mentioned is a different approach for similar optimizations. > > In addition to ARP suppression, in the specific Cisco implementation of > Multi-Site, we provide a BUM traffic policer to rate limit between Sites. > This policer are located on the GW and acts in the egress direction. > > > > So with the DCI EVPN VNI translation does that end up netting the desired > effect control plane segregation from data plane and providing that reduced > size Mac VRF showing only active interesting traffic type 2 Mac-IP routes > intra pod within the DC. > > > > In a certain way, yes > > > > Kind Regards > > -Lukas > > > > > > On Apr 24, 2020, at 7:21 AM, Rabadan, Jorge (Nokia - US/Mountain View) < > jorge.raba...@nokia.com> wrote: > > > > Hi Gyan, > > > > The dci evpn overlay draft indeed provides that segmentation. EVPN routes > are readvertised at the GWs with change in RD/VNI/Nhop, and this certainly > optimizes the BUM replication. From end leaf nodes. The draft also > introduces the use of an unknown Mac route that the GWs can advertise to > their local POD, as opposed to readvertise all the received MAC routes. > This can be used under the assumption that if a mac is unknown for a leaf, > it must be somewhere beyond the GW. Finally, the draft also allows you to > use an I-ES for multihoming and have all-active to two or more GWs. > > > > Note that this draft has multiple implementations, and the only reason why > is not an RFC yet is due to a normative reference that must be cleared > first. > > > > Thanks. > > Jorge > > > > *From: *Gyan Mishra <hayabusa...@gmail.com> > *Date: *Friday, April 24, 2020 at 3:54 PM > *To: *"Rabadan, Jorge (Nokia - US/Mountain View)" <jorge.raba...@nokia.com > > > *Cc: *BESS <bess@ietf.org>, Jeff Tantsura <jefftant.i...@gmail.com> > *Subject: *Re: [bess] VXLAN BGP EVPN Question > > > > > > Hi Jorge > > > > I read through the draft and it sounds this vxlan segmentation is similar > to multi site segmented multi part LSP used for DCI. How does this > option compare or contrast with the multi site draft below. > > > > With DCI evpn overlay you mentioned, the VNIs on the ASBRs are translated > and not global. Interesting. > > > > With multi site the VNIs are Globally significant inter of intra site and > an RT rewrite happens for the BGW to BGW middle segment to establish for > the NVE to be stitched. > > > > So with the DCI EVPN VNI translation does that end up netting the desired > effect control plane segregation from data plane and providing that reduced > size Mac VRF showing only active interesting traffic type 2 Mac-IP routes > intra pod within the DC. > > > > Multi site DCI > > https://datatracker.ietf.org/doc/draft-sharma-multi-site-evpn/ > > > > > > Kind regards > > > > Gyan > > > > On Fri, Apr 24, 2020 at 3:07 AM Rabadan, Jorge (Nokia - US/Mountain View) < > jorge.raba...@nokia.com> wrote: > > Hi Gyan, > > > > If I may, note that: > > https://tools.ietf.org/html/draft-ietf-bess-dci-evpn-overlay-10#section-4..6 > > > > Also provides vxlan segmentation, and while the description is based on > DCI, you can perfectly use it for inter-pod connectivity. > > > > Thanks. > > Jorge > > > > *From: *BESS <bess-boun...@ietf.org> on behalf of Gyan Mishra < > hayabusa...@gmail.com> > *Date: *Friday, April 24, 2020 at 5:21 AM > *To: *Jeff Tantsura <jefftant.i...@gmail.com> > *Cc: *BESS <bess@ietf.org> > *Subject: *Re: [bess] VXLAN BGP EVPN Question > > > > > > Hi Jeff > > > > Yes - Cisco has a draft for multi site for use cases capability of inter > pod or inter site segmented path between desperate POD fabrics intra DC or > as DCI option inter DC without MPLS. The segmentation localizes BUM > traffic and has border gateway DF election for BUM traffic that is > segmented stitched between PODs as I mentioned similar to inter-as L3 vpn > opt b. There is a extra load as you said on the BGW border gateway > performing the network vtep dencap from leaf and then again encap towards > the egress border gateway. Due to that extra load on the border gateway > it’s not recommended to have spine function on BGW thus an extra layer for > multi site to be scalable. Definitely requires proprietary asic and not > merchant silicon or white box solution. The BUM traffic is much reduced as > you stated from multi fabric connected super spine or single fabric spine > that contains all leafs. That decoupling sounds like incongruent control > and data plane with Mac only Type 2 routes which would result in more BUM > traffic but it sounds like that maybe trade off of conversation learning > only active flows versus entire data center wide Mac VRF being learned > everywhere. I wonder if their is an option to have that real decoupling of > EVPN control plane and vxlan data plane overlay that does not impact > convergence but adds stability and only active flow Type 2 Mac learner > across the fabric. > > > > https://datatracker.ietf.org/doc/draft-sharma-multi-site-evpn/ > > > > Kind regards > > > > Gyan > > > > On Thu, Apr 23, 2020 at 6:04 PM Jeff Tantsura <jefftant.i...@gmail.com> > wrote: > > Gyan, > > > > "Multi site” is not really an IETF terminology, this is a solution > implement by NX-OS, there’s a draft though. Its main functionality is to > localize VxLAN tunnels and provide segmented path vs end2end full mesh of > VxLAN tunnels (participating in the same EVI). We are talking HER here. > > The feature is heavily HW dependent as it requires BUM re-encapsulation at > the boundaries (leaf1->BGW1-BGW2->leaf2..n). So good luck seeing it soon on > low end silicon. > > It doesn’t eliminate BUM traffic but significantly reduces the span of > “broadcast domain” and reduces the need for large flood domains (modern HW > gives you ~512 large flood groups, obviously depending on HW) > > > > Wrt your question about Mac conversation learning - this is an > implementation issue, nothing in EVPN specifications precludes you of doing > so, moreover in the implementation I was designing (in my previous life) we > indeed decoupled data plane learning from control plane advertisement so > control plane was aware of “Active” flows. Needless to say - this creates > an additional layer of complexity and all kinds of funky states in the > system ;-). > > > > Hope this helps > > > > Cheers, > > Jeff > > On Apr 23, 2020, 8:30 AM -0700, Gyan Mishra <hayabusa...@gmail.com>, > wrote: > > > > > > Slight clarification with the arp traffic. What I meant was broadcast > traffic translated into BUM traffic with the EVPN architecture is there any > way to reduce the amount of BUM traffic with a data center design > requirement with vlan anywhere sprawl with 1000s of type 2 Mac mobility > routes being learned between all the leaf VTEPs. > > > > The elimination of broadcast is a tremendous gain and with broadcast > suppression of multicast that does help but it would be nice to not have > such massive Mac tables type 2 route churn chatter with a conversation > learning where only active flows are are in the type 2 rib. > > > > Kind regards > > > > Gyan > > > > On Wed, Apr 22, 2020 at 6:47 PM Gyan Mishra <hayabusa...@gmail.com> wrote: > > > > In the description of the vxlan BGP evpn scenario has a typo on the > multisite feature segmented LSP inter pod with the RT auto rewrite which is > similar to MPLS inter-as option b not a. > > > > Kind regards > > > > Gyan > > > > On Wed, Apr 22, 2020 at 5:57 PM Gyan Mishra <hayabusa...@gmail.com> wrote: > > > > All > > > > Had a question related to vxlan BGP EVPN architecture specifications > defined in BGP EVPN NVO3 overlay RFC 8365 and VXLAN data plane RFC 7348. > > > > In a Data Center environment where you have a multiple PODs individual > fabrics per POD connected via a super spine extension using a Multi site > feature doing auto rewrite of RTs to stitch the NVE tunnel between pods > similar to inter-as option A. > > > > So in this scenario where you have vlan sprawl everywhere with L2 and L3 > VNIs everywhere as if it were a a single L2 domain. The topology is a > typical vxlan spine leaf topology where the L3 leafs are the TOR switch so > very small physical L2 fault domain. So I was wondering if with the vxlan > architecture if this feature below is possible or if their is a way to do > so in the current specification. > > > > Cisco use to have a DC product called “fabric path” which was based on > conversation learning. > > > > Is there any way with existing vxlan BGP evpn specification or maybe > future enhancement to have a Mac conversation learning capability so that > only the active mac’s that are part of a conversations flow are the mac > that are flooded throughout the vxlan fabric. That would really help > tremendously with arp storms so if new arp entries are generated locally on > a leaf they are not flooded through the fabric unless their are active > flows between leafs. > > > > Also is there a way to filter type 2 Mac mobility routes between leaf > switches at the control plane level based on remote vtep or maybe other > parameters.. That would also reduce arp storms BUM traffic. > > > > Kind regards > > > > Gyan > > -- > > Gyan Mishra > > Network Engineering & Technology > > Verizon > > Silver Spring, MD 20904 > > Phone: 301 502-1347 > > Email: gyan.s.mis...@verizon.com > > > > > > -- > > Gyan Mishra > > Network Engineering & Technology > > Verizon > > Silver Spring, MD 20904 > > Phone: 301 502-1347 > > Email: gyan.s.mis...@verizon.com > > > > > > -- > > Gyan Mishra > > Network Engineering & Technology > > Verizon > > Silver Spring, MD 20904 > > Phone: 301 502-1347 > > Email: gyan.s.mis...@verizon.com > > > > > > _______________________________________________ > BESS mailing list > BESS@ietf.org > https://www.ietf.org/mailman/listinfo/bess > > -- > > Gyan Mishra > > Network Engineering & Technology > > Verizon > > Silver Spring, MD 20904 > > Phone: 301 502-1347 > > Email: gyan.s.mis...@verizon.com > > > > > > -- > > Gyan Mishra > > Network Engineering & Technology > > Verizon > > Silver Spring, MD 20904 > > Phone: 301 502-1347 > > Email: gyan.s.mis...@verizon.com > > > > > > _______________________________________________ > BESS mailing list > BESS@ietf.org > https://www.ietf.org/mailman/listinfo/bess > > > > -- > > Gyan Mishra > > Network Engineering & Technology > > Verizon > > Silver Spring, MD 20904 > > Phone: 301 502-1347 > > Email: gyan.s.mis...@verizon.com > > > > > > -- > > Gyan Mishra > > Network Engineering & Technology > > Verizon > > Silver Spring, MD 20904 > > Phone: 301 502-1347 > > Email: gyan.s.mis...@verizon.com > > > > > -- Gyan Mishra Network Engineering & Technology Verizon Silver Spring, MD 20904 Phone: 301 502-1347 Email: gyan.s.mis...@verizon.com
_______________________________________________ BESS mailing list BESS@ietf.org https://www.ietf.org/mailman/listinfo/bess