Re: [bess] shepherd review of draft-ietf-bess-evpn-etree

Thomas Morin Thu, 05 Jan 2017 01:54:36 -0800

Hi Ali,

2016-12-19, Ali Sajassi (sajassi):

I have modified section 2.2 (copied below) to elaborate why coloringapproach for Leaf/Root MAC addresses is used in this draft. Also, theuse of single RT for this scenario is mentioned just as “MAY”. Pleasereview the text below and let me know if you still havequestions/comments:


Thanks for providing text that goes in the right direction.
I still have a few comments below.

-Thomas


2.2 Scenario 2: Leaf OR Root site(s) per AC

   In this scenario, a PE receives traffic from either Root OR Leaf
   sites (but not both) on a given Attachment Circuit (AC) of an EVI. In
   other words, an AC (ES or ES/VLAN) is either a Root AC or a Leaf AC
   (but not both).


                     +---------+            +---------+
                     |   PE1   |            |   PE2   |
    +---+            |  +---+  |  +------+  |  +---+  |            +---+
    |CE1+-----ES1----+--+   |  |  |      |  |  | +--+---ES2/AC1--+CE2|
    +---+    (Leaf)  |  |MAC|  |  | MPLS |  |  |MAC|  |   (Leaf)   +---+
                     |  |VRF|  |  |  /IP |  |  |VRF|  |
                     |  |   |  |  |      |  |  |   |  |            +---+
                     |  |   |  |  |      |  |  | +--+---ES2/AC2--+CE3|
                     |  +---+  |  +------+  |  +---+  |   (Root)   +---+
                     +---------+            +---------+

   Figure 2: Scenario 2

   In this scenario, just like the previous scenario (in section 2.1),
   two Route Targets (one for Root and another for Leaf) can be used.
   However, the difference is that on a PE with both Root and Leaf ACs,
   all remote MAC routes are imported and thus there needs to be a way
   to differentiate remote MAC routes associated with Leaf ACs versus
   the ones associated with Root ACs in order to apply the proper
   ingress filtering.

   In order to support such ingress filtering on the ingress PE with
   both Leaf and Root ACs, one the following two approaches can be used:

reverting A and B would be more natural since solution B corresponds tothe starting point "what we had before this spec"


   A) Color MAC addresses with Leaf (or Root) color before distributing
   them in BGP to other PEs depending on whether it is learned on a Leaf


s/it is/they are/

   AC (or a Root AC)

I think removing the parenthesis is needed for the 'whether' statementto parse.


   B) Use two MAC-VRFs (two bridge tables per VLANs) - one for Root ACs
   and another for Leaf ACs.

I think "(two bridge tables per VLANs)" is inexact: "two bridge tablesper VLAN if a given VLAN exists on the PE for both Leaf and Root ACs ofa given EVI" ?

Similarly, in the following paragraph, I think "per VLAN" should bereplaced by "per E-TREE EVI having both Root and Leaf ACs".


   Maintaining two bridge tables per VLAN requires either two lookups be
   performed per MAC address in either direction in case of a miss, or
   duplicating many MAC addresses between the two bridge tables
   belonging to the same VLAN (same E-TREE instance). The duplication of
   MAC addresses are need for both locally learned and remotely learned
   MAC addresses.

Since it is said above "Maintaining two bridge tables per VLAN requireseither two lookups [...] or duplicating many MAC addresses [...]",saying "The duplication of MAC addresses is needed for [...]" issurprising, so I guess the intent is rather "Unless two lookups aremade, duplication of MAC addresses would be needed for [...]".

   Locally learned MAC addresses from Leaf ACs need to be
   duplicated onto Root bridge table and locally learned MAC addresses
   from Root ACs need to be duplicated onto Leaf bridge table. Remotely
   learned MAC addresses from Root ACs need to be copied onto both Root
   and Leaf bridge tables.
   Neither double lookups nor MAC duplications
   are considered viable options; therefore, this draft recommends the
   use of MAC address coloring for this scenario as detailed in section
   3.1.

I think that this explanation is too elliptic compared to the strong("not viable") conclusion. As soon as we talk about implementationdetails, a more detailed discussion is required on why, and under whichassumptions, some things are impossible -- there can be many differentway to implement a dataplane. Without explaining what "two lookups"exactly means in this context, it's hard to follow why it would berequired if duplicating MAC addresses is not done, and why it is latterconcluded as "not viable":- doing multiple lookups is something that is far from being uncommon onrouter platforms- on software platforms the impact of doing multiple lookups can bereduced to mostly zero (e.g. with OpenVSwitch that would only impact thefirst packets of a flow)- if the dataplane can leverage the colouring information to avoid doingtwo lookups, then perhaps this hardware ability can be leveraged tosupport the two MAC-VRFs approach with only one lookup (building onetable, marking MAC entries as Leaf entries if they were learned withroutes carrying only the Leaf RT?) --- don't misunderstand me: I'm notclaiming that this works (I haven't looked closely enough), but simplythat the text provided is not sufficient to exclude this kind of solution

The "duplicating MAC addresses" alternative is explained better, butstill, nothing is explained on why this is "not viable". It seems to beas something rather belonging to the realm of "having a scalabilityimpact", but even looking in this respect we are not talking about achange of order of magnitude.


   For this scenario, if for a given EVI, the vast majority of PEs will
   have both Leaf and Root sites attached, even though they may start as
   Root-only or Leaf-only PEs, then a single RT per EVI MAY be used in
   order to alleviate  additional configuration overhead associated with

"to alleviate additional configuration overhead associated with ..." ->"to alleviate the configuration overhead associated with ..." ?

   using two RTs per EVI at the expense of having unwanted MAC addresses
   on the Leaf-only PEs.

From: Thomas Morin <[email protected]<mailto:[email protected]>>
Organization: Orange
Date: Thursday, December 15, 2016 at 4:12 AM
To: Cisco Employee <[email protected] <mailto:[email protected]>>, LoaAndersson <[email protected] <mailto:[email protected]>>, "George Swallow -T (swallow- MBO PARTNERS INC at Cisco)" <[email protected]<mailto:[email protected]>>, Eric Rosen <[email protected]<mailto:[email protected]>>, BESS <[email protected] <mailto:[email protected]>>Cc: Martin Vigoureux <[email protected]<mailto:[email protected]>>
Subject: Re: shepherd review of draft-ietf-bess-evpn-etree

Hi Ali,

2016-12-13, Ali Sajassi (sajassi):
2016-12-10, Ali Sajassi (sajassi):
Your suggestion regarding multiple MAC-VRFs per EVI for E-TREE,impacts lot more sections than just section 2.2 for which yousuggested some texts. It drastically impacts section 3.1 (knownunicast traffic), and it also impacts section 3.2 (BUM traffic) andsection 5.1.
Can you detail why ?
The understanding that leads me to this suggestion is that the2-RT+split-horizon scenario in 2.1, then applied to Root/Leaf PE in a2.2.1 would not require new procotol procedures nor changes in thetext that as I understand provides procedures for 2.2(.2) and 2.3.
2nd try. As my 1st response got truncated for some reason.
The reason that impacts more sections than just sec. 2, is that theproposed 2.2.1 would be an alternative option for section 3.1. Insection 3.1, the root/leaf indication for MAC addresses are done viaflag-bit defined in section 5.1 and it only uses a single MAC-VRF(single bridge table per VLAN) per RFC 7432. If we go with twoMAC-VRFs (e.g., two bridge tables) per VLAN, then that is analternative way of doing the same thing described in section 3.1.This alternative way has big ramifications on the platform as itrequires duplicating MACs and managing multiple bridge tables per VLAN.
Since 2.1 and the proposed 2.2.1 do not require new protocolprocedures (they only require split-horizon locally in Leaf MAC-VRFs),if you state clearly that the procedures in the document are here toaddress 2.2.2 and 2.3, then you don't need to modify the content ofthe document after section 2 (more exactly, you will need minor updatelike changing the current "This scheme applies to all scenariosdescribed in section 2." in section 3 into "This scheme applies toscenarios described in 2.2.2 and 2.3".
The "big ramifications" above are then not about section 3, but justthe (platform specific-drawbacks) of 2.2.2 that we have alreadydiscussed and that can be covered in 2.2.2.
Maybe what you really want is to allow for scenario 2.2 to operatewith two RTs which has the benefits of both 2.2.1 and 2.2.2 and nonof the drawbacks. So, maybe we can clarify the current text to makesure that this comes out clearly – ie, a PE can have single MAC–VRFcan have multiple RTs.
You could mention that, but for me the key things is:
- documenting the motivation for the new procedures
- not arbitrarily /restrict/ 2.2.2 to one RT (but why not documentidentified drawbacks)
Furthermore, it creates a new paradigm for EVPN that was neverintended for because of creating two MAC-VRFs (and two bridgetables) for the same VLAN.
The "<new thing> created a new paradigm that <RFX xyz> was neverintended for" is a not generally valid, or sufficiently detailed,argument: if it was, then you might go as far as challenging thewhole E-Tree spec on the same kind grounds (and many other new things).
So here is where it seems we have a gap to bridge: I still don'tunderstand what in RFC7432 describes an intention of "not supportingtwo MAC-VRFs for the same VLAN".
I tried to explain the relationship between EVI, MAC-VRF, bridgetable, and VLAN in my previous email per RFC 7432. However, lets parkthis discussion for time being as I think it is secondary.
Ok, feel free to revisit if you think that RFC7432 would precludeprocedures that end up being described in this draft
I think you agree that if we have a single solution that has all thebenefits of your proposed 2.2.1 and 2.2.2 and none of the drawbacks,it is much more preferable with having two solutions each with itsown advantages and draw backs, right? If so, then existing text in2.2 was intended to convey that. However, we can clarify itfurther – e.g, make it clear that for PE with root & leaf in the sameEVI, we can use a single MAC-VRF with two RTs (one for leaf andanother for root).
As said above my key concern is having the document clearly spell outthe motivation for new specs.If this implies documenting the fact that already existing procedurecan be used, but have drawbacks, then so be it ; there would be nopoint in hiding that, right ?
The WG LC was completed on 3/29/16 and I am sure it is not yourintention to have major changes to the doc at this stage wheremultiple vendors have already implemented the draft.
As you know, there are different stages at which people do reviews ona doc after WGLC, an which may lead doc editors to introducesignificant --editorial or technical-- changes in a document.Sometimes that leads to documents going back to the working group.
However my root intention as doc shepherd, of course, is not topropose a major change, but merely to able to answer the standardquestion of the shepherd review -- on the reviews done, on documentreadiness, and on the document quality -- in a way as positive andsincere as possible. In particular questions (3) (4) and (6).
So, hopefully the answers to these three questions are nowclear. I believe your main concern is to ensure that we can applytwo-RT approach of sec. 2.1 to sec. 2.2 (and we can still do andstill have a single MAC-VRF)
See above.
This draft talks about two kinds of traffic filtering: a) ingressfiltering for known unicast and b) egress filtering for BUM traffic.What you are suggesting is an alternate mechanism for ingress filtering.
(well I'm not suggesting the mechanism itself --which section 2.1already does-- but simply to document that it can still apply withoutthe constraint of avoiding the presence of a Root MAC-VRF and a LeafMAC-VRF on a same PE)
Although having multiple VRFs (and forwarding tables) are fine forIP-VPNs because the unknown traffic is always dropped, multiple VRFsfor the same VLAN is not OK for L2 traffic because of flooding ofunknown traffic. That’s why in section 6 of RFC 7432, for allservice interface types, the draft talks about a single MAC-VRF perEVI per PE and in case of VLAN-aware mode, multiple VLANs perMAC-VRF but only a single bridge table per VLAN. In other words, thebottom line is that there can only be a single bridge table per VLANin order to avoid unnecessary flooding.
When you have two MAC-VRFs per VLAN (one for root ACs and anotherfor Leaf ACs), then you either need to duplicate lots of MACaddresses between these two VRFs, or do lookup on both of theseVRFs. Either ways this is not a good option relative to keeping asingle VRF table for both root and leaf sites and just have asingle-bit indication on whether a MAC is associated with root orleaf (as currently described approach in the draft). I
In the above, it seems you agree that it can work, and you are ableto offer reasons why it is not the preferred option, then why notjust document that it can work and provides these reasons as themotivations that lead to proposing a new specs ?
Sure, I can do that. [...]
Ok.
I'll be happy to review a new revision and hopefully post the shepherdreview.
Thanks,

-Thomas
(it seems you have an unfinished last sentence: "I [...]" )
(assuming the previous point is resolved:)
With this mechanism above, isn't it possible to have on a givenPE, for a single E-TREE EVI, both Leaves and Roots, as long asdistinct MAC-VRFs are used (one for Leaves and one for Roots) ?(it seems to me that the assymetric import/export RT would do whatis needed to build an E-TREE, we would just have a particular casewhere a Leaf MAC-VRF and a Root MAC-VRF for a given E-TREE end upon a single PE)
That’s not possible because per definition of an EVI, there isonly a single MAC-VRF per EVI for a PE.
Where can I read such a definition ? (the Terminology section inRFC7432 does not say that, unless I'm missing something).
And that seems a completely arbitrary restriction.
(just thinking that a given PE device can be split in two logicaldevices show that it can work)
Section 6 of RFC7432 where it gives definitions for differentservice interface types, it specifies the relationship betweenMAC-VRF and VLAN (bridge table) and how many MAC-VRF (and bridgetables) can be per EVI.
This section of RFC7434 discusses many different things for thedifferent variants.Can you provide a specific pointer about "how many MAC-VRFs can beper EVI" ?
Ali> Section 6 of RFC7432 spells out the relationship between EVI,MAC-VRF, and bridge tables for all service interfaces very clearly.In all service interfaces, the RFC says there is one MAC-VRF per EVIon a given PE.Now, if the service interface is “vlan-aware”, then there areseveral bridge tables for that single MAC-VRF – ie, one bridge tableper VLAN. In all service interfaces, you can ONLY have one bridgetable per VLAN.
This answer is everything but a specific pointer.
If Section 6 of RFC7432 says all this very clearly, I guess it shouldbe possible to extract quotes about "there is one MAC-VRF per EVI ona given PE", right ?
In bridging world, there can only be a single bridge table per VLANin a device.
I still don't find here anything that would preclude having, on agiven PE, for a given E-TREE EVI, one Leaves MAC-VRF and one RootsMAC-VRF: can't these two MAC-VRFs use different internal VLANs (withtranslation if the external VLANs are constrained).
Ali> Lets assume we are using vlan-based service and thus there isonly a single bridge table per MAC-VRF, then what you are suggestingis two use two MAC-VRFs (two bridge tables) for the same EVI (sameVLAN). This results in some duplications of MAC addresses and wouldonly work if flooding is disabled (more on this later).
"results in some duplications of MAC" is perhaps a drawback, butnothing like "just does not work" ?
"would only work if flooding is disabled": why ? (you wrote "(moreon this later)" but I couldn't identify anything recent from you inthe rest of the email below)
From an helicopter view, I can't see what fundamentally would becomeproblematic between "two MAC-VRFs on two distinct PEs" and the same"two MAC-VRFs on a same PEs", at worse it is as efficient or asinefficient as having them on separate PEs (think logical routerwithout anykind of dataplane optimisation), and we can't exclude thatthe PE could have local implementation details to do better than that.
Besides, I don’t understand what good does it do to have twoMAC-VRFs on the same PE (one for Leafs and another for Roots)
Well, the "what is good for" is pretty simple: it means you canhave, just by tailoring the import/export policies like in 2.1,something as useful as the scenario in 2.2.
There can only be a single bridge table per VLAN. Now even if youadd some kind of logic to form two logical PEs in single physicalPE, you end up replicating all the MAC addresses associated withthe root sites in two bridge tables.
Your point above certainly does not sound to me as "it can't bedone": some may think that the above is an acceptable cost, someothers may find ways to make this "replication" with a low overhead,on some platforms the cost may be negligible, etc.
because Leafs and Roots need to talk to each other and thus wewant them to be in the same MAC-VRF.
The fact that Leafs and Roots need to talk to each other does notmean that they *have* to be in the same MAC-VRF, you can rely onthe local MPLS dataplane inside the PE to carry the traffic betweenRoots and Leaves can be passed between a Leaf MAC-VRF and a RootMAC-VRF (and you can possibly implement a shortcut not involvingMPLS encap/decap).
Anything is possible but at what cost.
You know, for cost it is not always obvious to reach conclusionsthat are true for all implementations and all targets.
The current proposal is very efficient in terms of forwarding pathas well as control plane.
Sure, but what I question is not the new solution but the lack ofdiscussion on why using the existing specs was not considered goodenough.
I think that my concern of clearly explaining the scenarios andmotivations for this new spec could be addressed by splittingsection 2.2 into a 2.2.1 describing the approach from 2.1 and itspossible drawbacks, and a 2.2.2 having essentially the content ofcurrent section 2.2.
Here is a proposal:

2.2 Scenario 2: Leaf of Root site(s) per AC

In these scenarii, a PE receives traffic from either Root OR Leaf
sites (but not both) on a given Attachment Circuit (AC) of an EVI. In
other words, an AC (ES or ES/VLAN) is either associated with Root(s)
or Leaf(s) (but not both).
2.2.1 Scenario 2a: Leaf OR Root site(s) per AC, separate Leaf/RootMAC-VRFs
+---------+            +---------+
|   PE1   |            |   PE2   |
+---+            |  +---+  | +------+  |  +---+  | +---+
|CE1+-----ES1----+--+   |  |  | |  |  |MAC+--+---ES2/AC1--+CE2|
+---+    (Leaf)  |  |MAC|  |  | MPLS |  |  |VRF|  |   (Leaf)   +---+
|  |VRF|  |  |  /IP |  |  '---'  |
|  |   |  |  |      |  |  .---.  |
|  |   |  |  |      |  |  |MAC| |            +---+
|  |   |  |  |      |  | |VRF+--+---ES2/AC2--+CE3|
|  +---+  |  +------+  |  +---+  | (Root)   +---+
+---------+            +---------+

Figure 2: Scenario 2a
In this scenario, the RT constraint procedures described in section2.1 could
also be used. The feasibility and efficiency of this approach depends on
platforms specifics.
This approach will lead toduplication of a large proportion of MACaddressesonPEs having both Leaf and Root sites, and is hence considered lesssuitable fordeployment contexts where the vast majority of PEs are likely toultimately
have both Leaf and Root sites attached to them.

2.2.2 Scenario 2b: Leaf OR Root site(s) per AC, single MAC-VRF

+---------+            +---------+
|   PE1   |            |   PE2   |
+---+            |  +---+  | +------+  |  +---+  | +---+
|CE1+-----ES1----+--+   |  |  | |  |  |   +--+---ES2/AC1--+CE2|
+---+    (Leaf)  |  |MAC|  |  | MPLS |  |  |MAC|  |   (Leaf)   +---+
|  |VRF|  |  |  /IP |  |  |VRF|  |
|  |   |  |  |      |  |  |   | |            +---+
|  |   |  |  |      |  |  | +--+---ES2/AC2--+CE3|
|  +---+  |  +------+  |  +---+  | (Root)   +---+
+---------+            +---------+

Figure 2: Scenario 2b
This scenario will alleviate keys drawbacks from Scenario 2a, inparticularby avoiding duplication of MAC addresses on Leaf/Root PEs andavoiding the
   operational overhead of managing more than one RT.
This approach comes at the expense of having routes for unneededMAC addresses on Leaf-only PEs, and is hence considered lesssuitable for deployment contexts where the vast majority of PEswould remain Leaf-only. Unlike Scenario 1 and Scenario 2a, this scenario requires additional procedures
    provided in this document.


(And this last sentence should be added to section 2.3 as well)
For this scenario, if for a given
   EVI, the majority of PEs will eventually have both Leaf and Root
sites attached, even though they may start as Root-only orLeaf-only
   PEs, then it is recommended to use a single RT per EVI and avoid
   additional configuration and operational overhead.
Why this recommendation ?
Even with a majority of PEs having both Leaves and Roots, therecan remain (up to 49% of) PEs having only Leaves, which willuselessly have all routes to other Leaves.
So "it is recommended" above, deserves to be explained more, I think.

OK, I changed “majority” to “vast majority” :-)
My point was not to nit pick on "majority", but was that you shouldexplain why you recommend that.As the text currently reads, the cost of the recommendation can beidentified: having useless routes on the fraction of PEs havingonly Leaves.But the gain brought by the recommendation is not even mentioned,not to say explained.
Hence: why ?
(Why is it a useful tradeoff to have useless routes on some, evenif only one, PE ?)
Changed the last sentence from:
"then it is recommended to use a single RT per EVI and avoidadditional configuration and operational overhead.”
To
"then it is recommended to use a single RT per EVI and avoidadditional configuration and operational overhead
at the expense of having unwanted MAC addresses on the Leaf PEs."
Ok. I adapted and incorporated this addition into my proposed textsplitting 2.2 into a 2.2.1 and a 2.2.2.
Best,

-Thomas

_______________________________________________
BESS mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/bess

Re: [bess] shepherd review of draft-ietf-bess-evpn-etree

Reply via email to