This is an interesting and potentially useful draft. The major issue I see is that while the draft does not say that it is applicable only to non-segmented IR P-tunnels, I don't see how it will work if IR P-tunnels are segmented at ASBRs or ABRs. The draft seems to use the term "Upstream Multicast Hop" (UMH) to mean "Upstream PE", which would be okay only if non-segmented IR P-tunnels are out of scope.
Comments in-line, look for ****. Abstract RFC 6513 described a method to support bidirectional C-flow using "Partial Mesh of MP2MP P-Tunnels". This document describes how partial mesh of MP2MP P-Tunnels can be simulated with Ingress Replication, instead of a real MP2MP tunnel. **** I'd add to the abstract that this enables a Service Provider to use **** Ingress Replication to offer transparent BIDIR-PIM service to its VPN **** customers. 1. Introduction Section 11.2 of RFC 6513, "Partitioned Sets of PEs", describes two methods of carrying bidirectional C-flow traffic over a provider core without using the core as RPL or requiring Designated Forwarder election. With these two methods, all PEs of a particular VPN are separated into partitions, with each partition being all the PEs that elect the same PE as the UMH wrt the C-RPA. A PE must discard bidirectional C-flow traffic from PEs that are not in the same partition as the PE itself. In particular, Section 11.2.3 of RFC 6513, "Partial Mesh of MP2MP P-Tunnels", guarantees the above discard havavior without using an extra PE Distinguisher label by having all PEs in the same partition join a single MP2MP tunnel dedicated to that partition and use it to transmit traffic. All traffic arriving on the tunnel will be from PEs in the same partition, so it will be always accepted. RFC 6514 specifies BGP encodings and procedures used to implement MVPN as specified in RFC 6513, while the details related to MP2MP tunnels are specified in [draft-ietf-l3vpn-mvpn-bidir-05]. [draft-ietf-l3vpn-mvpn-bidir-05] assumes that an MP2MP P-tunnel is realized either via PIM-Bidir, or via MP2MP mLDP. Each of them would require signaling and state not just on PEs, but on the P routers as well. This document describes how the MP2MP tunnel can be simulated with a mesh of P2P or MP2P LSPs, i.e. Ingress Replication. **** What is really being proposed is to simulate a MP2MP P-tunnel with a **** set of P2MP P-tunnels, and then to use Ingress Replication to **** instantiate each such P-tunnel. The trick is how to get all the PEs to **** join all the necessary P2MP P-tunnels without requiring each PE to send **** a Leaf A-D route for each MP2MP P-tunnel to each other PE. The advantage is that existing P2P/MP2P LSPs created for unicast can be used for multicast as well w/o introducing additional signaling or state in the core. While there may be concerns with traffic replication in the core, in many situations the traffic could be low- rate and/or sporadic and the advantage of signaling and state savings will outweight the concerns with traffic replication, making Ingress Replication an applicable and attractive alternative. **** It might be better simply to say that this scheme has both the **** advantages and the disadvantages of Ingress Replication in general. This documentation specifies the BGP signaling and procedures used to simulate "Partial Mesh of MP2MP P-Tunnels" with Ingress Replication. ... 3. Operation 3.1. Control State If a PE, say PEx, is connected to a site of a given VPN, and that site hosts the C-RPA for some Bidir-PIM groups, i.e., the route to the C-RPA is through a local PE-CE interface, **** I think the actual condition is that PEx's next hop interface to some **** C-RPA is a VRF interface. This is not exactly the same thing as being **** connected to a site that "hosts a C-RPA". then PEx MUST advertises a (C-*,C-BIDIR) S-PMSI A-D route, regardless of whether it has any local Bidir-PIM join states corresponding to the C-RPA learned from its CEs. It MAY also advertise a (C-*,C-G-BIDIR) S-PMSI **** "advertise a" --> "advertise one or more" A-D route, just like how any other S-PMSI A-D routes are triggered (e.g, when the (C-*,C-G-BIDIR) traffic rate goes above a threshold). **** It's worth pointing out that applying a traffic rate threshold to a **** (C-*,C-G-BIDIR) state would require measuring the traffic in both **** directions, as the sources are not necessarily local. For IR **** P-tunnels, it might also be necessary to take the fanout into account. Here the C-G-BIDIR refers to a C-G where G is a Bidir-PIM group, and the corresponding C-RPA is in the site that the PEx connects to. The S-PMSI A-D routes include a Provider Tunnel Attribute (PTA) with **** "PMSI Tunnel attribute" tunnel type set to Ingress Replication, with Leaf Information Required flag set, and with a downstream allocated MPLS label that other PEs in the same partition MUST use when sending relevant C-bidir flows to this PE. **** and with the Tunnel Identifier field in the PTA set to a routable **** address of the originator? **** Can the MPLS label be shared with any other P-tunnels? Perhaps all **** the (C-*,C-BIDIR) and (C-*,C-G-BIDIR) S-PMSI A-D routes originated by a **** given PE can (optionally) share a label? If some other PE, PEy, receives and imports into one of its VRFs such a (C-*,C-BIDIR) S-PMSI A-D route, **** I'm not sure just what is mean by "such a (C-*,C-BIDIR) S-PMSI A-D **** route". Does this mean "any (C-*,C-BIDIR) S-PMSI A-D route whose PTA **** specifies an IR P-tunnel"? and the VRF has any local Bidir-PIM join state that PEy has received from its CEs, and if PEy chooses PEx as its UMH wrt the C-RPA for those states, PEy MUST advertise a Leaf A-D route in response. Or, if PEy has received and imported into one of its VRFs a (C-*,C-BIDIR) S-PMSI A-D route from PEx before, then upon receiving in the VRF any local Bidir-PIM join state from its CEs with PEx being the UMH for those states' C-RPA, PEy MUST advertise a Leaf A-D route. The encoding of the Leaf A-D route is as specified in RFC 6514, except that the Route Targets are set to the same value as in the corresponding S-PMSI A-D route so that the Leaf A-D route will be imported by all VRFs that import the corresponding S-PMSI A-D route. **** I take the "except" clause to mean that RFC 6514's rules for setting **** the Leaf A-D route's RTs are not followed, and that the RTs are instead **** just copied from the S-PMSI A-D route. Is that the intention? **** This means that the Leaf A-D route will not have an RT that is created **** from the Next Hop or P2MP Segmented Next Hop EC, which essentially **** means that all the P-tunnels will be non-segmented. Is that the **** intention? **** RFC 6514 says that a PE/ASBR should take no action with regard to a **** Leaf A-D route unless that Leaf A-D route carries an IP Address **** Specific RT identifying the PE/ASBR. This draft should make it very **** clear that it is changing the RFC6514 procedures for the case where the **** route key of a Leaf A-D route identifies a (C-*,C-BIDIR) or a **** (C-*,C-G-BIDIR) S-PMSI. **** It's not clear to me how these procedures would coexist with the **** segmentation procedures that ordinarily occur at ABRs or ASBRs. Are **** ABRs/ASBRs supposed to modify the next hop and/or segmented next hop **** extended communities of the S-PMSI A-D routes that are about BIDIR **** groups? If not, but if segmentation is to be applied to S-PMSI A-D **** routes that are not about BIDIR groups, how will the ABRs/ASBRS know **** which are which? I.e., how exactly do the procedures of this draft **** coexist with the ABR/ASBR segmentation procedures that apply to **** non-BIDIR S-PMSIs? **** Note that there is up to now no requirement that an S-PMSI A-D route **** originated from a particular VRF carry any of that VRF's import RTs. I **** think that requirement needs to be added to this draft; otherwise the **** Leaf A-D routes originated in response to an S-PMSI A-D route won't **** necessarily be imported into the originating VRF of the S-PMSI A-D **** route. Alternatively, one could require that the Leaf A-D route carry **** an IP Address Specific RT identifying the S-PMSI route's originator (as **** learned from the NLRI) in its Global Administrator field. This is irrespective of whether from a receiving PE, PEz's perspective PEx (oiginator of the S-PMSI A-D route) is the UMH PE or **** Is "PEz" supposed to be "PEy"? **** It would be better to say "Upstream PE" than "UMH" or "UMH PE", as the **** UMH could presumably be an ABR or ASBR. not. The label in the PTA of the Leaf A-D route originated by PEy MUST be allocated specifically for PEx, so that when traffic arrives with that label, the traffic can associated with the partition (represented by the PEx). **** This doesn't make it clear whether the label can be shared with other **** S-PMSIs (e.g., P2MP S-PMSIs) that originate from PEx. I think the **** answer is yes, at least in non-extranet cases. **** I think the draft should specify that the originator (the upstream PE) **** is identified from the "originating router's IP address" field of the **** NLRI of the S-PMSI A-D route. With PEy advertising Leaf A-D route only if it chooses the originator of the S-PMSI A-D route as its UMH, it won't receive traffic from PEs **** "UMH" vs. "Upstream PE" again. in other partitions, so the label is actually useful only when PEy switches to a different UMH - it will stop accepting traffic before sending PEs stop sending it traffic (upon the receipt of its Leaf A-D route withdrawl). **** I don't see why it is said that "PEy ... won't receive traffic from PEs **** in other partitions". PE1, for example, may choose PE2 as the Upstream **** PE for (C-*,C-G1-BIDIR) while choosing PE3 as the Upstream PE for **** (C-*,C-G2-BIDIR). PE4 may make the opposite choices. In that case PE1 **** and PE4 may both originate Leaf A-D routes with NLRI <C-*,C-BIDIR, PE2> **** and <C-*,C-BIDIR,PE3>. As a result, PE2 and PE3 would get the **** (C-*,C-G1-BIDIR) and (C-*,C-G2-BIDIR) flows from both partitions, and **** they would need to use the label to determine which copy of each C-flow **** to drop. To speed up convergency (so that PEy starts receiving traffic from its new UMH immediately instead of waiting until the new Leaf A-D route corresponding to the new UMH is received by sending PEs), PEy MAY advertise a Leaf A-D route even if does not choose PEx as its UMH wrt the C-RPA. With that, it will receive traffic from all PEs, but some will arrive with the label corresponding to its choice of UMH while some will arrive with a different label, and the traffic in the latter case will be discarded. **** This might be useful as a form of live-live redundancy, but I don't **** think it is a very efficient way to speed up convergence when a **** receiving PE decides to switch the partition over which it receives a **** particular C-flow. When the receiving PE originates the Leaf A-D route **** for the new partition, it just has to keep forwarding the C-flow **** received from the old partition for a certain period of time, while **** discarding that C-flow when received from the new partition; when that **** period of time is up, can start discarding the C-flow when received **** from the old partition and start forwarding it when received from the **** new. (A larger delay should be imposed on transmitters, so they **** continue transmitting to a particular receiver for a period of time **** after that receiver withdraws its Leaf A-D route.) This would provide **** a more efficient "make before break" type of procedure. Similar to the (C-*,C-BIDIR) case, if PEy receives and imports into one of its VRFs such a (C-*,C-G-BIDIR) S-PMSI A-D route, and PEy chooses PEx as its UMH wrt the C-RPA, and it has corresponding local (C-*,C-G-BIDIR) join state that it has received from its CEs in the VRF, PEy MUST advertise a Leaf A-D route in response. Or, if PEy has received and imported into one of its VRFs a (C-*,C-G-BIDIR) S-PMSI A-D route before, then upon receiving its local (C-*,C-G-BIDIR) join state from its CEs in the VRF, it MUST advertise a Leaf A-D route. The encoding of the Leaf A-D route is as specified in RFC 6514, except that the Route Targets are set to the same as in the corresponding S-PMSI A-D route so that the Leaf A-D route will be imported by all VRFs that import the corresponding S-PMSI A-D route. **** See prior comment re RTs. This is irrespective of whether from the receiving PE, PEz's **** PEz? Is that supposed to be PEy? perspective PEx (oiginator of the S-PMSI A-D route) is the UMH PE or not. The label in the PTA of the Leaf A-D route originated by PEy MUST be allocated specifically for PEx, so that when traffic arrives with that label, the traffic can associated with the partition (represented by the PEx). **** See prior comment on label granularity. Whenever the (C-*,C-BIDIR) or (C-*,C-G-BIDIR) S-PMSI A-D route is withdrawn, or if PEy no longer chooses the originator PEx as its UMH wrt C-RPA and PEy only advertises Leaf A-D routes in response to its UMH's S-PMSI A-D route, or if relevant local join state is pruned, PEy MUST withdraw the corresponding Leaf A-D route. **** I think this "MUST" is too strong, and in fact it contradicts what is **** said above "PEy MAY advertise a Leaf A-D route even if does not **** choose PEx as its UMH wrt the C-RPA".
