Re: [bess] WG adoption and IPR poll for draft-zzhang-bess-bgp-multicast-03

Jeffrey (Zhaohui) Zhang Fri, 24 Jan 2020 10:33:05 -0800

Hi Gyan,

Please zzh4> below.

From: Gyan Mishra <[email protected]>
Sent: Thursday, January 9, 2020 7:22 PM
To: Jeffrey (Zhaohui) Zhang <[email protected]>
Cc: [email protected]; [email protected]; 
[email protected]; [email protected]
Subject: Re: [bess] WG adoption and IPR poll for 
draft-zzhang-bess-bgp-multicast-03

In-line comments

On Wed, Jan 8, 2020 at 1:48 PM Jeffrey (Zhaohui) Zhang 
<[email protected]<mailto:[email protected]>> wrote:
Hi Gyan,

Please see zzh3> below. I trimmed some text.

From: Gyan Mishra <[email protected]<mailto:[email protected]>>
Sent: Wednesday, January 8, 2020 2:50 AM
To: Jeffrey (Zhaohui) Zhang <[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>;
 [email protected]<mailto:[email protected]>
Subject: Re: [bess] WG adoption and IPR poll for 
draft-zzhang-bess-bgp-multicast-03

Hi Jeffery

Pleas see in line Gyan>

Gyan> I actually read RFC 7938 when I was redesigning a data center 
architecture for stability using a L3 smaller fault domain design.  This BGP 
signaling of trees feature has to be used with eBGP and not iBGP as that 
requires IGP which would now be in the RIB for RPF path so would not work thus 
the "no IGP" requirement as per RFC 7938.  If you had directly connected iBGP 
peers and not loop-loop so that you don't need an IGP, could the BGP signaled 
tree feature still work. In theory your spine & leaf could all be directly 
connected iBGP peers and all now sit in one AS and not have an IGP.  This would 
eliminate the need to have ASNs deployed.

Zzh3> This draft does work for both eBGP and iBGP:

   How the BGP peer sessions are provisioned, whether EBGP or IBGP,
   whether statically, automatically (e.g., based on IGP neighbor
   discovery), or programmably via an external controller, is outside
   the scope of this document.

   In case of IBGP, it could be that every router peering with Route
   Reflectors, or hop by hop IBGP sessions could be used to exchange
   C-MCAST NLRIs for joins.  In the latter case, unless desired
   otherwise for reasons outside of the scope of this document, the hop
   by hop IBGP sessions SHOULD only be used to exchange C-MCAST NLRIs.

   Gyan> I am on the same page with you on the draft as it has a lot of merit.  
I like the concept of leveraging BGP for multicast and using the proven MVPN 
procedures to instantiate PMSI trees now with hard stare end to end.

Few comments below:

The main objectives of this draft which should be incorporated in the 
Introduction is to provide an option for both service providers and enterprises 
that do not want to maintain soft state of multicast trees ; native PIM ASM or 
SSM based trees ; or MPLS based instantiated S-PMSI trees ; to use BGP 
multicast hard state to send joins/prune using proven service provider MVPN 
procedures.  Provide a means of network based source discovery for both ASM and 
SSM.

Zzh4> The introduction section has the following:

   1<https://tools.ietf.org/html/draft-zzhang-bess-bgp-multicast-02#section-1>. 
 Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   
3<https://tools.ietf.org/html/draft-zzhang-bess-bgp-multicast-02#page-3>

1.1<https://tools.ietf.org/html/draft-zzhang-bess-bgp-multicast-02#section-1.1>.
  Motivation  . . . . . . . . . . . . . . . . . . . . . . .   
3<https://tools.ietf.org/html/draft-zzhang-bess-bgp-multicast-02#page-3>

1.1.1<https://tools.ietf.org/html/draft-zzhang-bess-bgp-multicast-02#section-1.1.1>.
  Native/unlabeled Multicast  . . . . . . . . . . . . .   
3<https://tools.ietf.org/html/draft-zzhang-bess-bgp-multicast-02#page-3>

1.1.2<https://tools.ietf.org/html/draft-zzhang-bess-bgp-multicast-02#section-1.1.2>.
  Labeled Multicast . . . . . . . . . . . . . . . . . .   
4<https://tools.ietf.org/html/draft-zzhang-bess-bgp-multicast-02#page-4>

Zzh4> It does cover the above points.

Would Default MDT UI/MI PMSI instantiated tree type 1 and 2 be considered hard 
state since the default MDT tree is always UP.  They could be an example to 
give to describe what is meant by hard state.

Zzh4>. Hard state vs. soft state means whether the state needs to be refreshed.

What is confusing is the mention in the introduction of the target deployment 
being for data centers BGP-only RFC 7938 deployments.  Since BGP multicast 
procedure can use both eBGP or IBGP with or w/o RR ; I would put the target 
deployments or use case in a separate section.  I think the reason why 7938 is 
mentioned is that in cases where BGP is solely used and protocol elimination is 
the goal, similar to P router “BGP free core” ; this allows for the elimination 
of PIM as well.  Does add confusion as part of intro since it makes the reader 
think that this is a eBGP only option for this feature  which it is not.

Zzh4> The text says “ONE target deployment”. There could be more. But I will 
change the wording to “one deployment scenario”. The following paragraph and 
sub-sections expand to other motivations.

Some operators don’t have an issue with the IP or labeled soft state with tree 
based models

Zzh4> Right – that’s why it says “some Data Center operators have been avoiding 
deploying multicast in their networks”.

Since the objective of this draft is to provide an alternative solution for 
operators  and not that this would provide a means to that end to eliminate 
tree based protocols such as pim or mldp but as an “option” if so desired.   I 
would stare as such.

Zzh4> This document has no mentioning of segment routing (where people talk 
about eliminating core state) and has the following in the abstract:

   This document specifies a BGP address family and related procedures
   that allow BGP to be used for setting up multicast distribution
   trees.

I don’t think tree based protocols are going away any time soon but I think 
it’s good to clarify that the direction from a standards perspective is not to 
eliminate but provide the optional flexibility for operators.

I think it’s a good idea to define what is meant by hard state and soft state.

Zzh4> Hard state and soft state are generally well-known, and I do have the 
following:

   o  Periodical protocol state refreshes due to soft state nature.

There maybe reason that operators may desire to keep the soft state if they 
don’t have much native or labeled state.  Or for telemetry or tracking purposes 
desire to maintain tree state to see how many trees are active within the 
network.  For scalability in cases of where soft state management is an issue 
now they have an option.

Zzh4> If they don’t have concerns that are addressed by this solution they 
certainly don’t have to use this. This is just an option.

ASM was built with LHR network based source discovery with MSDP and to that end 
it did add complexity but did allow a lot of flexibility that the MRIB 
contained all available groups so any receiver could join and group and any 
source could start streaming if they so desired.  Controls were built into ASM 
for that reason to limit what groups could be serviced  by an RP via ACL and 
what sources could stream with PIM accept register.  Controls were also put in 
place with MSDP SA propagation with SA filtering.  So the added flexibility of 
network based discovery was nice with ASM however with a major trade off of 
complexity as well as now added controls as to what valid sources can steam and 
what receivers can join what what steam.

Zzh4> ASM does not require MSDP. The real problem of ASM is with the complexity 
of source discovery that involves RP, register and RPT to SPT switchover). The 
solution in this draft addresses the complexity.

SSM was designed w/o network based discovery to eliminate the complexity of all 
the knobs to provide those controls that were built into ASM for the gain of 
simplicity : thus application based network discovery.  With SSM now all the 
controls are with the application /  server layer and removed from the network. 
 With application based source discovery at the application layer a channel 
list can now be provided which was done previously by legacy multicast apps 
such as SDR which detected the active groups on the network.

I think it maybe worthwhile mentioning reasons why SSM was built without 
network based discovery since now this draft will be adding the capability to 
SSM for network based source discovery.

Zzh4> That is outside the scope of this document. It’s better to be focused.

Regarding the multicast NLRI I was thinking how that would work and reason why 
the group is not needed is that the group is known with ASM model and if you 
have ASM and SSM overlay the  group is known by the app layer and so the 
network just needs the source to be learned for the LHR S,G join. Maybe worth 
mentioning in the draft.

Zzh4> The NLRIs do have (*,g) or (s,g) information. The LHR does need to know 
which group it needs to join, and that is a given.

In the introduction I would remove that soft state and tree building protocols 
as a reason why data centers avoid enabling multicast on the network.  There 
maybe some isolated corner cases however predominantly PIM soft state ASM or 
SSM has never been an issue.  To that end most data centers server deployments 
rely on multicast for clustering and data replication.  Also server based 
applications that can utilize multicast save on high bandwidth server to server 
east to west intra data center flows.  I do think BGP multicast would be an 
improvement and would add stability for data centers requiring multicast.

Zzh4> It is true some DC operators don’t like multicast trees. It’s true some 
others don’t mind having them. It’s ok to mention that this solution could suit 
for whoever needs multicast yet don’t want to run PIM.

In the draft it mentions that earlier versions of this draft user C-Multicast 
for signaling which was change to S-PMSI leaf Type 4 routes so we can set up 
the tree from FEC root instead of leaves.  This is similar to P2MP TE P-Tunnel 
where the root advertises BGP route type 3 with “leaf info required” bit set ; 
when the receiver PE gets the route it responds with type 4 leaf-ad.  Is that 
correct?

Zzh4> S-PMSI routes are used for two purposes in this document – see 2.2.5 and 
2.2.6.

For labeled use case can this be used for all P-tunnel types MP2MP P2MP mLDP, 
P2MP TE.  Would PIM Rosen GRE still be valid P tunnel supported.  Maybe good to 
mention all the P tunnels supported for labeled BGP multicast.

Zzh4> This is about establishing native/labeled multicast trees. The resulting 
trees can be used for MVPN (BGP-MVPN or Rosen MVPN) as P-tunnels, but in the 
context of this document,  you ca just forget about P tunnels.

As far as the MVPN procedures instantiation of S-PMSI trees is that being 
completed reused from RFC 6513 6514  - all the same route types supporting ASM 
and SSM types 3-7.  Only type 1 and 2 for default MDT instantiation of I-PMSI 
trees is not supported.

Zzh4> Other than that some NLRI types are similar to MVPN, you can ignore its 
relevance with MVPN.
Zzh4> Thanks!
Zzh4> Jeffrey

Correct?

Kind regards,

Gyan

Zzh3> RFC 7938 uses eBGP which does not require IGP, so the following text is 
perfect (after removing P2MP tunnel wording):

   This section provides some motivation for BGP signaling for native
   and labeld multicast.  One target deployment would be a Data Center
   that requires multicast but uses BGP as its only routing protocol

[RFC7938<https://urldefense.com/v3/__https:/tools.ietf.org/html/rfc7938__;!!NEt6yMaO-gk!X9DfJ6RoZtXubIffaIRRYd0TgIz3lgYayTfkKv2a7LIImWQqa9eDPnSJyZ1k8qhB$>].
  In such a deployment, it would be desirable to support
   multicast by extending the deployed routing protocol, without
   requiring the deployment of tree building protocols such as PIM,
   mLDP, RSVP-TE P2MP, and without requiring an IGP.

Zzh3> Then the following talks about other scenarios beyond DC:

   Additionally, compared to PIM, BGP based signaling has several
   advantage as described in the following section, and may be desired
   in non-DC deployment scenarios as well.

Zzh3> I will change “compared to PIM” to “compared to PIM/mLDP”.

Gyan> So with this feature the last hop router signals join similar to mLDP 
inband via BGP and the join is sent via BGP signalled tree.

Zzh3> It’s the PIM joins from LHRs or mLDP label mappings from leaves replaced 
with BGP messages, not that “the join is sent via BGP signalled tree”.

 Gyan>  With the BGP trees using the same MVPN mLDP procedures is their a 
concept of PMSI-I inclusive trees c-tree p-tree so a single tree  P2MP or MP2MP 
can be shared by all groups or is it 1-1 mapping group to tree.

Zzh3> MVPN/EVPN is for overlay – multiple C-flows can be transported over a 
single I/S-PMSI which are instantiated by underlay tunnels. This draft is about 
establish trees/tunnels, which can instantiate MVPN/EVPN I/S-PMSI (among other 
things).

Gyan>  Is their any way to minimize per GDA state with BGP trees.

Zzh3> What does GDA mean? Group Destination Address?
Zzh3> Anyway, so far the only efficient replication solution w/o incurring 
per-tree/tunnel state is BIER. BGP-signaled multicast does still have 
per-tree/tunnel state just like with PIM/mLDP. The only difference is how the 
state is signaled.

 Gyan> For the BGP trees is it possible use the same MVPN BGP A-D and c-tree 
p-tree Type 6 & 7 routes BGP multicast c-signalling.

Zzh3> This draft uses a different address family and new route types similar to 
MVPN type-3/4 routes instead of type-6/7 routes:

   The joins are carried in BGP Updates with MCAST-TREE SAFI and S-PMSI/
   Leaf A-D routes defined in this document.  The updates are targeted
   at the upstream neighbor by use of Route Targets.  [Note - earlier
   version of this draft uses C-multicast route to send joins.  We're
   now switching to S-PMSI/Leaf routes for three reasons. a) when the
   routes go through RRs, we have to distinguish different routes based
   on upstream router and downstream router.  This leads to Leaf routes.
   b) for labeled bidirectional trees, we need to signal "upstream fec".
   S-PMSI suits this very well. c) we may want to allow the option of
   setting up trees from the roots instead of from the leaves.  S-PMSI
   suits that very well.]

   Gyan> Doing so could you leverage the PMSI-I inclusive tree MVPN feature so 
you don't have per GDA state

Zzh3> As explained earlier, I-PMSI is irrelevant.

    Gyan> Source discovery is only necessary with ASM not SSM. With SSM the 
receiver is "source" aware so does not require any discovery mechanism.
So with SSM which requires IGMPv3 enabled on the receiver last hop router 
subnets and on the source first hop router subnet for the both to be "source 
aware" ; for the receiver now to send the (S,G) join for the channel since it 
is now source aware. How the receiver gets that source awareness is from the 
server URI that the user connects to which has the S,G information ; server has 
to be also  source aware and has S,G channel available that can be joined. With 
IGMPv3 the packet  accommodate the Source information in the S,G join sent 
along the RPF path to the source. You mention that SSM deployment has been 
limited but in fact the opposite and reason why ASM is being officially 
deprecated by the IETF for inter domain multicast routing. IPv6 does not even 
have MSDP support since with ASM MSDP source discovery and propagation is not 
necessary since no RPs exist all disparate ASM multicast domains can now be 
collapsed into a single SSM domain. ASM MSDP/Anycast has its complexities which 
is why IPv6 nixed the idea of integrating MSDP into the architecture. Thus IPv6 
only supports SSM for inter-domain multicast routing. I would keep the comment 
about ASM complexity which is true but remove mention of SSM.  I would not 
mention any gains with less state as you would still have to maintain IGMP join 
state with BGP with 1-1 mappings of GDA to tree so the tree state is not being 
eliminated.

Zzh3> To do SSM you need to know sources ahead of time.
Zzh3> In a true ASM scenario, there are multiple sources sending to the same 
group and receivers don’t necessarily know which sources will be sending. Even 
though for some applications the receivers can get that source information from 
some servers/URIs (which is what I refer to as “application based source 
discovery”), there are still many situations where the receivers just want to 
do (*,g) IGMP join and leave the source discover to the network.
Zzh3> As for deprecating inter-domain ASM, please note the following:

   This document does not make any statement on the use of ASM within a
   single domain or organisation, and therefore does not preclude its
   use.  Indeed, there are application contexts for which ASM is
   currently still widely considered well-suited within a single domain.

Zzh3> More importantly, to use SSM you need to know sources first – either the 
receivers somehow learns/knows the sources or the network will figure it out. 
This draft provides a way for the LHRs to figure out where the sources are and 
then apply SSM procedures.
Zzh> Notice that we’re not saying SSM is bad. Rather, SSM is what we want to 
do, but the draft is about BGP-SSM (a step up from PIM-SSM) with BGP-based 
source discovery.

   Gyan> "While PIM-SSM removes the complexity of PIM-ASM, it requires that

   multicast sources are known apriori.  There have not been a good way

   of discovering sources, so its deployment has been limited."

Zzh3> To clarify, the above text is not implying to move away from SSM. Rather, 
it is to explain why we introduce network-based source discovery via BGP in 
this draft so that SSM can be used w/o requiring application-based source 
discovery.

Zzh3> Thanks!
Zzh3> Jeffrey
--
Gyan  Mishra
Network Engineering & Technology
Verizon
Silver Spring, MD 20904
Phone: 301 502-1347
Email: [email protected]<mailto:[email protected]>

_______________________________________________
BESS mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/bess

Re: [bess] WG adoption and IPR poll for draft-zzhang-bess-bgp-multicast-03

Reply via email to