Hi Daniel,
thank you for the review, comments, and helpful suggestions. I'll work on
answering questions and addressing comments and respond soon.


On Fri, Oct 23, 2020 at 5:36 AM Daniel Migault via Datatracker <
nore...@ietf.org> wrote:

> Reviewer: Daniel Migault
> Review result: Has Nits
> Hi,
> I reviewed this document as part of the Security Directorate's ongoing
> effort to
> review all IETF documents being processed by the IESG.  These comments were
> written primarily for the benefit of the Security Area Directors.  Document
> authors, document editors, and WG chairs should treat these comments just
> like
> any other IETF Last Call comments.  Please note also that my expertise in
> BGP is
> limited, so feel free to take these comments with a pitch of salt.
> Review Results: Has Nits
> Please find my comments below.
> Yours,
> Daniel
>                   Multicast VPN Fast Upstream Failover
>                  draft-ietf-bess-mvpn-fast-failover-11
> Abstract
>    This document defines multicast VPN extensions and procedures that
>    allow fast failover for upstream failures, by allowing downstream PEs
>    to take into account the status of Provider-Tunnels (P-tunnels) when
>    selecting the Upstream PE for a VPN multicast flow, and extending BGP
>    MVPN routing so that a C-multicast route can be advertised toward a
>    Standby Upstream PE.
> <mglt>
> Though it might be just a nit, if MVPN
> designates multicast VPN, it might be
> clarifying to specify the acronym in the
> first sentence. This would later make
> the correlation with BGP MVPN clearer.
> </mglt>
> 1.  Introduction
>    In the context of multicast in BGP/MPLS VPNs, it is desirable to
>    provide mechanisms allowing fast recovery of connectivity on
>    different types of failures.  This document addresses failures of
>    elements in the provider network that are upstream of PEs connected
>    to VPN sites with receivers.
> <mglt>
> Well I am not familiar with neither BGP
> nor MPLS. It seems that BGP/MLPS IP VPNS
> and MPLS/BGP IP VPNs are both used. I am
> wondering if there is a distinction
> between the two and a preferred way to
> designate these VPNs.  My understanding
> is that the VPN-IPv4 characterizes the
> VPN while MPLS is used by the backbone
> for the transport.  Since the PE are
> connected to the backbone the VPN-IPv4
> needs to be labeled.
> </mglt>
>    Section 3 describes local procedures allowing an egress PE (a PE
>    connected to a receiver site) to take into account the status of
>    P-tunnels to determine the Upstream Multicast Hop (UMH) for a given
>    (C-S, C-G).  This method does not provide a "fast failover" solution
> <mglt>
> I understand the limitation is due to
> BGP convergence.
> </mglt>
>    when used alone, but can be used together with the mechanism
>    described in Section 4 for a "fast failover" solution.
>    Section 4 describes protocol extensions that can speed up failover by
>    not requiring any multicast VPN routing message exchange at recovery
>    time.
>    Moreover, section 5 describes a "hot leaf standby" mechanism, that
>    uses a combination of these two mechanisms.  This approach has
>    similarities with the solution described in [RFC7431] to improve
>    failover times when PIM routing is used in a network given some
>    topology and metric constraints.
> [...]
> 3.1.1.  mVPN Tunnel Root Tracking
>    A condition to consider that the status of a P-tunnel is up is that
>    the root of the tunnel, as determined in the x-PMSI Tunnel attribute,
>    is reachable through unicast routing tables.  In this case, the
>    downstream PE can immediately update its UMH when the reachability
>    condition changes.
>    That is similar to BGP next-hop tracking for VPN routes, except that
>    the address considered is not the BGP next-hop address, but the root
>    address in the x-PMSI Tunnel attribute.
>    If BGP next-hop tracking is done for VPN routes and the root address
>    of a given tunnel happens to be the same as the next-hop address in
>    the BGP A-D Route advertising the tunnel, then checking, in unicast
>    routing tables, whether the tunnel root is reachable, will be
>    unnecessary duplication and thus will not bring any specific benefit.
> <mglt>
> It seems to me that x-PMSI address
> designates a different interface than
> the one used by the Tunnel itself. If
> that is correct, such mechanisms seems
> to assume that one equipment up on one
> interface will be up on the other
> interfaces. I have the impression that a
> configuration change in a PE may end up
> in the P-tunnel being down, while the PE
> still being reachable though the x-PMSI
> Tunnel attribute. If that is a possible
> scenario, the current mechanisms may not
> provide more efficient mechanism than
> then those of the standard BGP.
> Similarly, it is assumed the tunnel is
> either up or down and the determination
> of not being up if being down.  I am not
> convinced that the two only states.
> Typically services under DDoS may be
> down for a small amount of time. While
> this affects the network, there is not
> always a clear cut between the PE being
> up or down.
> </mglt>
> [...]
> 3.1.6.  BFD Discriminator Attribute
>    P-tunnel status may be derived from the status of a multipoint BFD
>    session [RFC8562] whose discriminator is advertised along with an
>    x-PMSI A-D Route.
>    This document defines the format and ways of using a new BGP
>    attribute called the "BFD Discriminator".  It is an optional
>    transitive BGP attribute.  In Section 7.2, IANA is requested to
>    allocate the codepoint value (TBA2).  The format of this attribute is
>    shown in Figure 1.
> <mglt>
> I feel that the sentence "In Section ...
> TBA2)." should be removed.
> </mglt>
>        0                   1                   2                   3
>        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>       |    BFD Mode   |                  Reserved                     |
>       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>       |                       BFD Discriminator                       |
>       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>       ~                         Optional TLVs                         ~
>       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>             Figure 1: Format of the BFD Discriminator Attribute
>    Where:
>       BFD Mode field is the one octet long.  This specification defines
>       the P2MP BFD Session as value 1 Section 7.2.
>       Reserved field is three octets long, and the value MUST be zeroed
>       on transmission and ignored on receipt.
>       BFD Discriminator field is four octets long.
> Morin, et al.             Expires April 5, 2021                 [Page 7]
> Internet-Draft         mVPN Fast Upstream Failover          October 2020
>       Optional TLVs is the optional variable-length field that MAY be
>       used in the BFD Discriminator attribute for future extensions.
>       TLVs MAY be included in a sequential or nested manner.  To allow
>       for TLV nesting, it is advised to define a new TLV as a variable-
>       length object.  Figure 2 presents the Optional TLV format TLV that
>       consists of:
>       *  one octet-long field of TLV 's Type value (Section 7.3)
>       *  one octet-long field of the length of the Value field in octets
>       *  variable length Value field.
>       The length of a TLV MUST be multiple of four octets.
> <mglt>
> I am wondering why the constraint on the
> length is not mentioned in the paragraph
> associated to the field - as opposed to
> a  separate paragraph.
> </mglt>
> [..]
> 8.  Security Considerations
>    This document describes procedures based on [RFC6513] and [RFC6514]
>    and hence shares the security considerations respectively represented
>    in these specifications.
>    This document uses p2mp BFD, as defined in [RFC8562], which, in turn,
>    is based on [RFC5880].  Security considerations relevant to each
>    protocol are discussed in the respective protocol specifications.  An
>    implementation that supports this specification MUST use a mechanism
>    to control the maximum number of p2mp BFD sessions that can be active
>    at the same time.
> <mglt>
> At a high level view - or at least my
> interpretation of it - the document
> proposes a mechanism based on BFD to
> detect fault in the path.  Upon a fault
> detection a fail-over operation is
> instructed using BGP. This rocedure is
> expected to perform a faster fail-over
> than traditional BGP convergence on
> maintaining routing tables. Once the
> fail over has been performed, BFD is
> confirms the new path is "legitimate"
> and works.
> It seems correct to me that the current
> protocol relies on BGP / BFD security.
> That said, having BFD authentication
> based on MD5 or SHA1 may suggest that
> stronger primitives be recommended.
> While this does not concerns the current
> document, it seems to me that the
> information might be relayed to routing
> ADs.
> What remains unclear to me - and I
> assume this might be due to my lake or
> expertise in routing area - is the impact
> associated to performing a fail-over
> both on 1) the data plane and 2) the
> standard BGP way to establish routing
> tables.
> Regarding the data plane, I am wondering
> if fail-over results in a lost of
> packets for example - I suppose for
> example that at least the packets in the
> process of being forwarded might be
> lost. I believe that providing details
> on this may be good.
> If there are any impacts I would like to
> understand also in which cases the
> decision to perform a failover operation
> may result in more harm than the event
> that has been over-interpreted. An
> hypothetical scenario could be that the
> non reception of a BFD packet is
> interpreted as a PE being down while it
> may not be correct and the PE might have
> been simply under stress. A "too fast" fail-over
> may over interpreted it and perform a
> fail-over. If such things could happen,
> an attacker could leverage a micro event
> to perform network operation that are
> not negligible. Another way to see that
> is that an attacker might not have
> direct access to the control plan, but
> could use the data plan to generate a
> stress and sort of control the fail
> over. It seems to me that some text
> might be welcome to prevent such cases
> to happen. This could be guidance for
> declaring a tunnel down for example.
> Similarly, it would be good to add some
> text regarding the interferences with
> the non-fast forwarding fail over when
> performed by the standard BGP.
> Typically, my impression is that the
> fast fail-over mechanism is a local
> decision versus the BGP convergence that
> is more global. As a result, even with
> more time this two mechanisms may come
> with different outcomes. One such
> example to illustrate my purpose could
> be the following. Note that this is only
> illustrative of my purpose, and I let
> you find and pick on ethat is more
> appropriated.   I am thinking of a case
> where a standby PE is be shared among
> multiple PEs - supposing this situation
> could occur.  Typically, if PE_1, PE_2
> are shared by PE_a, ..., PE_z. In case
> PE_a and PE_b are down, we expect PE_a
> to switch to PE_1 and PE_b to switch to
> PE_2. It seems to me that BGP would end
> up in such situation while a local
> decision may end up in PE_a and PE_a to
> switch to PE_1.
> </mglt>
BESS mailing list

Reply via email to