Hi!

I have been selected to do a routing directorate “early” review of this
draft.
https://datatracker.ietf.org/doc/draft-ietf-bess-bgp-sdwan-usage/

The routing directorate will, on request from the working group chair,
perform an “early” review of a draft before it is submitted for publication
to the IESG. The early review can be performed at any time during the
draft’s lifetime as a working group document. The purpose of the early
review depends on the stage that the document has reached.

As this document has recently been through a working group last call, my
focus for the review was to determine whether it is ready for publication.
Please consider my comments along with the other working group last call
comments.

For more information about the Routing Directorate, please see
https://wiki.ietf.org/en/group/rtg/RtgDir.


Document: draft-ietf-bess-bgp-sdwan-usage-28
Reviewer: Alvaro Retana
Review Date: December 4, 2025
Intended Status: Informational

Summary:

I have some concerns about this document that should be resolved before it
is submitted to the IESG.

Comments:

The document presents an overview of how BGP can be used in a large-scale
SD-WAN network.

The draft is structured so that the scenarios are initially described
(Section 3), then general provisioning is covered (Section 4), how
BGP-controlled SD-WAN works for the scenarios in Section 3 (Section 5), and
finally, the forwarding models for those scenarios are discussed (Section
6). This structure results in some repetition and disjointness -- the
readability would improve if all aspects of each scenario were explained in
a single section.

Given the draft's focus on the control plane, I believe Section 6
(Forwarding Model) is out of place. The information in it can further
inform the expected behavior, but it could also be moved to an appendix or
eliminated. Note also that the Manageability and Security Considerations
sections focus only on the control plane.

This document already went through WGLC, so I defer to the Chairs/Shepherd
on what shousld be included and how to structure the text.


I have included in-line comments below. Among the ones I tagged as "major",
I want to highlight the following:

(1) Expectation about the use and placement of the RR/Controller and the BGP
    sessions.

   In general, the document assumes that the RR and the SD-WAN Controller
are
   the same. However, this assumption doesn't account for multiple RRs or
   different planes. The assumption should be clearly and explicitly
   articulated.

   §3.1.5 also seems to mention the possibility of having BGP sessions
between
   the SD-WAN Edges.


(2) IPSec Tunnel Encapsulation

   The draft repeatedly references rfc9012, but no IPSec Tunnel
Encapsulation
   is specified there. The correct reference should be
   draft-ietf-idr-sdwan-edge-discovery.


(3) Focus of the Manageability and Security Considerations sections

   Both sections cover only a small part of what the draft covers.
   Explicitly, none of the building technologies are mentioned (even by
   reference).

   The Security Considerations section should not just cover expectations,
   but also the risks if those expectations are not followed. For example,
   BGP doesn't have a mandatory "secure communication channel"; what are
   the risks if the expectations are not met in a deployment?


See more details below. This review ends with "[EoR -28]".

Thank you!

Alvaro.


[Line numbers from idnits.]


...
98 1. Introduction
...
127   This document outlines SD-WAN use cases and the complexities of
128   managing large-scale SD-WAN overlay networks, as described in
129   [Net2Cloud-Problem]. It demonstrates how a BGP-based control plane
130   can efficiently manage these networks with minimal manual
131   intervention; additional operational drivers for standardized
132   protocol behavior are summarized in Section 6 of [MPLIFY-119].

[minor] "as described in [Net2Cloud-Problem]"

It is not clear to me what this sentence says about what is described in
[Net2Cloud-Problem].  If [Net2Cloud-Problem] describes use-cases and/or
"the complexities...", what is this draft for?


[major] There's no reference entry for [MPLIFY-119].



134   It's important to distinguish the BGP instance as the control
135   plane for SD-WAN overlay from the BGP instances governing the
136   underlay networks. The document assumes a secure communication
137   channel between the SD-WAN controller and SD-WAN edges for
138   exchanging control plane information.

[minor] "The document assumes a secure communication channel between the
SD-WAN controller and SD-WAN edges for exchanging control plane
information."

What is a "secure communication channel"?

§3.1.5 answers this question:

   An SD-WAN edge must use a secure channel, such as TLS (RFC5246)
   [RFC8446] or IPsec, to its designated RR for exchanging BGP UPDATE
   messages.

But it calls it "secure channel".  §5.1 uses "secure management channel",
and §8 goes back to "secure communication channel". Please be consistent.



140   The need for an RFC documenting SD-WAN use cases lies in ensuring
141   standardization and interoperability. While BGP and IPsec are
142   well-established technologies, their application to SD-WAN
143   introduces challenges such as scalability, traffic segmentation,
144   and multi-homing. This document consolidates best practices and
145   defines guidelines to enable consistent implementations across
146   diverse networks, optimizing existing protocols for SD-WAN
147   scenarios rather than proposing new ones.

[nit] s/The need for an RFC documenting/The need for documenting


[major] "standardization and interoperability...consolidates best practices
and defines guidelines to enable consistent implementations..."

This document is tagged as Informational (which seems the right intended
status to me), but the statements in this paragraph point at it being much
more.

IMO, this justification paragraph is not necessary. If justification is
needed for publication, the Shepherd write-up is a better place to put it.



149 2. Conventions used in this document
...
154   Controller: Used interchangeably with SD-WAN controller to manage
155               SD-WAN overlay networks in this document. In the
156               context of BGP-controlled SD-WAN, the SD-WAN
157               controller functions as or is integrated with the BGP
158               Route Reflector (RR).

[nit] s/as or is/are


[major] In many places the text assumes that the RR is the controller.  Is
that always the case?  Is it a requirement?  What happens in cases where
multiple RRs exist, are all of them controllers?

Note that, for example, §5.1 opens the possibility of the RR and the
controller not being the same: "When the BGP RR is integrated with the
SD-WAN controller..."; which implies that the functionality may not be
integrated.

Please clarify the expectation.



...
163   Client route: A BGP-advertised route originated by an SDWAN edge
164               that represents the reachability of a client-facing
165               service (e.g., IP prefix or VLAN) and includes
166               associated path attributes used by the SDWAN-
167               Controller for policy enforcement and forwarding
168               decisions.

[nit] s/SDWAN/SD-WAN/g


[minor] s/client-facing service/client service/g
Be consistent.



...
180   MP-NLRI:    In this document, the term "MP-NLRI" serves as a
181               concise reference for "MP_REACH_NLRI".

[minor] Even if defined here, please don't make up new terminology. In this
case, "MP-NLRI" shows up only one time in the text.



...
198   SD-WAN IPsec SA: IPsec Security Association between two WAN ports
199               of the SD-WAN edges or between two SD-WAN edges.

[minor] Add a reference.



...
229 3.1. SD-WAN Functional Overview and Requirements

[] More than requirements, this section is a description of the operation.



231 3.1.1. Supporting SD-WAN Segmentation
...
239   This document assumes that SD-WAN VPN configuration on PE devices
240   will, as with MPLS VPN [RFC4364] [RFC4659], make use of VRFs
241   [RFC4364] [RFC4659]. Notably, a single SD-WAN VPN can be mapped to
242   one or multiple virtual topologies governed by the SD-WAN
243   controller's policies.

[nit] s/MPLS VPN [RFC4364] [RFC4659], make use of VRFs [RFC4364]
[RFC4659]./MPLS VPN, make use of VRFs [RFC4364] [RFC4659].



...
250   As SD-WAN is an overlay network arching over multiple types of
251   networks, MPLS L2VPN[RFC4761] [RFC4762]/L3VPN[RFC4364] [RFC4659]
252   or pure L2 underlay can continue using the VPN ID (Virtual Private
253   Network Identifier), VN-ID (Virtual Network Identifier), or VLAN
254   (Virtual LAN) in the data plane to differentiate packets belonging
255   to different SD-WAN VPNs. For packets transported through an IPsec
256   tunnel, additional encapsulation, such as GRE [RFC2784] or VxLAN

[nit] s/L2VPN[RFC4761] [RFC4762]/L3VPN[RFC4364]/L2VPN [RFC4761]
[RFC4762]/L3VPN [RFC4364]



...
261 3.1.2. Client Service Requirement
...
270   In [MEF 70.1], the "SD-WAN client interface" is called SD-WAN UNI
271   (User Network Interface). Section 11 of [MEF 70.1] defines a
272   comprehensive set of attributes for the SD-WAN UNI, detailing the
273   expected behavior and requirements to enable seamless connectivity
274   to the client network.

[major] MEF 70.2 is used elsewhere, are the definitions in MEF 70.1
different?  IOW, do you need both references?



...
279 3.1.3. SD-WAN Traffic Segmentation

[] What's the difference between the segmentation in this section and
§3.1.1?  Both sections talk about the same thing, only the level of the
examples is different.  Consider merging them.



...
293   In the figure below, traffic from the PoS system follows a tree
294   topology (denoted as "----" in the figure below), whereas other
295   traffic can follow a multipoint-to-multipoint topology (denoted as
296   "===").

[] Assuming that the topology below is conceptual, it looks like the "link"
between the "payment gateway" and the "multi-point connection" is not
needed.


298                              +--------+
299              Payment traffic |Payment |
300                +------+----+-+gateway +------+----+-----+
301               /      /     | +----+---+      |     \     \
302              /      /      |      |          |      \     \
303           +-+--+  +-+--+  +-+--+  |   +-+--+  +-+--+  +-+--+
304           |Site|  |Site|  |Site|  |   |Site|  |Site|  |Site|
305           | 1  |  |  2 |  | 3  |  |   |4   |  |  5 |  | 6  |
306           +--+-+  +--+-+  +--|-+  |   +--|-+  +--|-+  +--|-+
307              |       |       |    |      |       |       |
308            ==+=======+=======+====+======+=======+=======+===
309         Figure 1 multi-point connection for non-payment traffic

[minor] Find a more descriptive name of this Figure.



...
318 3.1.4. Zero Touch Provisioning
...
329     - The SD-WAN edge's customer information and unique device
330     identifier (e.g., serial number, MAC address, or factory-
331     assigned ID) are registered with the SD-WAN Central Controller.

[minor] Is the "SD-WAN Central Controller" different than a "SD-WAN
Controller"?  I ask because this is the only place this new term is used.


333     - Upon power-up, the SD-WAN edge can establish the transport
334     layer secure connection [BCP195] to its controller, whose URL
335     (or IP address) and credential for connection request can be
336     preconfigured on the edge device by the manufacture, external
337     USB drive or secure Email given to the installer. The external
338     USB method involves providing the installer with a pre-
339     configured USB flash drive containing the necessary
340     configuration files and settings for the SD-WAN device. The
341     secure Email approach entails sending a secure email containing
342     the configuration details for the SD-WAN device.

[minor] I'm confused about the reference to BCP195.  By "transport layer
secure connection", do you mean a TLS connection?  BCP195 points at general
TLS-related best practices and doesn't define the protocol itself.  If you
meant TLS, I wonder why not use RFC8446.


[nit] s/by the manufacture/by the manufacturer



344     - The SD-WAN Controller authenticates the ZTP request from the
345     remote SD-WAN edge with its configurations. Once the
346     authentication is successful, it can designate a local network
347     controller near the SD-WAN edge to pass down the initial
348     configurations via the secure channel. The local network
349     controller manages and monitors the communication policies for
350     traffic to/from the edge node.

[minor] "local network controller"

Here's another type of controller...which is not the central one mentioned
above.  What is the relationship with the SD-WAN Controller?



352 3.1.5. Constrained Propagation of SD-WAN Edge Properties

354   For an SD-WAN edge to establish an IPsec tunnel to another edge
355   and exchange the attached client routes, both edges need to know
356   each other's network properties, such as the IP addresses of the
357   WAN ports, the edges' loopback addresses, the attached client
358   routes, the supported encryption methods, etc.

360   In many cases, an SD-WAN edge is authorized to communicate with
361   only a subset of other edge nodes. To maintain security and
362   privacy, the property of an SD-WAN edge must not be propagated to
363   unauthorized peers. However, when a remote SD-WAN edge powers up,
364   it may lack the policies to determine which peers are authorized
365   to communicate. Therefore, SD-WAN deployment needs to have a
366   central point to distribute the properties of an SD-WAN edge to
367   its authorized peers.

369   BGP is well suited for this purpose. A Route-Reflector (RR)
370   [RFC4456], integrated into the SD-WAN controller, enforces
371   policies governing the communication among SD-WAN edges. The RR
372   ensures that BGP UPDATE messages from an SD-WAN edge are
373   propagated only to other edges within the same SD-WAN VPN.

[major] The first paragraph talks about "an SD-WAN edge to establish an
IPsec tunnel to another edge and exchange the attached client routes",
which sounds to me like establishing a BGP session.  But this last
paragraph says that the RR "ensures that BGP UPDATE messages from an SD-WAN
edge are propagated only to other edges within the same SD-WAN VPN".  Are
direct BGP peerings between SD-WAN Edges established, or is the
communication only through the RR/controller?



375   An SD-WAN edge must use a secure channel, such as TLS (RFC5246)
376   [RFC8446] or IPsec, to its designated RR for exchanging BGP UPDATE
377   messages.

[major] RFC5246 was obsoleted by RFC8446.  Do you need both references?


[major] Add a reference for IPSec.



...
394 3.2. Scenario #1: Homogeneous Encrypted SD-WAN
...
402   -  A small branch office connecting to its headquarters via the
403   Internet. All traffic to and from this small branch office must be
404   encrypted, usually achieved by IPsec Tunnels [RFC6071].

[major] RFC6071 is an IPSec document roadmap, not the appropriate reference
to be used here.


[minor] This paragraph used "IPsec Tunnels", but the next couple use "IPSec
SAs"...and elsewhere only "IPSec".  Please be consistent.



...
518 3.4. Scenario #3: Private VPN PE based SD-WAN
...
540                           +======>|PE2|
541                         //        +---+
542                        //          ^
543                       //           || VPN
544                      //     VPN    v
545                      |PE1| <====> |RR| <=>   |PE3|
546                      +-+-+        +--+       +-+-+
547                        |                       |
548                        +--- Public Internet -- +
549                                 Offload
550          Figure 5: Additional Internet paths added to the VPN

[nit] The "top" of the routers in the figure are missing.  The same happens
in other figures.



...
567 4.1. Client Service Provisioning Model

569   Provisioning of client-facing services in an SD-WAN network can
570   leverage approaches similar to those used for VRFs (Virtual
571   Routing and Forwarding) in MPLS based VPNs [RFC4364][RFC4659]. A
572   client VPN can define communication policies by specifying BGP
573   Route Targets for import and export. Alternatively, policy-based
574   filtering using ACLs (Access Control List) can be employed to
575   control which routes are allowed or denied for a given client VPN.

[nit] s/MPLS based VPNs [RFC4364][RFC4659]/MPLS based VPNs [RFC4364]
[RFC4659]



...
597 4.3. IPsec Related Parameters Provisioning
...
606   In a BGP-controlled SD-WAN, BGP UPDATE messages can be extended to
607   propagate IPsec-related attributes for each SD-WAN edge. This
608   approach allows peers to receive and apply compatible
609   cryptographic parameters distributed over a secure channel between
610   the SDWAN edge and its BGP RR, thereby simplifying IPsec tunnel
611   establishment and reducing reliance on traditional IKEv2
612   negotiation [RFC7296].

[minor] "BGP UPDATE messages can be extended"

Include an Informative reference to draft-ietf-idr-sdwan-edge-discovery.



...
620 5.1. Rational for Using BGP as Control Plane for SD-WAN

[minor] s/Rational/Rationale


[] The rest of the draft discusses how BGP is used...I don't think a
justification is needed anymore.



...
634   -  Simplified peer authentication process:

636     With a secure management channel established between each edge
637     node and its RR, the RR can perform peer authentication on
638     behalf of the edge node. The RR has policies on peer
639     communication and the built-in capability to constrain the
640     propagation of the BGP UPDATE messages to the authorized edge
641     nodes only.

[major] "With a secure management channel established between each edge
node and its RR, the RR can perform peer authentication on behalf of the
edge node."

This question is related to the peering model question above (§3.1.5).

I read the sentence as saying that (somehow) the RR is able to authenticate
an edge node on behalf of another edge node.  What does that mean?  The use
of "peer authentication" leads me to believe that the edge nodes will peer
with each other (??).  Is the "peer communication" at the control plane
level or in the dataplane?



643   - Scalable IPsec tunnel management

645     In networks with multiple IPsec tunnels between SD-WAN edges,
646     BGP simplifies tunnel management by using the Tunnel
647     Encapsulation Attribute specified in [RFC9012] to carry
648     information that associates advertised client routes with
649     specific tunnels.

[major] RFC9012 doesn't specify an IPSec tunnel encapsulation.



651     Unlike traditional IPsec VPN where IPsec tunnels between two
652     edge nodes are treated as independent parallel links requiring
653     duplicated control plane messages for load sharing.

[] This sentence seems orphaned...unlike what?



655   - Simplified traffic selection configurations

657     BGP can simplify the configuration of IPsec tunnel associations
658     and related forwarding policies. By leveraging Route Targets to
659     identify SD-WAN VPN membership, administrators can apply
660     import/export policies that control the distribution of client
661     routes. These route attributes, in turn, inform the local
662     configuration of IPsec traffic selectors at each SDWAN edge.

[] This point sounds like tunnel management to me.  Maybe merge with the
last point...



...
678 5.2. BGP Scenario for Homogeneous Encrypted SD-WAN
...
686   For example, in the figure below, the BGP UPDATE message from C-
687   PE2 to RR can have the client routes encoded in the MP-NLRI Path
688   Attribute and the IPsec Tunnel associated parameters encoded in
689   the Tunnel Encapsulation Attribute [RFC9012].

[major] RFC9012 doesn't specify an IPSec tunnel encapsulation.



...
717 5.3. BGP Scenario for Differential Encrypted SD-WAN
...
726  - Update 1: Client Route Advertisement for advertising the
727     prefixes of client services attached to the client facing
728     interfaces. The Color (Section 8 of [RFC9012]) is used to
729     associate each client service with the corresponding WAN ports
730     for the desired underlay paths.

[] "The Color..." what?



...
820 6.1.1. Network and Service Startup Procedures
...
827   For example, in the full mesh scenario in Figure 2 of Section 3.2,
828   where client CN2 is attached to C-PE1, C-PE3, and C-PE4, six uni-
829   directional IPsec SAs must be established: C-PE1 <-> C-PE3; C-PE1
830   <-> C-PE4; C-PE3 <-> C-PE4.

[minor] s/Figure 2/Figure 3



...
887 6.2. Forwarding Model for Hybrid Underlay SD-WAN

889   In this scenario, as shown in Figure 3 of Section 3.3, traffic
890   forwarded over the trusted VPN paths can be native (i.e.,
891   unencrypted). The traffic forwarded over untrusted networks need
892   to be protected by IPsec SA.

[minor] s/Figure 3/Figure 4



894 6.2.1. Network and Service Startup Procedures

896   Infrastructure setup: The proper MPLS infrastructure must be
897   configured among the edge nodes, i.e., the C-PE1/C-PE2/C-PE3/C-PE4
898   of Figure 3. The IPsec SA between wAN ports or nodes must be set
899   up as well. IPsec SA related attributes on edge nodes can be
900   distributed by BGP UPDATE messages as described in Section 5.

[nit] s/wAN/WAN



...
906 6.2.2. Packet Walk-Through
...
921     For a c-PE with multiple WAN ports provided by different NSPs,
922     separate IPsec SAs can be established for the WAN ports. In this
923     case, the C-PE have multiple IPsec tunnels in addition to the
924     MPLS path to choose from to forward the packets from the client
925     facing interfaces.

[nit] s/c-PE/C-PE



...
957     For multicast traffic, MPLS multicast [RFC6513, RFC6514, or
958     RFC7988] can be utilized to forward multicast traffic across the
959     network.

[minor] s/[RFC6513, RFC6514, or RFC7988]/[RFC6513], [RFC6514], or [RFC7988]



...
1022 7. Manageability Considerations

1024   A BGP-controlled SD-WAN uses RR to propagate client routes and
1025   underlay tunnel properties among authorized SD-WAN edges. Since
1026   the RR is configured with policies that identify authorized peers,
1027   the peer-wise IPsec IKE (Internet Key Exchange) authentication
1028   process is significantly simplified.

[major] This section only considers small part of what was covered in the
rest of the document.



1030 8. Security Considerations

[major] Should the security considerations for all the technology mentioned
in the draft be inherited?  At least BGP and IPSec...



1032   In a BGP-controlled SD-WAN network, secure operation replies in
1033   part on the correct configuration and behavior of the RR, which
1034   acts as the central distribution point for BGP routing
1035   information. RR applies preconfigured routing policies to control
1036   the propagation of BGP UPDATE messages to authorized SD-WAN edges,
1037   help minimizing the risk of unintended route exposure or
1038   unauthorized communication.

[nit] s/replies in part/relies in part



1040   The security model for the SD-WAN described in this document is
1041   based on the following principles:

1043   1) Centralized Control: The RR governs all routing and policy
1044     decisions. This centralized architecture simplifies security
1045     management compared to distributed models, as it limits the
1046     potential attack surface to a smaller, more controlled set of
1047     components.

[major] True.  What are the risks associated with misconfiguration?



1048   2) Secure Communication Channels: All communication between SD-WAN
1049     edges and the RR must occur over a secure channel, such as TLS
1050     or IPsec, to ensure the confidentiality and integrity of BGP
1051     UPDATE messages.

[major] What happens if the secure communication channel is not used?

The propagation of BGP UPDATEs is not gated by the transport mechanism. A
peering session could be configured without the required secure
communication channel.  What are the associated risks?

Is the expectation that the RR, or the edge nodes (or both), will not
proceed with the BGP session unless a secure communication channel is used?



1052   3) Policy Enforcement: The RR is responsible for enforcing policies
1053     that restrict the propagation of edge node properties and
1054     routing updates to only authorized peers. This prevents
1055     sensitive information from being exposed to unauthorized nodes.

[major] What are the risks associated with misconfiguration?



1057   4) Mitigation of Internet-Facing Risks: In scenarios where SD-WAN
1058     edges include Internet-facing WAN ports, additional measures
1059     must be taken to mitigate security risks:
1060       - Anti-DDoS mechanisms must be enabled to protect against
1061          potential attacks on Internet-facing ports.

[major] Is there an example of an "Anti-DDoS mechanism" you can point to?
What are the risks of not using one?



1062       - The control plane must avoid learning routes from Internet-
1063          facing WAN ports to prevent unauthorized traffic from being
1064          injected into the SD-WAN.

[major] What are the risks associated with misconfiguration?



...
1084 10. References

[major] IMO, only the reference to MEF70.2 (where the concept of SD-WAN is
introduced) should be Normative, the rest can be Informative.

[EoR -28]
_______________________________________________
BESS mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to