ipvpn authors.
Here is my long delayed review for draft-ietf-bess-evpn-ipvpn-interworking using
version 10. To aid in review, I'm using the idnits output for this version of
the draft to help identify line numbers.
Note, idnits is flagging the following, which needs expansion to make the RFC
Editor happy.
** The abstract seems to contain references ([RFC4271]), which it
shouldn't. Please replace those with straight textual mentions of the
documents in question.
35 modifies the BGP best path selection for multiprotocol BGP routes of
36 SAFI 128 and EVPN IP Prefix routes, and therefore this document
37 updates the BGP best path selection in [RFC4271], but only for IPVPN
38 and EVPN families.
Here is the citation to 4271 that it wants expanded according to the nits rules.
419 4. Domain Path Attribute (D-PATH)
[...]
424 Similar to AS_PATH, D-PATH is composed of a sequence of Domain
425 segments. Each Domain segment is comprised of <domain segment
426 length, domain segment value>, where the domain segment value is a
427 sequence of one or more Domains, as illustrated in Figure 6. Each
428 domain is represented by <DOMAIN-ID:ISF_SAFI_TYPE>.
The format for D-PATH being a sequence of Domain segments has been there since
the beginning, so note that I'm not recommending any format changes. It
probably would have simplified things since there is no similar segment type to
motivate needing to differentiate the sequence entries from each other. For
example, validation becomes "is the attribute length divisible by 7". :-)
Having reviewed the full document for the procedure, there's discussion that
Domains are prepended to the D-PATH similar to the AS_PATH. However, part of
BGP's procedures for AS_PATH effectively is how to minimize segments. From
RFC 4271, Section 5.1.2.b.1
: 1) if the first path segment of the AS_PATH is of type
: AS_SEQUENCE, the local system prepends its own AS number as
: the last element of the sequence (put it in the leftmost
: position with respect to the position of octets in the
: protocol message). If the act of prepending will cause an
: overflow in the AS_PATH segment (i.e., more than 255 ASes),
: it SHOULD prepend a new segment of type AS_SEQUENCE and
: prepend its own AS number to this new segment.
The normative text describing prepending Domains to the D-PATH attribute needs
some text describing when new segments are generated.
446 0 1 2 3 4 5 6
447 +-----------------------+-----------+
448 | Global | Local |
449 | Admin | Admin |
450 +-----------------------+-----------+
452 Figure 6: D-PATH Domain Segment
454 * The domain segment length field is a 1-octet field, containing the
455 number of domains in the segment.
457 * DOMAIN-ID is a 6-octet field that represents a domain. It is
458 composed of a 4-octet Global Administrator sub-field and a 2-octet
459 Local Administrator sub-field. The Global Administrator sub-field
460 MAY be filled with an Autonomous System Number (ASN, Public or
461 Private), an IPv4 address, or any value that guarantees the
462 uniqueness of the DOMAIN-ID (when the tenant network is connected
463 to multiple Operators) and helps troubleshooting and debugging of
464 D-PATH in ISF routes. The Local Administrator sub-field is any
465 local 2-octet value, and its allocation or configuration is a
466 local implementation matter.
The intent for domain id appears to be that the contents of the type is
"structured", in the sense that it has a "global" field and a "local" field is
clear. The related intent here appears to be that the contents are opaque.
Several examples for the Global Admin field are offered. The fact that the
examples are aligned with RFC 4360 Extended Community types is perhaps not a
surprise.
However, the comparison with Extended Communities breaks down somewhat in that
the domain-id does not provide an additional qualifier on the semantics of the
Global Admin field. For Extended Communities, that's the type/subtype.
>From a protocol perspective, this is "fine". From an operational perspective,
it's somewhat problematic. Consider two implementations: One which treats
the field as an unsigned 32-bit number; perhaps an AS number, or "just a
number". The other has the configuration semantics of an IPv4 dotted-quad
address.
Clearly it's possible to map each implementation's configuration type to the
other. But it creates operational friction.
Another place this ambiguity becomes a challenge will be in future YANG modules.
At this point, I'm unable to find reference to D-PATH in the IETF evpn YANG
module (casual review) or in OpenConfig. In order to represent the domain-id,
a display format will be needed. In the absence of something in the attribute
to qualify the semantics, something generic will have to be selected.
Perhaps that's just opaque unsigned integers for each field. That's fine.
Consider whether the intended opaqueness and flexibility will be worth the
future contention over display formats in interoperable management systems.
487 The BGP D-PATH attribute is supported on ISF routes of type IPVPN and
488 EVPN and MUST NOT be advertised along with routes different from
489 IPVPN and EVPN routes. By default, the BGP D-PATH attribute is not
490 advertised and MUST be explicitly enabled by configuration on the
491 Gateway PEs. In addition, D-PATH:
493 a. Identifies the sequence of domains, each identified by a <DOMAIN-
494 ID:ISF_SAFI_TYPE> through which a given ISF route of type IPVPN
495 or EVPN has passed.
[...]
508 * As an example, an ISF IPVPN or EVPN route received with a
509 D-PATH attribute containing a domain segment of {length=2,
510 <6500:2:IPVPN>,<6500:1:EVPN>} indicates that the route was
511 originated in EVPN domain 6500:1, and propagated into IPVPN
512 domain 6500:2.
Style comment that may be flagged by others: Normative behaviors and examples
are mixed together in this list. In my opinion, they're clear. However,
some reviewers have strong preferences about not mixing them.
546 c. For a local ISF route, i.e., a configured route or a route
547 learned from a local attachment circuit, a gateway PE has three
548 choices:
550 1. It MAY advertise that ISF route without a D-PATH attribute
551 into one or more of its configured domains, in which case the
552 D-PATH attribute will be added by the other gateway PEs in
553 each of those domains.
555 2. It MAY advertise that ISF route with a D-PATH attribute into
556 one or more of its configured domains, in which case the
557 D-PATH attribute in each copy of the ISF route is initialized
558 with an ISF_SAFI_TYPE of 0 and the DOMAIN-ID of the domain
559 with which the ISF route is associated.
561 3. It MAY advertise that ISF route with a D-PATH attribute that
562 contains a configured domain specific to its local ISF routes
563 into one or more of its configured domains, in which case the
564 D-PATH attribute in each copy of the ISF route is initialized
565 with a ISF_SAFI_TYPE of 0 and the DOMAIN-ID for the local ISF
566 routes. This DOMAIN-ID MUST be globally unique and MAY be
567 shared by two or more gateway PEs.
There are three MAYs above. Is there a motivation to not make one of these the
RECOMMENDED case? Some of the normative text later on, e.g. Section 8, has
explicit procedure for this.
For comparison to BGP's loop prevention mechanisms, normally you'd always want
to append that mechanism when advertising to the next node. This means you will
consistently know where the thing (AS, cluster-id) is at and there's no
possibility that the device attaching it would be missed in loop detection.
I'm also having some difficulty understanding the difference between c.2 and c.3
above.
629 g. The following error-handling rules apply to the D-PATH attribute:
631 1. A received D-PATH attribute is considered malformed if it
632 contains a malformed Domain Segment.
634 2. A Domain Segment is considered malformed in any of the
635 following cases:
637 * The Domain Segment length of the last Domain Segment
638 causes the D-PATH attribute length to be exceeded.
Is the intention of this sentence to say "the remaining content is longer than
the d-path path attribute length? (I generally refer to such overrun issues an
"enveloping" problem.)
640 * After the last successfully parsed Domain Segment there
641 are less than eight octets remaining.
Consider some additional text:
The D-PATH attribute MUST be at least 8 octets in length or it is malformed.
For each contained Domain Segment, the Domain Segment length is one octet
containing the number of Domains in this segment, each of which are 7 octets in
length. If the total length of the Domain Segment in octets
(1 + 7 * number of Domains) exceeds the remaining length of the D-PATH
attribute, the Domain Segment is malformed.
643 * The Domain Segment has a Domain Segment Length of zero.
This is folded into the "at least 8" above, if accepted.
651 5. In case a PE receives more than one D-PATH attribute with a
652 route, the PE SHALL process the first one in the list and not
653 store and propagate the others.
This point goes against RFC 4271's requirement to not duplicate path attributes.
RFC 7606 doesn't loosen this stricture.
766 5.3. Aggregation of Routes and Path Attribute Propagation
[...]
790 * An ISF aggregate route SHOULD NOT be advertised unless all the
791 contributing ISF routes have the same D-PATH value. If there is
792 at least one contributing ISF route that has different D-PATH, the
793 gateway PE SHOULD advertise each contributing ISF route with its
794 own D-PATH (prepended with the gateway's domain). An
795 implementation MAY override this behavior, via policy, to
796 advertise an ISF aggregate route without D-PATH in case the
797 contributing routes did not have the same D-PATH value.
While the D-PATH is built as a vector rather than as a set, the loop prevention
characteristics are "does the receiving domain exist in the D-PATH attribute".
I.e., it's set membership as an operation.
"the same D-PATH value" implies exact match. Would it be more precise to
compare D-PATH attributes that do not have the same members?
A secondary consideration is that the ISF_SAFI_TYPE is irrelevant to loop
detection and is only informational. Thus, for aggregation purposes, are the
following D-PATHs equivalent?
<65001:1:0>
<65001:1:EVPN>
<65001:1:IPVPN>
799 * The Community, Extended Community and Large Community attributes
800 of the aggregate ISF route MUST contain all the Communities/
801 Extended Communities/Large Communities from all of the aggregated
802 ISF routes, with the exceptions of the extended communities listed
803 in Section 5.2 that are not propagated.
Note that while this practice was established in RFC 1997, modern
implementations do not always provide such aggregation of communities.
Minimally, I recommend this be changed from MUST to SHOULD.
There are two general motivations for this lack of community aggregation:
1. The resulting aggregated community attributes may churn as contributing route
membership to the aggregate itself churns. This can cause unwanted BGP UPDATEs
for the aggregate route and decreases its stability.
2. Communities are more often than not control signaling for policy.
Indiscriminately aggregating them leads to confusing signaling.
It's a fair criticism that RFC 1997 needs an update on these points.
812 6. Route Selection Process for ISF Routes
[...]
875 Example 1 - PE1 receives the following routes for IP1/32, that are
876 candidate to be imported into IP-VRF-1:
878 {SAFI=EVPN, RT-2, Local-Pref=100, AS-Path=(100,200)}
879 {SAFI=EVPN, RT-5, Local-Pref=100, AS-Path=(100,200)}
880 {SAFI=128, Local-Pref=100, AS-Path=(100,200)}
882 Selected route: {SAFI=EVPN, RT-2, Local-Pref=100, AS-Path=100,200]
883 (due to step 3, and no ECMP).
I suspect this example, and the following one, intend "RT-" to be "RD-".
The RT in question should be the same for the candidate routes for the VRF and
the RD is used to distinguish those routes.
894 7. Composite PE Procedures
933 In a composite domain with composite and regular PEs:
935 1. The composite PEs MUST advertise the same IP prefixes in each ISF
936 SAFI to the Route Reflector (RR). For example, in Figure 7, the
937 prefix IP1/24 is advertised by PE1 and PE2 to the Route Reflector
938 in two separate NLRIs, one for AFI/SAFI 1/128 and another one for
939 EVPN.
It may be worth highlighting that while the procedure here is clear, that a note
regarding prioritzing the announcement of the EVPN route prior to the IPVPN
route may be a good idea. According to the route selection rules, the EVPN
route will win vs. the same ISF carried in an IPVPN route. As noted in the
following section:
976 6. As an informative note, in composite domains, such as the one in
977 Figure 7, the EVPN advanced forwarding features will only be
978 available to composite and EVPN PEs (assuming they select an EVPN
979 IP Prefix route to forward packets for a given IP prefix), and
980 not to IPVPN PEs. For example, assuming PE1 sends IP1/24 in an
If the EVPN route arrives first, the selected route is not replaced by the IPVPN
route and the PE can take advantage of the above "advanced forwarding features".
The opposite case is that the IPVPN route arrives first, forwarding is
installed, and must then be updated when the EVPN route arrives.
Note that some implementations do not support route prioritization. Similarly,
BGP's convergence properties are such that even if prioritized at the
originating PE that two routes may not arrive in a specific order at the
destination PE. Thus, such prioritization MUST be optional but may still be a
good optimization.
1178 10. BGP Error Handling on Interworking PEs
[...]
1190 * The Interworking PEs do not introduce any new error-handling rules
1191 for UPDATES received with NLRIs and BGP Path Attributes defined in
1192 other specifications. Interworking PEs follow the error-handling
1193 defined in the specification for the specific NLRI or BGP Path
1194 Attribute. In other words, UPDATES for BGP IP routes MUST follow
1195 the error-handling procedures of [RFC4760] [RFC8950], UPDATES for
1196 IPVPN routes MUST follow the error-handling rules of [RFC4364]
1197 [RFC4659], UPDATES for EVPN MAC/IP routes MUST follow the error-
1198 handling of [RFC7432] [RFC8365] and UPDATES for EVPN IP Prefix
1199 routes MUST follow the error-handling in [RFC9136].
1201 * Received UPDATE messages to be programmed in IP-VRFs supporting
1202 Segment Routing for IPv6 data path (SRv6) follow the error-
1203 handling rules defined in [RFC9252].
What's the motivation for the statements above? In general, when new BGP
attributes are used, the error handling for those attributes is expected also to
be used. The above text seems to imply, "don't do that".
1265 12. Security Considerations
[...]
1270 Section 4 introduces the use of the D-PATH attribute, which provides
1271 a security tool against control plane loops that may be introduced by
1272 the use of gateway PEs that propagate ISF IPVPN/EVPN routes between
1273 domains. A correct use of the D-PATH will prevent control plane and
I don't know that the security directorate would consider D-PATH a "security"
tool. I'd suggest stating instead:
"Section 4 introduces the use of the D-PATH attribute, which provides a loop
prevention mechanism that is used by gateway PEs that propagate ISF IPVPN/EVPN
routes between domains."
1274 data plane loops in the network, however an incorrect configuration
1275 of the DOMAIN-IDs or an inconsistent support of D-PATH on the Gateway
1276 PEs may lead to the detection of false route loops, the blackholing
1277 of the traffic or may result in inconsistent and sub-optimal routing.
... and the text above follows mis-use of D-PATH as a security consideration.
-- Jeff
_______________________________________________
BESS mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/bess