[spring] AD Review of draft-ietf-spring-segment-routing-12

Alvaro Retana (aretana) Thu, 10 Aug 2017 12:01:46 -0700

Dear authors:

I just finished reviewing this document – sorry for the delay in processing.  
Thanks for all the work you’ve but into this document!

I have some significant concerns (see below for details). In general, the
document presents an incomplete view of the architecture: details about
important pieces are barely mentioned or not fully described, as is the case of
the role of a central controller (PCE), and the BGP and Binding Segments.

Also, some of the architectural details are predicated on specific bits defined
in the extensions, where this document should describe the general operation
and leave the details (like bit names) to the extensions. Note that I’m not
asking that you don’t mention the extensions – pointing to them is fine --, I’m
asking you to define the functionality in general and not as a function of the
extensions. For an example, see point M5.1. below.

I will wait for at least the Major comments to be addressed before starting the
IETF Last Call.

Thanks!

Alvaro.

Major:

M1. The Introduction mentions several types of segments, and it says that the
LDP LSP, RSVP-TE LSP, and BGP LSP segments are “illustrated in
[I-D.ietf-spring-segment-routing-mpls]”. But that is only true for the first
two, for which examples are shown. Where are these segment types defined? The
definition, and not the examples, is the Major issue here. This document being
the main architecture document should include the complete description. BTW,
the list is only about the “MPLS instantiation”, are there similar types of
segments for IP?

M2. From Section 2. (Terminology): “Using the same SRGB on all nodes within the
SR domain ease operations and troubleshooting and is expected to be a
deployment guideline.” It is “expected to be a deployment guideline”
where/when?? Given that this document is the general architecture, I figured
that maybe draft-ietf-spring-segment-routing-mpls contained that deployment
recommendation, but all that document says is: “As described in
[I-D.ietf-spring-segment-routing], using the same SRGB on all nodes within the
SR domain eases operations and troubleshooting and is expected to be a
deployment guideline.” So…where are the Deployment/Operational considerations
related to the SRGB? I note that neither document
(draft-ietf-spring-segment-routing or draft-ietf-spring-segment-routing-mpls)
include them. I would expect some information to be in this (general) document
and other more specific information (like the considerations about using the
same SRGB) to be in the more specific document
(draft-ietf-spring-segment-routing-mpls, in this case).

M2.1. The example illustrated in Figure 2 would be a great place to
demonstrate the value of having the same SRGB. In fact, the text now shows
things not working and it even warns by saying that “using anycast segment
without configuring the same SRGB on nodes belonging to the same device group
may lead to misrouting”, but no explanation of how it should be done the right
way.

M3. From Section 2. (Terminology): “…a global segment is represented by a
globally-unique index.”

M3.1. I couldn’t find anywhere a discussion about the use of the index. When
it is discussed in 3.1.2, it seems to be an understood concept: “A Prefix-SID
is allocated in the form of an index in the SRGB…” Even if straightforward, I
think the concept of the index should be explained (maybe with an example) and
not assumed. I again went to look at draft-ietf-spring-segment-routing-mpls,
but that document just points back to this one: “The notion of indexed global
segment, defined in [I-D.ietf-spring-segment-routing]…”

M3.2. I assume that “globally-unique index” is really unique to the SR Domain,
right? I ask this question because “globally-unique IPv6 address” has a
different connotation (as in unique world-wide, not just inside a domain).
Please clarify.

M3.3. Note that the latest version (-10) of
draft-ietf-spring-segment-routing-mpls introduced the SR Local Block (SRLB)
which is not defined in this document. I mention it here because it was
introduced right after the pointer about the index…

M4. Section 3.1.1. (Prefix-SID Algorithm)

M4.1. This section starts by justifying the use of an algorithm based on the
IGP protocol extensions: the extensions have an algorithm field, so we should
use it. The justification should be the other way around: this document (the
architecture) defines the concept of an algorithm and the extensions implement
it. Please re-word the first 3 paragraphs – note that the instance/topology
references seem superfluous to me.

M4.2. The algorithms are not really defined – there are mentions of a “well
known ECMP-aware SPF algorithm”, but no specific reference to what that is.
The last sentence in the Section mentions that the “details of the two defined
algorithms are defined in…” pointing to the extension drafts – but those drafts
just offer an additional piece of information in (from the OSPF document): “the
standard shortest path algorithm as computed by the OSPF protocol”. Ideally
the algorithms would be defined here (even if it is just to say: “the standard
algorithm used in the corresponding IGP”), and the extensions would just
reference it.

M4.3. When should the packets be dropped?? The following text points to at
least 3 different places where it should be dropped, one marked with a MUST.
Please clarify.

M4.3.1. “A router MUST drop any SR traffic associated with the SR algorithm to
the adjacent router, if the adjacent router has not advertised support for such
SR algorithm.” Ok, this sounds as if the traffic is dropped one hop before the
router not supporting the algorithm – and it has a MUST in it.

M4.3.2. “The ingress node of an SR domain validates that the path to a
prefix…includes nodes all supporting the advertised algorithm. As a
consequence, if a node on the path does not support algorithm X, the IGP-Prefix
segment will be interrupted and will drop packet on that node.” This sounds
like the traffic would be dropped on the node not supporting the algorithm.

M4.3.3. “It's the responsibility of the ingress node using a segment to check
that all downstream nodes support the algorithm of the segment.” This last
sentence hints at the fact that the ingress node should maybe even stop the
traffic there, or maybe not sending it unless all nodes in the path support the
same algorithm…

M5. Section 3.1.2. (MPLS Dataplane).

M5.1. The first bullet (again!) explains the architecture in function of the
extensions. The architecture should explain what needs to be done and the
extensions would then do it… Suggestion:

NEW>
In order to achieve a behavior equivalent to Penultimate Hop Popping
in MPLS [Reference], a Node N advertising a Prefix-SID SID-R for its
attached prefix R MUST be able to instruct its connected neighbors to
perform the NEXT operation while processing SID-R. Upon receiving
this instruction from Node N, its neighbors of N MUST perform the NEXT
operation while processing SID-R. Alternatively, the neighbors of N
MUST perform the CONTINUE operation while processing SID-R.

M5.1.1. The penultimate bullet refers again to the extensions (but it only
mentions the P-bit this time). Please explain what the architecture intends to
do, but not from the point of view of the extensions. This and the last bullet
seem to be the result of the first one (above). I think that the first bullet
would have enough text for the FIB entries to be obvious, but if you want to
include this corollary, please do it right after.

M5.2. “A Prefix-SID is allocated in the form of an index in the SRGB (or as a
local MPLS label) according to a process similar to IP address allocation.”
The SRGB is defined (in the Terminology section) as “the set of local labels
reserved for global segments”, what is the difference between “an index in the
SRGB” and “a local MPLS label”? I’m hoping that the explanation of the concept
of index would clarify this.

M5.2.1. Three bullets down… “If a node learns a Prefix-SID having a value that
falls outside the locally configured SRGB range, then the node MUST NOT use the
Prefix-SID…” Going back to my previous question: it looks like “a local MPLS
label” would have to be included in the SRGB – again, clarifying upfront would
help a lot. Note that this piece of text falls into the recommendation of
having the same SRGB configured in all nodes.

M5.2.2. Related to the concept of index and its relationship to the SRGB…the
next bullet says: “…the segment is global (the SID is allocated from the SRGB
or as an index)”. We now have “globally-unique index”, “an index in the SRGB”,
and “the SRGB or as an index” (separately).

M6. Section 3.3. (IGP-Anycast Segment, Anycast SID): “…the value of the SID
following the anycast SID MUST be understood by all nodes advertising the same
anycast segment.” It seems to me that this is really a statement of fact: all
nodes, not just ones advertising an anycast segment must understand what to do
with the next SID… IOW, s/MUST/must I would even take this sentence out
because it is redundant and obvious.

M7. Section 3.4. (IGP-Adjacency Segment, Adj-SID) also tries to explain
functionality starting from the extensions.

M7.1. “The encodings of the Adj-SID include the a set of flags among which
there is the B-flag. When set, the Adj-SID refers to an adjacency that is
eligible for protection (e.g.: using IPFRR or MPLS-FRR).” If the Adjacency
Segment is one that is locally significant to the node advertising it, what is
the purpose of signaling that it is eligible for protection? Wouldn’t that be
a local decision as well? Maybe an example of how the architecture is expected
to work would help.

M7.2. “The encodings of the Adj-SID also include the L-flag. When set, the
Adj-SID has local significance. By default, the L-flag is set.” The
definition of IGP-Adjacency Segment already says that it is “local (unless
explicitly advertised otherwise)”, which makes this statement unnecessary. If
you want to keep it, please figure out a way that doesn’t justify it based on a
bit in the extensions.

M8. Section 3.5. (Binding Segment). A binding segment is not described
anywhere in this document – please do so! Section 3.5.1. (Mapping Server)
mentions a “Remote-Binding SID S advertised by the mapping server”, and says
that more “details are described in the SR/LDP interworking procedures
([I-D.ietf-spring-segment-routing-ldp-interop]”, but that draft doesn’t mention
a Binding Segment. Section 5. (IGP Mirroring Context Segment) says that “the
binding segment [is] defined in SR IGP protocol extensions (
[I-D.ietf-isis-segment-routing-extensions],
[I-D.ietf-ospf-segment-routing-extensions] and
[I-D.ietf-ospf-ospfv3-segment-routing-extensions])”; however, those documents
don’t mention a Binding Segment, they just (except for the OSPFv2 draft) define
TLVs. Note that I-D.ietf-ospf-segment-routing-extensions points back to this
document when referring to the definition of a Binding SID, completing a
circular reference.

M8.1. BTW, Section 5. (IGP Mirroring Context Segment) says that the “Mirror SID
is advertised using the binding segment”, which then looks like the Mirror
Segment is an application of the Binding Segment (+ an explicit indication of
it). I think that Section 5 should then be moved to be a sub-section of 3.5.

M8.2. Part of the Security Considerations (in Section 8.1) are related to the
Binding SID, which is another reason for clearly explaining that part of the
architecture here.

M9. Section 3.5.1. (Mapping Server) mentions a Mapping Server, but it punts to
I-D.ietf-spring-segment-routing-ldp-interop for further details. While the
SRMS can be used for interoperability cases, I think that it is an important
part of the overall architecture and as such it should be described and
discussed in this document instead…including any Manageability Considerations
(as mentioned in Section 9).

M10. Section 3.6. (Inter-Area Considerations)

M10.1. This section shows an example of the behavior: maintain the SID across
area boundaries. But it doesn’t actually say how the architecture is expected
to work. IOW, in the example the SID is maintained, but should that always
happen (MUST, SHOULD)? Or is it just an example (MAY)?

M10.2. Another case of explaining the architecture based on the extension
functionality: “When an ABR propagates a prefix from one area to another it
MUST set the R-Flag.” Maybe try something like this instead: “The
re-advertisement of an SID that originated on a different IGP area MUST be
indicated.”

M10.3. [minor] As the advertisement moves to the left (from Area 2 to the
Backbone…), the ABRs change the Node-SID for a Prefix-SID, right? The
description in the last 3 paragraphs uses Node-SID: “node S…pushes
Node-SID(150)”.

M10.4. [minor] Going back to global vs local significance, it would be nice to
remind the readers at this point that the SR domain is really the whole IGP
network, and not just one flooding domain…which is why SID 150 can be used
throughout.

M11. Section 4. (BGP Peering Segments) is the only non-IGP focused section in
this document. The architecture of the EPE solution (as described in
draft-ietf-spring-segment-routing-central-epe) goes beyond what has already
been discussed here because it incorporates elements such as a mandatory
centralized controller, which was optional before (a PCE server is mentioned
only casually in the Terminology section). This leaves us with an incomplete
architecture for the BGP case. I would prefer it if the whole architecture was
defined in this single document (i.e. expand this section by potentially moving
parts from draft-ietf-spring-segment-routing-central-epe – which might leave
that document without enough content to stand alone).

M12. Section 8. (Security Considerations). The main part of this section talks
about the instructions on the packets – is adding that meta-data a security
concern? I think it could be because someone watching could tell which
“forwarding path elements (e.g.: nodes, links, services, etc.)” are used for
specific flows, etc. But I also think that the risk is mitigated by the fact
that the information MUST NOT be exposed outside the SR domain. Mentioning
that, and the fact that the SR domain will usually be under a single
administration would be a good thing.

M13. Section 9. (Manageability Considerations) says that “an implementation
SHOULD protect the network in case of conflict detection by providing a
deterministic resolution approach…addressed in
[I-D.ietf-spring-conflict-resolution].” That “SHOULD” is in conflict with
I-D.ietf-spring-conflict-resolution (in WGLC) where it says that “All protocols
which support SR MUST adhere to the policies defined in this document.” It
seems like the easy way forward is to s/SHOULD/MUST. In either case, I think
that I-D.ietf-spring-conflict-resolution should be a Normative reference.

Minor:

P1. Introduction

P1.1. The term “service chain” is used in the Abstract and in the Introduction.
Given that the concept is not vital to the architecture and that there might
be unnecessary confusion with SFC, I would suggest taking it out.

P1.2. Related… Section 1.1 says that this document “defines…the service
segments”, but there’s no specific mention of “service segments” anywhere,
except for a very quick mention in Section 5. (IGP Mirroring Context Segment).
What am I missing? Consider taking “service segments” off the list of things
this document defines.

P1.3. The last paragraph says that “This document defines a set of instructions
(called segments) that are required to fulfill the described use-cases.” The
use cases are not mentioned until 1.1. Suggestion: merge Section 1 and 1.1 to
provide proper flow to the text.

P1.4. Speaking of use cases, I-D.ietf-spring-oam-usecase doesn’t actually
include use cases that will affect the architecture. It includes use case of
how to use monitoring system defined in it. Same comment for the mention in
Section 9.

P2. From Section 2. (Terminology)

P2.1. SID is not defined, just extended. Besides the examples, please provide
an actual definition.

P2.2. “The CONTINUE instruction is implemented as the SWAP instruction in the
MPLS dataplane.” Please include a reference to where the “SWAP instruction” is
defined. I assume you’re really referring to “label swapping” in rfc3031, are
you? If so, please be consistent as “MPLS dataplane” and “MPLS architecture”
are used in different places.

P2.3. The “Local Segment” definition says that for IPv6 it “is not advertised
in any routing protocol”. But for MPLS it is advertised, right? Please
clarify. I think that some of the vocabulary is a little confusing, for
example, the definition of IGP-Adjacency Segment says that it “is local (unless
explicitly advertised otherwise) to the node that advertises it” – this means
that for IPv6 this piece of the architecture wouldn’t apply because a Local
Segment is not advertised. To add to the confusion,
draft-filsfils-spring-srv6-network-programming talks about local SIDs… IOW, in
some places it sounds as if what should be the general architecture only really
applies to MPLS – or maybe we need a more explicit local qualifier.

P2.4. “The PCEP discovery capability is described in
[I-D.ietf-pce-segment-routing].” That document doesn’t even mention the word
“discovery”. Please use the proper name for easier cross-reference.

P2.5. “unless explicitly advertised otherwise” is used to qualify the
global/local nature of a segment. How do the characteristics of the segment
change if “explicitly advertised otherwise”? For example, if an IGP-Prefix
Segment is advertised as local, how do its characteristics change, or do they?

P2.5.1. In Section 3.4. (IGP-Adjacency Segment, Adj-SID), the use of an Adj-SID
is illustrated for both the local and global cases, but both explanations say
the same thing: “forwarded along the shortest-path to N and then be switched by
N, without any IP shortest-path consideration, towards link L” – I don’t see a
difference in behavior between the local and global nature of this SID. There
is some additional text related to global: “The use of global Adj-SID allows to
reduce the size of the segment list when expressing a path at the cost of
additional state (i.e.: the global Adj-SID will be inserted by all routers
within the area in their forwarding table).” If node N is the one executing on
the Adj-SID, how does including it in other FIBs help?

P2.6. None of the segments after Section 3.4 (binding, BGP, mirroring) are
defined in the Terminology. Please do so for completeness.

P3. Referring to Figure 1: “…each PE device is connected to two routers in each
of the groups A and B.” The routers connected to the PEs (Rx) are not in
either group.

P4. s/Obviously//

P5. Section 3.4. (IGP-Adjacency Segment, Adj-SID): “The remote node MAY be an
adjacent IGP neighbor or a non-adjacent neighbor…” The “MAY” is out of place
as there isn’t another option, it is one or the other. s/MAY/may

P6. Maybe it’s because the rest of 3.5 is incomplete, but Section 3.5.2.
(Tunnel Head-end) has no context to speak of – without that, it feels like an
orphan section.

P7. Section 8.1. (MPLS Data Plane)

P7.1. You included a couple of references at the end of this section, which is
good. I would however not explicitly mention specific sections, as the whole
RFCs (rfc4381 and rfc5920) are about MPLS-related security.

P7.2. You compare the use of a single segment to RSVP-TE – if the security
characteristics are similar, please provide a reference.

P8. References

P8.1. RFC2460 was obsoleted by RFC8200 / RFC6822 was obsoleted by RFC8202

P8.2. I think that the reference to RFC4206 should be Informative.

Nits:

N1. The Abstract is very long – I would even say too long. If it was up to me,
I would leave just the first paragraph.

N2. The second to last paragraph in the Introduction (“Numerous use-cases…”)
references the “marketing style” content of I-D.ietf-spring-oam-usecase. I
have asked the authors of that document to please focus their content. Please
consider taking this paragraph out.

N3. s/participating into the source based/participating in the source based

N4. References. Please add Informative references to netconf (Section 2), PCEP
(2), FRR (3.1.1), “parallel adjacency suppression” (3.4).

N5. Section 3: “IGP segments require extensions in link-state IGP protocols.
IGP extensions are required in order to advertise the IGP segments.” These 2
sentences say the exact same thing.

N6. Please avoid using “we” in the text – except for the Acknowledgements.

N7. s/Regardless Segment Routing/Independent of Segment Routing

N8. s/and not to each other individual nodes in the LAN/and not to individual
nodes in the LAN

N9.
s/([I-D.ietf-spring-segment-routing-ldp-interop]/[I-D.ietf-spring-segment-routing-ldp-interop]

N10. “IGP deployed using areas” An area is an OSPF-specific construct –
flooding domain is more generic.

N11. It would be nice if the examples used IPv6 instead of IPv4.

_______________________________________________
spring mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/spring

[spring] AD Review of draft-ietf-spring-segment-routing-12

Reply via email to