Hi Alvaro,
Thanks for your review. See inline with [PC].
(Note that we've just posted rev21 of the draft.)
Thanks,
Pablo.
-----Original Message-----
From: Alvaro Retana via Datatracker <[email protected]>
Sent: miércoles, 23 de septiembre de 2020 22:58
To: The IESG <[email protected]>
Cc: [email protected]; [email protected];
[email protected]; Bruno Decraene <[email protected]>; Joel Halpern
<[email protected]>
Subject: Alvaro Retana's Discuss on
draft-ietf-spring-srv6-network-programming-20: (with DISCUSS and COMMENT)
Alvaro Retana has entered the following ballot position for
draft-ietf-spring-srv6-network-programming-20: Discuss
When responding, please keep the subject line intact and reply to all email
addresses included in the To and CC lines. (Feel free to cut this introductory
paragraph, however.)
Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.
The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-spring-srv6-network-programming/
----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------
I am balloting DISCUSS because I find the document unclear and lacking proper
technical details around significant functionality, as reflected in my first 3
points. The fourth point is related to the registration policy (which doesn't
match the definition in rfc8126), and my last point is for the IESG to consider.
(1) Pseudocode and normative behavior
The use of pseudocode was chosen as the mechanism to specify behavior, as
explained in §4:
An implementation of the pseudocode is compliant as long as the externally
observable wire protocol is as described by the pseudocode.
Clarity in the pseudocode is essential because it is used to determine
compliance. Several places need improvement:
(1a) In §4.1/§4.13/§4.15, the pseudocode is missing an ELSE after S04, to
include the error conditions if SL != 0. A check for an error condition when
SL is decremented is also needed. As written, the pseudocode could process the
packet (SL == 0) *and* send an ICMP time exceeded message... :-(
[PC] Please check the pseudocode in rev 21. We have included the missing break
statement in S03. I believe the current pseudocode is good.
I'm using as a reference the pseudocode in §4.3.1.1/rfc8754, which includes the
same initial statement.
(1b) It would be nice if the behavior in §4.1.1 were also specified using pseudocode. As
written, I am not sure if the intent is to process any upper-layer header or only
specific ones. Is the objective for this operation to be similar to the one in
§4.3.1.2/rfc8754? Please be specific on what is meant by "allowed by local
configuration".
[PC] Yes, we can structure the text in 4.1.1 in pseudocode form. The processing
is not the same as RFC8754/Section 4.3.1.2. The “allowed by local
configuration” is to enable the processing of only specific types of
Upper-layer Headers for packets addressed to an SRv6 SID of the specific
behaviors. E.g. An operator may not wish to have BGP sessions (or in general
any TCP traffic) destined to a local SID, but may want to enable ICMPv6 packet
processing for OAM purposes.
<OLD>
4.1.1. Upper-Layer Header
When processing the Upper-layer Header of a packet matching a FIB
entry locally instantiated as an SRv6 End SID, if Upper-layer Header
processing is allowed by local configuration (e.g. ICMPv6), then
process the upper-layer header. Otherwise, send an ICMP parameter
problem message to the Source Address and discard the packet. Error
code 4 (SR Upper-layer Header Error) and Pointer set to the offset of
the upper-layer header.
</OLD>
<NEW>
4.1.1. Upper-Layer Header
When processing the Upper-layer Header of a packet matching a FIB entry locally
instantiated as an SRv6 End SID do the following:
S01. If (Upper-Layer Header type is allowed by local configuration) {
S02. Process the Upper-layer Header
S03. } Else {
S04. Send an ICMP Parameter Problem to the Source Address,
Code 4 (SR Upper-layer Header Error),
Pointer set to the offset of the Upper-layer Header,
Interrupt packet processing and discard the packet.
S05 }
Notes.
S01. An operator may not wish to have any TCP traffic destined to a local SID,
but may want to enable ICMPv6 packet processing for OAM purposes.
</NEW>
[Note: this point by itself is not DISCUSS-worthy, but §4.1.1 is used, for
different reasons, in some of the other items I point to below. That is why I
include it here.]
(1c) §4.4/§4.6: S01 of the second piece of pseudocode is an instruction for
processing a non-IPv6 upper header. However, earlier in that section, it is
specified that the SID "is associated with one or more L3 IPv6 adjacencies/an
IPv6 FIB table". How can the upper header not be IPv6 if the specification
explicitly says it has to be?
[PC] The pseudocode is convoluted. I propose to turn it around for 4.4, 4.5,
4.6 and 4.7. As an example with 4.4:
<OLD>
When processing the Upper-layer header of a packet matching a FIB
entry locally instantiated as an SRv6 End.DX6 SID, the following is
done:
S01. If (Upper-Layer Header type != 41(IPv6) ) {
S02. Process as per Section 4.1.1
S03. }
S04. Remove the outer IPv6 Header with all its extension headers
S05. Forward the exposed IPv6 packet to the L3 adjacency J
</OLD>
<NEW>
When processing the Upper-layer header of a packet matching a FIB
entry locally instantiated as an SRv6 End.DX6 SID, the following is
done:
S01. If (Upper-Layer Header type == 41(IPv6) ) {
S02. Remove the outer IPv6 Header with all its extension headers
S03. Forward the exposed IPv6 packet to the L3 adjacency J
S04. }
S05. Else {
S06. Process as per Section 4.1.1
S07. }
</NEW>
(1d) §4.5/§4.7 have the same issue but related to IPv4.
[PC] We’ve clarified the pseudocode on the same lines as 1c.
(1e) §4.9 also has the same issue when it specifies that "End.DX2 SID...is
associated with one outgoing interface I", but allows for the processing of
non-ethernet payloads which could then be forwarded through a different outgoing
interface.
[PC] We’ve clarified the pseudocode on the same lines as 1c.
(1f) §4.11/§4.12 allows the processing of non-ethernet payloads, which will not be
"associated with an L2 Table T" as described.
[PC] We’ve clarified the pseudocode on the same lines as 1c.
(2) §4.12 describes the only behavior that can carry an ARG. I don't
understand how it works:
Arg.FE2 is encoded in the SID as an (k*x)-bit value. These bits
represent a list of up to k OIFs, each identified with an x-bit
value. Values k and x are defined on a per End.DT2M SID basis. The
interface identifier 0 indicates an empty entry in the interface
list.
Let's assume a router has 10 possible OIFs, and the operator uses 4-bit values
to identify them; then, the ARG would take 40 bits of the SID. Is that how the
math works?
Assuming my interpretation is correct, for 20 OIFs and 5-bit values we would
need 100 bits. Considering the examples in §3.2, where a /64 is allocated to a
router, this behavior wouldn't have enough bits! I realize that maybe a better
encoding would be to use a 20-bit field, each representing an interface.
However, there would still be a limit of < 64 OIFs. Am I missing something?
[PC] For the End.DT2M behavior, Arg.FE2 is a locally allocated Ethernet Segment
Identifier that is used for split-horizon filtering as described in RFC7432.
The text that you have quoted above needs to be removed since it is trying to
describe one way of allocation (which obviously has its limitations). Instead,
we will update this text to clarify the semantics and purpose of the Arg.FE2
and its allocation method would be left as implementation specific (just
similar to an ESI label).
[PC]Please see the updated text of rev21.
I'm trying to ultimately get to the fact that there are limits to this
behavior, but they are not described in the document. Please clearly explain
any limitations and any possible workaround.
(3) The description of the flavors in §4.16 is also unclear.
The section starts with this introduction:
The PSP, USP and USD flavors are variants of the End, End.X and End.T
behaviors. For each of these behaviors these flavors MAY be
supported for a SID either individually or in combinations.
By being "variants", I interpret that the behavior is different than what is
specified in §4.1.
(3a) Some of the behaviors, as listed in Table 4, include an indication of the
flavors. How are the values interpreted? For example, the Table lists 8
different behaviors related to End:
| 1 | 0x0001 | End (no PSP, no USP) | [This.ID] |
| 2 | 0x0002 | End with PSP | [This.ID] |
| 3 | 0x0003 | End with USP | [This.ID] |
| 4 | 0x0004 | End with PSP&USP | [This.ID] |
...
| 28 | 0x001C | End with USD | [This.ID] |
| 29 | 0x001D | End with PSP&USD | [This.ID] |
| 30 | 0x001E | End with USP&USD | [This.ID] |
| 31 | 0x001F | End with PSP, USP & USD | [This.ID] |
Is value 1 what is specified in §4.1? Or does it include USD, which is not
explicitly excluded)?
[PC] This has been corrected in revision 20. Can you please recheck and let me
know? E.g. 0x0001 does not include any of the flavors.
[PC] For the remainder of the comments in this section, lets recall that a
single Segment Endpoint Behavior codepoint is bound to a SID at a segment
endpoint node. A node computing a segment list with a particular SID knows
its associated behavior. A segment endpoint node receiving a packet destined
to a locally instantiated SID performs only the processing associated with the
behavior bound to that SID. A behavior is only bound to a SID, never to a node.
(3b) If a behavior with more than one flavor is signaled, how should the
receiving node determine which one to apply? I guess that the application of
behaviors 4 or 29 depends on the number of SLs -- the expected behavior should
be clearly specified.
[PC] The segment endpoint node receiving a packet destined to a SID with
behavior 4 applies only the processing associated with the SID (I.e. behavior
4).
(3c) Is it assumed that all nodes support all behaviors? Are there mandatory
to implement behaviors? Should the behavior be advertised before it is used?
[PC] The answer to first two questions is no. For the third, if a node
computing a segment list does not know of a SID (and its behavior) it will not
be able to 'use' the SID in a segment list.
(3d) §4.16.1.2:
When a SID of PSP-flavor is processed at a non-penultimate SR Segment
Endpoint Node, the PSP behavior is not performed as described in the
pseudocode below since Segments Left would not be zero.
For example, for the End behavior, I'm assuming that behavior 1 is performed
instead of 2 (or 4, or 29, or 31) if SL != 0. Should this be done even if the
node did not advertise the non-PSP flavor?
[PC] If a SID of END behavior (1) is instantiated at a segment endpoint node, a
packet destined to that SID will only ever be processed with behavior 1.
If the node is not known to support the PSP flavor, should it be an error to
receive a packet requesting that behavior?
[PC] If a node does not support PSP, then it has not instantiated any SID with
a segment endpoint behavior including PSP, and it is not possible for it to
receive a packet destined to a local SID it has not instantiated.
If only the PSP flavor is advertised, can the Source assume that the node also
supports the non-PSP flavor?
[PC] If a SID with PSP flavor is advertised (I.e. segment endpoint behavior
codepoint 2) by a segment endpoint node, a SR source node can only expect that
SID has the behavior bound to it.
[BTW, I'm asking about advertisement because §4.16.1.1 makes the statement
that the nodes "advertise the SIDs instantiated on them via control plane
protocols as described in Section 9". Even though §9 talks about control
plane protocols are "not necessary for an SDN control plane" because "one
expects the controller to explicitly provision the SIDs".]
(3e) §4.16.2 describes the USP flavor, which is one where the endpoint consumes
the packet by processing the next header. I don't understand how the outcome
due to the extended process is different from the original one in §4.1. Can
you please explain? It seems to me that the externally observable result is
the same.
[PC] We have use-cases where the packets with SRH may be destined to
applications or host implementations running in containers. The USP flavor is
useful to remove the consumed SRH from the extension header chain before
sending over to the application stack – we’ve seen this with smartNICs. As such
the perspective on externally observability differs and hence we believe it is
needed to specify this.
I have the same question about the USD flavor and the externally observable
behavior related to §4.1.
[PC] The USD flavor specifically enables the de-encapsulation of inner IP
packet and its further forwarding (consider use-case like TI-LFA where
encapsulation is done on the PLR and de-encapsulation has to be done on the
last node of the repair list). In this case the PLR node that is crafting the
SID list wants to ensure that the last segment in the repair list is able to
perform decapsulation.
In general, the observable behavior of §4.1, USP, and USD seem the same to me.
The next two points are related.
(3f) §4.16.3 describes the USD flavor, which assumes that the decapsulation
results in a packet that can be forwarded. Can the FIB lookup result in a
local destination?
[PC] Please refer the previous comment about the use-case and so yes, we
normally expect the decapsulation results in a packet that is forwarded out.
However, the inner packet may also be destined to a local address.
(3g) Does the USD flavor mean that, for the End behavior (as described in §4.1), the
action of "process the next header in the packet" cannot result in a forwarded
packet? Same question for the USP behavior?
[PC] Please refer to the previous comments. There is no such assumptions on
neither the base End behavior nor End with USP.
(3h) The last paragraph in §4.16.3:
An implementation that supports the USD flavor in conjunction with
the USP flavor MAY optimize the packet processing by first looking
whether the conditions for the USD flavor are met, in which case it
can proceed with USD processing else do USP processing.
What are the "conditions for the USD flavor"? As far as I can tell from the
document, the only condition is for the specific behavior to be signaled. What else?
[PC] I've removed this paragraph. This is an implementation optimization and
provides no value to the pseudocode preceding it.
Going back to the questions above... When is the option to optimize possible?
Does a specific behavior have to be used? Behavior 30 (End with USP&USD)? Or
can it also optimize if behavior 3 (End with USP) is signaled?
(4) §10.2 creates a new registry with an "FCFS" registration procedure. I am assuming
that this is the same as the "First Come First Served" (no
abbreviation!) policy from rfc8126; please add a reference if that is the case.
[PC] Ack to change FCFS for “First Come First Served [RFC8126]”.
The description used is not the same as what rfc8126 specifies:
- "Requests for allocation...must include a...preferably also a brief
description of how the value will be used." Using "preferably" indicates
that a description is optional. However, it is not optional in rfc8126.
- "...brief description...may be provided with a reference to an Internet
Draft or an RFC or in some other documentation that is permanently and
readily available." There is no such requirement in rfc8126. For example,
the "Specification Required" policy requires "a permanent and readily
available public specification". Is that what you want instead?
[PC] Indeed, the current text is wrong. My bad. I've updated the text with this
diff below. Also, I’m not sure whether that paragraph is really needed. Maybe
just putting in the table “First Come First Served [RFC8126]” is sufficient as
RFC8126 already describes what is written in the text below. If it can be
removed please let me know.
<OLD>
Requests for allocation from within the FCFS range must include a
point of contact and preferably also a brief description of how the
value will be used. This information may be provided with a
reference to an Internet Draft or an RFC or in some other
documentation that is permanently and readily available.
</OLD>
<NEW>
Requests for allocation from within the First Come First Serve range must
include a
point of contact and a brief description of how the
value will be used.
</NEW>
(5) This point is for the IESG to discuss.
§4.16.1.2:
The End, End.X and End.T behaviors with PSP do not contravene
Section 4 of [RFC8200] because the destination address of the
incoming packet is the address of the node executing the behavior.
The spring WG's interpretation of rfc8200 was a central point in the appeal
presented against the WG consensus on this document. The text above, I
believe, reflects that consensus.
However, given that the document relies on the spring WG's interpretation of
rfc8200, I think it would be better if the text is explicit.
Suggestion: to add at the end of the paragraph>
This conclusion represents the consensus interpretation of the spring WG.
----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------
(1) §3.1:
An SRv6 endpoint behavior MAY require additional information for its
processing (e.g. related to the flow or service). This information
may be encoded in the ARG bits of the SID.
The sentence is simply stating a fact, not a normative behavior. s/MAY/may
[PC] This is fixed in rev20.
(2) All the examples in §3.2 have a /48 prefix allocated to the SRv6 deployment
(and then /64s per node). Is it possible to start with a different SRv6
infrastructure allocation, a /64, or /96 maybe? If so, please include an
example. If not, please explain any limitations.
[PC] The examples are based on real deployments and as such reflect the
practical aspects of those operators SRv6 infrastructure allocation designs. It
would be counter-productive and misleading to provide artificially manufactured
examples (and then why just /96 and not something else?). The document does not
pose any limitations.
(3) §4 starts by saying that "Each FIB entry indicates the behavior associated with
a SID instance and its parameters." But the previous section (§3.3. SID
Reachability) says that "node N would advertise IPv6 prefix(es) matching the LOC
parts covering its SIDs or shorter-mask prefix" (not the behavior). IOW,
§3.3 sets the expectation of an advertisement covering just the LOC, but §4
seems to expect entries that cover the LOC+FUNC. Which one is correct?
[PC] Section 3.3 talks about the Prefix Reachability advertisement for the SRv6
Locator. Section 4, is talking about FIB entries instantiated on the SR Segment
Endpoint Node that has allocated those local SIDs – please note that they are
local entries. There is no discussion on their advertisement in Section 4. Can
you please check if the update text in Section 4 in v21 clarifies?
In the end, it may not matter which entry is in the FIB, as long as the SID is
reachable. However, the specification of the behavior feels sloppy.
(4) §4.9/§4.10: For the S04 step, perhaps decompose it into individual actions
(similar to S04-S06 in §4.7).
[PC] Fixed in rev21.
(5) §4.11/§4.12 "S05. Learn the exposed MAC Source Address..."
The note related to this step says that in "EVPN, the learning...is done via the
control plane"...but here it is done via the data plane. What, if any, is the
effect on EVPN operation? Are there issues with learning conflicting information from
different sources? It seems to me that it could be relatively easy to spoof the source
and create unexpected entries in the L2 table. Please point to the EVPN documents where
this type of operation is considered.
[PC] Indeed, text is inaccurate. I've updated the note to the following:
<OLD>
S05. In EVPN, the learning of the exposed MAC Source Address is done
via the control plane.
</OLD>
<NEW>
S03. In EVPN RFC7432, the learning of the exposed MAC Source Address is done
via control plane. In L2VPN VPLS RFC4761 RFC4762 reachability is obtained by
standard learning bridge functions in the data plane.
</NEW>
(6) §4.10/§4.11/§4.12 don't have references to the example applications
mentioned. Please add Informative references.
[PC] Ack. I’ve add the references to RFC7432, RFC4761, RFC4762, RFC8317,
RFC4761.
(7) §4.13/§4.15 instantiate a Binding SID, but only in the case where SL != 0.
What about the case where a Binding SID wouldn't require an extra encapsulation
(SL == 0)? Is there a reason that it is not supported in this document?
[PC] Such a requirement has not yet arisen. If need arises in the future, a new
behavior may be defined by a future document. This document provides the
framework and the extensibility to do so.
(8) §5.1: I'm assuming that the last line in this section (the one starting with S03)
should be proceeded by "Note:".
[PC] Fixed.
(9) §5.1: "The H.Encaps behavior is valid for any kind of Layer-3 traffic."
While it may be used for any kind of traffic, I'm assuming that there will be a
policy that determines which traffic is encapsulated using a specific SRv6
policy, right? Please be specific about that.
[PC] This document does not describe how traffic is steered into an SR Policy,
it may be steered by a route installed by BGP, a static route, some application
specific selection, etc.. The steering of a packet into an SR Policy is out of
scope of this document..
(10) §5.3: "Ethernet [IEEE.802.3_2012]" Please use the reference when Ethernet
is first used in the document. [I have the same question as Rob related to the version
of the 802.3 spec.]
[PC] Fixed in rev20.
(11) §5.3: "...MUST remove the preamble or frame check sequence (FCS) from the
Ethernet frame upon encapsulation and the decapsulating node MUST regenerate
the preamble or FCS before forwarding Ethernet frame." Which one? The
preamble can be easily recreated by the receiver, while removing the FCS may be more
problematic -- even if the FCS is not checked in transit, it seems that it would be
important to carry it. In any case, the real question here is: why use "or"?
Is it left at the discretion of the encapsulating node? Are there any considerations
when selecting?
[PC] You are correct. Corrected to the following:
<OLD>
The encapsulating node MUST remove the preamble or frame check sequence (FCS)
from the Ethernet frame upon encapsulation and the decapsulating node MUST
regenerate the preamble or FCS before forwarding Ethernet frame.
</OLD>
<NEW>
The encapsulating node MUST remove the preamble (if any) and frame check
sequence (FCS) from the Ethernet frame upon encapsulation and the decapsulating
node MUST regenerate, as required, the preamble and FCS before forwarding
Ethernet frame.
</NEW>
(12) All the headend behaviors (§5) include this text:
The push of the SRH MAY be omitted when the SRv6 Policy only contains
one segment and there is no need to use any flag, tag or TLV.
If the endpoint behavior indicates the PSP or USP flavors, what should the
receiver do? Clearly there is no SRH to pop. Is this an error or should the
receiver simply ignore the flavor?
[PC] If there is no SRH, then the SRH processing is not executed.
The PSP and USP flavors only make changes in the SRH processing pseudocode,
hence it is not executed.
(13) §6: "counter...for traffic that matched that SID and was processed correctly" Does
"processed correctly" include when the result being an ICMP error message? Or should
those be counted separately?
[PC] Packets that result in an ICMP error message or those that are dropped are
not counted as correctly processed. I've updated the text.
<OLD>
A node supporting this document SHOULD implement a pair of traffic
counter (one for packets and one for bytes) per local SID entry, for
traffic that matched that SID and was processed correctly.
</OLD>
<NEW>
A node supporting this document SHOULD implement a pair of traffic
counters (one for packets and one for bytes) per local SID entry, for
traffic that matched that SID and was processed successfully (i.e. packets
which generate ICMP Error Messages or are dropped are not counted).
</NEW>
(14) §7: I'm guessing that "flow-based hash" and "load-balancing hash" are the
same thing, is that correct? It would be nice to use consistent terminology.
[PC] Fixed.
(15) §8: A rogue node inside the SR domain may (on purpose) signal the wrong
behavior for a flow, which may result in the delivery of the traffic to the
wrong destination (potentially including destinations outside the domain),
among other things. Note that this action is possible even if the rogue node
is authenticated and authorized to generate an SRH. I didn't find this threat
mentioned in rfc8402/rfc8754.
[PC] The control plane protocol specifics are outside the scope of this
document. I am not able to parse this comment and what is it that needs to be
addressed in this document.
(16) §9.4: I'm not sure what the purpose of §9 is, as a whole. But the summary
in §9.4 puzzles me more; what is the intent? Does Table 1 indicate that, for
example, an IGP implementation should not advertise the End.B6.Encaps behavior?
Does Table 2 indicate that only BGP-LS should signal the ability to
H.Encaps.L2? I am confused about the value/intent because the text clearly
says that the control plane is outside the scope of this document.
[PC] The section provides an overview of the role of control plane routing
protocols in the advertisements of the SRv6 Locator and the SIDs along with
their behaviors – all new aspects that have been introduced in this document.
Based on the SRv6 solutions developed around the behaviors introduced in this
document, it indicates what information is expected to be advertised via which
protocol. It does not describe “how” since that is clearly outside the scope of
this document and part of the individual routing protocol extensions.
(17) [nits]
s/an network operator/a network operator
s/one billionth and one millionth of the assigned address space/one billionth
and one millionth of the available address space
s/packet's header Section 7/packet's header (Section 7)/g
[PC] Those three have been fixed.
s/bundle(LAG)/bundle (LAG)
Please expand LAG.
[PC] "(LAG)" has been removed as per the comments of another AD.
Thank you for your time Alvaro!