Hi Alvaro,
Thanks for your review. See inline with [PC].
(Note that we've just posted rev21 of the draft.)
Thanks,
Pablo.
-----Original Message-----
From: Alvaro Retana via Datatracker <[email protected]>
Sent: miércoles, 23 de septiembre de 2020 22:58
To: The IESG <[email protected]>
Cc: [email protected];
[email protected]; [email protected]; Bruno Decraene
<[email protected]>; Joel Halpern <[email protected]>
Subject: Alvaro Retana's Discuss on
draft-ietf-spring-srv6-network-programming-20: (with DISCUSS and COMMENT)
Alvaro Retana has entered the following ballot position for
draft-ietf-spring-srv6-network-programming-20: Discuss
When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut
this introductory paragraph, however.)
Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.
The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-spring-srv6-network-programming/
----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------
I am balloting DISCUSS because I find the document unclear and lacking
proper technical details around significant functionality, as
reflected in my first 3 points. The fourth point is related to the
registration policy (which doesn't match the definition in rfc8126),
and my last point is for the IESG to consider.
(1) Pseudocode and normative behavior
The use of pseudocode was chosen as the mechanism to specify behavior,
as explained in §4:
An implementation of the pseudocode is compliant as long as the
externally
observable wire protocol is as described by the pseudocode.
Clarity in the pseudocode is essential because it is used to determine
compliance. Several places need improvement:
(1a) In §4.1/§4.13/§4.15, the pseudocode is missing an ELSE after S04,
to include the error conditions if SL != 0. A check for an error
condition when SL is decremented is also needed. As written, the
pseudocode could process the packet (SL == 0) *and* send an ICMP time
exceeded message... :-(
[PC] Please check the pseudocode in rev 21. We have included the
missing break statement in S03. I believe the current pseudocode is good.
I'm using as a reference the pseudocode in §4.3.1.1/rfc8754, which
includes the same initial statement.
(1b) It would be nice if the behavior in §4.1.1 were also specified
using pseudocode. As written, I am not sure if the intent is to
process any upper-layer header or only specific ones. Is the
objective for this operation to be similar to the one in
§4.3.1.2/rfc8754? Please be specific on what is meant by "allowed by
local configuration".
[PC] Yes, we can structure the text in 4.1.1 in pseudocode form. The
processing is not the same as RFC8754/Section 4.3.1.2. The “allowed by
local configuration” is to enable the processing of only specific
types of Upper-layer Headers for packets addressed to an SRv6 SID of
the specific behaviors. E.g. An operator may not wish to have BGP
sessions (or in general any TCP traffic) destined to a local SID, but
may want to enable ICMPv6 packet processing for OAM purposes.
<OLD>
4.1.1. Upper-Layer Header
When processing the Upper-layer Header of a packet matching a FIB
entry locally instantiated as an SRv6 End SID, if Upper-layer Header
processing is allowed by local configuration (e.g. ICMPv6), then
process the upper-layer header. Otherwise, send an ICMP parameter
problem message to the Source Address and discard the packet. Error
code 4 (SR Upper-layer Header Error) and Pointer set to the offset of
the upper-layer header.
</OLD>
<NEW>
4.1.1. Upper-Layer Header
When processing the Upper-layer Header of a packet matching a FIB
entry locally instantiated as an SRv6 End SID do the following:
S01. If (Upper-Layer Header type is allowed by local configuration) {
S02. Process the Upper-layer Header
S03. } Else {
S04. Send an ICMP Parameter Problem to the Source Address,
Code 4 (SR Upper-layer Header Error),
Pointer set to the offset of the Upper-layer Header,
Interrupt packet processing and discard the packet.
S05 }
Notes.
S01. An operator may not wish to have any TCP traffic destined to a
local SID, but may want to enable ICMPv6 packet processing for OAM
purposes.
</NEW>
[Note: this point by itself is not DISCUSS-worthy, but §4.1.1 is used,
for different reasons, in some of the other items I point to below.
That is why I include it here.]
(1c) §4.4/§4.6: S01 of the second piece of pseudocode is an
instruction for processing a non-IPv6 upper header. However, earlier
in that section, it is specified that the SID "is associated with one
or more L3 IPv6 adjacencies/an
IPv6 FIB table". How can the upper header not be IPv6 if the
specification explicitly says it has to be?
[PC] The pseudocode is convoluted. I propose to turn it around for
4.4, 4.5, 4.6 and 4.7. As an example with 4.4:
<OLD>
When processing the Upper-layer header of a packet matching a FIB
entry locally instantiated as an SRv6 End.DX6 SID, the following is
done:
S01. If (Upper-Layer Header type != 41(IPv6) ) {
S02. Process as per Section 4.1.1
S03. }
S04. Remove the outer IPv6 Header with all its extension headers
S05. Forward the exposed IPv6 packet to the L3 adjacency J
</OLD>
<NEW>
When processing the Upper-layer header of a packet matching a FIB
entry locally instantiated as an SRv6 End.DX6 SID, the following is
done:
S01. If (Upper-Layer Header type == 41(IPv6) ) {
S02. Remove the outer IPv6 Header with all its extension headers
S03. Forward the exposed IPv6 packet to the L3 adjacency J
S04. }
S05. Else {
S06. Process as per Section 4.1.1
S07. }
</NEW>
(1d) §4.5/§4.7 have the same issue but related to IPv4.
[PC] We’ve clarified the pseudocode on the same lines as 1c.
(1e) §4.9 also has the same issue when it specifies that "End.DX2
SID...is associated with one outgoing interface I", but allows for the
processing of non-ethernet payloads which could then be forwarded
through a different outgoing interface.
[PC] We’ve clarified the pseudocode on the same lines as 1c.
(1f) §4.11/§4.12 allows the processing of non-ethernet payloads, which
will not be "associated with an L2 Table T" as described.
[PC] We’ve clarified the pseudocode on the same lines as 1c.
(2) §4.12 describes the only behavior that can carry an ARG. I don't
understand how it works:
Arg.FE2 is encoded in the SID as an (k*x)-bit value. These bits
represent a list of up to k OIFs, each identified with an x-bit
value. Values k and x are defined on a per End.DT2M SID
basis. The
interface identifier 0 indicates an empty entry in the interface
list.
Let's assume a router has 10 possible OIFs, and the operator uses
4-bit values to identify them; then, the ARG would take 40 bits of the
SID. Is that how the math works?
Assuming my interpretation is correct, for 20 OIFs and 5-bit values we
would need 100 bits. Considering the examples in §3.2, where a /64 is
allocated to a router, this behavior wouldn't have enough bits! I
realize that maybe a better encoding would be to use a 20-bit field,
each representing an interface.
However, there would still be a limit of < 64 OIFs. Am I missing
something?
[PC] For the End.DT2M behavior, Arg.FE2 is a locally allocated
Ethernet Segment Identifier that is used for split-horizon filtering
as described in RFC7432. The text that you have quoted above needs to
be removed since it is trying to describe one way of allocation (which
obviously has its limitations). Instead, we will update this text to
clarify the semantics and purpose of the Arg.FE2 and its allocation
method would be left as implementation specific (just similar to an
ESI label).
[PC]Please see the updated text of rev21.
I'm trying to ultimately get to the fact that there are limits to this
behavior, but they are not described in the document. Please clearly
explain any limitations and any possible workaround.
(3) The description of the flavors in §4.16 is also unclear.
The section starts with this introduction:
The PSP, USP and USD flavors are variants of the End, End.X and End.T
behaviors. For each of these behaviors these flavors MAY be
supported for a SID either individually or in combinations.
By being "variants", I interpret that the behavior is different than
what is specified in §4.1.
(3a) Some of the behaviors, as listed in Table 4, include an
indication of the flavors. How are the values interpreted? For
example, the Table lists 8 different behaviors related to End:
| 1 | 0x0001 | End (no PSP, no USP) | [This.ID] |
| 2 | 0x0002 | End with PSP | [This.ID] |
| 3 | 0x0003 | End with USP | [This.ID] |
| 4 | 0x0004 | End with PSP&USP | [This.ID] |
...
| 28 | 0x001C | End with USD | [This.ID] |
| 29 | 0x001D | End with PSP&USD | [This.ID] |
| 30 | 0x001E | End with USP&USD | [This.ID] |
| 31 | 0x001F | End with PSP, USP & USD | [This.ID] |
Is value 1 what is specified in §4.1? Or does it include USD, which
is not explicitly excluded)?
[PC] This has been corrected in revision 20. Can you please recheck
and let me know? E.g. 0x0001 does not include any of the flavors.
[PC] For the remainder of the comments in this section, lets recall
that a single Segment Endpoint Behavior codepoint is bound to a SID at
a segment endpoint node. A node computing a segment list with a
particular SID knows its associated behavior. A segment endpoint node
receiving a packet destined to a locally instantiated SID performs
only the processing associated with the behavior bound to that SID. A
behavior is only bound to a SID, never to a node.
(3b) If a behavior with more than one flavor is signaled, how should
the receiving node determine which one to apply? I guess that the
application of behaviors 4 or 29 depends on the number of SLs -- the
expected behavior should be clearly specified.
[PC] The segment endpoint node receiving a packet destined to a SID
with behavior 4 applies only the processing associated with the SID
(I.e. behavior 4).
(3c) Is it assumed that all nodes support all behaviors? Are there
mandatory to implement behaviors? Should the behavior be advertised
before it is used?
[PC] The answer to first two questions is no. For the third, if a node
computing a segment list does not know of a SID (and its behavior) it
will not be able to 'use' the SID in a segment list.
(3d) §4.16.1.2:
When a SID of PSP-flavor is processed at a non-penultimate SR Segment
Endpoint Node, the PSP behavior is not performed as described in the
pseudocode below since Segments Left would not be zero.
For example, for the End behavior, I'm assuming that behavior 1 is
performed instead of 2 (or 4, or 29, or 31) if SL != 0. Should this
be done even if the node did not advertise the non-PSP flavor?
[PC] If a SID of END behavior (1) is instantiated at a segment
endpoint node, a packet destined to that SID will only ever be
processed with behavior 1.
If the node is not known to support the PSP flavor, should it be an
error to receive a packet requesting that behavior?
[PC] If a node does not support PSP, then it has not instantiated any
SID with a segment endpoint behavior including PSP, and it is not
possible for it to receive a packet destined to a local SID it has not
instantiated.
If only the PSP flavor is advertised, can the Source assume that the
node also supports the non-PSP flavor?
[PC] If a SID with PSP flavor is advertised (I.e. segment endpoint
behavior codepoint 2) by a segment endpoint node, a SR source node can
only expect that SID has the behavior bound to it.
[BTW, I'm asking about advertisement because §4.16.1.1 makes the
statement
that the nodes "advertise the SIDs instantiated on them via control
plane
protocols as described in Section 9". Even though §9 talks about
control
plane protocols are "not necessary for an SDN control plane"
because "one
expects the controller to explicitly provision the SIDs".]
(3e) §4.16.2 describes the USP flavor, which is one where the endpoint
consumes the packet by processing the next header. I don't understand
how the outcome due to the extended process is different from the
original one in §4.1. Can you please explain? It seems to me that
the externally observable result is the same.
[PC] We have use-cases where the packets with SRH may be destined to
applications or host implementations running in containers. The USP
flavor is useful to remove the consumed SRH from the extension header
chain before sending over to the application stack – we’ve seen this
with smartNICs. As such the perspective on externally observability
differs and hence we believe it is needed to specify this.
I have the same question about the USD flavor and the externally
observable behavior related to §4.1.
[PC] The USD flavor specifically enables the de-encapsulation of inner
IP packet and its further forwarding (consider use-case like TI-LFA
where encapsulation is done on the PLR and de-encapsulation has to be
done on the last node of the repair list). In this case the PLR node
that is crafting the SID list wants to ensure that the last segment in
the repair list is able to perform decapsulation.
In general, the observable behavior of §4.1, USP, and USD seem the
same to me.
The next two points are related.
(3f) §4.16.3 describes the USD flavor, which assumes that the
decapsulation results in a packet that can be forwarded. Can the FIB
lookup result in a local destination?
[PC] Please refer the previous comment about the use-case and so yes,
we normally expect the decapsulation results in a packet that is
forwarded out. However, the inner packet may also be destined to a
local address.
(3g) Does the USD flavor mean that, for the End behavior (as described
in §4.1), the action of "process the next header in the packet" cannot
result in a forwarded packet? Same question for the USP behavior?
[PC] Please refer to the previous comments. There is no such
assumptions on neither the base End behavior nor End with USP.
(3h) The last paragraph in §4.16.3:
An implementation that supports the USD flavor in conjunction with
the USP flavor MAY optimize the packet processing by first looking
whether the conditions for the USD flavor are met, in which case it
can proceed with USD processing else do USP processing.
What are the "conditions for the USD flavor"? As far as I can tell
from the document, the only condition is for the specific behavior to
be signaled. What else?
[PC] I've removed this paragraph. This is an implementation
optimization and provides no value to the pseudocode preceding it.
Going back to the questions above... When is the option to optimize
possible?
Does a specific behavior have to be used? Behavior 30 (End with
USP&USD)? Or can it also optimize if behavior 3 (End with USP) is
signaled?
(4) §10.2 creates a new registry with an "FCFS" registration
procedure. I am assuming that this is the same as the "First Come
First Served" (no
abbreviation!) policy from rfc8126; please add a reference if that is
the case.
[PC] Ack to change FCFS for “First Come First Served [RFC8126]”.
The description used is not the same as what rfc8126 specifies:
- "Requests for allocation...must include a...preferably also a brief
description of how the value will be used." Using "preferably"
indicates
that a description is optional. However, it is not optional in
rfc8126.
- "...brief description...may be provided with a reference to an Internet
Draft or an RFC or in some other documentation that is permanently and
readily available." There is no such requirement in rfc8126. For
example,
the "Specification Required" policy requires "a permanent and readily
available public specification". Is that what you want instead?
[PC] Indeed, the current text is wrong. My bad. I've updated the text
with this diff below. Also, I’m not sure whether that paragraph is
really needed. Maybe just putting in the table “First Come First
Served [RFC8126]” is sufficient as RFC8126 already describes what is
written in the text below. If it can be removed please let me know.
<OLD>
Requests for allocation from within the FCFS range must include a
point of contact and preferably also a brief description of how the
value will be used. This information may be provided with a
reference to an Internet Draft or an RFC or in some other
documentation that is permanently and readily available.
</OLD>
<NEW>
Requests for allocation from within the First Come First Serve
range must include a
point of contact and a brief description of how the
value will be used.
</NEW>
(5) This point is for the IESG to discuss.
§4.16.1.2:
The End, End.X and End.T behaviors with PSP do not contravene
Section 4 of [RFC8200] because the destination address of the
incoming packet is the address of the node executing the behavior.
The spring WG's interpretation of rfc8200 was a central point in the
appeal presented against the WG consensus on this document. The text
above, I believe, reflects that consensus.
However, given that the document relies on the spring WG's
interpretation of rfc8200, I think it would be better if the text is
explicit.
Suggestion: to add at the end of the paragraph>
This conclusion represents the consensus interpretation of the
spring WG.
----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------
(1) §3.1:
An SRv6 endpoint behavior MAY require additional information for its
processing (e.g. related to the flow or service). This information
may be encoded in the ARG bits of the SID.
The sentence is simply stating a fact, not a normative behavior.
s/MAY/may
[PC] This is fixed in rev20.
(2) All the examples in §3.2 have a /48 prefix allocated to the SRv6
deployment (and then /64s per node). Is it possible to start with a
different SRv6 infrastructure allocation, a /64, or /96 maybe? If so,
please include an example. If not, please explain any limitations.
[PC] The examples are based on real deployments and as such reflect
the practical aspects of those operators SRv6 infrastructure
allocation designs. It would be counter-productive and misleading to
provide artificially manufactured examples (and then why just /96 and
not something else?). The document does not pose any limitations.
(3) §4 starts by saying that "Each FIB entry indicates the behavior
associated with a SID instance and its parameters." But the previous
section (§3.3. SID
Reachability) says that "node N would advertise IPv6 prefix(es)
matching the LOC parts covering its SIDs or shorter-mask prefix" (not
the behavior). IOW,
§3.3 sets the expectation of an advertisement covering just the LOC,
but §4 seems to expect entries that cover the LOC+FUNC. Which one is
correct?
[PC] Section 3.3 talks about the Prefix Reachability advertisement for
the SRv6 Locator. Section 4, is talking about FIB entries instantiated
on the SR Segment Endpoint Node that has allocated those local SIDs –
please note that they are local entries. There is no discussion on
their advertisement in Section 4. Can you please check if the update
text in Section 4 in v21 clarifies?
In the end, it may not matter which entry is in the FIB, as long as
the SID is reachable. However, the specification of the behavior
feels sloppy.
(4) §4.9/§4.10: For the S04 step, perhaps decompose it into individual
actions (similar to S04-S06 in §4.7).
[PC] Fixed in rev21.
(5) §4.11/§4.12 "S05. Learn the exposed MAC Source Address..."
The note related to this step says that in "EVPN, the learning...is
done via the control plane"...but here it is done via the data plane.
What, if any, is the effect on EVPN operation? Are there issues with
learning conflicting information from different sources? It seems to
me that it could be relatively easy to spoof the source and create
unexpected entries in the L2 table. Please point to the EVPN
documents where this type of operation is considered.
[PC] Indeed, text is inaccurate. I've updated the note to the following:
<OLD>
S05. In EVPN, the learning of the exposed MAC Source Address is done
via the control plane.
</OLD>
<NEW>
S03. In EVPN RFC7432, the learning of the exposed MAC Source Address
is done via control plane. In L2VPN VPLS RFC4761 RFC4762 reachability
is obtained by standard learning bridge functions in the data plane.
</NEW>
(6) §4.10/§4.11/§4.12 don't have references to the example
applications mentioned. Please add Informative references.
[PC] Ack. I’ve add the references to RFC7432, RFC4761, RFC4762,
RFC8317, RFC4761.
(7) §4.13/§4.15 instantiate a Binding SID, but only in the case where
SL != 0.
What about the case where a Binding SID wouldn't require an extra
encapsulation (SL == 0)? Is there a reason that it is not supported
in this document?
[PC] Such a requirement has not yet arisen. If need arises in the
future, a new behavior may be defined by a future document. This
document provides the framework and the extensibility to do so.
(8) §5.1: I'm assuming that the last line in this section (the one
starting with S03) should be proceeded by "Note:".
[PC] Fixed.
(9) §5.1: "The H.Encaps behavior is valid for any kind of Layer-3
traffic."
While it may be used for any kind of traffic, I'm assuming that there
will be a policy that determines which traffic is encapsulated using a
specific SRv6 policy, right? Please be specific about that.
[PC] This document does not describe how traffic is steered into an SR
Policy, it may be steered by a route installed by BGP, a static route,
some application specific selection, etc.. The steering of a packet
into an SR Policy is out of scope of this document..
(10) §5.3: "Ethernet [IEEE.802.3_2012]" Please use the reference when
Ethernet is first used in the document. [I have the same question as
Rob related to the version of the 802.3 spec.]
[PC] Fixed in rev20.
(11) §5.3: "...MUST remove the preamble or frame check sequence (FCS)
from the Ethernet frame upon encapsulation and the decapsulating node
MUST regenerate
the preamble or FCS before forwarding Ethernet frame." Which one? The
preamble can be easily recreated by the receiver, while removing the
FCS may be more problematic -- even if the FCS is not checked in
transit, it seems that it would be important to carry it. In any
case, the real question here is: why use "or"? Is it left at the
discretion of the encapsulating node? Are there any considerations
when selecting?
[PC] You are correct. Corrected to the following:
<OLD>
The encapsulating node MUST remove the preamble or frame check
sequence (FCS) from the Ethernet frame upon encapsulation and the
decapsulating node MUST regenerate the preamble or FCS before
forwarding Ethernet frame.
</OLD>
<NEW>
The encapsulating node MUST remove the preamble (if any) and frame
check sequence (FCS) from the Ethernet frame upon encapsulation and
the decapsulating node MUST regenerate, as required, the preamble and
FCS before forwarding Ethernet frame.
</NEW>
(12) All the headend behaviors (§5) include this text:
The push of the SRH MAY be omitted when the SRv6 Policy only contains
one segment and there is no need to use any flag, tag or TLV.
If the endpoint behavior indicates the PSP or USP flavors, what should
the receiver do? Clearly there is no SRH to pop. Is this an error or
should the receiver simply ignore the flavor?
[PC] If there is no SRH, then the SRH processing is not executed.
The PSP and USP flavors only make changes in the SRH processing
pseudocode, hence it is not executed.
(13) §6: "counter...for traffic that matched that SID and was
processed correctly" Does "processed correctly" include when the
result being an ICMP error message? Or should those be counted
separately?
[PC] Packets that result in an ICMP error message or those that are
dropped are not counted as correctly processed. I've updated the text.
<OLD>
A node supporting this document SHOULD implement a pair of traffic
counter (one for packets and one for bytes) per local SID entry, for
traffic that matched that SID and was processed correctly.
</OLD>
<NEW>
A node supporting this document SHOULD implement a pair of traffic
counters (one for packets and one for bytes) per local SID entry, for
traffic that matched that SID and was processed successfully (i.e.
packets which generate ICMP Error Messages or are dropped are not
counted).
</NEW>
(14) §7: I'm guessing that "flow-based hash" and "load-balancing hash"
are the same thing, is that correct? It would be nice to use
consistent terminology.
[PC] Fixed.
(15) §8: A rogue node inside the SR domain may (on purpose) signal the
wrong behavior for a flow, which may result in the delivery of the
traffic to the wrong destination (potentially including destinations
outside the domain), among other things. Note that this action is
possible even if the rogue node is authenticated and authorized to
generate an SRH. I didn't find this threat mentioned in rfc8402/rfc8754.
[PC] The control plane protocol specifics are outside the scope of
this document. I am not able to parse this comment and what is it that
needs to be addressed in this document.
(16) §9.4: I'm not sure what the purpose of §9 is, as a whole. But
the summary in §9.4 puzzles me more; what is the intent? Does Table 1
indicate that, for example, an IGP implementation should not advertise
the End.B6.Encaps behavior?
Does Table 2 indicate that only BGP-LS should signal the ability to
H.Encaps.L2? I am confused about the value/intent because the text
clearly says that the control plane is outside the scope of this
document.
[PC] The section provides an overview of the role of control plane
routing protocols in the advertisements of the SRv6 Locator and the
SIDs along with their behaviors – all new aspects that have been
introduced in this document. Based on the SRv6 solutions developed
around the behaviors introduced in this document, it indicates what
information is expected to be advertised via which protocol. It does
not describe “how” since that is clearly outside the scope of this
document and part of the individual routing protocol extensions.
(17) [nits]
s/an network operator/a network operator
s/one billionth and one millionth of the assigned address space/one
billionth and one millionth of the available address space
s/packet's header Section 7/packet's header (Section 7)/g
[PC] Those three have been fixed.
s/bundle(LAG)/bundle (LAG)
Please expand LAG.
[PC] "(LAG)" has been removed as per the comments of another AD.
Thank you for your time Alvaro!