[Pce] Benjamin Kaduk's Discuss on draft-ietf-pce-stateful-pce-lsp-scheduling-19: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker Tue, 07 Jul 2020 15:27:10 -0700

Benjamin Kaduk has entered the following ballot position for
draft-ietf-pce-stateful-pce-lsp-scheduling-19: Discuss


When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-pce-stateful-pce-lsp-scheduling/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

This being a discuss ballot notwithstanding, the protocol mechanisms
here seem pretty well thought-out; I'm just wanting changes to how they
are described.

There seems to be an internal inconsistency in Section 4.3, between
"[t]he PCE SHOULD add the scheduled LSP into its scheduled LSP-DB and
update its scheduled TED" and "[t]he stateful PCE is required to update
its local scheduled LSP-DB and scheduled TED".  (I think the "SHOULD"
one is wrong, personally.)

Let's also take a closer look at the precise interdependency between the
B bit and PD bit -- Section 5.1 implies that the PD bit itself cannot be
set in the absence of the B bit, referring forward to Section 5.2.2, but
Section 5.2.2 seems to only say that you need both the B and PD bits set
in order to send SCHED-PD-LSP-ATTRIBUTE.  Bits being set as a
prerequisite for sending the TLV is a subtly different condition than
having the one bit itself depend on the other, with correspondingly
different error handling.

Section 6.6 refers to the "LSP-ERROR-CODE TLV (Section 7.3.3) which is
not defined in this document, rather, the reference should be to § 3.3
of RFC 8231.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

I'm pretty sympathetic to Alvaro's not-quite-Discuss, but also am not
quite prepared to elevate it to Discuss level (and incur the
responsibility to determine what changes are necessary to resolve it).
I note several editorial comments and nits below, but those remarks are
not comprehensive.

Abstract

                                                            so as to
   enable Labeled Switched Path (LSP) scheduling for path computation
   and LSP setup/deletion based on the actual network resource usage and
   the duration of a traffic service in a centralized network
   environment as stated in RFC 8413.

Just looking at this text in isolation, it's not entirely clear if the
event that is scheduled is the LSP activation, creation, calculation, or
something else, and whether it is just the computed path that depends on
the resource usage/traffic duration, or if other things like the
frequency of scheduling or the existence of an LSP at all would be
dependent on those.  Presumably the rest of the document will clarify,
but perhaps there is some wordsmithing possible here.

Section 1

   letting other services use it when this service is not using it.  The
   requirement of scheduled LSP provision is mentioned in [RFC8231] and

nit(?): I think s/provision/provisioning/

   [RFC7399].  A solution for providing more efficient network resource
   usage for traffic engineering is desired.  Also, for deterministic

nit: I don't really follow the connection between these two sentences --
"a solution for [...] is desired" seems to be a general re-statement of
the problem, whereas the previous discussion has been discussing,
essentially, the benefits of the proposed solution.

Section 2.1

Hmm, RFC 8231 says that it itself takes the definitions for "Active
Stateful PCE", "Passive Stateful PCE", and several other terms from RFC
8051; we should probably short-circuit the reference chain(s).

   Scheduled TED:  Traffic engineering database with the awareness of
      scheduled resources for TE.  This database is generated by the PCE
      from the information in TED and scheduled LSP-DB and allows
      knowing, at any time, the amount of available resources (does not
      include failures in the future).

I'd consider "the expected amount of available resources (discounting
the possibility of failures in the future)".

Section 4.1

   The LSP scheduling allows PCEs and PCCs to provide scheduled LSP for
   customers' traffic services at its actual usage time, so as to
   improve the network resource efficient utilization.

nit: s/resource efficient utilization/resource utilization efficiency/

   In case of implementing PCC-initiated scheduled LSPs, before a PCC
   delegates a scheduled LSP, it MAY use the PCReq/PCRep messages to

Just to check: there is some risk that the computed path might change
between this query and when the LSP actually becomes active?

   learn the path for the scheduled LSP.  A PCC MUST delegate a
   scheduled LSP with information of its scheduling parameters,
   including the starting time and the duration using PCRpt message.

I suggest "When delegating a scheduled LSP, a PCC MUST include its
scheduling parameters, including [...]", to be clear about what cases
the "MUST" applies to.  (It might also be worth saying what a PCE should
do if it receives a delegation request for a scheduled LSP that does not
include the requisite parameters.)

   For a multiple PCE environment, a PCE MUST synchronize to other PCEs
   within the network, so as to keep their scheduling information
   synchronized.  There are many ways that this could be achieved: one
   such mechanism is described in [I-D.litkowski-pce-state-sync].  Which
   way is used to achieve this is out of scope for this document.  [...]

I'd suggest restructuring how this paragraph is laid out (akin to
Alvaro's comment).  Specifically, it's an intrinsic fact that if you're
in a multi-PCE environment, you have to have inter-PCE synchronization
or the stat skew causes problems.  That's not new with this document;
what we are most interested in saying is that, in addition to the
existing need to synchrnoize the TED and LSP-DB, when scheduled LSPs are
in use you also have to synchronize the SLSP-DB and have each PCE
reconstruct the Scheduled TED (or synchronize the Scheduled TED as
well).  The ways to perform such synchronization are hardly worth
mentioning, except to the extent that existing mechanisms cannot handle
sending the extra information.

   The scheduled LSP can also be initiated by PCE itself.  In case of

nit: missing article (perhaps "by a PCE itself").

   scheduled LSP based on the local policy.  For the former SCHED-LSP-
   ATTRIBUTE TLV (see Section 5.2.1) MUST be included in the message

I suggest s/For the former/In the former case, the/

   where as for the latter SCHED-LSP-ATTRIBUTE TLV SHOULD NOT be

nits: s/where as/whereas/, s/SCHED-LSP-ATTRIBUTE/the SCHED-LSP-ATTRIBUTE/

   included.  Either way the synchronization to other PCEs should be
   done when the scheduled LSP is created.

I recognize that the BCP 14 keywords are not being used, but earlier we
said "shall synchronize" but here it's just "synchronization should be
done"; it's probably worth making these consistent.

   In both modes, for activation of scheduled LSPs, the PCC could
   initiate the setup of scheduled LSP at the start time by itself or
   wait for the PCE to update the PCC to initiate the setup of LSP.

I'm worried about the "could initiate [...] or wait".  While it's true
that either party could take the initiative, doesn't there need to be an
agreement between them about which one it will be, to avoid the risk of
the LSP not actually geting instantiated at the start time?

   Similarly on the scheduling usage expires, the PCC could initiate the

nit: s/expires/expiry/ or s/expires/expiration/

(Same comment about "could" as above.)

Section 4.2.2

   When an LSP is configured with a scheduling interval such as "[Ta,
   Tb] repeats 10 times with a repeat cycle a week" (representing 11
   scheduling intervals), a path satisfying the constraints for the LSP
   in every interval represented by the periodical scheduling interval
   is computed once.  And then the LSP along the path is set up to carry
   traffic in each of the scheduling intervals.  If there is no path
   satisfying the constraints for some of the intervals, the LSP will
   not be set up at all.

This seems to say that the same path must be used for each recurrence of
the scheduled event, precluding some optimizations that might be desired
in the face of other (unscheduled or differently scheduled) load.  Is
that intended?

Section 4.2.2.1

   When an LSP is configured with elastic time interval "[Ta, Tb] within
   -P and Q", a path is computed such that the path satisfies the
   constraints for the LSP in the time period from (Ta+Xv) to (Tb+Xv)
   and |Xv| is the minimum value for Xv from -P to Q.  That is, [Ta+Xv,

To check my understanding, this mention of |Xv| is indicating that the
PCE attempts to limit the deviation from the requested interval, using
an absolute value metric to indicate distance from the requested value?
It might be worth putting in a few more words to indicate that this
optimization is being performed; just "is the minimum value" could be
confusing.

Section 4.2.2.2

   During grace periods from (Ta-GB) to Ta and from Tb to (Tb+GA), the
   LSP is up to carry traffic (maybe in best effort).

This point seems pretty key to having grace periods at all.  In
particular, if there is no difference between the traffic-handling
properties for the grace period and the "main interval", then the grace
period is more simply handled by the entity requesting the interval
(i.e., "just ask for a larger interval").  The fact that we propose to
give different traffic-handling behavior during the grace period should
be emphasized, in order to justify the existence of the protocol
element.  In the absence of such justifying text, I would propose to
remove the grace-period feature as needless complexity.

Section 4.3

   For PCE-Initiated Scheduled LSP, the stateful PCE can compute a path
   for the scheduled LSP per requests from network management systems
   automatically based on the network resource availability in the
   scheduled TED, send a PCInitiate message with the path information

nit: s/, send/ and send/

   back to the PCC.  Based on the local policy, the PCInitiate message
   could be sent immediately to ask PCC to create a scheduled LSP (as

nit: s/ask PCC/ask the PCC/

   o  Based on the configuration (and the C flag in scheduled TLVs),
      when it is time (i.e., at the start time) for the LSP to be set
      up, either the PCC triggers the LSP to be signaled or the
      delegated PCE sends a PCUpd message to the head end LSR providing
      the updated path to be signaled (with A flag set to indicate LSP
      activation).

We haven't discussed the C flag yet, so a reader is left wondering "how
do I know whether the PCC or PCE is going to take initiative?".  We
could reword, perhaps like "When it is time for the LSP to be set up
(i.e., at the start time), based on the value of the C flag for the
scheduled TLV, either the PCC [...]".  Similar changes would be
applicable in later sections as well.

Section 4.4

Are there any special considerations for modifying a periodic scheduled
LSP after some recurrences have already happened?  What about for
modifying any scheduled LSP that is currently active (whether before the
chage, after the change, or both)?

Section 5.1

   After a PCEP session has been established, a PCC and a PCE indicates
   its ability to support LSP scheduling during the PCEP session
   establishment phase.  For a multiple-PCE environment, the PCEs should
   also establish PCEP session and indicate its ability to support LSP
   scheduling among PCEP peers.  The Open Object in the Open message

Does a PCE need to refrain from advertising scheduling support to PCCs
if its PCE peers do not all support scheduling?

   scheduling among PCEP peers.  The Open Object in the Open message
   contains the STATEFUL-PCE-CAPABILITY TLV defined in [RFC8231].  Note
   that the STATEFUL-PCE-CAPABILITY TLV is defined in [RFC8231] and
   updated in [RFC8281] and [RFC8232]".  In this document, we define a
   new flag bit B (SCHED-LSP-CAPABLITY) flag for the STATEFUL-PCE-
   CAPABILITY TLV to indicate the support of LSP scheduling and another
   flag bit PD (PD-LSP-CAPABLITY) to indicate the support of LSP
   periodical scheduling.

I note that (e.g.) RFC 8623 does not seem to give mnemonic names for the
individual bits, so our "bit B" and "bit PD" seem a bit out of place.

Section 5.2

   Only one of these TLV SHOULD be present in the LSP object.  In case
   more than one scheduling TLV is found, the first instance is
   processed and others ignored.

It seems that this wording "more than one scheduling TLV" might apply to
some hypothetical future TLV type for a different variation of scheduled
LSPs.  If that would be undesirable, we could reword to mention the two
TLV types by name.

Section 5.2.1

Please note that this formulation (number of seconds since a fixed time)
is invariant to leap seconds, but that conversions from current UTC time
to it might need to account for leap seconds.  (Or if you want to ignore
leap seconds, say that.)

      C (1 bit):  Set to 1 to indicate the PCC is responsible to setup
         and remove the scheduled LSP based on the Start-Time and
         duration.

I suggest noting that the PCE holds these responsibilities when the bit
is set to zero.

   Start-Time (32 bits):  This value in seconds, indicates when the
      scheduled LSP is used to carry traffic and the corresponding LSP
      must be setup and activated.  Value of 0 MUST NOT be used in
      Start-Time.  Note that the transmission delay SHOULD be considered
      when R=1 and the value of Start-Time is small.

I don't understand why start-time of 0 is disallowed (for at least the
R=0 case) -- that would disallow requesting a start time that happens to
land on the time when the time counter wraps around, for no reason.

   The Start-Time indicates a time at or before which the scheduled LSP
   must be set up.  The value of the Start-Time represents the number of
   seconds since the epoch when R bit is set to 0.  When R bit is set to
   1, it represents the number of seconds from the current time.

   In addition, it contains an non zero grace-before and grace-after if

I suggest s/it/the SCHED-LSP-ATTRIBUTE TLV/; it's easy to misread the
"it" as referring to the "Start-Time" from the previous paragraph.

   grace periods are configured.  It includes an non zero elastic range

Are the Grace-Before/Grace-After fields set to zero when grace periods
are not configured?

   lower bound and upper bound if there is an elastic range configured.

(Likewise for elastic-range.)

Section 5.2.2

   Opt: (4 bits)  Indicates options to repeat.  A new registry "Opt"
      under SCHED-PD-LSP-ATTRIBUTE is created.  When a PCE receives a
      TLV with a Opt value not defined, it does not compute any path for
      the LSP.  It generates a PCEP Error (PCErr) with a PCEP-ERROR
      object having Error-type = 4 (Not supported object) and Error-
      value = 4 (Unsupported parameter).

Have we thought about what kind of negotiation might be needed in the
case where a new Opt value is defined?  Though the possibility currently
seems unlikely, this error message does not seem sufficient to indicate
which Opt value is problematic.

   NR: (12 bits)  The number of repeats.  In each of repeats, LSP
      carries traffic.

Maybe say that NR==0 is equivalent to using SCHED-LSP-ATTRIBUTE (to
avoid questions of 0- vs. 1-indexing)?

Section 6.x

We mention in several places "the scheduled TLVs" for the LSP object,
but this seems misleading, since at most one scheduled TLV should be
present in a given object. Perhaps "a scheduled TLV" would be better?

Section 6.2

Perhaps it's worth noting that in the PCE-initiated case there is the
option to avoid using the scheduled LSP TLVs (and, to some extent, PCUpd
at all), since the PCE can just not tell the PCC about the scheduled
path until its start-time occurs.

Section 6.4

   request the path computation based on scheduled TED and LSP-DB.  A
   PCC MAY use PCReq message to obtain the scheduled path before
   delegating the LSP.

[if my previous comment about "subject to change" results in text
changes, similar changes would apply here]

Section 6.5

Just to check: the scheduled TLV should still be included in the
response even for a negative response?
(Also, same comment about "obtain the scheduled path before
delegating".)

Section 8

Since we deal with scheduled events, we should remind implementations to
do something reasonable when their current time jumps.  Jumps can be
forward or backward, and might cross boundaries for when LSPs should be
(in)active.  The presence of a significant time correction may be
indicative of other (configuration) issues, and falling back to a
conservative stance (keep LSPs active?) might be appropriate.

Similarly, some discussion of how things break when there is clock skew
between PCC and PCE might be useful (we already have a requirement for
clock synchronization in discussion of the R flag).

   on the network.  Thus, such deployment should employ suitable PCEP
   security mechanisms like TCP Authentication Option (TCP-AO) [RFC5925]
   or [RFC8253].  The procedure based on Transport Layer Security (TLS)
   in [RFC8253] is considered a security enhancement and thus is much
   better suited for the sensitive information.  PCCs may also need to

nit: TCP-AO would be considered a "security enhancement" as well
(compared to a baseline of unprotected TCP).  Perhaps the intent is to
say that the TLS procedure from RFC 8253 additionally provides
confidentiality protection to the conveyed data?

nit: "such deployments" plural.

Section 9.1

   When configuring the parameters about time, a user SHOULD consider
   leap-years and leap-seconds.

I know I mentioned leap seconds earlier as well, but this feels like a
cop-out.  We can tell the reader in much more detail how leap years and
seconds will affect their calculations, which in aggregate will be much
more efficient than making each reader think it through for themself.

Section 9.2

nit(?) "view the capability" (singular) sounds like it's just seeing
whether the scheduled LSP functionality is enabled or not, a boolean
value.  If the intent is to say that the specific (e.g., per-tunnel)
state should be visible, then this should be reworded accordingly.

Section 9.4

Is there something to say about checking that LSPs are
activated/disabled at the appropriate times for scheduled and periodic
events?

Section 9.5

Are there any requirements on PCE-to-PCE synchronization protocols that
now need to carry the SLSP-DB?

Section 10.1.1

Is there anything to say about why the two reserved values are reserved?



_______________________________________________
Pce mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/pce

[Pce] Benjamin Kaduk's Discuss on draft-ietf-pce-stateful-pce-lsp-scheduling-19: (with DISCUSS and COMMENT)

Reply via email to