Re: [Pce] I-D Action: draft-koldychev-pce-operational-05.txt

Mike Koldychev (mkoldych) Mon, 04 Jul 2022 11:04:10 -0700

Hi Adrian,

Thanks for the useful review comments. I will address most of them in the next 
version. See my replies inline with [MK].

Thanks,
Mike.

-----Original Message-----
From: Adrian Farrel <[email protected]> 
Sent: Sunday, February 20, 2022 3:34 PM
To: [email protected]
Cc: [email protected]
Subject: RE: I-D Action: draft-koldychev-pce-operational-05.txt

Hi authors,

I really appreciate the work done through interop to better understand protocol 
specs and revise the protocol. I hope that you are not all talking yourselves 
into an interop mode that changes the specs because that seems to interoperate, 
rather than fixing implementations to conform to the current specs (which were 
thought about quite a bit ;-)

In the end, of course, we should document what is implemented and interoperates 
(provided it works!).

I did a very light skim of the document and I found a number of issues that 
concern me.

Best,
Adrian

---

In section 3 you have:
   We introduce the concept of the LSP-DB, as a database of actual LSP
   state in the network

I don't think you do 😊

You might start with RFC 7399 and RFC 8051.

Possibly you are introducing the concept of the PCC LSP-DB.

[MK] Sure will change the wording.[/MK]

---

In section 3 you say

   The structure and format
   of the LSP-DB MUST be common among all dataplane types (i.e., RSVP-
   TE/SR-TE/SRv6), all instantiation methods (i.e., PCC-initiated/PCE-
   initiated), all destination types (i.e., point-to-point/point-to-
   multipoint).

This should be self-evidently wrong. The LSP-DB is internal to the PCE 
implementation, so while it is true that the PCE needs to be able to derive 
certain information from the LSP-DB, how it stores that information is 
completely its own business. Now, you may want an abstract representation in 
order to define your state machines, and that is fine, but please don't tell 
implementations how to implement.

[MK] Was simply saying that whatever mechanisms we define for manipulating the 
LSP database must be common among all dataplane types. I can just remove this 
paragraph as it's sort of obvious anyway.[/MK]

---

In 3.2 you have

   The PCC adds/removes entries to/from its LSP-DB based on what LSPs it
   creates/destroys in the network.  There can be many trigger types for
   updating the PCC LSP-DB, some examples include PCUpd messages, local
   computation on the PCC, local configuration on the PCC, etc.  The
   trigger type does not affect the content of the PCC LSP-DB, i.e., the
   content of the PCC LSP-DB is updated identically regardless of the
   trigger type.

But surely a PCUpd message does not immediately affect the state of the LSP in 
the network. Depending on the signaling process and the processing at the PCC, 
that may take some time. So, you need to be careful when you include PCUpd as a 
trigger and then say that all trigger types are to be treated equally.

[MK] Wasn't saying anything about "immediately" here? I was saying "identically 
regardless of the trigger type". But let me try to condense/rework a lot of the 
wording since it may be confusing.[/MK]

Later (in 3.2) you say:
   The LSP-DB on both the PCC and the PCE only stores the actual state
   in the network, it does not store the desired state.
Which seems to say that PCUpd is not a trigger.

In fact, the PCE and PCC need to store both the "currently believed current 
state" and the "state that is being attempted". How this state is stored is a 
very moot point because the LSP-DB is a logical concept. [In previous work] it 
expresses the information stored by the PCE about the active LSPs, but there is 
no such limit placed on what information about the active LSPs may be stored. 
Why not store the shoe size of the network operator's mother, if the 
implementation chooses to do so?

I suspect that all PCCs have always stored the current/desired states (plural) 
because that is how head end devices work. That is why there was never any 
mention of PCC LSP-DBs in the previous literature: they were implicit in "this 
is what you do when you build a router." The LSP-DB was only introduced for the 
stateful work because of the (new) desire to know what the PCC already knew and 
to synchronise PCC-PCE and PCE-PCE.

It seems to me that you are trying to describe how to implement a PCE and a 
PCC: something that may be a fun task, but which surely belongs outside the 
IETF.

[MK] There is nothing preventing an implementation from storing other 
information (desired state, shoe size, etc.) in a parallel-but-separate 
database indexed by the same key as the LSP-DB. That's all implementation 
details that we don't want to talk about in the draft, so we keep them out of 
our logical database. This allows us to make statements like "PCE LSP-DB is 
only modified by PCRpt messages" which sort of clarify where the data comes 
from. [/MK]

---

3.2

   Whenever a PCC modifies an entry in its PCC LSP-DB, it MUST send a
   PCRpt message to the PCE (or multiple PCEs), to synchronize this
   change.

This implies that this update must be sent "at once". Why? The network may have 
taken some time to reach this state - why, then, must the update be instant?

Indeed, you may want to smooth flaps in the PCC, and also avoid swamping the 
PCE.

[MK] Not sure why you think this implies "at once", but I shortened a lot of 
this text, to make it more to the point. [/MK]

---

3.2

   Ensuring this synchronization is always in place allows one
   to define behavior as a function of LSP-DB state, instead of defining
   behavior as a function of what PCEP messages were sent or received.

Cough?
The message sets the state (you have just said), so the state is exactly a 
function of what messages were sent or received.

[MK] Sure, I can remove this paragraph, it doesn't really serve any purpose. 
[/MK]

---

3.2

   When the PCC acts on message, it would update its own PCC
   LSP DB and immediately send PCRpt to the PCE to synchronize the
   change.

As before:
- why immediately?
- and why not wait until the change is present in the network?

[MK] Ack, removed the confusing part. [/MK]

---

Section 1 has

   The current document serves to optimize the original procedure in
   [RFC8231] to drop the PCReq and PCReply exchange, which greatly
   simplifies implementation and optimizes the protocol.

But 3.3 says

   In this document, we would like to make
   it clear that sending PCReq is optional.

So, I think Section 1 should s/drop/optionally drop/

[MK] Yes, it is of course optionally drop. [/MK]

---

3.3 has

   In reality, stateless bringup introduces
   overhead and is not possible to enforce from the PCE, because the
   stateless PCE is not supposed to keep any per-LSP state about
   previous PCReq messages.

I think "not supposed to" is a bit strong. What stops a PCE keeping track of 
everything? It is an implementation and it can do what it wants.
Indeed, the concept of "sticky resources" was introduced a long time ago (and 
so described rather late in the day in RFC 7399) to help with the need to 
understand which network resources might be in the process of being provisioned 
but have not been updated in the TED yet. 

Perhaps s/not supposed/not required/

[MK] Yes, "not required" is a better fit. [/MK]

---

3.3

   It was found that many vendors choose to
   ignore this requirement and send the PCRpt directly, without going
   through PCReq.

This is the main point, and it is good to flag it up. 
But, be clear, this is a violation of 8231 (which uses a "MUST NOT" to cover 
this case). 
So this document is not "clarifying" it is "updating" and you need to make this 
very clear in the document.

[MK] Yes, thanks. [/MK]

---

3.3

   Therefore, PCReq messages
   are useful for many OAM ping/traceroute applications where the PCC
   wishes to probe the network without having any effect on the existing
   LSPs.

I don't think the PCC "probes the network" in this case. I think the PCC asks 
questions to the PCE. 

[MK] Maybe "probe the topology"? [/MK]

---

3.3

   The PCC MAY delegate an empty LSP to the PCE and then wait for the
   PCE to send PCUpd, without sending PCReq.  We shall refer to this
   process as "stateful bringup".  The PCE MUST support the original
   stateless bringup, for backward compatibility purposes.  Supporting
   stateful bringup should not require introducing any new behavior on
   the PCE, because as mentioned earlier, the PCE MUST NOT modify LSP-DB
   state based on PCReq messages.  So whether the PCE has received a
   PCReq or not, it MUST process the PCRpt all the same.

OK. This is your description of new behaviour. You need to do two things:
1. In this section, turn this into an abstract description of the function.
2. Move the new BCP 14 description into a new section and flag it as "new 
behaviour that updates 8231".

[MK] Yes, thanks. [/MK]

Missing from this description is how the PCE decides to:
- do a path computation
- set this LSP's oper status to up
- send a PCUpd

[MK] I think these are all "same as before", no? [/MK]

I *think* that you are using the "empty" ERO as the trigger. But I would note 
that, per RFC 3209, the ERO is optional. I don't know that optional and 
present-but-empty are easy to distinguish. Perhaps that doesn't matter because 
after delegation the PCE will determine what path it wants and will supply an 
ERO. 

[MK] I will clarify "no ERO/empty ERO". [/MK]

I would question why the PCC doesn't send "oper up" with a null ERO to be the 
trigger. This would be more consistent with the PCC's desire to have the LSP 
transition to active.

[MK] I believe the oper state is not "desired oper state", right? [/MK]

---

In 3.4 and 3.5 I am confused about delegation since you don't show the D-flag. 
This is crucial because if LSP 2 has been delegated, it is not the PCC's job to 
perform MBB, but if LSP 2 has not been delegated, why would it be reported to 
the PCE?

[MK] The D-flag plays no role here at all. The process is the same for 
non-delegated LSPs. I.e., when PCE is passive-stateful and simply logs the 
state of the LSPs being reported from the PCCs. It will still get information 
about MBB temporary LSPs even though it's not "driving" those LSPs. [/MK]

---

I'm unclear whether 4.1 etc. is defining any new behaviour (notwithstanding the 
"new" logical ASSO DB). Isn't the case that when the state of an 
already-delegated LSP changes, the PCC must update the PCE? And surely the 
association is part of the state.

[MK] I don't believe there's new behavior here, just illustrating existing 
behavior. [/MK]

You are using BCP 14 language in an example which is unhelpful.

If you are clarifying existing procedures then:
- you must reference them
- you must not use BCP 14 language
If you are defining new procedures or fixing existing procedures then:
- you must reference them
- you must update the relevant rfcs
- you must make it clear:
    - what the abstract behaviour is
    - what the new/changed procedures are

[MK] I can remove the BCP 14 language, thanks. [/MK]

---

I think 4.2 is just an example for clarification (since everything works as I 
would expect it to). That's fine, but given the confusion in the document about 
what is new/change and what is clarification, it would be helpful to scope this 
sections as "an example for clarification, and not normative."

[MK] Sure, added. [/MK]

---

Section 5 is a big change to RFC 8697 section 6.3.1 that has:

   When an LSP is first reported to the PCE, the PCRpt message MUST
   include all the association groups that it belongs to.  Any
   subsequent PCRpt message SHOULD include only the associations that
   are being modified or removed.

Thus, you are saying that an existing implementation of 8697 will accidentally 
remove an LSP from all of its associations if it doesn't list them in a PCRpt.

So...
- You need to reference 8697
- You need to explicitly update 8697
- You need to use BCP 14 language

[MK] Section 5 actually doesn't apply to RFC 8697, because the ASSOCIATION 
object has an explicit R-flag. Section 5 is specifically for "any PCEP object 
that does not have an explicit removal flag". This is stuff like LSPA, METRIC, 
BW, etc. I believe there were implementations that would only send an object on 
the first PCRpt, but would not send it on subsequent reports assuming that "PCE 
remembers it". I don't think that this is actually updating existing RFCs. 
Maybe it just wasn't obvious or misinterpreted. [/MK]

...and this leads to the big one for the whole document...

---

...you need to describe how to interoperate with "legacy" implementations that 
adhere to the current specifications.

---

Section 6

Again, I'm not clear when you are restating and when you are changing specified 
behaviour.

But...

   A PCE SHOULD interpret the RRO/SR-
   RRO/SRv6-RRO as the actual path the LSP is taking but MAY interpret
   only the ERO/SR-ERO/SRv6-ERO as the actual path.  

No. If the RRO is present, it would be wrong to ever interpret the ERO as the 
actual path taken. That would lead to the LSP-DB being incorrect with the 
result that subsequent computations would potentially be in error. In fact, the 
LSP-DB needs to hold both the intended and actual paths to allow for more 
complex cases.

You can either reduce this to only cover SR, or you fix it for all paths.

[MK] Yes, it's only for SR-TE. [/MK]

Furthermore...

   In the absence of
   an RRO/SR-RRO/SRv6-RRO a PCE SHOULD interpret the ERO/SR-ERO/SRv6-ERO
   (respectively) as the actual path for the LSP.

Why "SHOULD" and not "MUST"? What other recourse does the PCE have? Should it 
update the LSP-DB to say "I've no idea, but there is an LSP somewhere in the 
network"?

[MK] I'm thinking that it might be possible for SR-TE to do signaling similar 
to RSVP-TE to reserve state along the path, so I don't want to make it too 
restrictive. I will add some wording to say this. [/MK]

---

-----Original Message-----
From: I-D-Announce <[email protected]> On Behalf Of 
[email protected]
Sent: 19 February 2022 19:22
To: [email protected]
Subject: I-D Action: draft-koldychev-pce-operational-05.txt

A New Internet-Draft is available from the on-line Internet-Drafts directories.

        Title           : PCEP Operational Clarification
        Authors         : Mike Koldychev
                          Siva Sivabalan
                          Shuping Peng
                          Diego Achaval
                          Hari Kotni
        Filename        : draft-koldychev-pce-operational-05.txt
        Pages           : 15
        Date            : 2022-02-19

Abstract:
   This document proposes some important simplifications to the original
   PCEP protocol and also serves to clarify certain aspects of PCEP
   operation.  The content of this document has been compiled based on
   the feedback from several multi-vendor interop exercises.  Several
   constructs are introduced, such as the LSP-DB and the ASSO-DB.

The IETF datatracker status page for this draft is:
https://datatracker.ietf.org/doc/draft-koldychev-pce-operational/

There is also an htmlized version available at:
https://datatracker.ietf.org/doc/html/draft-koldychev-pce-operational-05

A diff from the previous version is available at:
https://www.ietf.org/rfcdiff?url2=draft-koldychev-pce-operational-05

Internet-Drafts are also available by rsync at rsync.ietf.org::internet-drafts

_______________________________________________
I-D-Announce mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/i-d-announce
Internet-Draft directories: http://www.ietf.org/shadow.html or 
ftp://ftp.ietf.org/ietf/1shadow-sites.txt

_______________________________________________
Pce mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/pce

Re: [Pce] I-D Action: draft-koldychev-pce-operational-05.txt

Reply via email to