[netmod] Re: WGLC on system-config-05

Rob Wilton (rwilton) Tue, 21 May 2024 13:02:57 -0700

Hi authors, chairs, WG,

I’m generally supportive of this work, but I think that there are still some 
potential corner cases that are not covered, or it isn’t entirely obvious how 
they are handled.


Comments below.





Moderate level comments:



(1) p 7, sec 2.3.  Inactive-Until-Referenced



   There are some system configuration predefined (e.g., application

   ids, anti-x signatures, trust anchor certs, etc.) as a convenience

   for the clients, which must be referenced to be active.  The clients

   can also define their own configurations for their unique

   requirements.  Inactive-until-referenced system configuration is

   generated in <system> immediately when the device is powered on, but

   it is not active until being referenced.



I'm not sure whether Inactive-Until-Referenced actually needs to be defined, or 
to put it another way, I'm not sure whether this type of configuration is 
special to system datastores at all.  If a configuration (either explicitly in 
<running> or implicitly from <system>) defines a QoS policy that is not 
referenced from anywhere, (e.g., not applied to any interfaces) then I think 
that it up to the server to decide whether that unreferenced QoS policy is 
reported in operational or not, depending on server implementation.





(2) p 9, sec 5.1.  Conceptual Model of Datastores



   When the device is powered on, immediately-active system

   configuration is generated in <system> and active immediately, but

   inactive-until-referenced system configuration only becomes active if

   referenced by client-defined configuration.  However, conditionally-

   active system configuration will only be created and active when

   specific conditions on system resources are met.



I think that it should be "merged with system" not "merged into system" since 
the running configuration never ends up in the system datastore.





(3) p 9, sec 5.1.  Conceptual Model of Datastores



                  additional nodes to a list entry or new list/leaf-

   list entries appearing in <running> extends the list entry or the

   whole list/leaf-list defined in <system> if the server allows the

   list/leaf-list to be updated.



How is this achieved?  This appears to suggest that there are two different 
merging behaviours (one choice is to be additive, the other is to replace), and 
it seems to be down to the server to choose what to do on a case-by-case basis. 
 I think that it would be cleaner to define a single merge behaviour if that is 
feasible (even if it is slightly less flexible).  Also, potentially it is 
appropriate for the merge behaviour to be different for list vs leaf-list 
(e.g., always merge list entries, but do a simple replace on leaf-lists).





(4) p 9, sec 5.1.  Conceptual Model of Datastores



                                      If a server implements

   <intended>, <system> MUST be merged into <intended>.



This sentence is just repetition and can be deleted.  The text above is still 
normative without the RFC 2119 MUST.





(5) p 13, sec 5.4.  Modifying (Overriding) System Configuration



   For instance, descendant nodes in a system-defined list entry may be

   modifiable or not, even if some system configuration has been copied

   into <running> earlier.  If a system node is non-modifiable, then

   writing a different value for that node MUST return an error.  The

   immutability of system configuration is defined in

   [I-D.ma-netmod-immutable-flag].



I think that some care is needed here.  E.g., if the modification was being 
done to <candidate>, then it isn't writing a different value to <candidate> 
that would return an error, but instead the <validate> or <commit> operation 
that would fail.





(6) p 13, sec 5.4.  Modifying (Overriding) System Configuration



   A server may also allow a client to add data nodes to a list entry in

   <system> by writing those additional nodes in <running>.  Those

   additional data nodes may not exist in <system> (i.e., an *addition*

   rather than an override).



Earlier, the text in 5.1 seems to suggest that a list-entry could be 
overwritten.  Is the intention that this is always a merge?  I.e., it is 
possible to override entries, but there is no way that running can remove a 
list entry that is defined in <system>.



This section, 5.4., seems somewhat of a repeat of what is specified in section 
5.1, and arguably it would be nice if this text could be co-located and only 
specified once (for brevity and to avoid ambiguity).  I’m wondering if the 
merge behaviour generally needs to be specified more explicitly.







Minor level comments:



(7) p 0, sec



   This document defines how a management client and server handle YANG-

   modeled configuration data that is defined by the server itself.  The

   system-defined configuration can be referenced (e.g. leafref) by

   configuration explicitly created by a client.



Perhaps 'instantiated' by the server itself rather than 'defined' by the server.





(8) p 0, sec



   The Network Management Datastore Architecture (NMDA) defined in RFC

   8342 is updated with a read-only conventional configuration datastore

   called "system" to hold system-defined configuration.



Perhaps 'expose system-defined configuration' to clients rather than 'hold'.





(9) p 0, sec



   As an alternative to clients explicitly copying referenced system-

   defined configuration into the target configuration datastore (e.g.,

   <running>) so that the datastore is valid, a "resolve-system"

   parameter is defined to allow the server acting as a "system client"

   to copy referenced system nodes automatically.  This solution enables

   clients manipulating the target configuration datastore (e.g.,

   <running>) to reference nodes defined in <system>, override system-

   provided values, and configure descendant nodes of system-defined

   configuration.



I think that this paragraph is too detailed to be in the abstract and should be 
removed from the abstract.





(10) p 4, sec 1.1.  Terminology



   The following terms are defined in this document:

   System configuration:  Configuration that is provided by the system

      itself.  System configuration is present in the system

      configuration datastore (regardless of whether it is applied or

      referenced) and appears in <intended> unless explicitly

      overridden.  System configuration that is considered active

      appears in <operational> with origin="system".  It is a different

      and separate concept from factory default configuration defined in

      RFC 8808 (which represents a preset initial configuration that is

      used to initialize the configuration of a server).



RFC 8808 should turn into a proper reference, it looks like it is just text 
here.





(11) p 5, sec 1.4.  Updates to RFC 6241 and RFC 8526



   This document defines a NETCONF protocol capability to indicate

   support for this parameter.  NETCONF server that supports "resolve-

   system" parameter MUST advertise the following capability identifier:



Are we ambiguous as to whether this must be supported, or is optional to 
implement?  Ah, I see that this is specified later in the document (which is 
arguably the right place).  Is the capability really an update to RFC 6241 and 
8526?  I wonder whether this last paragraph (i.e., the capability definition) 
would be better under section 5.3.





(12) p 5, sec 1.5.  Updates to RFC 8040



   This document extends Sections 4.8 and 9.1.1 of [RFC8040] to add a

   new query parameter "resolve-system" and corresponding query

   parameter capability URI.



Again, I think that possibly sections 1.5.1 and 1.5.2 would be better outside 
of the introduction, perhaps as subsections of 5.3.  Then section 1.5, could 
then forward reference to those sections.





(13) p 6, sec 2.  Kinds of System Configuration



   Active system configuration refers to system configuration that is

   currently in use.  As per definition of the operational state

   datastore in [RFC8342], if system configuration is inactive, it does

   not appear in <operational>.  However, system configuration is

   present in <system> once it is generated, regardless of whether it is

   active or not.



I'm not sure that calling this "active configuration" is a great choice, 
because it seems to be a slightly different concept to inactive configuration 
defined in RFC 8342.  Specifically, I thought that the inactive configuration 
in RFC 8342 controlled whether or not it would appear in <intended>, but in 
this case, presumably it always turns up in <intended> if it is in <system> and 
instead doesn't appear in <operational>?







(15) p 7, sec 3.  The System Configuration Datastore (<system>)



   *  Management operations: The content of the datastore is set by the

      server in an implementation dependent manner.  The content can not

      be changed by management operations via protocols such as NETCONF,

      RESTCONF, but may change itself by license change, device upgrade

      and/or system-controlled resources change.  The datastore can be

      read using the standard network management protocols such as

      NETCONF and RESCTCONF.



Rather than saying that the contents can change itself, I think that it would 
be better to say that the server may change the contents under various 
conditions, such as ...





(16) p 7, sec 3.  The System Configuration Datastore (<system>)



   *  Origin: This document does not define any new origin identity when

      it interacts with <intended> and flows into <operational>.  The

      "system" origin Metadata Annotation [RFC7952] is used to indicate

      the origin of a data item is system, which is achieved by updating

      the definition of "intended" origin metadata annotation in

      [RFC8342].



If a different value is configured in <running> that overrides a value in 
<system> then it is clear that the origin should be <intended>.  Do we specify 
what the origin should be if the same value exists in both <running> and 
<system> (which could be a very common occurrence if resolve-system is used)?





(17) p 8, sec 3.  The System Configuration Datastore (<system>)



   The system datastore is defined as a conventional configuration

   datastore and shares a common datastore schema with other

   conventional datastores.



This paragraph should probably move up to "YANG modules".





(18) p 8, sec 4.2.  May Change via Software Upgrades or Resource Changes



   *  Servers rejects the operation to change system configuration

      (e.g., device upgrade fails) and needs the client to correct the

      configuration in <running> as a prerequisite to ensure validity



Should we add a recommendation for servers to document how they handle these 
issues?





(19) p 10, sec 5.1.  Conceptual Model of Datastores



    ct = config true; cf = config false

    rw = read-write; ro = read-only

    boxes denote named datastores



In this diagram, (1) please move the system box 1 line to the left to keep is 
more cleanly separate from the arrow into running.  (2) I think that we should 
discuss whether the running and system arrows should merge at a common point 
rather than running flowing into the side.





(20) p 10, sec 5.1.  Conceptual Model of Datastores



    ct = config true; cf = config false

    rw = read-write; ro = read-only

    boxes denote named datastores



I know that it isn't directly related to this work, but I wonder whether the 
"default configuration" arrow is really in the right place, and whether that 
shouldn't also be feeding this arrow into <intended>, since validation would 
surely take default values into account.  But this is perhaps a question for 
another day ...





(21) p 11, sec 5.1.  Conceptual Model of Datastores



   Any deletable system-provided configuration that is populated as part

   of <running> by the system at boot up, without being part of the

   contents of a <startup> datastore, must be defined in <factory-

   default> [RFC8808], which is used to initialize <running> when the

   device is first-time powered on or reset to its factory default

   condition.



I agree with the sentiment of what is written here, but I'm not sure that it is 
wise to restate it in this document, or whether it would be better to delete 
this paragraph and just reference RFC 8808.



E.g., maybe something like ..

<factory-default> [RFC8808] defines a mechanism for populating <running> at 
system boot up with regular configuration data nodes, that hence can be deleted.





(22) p 12, sec 5.3.  Servers Auto-configuring Referenced System Configuration

      ("resolve-system" parameter)



   The "resolve-system" parameter is optional and has no value.  If it

   is present, and the server supports this capability, the server MUST

   copy referenced system nodes into the target datastore (e.g.,

   <running>) without the client doing the copy/paste explicitly, to

   resolve any references not resolved by the client.  The server acting

   as a "system client" like any other remote clients copies the

   referenced system-defined nodes when triggered by the "resolve-

   system" parameter.  Legacy clients interacting with servers that

   support this parameter don't see any changes in <edit- config>/<edit-

   data> and <copy-config> behaviors.



How does resolve-system interplay with the candidate configuration datastore?  
E.g., should it also be listed in the examples of datastores.  What about the 
<validate> or <commit> operations?  Is there any impact of private-candidate 
datastores, and if so, where should that be documented?





(23) p 12, sec 5.3.  Servers Auto-configuring Referenced System Configuration

      ("resolve-system" parameter)



   The server's copy referenced nodes from <system> to the target

   datastore MUST be enforced at the end of the <edit-config>/<edit-

   data> or <copy-config> operations during the validation processing,

   regardless of which target datastore it is.



This probably means that it isn't a separate "system client" because I would 
expect that to turn in the commit history as a separate commit, but instead, 
the update to running via resolve-system is exactly the same as if the client 
had made the modification directly as part of an edit-data (or similar) 
operation.





(24) p 13, sec 5.3.  Servers Auto-configuring Referenced System Configuration

      ("resolve-system" parameter)



   If the "resolve-system" parameter is not given by the client, the

   server should not modify <running> in any way otherwise not specified

   by the client.  Not using capitalized "SHOULD NOT" in the previous

   sentence is intentional.  The intention is to bring awareness to the

   general need to not surprise clients with unexpected changes.  It is

   desirable for clients to always opt into using mechanisms having

   server-side changes.  This document enables a client to opt into this

   behavior using the "resolve-system" parameter.  An example of this

   type of opt-in behavior can also be found in RFC 7317, which enables

   a client to opt into its behavior using a "$0$" prefix (see

   ianach:crypt-hash type defined in [RFC7317]).



Arguably, I don't think that above paragraph is needed at all and can just be 
removed.  Otherwise, you could argue that it perhaps conflicts with the text in 
4.2?





(25) p 13, sec 5.3.  Servers Auto-configuring Referenced System Configuration

      ("resolve-system" parameter)



   Implementation specifics are beyond the scope of this document,

   however, due to the extra complexity brought by the "resolve-system"

   parameter, clients should be aware that it would cost a reasonable

   amount of time for the server to resolve reference, retrieve and copy

   the referenced system configuration from <system>, which could take

   multiple rounds since some errors may depend on the resolution of

   previous ones.



Suggest changing "it would cost" to "it may take".  But I'm also not really 
sure that this paragraph should be in the document (e.g., what is a reasonable 
amount of time?  Is it 1 second, or a minute, or a few minutes).





(26) p 24, sec 6.2.  Example Usage



   The local port and remote port are used when the BGP peer connection

   is established.  Since both are not supplied explicitly in <running>

   and <intended>, the default value for "bgp/peer/remote-port" is used,

   and there is no default statement for "bgp/peer/local-port", the

   system will select a value for it.  So the contents of <system> are

   shown as follows:



There is some level of interplay here between YANG default values and the 
information present in system.  E.g., depending on how the YANG data model is 
written (i.e., sometimes complex default values are specifying in description 
statements rather than as formal YANG defaults), then the choice as to whether 
to report a value in system vs a default value in the configuration may be a 
bit ambiguous.  E.g., this example of using the local-port as an example of 
system configuration potentially feels like the weakest of the alternative 
justifications that have been provided.





(27) p 29, sec 7.3.  YANG Module



         description

           "When present, the server is allowed to automatically

            configure referenced system configuration into the

            target configuration datastore.";



Should this be "is allowed to automatically configure", or should it be "must 
automatically configure"?





(28) p 31, sec 9.2.  Regarding the "ietf-netconf-resolve-system" YANG Module



   The security considerations for the base NETCONF protocol operations

   (see Section 9 of [RFC6241] apply to the new extended RPC operations

   defined in this document.



Possibly, this section should say a bit more about the security impacts of 
supporting the resolve-system option, i.e., that there aren't any beyond the 
potential performance impacts of implementing resolve-system, which may mean 
that employing some form of rate limiting of requests specifying this option 
might be a good idea to avoid DoS attacks.





(29) p 35, sec Appendix A.  Key Use Cases



A.1.  Device Powers On



Please provide a short prose description of what the example illustrates.





(30) p 35, sec Appendix A.  Key Use Cases



   <running>:



Please expand these, e.g.  The <running> datastore contains:





(31) p 35, sec Appendix A.  Key Use Cases



   <system>:



Please expand these, e.g.  The <system> datastore contains:





(32) p 35, sec Appendix A.  Key Use Cases



   <intended>:



Please expand these, e.g.  After merging,the <intended> datastore contains:





(33) p 35, sec Appendix A.  Key Use Cases



   <operational>:



Please expand these, e.g.  Once the configuration is applied, the <operational> 
datastore contains:





(34) p 36, sec Appendix A.  Key Use Cases



   <running>:



Please expand these simiarly to above for the other examples, A.2 and A.3.





(35) p 36, sec Appendix A.  Key Use Cases



   <interfaces xmlns:or="urn:ietf:params:xml:ns:yang:ietf-origin"

               or:origin="or:intended">



As per a previous comment, I wonder whether the origin of the 'interfaces' 
container itself should be 'intended' or 'system' (given than loopback always 
exists and hence it can never be removed).





(36) p 37, sec Appendix A.  Key Use Cases



     <interface or:origin="or:system">

       <name>lo0</name>

       <ip-address>127.0.0.1</ip-address>

       <ip-address>::1</ip-address>

     </interface>

   </interfaces>

A.3.  Operator Installs Card into a Chassis



Please provide a short prose description of what the example illustrates.







Nit level comments:



(37) p 5, sec 1.3.  Updates to RFC 8342



   Configuration in <running> is merged into <system> to create the

   contents of <intended> after the configuration transformations to

   <running> (e.g., template expansion, removal of inactive

   configuration defined in [RFC8342]) have been performed.  This

   document updates the definition of "intended" origin metadata

   annotation identity to allow a subset of configuration provided by

   <intended> to use "system" as origin value as it flows into

   <operational>.  Applied system configuration appears in <operational>

   with origin value being reported as "system" (Section 5.1).



I think that "<running> is merged into <system>" is confusing.  I would say 
that <running> is merged with the contents of <system> and how that merge is 
performed must be specified.





(38) p 8, sec 3.  The System Configuration Datastore (<system>)



   *  Defining YANG module: "ietf-system-datastore".

   The datastore's content is defined by the server and read-only to

   clients.  Upon the content is created or changed, it will be merged

   into <intended>.  Unlike <factory-default> [RFC8808], it MAY change

   dynamically, e.g., depending on factors like license change, device

   upgrade or system-controlled resources change (e.g., HW available).

   The system configuration datastore doesn't persist across reboots;

   <factory-reset> RPC operation defined in [RFC8808] can reset it to

   its factory default configuration without including configuration

   generated due to the system update or client-enabled functionality.



Upon the content => When the content.  Some of the content here seems to repeat 
the text in "Management operations", I think the examples would be better in 
only a single place.





(39) p 8, sec 4.2.  May Change via Software Upgrades or Resource Changes



   If system configuration changes (e.g., due to device upgrade),

   <running> MAY become invalid.  The server behaviors of migrating

   updated system data into <running> is beyond the scope of this

   document.  That said, the following gives a list of examples of

   server implementations that might be possible:



Suggest rewording to: "That said, here are some examples of how a server might 
handle this scenario:"





(40) p 9, sec 4.3.  No Impact to <operational>



   This work intends to have no impact to <operational>.  System

   configuration appears in <operational> with origin value being

   reported as "system" if not configured or overridden explicitly in

   <running>.  This document enables a subset of those system generated

   nodes to be defined like configuration, i.e., made visible to clients

   in order for being referenced or configurable prior to present in

   <operational>.  "Config false" nodes are out of scope, hence existing

   "config false" nodes are not impacted by this work.



As per above, does "Overridden explicitly" mean "has a different value" in 
running?





(41) p 12, sec 5.3.  Servers Auto-configuring Referenced System Configuration

      ("resolve-system" parameter)



   Note that even an auto-configured node is allowed to be deleted from

   the target datastore by the client, the system may automatically

   configure the deleted node again to make configuration valid, when a

   "resolve-system" parameter is carried.  It is also possible that the

   operation request (e.g., <edit-config>) may not succeed due to

   incomplete referential integrity.



Perhaps "recreate the deleted node" rather than "configure the deleted node".





(42) p 12, sec 5.3.  Servers Auto-configuring Referenced System Configuration

      ("resolve-system" parameter)



   Support for the "resolve-system" parameter is OPTIONAL.  Servers not

   supporting NMDA [RFC8342] MAY also implement this parameter without

   implementing the system configuration datastore, which would only

   eliminate the ability to expose the system configuration via protocol

   operations.  If a server implements <system>, referenced system

   configuration is copied from <system> into the target datastore

   (e.g., <running>) when the "resolve-system" parameter is used;

   otherwise it is an implementation decision where to copy referenced

   system configuration into the target datastore (e.g., <running>).



Perhaps 'examine' rather than 'expose'.





(43) p 21, sec 5.5.3.  Modifying a System-instantiated Leaf's Value



   <interfaces xmlns="urn:example:interface">

     <interface>

       <name>lo0</name>

       <mtu>65536</mtu>

       <ip-address>127.0.0.1</ip-address>

       <ip-address>::1</ip-address>

     </interface>

   </interfaces>

   A client modifies the value of MTU to 65535 and adds the following

   configuration into <running>:



I initially hadn't spotted the subtle change, perhaps use an MTU value that is 
more obviosuly different from the value in system.  E.g., perhaps 9216.



Regards,

Rob




From: netmod <[email protected]> on behalf of Kent Watsen 
<[email protected]>
Date: Friday, 29 March 2024 at 14:09
To: [email protected] <[email protected]>
Subject: [netmod] WGLC on system-config-05
This email begins a two-week WGLC on:
System-defined Configuration
https://datatracker.ietf.org/doc/draft-ietf-netmod-system-config/

Please take time to review this draft and post comments by April 12.  Favorable 
comments are especially welcomed.
There is no known IPR for this document:
https://mailarchive.ietf.org/arch/msg/netmod/IpzWIAbgifXoKaNfLhEDmYbyXkY/
Kent  // co-chair

_______________________________________________
netmod mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[netmod] Re: WGLC on system-config-05

Reply via email to