[netmod] Re: 2nd WGLC on system-config-08

Jürgen Schönwälder Thu, 15 Aug 2024 08:11:32 -0700

On Sun, Aug 11, 2024 at 02:10:31PM +0000, Kent Watsen wrote:
> 
> This email begins a two-week WGLC on:
> 
>       System-defined Configuration
>       https://datatracker.ietf.org/doc/html/draft-ietf-netmod-system-config-08


I have reviewed draft-ietf-netmod-system-config-08 as part of the WG
last call. I think the document is _not_ready_, in fact, I find the
document inconsistent. There is text (and a figure) suggesting that
system config is merged after template expansion into intendedn and
then there is text and lots of details how system config is copied
into running. These are two very different things and you either do
copy to running or merge to intended.

Here are my notes as I was going through the document from the
beginning to the end.

* Abstract

  - The abstract is not well written for readers not familiar with the
    details of the work nor is it clear to me what the second sentence
    tries to convey. Here is an attempt to write a replacement
    abstract:

      The Network Management Datastore Architecture (NMDA) defined in
      RFC 8342 defines several configuration datastores holding
      configuration information. The content of these configuration
      datastores is controlled by clients. This document introduces
      the concept of a system configuration datastore holding
      configuration controlled by the system on which a server is
      running.

      This document updates RFC 6241, RFC 8040, RFC 8342, and RFC
      8526.

* Introduction

  - I think the second paragraph in the Introduction is
    handwaving. The system configuration is clearly exposed and the
    structure follows a data model, the new system datastore is no
    different. Also the 3rd paragraph is not good, a server allowing a
    reference from a configuration datastore to operational state is
    broken. I would love to see this text be rewritten to state the
    problem instead of avoiding an elaboration of bad implementation
    practice as a justification for this work. In other words, the
    text should say that YANG does not allow configuration data to
    reference operational state data (and why) and then it should
    motivate why a system configuration datastore helps to improve the
    situation.

  - There is text about a 'resolve-system" parameter in the
    Introduction and it remains unclear where and when this parameter
    is used. It is a parameter of what?

  - The text "If a system-defined node is referenced ..." seems to be
    belong to the terminology section. And the term "system-defined
    node" deserves to be defined.

  - Why is the merging of system into intended happening after any
    expansions of templates or removal of inactive configuration? Is
    there a specific reason for this? Given that we have no
    specification for templates or inactive configuration, is it wise
    to fix the order? Also note that Figure 1 is misleading since it
    kind of indicates a merge _before_ intended. Why do we not have:

      |                              | // configuration transformations,
      |                              | // e.g., removal of nodes marked
      |                              | // as "inactive", expansion of
      +--------------+---------------+ // templates, merge of <system>
                     |
                     |
                     v
               +------------+
               | <intended> |  // subject to validation
               | (ct, ro)   |
               +------------+

* Kinds of System Configuration

  - I did search for immediate and conditional and it seems that the
    two terms defined here are not really used anywhere. If so, do we
    need the definitions? And if we need them, why are these terms not
    defined in section 1.1? My preference is to not defined terms we
    do not use. In section 4.2, it is stated that <system> MAY change
    dynamically and hence all information is conditionally present. I
    suggest to simply remove section 2.

* The System Configuration Datastore (<system>)

* Static Characteristics of <system>

  - I am confused about what is being said there. The text says
    "should not cause the automatic update of <running> and later it
    says "Servers migrate system configuration update in <running>". I
    think this needs more clarity.

    I think you _never_ update running. According to your model,
    <system> feeds into intended. Perhaps you meant <intended>? If so,
    I believe there are situations where updating intended is the
    right thing to do (say the system's default MTU of an interface
    changes and there is no explicit configuration for it) but there
    may also be situations where indeed the upgrade may need to be
    rejected (e.g., if an update would rename all interfaces, which
    may cause for example firewall rules to mean something very
    different). I believe we should have better and contradiction free
    text here. And the text should not talk about updating <running>.

    I am also not sure why this is a static characteristic. Perhaps
    the important portions from section 4 can be merged into section 5
    and we get rid of the distinction of static characteristics and
    dynamic characteristics?

* Dynamic Behaviors

  - The phrase "Clients MAY reference nodes defined in <system>" is
    misleading. A client does not reference nodes, a client controls
    configuration in other datastores and these datastores may contain
    nodes referencing nodes defined in <system>. In other words, it
    should be "Clients may create configuration data nodes that
    reference nodes in the <system> datastore ..." or something like
    that. I also believe the MAY should just be a may.

  - Again, what is the reason for merging after template expansion and
    removal of inactive? Why is this a MUST? What breaks if I would
    say remove inactive after the merge? There may be valid reasons
    and it may be useful to capture them.

  - "Whenever configuration in <system> changes, the server MUST also
    immediately update and validate <intended>."

  - See above on comments on the figure. I would go even further and
    draw it as follows:

 +-----------+         |        +-----------+         |
 | <system>  |         +------->| <running> |<--------+
 | (ct, ro)  |                  | (ct, rw)  |
 +-----------+                  +-----------+
      |                              | 
      |              +---------------+
      |              |
      |              |  // configuration transformations, e.g., removal
      +--------------+  // of nodes marked as "inactive", expansion of
                     |  // templates, merge of <system>
                     v
               +------------+
               | <intended> |  // subject to validation
               | (ct, ro)   |
               +------------+

  - What does it mean to "remove a copied system node from <running>"?
    Copied by whom? Perhaps you mean 'delete and overriding system
    node'?

  - I did not understand what section 5.1.1 tries to tell me. And it
    talks about "configuration copied from <system> into <running>",
    copied by whom? And why would something in <running> be tagged
    origin intended? My attempts to make sense out of 5.1.1 cause
    disagreement with it.

  - I think section 5.2 is not well worded. It talks about 'declaring
    system configuration' but I think we never really used the phrase
    to 'declare configuration'. It seems that all the text in 5.2
    essentially reduces to this statement:

      Clients can create configuration data nodes in <running> that
      match configuration data nodes in <system>. This can be
      necessary, for example, when the client does not support the
      "resolve-system" parameter (Section 5.3) in order to validate
      <intended>.

   - Section 5.3 says that "resolve-system" causes the server to copy
     the entire referenced system configuration, including all
     descendants into the target datastore (e.g., <candidate> and
     <running>). Really? This is not what I got from the text I read
     before and this is clearly not what the figures suggest. I also
     find this "copy into running" problematic; I thought so that that
     "resolve-system" causes <system> content to be merged into
     <intended>?

   - Why is it desirable to allow implementations supporting
     "resolve-system" but not exposing <system>? This means that there
     is no way to predict for a client what an edit-config will
     resolve to. Is this desirable?

   - Section 5.4 tells me that some data in <running> may override
     what is in <system> but some other data may not since system
     config is immutable. "The immutability of system configuration is
     defined in [I-D.ietf-netmod-immutable-flag]." So does this create
     a dependency on the immutability flag? Or is the idea that
     clients do not require to know this since they can go into trial
     and error mode? Is this desirable? And if so, what exactly is the
     error information to expect and react on?

   - Is <system> configuration that is not referenced directly or
     indirectly from <running> applied (or merged into intended) or
     not? In the example, are all applications visible if there is a
     "resolve-system"? Or only if they are referenced by explicit
     config?

   - In the example, can a explicit client express that a specific
     application should not make it into <intended>? Something like
     resolve-system but if that brings in xyz, please fail the
     edit-config since xyz is a no-go. And can I request that a
     specific application should make it into running without having
     to write an ACL or something else that references the
     application? If I do not know what system provides, do I go trial
     and error to explore what the list of system preconfigured
     applications is?

* Default Interactions

  - There are "should not" and MUST rules here but there is no
    explanation why these rules are important. What breaks if a
    <system> data node happens to have the schema's default value?

* The "ietf-system-datastore" Module

  - Why is section 8.2 here? I expected to see the definition of the
    "ietf-system-datastore" module and not a 2+ pages example of
    something relatively unrelated. Perhaps move all lengthy examples
    into the appendix and use only short concise examples where really
    needed in the body of the specification. In this section, I do not
    see the need for an example at all.

* Security Considerations

  - While you follow the default security templates, I wonder whether
    more needs to be said here. If I have a server supporting
    resolve-system but not implementing <system>, I have no way to
    predict how the result produced by an edit-config will look
    like. So I have to fully trust that resolve-system does not lead
    to a modification of running that may have undesirable
    consequences.

I did not go to rewrite my review now that I see that I fundamentally
misunderstood the proposed solution at the beginning. The reason is
that Figure 1 is misleading. The reason is that you seem to have
two models and neither is captured in Figure 1:

 +-----------+         |        +-----------+         |
 | <system>  |         +------->| <running> |<--------+
 | (ct, ro)  |                  | (ct, rw)  |
 +-----------+                  +-----------+
                                     | // configuration transformations,
                                     | // e.g., removal of nodes marked
                                     | // as "inactive", expansion of
                     +---------------+ // templates
                     |
                     |
                     v
               +------------+
               | <intended> |  // subject to validation
               | (ct, ro)   |
               +------------+

This is the situation where the client learns about system config by
reading <system> and then explicitly copying over into <running>
whatever is needed. The other model is really a partial copy of
<system> into <running> via the "resolve-system" magic:

 +-----------+         |        +-----------+         |
 | <system>  |         +------->| <running> |<--------+
 | (ct, ro)  |----------------->| (ct, rw)  |
 +-----------+                  +-----------+
                                     | // configuration transformations,
                                     | // e.g., removal of nodes marked
                                     | // as "inactive", expansion of
                     +---------------+ // templates
                     |
                     |
                     v
               +------------+
               | <intended> |  // subject to validation
               | (ct, ro)   |
               +------------+

But then I do find statements like this:

   Configuration in <running> is merged with <system> to create the
   contents of <intended> after the configuration transformations to
   <running> (e.g., template expansion, removal of inactive
   configuration defined in [RFC8342]) have been performed
   (Section 5.1).

This apparently is conflict with what is really proposed. So my
confusion is very well justified I think. There is a very big
difference between thinking of <intended> as an overlay of <running>
and <system> or thinking of <system> as something that either explicit
or implicitly is copied into <running> as a side-effect of certain
operations.

/js

-- 
Jürgen Schönwälder              Constructor University Bremen gGmbH
Phone: +49 421 200 3587         Campus Ring 1 | 28759 Bremen | Germany

_______________________________________________
netmod mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[netmod] Re: 2nd WGLC on system-config-08

Reply via email to