Re: [netmod] AD review: draft-ietf-netmod-revised-datastores-08

Robert Wilton Tue, 09 Jan 2018 07:34:12 -0800


On 09/01/2018 11:28, Martin Bjorklund wrote:

Robert Wilton <[email protected]> wrote:

Hi Andy,


On 08/01/2018 19:45, Andy Bierman wrote:


On Mon, Jan 8, 2018 at 5:55 AM, Robert Wilton <[email protected]
<mailto:[email protected]>> wrote:

     Hi Andy,

     Regarding your comment below, this intent is captured by this text
     describing the operational datastore in section 5.3:

         <operational> SHOULD conform to any constraints specified in the
         data
         model, but given the principal aim of returning "in use" values, it
         is possible that constraints MAY be violated under some
         circumstances, e.g., an abnormal value is "in use", the structure of
         a list is being modified, or due to remnant configuration (see
         Section 5.3.1).  Note, that deviations SHOULD be used when it is
         known in advance that a device does not fully conform to the
         <operational> schema.

         Only semantic constraints MAY be violated, these are the YANG
         "when",
         "must", "mandatory", "unique", "min-elements", and "max-elements"
         statements; and the uniqueness of key values.

         Syntactic constraints MUST NOT be violated, including hierarchical
         organization, identifiers, and type-based constraints.  If a node in
         <operational> does not meet the syntactic constraints then it MUST
         NOT be returned, and some other mechanism should be used to flag the
         error.


     Do you agree that this is sufficient?



Not really.
It does not address my concern, which is that NMDA is
removing the YANG constraints on config=false data nodes
for no apparent reason.

There is a reason. I don't think that the constraints on config=false
is really being removed, because I don't think that they truly existed
in the first place (despite what RFC 7950 might indicate!).

I agree.  But note that RFC 7950 says:

    o  If the constraint is defined on state data, it MUST be true in a
       valid state data tree.

It is not defined anywhere that <get> must return a "valid state data
tree".

In reality, I suspect that all implementations of <get> call various
instrumentation call back functions in some order, possibly in
parallell, which means that data will be collected at different times
from the backend systems.  I don't think it is feasible to freeze the
operational state of a device, collect all data, and unfreeze, in
order to get a consistent snapshot of the operational state.

I agree.

It is not even just that the management agent may be reading the datafrom the backend systems at different times, the operational state inthose backend systems may not have converged at the point that it isbeing read (e.g. a route that has been installed in the FIB on somelinecards, but not all).

I think that the operational state is probably best considered as beingeventually consistent. I.e. if the device stops receiving furtherupdates (in config or state), and if no abnormal conditions haveoccurred, then the operational state of the system must end upconforming to the schema for <operational>, and <operational> would be avalid data state data tree.


Thanks,
Rob



/martin

I think that we all agree on the expected behavior for configuration:
If a client sends configuration to a server that would cause <running>
to become invalid then the server should reject that change, to ensure
that <running> always holds a consistent configuration.  Having a
consistent configuration is the most important property here.
I.e. the server has the right to reject an invalid configuration
request from a client.

However, the flow of operational state data in opposite direction
cannot hold to the same rules.  If during the processing of a get
request (or YANG push) a server sends operational state data back to a
client then a client has to choose how to process the message:
  - if the message is garbled or not sane then it makes sense to
discard it.
  - however, what should the client do if the message is well formed
but either (i) contains some values outside the permitted schema range
(but can be represented by the schema datatype), or (ii) by applying
the values would cause the clients copy of <operational> to become
invalid?

If the client discards the message because of one bad value, then that
doesn't seem to be helpful, since it allows for a very fragile model
of system management.  I.e. if one small thing is bad then the whole
house of cards collapses.

So I think that the only sensible behaviour here is that the client
has to process the operational state update in a best effort fashion,
keep all the good data and probably flag any values that are outside
the value constraints.  Similarly any reference constraint failures
(i.e. when/must) can similarly be flagged up, but throwing away an
update message that would cause the operational state to become
inconsistent doesn't seem to be helpful.  I.e. it is much better if
the client gets to see the true state of the server, even if that
state isn't good (or consistent).

Similar questions arise on the server itself:
  - what if the real value in use (e.g. that is read from the hardware)
is outside the permitted range (because of a logic defect)?  Is it
really better to suppress that value entirely or return a value that
server knows to be wrong?
  - can a server even know that its operational view is consistent? For
complex systems where the real operational state is split across
multiple underlying linecards, or remote devices, I think that this is
very hard (if not impossible) to do.

So what the NMDA architecture states is:
  (i) if a server knows that it won't conform to the operational schema
then it must use deviations,
  (ii) a server in a normal steady state should conform to the
operational schema (and be valid),
  (iii) but, if the system is churning (e.g. configuration, route
update, etc) then the operational state of the server might be
transiently inconsistent and this is OK,
  (iv) if, the server is in a bad state, then it is better to return
the actual state than to lie or not report a particular value (as long
as it can be encoded).
  (v) a server does not need to explicitly validate that its view of
operational is valid. It is unclear what it would/could do if it
detected that the operational state is invalid, nor is it clear that
servers would generally be able to always perform this operation.

The server implementation requirements expressed in YANG constraints
are applicable to any data node, not just config=true data nodes.
The requirement to implement the ancestor nodes (with keys) does not
change.

The draft does not allow this to be violated.  I.e. the following
statement prevents this: "Syntactic constraints MUST NOT be violated,
including hierarchical organization".

The requirement to conform to the YANG constraints defined within
config=false
data nodes does not change.

To do otherwise does not make sense.  E.g. "when" conditions that add
ethernet
counters only when the interface type is ethernetCsmacd. Why would it
be OK for
the server to ignore that when-stmt and add ethernet counters to every
interface?

It is not OK for a server to ignore that and add Ethernet counters to
every interface (without using a deviation).  The draft is not trying
to allow that.

But if an interface could change type (e.g. between Ethernet and ATM
via a different optics module being inserted) then it would be allowed
for a server to transiently report the ethernet counters on the
interface whilst it is in the process of changing the interface type
from ethernet to ATM (e.g. if the counters are maintained by a
separate daemon that is updated asynchronously with respect to the
config or optics change).  Once the change had completed, the the
system reaches steady state then the Ethernet counter must no longer
be reported.

Thanks,
Rob

IMO the text above can only apply to the operational values of
config=true nodes.


     Thanks,
     Rob



Andy



     On 21/12/2017 22:49, Andy Bierman wrote:

     Hi,

     It should be clear somehow that server requirements to provide
     config=false data
     that is valid according to the YANG definitions is not affected
     by NMDA.
     That is not being taken away.  The ability to validate
     operational values
     of configuration data has never been provided, and therefore is
     not being taken away either.

     A constraint on config=true nodes only applies to configuration
     datastores.
     These are the only constraints that should be ignored in
     <operational>.
     Constraints on config=false nodes still apply in <operational>.


     Andy



     On Thu, Dec 21, 2017 at 2:27 PM, Juergen Schoenwaelder
     <[email protected]
     <mailto:[email protected]>> wrote:

         On Thu, Dec 21, 2017 at 07:52:54PM +0100, Vladimir Vassilev
         wrote:
         > On 12/21/2017 02:20 PM, Juergen Schoenwaelder wrote:
         >
         > > On Thu, Dec 21, 2017 at 02:03:45PM +0100, Vladimir
         Vassilev wrote:
         > > > On 12/21/2017 11:34 AM, Robert Wilton wrote:
         > > >
         > > > > Hi Vladimir,
         > > > >
         > > > > First point of clarification is that this is not
         about running/intended
         > > > > at all.  The contents of running/intended do not
         change in anyway
         > > > > depending on whether hardware is present or absent.
         > > > >
         > > > > The section is only concerned with how the
         configuration is applied in
         > > > > operational, and basically says that you cannot apply
         configuration for
         > > > > resources that are missing (which seems reasonable).
         E.g. I cannot
         > > > > configure an IP address on a physical interface that
         isn't there.  Or if
         > > > > the physical interface gets removed then the
         configuration associated
         > > > > with that interface is also removed from operational.
         > > > >
         > > > > Operational isn't validated and data model
         constraints are allowed to be
         > > > > broken (ideally transiently).
         > > > I want to focus on this. IMO giving up schema validitiy
         for any datastore is
         > > > unacceptable price. Pre-NMDA devices had full model
         support in operational
         > > > data (all YANG constrains part of the model without
         discrimination were
         > > > enforced).
         > > There was a long debate about the value of returning the true
         > > operational state. What do you do if the operational
         state is invalid?
         > > A server can reject configuration changes if they lead to
         invalid
         > > state, a server can not reject reality.
         > IMO if the model can represent reality then data conforming
         to the model
         > can. If not a better model is needed not a hack that breaks
         the datastore
         > conformance to the YANG model. I do not see how
         > /interfaces/interface/oper-status=not-present was not
         representing the
         > reality of a system with removed line card that is
         configured and ready to
         > resume operation as soon as the line card is reconnected.

         I assume this is all system and implementation specific. If your
         system knows about interfaces that are not present (i.e.,
         there is
         operational state about them), you can report these
         interfaces.  But
         'is configured' is confusing here. I am not sure a line card
         that does
         not exist should be considered configured. But yes, this may
         be system
         specific. Anyway, draft-ietf-netmod-rfc7223bis-01.txt still has
         oper-status 'not-present' - so this seems to be a mood point.

         > > > If this is about to change it will compromise
         interoperability
         > > > and a significant portion of the client implementation
         workload that can be
         > > > automated will need to be coded in hand and tested.
         Unresolved leafrefs,
         > > > undefined behaviour of different implementations
         removing different
         > > > configuration nodes in violation of YANG semantic
         constraints (which I do
         > > > not think can be so clearly separated from the
         syntactic constraints when
         > > > one considers types like leafref, instance-identifier
         etc.) and the
         > > > corresponding side effects based on the server
         implementators own creativity
         > > > is eventually going to create more problems.
         > > >
         > > > 1. IMO the only acceptable solution is to have YANG
         valid operational
         > > > datastore at all times. operational like any other
         datastore MUST be valid
         > > > YANG data tree and it has to be a system implementation
         task to consider all
         > > > complications resulting from the removal of the
         resources leading to any
         > > > data transformations. If this is difficult or
         impossible other mechanisms to
         > > > flag missing resources should be used (e.g.
         > > > /interfaces/interface/oper-status=not-present) This
         sounds like a useful
         > > > contract providing the value of a standard the
         alternative does not.
         > > As said above, it is impossible to report valid
         operational state if
         > > the operational state is not valid according to the models.
         > >
         > > > 2. Even with the change in 1. I do not see the removal
         of intended
         > > > configuration nodes from operational as a solution
         worth implementing on our
         > > > servers. I do not see a real world plug-and-play
         scenario that can be
         > > > automatically solved without specific additions to the
         models e.g.
         > > > /interfaces/interface/oper-status=not-present is
         oversimplified solution but
         > > > it needs to be extended exactly as much as the solution
         provided by the
         > > > removal of config true; nodes without the sacrifice of
         YANG validity of
         > > > operational.
         > > Your thinking is likely wrong. <operational> reports the
         operational
         > > state. It may have little in common with <intended>.
         Trying to derive
         > > operational from intended is likely a not well working
         approach.
         > The proposal for this solution ("derive operational from
         intended" e.g.
         > merge /interfaces-state in /interfaces) comes from the
         revised datastores
         > draft not me.
         >
         > By definition config true; data represents intent. Reusing
         the model of a
         > config true; data to represent state absent of intent (e.g.
         > /interfaces/interface with origin="or:system") is a hack.
         The hack works
         > fine without compromising the conformance of operational to
         the YANG model
         > as long as certain conditions are met. I am pointing out
         that one of the
         > conditions is to keep all of the intended configuration
         data present in
         > 'operational' and handle missing resources with
         conventional means e.g.
         > /interfaces/interface/oper-status=not-present instead of
         adding the straw
         > that breaks the camel's back.

         I fail to see why you believe all objects that appear in intended
         configuration needs to exist in applied configuration. In fact,
         operators told us very clearly that they care about the
         distinction
         between intended and applied config.

         > > > 3. Solutions like /interfaces/interface/admin-state
         stop working. With the
         > > > interface removed you can no longer figure if the
         if-mib has or does not
         > > > have the interface enabled so an operator has to use
         SNMP or wait for a
         > > > replacement line card to be connected to figure this
         bit of information.
         > > At least on my boxes, if I remove a line card, the
         interface also
         > > disappears in SNMP tables. Stuff that is operationally
         not present is
         > > simply operationally not present.
         > >
         > > > My
         > > > interpretation of the MAY as requirement level in sec.
         5.3. The Operational
         > > > State Datastore (<operational>) is that plug-and-play
         solutions can be
         > > > implemented without this limited approach that has the
         same problem as the
         > > > pre-NMDA only now we have to have /interfaces-state to
         keep config false;
         > > > data relevant to hardware that is configured but not
         present:
         > > >
         > > >     configuration data nodes supported in a
         configuration datastore
         > > >     MAY be omitted from <operational> if a server is
         not able to
         > > >     accurately report them.
         > > >
         > > > I realize this discussion comes late. I have stated my
         objections to this
         > > > particular part of the NMDA draft earlier.
         > > I believe there is a conceptual misunderstanding. I think
         there never
         > > was a requirement that a server reports the state of
         hardware that is
         > > not present.
         > "Data relevant to hardware that is configured but not
         present" is different
         > from "state of hardware that is not present". For example
         information
         > indicating when the line card became unavailable, what was
         the reason, or
         > other information like how many packets that had this
         interface as egress
         > destination are being dropped as a result of the removal.

         I think that systems handle non-existing interfaces
         differently. It
         seems that ietf-interfaces is flexible enough to accomodate the
         differnet styles.

         /js

         --
         Juergen Schoenwaelder           Jacobs University Bremen gGmbH
         Phone: +49 421 200 3587         Campus Ring 1 | 28759 Bremen
         | Germany
         Fax:   +49 421 200 3103
          <http://www.jacobs-university.de/
         <http://www.jacobs-university.de/>>

         _______________________________________________
         netmod mailing list
         [email protected] <mailto:[email protected]>
         https://www.ietf.org/mailman/listinfo/netmod
         <https://www.ietf.org/mailman/listinfo/netmod>




     _______________________________________________
     netmod mailing list
     [email protected] <mailto:[email protected]>
     https://www.ietf.org/mailman/listinfo/netmod
     <https://www.ietf.org/mailman/listinfo/netmod>


_______________________________________________
netmod mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/netmod

Re: [netmod] AD review: draft-ietf-netmod-revised-datastores-08

Reply via email to