Re: [OPSAWG] AD review of draft-ietf-opsawg-service-assurance-yang-07

Rob Wilton (rwilton) Sun, 06 Nov 2022 07:22:06 -0800

Hi Jean,

Just one further question inline ...


> -----Original Message-----
> From: Jean Quilbeuf <jean.quilb...@huawei.com>
> Sent: 19 October 2022 01:10
> To: Rob Wilton (rwilton) <rwil...@cisco.com>; opsawg@ietf.org; draft-ietf-
> opsawg-service-assurance-yang....@ietf.org
> Subject: RE: AD review of draft-ietf-opsawg-service-assurance-yang-07
> 
> Hi Rob,
> Thank you very much for you detailed review.
> 
> The latest version should address most of the comments. The diff is here:
> https://www.ietf.org/rfcdiff?url2=draft-ietf-opsawg-service-assurance-yang-
> 08
> 
> Answers to some unanswered or added details are inline below.
> 
> Thanks Again
> Best,
> Jean
> 
> 
> > -----Original Message-----
> > From: Rob Wilton (rwilton) [mailto:rwil...@cisco.com]
> > Sent: Monday 17 October 2022 13:06
> > To: opsawg@ietf.org; draft-ietf-opsawg-service-assurance-yang....@ietf.org
> > Subject: AD review of draft-ietf-opsawg-service-assurance-yang-07
> >
> > Dear authors,
> >
> > And here is my corresponding AD review of draft-ietf-opsawg-service-
> > assurance-yang-07.  Again, I found the document to be well-written, with
> > mostly minor comments/suggestions, bar one question about the
> symptom
> > reference.
> >
> >
> > Moderate level comments:
> >
> > (1) p 11, sec 3.3.  YANG Module
> >
> >   grouping symptom {
> >     description
> >       "A grouping for the symptoms for a specific subservice.";
> >     leaf symptom-id {
> >       type leafref {
> >         path "/agents/symptoms-description/symptom-id";
> >       }
> >       description
> >         "Identifier of the symptom, to be interpreted according
> >          to the agent identified by the agent-id.";
> >     }
> >
> > Should this path be constrained to the specific agent instance identified by
> > agent-id?  E.g., similar to how the 'id' path is constrained in subservice-
> > reference below.
> >
> 
> Indeed that was a mistake, it's fixed now.
> 
> >
> > (4) p 5, sec 3.2.  Tree View
> >
> >    module: ietf-service-assurance
> >      +--ro assurance-graph-last-change    yang:date-and-time
> >      +--rw subservices
> >      |  +--rw subservice* [type id]
> >      |     +--rw type                                identityref
> >      |     +--rw id                                  string
> >
> > I don't really like to mess with the structure of YANG modules during AD
> (or
> > IESG) review, but I wonder whether having type as part of the subservice
> > key doesn't end up making it more cumbersome to refer to subservices,
> and
> > whether a flat id space might be better (i.e., removing 'type' from the
> > subservice list keys).  But I also don't mind if the author/WG do not want
> to
> > change the structure at this stage.
> >
> 
> I decided not to change it at the moment.
> 
> >
> > (8) p 5, sec 3.2.  Tree View
> >
> >            +--ro subservices* [type id]
> >               +--ro type    -> /subservices/subservice/type
> >               +--ro id      leafref
> >
> > A couple more consistency questions:
> > 1) I note that subservices has been put into a container, but that agents,
> > assured-services are not.
> > 2) The "subservice" list is the singular tense (presumably because it is
> under a
> > container), but agents, assured-services and subservices are not.  I know
> that
> > you have to optimize for XML or JSON readability.
> >
> 
> According to https://datatracker.ietf.org/doc/html/rfc8407#section-4.14.2 ,
> top level lists with large number of nodes should be avoided. So I have
> wrapped them into containers.

Okay.


> 
> >
> > (9) p 6, sec 3.2.  Tree View
> >
> >            +--ro subservices* [type id]
> >               +--ro type    -> /subservices/subservice/type
> >               +--ro id      leafref
> >    The date of last change "assurance-graph-last-change" is read only.
> >    It must be updated each time the graph structure is changed by
> >    addition or deletion of subservices, dependencies or modification of
> >    their configurable attributes.  Such modifications correspond to a
> >    structural change in the graph.  The date of last change is useful
> >    for a client to quickly check if there is a need to update the graph
> >    structure.  A change in the health-score or symptoms associated to a
> >    service or subservice does not change the structure of the graph and
> >    thus has no effect on the date of last change.
> >
> > Presumably, you considered, and dismissed, having a top-level timestamp
> to
> > track whether any changes in symptoms or health-score have occurred
> > within any sub-service?  Perhaps this doesn't end up being useful in
> practice?
> >
> 
> The idea behind that timestamp was that the graph structure is "hard" to
> maintain using updates from streaming telemetry (need to keep track of
> dependencies and sometimes recursively delete orphan nodes), thus having
> a way to fully synchronize via this timestamp can simplify the algorithm.
> Scores and symptom changes are easier to track via a simple subscription.
> Also our assumption is that graph structure changes less frequently than
> scores and symptoms.

Okay, makes sense.


> 
> >
> > (11) p 6, sec 3.2.  Tree View
> >
> >    The presence of "maintenance-contact" field inhibits the emission of
> >    symptoms for that subservice and subservices that depend on them.
> >    See Section 3.6 of [I-D.ietf-opsawg-service-assurance-architecture]
> >    for a more detailed discussion.
> >
> > This wasn't obvious to me from the leaf name, and I wonder whether it
> > would have been better to have a presence-container "in-maintenance"
> > with a child "contact" leaf.
> 
> Yes, it's more readable.

Thanks.

> 
> >
> > (12) p 6, sec 3.2.  Tree View
> >
> >    The "health-score" contains a value normally between 0 and 100
> >    indicating how healthy the subservice is.  The special value -1 can
> >    be used to specify that no value could be computed for that health-
> >    score, for instance if some metric needed for that computation could
> >    not be collected.
> >
> > Because this is an enum, this would often/normally be encoded as the
> string
> > "missing" rather than as -1.
> 
> I am hoping that YANG-aware deserialization will replace the value -1 by the
> enum name, i.e. "missing" when targeting a human. The reason to use a
> number is for homogeneity with the health score, thinking of time series
> databases which fail if a time series changes type (int -> string for 
> instance)
> over time.

This is okay, but it does mean that the deserialization code will need to 
potentially spot that it may be the string value "missing" and then decide to 
convert that a reserved integer, presumably -1.  E.g., it feels like the union 
type of integer + string just means that you probably need slightly more code 
when streaming it into a time series database?

Given that this leaf is marked as being optional in the schema, then it could 
just be represented as the range 0 to 100, and if the server doesn't know the 
value then it doesn't provide any value at all.
Or alternatively, it could be modelled as a (perhaps mandatory true) leaf, with 
the range -1 to 100, with a default value of -1, and an explanation in the 
description that the value -1 is used to indicate that no health score is being 
provided.

But there are pretty minor comments on the encoding.  Please let me know if you 
would like to change this and post a -09, or if you would like me to last call 
-08 as it is?

> 
> >
> > (13) p 7, sec 3.2.  Tree View
> >
> >    The "health-score" contains a value normally between 0 and 100
> >    indicating how healthy the subservice is.  The special value -1 can
> >    be used to specify that no value could be computed for that health-
> >    score, for instance if some metric needed for that computation could
> >    not be collected.
> >    The "symptoms-history-start" is the cutoff date for reporting
> >    symptoms.  Symptoms that were terminated before that date are not
> >    reported anymore in the model.
> >
> > Presumably it is implementation specific about how/when the symptoms
> get
> > cleared (i.e., when the symptoms-history-start is updated).  I did wonder
> > whether a YANG RPC could be defined to reset this.  But I might be
> confuses
> > as to who is actually providing this timestatmp, e.g., is it the monitored
> > device (or SAIN agent), or is it the SAIN Collector.  This could also be
> deferred
> > as future work.
> >
> 
> For now it is to the discretion of the entity exporting the symptoms (could
> be agent exporting to collector or collector exporting to the rest of the
> world) to decide that cutoff date.
> Indeed a more sophisticated agent or collector might have a rpc to set the
> symptoms retention period, but I think that it is out of scope for this draft.

Ok.

Regards,
Rob



> 
> 
> > Thanks,
> > Rob

_______________________________________________
OPSAWG mailing list
OPSAWG@ietf.org
https://www.ietf.org/mailman/listinfo/opsawg

Re: [OPSAWG] AD review of draft-ietf-opsawg-service-assurance-yang-07

Reply via email to