Hi Benoit,

I have seen your presentations on Service Assurance for Intent-Based Networking 
Architecture and read your drafts with interest 
(draft-claise-opsawg-service-assurance-yang-05 and 
draft-claise-opsawg-service-assurance-architecture-03).  Interesting stuff on 
which I do have a couple of comments.

The basis for the drafts is in essence a proposal for Model-Based Reasoning, in 
which you capture dependencies between objects and make inferences by 
traversing the corresponding graph.  MBR based on dependency graphs allows to 
reason about the impact and propagation of the status or health of one object 
on the status or health of dependent objects "downstream" from it.  Likewise, 
traversing the same graph in the opposite direction (from the "downstream" or 
dependent objects) allows to identify potential root causes for symptoms 
observed by those objects, although this seems to be not so much your focus.

While MBR as a concept makes sense and has a long tradition in network 
management, there are also a number of considerable issues with it, and I was 
wondering about your perspective and mitigation strategies for these.  For one, 
their effectiveness depends on the model being "complete".  In most cases, 
there are myriads of interdependencies which are difficult to capture 
comprehensively.  The model is still useful for many applications as a starting 
point, but rarely captures the full reality.  As long as users are clear about 
that, this is not an issue.  However, the one thing where I have a bit of 
concern in your model is that you use it to draw conclusions about the health 
of the dependent objects (for example, your end-to-end service).  It seems that 
a derived health score will be no substitute for monitoring the actual health, 
and should not lull users into a false sense of security that as long as they 
monitor components of a system or service, that they don't need to be concerned 
with monitoring the system or service as a whole..  In reality I believe the 
value (although there still is a value) is more limited than that.  I believe 
that this should be clearly acknowledged and discussed in the drafts.

A second set of issues concerns the intensity of maintaining the graph and of 
continuously updating the dependencies.  In a realistic system you will have 
many objects with even more interdependencies.  Maintaining derived health 
state can become computationally very expensive, which suggests a number of 
mitigation strategies:  for one, don't continuously maintain this but compute 
this only "on demand".  Second, perhaps don't maintain this on the server at 
all, at least to the extent that you expect the server to be a networking 
device.  It seems much more feasible to perform these type of Model-Based 
Reasoning computations in an Operations Support System or application outside 
the network, not within the network.  However, it is not clear that YANG models 
and Netconf/Restconf would be applied there.  It seems to me the drafts should 
add clarification on where those models would be expected to be deployed and 
how/would keep them updated.  As an OSS tool, your proposal makes sense, but 
trying to process this on networking devices strikes me as very heavy, in 
particular given the limitations as per the earlier point.   So, IMHO I think 
you may want to consider adding an according section that discusses these 
aspects in the draft, specifically the architecture draft.

Cheers
--- Alex


_______________________________________________
OPSAWG mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/opsawg

Reply via email to