Hi Roman,

Thank you very much for the review and comments! Please find inline my response 
and proposed modifications. The modifications will be reflected in the version 
-12. 

Best regards,
Haoyu

-----Original Message-----
From: Roman Danyliw via Datatracker <[email protected]> 
Sent: Tuesday, November 30, 2021 6:10 PM
To: The IESG <[email protected]>
Cc: [email protected]; [email protected]; [email protected]; 
[email protected]; [email protected]
Subject: Roman Danyliw's Discuss on draft-ietf-opsawg-ntf-11: (with DISCUSS and 
COMMENT)

Roman Danyliw has entered the following ballot position for
draft-ietf-opsawg-ntf-11: Discuss

When responding, please keep the subject line intact and reply to all email 
addresses included in the To and CC lines. (Feel free to cut this introductory 
paragraph, however.)


Please refer to 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ietf.org%2Fblog%2Fhandling-iesg-ballot-positions%2F&amp;data=04%7C01%7Chaoyu.song%40futurewei.com%7C44cefd5adab943c3365e08d9b46fa527%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637739213905186071%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=aiyR0NZZCOj8m9JfGw2RBnbYIje7oaROrZN%2BPj2ytMo%3D&amp;reserved=0
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatatracker.ietf.org%2Fdoc%2Fdraft-ietf-opsawg-ntf%2F&amp;data=04%7C01%7Chaoyu.song%40futurewei.com%7C44cefd5adab943c3365e08d9b46fa527%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637739213905186071%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=jwSUX6LNW7mwPmYYILyM5gKEyHF5zKDfyddVBBlXvH4%3D&amp;reserved=0



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

Thank you for being responsive to the SECDIR review threat to improve the 
security considerations text.  Specifically, 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmailarchive.ietf.org%2Farch%2Fmsg%2Fsecdir%2FGUvFWXP7n9IjXW8xlIdMS5ZE5u0%2F&amp;data=04%7C01%7Chaoyu.song%40futurewei.com%7C44cefd5adab943c3365e08d9b46fa527%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C0%7C637739213905186071%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=%2FcZ2Ccch7Z24xce%2F8C4zEyHxMS9Dvb8HoG8HOAmGInU%3D&amp;reserved=0.

Even after these edits, there are a few straightforward ambiguities to clear up.

(a) Section 2.  "When a network's endpoints do not represent individual users 
(e.g. in industrial, datacenter, and infrastructure contexts), network 
operations can often benefit from large-scale data collection without breaching 
user privacy."

Is network telemetry architecture being restricted to such a limited 
applicability?  To quote the original SECDIR thread, is this saying "The 
Network Telemetry Framework is not applicable to networks whose endpoints 
represent individual users, such as general-purpose access networks"?  If so, 
I'd recommend being that explicit.

HS> Thank you for pointing that out. We'll make it explicit. 


(b) Section 2.1.  "To preserve user privacy, the user packet content should not 
be collected." This is a great principle, but extremely nuanced and potentially 
complicated to implement.  Is this saying (using the words of this framework), 
"To preserve the privacy of end-users, no user packet content should be 
collected.  Specifically, the data objects generated, exported, and collected 
by the Network Telemetry Framework should not include any packet payload from 
traffic associated with end-users systems"?

HS> We will adopt the new wording you suggested above. 


(c) Section 2.5.  Please use stronger and consistent language.

OLD
Disclaimer: large-scale network data collection is a major threat to user 
privacy [RFC7258].  The network telemetry framework presented in this document 
should not be applied to collect and retain individual user data or any data 
that can identify end users without consent.
Any data collection or retention using the framework must be tightly limited to 
protect user privacy.

NEW
Large-scale network data collection is a major threat to user privacy and may 
be indistinguishable from pervasive monitoring [RFC7258].  The network 
telemetry framework presented in this document must not be applied to 
generating, exporting, collecting, analyzing or retaining individual user data 
or any data that can identify end users or characterize their behavior without 
consent.

The principles described in (a), (b) and (c) seems sufficiently important they 
shouldn't be scattered across the document.  Please either make an 
applicability statement section early in the document or a dedicated privacy 
consideration section.

HS> Thank you for the suggestion. The applicability statement is now made a 
subsection 1.1.

----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

(Apologize if any of the below section numbers are wrong.  I conducted most of 
my review on -10 and then -11 was published which renumbered the document)

Thanks to Alexey Melnikov for the SECDIR review.

I'm a bit of confusion on the framing of this document.  It seems to me to be 
suggesting that "OAM" is a tied to a series of static technologies and 
practices, and a set of new practices called "network telemetry" are needed.  I 
don't disagree with the idea that network management practices need to evolve, 
and that the "networks of the future" will look different than today.  Relying 
on BCP 161 (RFC 6291), I took OAM to mean an evolving set of practices and 
technology.  Using Section 3 of BCP 161, O + A + M seemed like a contextual set 
of operations that would be done now and still required in networks of the 
future.  The document acknowledges that there is some ambiguity in "network 
telemetry".  I think it needs to equally acknowledge that the same is true of 
OAM, and that RFC7276 is not OAM.  In the aggregate, I don't think the text 
realizes the clarity that it set out to provide by defining "key 
characteristics of network telemetry which set a clear distinction from the 
convention
 al network OAM and show that some conventional OAM technologies can be 
considered a subset of the network telemetry technologies.".  To be clear, I'm 
not raising an objection to many of the properties linked to network telemetry. 
 Instead, I think the clarity of message is getting diluted because a very 
particular distinction is trying to be made (OAM vs. network telemetry) and it 
isn't clear.  See below for a specifics.

** Section 1.
... using a wide variety of techniques including machine learning, data 
analysis, and correlation.

ML, data analysis and correlation are unlike things.  ML is a particular AI 
technique, data analysis is a generic description of an activity, and is 
correlation intended to be a statistical technique?

HS>  change to "... using a wide variety of data analytical techniques."

** Section 1
   Network telemetry extends beyond the historical network Operations,
   Administration, and Management (OAM) techniques and expects to
   support better flexibility, scalability, accuracy, coverage, and
   performance.

This seems hypothetical depending on the definition on which technologies are 
considered in scope of network telemetry and OAM.

HS> change "historical" to "classical". We admit the term "network telemetry" 
is just an extension of OAM with a bigger scope and more future oriented. So 
many classical OAM techniques are also considered as network telemetry, as 
stated in the document. 

** Section 2.

Today one can access advanced big data analytics capability through a
   plethora of commercial and open source platforms (e.g., Apache
   Hadoop), tools (e.g., Apache Spark), and techniques (e.g., machine
   learning).  Thanks to the advance of computing and storage
   technologies, network big data analytics gives network operators an
   opportunity to gain network insights and move towards network
   autonomy.
In trying to contextual this observation, where is this capability relative to 
Figure 1?  In general, I would recommend that this reference architecture when 
assessing the ecosystem.

HS> This capability is mainly supported by the "Network Operation Applications" 
block in Figure 1. Add "Once the network operation applications acquire the 
data from these modules, they can apply data analytics and take actions." In 
Section 3 for clarification. 

** Section 2.

However, while the data processing capability is improved and
   applications are hungry for more data ...

What does it mean and what applications are "hungry for more data".  Is a 
reference possible here?

HS> Rephrase to "... and applications require more data to function better, ..."

** Section 2.  Editorial.  s/concerned in the context/relevant in this document/

HS> Done

** Section 2.1
Less but higher quality data are often better
   than lots of low quality data.

This seems like a broad generalization that doesn't consider the application 
and the cost of acquisition or processing.

HS> Rephrase to "In some cases, if the cost is acceptable, less but higher 
quality data are preferred than lots of low quality data."

** Section 2.2.

The ultimate goal is to achieve the
      ideal security with no, or only minimal, human intervention.

What is "ideal" security?

HS> removed the word "ideal"

** Section 2.2.
While machine learning technologies can be used for
      root cause analysis, it up to the network to sense and provide the
      relevant diagnostic data which are either actively fed into, or
      passively retrieved by, machine learning applications.

This text is asymmetric with the others bullets since don't discuss specific
techniques.   Personally, it also seem odd to include this text as there are
other ways  to do root cause analysis beyond ML (to include other AI 
approaches).

HS> Rephrase to "While technologies such as machine learning can be used for 
root cause analysis, it is up to the network to sense and provide the relevant 
diagnostic data which are either actively fed into, or passively retrieved by, 
the root cause analysis applications."

** Section 2.3
   For a long time, network operators have relied upon SNMP [RFC3416],
   Command-Line Interface (CLI), or Syslog to monitor the network.  Some
   other OAM techniques as described in [RFC7276] are also used to
   facilitate network troubleshooting.
...
   These challenges were addressed by newer standards and techniques
   (e.g., IPFIX/Netflow, PSAMP, IOAM, and YANG-Push) and more are
   emerging.  These standards and techniques need to be recognized and
   accommodated in a new framework.

This section is an exemplar of the disconnect I noted in the definitions of 
OAM.  The first paragraph presents a narrow view of currently used (albeit
older) network monitoring technologies (SNMP, CLI Syslog).  However, in the 
closing paragraph, the text names more modern technologies I would also 
consider OAM, and these technologies could meet some of the challenges 
mentioned in this section.  Furthermore, some of these "newer standards" are 
framed as things that need to be "recognized".  This is puzzling because my 
understanding was that technologies like IPFIX/Netflow have been very widely 
deployed for quite some time now.  What's the new framework needed?

HS> We are expecting more new techniques will emerge and new applications 
(e.g., interactive telemetry, programmable telemetry) will be used which is 
beyond the conventional understanding of the network OAM. While we summarize 
all these using the new term "network telemetry", we admit it's not a brand-new 
concept coming from no-where and a long line of related techniques and 
standards one may considered as OAM are developed. To emphasize the new 
features and new potentials, we think the term "network telemetry" is better 
than "OAM".

** Section 2.4
Network telemetry covers the conventional network OAM and
   has a wider scope.

Can the text be more specific in what way network telemetry is wider.  I 
thought OAM was rather ambiguous.

HS> The next sentence "It is expected that network telemetry can provide the 
necessary network insight for autonomous networks and address the shortcomings 
of conventional OAM techniques." means to  provide a specific explanation. Add 
"For instance, ..." to make it clear.

** Section 2.4
Hence, the network telemetry can directly
   trigger the automated network operation, while in contrast some
   conventional OAM tools are designed and used to help human operators
   to monitor and diagnose the networks and guide manual network
   operations.

I'm not sure if this is a fair generalization.  Even "older technologies" like 
SNMP currently trigger automated responses based on the values they return.

HS> We admit some old technologies can have new use in an automated 
environment. That's why we say "some" and only say their original intention 
when being designed. In other words, any technique that is used in the new 
context is a network telemetry technique for sure. 

** Section 2.4.  Per "data fusion," which part of the Figure 1 is this 
happening?

HS> It should be in the "network operation applications" block. 

** Section 2.5.

Network data analytics and machine-learning technologies are applied
   for network operation automation, relying on abundant and coherent
   data from networks.

-- What is the difference between a network data analytics system and ML 
technologies?  Isn't analytics a superset of ML?

HS> Rephrase to "Network data analytics (e.g., machine learning) is applied

-- What is coherent data?

HS> Here "coherent" means  united as or forming a whole

** Section 2.5.
In detail, such a framework would benefit application
   development for the following reasons:

It might be helpful to level set what an application is in this context.  Is 
this the "network operations application" of Figure 1?

HS> Yes. Expand "application" to "network operation application" for 
clarification. 

** Section 2.5
All the use cases and
      applications are better to be supported uniformly and coherently
      under a single intelligent agent

-- Editorial.  There is a missing word which leads to this sentence not parsing.

-- What's the basis for asserting that a "single intelligent agent" is the best 
approach?

-- Maybe the issue is of semantics, what is an "intelligent agent" in this 
context?

HS> Remove " under a single intelligent agent" to avoid confusion. 

** Section 2.5.

Network visibility presents multiple viewpoints

and

Efficient data fusion is critical for applications to reduce the
      overall quantity of data and improve the accuracy of analysis.

Are these generalizations expected to be true across the broad use cases?

HS> We think the first point is valid.  The second point has been corrected as 
"efficient data aggregation" in -11. 

** Figure 2.  For the management plane, the data model module has MIB and 
syslog listed, but the data encodings as GPB, JSON and XML.  These data models 
and encodings don't line up (i.e., MIBs and syslog typically don't rely on GPB, 
JSON or XML).

HS> We explained that this table is not exhaustive and does not mean to provide 
1-to-1 matching. It only provides some representative examples. 

** Section 3.1.  Where do network security applications such as WAFs, IDS/IPS/ 
NGF, DLP, web-proxies, and pDNS fit into this taxonomy?

HS> Security application is a branch of telemetry applications for which its 
parts will spread in all components of the framework just as the other 
telemetry applications.  

** Section 3.1.* These sections inconsistently describe properties/requirements 
for an architectural element and their challenges (but no solutions or 
requirements for) a given elements.  As a result, I had trouble understanding 
what an implementer should understand these components.  It would have been 
clearer is the different modules had common and module specific requirements.

HS> We describe the common challenges/requirements in earlier sections. Here we 
want to summarize some specific requirements pertaining to each element. Of 
course, some requirements are relevant to the other elements as well.

** Section 3.1.1.  Per the requirements of "Convenient Data Subscription", 
"Structured Data", etc. why wouldn't those be desirable requirements for all 
four of the modules?

HS> We note that "the requirements may pertain across all telemetry modules; 
however, we emphasize those that are most pronounced for a  particular plane."

** Section 3.1.3.  Providing "timely data" and "structured data", seem like the 
restatements of Section 4.1.1's "structure data" and "high speed transport". Is 
this a common requirement?

HS> Yes indeed.

** Section 3.1.3.  Why wouldn't it be desirable for all of the modules to 
support incremental deployment note here?

HS> We note that " Although not specific to the forwarding plane, these 
challenges are  more difficult to the forwarding plane because of the limited  
resource and flexibility."

** Section 3.2.
   *  Data Query, Analysis, and Storage: This component works at the
      application layer.

I need a bit of topological orientation.  What is the application layer of say 
a "forwarding plane" or "external data" be?  What are the other layers?

HS> Thanks for the catch. We actually mean "the application block" in the 
Figure 1. We will correct it.  

** Section 5.  Recommend explicitly saying that this document doesn't define 
specific technologies to shift the responsibility of specific considerations.

HS> We will make it explicit at the beginning of the document. 

OLD
   Security considerations for networks that use telemetry methods may
   include:

NEW

This document proposes a conceptual architectural for collecting, transporting, 
and analyzing a wide variety of data sources in support of network 
applications.  The protocols, data formats, and configurations chosen to 
implement this framework will dictate the specific Security Considerations. 
These considerations may include:

HS> Done

** Section 5.

OLD
   *  Telemetry data stores, storage encryption and methods of access;

NEW
   *  Telemetry data stores, storage encryption, methods of access, and
   retention practices.

HS> Done

_______________________________________________
OPSAWG mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/opsawg

Reply via email to