Hi Haoyu, Thanks for the expedient updates, they look good to me. I've requested IETF LC.
Regards, Rob > -----Original Message----- > From: Haoyu Song <[email protected]> > Sent: 13 October 2021 16:05 > To: Rob Wilton (rwilton) <[email protected]>; draft-ietf-opsawg- > [email protected] > Cc: [email protected]; 'opsawg-chairs' <[email protected]> > Subject: RE: AD review of draft-ietf-opsawg-ntf-07 [2] > > Hi Rob, > > Thank you very much for your second review! We have made all the > modifications you pointed out. > https://datatracker.ietf.org/doc/draft-ietf-opsawg-ntf/09/ > Please help to move it forward. Thanks again! > > Best regards, > Haoyu > > -----Original Message----- > From: Rob Wilton (rwilton) <[email protected]> > Sent: Tuesday, October 12, 2021 2:08 PM > To: Haoyu Song <[email protected]>; draft-ietf-opsawg- > [email protected] > Cc: [email protected]; 'opsawg-chairs' <[email protected]> > Subject: RE: AD review of draft-ietf-opsawg-ntf-07 [2] > > Hi Haoyu, > > Thanks for applying the markups. > > I've given -08 another read through, and I think that there are still some > tweaks to the grammar that I would recommend. I've also included some > automated warnings from a grammar tool that a probably also worth fixing > (you would get similar warnings during IESG review anyway). I think that > once you have fixed these we should be ready to go. > > 3.2. Use Cases > > * Security: Network intrusion detection and prevention systems need > to monitor network traffic and activities and act upon anomalies. > Given increasingly sophisticated attack vector coupled with > increasingly severe consequences of security breaches, new tools > and techniques need to be developed, relying on wider and deeper > visibility into networks. The ultimate goal is to achieve the > ideal security with no or minimal human intervention. > > RW: suggest > no or minimal human => no, or only minimal, human intervention > > > * Last sentence suggest: > > The ultimate goal is to achieve the ideal security with no, or only minimal, > human intervention. > > networks. While a policy or an intent is enforced, the compliance > needs to be verified and monitored continuously relying on > visibility that is provided through network telemetry data, any > violation needs to be reported immediately, and updates need to be > applied to ensure the intent remains in force. > > RW: Suggest: > > While a policy or intent is enforced, the compliance > needs to be verified and monitored continuously by relying on > visibility that is provided through network telemetry data. Any > violation must be notified immediately, potentially resulting in > updates to how the policy or intent is applied in the network to ensure > that it remains in force, or otherwise alerting the network administrator > to the policy or intent violation. > > * ... > overwhelming. While machine learning technologies can be used for > root cause analysis, it up to the network to sense and provide the > relevant diagnostic data which are either actively fed into or > passively retrieved by machine learning applications. > > RW: Suggest: > actively fed into or passively retrieved by => actively fed into, or passively > retrieved, by > > > 4. Network Telemetry Framework > > RW: (Section 4.3)are applied. => (Section 4.3) are applied. > > > 4.1. Top Level Modules, diagram: > > RW: > 1. I still not sure that I would list "ACL" under a control plane object. > 2. Thinking about it, I think that this table would be more consistent if the > columns were ordered with management plane before control plane, e.g.,: > +---------+--------------+--------------+---------------+-----------+ > | Module | Management | Control | Forwarding | External | > | | Plane | Plane | Plane | Data | > > > 4.1.1. Management Plane Telemetry > > * Convenient Data Subscription: An application should have the > freedom to choose the data export means such as the data types (as > described in Figure 4) and the export means and frequency (e.g., > on-change or periodic subscription). > > RW: > I don't think that the client is really choosing the data types, but > instead choosing which data to export, and how it is exported. How about: > > Convenient Data Subscription: An application should have the > freedom to choose which data is exported (see section 4.3) and the > means and frequency of how that data is exported (e.g., > on-change or periodic subscription). > > * High Speed Data Transport: In order to keep up with the velocity > of information, a server needs to be able to send large amounts of > data at high frequency. Compact encoding formats or data > compression schemes are needed to compress the data and improve > the data transport efficiency. The subscription mode, by > replacing the query mode, reduces the interactions between clients > and servers and helps to improve the server's efficiency. > > RW: > are needed to compress the data => are needed to reduce the quantity of > data > > > 4.1.2. Control Plane Telemetry > > RW: > (e.g., the IGP monitoring => (e.g., IGP monitoring > > > 4.1.3. Forwarding Plane Telemetry > > RW: Perhaps: > between forwarding and telemetry => between forwarding performance and > telemetry > > RW: > described in Appendix => Please add a reference to the section where > postcard telemetry is described, perhaps A3.5? > > Very minor nit: > Search for "e.g. " and replace with "e.g., " > > 4.2. Second Level Function Components > > RW: Sorry, I had a typo in my previous suggested text, correction: > > The telemetry module as each plane => The telemetry module at each plane > > RW: > provisioned in device => provisioned in the device > > 4.3. Data Acquisition Mechanism and Type Abstraction > > * Event-triggered Data: The data are conditionally acquired based on > the occurrence of some events. For example, a network interface > changing its operational state from up to down can be a trigger > event. Such data can be actively pushed through subscription or > passively polled through query. There are many ways to model > events, including using Finite State Machine (FSM) or Event > Condition Action (ECA) [I-D.wwx-netmod-event-yang]. > > RW: For example, a network interface changing its operational state from up > to down can be a trigger event. => > > An example of event-triggered data could be an interface changing > operational state between up and down. > > > 4.4. Mapping Existing Mechanisms into the Framework > > RW: Figure 5: Existing Work Mapping II => Figure 5: Existing Work Mapping > > > 6. Security Considerations > > RW: vulnerability. => vulnerabilities. > > Spellings to check: > de-facto, > exensive, > secuirty, > telemtry, > tradeoff, > > Grammar Warnings: > Section: abstract, draft text: > This document clarifies the terminologies and classifies the modules and > components of a network telemetry system from several different > perspectives. > Warning: Consider using several. > Suggested change: "several" > > Section: 1, draft text: > All the modules are internally structured in the same way, including > components that allow to configure data sources with regards to what data > to generate and how to make that available to client applications, > components that instrument the underlying data sources, and components > that perform the actual rendering, encoding, and exporting of the generated > data. > Warning: Use in regard to, with regard to, or more simply regarding. > Suggested change: "in regard to" > > Section: 2, draft text: > - gRPC Remote Procedure Call, a open source high performance RPC > framework that gNMI is based on. > Warning: Use an instead of 'a' if the following word starts with a vowel > sound, e.g. 'an article', 'an hour' > Suggested change: "an" > > Section: 3.2, draft text: > The ultimate goal is to achieve the ideal security with no or minimal human > intervention. > Warning: Did you mean now (=at this moment) instead of 'no' (negation)? > Suggested change: "now" > > Section: 3.3, draft text: > - Some of the conventional OAM techniques (e.g., CLI and Syslog) lack a > formal data model. > Warning: If the text is a generality, 'of the' is not necessary. > Suggested change: "Some" > > Section: 3.5, draft text: > A telemetry framework collects together all of the telemetry-related works > from different sources and working groups within IETF. > Warning: Consider using all the. > Suggested change: "all the" > > Section: 4.1, draft text: > Some of the operational states can only be derived from data plane data > sources such as the interface status and statistics. > Warning: If the text is a generality, 'of the' is not necessary. > Suggested change: "Some" > > Section: 4.1.3, draft text: > This raises some challenges to the network data plane devices where the first > hand data originates. > Warning: 'first hand' seems to be a compound adjective in front of a noun. > Use a hyphen: first-hand. > Suggested change: "first-hand" > > Section: 4.1.3, draft text: > While supporting network visibility is important, the telemetry is just an > auxiliary function, and it should strive to not impede normal traffic > processing and forwarding (i.e., the forwarding behavior should not be > altered and the tradeoff between forwarding and telemtry should be well > balanced). > Warning: This word is normally spelled with hyphen. > Suggested change: "well-balanced" > > Section: 6, draft text: > For example, telemetry data can be manipulated to exhaust various network > resources at each plane as well as the data consumer; falsified or tampered > data can mislead the decision making and paralyze networks; wrong > configuration and programming for telemetry is equally harmful. > Warning: This word is normally spelled with hyphen. > Suggested change: "decision-making" > > Section: 6, draft text: > Some of the security considerations highlighted above may be minimized or > negated with policy management of network telemetry. > Warning: If the text is a generality, 'of the' is not necessary. > Suggested change: "Some" > > Section: A.1.2, draft text: > gRPC is an [RFC7540] based open source micro service communication > framework. > Warning: This word is normally spelled as one. > Suggested change: "microservice" > > Section: A.2.1, draft text: > The BGP routes (including [RFC7854], [I-D.ietf-grow-bmp-adj-rib-out], and [I- > D.ietf-grow-bmp-local-rib] are encapsulated in the BMP Route Monitoring > Message and the BMP Route Mirroring Message, providing both an initial > table dump and real-time route updates. > Warning: Unpaired symbol: ')' seems to be missing > > Section: A.3.1, draft text: > Since networks offer rich sets of network performance measurement data > (e.g packet counters), traditional approaches run into limitations. > Warning: The abbreviation e.g. (= for example) requires two periods. > Suggested change: "e.g.," > > Section: A.4.1, draft text: > For example, a sports event takes place and some unexpected movement > makes it highly interesting and many people connects to sites that are > reporting on the event. > Warning: Consider using an extreme adjective for 'interesting'. > Suggested change: "fascinating" > > Section: A.4.1, draft text: > For example, a sports event takes place and some unexpected movement > makes it highly interesting and many people connects to sites that are > reporting on the event. > Warning: If 'people' is plural here, don't use the third-person singular > verb. > Suggested change: "connect" > > Section: A.4.1, draft text: > Additional types of detector types can be added to the system but they will > be generally the result of composing the properties offered by these main > classes. > Warning: Use a comma before 'but' if it connects two independent clauses > (unless they are closely connected and short). > Suggested change: "system, but" > > Thanks, > Rob > > > > > -----Original Message----- > > From: Haoyu Song <[email protected]> > > Sent: 08 October 2021 00:15 > > To: Rob Wilton (rwilton) <[email protected]>; draft-ietf-opsawg- > > [email protected] > > Cc: [email protected]; 'opsawg-chairs' <[email protected]> > > Subject: RE: AD review of draft-ietf-opsawg-ntf-07 [2] > > > > Hi Rob, > > > > We have updated the draft according to your review suggestions and > uploaded > > the -08 version. In the new revision we believe all your > suggestions/questions > > have been addressed. Please let me know if you have further questions. > Thank > > you very much! > > > > Best regards, > > Haoyu > > > > > > ------------------------------------------------- > > A new version of I-D, draft-ietf-opsawg-ntf-08.txt has been successfully > > submitted by Haoyu Song and posted to the IETF repository. > > > > Name: draft-ietf-opsawg-ntf > > Revision: 08 > > Title: Network Telemetry Framework > > Document date: 2021-10-07 > > Group: opsawg > > Pages: 40 > > URL: > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww > .i%2F&data=04%7C01%7Chaoyu.song%40futurewei.com%7Cec0d086f9 > d4b4bacbe9908d98dc47250%7C0fee8ff2a3b240189c753a1d5591fedc%7C1% > 7C0%7C637696697183566271%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4 > wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&a > mp;sdata=gYhlKlJGAJFrPLMQJsyJrWGUxq00Al5pOTrq%2BBAo%2BPE%3D&a > mp;reserved=0 > > etf.org%2Farchive%2Fid%2Fdraft-ietf-opsawg-ntf- > > > 08.txt&data=04%7C01%7Chaoyu.song%40futurewei.com%7C96249f77c > e > > > 0246132c2608d989e79553%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7 > > > C1%7C637692450027508042%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w > > > LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&am > > > p;sdata=fm%2FeutvtbKzZN7c%2BvZzlzmZzSWQs0I52sn68EQ1bSv0%3D& > r > > eserved=0 > > Status: > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatat > r > > acker.ietf.org%2Fdoc%2Fdraft-ietf-opsawg- > > > ntf%2F&data=04%7C01%7Chaoyu.song%40futurewei.com%7C96249f77 > c > > > e0246132c2608d989e79553%7C0fee8ff2a3b240189c753a1d5591fedc%7C1% > > > 7C1%7C637692450027508042%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4 > > > wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&a > > > mp;sdata=mPDw6Gz2JqqJ%2F6X0ISjEH5MH1nL%2Bgn5MK4VnbaBAfRs%3D& > > amp;reserved=0 > > Htmlized: > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatat > r > > acker.ietf.org%2Fdoc%2Fhtml%2Fdraft-ietf-opsawg- > > > ntf&data=04%7C01%7Chaoyu.song%40futurewei.com%7C96249f77ce02 > > > 46132c2608d989e79553%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C1 > > > %7C637692450027508042%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj > > > AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000& > > > sdata=x8mxaK3UugiiTtDDX1YCrs3a9%2FjhdUXBPMetNuoR1SM%3D&res > e > > rved=0 > > Diff: > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww > .i%2F&data=04%7C01%7Chaoyu.song%40futurewei.com%7Cec0d086f9 > d4b4bacbe9908d98dc47250%7C0fee8ff2a3b240189c753a1d5591fedc%7C1% > 7C0%7C637696697183566271%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4 > wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&a > mp;sdata=gYhlKlJGAJFrPLMQJsyJrWGUxq00Al5pOTrq%2BBAo%2BPE%3D&a > mp;reserved=0 > > etf.org%2Frfcdiff%3Furl2%3Ddraft-ietf-opsawg-ntf- > > > 08&data=04%7C01%7Chaoyu.song%40futurewei.com%7C96249f77ce02 > > > 46132c2608d989e79553%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C1 > > > %7C637692450027508042%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj > > > AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000& > > > sdata=3QV9pT%2Fzs5xj6WxMLqIwGr2%2F4cD7xqclE3uznclsZfA%3D&re > s > > erved=0 > > > > > > -----Original Message----- > > From: Haoyu Song > > Sent: Wednesday, October 6, 2021 9:14 AM > > To: Rob Wilton (rwilton) <[email protected]>; draft-ietf-opsawg- > > [email protected] > > Cc: [email protected] > > Subject: RE: AD review of draft-ietf-opsawg-ntf-07 [2] > > > > Hi Rob, > > > > Thank you very much for the review! We'll update the draft as you > suggested. > > > > Best regards, > > Haoyu > > > > -----Original Message----- > > From: Rob Wilton (rwilton) <[email protected]> > > Sent: Wednesday, October 6, 2021 3:55 AM > > To: [email protected] > > Cc: [email protected] > > Subject: RE: AD review of draft-ietf-opsawg-ntf-07 [2] > > > > Sigh, this also appears to be truncated in my email client. > > > > To be sure that you see all the comments (i.e., to the end of the > document), > > please either see the previous attachment. The full email can also be seen > in > > the archives at > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail > ar > > > chive.ietf.org%2Farch%2Fmsg%2Fopsawg%2FWDnVtM_vLm15X28OTEwI9Q6 > g > > > fx0%2F&data=04%7C01%7Chaoyu.song%40futurewei.com%7Cf1e7980d > > > 22be45a356e608d988b7d5ba%7C0fee8ff2a3b240189c753a1d5591fedc%7C1 > > > %7C0%7C637691145441218654%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC > > > 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000& > > > amp;sdata=d3NH7iwGu4T99Y%2Fwh9jft0oWofQeKyfWhcuBCQSZcJM%3D&a > > mp;reserved=0 > > > > Regards, > > Rob > > > > > > -----Original Message----- > > From: Rob Wilton (rwilton) <[email protected]> > > Sent: 06 October 2021 11:48 > > To: [email protected] > > Cc: [email protected] > > Subject: AD review of draft-ietf-opsawg-ntf-07 [2] > > > > Hi, > > > > > > > > Here is my belated AD review of draft-ietf-opsawg-ntf-07.txt. [Text file > > with > > comments attached in case this also gets truncated.] > > > > > > > > I would like to thank you for the effort that you have put into this > document, > > and apologise for my long delay in reviewing it. > > > > > > > > Broadly, I think that this is a good and useful framework, but in some of > > the > > latter parts of the document it seems to give prominence to protocols that I > > don't think have IETF consensus behind them yet (particularly DNP). I have > > flagged specific comments in comments inline within the document, but I > think > > that the document will have been accuracy/longevity if text about the > potential > > technologies is mostly kept to the appendices. > > > > > > > > There were quite a lot of cases where the text doesn't scan, or read easily, > > particularly in the latter sections of this document, although I acknowledge > > that none of the authors appear to be native English speakers. Ideally, > these > > sorts of issues would have been highlighted and addressed during WG LC. > > Although the RFC editor will improve the language of the documents, > making > > the improvements now before IESG review will aid its passage, and > hopefully > > result in a better document when it is published. I have flagged and > proposed > > alternative text/grammar where possible. Once you have made the > markups > > and resolved the issues/questions that I have raised then I can run it > through a > > grammar checking tool (Lar's will run an equivalent tool during IESG review > > anyway ...) > > > > > > > > All of my comments are directly inline, please search for "RW" or "RW:" > > > > > > > > > > > > > > > > > > > > OPSAWG H. Song > > > > Internet-Draft Futurewei > > > > Intended status: Informational F. Qin > > > > Expires: August 23, 2021 China Mobile > > > > P. Martinez-Julia > > > > NICT > > > > L. Ciavaglia > > > > Nokia > > > > A. Wang > > > > China Telecom > > > > February 19, 2021 > > > > > > > > > > > > Network Telemetry Framework > > > > draft-ietf-opsawg-ntf-07 > > > > > > > > Abstract > > > > > > > > Network telemetry is a technology for gaining network insight and > > > > facilitating efficient and automated network management. It > > > > encompasses various techniques for remote data generation, > > > > collection, correlation, and consumption. This document describes an > > > > architectural framework for network telemetry, motivated by > > > > challenges that are encountered as part of the operation of networks > > > > and by the requirements that ensue. Network telemetry, as > > > > necessitated by best industry practices, covers technologies and > > > > protocols that extend beyond conventional network Operations, > > > > > > > > Administration, and Management (OAM). The presented network > > > > telemetry framework promises flexibility, scalability, accuracy, > > > > coverage, and performance. In addition, it facilitates the > > > > implementation of automated control loops to address both today's and > > > > tomorrow's network operational needs. This document clarifies the > > > > terminologies and classifies the modules and components of a network > > > > telemetry system from several different perspectives. The framework > > > > and taxonomy help to set a common ground for the collection of > > > > related work and provide guidance for related technique and standard > > > > developments. > > > > > > > > RW: > > > > I would suggest condensing the abstract to the following, and move the > other > > text to the introduction if it is not already covered there. > > > > > > > > Network telemetry is a technology for gaining network insight and > > > > facilitating efficient and automated network management. It > > > > encompasses various techniques for remote data generation, > > > > collection, correlation, and consumption. This document describes an > > > > architectural framework for network telemetry, motivated by > > > > challenges that are encountered as part of the operation of networks > > > > and by the requirements that ensue. This document clarifies the > > > > terminologies and classifies the modules and components of a network > > > > telemetry system from several different perspectives. The framework > > > > and taxonomy help to set a common ground for the collection of > > > > related work and provide guidance for related technique and standard > > > > developments. > > > > > > > > > > > > Status of This Memo > > > > > > > > This Internet-Draft is submitted in full conformance with the > > > > provisions of BCP 78 and BCP 79. > > > > > > > > Internet-Drafts are working documents of the Internet Engineering > > > > Task Force (IETF). Note that other groups may also distribute > > > > working documents as Internet-Drafts. The list of current Internet- > > > > Drafts is at > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatat > r > > > acker.ietf.org%2Fdrafts%2Fcurrent%2F&data=04%7C01%7Chaoyu.song > % > > > 40futurewei.com%7Cf1e7980d22be45a356e608d988b7d5ba%7C0fee8ff2a3b > > 240189c753a1d5591fedc%7C1%7C0%7C637691145441218654%7CUnknown > > > %7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haW > > > wiLCJXVCI6Mn0%3D%7C1000&sdata=4B6oa1Ks5lxCrKsVA33csv8LE2rTL1 > > nZmfTlAv9n9ww%3D&reserved=0. > > > > > > > > > > > > > > > > > > > > Song, et al. Expires August 23, 2021 [Page 1] > > > > > > > > > > Internet-Draft Network Telemetry Framework February 2021 > > > > > > > > > > > > Internet-Drafts are draft documents valid for a maximum of six months > > > > and may be updated, replaced, or obsoleted by other documents at any > > > > time. It is inappropriate to use Internet-Drafts as reference > > > > material or to cite them other than as "work in progress." > > > > > > > > This Internet-Draft will expire on August 23, 2021. > > > > > > > > Copyright Notice > > > > > > > > Copyright (c) 2021 IETF Trust and the persons identified as the > > > > document authors. All rights reserved. > > > > > > > > This document is subject to BCP 78 and the IETF Trust's Legal > > > > Provisions Relating to IETF Documents > > > > > > > (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftrus > te > > e.ietf.org%2Flicense- > > > info&data=04%7C01%7Chaoyu.song%40futurewei.com%7Cf1e7980d22 > b > > > e45a356e608d988b7d5ba%7C0fee8ff2a3b240189c753a1d5591fedc%7C1%7C > > > 0%7C637691145441218654%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wL > > > jAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000& > ; > > > sdata=6bgdcWR1Sp3ry4Xg6iJN79hoSxXhzT2FvtcqMXUnmGs%3D&reserv > > ed=0) in effect on the date of > > > > publication of this document. Please review these documents > > > > carefully, as they describe your rights and restrictions with respect > > > > to this document. Code Components extracted from this document must > > > > include Simplified BSD License text as described in Section 4.e of > > > > the Trust Legal Provisions and are provided without warranty as > > > > described in the Simplified BSD License. > > > > > > > > Table of Contents > > > > > > > > 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 > > > > 2. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . 4 > > > > 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 6 > > > > 3.1. Telemetry Data Coverage . . . . . . . . . . . . . . . . . 7 > > > > 3.2. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 7 > > > > 3.3. Challenges . . . . . . . . . . . . . . . . . . . . . . . 9 > > > > 3.4. Network Telemetry . . . . . . . . . . . . . . . . . . . . 10 > > > > 4. The Necessity of a Network Telemetry Framework . . . . . . . 12 > > > > 5. Network Telemetry Framework . . . . . . . . . . . . . . . . . 13 > > > > 5.1. Top Level Modules . . . . . . . . . . . . . . . . . . . . 14 > > > > 5.1.1. Management Plane Telemetry . . . . . . . . . . . . . 17 > > > > 5.1.2. Control Plane Telemetry . . . . . . . . . . . . . . . 17 > > > > 5.1.3. Forwarding Plane Telemetry . . . . . . . . . . . . . 18 > > > > 5.1.4. External Data Telemetry . . . . . . . . . . . . . . . 20 > > > > 5.2. Second Level Function Components . . . . . . . . . . . . 21 > > > > 5.3. Data Acquisition Mechanism and Type Abstraction . . . . . 22 > > > > 5.4. Mapping Existing Mechanisms into the Framework . . . . . 24 > > > > 6. Evolution of Network Telemetry Applications . . . . . . . . . 25 > > > > 7. Security Considerations . . . . . . . . . . . . . . . . . . . 26 > > > > 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 > > > > 9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 27 > > > > 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 28 > > > > 11. Informative References . . . . . . . . . . . . . . . . . . . 28 > > > > Appendix A. A Survey on Existing Network Telemetry Techniques . 32 > > > > > > > > > > > > > > > > Song, et al. Expires August 23, 2021 [Page 2] > > > > > > > > > > Internet-Draft Network Telemetry Framework February 2021 > > > > > > > > > > > > A.1. Management Plane Telemetry . . . . . . . . . . . . . . . 32 > > > > A.1.1. Push Extensions for NETCONF . . . . . . . . . . . . . 32 > > > > A.1.2. gRPC Network Management Interface . . . . . . . . . . 32 > > > > A.2. Control Plane Telemetry . . . . . . . . . . . . . . . . . 33 > > > > A.2.1. BGP Monitoring Protocol . . . . . . . . . . . . . . . 33 > > > > A.3. Data Plane Telemetry . . . . . . . . . . . . . . . . . . 33 > > > > A.3.1. The Alternate Marking (AM) technology . . . . . . . . 33 > > > > A.3.2. Dynamic Network Probe . . . . . . . . . . . . . . . . 34 > > > > A.3.3. IP Flow Information Export (IPFIX) protocol . . . . . 35 > > > > A.3.4. In-Situ OAM . . . . . . . . . . . . . . . . . . . . . 35 > > > > A.3.5. Postcard Based Telemetry . . . . . . . . . . . . . . 35 > > > > A.4. External Data and Event Telemetry . . . . . . . . . . . . 35 > > > > A.4.1. Sources of External Events . . . . . . . . . . . . . 36 > > > > A.4.2. Connectors and Interfaces . . . . . . . . . . . . . . 37 > > > > Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 37 > > > > > > > > 1. Introduction > > > > > > > > Network visibility is the ability of management tools to see the > > > > state and behavior of a network, which is essential for successful > > > > network operation. Network Telemetry revolves around network data > > > > that can help provide insights about the current state of the > > > > network, including network devices, forwarding, control, and > > > > management planes, and that can be generated and obtained through a > > > > variety of techniques, including but not limited to network > > > > instrumentation and measurements, and that can be processed for > > > > purposes ranging from service assurance to network security using a > > > > wide variety of techniques including machine learning, data analysis, > > > > and correlation. In this document, Network Telemetry refer to both > > > > the data itself (i.e., "Network Telemetry Data"), and the techniques > > > > and processes used to generate, export, collect, and consume that > > > > data for use by potentially automated management applications. > > > > Network telemetry extends beyond the conventional network Operations, > > > > Administration, and Management (OAM) techniques and expects to > > > > support better flexibility, scalability, accuracy, coverage, and > > > > performance. > > > > > > > > RW: I suggest 'historical' rather than 'conventional' > > > > > > > > > > > > However, the term of network telemetry lacks a solid and unambiguous > > > > definition. The scope and coverage of it cause confusion and > > > > misunderstandings. It is beneficial to clarify the concept and > > > > provide a clear architectural framework for network telemetry, so we > > > > can articulate the technical field, and better align the related > > > > techniques and standard works. > > > > > > > > RW: Rather than term of, perhaps 'the term "network telemetry" lacks an > > > > unambiguous definition'. > > > > > > > > > > > > To fulfill such an undertaking, we first discuss some key > > > > characteristics of network telemetry which set a clear distinction > > > > from the conventional network OAM and show that some conventional > OAM > > > > technologies can be considered a subset of the network telemetry > > > > > > > > > > > > > > > > Song, et al. Expires August 23, 2021 [Page 3] > > > > > > > > > > Internet-Draft Network Telemetry Framework February 2021 > > > > > > > > > > > > technologies. We then provide an architectural framework for network > > > > telemetry which includes four modules, each concerned with a > > > > different category of telemetry data and corresponding procedures. > > > > All the modules are internally structured in the same way, including > > > > components that allow to configure data sources with regards to what > > > > data to generate and how to make that available to client > > > > applications, components that instrument the underlying data sources, > > > > and components that perform the actual rendering, encoding, and > > > > exporting of the generated data. We show how the network telemetry > > > > framework can benefit the current and future network operations. > > > > Based on the distinction of modules and function components, we can > > > > map the existing and emerging techniques and protocols into the > > > > framework. The framework can also simplify the tasks for designing, > > > > maintaining, and understanding a network telemetry system. At last, > > > > we outline the evolution stages of the network telemetry system and > > > > discuss the potential security concerns. > > > > > > > > The purpose of the framework and taxonomy is to set a common ground > > > > for the collection of related work and provide guidance for future > > > > technique and standard developments. To the best of our knowledge, > > > > this document is the first such effort for network telemetry in > > > > industry standards organizations. > > > > > > > > > > > > 2. Glossary > > > > > > > > Before further discussion, we list some key terminology and acronyms > > > > used in this documents. We make an intended differentiation between > > > > the terms of network telemetry and OAM. However, it should be > > > > understood that there is not a hard-line distinction between the two > > > > concepts. Rather, network telemetry is considered as the extension > > > > of OAM. It covers all the existing OAM protocols but puts more > > > > emphasis on the newer and emerging techniques and protocols > > > > concerning all aspects of network data from acquisition to > > > > consumption. > > > > > > > > > > > > RW: > > > > Nit: "this documents." -> "this document." > > > > Nit: "as an extension" rather than "as the extension". > > > > > > > > AI: Artificial Intelligence. In network domain, AI refers to the > > > > machine-learning based technologies for automated network > > > > operation and other tasks. > > > > > > > > AM: Alternate Marking, a flow performance measurement method, > > > > specified in [RFC8321]. > > > > > > > > BMP: BGP Monitoring Protocol, specified in [RFC7854]. > > > > > > > > DNP: Dynamic Network Probe, referring to programmable in-network > > > > sensors for network monitoring and measurement. > > > > > > > > > > > > > > > > > > > > > > > > Song, et al. Expires August 23, 2021 [Page 4] > > > > > > > > > > Internet-Draft Network Telemetry Framework February 2021 > > > > > > > > > > > > DPI: Deep Packet Inspection, referring to the techniques that > > > > examines packet beyond packet L3/L4 headers. > > > > > > > > gNMI: gRPC Network Management Interface, a network management > > > > protocol from OpenConfig Operator Working Group, mainly > > > > contributed by Google. See [gnmi] for details. > > > > > > > > gRPC: gRPC Remote Procedure Call, a open source high performance RPC > > > > framework that gNMI is based on. See [grpc] for details. > > > > > > > > IPFIX: IP Flow Information Export Protocol, specified in [RFC7011]. > > > > > > > > IOAM: In-situ OAM, a dataplane on-path telemetry technique. > > > > > > > > NETCONF: Network Configuration Protocol, specified in [RFC6241]. > > > > > > > > NetFlow: A Cisco protocol for flow record collecting, described in > > > > [RFC3594]. > > > > > > > > Network Telemetry: The process and instrumentation for acquiring and > > > > utilizing network data remotely for network monitoring and > > > > operation. A general term for a large set of network visibility > > > > techniques and protocols, concerning aspects like data generation, > > > > collection, correlation, and consumption. Network telemetry > > > > addresses the current network operation issues and enables smooth > > > > evolution toward future intent-driven autonomous networks. > > > > > > > > NMS: Network Management System, referring to applications that allow > > > > network administrators manage a network. > > > > > > > > RW: referring to => refers to applications that allow network administrators > to > > manage a network. > > > > > > > > > > > > > > > > OAM: Operations, Administration, and Maintenance. A group of > > > > network management functions that provide network fault > > > > indication, fault localization, performance information, and data > > > > and diagnosis functions. Most conventional network monitoring > > > > techniques and protocols belong to network OAM. > > > > > > > > PBT: Postcard-Based Telemetry, a dataplane on-path telemetry > > > > technique. > > > > > > > > SMIv2 Structure of Management Information Version 2, specified in > > > > [RFC2578]. > > > > > > > > RW: > > > > Is SMIv2 a better reference than MIBs, that readers are more likely to be > > familiar with? > > > > > > > > > > > > SNMP: Simple Network Management Protocol. Version 1 and 2 are > > > > specified in [RFC1157] and [RFC3416], respectively. > > > > > > > > YANG: The abbreviation of "Yet Another Next Generation". YANG is a > > > > data modeling language for the definition of data sent over > > > > > > > > RW: > > > > Nit: Please drop the first sentence, and add a reference to RFC 7950. > > > > > > > > > > > > > > > > Song, et al. Expires August 23, 2021 [Page 5] > > > > > > > > > > Internet-Draft Network Telemetry Framework February 2021 > > > > > > > > > > > > network management protocols such as the NETCONF and RESTCONF. > > > > YANG is defined in [RFC6020]. > > > > > > > > YANG ECA A YANG model for Event-Condition-Action policies, defined > > > > in [I-D.wwx-netmod-event-yang]. > > > > > > > > YANG PUSH: A method to subscribe pushed data from remote YANG > > > > datastore on network devices. Details are specified in [RFC8641] > > > > and [RFC8639]. > > > > > > > > RW: > > > > Perhaps borrow from the abstract in RFC 8641. > > > > "A mechanism that allows subscriber applications to request a > > > > stream of updates from a YANG datastore on a network device". Details > are > > ... > > > > > > > > > > > > 3. Background > > > > > > > > The term "big data" is used to describe the extremely large volume of > > > > data sets that can be analyzed computationally to reveal patterns, > > > > trends, and associations. Networks are undoubtedly a source of big > > > > data because of their scale and the volume of network traffic they > > > > forward. It is easy to see that network operations can benefit from > > > > network big data. > > > > > > > > RW: > > > > Also need to consider privacy. > > > > > > > > I think that we need to be careful not to imply that the intention here is > > to > > read/snoop on the data being carried over the network rather than gather > > insights into flows > > > > > > > > > > > > > > > > Today one can access advanced big data analytics capability through a > > > > plethora of commercial and open source platforms (e.g., Apache > > > > Hadoop), tools (e.g., Apache Spark), and techniques (e.g., machine > > > > learning). Thanks to the advance of computing and storage > > > > technologies, network big data analytics gives network operators an > > > > opportunity to gain network insights and move towards network > > > > autonomy. Some operators start to explore the application of > > > > Artificial Intelligence (AI) to make sense of network data. Software > > > > tools can use the network data to detect and react on network faults, > > > > anomalies, and policy violations, as well as predicting future > > > > events. In turn, the network policy updates for planning, intrusion > > > > prevention, optimization, and self-healing may be applied. > > > > > > > > It is conceivable that an autonomic network [RFC7575] is the logical > > > > next step for network evolution following Software Defined Network > > > > (SDN), aiming to reduce (or even eliminate) human labor, make more > > > > efficient use of network resources, and provide better services more > > > > aligned with customer requirements. Intent-based Networking (IBN) > > > > [I-D.irtf-nmrg-ibn-concepts-definitions] requires network visibility > > > > and telemetry data in order to ensure that the network is behaving as > > > > intended. Although it takes time to reach the ultimate goal, the > > > > journey has started nevertheless. > > > > RW: > > > > It would be helpful for the text to link autonomic networking and Intent > based > > networking, perhaps: > > > > The related technique of Intent-based Networking [...] requires ... > > > > > > > > RW: > > > > Not sure that the last sentence of the paragraph is required. > > > > > > > > > > > > However, while the data processing capability is improved and > > > > applications are hungry for more data, the networks lag behind in > > > > extracting and translating network data into useful and actionable > > > > information in efficient ways. The system bottleneck is shifting > > > > from data consumption to data supply. Both the number of network > > > > nodes and the traffic bandwidth keep increasing at a fast pace. The > > > > > > > > > > > > > > > > Song, et al. Expires August 23, 2021 [Page 6] > > > > > > > > > > Internet-Draft Network Telemetry Framework February 2021 > > > > > > > > > > > > network configuration and policy change at smaller time slots than > > > > before. More subtle events and fine-grained data through all network > > > > planes need to be captured and exported in real time. In a nutshell, > > > > it is a challenge to get enough high-quality data out of the network > > > > in a manner that is efficient, timely, and flexible. Therefore, we > > > > need to survey the existing technologies and protocols and identify > > > > any potential gaps. > > > > > > > > In the remainder of this section, first we clarify the scope of > > > > network data (i.e., telemetry data) concerned in the context. Then, > > > > we discuss several key use cases for today's and future network > > > > operations. Next, we show why the current network OAM techniques > and > > > > protocols are insufficient for these use cases. The discussion > > > > underlines the need of new methods, techniques, and protocols which > > > > we assign under the umbrella term - Network Telemetry. > > > > > > > > RW: > > > > We should also include the possibilty of extending existing protocols, > methods, > > techniques. > > > > > > > > > > > > 3.1. Telemetry Data Coverage > > > > > > > > Any information that can be extracted from networks (including data > > > > plane, control plane, and management plane) and used to gain > > > > visibility or as basis for actions is considered telemetry data. It > > > > includes statistics, event records and logs, snapshots of state, > > > > configuration data, etc. It also covers the outputs of any active > > > > and passive measurements [RFC7799]. Specially, raw data can be > > > > processed in-network before being sent to a data consumer. Such > > > > processed data is also considered telemetry data. A classification > > > > of telemetry data is provided in Section 5. > > > > > > > > RW: > > > > Specially - I would expand this. Perhaps: "In some cases, raw data is > processed > > before being sent .." > > > > We should also discuss the quality of data, i.e., less, higher quality data > may be > > better than lots of low quality data. > > > > > > > > > > > > 3.2. Use Cases > > > > > > > > The following set of use cases is essential for network operations. > > > > While the list is by no means exhaustive, it is enough to highlight > > > > the requirements for data velocity, variety, volume, and veracity in > > > > networks. > > > > > > > > o Security: Network intrusion detection and prevention systems need > > > > to monitor network traffic and activities and act upon anomalies. > > > > Given increasingly sophisticated attack vector coupled with > > > > increasingly severe consequences of security breaches, new tools > > > > and techniques need to be developed, relying on wider and deeper > > > > visibility into networks. > > > > > > > > RW: > > > > I agree with this, but it might be good to emphasize that the goal is > > > > to get to a place where this can be done without any, or only minimal, > > > > human intervention. > > > > > > > > > > > > o Policy and Intent Compliance: Network policies are the rules that > > > > constraint the services for network access, provide service > > > > differentiation, or enforce specific treatment on the traffic. > > > > For example, a service function chain is a policy that requires > > > > the selected flows to pass through a set of ordered network > > > > functions. Intent, as defined in > > > > > > > > RW: > > > > constraint => constrain > > > > > > > > > > > > Song, et al. Expires August 23, 2021 [Page 7] > > > > > > > > > > Internet-Draft Network Telemetry Framework February 2021 > > > > > > > > > > > > [I-D.irtf-nmrg-ibn-concepts-definitions], is a set of operational > > > > goal that a network should meet and outcomes that a network is > > > > supposed to deliver, defined in a declarative manner without > > > > specifying how to achieve or implement them. An intent requires a > > > > complex translation and mapping process before being applied on > > > > networks. While a policy or an intent is enforced, the compliance > > > > needs to be verified and monitored continuously, relying on > > > > visibility that is provided through network telemetry data, and > > > > any violation needs to be reported immediately. > > > > > > > > RW: > > > > Does it not also rely on visibility of the network to potentially modify > > > > the mapping to ensure that the intent remains in force? > > > > > > > > o SLA Compliance: A Service-Level Agreement (SLA) defines the level > > > > of service a user expects from a network operator, which include > > > > the metrics for the service measurement and remedy/penalty > > > > procedures when the service level misses the agreement. Users > > > > need to check if they get the service as promised and network > > > > operators need to evaluate how they can deliver the services that > > > > can meet the SLA based on realtime network telemetry data, > > > > including data from network measurements. > > > > > > > > o Root Cause Analysis: Any network failure can be the effect of a > > > > sequence of chained events. Troubleshooting and recovery require > > > > quick identification of the root cause of any observable issues. > > > > However, the root cause is not always straightforward to identify, > > > > especially when the failure is sporadic and the number of event > > > > messages, both related and unrelated to the same cause, is > > > > overwhelming. While machine learning technologies can be used for > > > > root cause analysis, it up to the network to sense and provide the > > > > relevant data to feed into machine learning applications. > > > > > > > > RW: > > > > In these sorts of scenarios, I would expect additional detailed diagnostics > > information to be requested from the device to figure out the root cause. > Or > > specifically, I think that this would contain data that wouldn't normally be > > exported via telemetry. > > > > > > > > > > > > o Network Optimization: This covers all short-term and long-term > > > > network optimization techniques, including load balancing, Traffic > > > > Engineering (TE), and network planning. Network operators are > > > > motivated to optimize their network utilization and differentiate > > > > services for better Return On Investment (ROI) or lower Capital > > > > Expenditures (CAPEX). The first step is to know the real-time > > > > network conditions before applying policies for traffic > > > > manipulation. In some cases, micro-bursts need to be detected in > > > > a very short time-frame so that fine-grained traffic control can > > > > be applied to avoid network congestion. Long-term planning of > > > > network capacity and topology requires analysis of real-world > > > > network telemetry data that is obtained over long periods of time. > > > > > > > > o Event Tracking and Prediction: The visibility into traffic path > > > > and performance is critical for services and applications that > > > > rely on healthy network operation. Numerous related network > > > > events are of interest to network operators. For example, Network > > > > operators want to learn where and why packets are dropped for an > > > > application flow. They also want to be warned of issues in > > > > > > > > > > > > > > > > Song, et al. Expires August 23, 2021 [Page 8] > > > > > > > > > > Internet-Draft Network Telemetry Framework February 2021 > > > > > > > > > > > > advance so proactive actions can be taken to avoid catastrophic > > > > consequences. > > > > > > > > 3.3. Challenges > > > > > > > > For a long time, network operators have relied upon SNMP [RFC3416], > > > > Command-Line Interface (CLI), or Syslog to monitor the network. Some > > > > other OAM techniques as described in [RFC7276] are also used to > > > > facilitate network troubleshooting. These conventional techniques > > > > are not sufficient to support the above use cases for the following > > > > reasons: > > > > > > > > o Most use cases need to continuously monitor the network and > > > > dynamically refine the data collection in real-time. The poll- > > > > based low-frequency data collection is ill-suited for these > > > > applications. Subscription-based streaming data directly pushed > > > > from the data source (e.g., the forwarding chip) is preferred to > > > > provide enough data quantity and precision at scale. > > > > > > > > o Comprehensive data is needed from packet processing engine to > > > > traffic manager, from line cards to main control board, from user > > > > flows to control protocol packets, from device configurations to > > > > operations, and from physical layer to application layer. > > > > Conventional OAM only covers a narrow range of data (e.g., SNMP > > > > only handles data from the Management Information Base (MIB)). > > > > Traditional network devices cannot provide all the necessary > > > > probes. More open and programmable network devices are therefore > > > > needed. > > > > > > > > o Many application scenarios need to correlate network-wide data > > > > from multiple sources (i.e., from distributed network devices, > > > > different components of a network device, or different network > > > > planes). A piecemeal solution is often lacking the capability to > > > > consolidate the data from multiple sources. The composition of a > > > > complete solution, as partly proposed by Autonomic Resource > > > > Control Architecture(ARCA) > > > > [I-D.pedro-nmrg-anticipated-adaptation], will be empowered and > > > > guided by a comprehensive framework. > > > > > > > > o Some of the conventional OAM techniques (e.g., CLI and Syslog) > > > > lack a formal data model. The unstructured data hinder the tool > > > > automation and application extensibility. Standardized data > > > > models are essential to support the programmable networks. > > > > > > > > o Although some conventional OAM techniques support data push (e.g., > > > > SNMP Trap [RFC2981][RFC3877], Syslog, and sFlow), the pushed data > > > > are limited to only predefined management plane warnings (e.g., > > > > SNMP Trap) or sampled user packets (e.g., sFlow). Network > > > > > > > > > > > > > > > > Song, et al. Expires August 23, 2021 [Page 9] > > > > > > > > > > Internet-Draft Network Telemetry Framework February 2021 > > > > > > > > > > > > operators require the data with arbitrary source, granularity, and > > > > precision which are beyond the capability of the existing > > > > techniques. > > > > > > > > o The conventional passive measurement techniques can either consume > > > > excessive network resources and render excessive redundant data, > > > > or lead to inaccurate results; on the other hand, the conventional > > > > active measurement techniques can interfere with the user traffic > > > > and their results are indirect. Techniques that can collect > > > > direct and on-demand data from user traffic are more favorable. > > > > > > > > These challenges were addressed by newer standards and techniques > > > > (e.g., IPFIX/Netflow, PSAMP, IOAM, and YANG-Push) and more are > > > > emerging. These standards and techniques need to be recognized and > > > > accommodated in a new framework. > > > > > > > > 3.4. Network Telemetry > > > > > > > > Network telemetry has emerged as a mainstream technical term to refer > > > > to the network data collection and consumption techniques. Several > > > > network telemetry techniques and protocols (e.g., IPFIX [RFC7011] and > > > > gRPC [grpc]) have been widely deployed. Network telemetry allows > > > > separate entities to acquire data from network devices so that data > > > > can be visualized and analyzed to support network monitoring and > > > > operation. Network telemetry covers the conventional network OAM and > > > > has a wider scope. It is expected that network telemetry can provide > > > > the necessary network insight for autonomous networks and address the > > > > shortcomings of conventional OAM techniques. > > > > > > > > Network telemetry usually assumes machines as data consumers rather > > > > than human operators. Hence, the network telemetry can directly > > > > trigger the automated network operation, while in contrast some > > > > conventional OAM tools are designed and used to help human operators > > > > to monitor and diagnose the networks and guide manual network > > > > operations. Such a proposition leads to very different techniques. > > > > > > > > Although new network telemetry techniques are emerging and subject to > > > > continuous evolution, several characteristics of network telemetry > > > > have been well accepted. Note that network telemetry is intended to > > > > be an umbrella term covering a wide spectrum of techniques, so the > > > > following characteristics are not expected to be held by every > > > > specific technique. > > > > > > > > o Push and Streaming: Instead of polling data from network devices, > > > > telemetry collectors subscribe to streaming data pushed from data > > > > sources in network devices. > > > > > > > > > > > > > > > > > > > > > > > > Song, et al. Expires August 23, 2021 [Page 10] > > > > > > > > > > Internet-Draft Network Telemetry Framework February 2021 > > > > > > > > > > > > o Volume and Velocity: The telemetry data is intended to be consumed > > > > by machines rather than by human being. Therefore, the data > > > > volume can be huge and the processing is optimized for the needs > > > > of automation in realtime. > > > > > > > > o Normalization and Unification: Telemetry aims to address the > > > > overall network automation needs. Efforts are made to normalize > > > > the data representation and unify the protocols, so to simplify > > > > data analysis and provide integrated analysis across heterogeneous > > > > devices and data sources across a network. > > > > > > > > o Model-based: The telemetry data is modeled in advance which allows > > > > applications to configure and consume data with ease. > > > > > > > > o Data Fusion: The data for a single application can come from > > > > multiple data sources (e.g., cross-domain, cross-device, and > > > > cross-layer) and needs to be correlated to take effect. > > > > > > > > o Dynamic and Interactive: Since the network telemetry means to be > > > > used in a closed control loop for network automation, it needs to > > > > run continuously and adapt to the dynamic and interactive queries > > > > from the network operation controller. > > > > > > > > In addition, an ideal network telemetry solution may also have the > > > > following features or properties: > > > > > > > > o In-Network Customization: The data that is generated can be > > > > customized in network at run-time to cater to the specific need of > > > > applications. This needs the support of a programmable data plane > > > > which allows probes with custom functions to be deployed at > > > > flexible locations. > > > > > > > > o In-Network Data Aggregation and Correlation: Network devices and > > > > aggregation points can work out which events and what data needs > > > > to be stored, reported, or discarded thus reducing the load on the > > > > central collection and processing points while still ensuring that > > > > the right information is ready to be processed in a timely way. > > > > > > > > o In-Network Processing: Sometimes it is not necessary or feasible > > > > to gather all information to a central point to be processed and > > > > acted upon. It is possible for the data processing to be done in > > > > network, allowing reactive actions to be taken locally. > > > > > > > > o Direct Data Plane Export: The data originated from the data plane > > > > forwarding chips can be directly exported to the data consumer for > > > > efficiency, especially when the data bandwidth is large and the > > > > real-time processing is required. > > > > > > > > > > > > > > > > > > > > Song, et al. Expires August 23, 2021 [Page 11] > > > > > > > > > > Internet-Draft Network Telemetry Framework February 2021 > > > > > > > > > > > > o In-band Data Collection: In addition to the passive and active > > > > data collection approaches, the new hybrid approach allows to > > > > directly collect data for any target flow on its entire forwarding > > > > path [I-D.song-opsawg-ifit-framework]. > > > > > > > > It is worth noting that a network telemetry system should not be > > > > intrusive to normal network operations by avoiding the pitfall of the > > > > "observer effect". That is, it should not change the network > > > > behavior and affect the forwarding performance. Otherwise, the whole > > > > purpose of network telemetry is compromised. > > > > > > > > Although in many cases a system for network telemetry involves a > > > > remote data collecting and consuming entity, it is important to > > > > understand that there are no inherent assumptions about how a system > > > > should be architected. Telemetry data producers and consumers can > > > > work in distributed or peer-to-peer fashions rather than assuming a > > > > centralized data consuming entity. In such cases, a network node can > > > > be the direct consumer of telemetry data from other nodes. > > > > > > > > 4. The Necessity of a Network Telemetry Framework > > > > > > > > RW: I think that the structure of the document might be better if this was a > > section 3.5 of the background rather than it's own top level section? > > > > > > > > Network data analytics and machine-learning technologies are applied > > > > for network operation automation, relying on abundant and coherent > > > > data from networks. Data acquisition that is limited to a single > > > > source and static in nature will in many cases not be sufficient to > > > > meet an application's telemetry data needs. As a result, multiple > > > > data sources, involving a variety of techniques and standards, will > > > > need to be integrated. It is desirable to have a framework that > > > > classifies and organizes different telemetry data source and types, > > > > defines different components of a network telemetry system and their > > > > interactions, and helps coordinate and integrate multiple telemetry > > > > approaches across layers. This allows flexible combinations of data > > > > for different applications, while normalizing and simplifying > > > > interfaces. In detail, such a framework would benefit application > > > > development for the following reasons: > > > > > > > > o Future networks, autonomous or otherwise, depend on holistic and > > > > comprehensive network visibility. All the use cases and > > > > applications are better to be supported uniformly and coherently > > > > under a single intelligent agent using an integrated, converged > > > > mechanism and common telemetry data representations wherever > > > > feasible. Therefore, the protocols and mechanisms should be > > > > consolidated into a minimum yet comprehensive set. A telemetry > > > > framework can help to normalize the technique developments. > > > > > > > > o Network visibility presents multiple viewpoints. For example, the > > > > device viewpoint takes the network infrastructure as the > > > > monitoring object from which the network topology and device > > > > > > > > > > > > > > > > Song, et al. Expires August 23, 2021 [Page 12] > > > > > > > > > > Internet-Draft Network Telemetry Framework February 2021 > > > > > > > > > > > > status can be acquired; the traffic viewpoint takes the flows or > > > > packets as the monitoring object from which the traffic quality > > > > and path can be acquired. An application may need to switch its > > > > viewpoint during operation. It may also need to correlate a > > > > service and its impact on user experience to acquire the > > > > comprehensive information. > > > > > > > > o Applications require network telemetry to be elastic in order to > > > > make efficient use of network resources and reduce the impact of > > > > processing related to network telemetry on network performance. > > > > For example, routine network monitoring should cover the entire > > > > network with a low data sampling rate. Only when issues arise or > > > > critical trends emerge should telemetry data source be modified > > > > and telemetry data rates boosted as needed. > > > > > > > > o Efficient data fusion is critical for applications to reduce the > > > > overall quantity of data and improve the accuracy of analysis. > > > > > > > > A telemetry framework collects together all of the telemetry-related > > > > works from different sources and working groups within IETF. This > > > > makes it possible to assemble a comprehensive network telemetry > > > > system and to avoid repetitious or redundant work. The framework > > > > should cover the concepts and components from the standardization > > > > perspective. This document describes the modules which make up a > > > > network telemetry framework and decomposes the telemetry system into > > > > a set of distinct components that existing and future work can easily > > > > map to. > > > > > > > > 5. Network Telemetry Framework > > > > > > > > The top level network telemetry framework partitions the network > > > > telemetry into four modules based on the telemetry data object source > > > > and represents their relationship. At the next level, the framework > > > > decomposes each module into separate components. Each of the > modules > > > > follows the same underlying structure, with one component dedicated > > > > to the configuration of data subscriptions and data sources, a second > > > > component dedicated to encoding and exporting data, and a third > > > > component instrumenting the generation of telemetry related to the > > > > underlying resources. Throughout the framework, the same set of > > > > abstract data acquiring mechanisms and data types are applied. The > > > > two-level architecture with the uniform data abstraction helps > > > > accurately pinpoint a protocol or technique to its position in a > > > > network telemetry system or disaggregate a network telemetry system > > > > into manageable parts. > > > > > > > > > > > > RW: Relationship of telemetry data vs get requests. I.e., isn't telemtry > > just > push > > rather than pulling data. > > > > > > > > > > > > > > > > > > > > Song, et al. Expires August 23, 2021 [Page 13] > > > > > > > > > > Internet-Draft Network Telemetry Framework February 2021 > > > > > > > > > > > > 5.1. Top Level Modules > > > > > > > > Telemetry can be applied on the forwarding plane, the control plane, > > > > and the management plane in a network, as well as other sources out > > > > of the network, as shown in Figure 1. Therefore, we categorize the > > > > network telemetry into four distinct modules with each having its own > > > > interface to Network Operation Applications. > > > > > > > > +------------------------------+ > > > > | | > > > > | Network Operation |<-------+ > > > > | Applications | | > > > > | | | > > > > +------------------------------+ | > > > > ^ ^ ^ | > > > > | | | | > > > > V | V V > > > > +-----------|---+--------------+ +-----------+ > > > > | | | | | | > > > > | Control Pl|ane| | | External | > > > > | Telemetry | <---> | | Data and | > > > > | | | | | Event | > > > > | ^ V | Management | | Telemetry | > > > > +------|--------+ Plane | | | > > > > | V | Telemetry | +-----------+ > > > > | Forwarding | | > > > > | Plane <---> | > > > > | Telemetry | | > > > > | | | > > > > +---------------+--------------+ > > > > > > > > Figure 1: Modules in Layer Category of NTF > > > > > > > > RW: > > > > In this diagram, for me at least, I think that it would more natural to have > > Management Plane on the left, and Control/ Forwarding Plane on the right. > > > > > > > > The rationale of this partition lies in the different telemetry data > > > > objects which result in different data source and export locations. > > > > Such differences have profound implications on in-network data > > > > programming and processing capability, data encoding and transport > > > > protocol, and required data bandwidth and latency. > > > > > > > > RW: > > > > Data can be sent directly, or proxied via the control and management > planes _______________________________________________ OPSAWG mailing list [email protected] https://www.ietf.org/mailman/listinfo/opsawg
