Hi!
I conducted an AD review of draft-ietf-i2nsf-nsf-monitoring-data-model-08.
Thanks for this detailed info and data model.
My high-level comments are as follows:
** There is a lot of flexibility in this data model in the cardinality of the
fields and the sheer number of free form fields. The positive of this approach
is that it should be able to represent a wide variety of tools/NSF. The
negative of this is that significant profiling or out of band knowledge will be
needed to make many of these field machine readable. It would be helpful to
discuss this in the text
** There are a number of taxonomies in this data model. They will be helpful
to parse and triage alarms. However, I have some concerns about the
completeness of those currently specified and the lack of discussion on how
they might be extended. It would also be helpful to cover this in the text.
** (Mentioned below in detail) I found the philosophically framing in Section 3
and 4, not entirely in sync. Additionally, the taxonomy of different kinds of
data introduction in Section 4 did not clearly align for me against the data
model in Section 10.
** (Mentioned below in detail) I found a few places where there were assumption
and architectural elements outside of the base I2NSF architecture (RFC8329 and
draft-ietf-i2nsf-applicability-18). Where that occurs more detail would be
helpful, or reconsideration if this is necessary.
Now the more specific comments:
** This document provides both an information model and seemingly a YANG module
to implement it. I may have missed it, but it would be helpful to state that
obvious fact
** Section 1. Editorial. s/Monitoring procedures intent to acquire/Monitoring
procedures acquire/
** Section 1. This sentence didn't parse for me:
OLD
Monitoring procedures
intent to acquire vital types of data with respect to NSFs, (e.g.,
alarms, records, and counters) via data in motion (e.g., queries,
notifications, and events)
NEW
This interface enables the sharing of vital data from the NSFs (e.g., alarms,
records, and counters) to the Security Controller through a variety of
mechanisms (e.g., queries, notifications, and events).
** Section 1. s/for an NSF for an NSF/for an NSF/
** Section 1. Recommend sticking to the named I2NSF architecture of RFC8329 so
s/e.g., Security Controller and NSF Data Analyzer/e.g., Security Controller/
** Section 1. Is the phrase "... provides visibility for an NSF for an NSF
data collector ...", the same thing as saying "...provides visibility into an
NSF for the NSF data collector"?
** Section 1. How important is it to introduce the new architectural element
of the "NSF data controller" as a super-set of the previously defined Security
Controller, and the never previously mentioned or defined "NSF Data Analyzer".
I asked because the I2NSF reference architectures in RFC8329 and
draft-ietf-i2nsf-applicability don't have it.
I see value in keeping the language consistent with previous drafts by defining
a new role for the Security Controller (from draft-ietf-i2nsf-applicability) or
Network Management Operator System (from RFC8329) instead. If it's imperative
to invent this new architecture term, please explain how it is architecturally
different than these previously defined components.
** Section 1. Editorial. Per "The information model for the NSF monitoring
interface presented in this document is a complementary information model to
the information model ...", recommend against using the term "information
model" three times in the same sentence for readability.
** Section 3. I found these use cases clear. However, after reading Section
4, I had trouble relating the terminology and nuances to this section.
-- There is discussion of "events" and "activity logs" in these use cases.
However, in Section 4 notes events, notifications and records. How do
notifications and records align with these use cases?
-- These use cases discuss primary discuss acting on events and triggered by
activity. However, in subsequent sections (e.g., Section 7), there is note of
"alerts" and "alarms". How are they related?
** Section 4. Editorial. Per "In order to maintain ...", this sentence has a
double negative (two "not")
** Section 4. Per "Three basic domains about the monitoring information
originating from a system entity [RFC4949] or an NSF are highlighted in this
document", what is the relationship between a "system entity" and NSF? What's
a "system entity" in the I2NSF framework/architecture?
** Section 4.
The Alarm Management Framework in [RFC3877] defines an Event as
something that happens as a thing of of interest.
-- Typo. s/of of/of/
-- The definition from RFC3877 "Something that happens which may be of
interest"? Editorially, the exact words are clearer.
** Section 4. The citation doesn't seem right. Per:
It defines a fault
as a change in status, crossing a threshold, or an external input to
the system.
Section 3.1 of RFC3877 says a fault is "Lasting error or warning condition."
The quoted text of "... as a change in status, crossing a threshold, or an
external input to the system" comes from the definition of an event and it
states them as examples: "A fault, a change in status, crossing a threshold, or
an external input to the system, for example."
** Section 4. Per "... the scope of the Alarm Management Framework's Events is
still applicable due to its broad definition", can you please clarify what is
being invoked from RFC3877 beyond these definitions.
** Section 4.1. The section used term "retention" in a way I didn't expect.
I'm most familiar with retention practices in the area of alerts and logs as
discussing what information is kept and for how long. Does that need to be
covered here?
** Section 4.1. Editorial.
Typically, a system entity populates standardized interface, such as
SNMP, NETCONF, RESTCONF or CoMI to provide and emit created
information directly via NSF Monitoring Interface
New
Typically, a system entity populates standardized interfaces, such as
SNMP, NETCONF, RESTCONF or CoMI to emit information via the NSF Monitoring
Interface
** Section 4.1. Per "Alternatively, the created information is ...", is it
"alternatively" or "additionally"?
** Section 4.1. Typo. s/ Monistoring/Monitoring/
** Section 4.1. Per the paragraph beginning with "Information retained on a
system entity ...", I had trouble following this guidance -- Is the text
suggesting that the data be emitted in some way other than the standardized
data model described in this document?
** Section 4.1. Per "An I2NSF User is required to process fresh [RFC4949]
records ...",
-- how does an I2NSF user know the records are fresh?
-- what is a "homogenizing function"?
-- per the architecture, how is the I2NSF user "...proving[ing] them to other
I2NSF Components", as the user only interacts with the Controller?
** Section 4.1. Per "When retained or emitted, the information required to
support monitoring processes has to be processed by an I2NSF User at some point
in the workflow. Typical locations of these I2NSF Users are: ...", unless I'm
misunderstanding that it meant by location, this behavior doesn't seem to align
with the reference architecture of Figure 1 of draft-ietf-i2nsf-applicability
or Section 3 of RFC8329, which suggest that only the security controller is
directly interacting with the NSF.
** Section 4.2. There is a distinction being made between an event and a
notification, but that distinction isn't clear to me - are events and
notifications the same things except events come from an I2NSF component, but
notifications do not? If notifications aren't part of the I2NSF architecture,
how are they in scope to this data model? Do multiple notifications aggregate
to be an event?
** Section 4.2. There deliberate taxonomy being created around events,
notifications and records with each having significantly different properties
to warrant distinct categories. This distinction is not clear to me when
manifested in the YANG module. Are all top level containers with "*-event-*"
in their names events? Are records the containers with "log" in their name?
What are notifications?
Additionally, "alarms" are later introduced in the text, how do they relate?
** Section 4.4. It isn't clear from the text whether records are shared via
the monitoring interface?
** Section 4.4. Per "Unlike information emitted via notifications and events,
records do not require immediate attention from an analyst but may be
useful for visibility and retroactive cyber forensic":
-- I don't see text in Section 4.2 is it noted that notifications required
immediate attention from an analyst
-- What is the relationship between the analyst and the "I2NSF user"?
** Section 5. This section appears to be restating very similar information to
Section 4.3. Are both needed?
** Section 5. Per the various examples of "data model and interaction model
for data in motion", their applicability to I2NSF isn't clear. This document
is specifying the encoding of data in a particular information and data model.
IPFIX and NetFlow don't seem germane, and represent an alternative to some of
the information described here.
** Section 5.1. This section explicitly calls out YANG Push and YANG
Subscribed Notification. Are these recommended protocols? I ask because the
previous sections (4.3 and 5.0), already described the generic utility of push
and notification capabilities. Here the capabilities are being described again
specifically.
** Section 6 and 7. There is no guidance on whether any of these data items
are mandatory.
** Section 6. Per the Basic Information model:
-- message: is this field identifying the type of message or the message
itself?
-- nsf-name: is the name a FQDN? Or any arbitrary string?
-- Should there be a timestamp?
** Section 7. Editorial. s/as alarm/as an alarm/ and s/with basic
information/with the basic information/
** Section 7.*. The definition of "acquisition method", "emission-type" or
"dampening-type" aren't clear from the one-word descriptions. Earlier text
hints at acquisition method and emission type, but dampening isn't mentioned.
Recommend up front definition before use here.
** Section 7.1.*.
-- A simple definition of memory, cpu, disk, etc. would be helpful to
explicitly say in the text
-- each of these alarms has a "message" field. Is the proposed text the actual
message? The text seems to read like the definition of that type of alarm.
** Section 7.1.5. interface-state. "up, down and congested" seems to be mixing
different properties. Are the underlying semantics up-not-congested,
up-but-congested, and down?
** Section 7.2.1 and 7.2.2.
-- Can a user only belong to one group?
-- Recommend defining "authentication"
-- per "login-ip-address" - is this the true "login IP" or the IP address of
the action that triggered the alarm?
** Section 7.2.2. Recommend defining the scope of "configuration change"
** Section 7.2.3. Should there be some indicator as to why this flow was
shared?
** Section 7.3 and 7.5. I was expecting to find symmetry between the NSF
events described here and those described by the "content-security-controls"
and "attack-mitigation-control" in draft-ietf-i2nsf-nsf-facing-interface. I
assumed that if the NSF has a certain capability then it would be able to
generate events for it using this info/data model. Instead I found
representations for modeling an alert for which capabilities doesn't seem to
exist; and capabilities for which there wasn't a corresponding way to created
alarms. Specifically:
-- voip-volte, pkt-capture and mail-filtering don't seem to have an analog here
-- here there is an intrusion event but draft-ietf-i2nsf-nsf-facing-interface
makes a distinction between ids and ips
-- botnet, session and vulnerability scanning are defined here don't have an
analog in draft-ietf-i2nsf-nsf-facing-interface
** Section 7.3.1. Per attack-type
-- saying "Any one of ..." and then ending it with an "and etc" is confusing
because this is no longer an enumerated list.
-- was it intentional to have this list not align with the types of DDOS
attacks enumerated in draft-ietf-i2nsf-nsf-facing-interface?
** Section 7.3.1. Per dst-ip, should this be expressed as a network mask or
domain name for additional flexibility?
** Section 7.3.1. Rule-name
-- Why is the "rule-name" encoded here but not in any of the other types of
events (as it seems like it would be useful)?
-- Is this the name of the I2NSF Policy Rule or rule specific to the
configuration on the NSF?
** Section 7.3.1. Per "profile", can "security profile" please be defined.
** Section 7.3.3 and 7.3.4 (and same applies to YANG module)
-- What is a src/dst-zone?
-- Should the IP/port/raw_info fields describe the flow in which the malware
was seen rather than a packet? (Per Section 7.3.4) I ask because it would seem
unlikely that a network-based malware inspection tech would operate on a
per-packet basis (or that the file with the malware payload actually fit in a
single packet). Per Section 7.3.4, most modern IDS/IPS also operate on
streams/flows.
** Section 7.3.5. role.
-- The roles seem like the would benefit from further generalization. Given
the diversity of C2 servers approaches, would it be cleared to s/IRC and Web/C2/
-- How would peer-to-peer zombie be handled, that is where the compromised
hosts can talk to each other?
-- Is an "other" needed to catch additional cases?
** Section 7.4. What is the relationship of a system log to the event,
notification and record taxonomy introduced earlier
** Section 7.4.1, Per "Administrator"
-- Editorially, this is one of the few uppercase field names
-- is this a user name?
** Section 7.4.2.
-- Are the CPU, disk, sessions, traffic rates, traffic speed numbers aggregates
across all CPUs and interfaces on a system?
-- How does one provide per CPU/interface/disk stats?
** Section 7.4.3
-- what is "access"? Is this describing the means by which the user accessed
the system? Is PPP, Point-to-Point Protocol? I don't recognized "SVN"?
-- What is "online-duration"? and "logout-duration"?
** Section 7.5.2 Editorial. Using the term "victim-id" suggests to me that a
particular system has been exploited. I was under the impression that a
"vulnerability scanning log" merely found a vulnerability.
** Section 7.6.1. What is the temporal frame of reference for these counters?
For example, is the peak computed since the last time the counter was polled?
Since some internal state was reset?
** Section 7.7.1, What are "*-regions"?
** YANG. typedef dpi-type. What is the difference between "data-filtering"
and "application-behavior-control"? Wouldn't a subset of application behavior
be filtering data it sends?
** YANG. typedef operation-type. "Configuration" seems underspecified, unless
it is intended to mean any operation done by a user beyond login or logout.
For example, if a I list the contents of a data store, I'm not doing a
configuration. Is that something that would be in scope to log?
** YANG. typedef operation-type. The information model distinguishes between
a privileged and unprivileged user. Does that distinction apply here?
** YANG. typedef login-mode.
-- Saying "root" seems Unix-centric.
-- "mode" doesn't seem like the right term. Root, user and guest are roles,
but the means of logging in is the same.
** YANG. identity periodical. Editorial. Should this be "periodic"?
** YANG. There are a number of enumerations whose descriptions are simply
repetitions of the name. Please scrub the model and add improved descriptions.
For example:
-- YANG. identity dampening-type, no-dampening and on-repetition. The
descriptions for these identities are not meaningful as they simply repeat
their names.
-- YANG. identity authentication-mode and event-type he specific
authentications modes do not have meaningful descriptions and merely repeat
their names.
-- YANG. identity access-mode. The description does not make it clear what
this or the derived enums are.
** YANG. identity virus-type.
-- Typo. s/caan/can/
-- the taxonomy of virus-type seems to be incomplete. How does one
characterize non-self replicating, non-trojan and pure binary malware?
-- the descriptions of the derived virus-type enums are not meaningful and
simply repeat their names
** YANG. identity req-method. Is there a reason why the HTTP request methods
are incomplete?
** YANG. identity whitelist and blacklist. In the spirit of inclusive
language, please do not use these terms. Consider accept/allow vs. deny-list.
** YANG. identity user-defined. What does a "user-defined" list mean when
compared to an accept or deny list?
** YANG. identity malicious-category. How is this different than a deny list
("identity blacklist")
** YANG. Per the various protocol identities (i.e., identity ftp, icmpv6,
etc.), can you explain how these are used (not saying there is an issue, I'm
just don't understand).
** YANG. identity http. s/HTPP/HTTP/
** YANG. grouping common-monitoring-data. In this grouping, the severity type
uses enums of critical, high, middle and low. However, in the information
model, Section 6, the severity is described as a number from 0..3. Shouldn't
they be the same?
** YANG. leaf vendor-name. Is this field also free form, like "leaf message"
** YANG. leaf nsf-name. Is this either an IP and a FQDN (or host name), or is
the "name" here an arbitrary label?
** YANG. grouping i2nsf-nsf-event-type-content-extended. What are "src-zone"
and "dst-zone"?
** YANG. When "grouping log-action" is invoked in various cases in the model,
should there be flexibility to have multiple actions (if I'm reading the YANG
correctly, it's 0..1)
** YANG. grouping attack-rates. Editorial. Please spell out PPS and BPS in
the description.
** YANG. grouping traffic-rates. What are the units and phenomenon measured
by the "leaf total-traffic"?
** YANG. grouping traffic-rates. For the counters that are *-average-*, how
is the time horizon conveyed?
** YANG. leaf src-user. Is that the I2NSF user? A user name?
** YANG. case i2nsf-traffic-flows.
-- This container represents flows, but all of the leaves seems to talk about
packets. Recommend s/of the packet/of the flow/
-- per "leaf arrival-rate", how is this computed? Is this the average arrival
rate for all packets in the flow?
** YANG. notification i2nsf-log. Case i2nsf-nsf-system-access-log.
-- Per "leaf administrator", is this the username of the "administrator"?
-- Per "leaf result", how would a binary result be encoded?
-- is there some what to convey the format of the input or output (e.g., the
content is a bash shell command)
** YANG. container i2nsf-system-res-util-log.
-- per "leaf system-status", what kind of text would be extended in this free
form string?
-- per "cpu, memory, disk-usage", what are the units of the uint8 value?
-- per "session, process-num", is uint8 sufficiently large to represent a
session a process count?
** YANG. container i2nsf-system-user-activity-log.
-- what do the field online-duration and logout-duration mean?
-- per additional-info, how should the list of values after the "e.g.,
Successful User ..." be read? It seems to be suggesting a set of enumerated
values.
** YANG. container i2nsf-nsf-detection-ddos.
-- Per "leaf attack-src/dst-ip", can the text "... If there are a large number
of IPv4 (or IPv6) addresses, then pick a certain number of resources according
to different rules" be clarified. I didn't follow the guidance about picking
resources?
** YANG. i2nsf-nsf-detection-virus.
-- per "leaf file-type", could this be expressed as a
https://www.iana.org/assignments/media-types/media-types.xhtml?
-- should a hash of the file also be an option?
-- per "leaf os", what is meant by the "simple" adjective for "Simple OS
information"?
** YANG. container i2nsf-nsf-detection-web-attack
-- What is the "uri-category"?
-- What is the "rsp-code", is that the HTTP response code?
-- Is the req-client-app, the user agent string?
-- recommend that the descriptions reflect the precise HTTP header field names
if appropriate
** YANG. container i2nsf-nsf-log-vuln-scan.
-- is there a reason why there isn't a mechanism for structure vulnerability
information (e.g., CVE) or severity (e.g., CVSS)?
** Section 14. Per the usual YANG template, this text is silent on read
operations and RPC. Please clarify.
** Section 14. Clarifying text.
OLD
... which can be created, modified and deleted ... are considered sensitive
NEW
... which can be created, modified and deleted ... are considered sensitive as
they all could potentially impact security monitoring and mitigation
activities. Write operations (e.g., edit-config) applied to these data nodes
without proper protection could result in missed alarms or incorrect alarms
information being returned to the NSF collector.
** Section 14. Per "The monitoring YANG module should be protected by the
secure communication channel, to ensure its confidentiality and integrity.",
can the intent of this sentence please be clarified. The first paragraph of
this section already established that either SSH or TLS must be used.
** Section 14.
In another side, the NSF and NSF data collector can
all be faked, which lead to undesirable results (i.e., leakage of an
NSF's important operational information, and faked NSF sending false
information to mislead the NSF data collector). The mutual
authentication is essential to protected against this kind of attack.
The current mainstream security technologies (i.e., TLS, DTLS, IPsec,
and X.509 PKI) can be employed appropriately to provide the above
security functions.
There are a few threats here and they should be separated. Mutual
authentication doesn't appear to mitigate all of them.
-- compromised NSF (valid credentials): can send falsified information to the
NSF collector to mislead detection or mitigation activities; and/or to hide
activity. There is no in-framework mechanism to mitigate this and an issue for
all monitoring infrastructures.
-- compromised NSF collection (has valid credentials): has visibility into all
collected security alarms; entire detection and mitigation infrastructure may
be suspect
-- impersonating NSF: system trying to send false information; client
authentication would help the NSF collector identify this invalid NSF in the
"push" model (NSF-to-collector); "pull" model (collector-to-NSF) should already
be addressed
-- impersonating NSF collector: legitimate NSF is tricked into communicating
with a rouge NSF collector; for "push" (NSF-to-collector), without valid
credentials, this should already not work; for "pull" (collector-to-NSF),
mutual auth would mitigate
** ID nits returned the following:
== Unused Reference: 'RFC2119' is defined on line 3817, but no explicit
reference was found in the text
[Roman] This means the boilerplate text on RFC2119 is not used correctly
== Unused Reference: 'I-D.ietf-i2nsf-capability' is defined on line 3944,
but no explicit reference was found in the text
[Roman] This should be removed.
** Downref: Normative reference to an Unknown state RFC: RFC 956
[Roman] This document is referenced in the introductory paragraph of Section
10, but doesn't appear to be used.
** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231,
RFC 7232, RFC 7233, RFC 7234, RFC 7235)
** Downref: Normative reference to an Informational RFC: RFC 3954
[Roman] This reference can be informative
** Downref: Normative reference to an Informational RFC: RFC 4949
[Roman] This reference can be informative
** Downref: Normative reference to an Historic RFC: RFC 6587
[Roman] This reference can be informative. However, are you sure it shouldn't
be rfc5425 (syslog over TLS)?
** Downref: Normative reference to an Informational RFC: RFC 8329
[Roman] This reference can be informative
== Outdated reference: draft-ietf-netconf-subscribed-notifications has been
published as RFC 8639
== Outdated reference: draft-ietf-netconf-yang-push has been published as
RFC 8641
[Roman] These two reference just need to be replaced by their corresponding RFC.
Regards,
Roman
_______________________________________________
I2nsf mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/i2nsf