[OPSAWG]draft-ietf-opsawg-rfc5706bis-01 early Iotdir review

Dave Thaler via Datatracker Fri, 23 Jan 2026 10:01:54 -0800

Document: draft-ietf-opsawg-rfc5706bis
Title: Guidelines for Considering Operations and Management in IETF 
Specifications
Reviewer: Dave Thaler
Review result: On the Right Track

I am the assigned iotdir reviewer for this draft. For background on iotdir,
please see the [FAQ](https://wiki.ietf.org/en/group/iotdir). Please resolve
these comments along with any other comments you may receive.

A marked-up PDF copy with my comments inline is at
https://1drv.ms/b/c/dc2b364f3f06fea8/IQBuO9rPPwGxRLZZi9kQSJncAT_Aktrn9MuIurcZp88NRjs?e=2boyoh

I found the checklist of key questions in Appendix A to be well written,
useful, and widely applicable to areas across the IETF, including IoT
protocols. Similarly the content in the body is well-written and useful.

I do have a bunch of comments on the body of the document however to make it
more widely applicable to areas across the IETF. A summary of my main
technical feedback follows, and many other minor editorial points (which
I don't expect should need WG discussion) can be found in my marked up
PDF copy.

1) Applicability of requirement: Three different places in the document make
three different, contradictory, statements about which RFCs would be
required to have this section.
a) abstract says all "RFCs in the IETF Stream"
b) 3.1 says all IETF RFCs "that document a technical specification"
c) Appendix B says "all new Standards Track RFCs"
I think section 3.1 is the best and the abstract and Appendix B should
both be changed.

2) Architecture RFCs: Most places in the document are consistent in saying
documents that specify a "New Protocol" or "Protocol Extension", but one
place in section 3.1 throws in "or an architecture". Generally speaking,
an implementation does not claim conformance to an architecture/framework
document, and so depending on how it is written and the content it may not
be considered a “technical specification”, just a roadmap document. In
that case, the previous paragraph would not require it in such an
architecture document. Furthermore, elsewhere in the document, like the
abstract, focused on requiring it in New Protocols and Protocol Extensions.
As such, I’d remove “or an architecture”. It might be ok in the preceding
paragraph to clarify that “anything an implementation would claim
conformance to is considered a technical specification”, and in my view
that would cover it.

3) Requirements around individual draft -00 submissions: Section 3.1 says
"early revisions of Internet-Drafts are expected to include an
Operational Considerations section". I'd find it a huge process hurdle
to “expect” all -00 versions of individual drafts to have such a section
as that would discourage many new entrants from participating in the IETF.
I might say "encouraged" instead of "expected".

4) Operator: I found much of the document, as currently worded, to be way too
_network_ operator focused, for a document that creates a requirement for
all areas, including IoT. Some places say "network operator" and other
places just say "operator". If you widen the term "operator" to be any
person or organization responsible for managing the protocol
implementations, then "operator" is fine but it should be added to the
Terminology section. E.g., is a cloud hosting service an "operator"? Is
a standalone DNS server admin an "operator"? Is an NTP server admin an
"operator"? In a home network, is the household member who configures
devices an "operator"? I'd want the definition to be such that the answer
to all of those is Yes (or else pick a different term that is generic),
so that the recommendations in the document are as widely applicable and
useful as possible. The checklist in Appendix A certainly is good already.

Similarly there are a bunch of places that only talk about "their network"
(e.g., section 4.6) and "impact ... on the network" (e.g., section 4.7),
rather than about "their devices and bandwidth" or whatever. Impact on
the network is good to talk about but from an operations perspective, the
impact on hosts/devices is also important in my view, and largely missing
it seems.

5) Network Operation: Section 4.5 contains a statement:
> If the protocol specification requires changes to end hosts, it
> should also indicate whether safeguards exist to protect networks
> from potential overload.

This statement seems asymmetric and biased in terms of only being from the
perspective of a network operator. Shouldn’t there be a similar statement
that if a protocol specification requires changes to routers it should
indicate whether safeguards exist to protect hosts from potential
overload? My point is really that it seems to be more about protecting
one organization from entities that aren’t under their control. In some
cases the hosts/servers may be more strictly managed than the network
boxes (e.g., in some home networks), and indexing on host vs network is,
in my view, not the right axis here if one is going to be asymmetric in
recommendation. My point is consistent with the wording in 2.1.2 of
RFC 5218 “Protocols that can be deployed by a single group or team … have
a greater chance of success than those that require cooperation across
organizations“ (which makes no distinction between network vs host per se).

Section 5.4.4 (Fault Isolation) is ok but seems overly network centric.
Say you have a docker container that is misbehaving in some way… the host
could isolate or quarantine the container. Same for VMs. Or say you have
a process in a host that is misbehaving… the kernel could isolate or
quarantine the process. I’d make the wording here more generic and less
network operator centric. Operations and management is about more than
just network operators per se. The guidance is good and just using more
generic terminology here in terms of stating the principles would make
the section stronger and more impactful in my view.

6) Internationalization: Section 4.8 suggests that English should be the
default language in implementations for human readable messages. I don't
think this document should make any such recommendation. I do, however,
recommend adding that it must also be possible to identify which language
a message intended for humans is in (e.g., via a language tag). Otherwise,
it cannot be reliably displayed correctly.

Section 5.5 also has an internationalization issue. It cites an IAB
workshop RFC (where such RFCs reflect the consensus of workshop
participants, not the IAB or IETF per se), and then makes a blanket
statement about configuration files that "human-readable strings should
utilize UTF-8" which comes across as saying this is now an IETF consensus
statement. There is IETF consensus on UTF-8 _in protocols_, and more
specifically UTF-8 with NFC (see section 2 of RFC 5198, which can be
cited as a normative reference here) but not in _device-local files_.
The IETF has no recommendation about files since they’re outside the
scope of Protocols per se. Different OS's already diverge in terms of
both normalization form and UTF-8 vs UTF-16. Hence either change the
text to be about strings in protocols (not textual configuration files
like the preceding sentence says) or make it clear that it is not an
IETF recommendation, or else be prepared for an IETF-wide discussion that
will never converge.

7) Information Model Design: The document nicely recommends in point 1 of
5.3.1 to "start with a small set of essential objects", which is great.
I’ve seen cases where someone just exposes everything just because it’s
there, not because there’s any need (“someone might want it”). As a
result, querying all state can be burdensome since it can be large and/or
expensive to query a given value, and can also disincent someone from
implementing the mechanism for querying them as too burdensome to
implement. To determine what is “essential”, I usually recommend
determining what questions need to be answered to troubleshoot, configure,
etc. and exposing the things that are needed to answer those questions.
It might help to say something like this to help readers understand what
is “essential” here. And I think that's consistent with the purpose of
Appendix A's checklist.

Point 2 in 5.3.1 says "Require that all objects be essential for
management" but I don't follow what that means. Elaborate.

Point 6 says "Avoid causing critical sections to be heavily instrumented"
I think it’s not just “critical sections” per se, but anything that would
be expensive. E.g., if someone wants to expose a summary object _rather
than the components of it from which the client could do the computation_
it would still meet criteria 4, but may be expensive to compute.

8) Liveness Detection: section 5.4.1 says:
> Protocol Designers should always build in basic testing features
> (e.g., ICMP echo, UDP/TCP echo service, NULL RPCs (remote procedure
> calls)) that can be used to test for liveness, with an option to
> enable and disable them.

I’m not convinced there aren’t exceptions, such as maybe for very
constrained IoT devices. Recommend removing “always” and just leaving
it as lower case “should” like other statements in this doc.

9) Configuration Management: Continuing my theme of making the document
less network-operator-centric in order to apply more generally, including
to IoT cases... section 5.5 (Configuration Management) comes across to
me as overly network centric for the section title which is nicely
generic. So if you manage a bunch of end hosts, or a bunch of Kubernetes
pods, or a bunch of IoT devices, or a bunch of VMs on a cloud service,
or a bunch of processes on one or more devices, this section should
still apply, but it provides little or no guidance. Either change the
section title and narrow the scope, or else (my preference) broaden the
discussion. For example, it would be remiss to not mention Kubernetes
in a general discussion of configuration management. Similarly, for
IoT devices there are various centralized configuration management
services such as Balena, SocketXP, Golioth, ThingsBoard, etc. One need
not name them (I wouldn't), but simply acknowledging the existence of
popular centralized management platforms would seem appropriate.

10) Operational Consideration section: The rest of the document already
says that this section shouldn't be required in documents that aren't
technical specifications and section 3.1 specifically uses process
documents as an example of when they're not required. Since this
document itself is a process document, it's not required, so why is
it here? If you do keep this section, you could say that explicitly
that it's not required in a document of this type, so people don’t try
to use this as a precedent to create barriers that aren’t required.

11) Network Device: This term is used in several places (e.g., section 9
among others) without definition. Is it "a device managed by a
network operator"? Is it "any device on the network, whether
router or end host"? Is it "a device that implements the New Protocol
or Protocol Extension in question"? If you use this term, an entry
in the Terminology section might help.

12) Password-based authentication: Section 9 (Security Considerations)
says "The security implications of password-based authentication should
be taken into account when designing a New Protocol or Protocol
Extension." True but this should already be stated in other RFCs,
not specific to O&M considerations per se. So is this sentence really
needed in _this_ document too? It seems anachronistic to me, even
though it's clearly good advice.

Dave Thaler

_______________________________________________
OPSAWG mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[OPSAWG]draft-ietf-opsawg-rfc5706bis-01 early Iotdir review

Reply via email to