Eric, WG: Summary: This email serves to explain the fixes done with rev-23/-24 just posted since Februar's -22 revision primarily in response to Eric Vynckes review. Eric is now the responsible AD for this document. Also the discuss we had with security folks.
Functional changes are quite limited: - Refined IPsec requirements. - hex lower (instead of upper) case in ACP domain info - added explanation of Clock requirements (6.1.3.1.) - Explicit SHOULD for TLS 1.3, "desirable" for DTLS 1.3 (as long as no RFC). - Filtering requirement for (currently unused) RPL headers. All the other changes just result in better text. If Eric agrees his points are solved to move forward, we have one discuss from Ben Kaduk re. the encoding of ACP domain info (see reply to him), but he said he wouldn't want to hold off the doc for it (maybe check with him), and some small fixes in IPsec section from discuss we had in prior weeks (maybe just running out of time for 2 week shutoff before IETF, so first committing without those). Details: Thanks a lot Eric for your great review Sorry for the long time to get back to your reply, but i had no time before the end of january, and since then i have been working hard on the non trivial 50% of your 72 point and also trying to address the remaining points from Bens review and other IPsec discuss. - rev 22 from early feb attempts to address Bens remaining point(s), but we did not finalize that discuss with the SEC experts yet, i will start a separate email thread for that. - Revision -23 addresses 71 of your 72 points. Answer to your point 69 is swap of two big sections and therefore committed to -24 to have a usefule -22 to -23 rfcdiff. - I need to still commit (-25) a few paragraph changes for IPsec from the discuss on the IPsec mailing list. I may runn out of time before the 2 week downtime for datatracker. Will let you know when i commit. Aka: -25 should be my final offer before i get more feedback. I have appended your original review points with the same numbers of points as in belows response. Not sure if you want/need to submit your points to datatracker. Just in case. Your review points with my replies are below, prefixed by this: {<number>:<status>}, where status can be: a(nswered) - (30 of 72) no textual change, but textual answer intended on my side to close the point. f(fixed) - (41 of 72) textual fixes, in my opinion closing the point. d(elayed) - your point 1, see below You may not agree with my solutions of course, please check my answers. Summary of important text diffs in changelog section of document as always. RFCdiff: (you reviewed -21, but -22 introduced only more feedback vs. Bens review, which didn't overlap with your questions. In addition, it removed all changelog and summarized it, so eaier for you to review against -22): http://tools.ietf.org//rfcdiff?url1=https://tools.ietf.org/id/draft-ietf-anima-autonomic-control-plane-22.txt&url2=https://tools.ietf.org/id/draft-ietf-anima-autonomic-control-plane-23.txt And this diff is just the reordering from your point 69. http://tools.ietf.org//rfcdiff?url1=https://tools.ietf.org/id/draft-ietf-anima-autonomic-control-plane-23.txt&url2=https://tools.ietf.org/id/draft-ietf-anima-autonomic-control-plane-24.txt Cheers Toerless On Fri, Jan 03, 2020 at 03:45:36PM +0000, Eric Vyncke (evyncke) wrote: {1:d} [ Wrt to non-technical textual review, shortening sentences etc:] Agreed. I did this once after WG last call, and i will do this again before it goes to RFC-editor, but for the time being i would like to keep diffs focussed on the reply to technical issues to make reading of diffs easier for reviewers. I am probably also the worst of all authors to shorten sentences. {2:f} > Please also check the long output of https://tools.ietf.org/idnits?url=https://tools.ietf.org/id/draft-ietf-anima-autonomic-control-plane-21.txt Yikes. https://trac.tools.ietf.org/tools/ietfdb/ticket/2880 I was so happy that all my prior versions passed idnits check, you where the first reviewer one noting that they really don't AND provided me with the URL to prove it. Because to me i always looked fine the way i did it. Tried to fix what was really an issue IMHO. Lots of wrong positives i think (e.g.: section number references like 6.7.1.2 seem to be recognized as non-compliant IPv4 addresses ;-). Some complaint about using obsoleted RFC referrences but those are intentional: Left some older protocols where idnits points to newer versions, eg.: i am pointing to (insecure) BSD syslog in list of widely used protocols to be protected by ACP. There is a newer RFC that would allow to combine with TLS without specifying exactly how, and i don't think available on any router today. So kept the widely deployed / insecure protocol references for that part of the text. > No source of time Asked and answered below > > Generalized TTL Security Mechanism (GTSM) Asked and answered below > > IANA consideration for 'type' field of the address .... > > {3:a}> - *** why not using Generalized TTL Security Mechanism (GTSM) in addition to the use of LLA ? Interesting question. I thought we disussed this but i can not find evidence on the mailing list. RFC5082 does not discuss how relevant GTSM is still in the presence of link-local addresses. I for once have not heard nor could think of how to create the attack vector that GTSM intends to protect against when link-local destination addresses are used. Are you aware of any specs arguing that GTSM on top of link-local addresses improves something ? I was quickly browsing through the RFCs and could not find anything. Also: the use ot TTL=1 with link-local addresses for DULL GRASP is specified in GRASP (seection 2.5.2 of the GRASP draft), not ACP. ACP just uses it. If you wanted to change it, that would be an update to GRASP spec. Also, DULL GRASP in ACP results in secure channels, yet another line of defense. You can see from the text, that i do in general like to address repeatedly asked question with explanatory text (e.g.: appendix discuss re. CDP/LLDP, repeatedly asked), but if i remembre correctly, you're the first one who asked this, hence no text added (yet ;-) for this. {4:f} > - *** there is very little text about time synchronization except a few words in section 10 See new subsection "Realtime clock and Time Validation" in the ACP domain membership section. Effectively we need a followup document describing a standard on how we would want to distribute current time information across the ACP if we want to well support ACP nodes without a raltime clock (or realtime clocks with dead batteries). {5:a} > - *** I am not too familiar with RPL but AFAIU there is a RPL root, how this RPL root is selected in ACP ? 6.11.1.12, Automatically selected by signaled DODAGPreference preference of ACP node. {6:f} > - while I like the fact that section are indicated as ???(informative)??? and as I have never seen this used before this I-D, I wonder whether some explanations of this tagging would be welcome now in acronym/terminology section: <t>This document serves both as a normative specification for how ACP nodes have to behave as well as describing requirements, benefits, architecture and operational aspects to explain the context. Normative sections are labelled "(Normative)" and use <xref target="RFC2119"/>/<xref target="RFC8174"/> keywords. Other sections are labelled "(Informative)" and do not use those normative keywords.</t> Btw: reason is that WG was not allowed to write a separate architecure, requirements document in charter 1. Not sure in hindsight if juggling 3 or more interdependent documents would have been easier than the one giant document approach we ended up with. {7:f} > - section 1, ???Management and Control" meaning no monitoring ? even if explained later I have replaced the "Management" in "Management and Control" with "OAM", because we really mean OAM when we said "Management". And OAM of course is inclusive of monitoring. Given how we where only pointed to prefer the term OAM over "management" late in IESG review, i can not replace "management" in all places with OAM, so i added definitions for Management and OAM into the terminology section, indicating Management means OAM, and OAM also includes monitoring. {8:f} > - section 1, ACP/OAM are defined twice but ANI is not fixed. {9:f} > - section 1, " An operator can use it to log" should be clarified: SSH ? SNMP? NETCONF ? (even if explained later) Fixed to: An operator can use it to access remote devices using protocols such as Secure SHell (SSH) or Network Configuration Protocol (NETCONF) running across the ACP {10:f} > - section 1.1, DTLS 1.2 while everyone moves to TLS 1.3 ? I am unsure whether there is a DTLS 1.3 in the cooking but be ready to have comments from the transport area AD Note that Eric Rescorla and Ben Kaduk did not issue concerns about this. Let me answer this point in the answer to later sections points about DTLS. {11:a} > - section 2, in " ACP secure channel" is integrity also important? It is usually a side effect of encryption though but worth mentioning ? Fixed. Btw: integrity-protection is not a side-effect of encryption but of authentication. Eg: ESP Null encryption or AH would also give integrity-protection. Actually, followup question to the IPv6 expert: Integrity protection not only helps against attackers (which is primarily why we do it), but possibly also againt bit/frame errors in non-integrity protected L2 underlay. Given how IPv6 removed the header checksum, i wonder if that was ever seen as a downside vs. IPv4 on those type of L2. I for once wouldn't know/remember such an L2 (e.g.: ethernet has checksum), so i am not sure it exists, but if it does, then we could mention it as a case where ACP secure channels are protected. {12:f} > - section 2, in " BRSKI and GRASP are products of the IETF ANIMA working group" replace "products" by "specifications" ? fixed. {13:f} > - section 2, in " node: A system, e.g.," please remove "e.g." fixed. {14:a} > - *** section 2, remove " It is the approximate IPv6 counterpart of the IPv4 private address" This sentence was refined i think twice or more through reviews because: I had great success in using this comparison with customers when explainng ULAs. And in any discussion i had where i was trying to explain ULA without comparing to IPv4 i always got the question "so this is the IPv6 version of private addresses in IPv4 ?". If you want to propose a better sentence that explains ULA by comparing it to IPv4 private addresses, i am happy to take new/better text, but i think without such a comparison we would be doing a disservice to readers unless they already are IPv6 geeks. {15:f} > - section 6.1.1 it is unclear whether it is "beneficial to copy the device identifying fields of the node's IDevID into the ACP domain certificate," as the same paragraph also says it is a bad idea... Ok, hope this is now easier to read, it also adds a hopefully good recipe. <t>For diagnostic and other operational purposes, it is beneficial to copy the device identifying fields of the node's IDevID into the ACP domain certificate, such as the "serialNumber" (see <xref target="I-D.ietf-anima-bootstrapping-keyinfra"/> section 2.3.1). This can be done for example if it would be acceptable for the devices "serialNumber" to be signalled via the Link Layer Discovery Protocol (LLDP, <xref target="LLDP"/>) because like LLDP signalled information, the ACP certificate information can be retrieved bei neighboring nodes without further authentication and be used either for beneficial diagnostics or for malicious attacks. Retrieval of the ACP certificate is possible via a (failing) attempt to set up an ACP secure channel, and the "serialNumber" contains usually device type information that may help to faster determine working exploits/attacks against the device.</t> {16:f} > - section 6.1.2 remove leading 0 in the IPv6 address of the example Hope you meant: given an ACP address of fd89:b714:f3db:0:200:0:6400:0000 now: given an ACP address of fd89:b714:f3db:0:200:0:6400:0 {17:a} > - section 6.1.2 "32HEXDIGIT" or "32 HEXDIGIT" ? "32HEXDIGIT" is correct, see rfc5234, 3.7 {18:a} > - section 6.1.2 it is unclear whether acp-address is a valid ULA address as the text mentions later "hash to generate ULA"... Also, is there any specification on how to generate this acp-address? Same /48 prefix for example ? This text is deliberately as it is. Better explanations would have to point to potential future work, and i had some important IESG reviewers (including Alica) say that text about "future" makes document look incomplete and rather not have it. I do not agree with this, but i follow this IESG advice. The mandatory hash to create the ACP ULA on a registrar from the hash is defined in 6.10.2, base ACP address scheme. An ACP domain information field where the IPv6 address is NOT a ULA is perfectly compliant with 6.1.2, but the use of a non ULA addresses is outside the scope of what is standardized in this spec because of 6.10{.2}. Aka: future variations of ACP could perfecly use a different hash or even non-ULA and should still be compatible with 6.1.2, and would only need to update/ignore/change 6.10.2. If i would put stronger text in to emphasize this distinction between 6.1.2 and 6.10.2, such text might help one type of future extensions, but might hurt other types of future extensions. Aka: the text says everything we know to be correct and mandatory and necessary for this specs target, and tries NOT to say things that would potentially make extensions more difficult. {19:f} > - *** section 6.1.3, suggest to use "MUST" when closing the secure channel upon discovering via CRL/OCSP that the cert was invalid. fixed. {20:f} > I would even suggest to use "SHOULD close all ACP peers connection" to block the wrong path for the benefits of 'downstream ACP nodes' I added "This applies of course to all ACP secure channels to this peer if there are multiple." but i find it somewhat redundant. Is that what you meant ? {21:f} > - section 6.1.3 about the same numbered points: please move the "Note:" below point 4) as done for point 6) fixed. {22:f} > - *** section 6.1.3 (and others) MUST use normative language, i.e., "MUST" "SHOULD"... in uppercase Ok, i did a complete normative text scan (sigh ;-). I changed some must to MUST where i felt it added relevant interop requirements, otherwise i changed text to not use must/should but other words. Example: "In DULL this field is irrelevant but must still be set according to the GRASP specification." Rather feel like it doesn't make sense to repeat normative requirements from other specs, so changed this to "but is still set...". Example: "the ACP connect interface and NOC systems connected to it 'needs to be' physically controlled/secured" This was 'must', but i don't want to be challenged on how to implement a MUST or give an RFC reference for it. I cold not find missing SHOULD. There is a lot of explanations (not the actual normative requirements) where should is used, and i didn't feel it would help the text to change the language to avoid he word should. {23:a} > - section 6.1.4 I am pretty sure that the mechanism of cert chains & trust anchors are well defined in the literature, perhaps easier to refer to rather than describing the mechanism Well defined, probably, but: Its actually not easier to refer to because a lot of is is in non-IETF docs like X.5xx ITU-T docs and the PKIX architecture of the IETF is also quite "scattered" across RFC. And actually there is little operational documentation, but mostly protocol specs only hard to understand for non-sec experts. I think it is a big benefit to the target audience of the document to have this summary. It hopefully makes adopting ACP by developers/operators as much possible "standalone" as possible. Aka: if we want to proliferate security architectures, we need more documents like this that explain enough for how it can be applied but also include all the key aspects that you need to understand/use. Ultimately, this section also summarizes the security understanding of the authors, and by having it written down and gone through SEC AD review there is a better degree of confidence that it is sound/correct. {24:f} > - section 6.1.5 " remember the EST server" is unclear... is it the FQDN or IP address or xyz to be remembered ? fixed to: ACP nodes SHOULD be able to remember the IPv6 locator (parameters of the O_IPv6_LOCATOR in GRASP) of the EST server... .... {25:a} > - section 6.1.5.2 should some randomness be added for the time when cert has to be renewed? I fear flash crowd effect This text doesn't preclude implementations to do this, but lets not over-engineer the normative part with not really too clear options. I have not seen such randomness in real life deployments of e.g.: VPN solutions, especially not from running an IPsec VPN in production (which i did for a while). It would have made life in operations more difficult too, because it makes it more difficult to recognize precisely when a node that should renew is not doing so. With explicit known times you can calculate this from certs on CA and lifetimes. And automate/track whether nodes do renew accordingly. Also note that Certs lifetime typically start when they are physically deployed, which at least today co-incides with physical deployment. So no risk of flash crowds. Even if you would in future more automated deployment batch initial BRSKI rollout , e.g.: bringing all new pledges online once a day or so, any possible performance issue would first be seen in BRSKI and could be fixed also easily by randomnly varying the lifetime in certs. Instead of putting more nerd-knobs into on-node code. {26:f} > - section 6.1.5.3 " SHOULD support Certificate Revocation Lists (CRL)" should specify 'processing' or 'retrieval' or ... fixed to SHOULD support revocation through Certificate Revocation Lists (CRL) {27:a} > - section 6.1.5.4 while I am a big fan of very short cert lifetime (to avoid CRL), I am less sure for the ANIMA use case... what if the ACP node is disconnected for 1 day? No way to restart the whole process :-( with going through 6.1.5.5 ? Re-enrollment via BRSKI can be fully automatic and without having to go back to the MASA when the BRSKI registrar ignores the clients certificate expiry time. This option is i think now documented in ACP and in BRSKI too. To keep a particular network region alive under loss of external connectivity longer than cert-lifetime, you can use a region-local registrar with built-in Sub-CA functionality, also runs fully automatic. Could be in every branch of an enterprise (WAN edge router). Lot more complex functionality you nowadays have on those boxes than subCA/registrar. Also a reason why i added all this subCA text to the doc. This may not does result overall in a more stringent security model than CRL/OCSP under failures and attack, but its well inline with todays directions of survivability design in typical ACP targets: enterprise, SP, manufacturing,... CRL also have a lot of silent failures where you just don't get new updates for them as they're yet another rarely used separate signaling channel. {28:a} > - *** section 6.3 unsure to understand why you need to use SLAAC for a link-local IPv6 address. You need a link-local IPv6 address for DULL GRASP messages and for the ACP secure channel that would use that link-local address. Auto-assigning link-local IPv6 address requries DAD. DAD is part of SLAAC. Please suggest better text if this is not it. {29:a} > - *** section 6.3 most of the implementations that I know do not use MLD for link-local multicast, they simply flood. Especially on a p-2-p link. sure, but that is not the point. See more comprehensive answer to the following point of yours: {30:a} > Please reconsider rewriting the section on MLD snooping requiring MLD by some more explanations. Also, the use of a IANA ll mcast should probably render MLD snooping useless (i.e. I am pretty sure that router / nodes do not use MLD for ff02::1 or ff02::2) No change because: RFC2710 (MLDv1): > MLD messages ARE sent for multicast addresses whose scope is 2 > (link-local), including Solicited-Node multicast addresses [ADDR- > ARCH], except for the link-scope, all-nodes address (FF02::1). I repeatedly asked about this point in the last few years in PIM-WG, and got (as far as i remember) reconfirmation that this is indeed what we want. I remember that we did this explicitly when we did MLDv1 because of all the problems we had with link-local multicast in IPv4 (and snooping switches not capable to deal with this because IGMP never demanded this). Alas, i did not have the time to fully review MLDv2 when it came out, and it is indeed missing that sentence. The simple explanation for this is that the authors of MLDv2 (not involved in before) did not inherit any MLDv1 text but translated IGMPv3 from IPv4 to IPv6, and by that time there was i think not too much review if everything we had improved in IPv6 with MLDv1 was completely put into the completely separately written MLDv1 RFC. More (not so) funnily, i think i could not find any statement in any documents that you MUST use IGMP/MLD as a listener - except the one above from MLDv1. So for all intend and purpose it's up to the application to decide if it wants to use MLD/IGMP. Aka: With current MLDv2 text, it is the perogative of the app (GRASP or ACP) to mandate applications receiving IPv6 multicast packets to use MLDv2 whatever the scope of the address is. Aka: That is the ultimate explanation why this ACP text can mandate this now (without waiting for any MLDv2 text changes). Better yet: The spec(s) themselves (MLDv2, maybe also IGMPv2) need to be fixed. I opened an errata against RFC3810 and will discuss: https://www.rfc-editor.org/errata/eid5977 Also will work solving the missing text in PIM-WG. Given some of the RFC8200/SRv6 discuss i see (no re-interpretation of old group intent) maybe we need a one-page update to MLDv2 instead of an errata, but i think it woudn't be contentuous in PIM-WG. I don't really think i want to explain any of this mess in ACP document, hence no change. Please suggest text if you think there is one that doesn't look too much like "dirt under MLD carpet" or becomes too long. {31:f} > - section 6.3 (and possibly others) please use only lowercase in IPv6 address (e.g. fe80...FEED... looks weird) Ack. Also changed the the rfc822 encoding of the address back to lowercase. {32:a} > - section 6.3 s/ttl/TTL/ Not fixed. Blame Brian (Carpenter, GRASP author). GRASP defined 'ttl' a a msec united time to live, so not ony is the name fixed (ttl), but i definitely also do not want any confusion with (IP-) TTL, which is actually used in that meaning in the RPL section. {33:f} > - section 6.3 IKEv2 was already expanded before. The very same issue (repeating expansion) occurs quite often in the document... Hence, the doc has an 'amateur' look deserving it (because it is real smart work) Ack. wrote a script to find those cases, and fixed them. Hope script found all. {34:f} > - section 6.5 please explain notation like " [4:C1]" Fixed. (C1 is the connection identifier). {35:f} > - section 6.5 please expand 'MTI' and why not using IETF "MUST" ? fixed and fixed. {36:f} > - *** section 6.7 about PFS, did you check that DTLS 1.2 support PFS ? Yes. ACP spec says MUST support RFC7525, which says: This document therefore advocates strict use of forward-secrecy-only ciphers. Ben asked me to change PFS to "forward secrecy". I also changed it to "MUST use forward secrecy". [ Alas, i find rfc7525 somehwat lacking as it is not explicit in the list of crypto options that actually do provide PFS, but thats the IETF BCP for the subject matter, and if that BCP finds it adequate to let the reader figure out by herself which of the hundred crypto algorithms in TLS/DTLS do that, then i want to be the last one who gives more explicit guidance in an already way too long ACP spec (rant off). ] {37:a} > - *** section 6.7 can ACP really rely on any L2 security mechanism? Or isn't it a catch 22 game ? Reread several times, i think paragraphs are sound. The paragraph reflects what we brainstormed outside IETF for MacSec, but its really a generic template. Think for example IPsec where instead of ESP you use MacSec. Of course, you need to NOT encrypt IKEv2 packets via MacSec like you also do not encrypt them through ESP thats avoiding your catch 22. {38:f} > - section 6.7.1.1 I do not mind too much, but, I wonder why you put some IANA non-consideration in the text. Suggest to remove Leftovers from early days trying reconfirm for ourselves what we needed to ask IANA. fixed. {39:f} > - section 6.7.2 the text about DTLS 1.3 is unclear. Fixed. See next point. {40:f} > I have really mixed feeling about using DTLS 1.2 as it is soon to be deprecated and ANIMA should use the latest and brightest (OTOH one approved your document can sit in the RFC editor queue for months/years if waiting for DTLS 1.3 to be published) As part of the parallel discussion with security folks, there is a bit more improvements in the security text than just your asks. Primarily pulling out what i think are good common requiremnts from IPsec/DTLS and put it into common paragraph on top of the section. If i do recall all my discusses: SEC AD had no concerns with mandating only DTLS v1.2, not even to only require TLS 1.2 (even though TLS 1.3 is out). For DTLS i did rewwrite it to address your points: DTLS1.2 is indeed MTI, but better explanations why in text (e.g.: desire to adopt ACP to lower-end devices with often a lot slower evolution of firmwaare, strciter common ACP secure channel security requirements - aka: going maybe 50% where DTLS 1.3 is. "DTLS" in GRASP really means DTLS 1.2 or anything newer/better that can negotiate down to DTLS 1.2. Aka: DTLS 1.2 + DTLS 1.3 implementation is fine. Non-normative text to suggest also to support DTLS 1.3, and RFC-editor note that that text will change to SHOULD support DTLS 1.3 IF we have an DTLS 1.3 RFC by AUTH48. Hence avoiding waiting for DTLS 1.3, because the explanations above should well enough explain why there is not enough additional value for this use-case in DTLS 1.3 now to make it MUST (or to forego the MUST for DTLS 1.2). In general, i do not agree with your statement "hot off the press is always best", especially i don't think there is one-size-fits-all, and the TLS recommendations are very centric to "web-software" with better consistent upgrade cycles than we have in other parts of the industry. Check 6.7.2, and let me know if there is still anything you would like to see improve. {41:f} > - *** section 6.8.2 please use the RFC for TLS 1.3 as it now exists Changed to: TLS version 1.2 (<xref target="RFC5246"/>) is REQUIRED and TLS 1.3 (<xref target="RFC8446"/> is RECOMMENDED. Discussed also with Ben, there is no mandate to (only use TLS 1.3 in solutions like ACP. Otherwise similar argument than DTLS except that its also used end-to-end so the lowest-common denominator problem is stronger (aka: must have working MTI across ALL nodes, whereas DTLS would only be some some "low-end" nodes). {42:f} > - section 6.8.2 to be honest, the text is easier to read than the picture, so, suggest to move the picture after the text fixed. {43:a} > - section 6.8.2 I did not re-read the ANIMA architecture document but I would assume that this 6.8.2 section is a part of the architecture document. Are they in sync? Its a reference model, not an architecture, but yes. It does not say which of the specs had to include which text. 6.8.2 effectively resulted when GRASP was in IESG security review and we concluded that GRASP itself wouldn't have to specify its transport and security layer, but the solution adopting GRASP would have to do that. Makes GRASP easier to adopt to different solutions (like ACP). {44:a} > - section 6.8.2.1 looks more like a security consideration section to me... move it there? I'd rather not: Eric/Ben did not raise a concern about this. A lof of the doc is about security and explanations thereof, security section would be very long if we moved all explanations there and IMHO wouldn't help readability. And this section is quite long. I tried to have security section be igh level analysis plus strange/tangential stuff, but otherwsie keep security explanations local to where they are needed. {45:a} > Also, it is hop-by-hop TLS? Then a hostile ACP node can do a MITM attack Hop-by-hop is TCP because its 1:1 on top of the hop-by-hop ACP secure channel, no added value of TLS. Its only used for flooding service ("objective") discovery messages. End-to-end/peer2peer GRASP in ACP uses TLS. You can not make service discovery more secure this this level, because the worry is not MITM, the problem is that every ACP node is equally trusted to announce a service, and you have no prior knowledge that one node is providing a great instance of the service and the other one may look like its doing the same, but is e.g.: re-selling your data. At the ACP/ANI level we can just kick a bad acting node out by certificate revocation or expiry (short-level certs), pretty much like any other current "secure soluton". {46:f} > - *** section 6.10.3 use EITHER 00b or 0x00 but not both for the same field fixed. Eliminated all b)inary values from addressing section. Kept them in RPL section, where Pascal wrote them that way. guess they may be commonly used here. {47:a} > - *** section 6.10.2 and 6.10.3 write about scheme and subscheme but they are not defined (or if they were, then it was pages ago) I do not understand the concern: 6.10.2 defines the overall (base) scheme which includes sub-schemes, explanations, table. Then 6.10.3. ... 6.10.5 define the sub-schemes. Maybe rephrase or propose a specific change ? {48:f} > - section 6.10.3 only 15 bits for addressing ACP nodes ? It is only 32.000... not too many for IoT A registrar can use multiple Registrar-IDs. Networks with distributed Registrars will typically have less than 15k nodes per registrar. Fixed text here and in the V-Long addressing scheme to: Registrar-ID (48-bit): A number unique inside the domain that identifies the ACP registrar which assigned the Node-ID to the node. One or more domain-wide unique identifiers of the ACP registrar can be used for this purpose. See <xref target="registrars-unique"/> {49:a} > - section 6.10.3.1 route aggregation is always a plus but does RPL support route summarization ? I am not aware, but i don't think so. This spec does not define all the routing mechanism needed/desirable for zone based route aggregation, it just carves out the space. Like we have done in other parts of IPv6 address architecture. What was tested with pre-standard implementation lab testing was transitional for example. ACP RPL metro regions (each with a zone) interconnected via a traditional MPLS/VPN core where a configured VRF interconnects those metro regions via the zone-prefix routes and ACP-connect. Idea would be document such model if/when ACP networks are start getting this problem and then think of way to make those setups more autonomic. {50:a} > - section 6.10.4 again the explanations on how IID is generated is postponed to a later section, this is frustrating ;-) Other away around could esaily be more frustrating. The registrar section 6.10.7 section where this is explained was written very late in the process so it could only be appended at the end or , but even if it was written earlier, i think the registrar description would be a lot harder to write without first having defined the addressing: Think about non-ACP network like enterprise network. You would first explain the addressing plan of e.g.: an enterprise network and later on describe how you could build a system that generates the addressing. {51:a} > - section 6.10 and addressing in general, I wonder whether such a complexity is required to be specified in the normative section... Why not state ULA and that's it ? Think of the IPv6 address architecture, it also has standardized addressing plan (whats unicast, scopes, multicast, ULA, etc.. pp) to avoid having to do consisten per-hop configuration of address ranges for different functions. If we wouldn't standardize, we would need to provision a lot more addressing config parameters to each node (all those implied by the addresses now, prefix-length, zones, which registrar can assign which suffixes, which addresses are inside ACP, which one is on ACP-connect interfaces). Potentialy this would be so much that we would raise too many eyeballs with security folks trying to put that all into the certificates. And if it doesn't fit into certificates, we would need another protocol beside BRSKI or non-crypto stuff in BRSKI. And configure all distributed registrars. And even signal addressing stuff via more routing protocol to check consistency, set up edge-filtering for internal addresses, etc. pp. Aka: huge simplification. We think we get all the use cases we understand done by ACP instances picking addresses from these options. But if we ever figured out something can not be done, we're not stuck as IPv6 overall is (AFAIK, most address space designed), but we could simply flip a bit in the certificate ACP domain information field in an incompatible way (e.g.: different RFC prefix) and come up with a new addressing plan (or different approach). {52:f} > - *** section 6.10.7.2 please do not use MAC address as a source of 46 unique bits for the registrar... Virtual nodes do not always have unique MAC addresses I don't think its a good design rule to not exploit a feature that is very beneficial in one domain just because its not applicable to another domain. Especially when the first domain (physical) is really today the primary domain against which the solution is designed, and the applicability to the second domain (virtual) is today mostly theoretical and has not been verified too much. How about demanding physical routers must not have a power source of their own (cable, batteries) because virtual routers do not need them ? ;-)) Kidding aside. I have thought how to improve the text to address technical clarity and generalization of the concept, here is whats in the new revision: <t>To support such unique address allocation, an ACP registrar MUST have one or more 46-bit identifiers unique across the ACP domain which is called the Registrar-ID. Allocation of Registrar-ID(s) to an ACP registrar can happen through OAM mechanisms in conjunction with some database / allocation orchestration.</t> <t>ACP registrars running on physical devices with known globally unique EUI-48 MAC address(es) can use the lower 46 bits of those address(es) as unique Registrar-IDs without requiring any external signaling/configuration. This approach is attractive for distributed, non-centrally administered, lightweight ACP registrar implementations. There is no mechanism to deduce from a MAC address itself whether it is actually uniquely assigned. Implementations need to consult additional offline information before making this assumption. For example by knowing that a particular physical product/MIC-chip is guaranteed to use globally unique assigned EUI-48 MAC address(es).</t> Hope this solves the discuss. {53:f} > - section 6.10.7.2 IdevID is used while in the terminology section it was stated that they are not. Or did I read this wrongly? changed in terminology: "IDevID cannot be used for the ACP" "IDevID cannot be used as a node identifier in the ACP" changed in 6.10.7.3 "ACP registrars that can use the the IDevID of a candidate ACP device" "ACP registrars that are aware of the IDevID of a candidate ACP device" Aka: The ACP domain certificate is locally provided by the domains registrars, like a membership card for some club/company/whatever, whereas the IDevID could be seen as a primary node identifier like passport that could be used as one of the authenticators when applying for the club/... membership. {54:f}> - section 6.11.1.1 unsure whether using IPv6 HbH would be an issue as ACP won't probably be HW accelerated, but, I do not mind to err on the safe side Most of the original text of the RPL section was from Pascal, and was very tersly written, IMHO for RPL experts. I already tried to expand a lot of the introduction for easier digestion by non-RPL experts. I have added/modified the following text to make the point you are referring to clearer: 6.11.1.1 This RPL profile avoids the use of Data-Plane artefacts (RPL data packet headers, see <xref target="rpl-Data-Plane"/>), because hardware accelerated forwarding planes most likely can not support them today. 6.11.1.13. <section anchor="rpl-Data-Plane" title= "RPL Data-Plane artifacts"> <t>RPL Packet Information (RPI) defined in <xref target="RFC6550"/>, section 11.2 defines the data packet artefacts required or beneficial in forwarding of those data packets when their routing information is derived from RPL. This profile does not use RPI for better compatibility with accelerated hardeware forwarding planes and achieves this for the following reasons.</t> <t>One RPI option is the RPL Source Routing Header (SRH) <xref target="RFC6554"/> which is not necessary in this profile because it uses storing mode where each hop has the necessary next-hop forwarding information.</t> <t>The simpler RPL Option header <xref target="RFC6553"/> is also not necessary in this profile, because it uses a single RPL instance and data path validation is also not used.</> </section> That text became so long because i felt without these explanations its diffficult to get on top of it as a non-RPL expert: RFC6550 that defines RPI does not even define that abbreviation, its only used from RFC8138 on, and the fact that RFC6553 and RFC6554 are two different options for RPI is also something you will only figure out after having read a lot more of those RFCs. {55:a} > - section 6.12.5 I agree that the term 'loopback interface' is becoming really old-fashioned. Time to use another term in this document? No change: I don't think a 150 page document of a specific solution is a good place to define new terminology to be reused generically. If we can scope a small document to introduce such a better terminology, i am all for it. [long opinions] Another term could be a lot of political infights. It might be worth to have that fight, but not in this doc: Logically i think its what a Node SID is, except that the definition in RFC8402 section 3.2 is fairly weak. And if you wanted to avoid dragging SR into this discussion, (but instead OSI), it would be a node instead of a subnet address, but then you're still not sure that the pragmatic folks working in OPS have any interest in investing a lot in new terminology, given how they have used loopback interface addresses forever as node identifiers in IGPs and BGP, and AFAIK, nobody has bothered to bring up the terminology discussion. At best you avoided to explain how you do actually achieve a node address in an existing IP stack (by using a loopback interface), so the word "loopback address" didn't show up in the according RFCs. Then again, that only works in the context of solutions where everybody already understands how to implement. I would not want to expect that for the ACP. [end] (And remember, ANIMA is an OPS group, not an RTG group, so being more practical shold earn brownie points ;-) {56:f} > - section 6.12.5, perhaps it is only me, but, I had two burning question marks in my head while reading this I-D: what about NBMA and what about DAD... Answered now... The multi-access could be mentioned earlier though as most of the text has an implicit P2P use case Fixed. The main section "6.7. Security Association (Secure Channel) protocols " didn't have any text (just subsections for IPsec etc), so i added the following text into it: <t>This section describes how ACP nodes establish secured data connections to automatically discovered or configured peers in the ACP. <xref target="discovery-grasp"/> above described how IPv6 subnet adjacent peers are discovered automatically. <xref target="remote-neighbors"/> describes how non IPv6 subnet adjacent peers can be configured.</t> <t><xref target="ACP-virtual-interfaces"/> describes how secure channels are mapped to virtual IPv6 subnet interfaces in the ACP. The simple case is to map every ACP secure channel into a separate ACP point-to-point virtual interface <xref target="ACP-p2p-virtual-interfaces"/>. When a single subnet has multiple ACP peers this results in multiple ACP point-to-point virtual interfaces across that underlying multi-party IPv6 subnet. This can be optimized with ACP multi-access virtual interfaces <xref target="iACP-ma-virtual-interfaces"/> but the benefits of that optimization may not justify the complexity of that option.</t> And 6.12.5 now has been structured into subsections to enable the new xref's. Also added a note about non-considerations of multi-party secure associations to 6.12.5.1 (GDOI). {57:f} > - *** section 7.2 is it really " MLD snooping must be changed to never forward packets" ? Suggest to use "ACP-aware L2 switch MUST never forward packets for ALL_GRASP_NEIGHBORS" Yes. Unfortunately, i think you may have been the only reviewer who commented on this L2 section, so when i tried to make it more precise, i also stumbled across a bit of other non-ideal text, especially wrt. to VLANs (non-explicit mentioning of running GRASP only on untagged ports of VLANs), and in not being clear how the described design is meant to enable ACP on L2/L3 switches without actually changing any of the L2 forwarding plane except for the GRASP message filtering. And of course i revisited this text because its the reason for the following security concerns of your next point. Text changes a bit longer in the section, read in rfcdiff. {58:f} > - section 7.2 should the discussion about address stealing rather in the security section ? Yes. Small enough to go there. fixed (moved verbatim, no changes). > - section 7.2 suggest the use of normative "SHOULD" and rewriting the > sentence "Ideally, ACP peering should be built also across ports that are > blocked in STP" Fixed... Hope this does not frustrate HW where its difficult to implement. {59:a} > - section 8.1.1 should the ACP edge also perform some duplicate address detection ? E.g., if the NMS acp-address is already advertised in the RPL ? RPL actually does NOT help because its optimized to automatically reduce the routing table. So unlike OSPF or ISIS, if you are not the root of the RPL tree you would not see the prefixes routed towards the root, but effectively just the equivalent of a "default" route to it. With our new charter and documents like ACP out of the way, a topology service ASA would be a good thing to define to help here, e.g.: auto-negotiate manual addressing scheme prefixes between all the ACP-connect edge routers. Shengs auto-addressing draft draft hanging in IETF editor queue could be a basis for that. {60:f} > - *** section 8.1.1 should the ACP edge also block all packets with HbH or routing header? <t>ACP Edge nodes SHOULD have a configurable option to filter packets with RPI headers (xsee <xref target="rpl-Data-Plane"/> across an ACP connect interface. These headers are outside the scope of the RPL profile in this specification but may be used in future extensions of this specification.</t> Note (again) that the document was beaten up in prior IESG review for its references to "future", and reviewers claimed it was incomplete because of such paragraphs. In result i removed all paragraphs with "future" in them to pass IESG reviews. So, i would expect you to defend this new instance of "future" ;-) (aka: if we didn't explain the future forward compatibility, we wouldn't have an explanation why the filter would have to be configurable). {61:f} > - section 8.1.2 sorry but cannot parse " The ACP connect mechanism be only be used to connect physically" Fixed: The ACP connect mechanism can not only be used to connect physically external systems {62:f} > - section 8.1.2 possibly because of the above issue, but, I fail to see what this section is all about? The section title is quite vague... Added new first paragraph to 8.1.2: <t>The previous section assumed that ACP Edge node and NOC devices are separate physical devices and the ACP connect interface is a physical network connection. This section discusses the implication when these components are instead software components running on a single physical device.</t> Hope this suffices. Btw: The goal of this section is to primarily make the argument that ACP-connect is a workaround when we talk about physical devices (too difficult to build secure channel into NOC devices), but when everything is co-located as software components on a single physical device and we trust the software orchestration on the physical device, then ACP becomes the gold standard, and cryptographically secure channels between those software coponents would just be useless waste. {63:a} > - section 8.1.3 is it worth to define a 3rd RPL profile for ACP edge nodes? IMHO: No We already have 5 options within the RPL profile as specified in 6.11.1.12/14. ACP-connect already implies the third highest priority to become RPL root, so it already is distinguished. If this is not satisfactory, pls. propose text or explain what functionality you think is missing. {64:a} > - *** section 8.1.5 is the ACP edge node really sending RA with PIO for the ACP ULA prefix? Then NMS host will do plain SLAAC and can select an address already in use (DAD will not work across a routed network). Please state that the A bit for the prefix is not set in order to disable SLAAC. There are also route options for RA that could be used. I think you are talking about 8.1.3 8.1.3 says ACP edge routers use RFC4191 (RIO, not PIO, hence no bother about A bit) to announce a 'poison' default route of lifetime 0 because ACP edge routers as defined here do not want to see non-ACP traffic from the ACP-connect interface plus actual (non-poisoned) RIO routes for the actual ACP prefixes (ACP ULA prefixes). RIOs AFAIK have no impact on SLAAC/DAD and are meant to indicate prefixes routed across the announcing router and work for multi-homed hosts. NMS nodes would need to be multi-homed with one ACP connect interface and a separate data-plane interface). Except for the "merged" case described in the ACP connect section. {65:f} > - section 8.2 current title " ACP through Non-ACP L3 Clouds" is confusing about the overloaded 'cloud'. What about "Connecting ACP islands over NON-ACP L2 networks" ? Fixed to: Connecting ACP islands over Non-ACP L3 networks (Remote ACP neighbors) (when we started, the other cloud was called "Data-Centers" ;-) {66:f} > - *** section 8.2.1 please replace "DTLS" by "DTLSv12" in the configuration to allow for DTLS 1.3 see discusses above for the DTLS logic enhancements. The keywords in 8.2.1 are simply the same as in GRASP. "DTLS" simply means "DTLS support down to version 2" (aka: DTLS v3 welcome, but not mandatory unless we havee an RFC for it). Btw: If we would later see the need to fully retire DTLS v2, we could do a new profile "DTLS support down to version 3", and we would call that "DTLsv3". Not saying DTLSv2 now, but only DTLS is the benefit of this being the first profile. {67:a} > - section 8.2.2 did you investigate whether Routing Header (a la MIPv6 or SRv6) could be used as well? Avoiding the double encapsulation No change: I did investigate when you asked, here is my reply: 8.2.2 is a workaround for platforms that can not support 8.2.1 which is IMHO always the most header efficient option. I am not aware of any platforms that implements MIPv6 or SRv6 "VTI" (virtual tunnel interfaces), if there where such implementations, then the notion of the first paragraph of this section "or other form of pre-existing tunnel" would apply (aka: such tunnels would be fine). But: I don't think this would be any more header efficient than the other explicitly mentioned encaps because AFAIK RFC8200 does not allow to add these headers without full IPv6 header encap. {68:a} > - section 9.2.1 may be add NETCONF, RESTCONF ? No need for the list to be exhaustive. Just long enough to justify the use-case. The list only includes protocols where i am quite confident that i could win the argument that the deployment realities (for lots of reasons) is unencrypted, even if newer versions specify secure transport. AFAIK, secure transport such as TLS or SSL are fairly widely used for NETCONF, RESTCONF. Hence their explicit non-inclusion. {69:f} > - should section 9.3 be normative ? Like being unable to disable ACP ? Section 9. is really a summary ("Benefits") that was origially at the end of the document. It does not and is not meant to introduce any new requirements (normative or not) that are not specified in the normative part. At some stage of review i started to move all text that could not be normative beyond all existing big text, which became section 10. This was done to minimize unnecessary rfcdiff delta (avoid renumbering of existing text). So it looks right now as if section 9 "introduces" something new, in reality its a leftover bug of prior reorderin (section 10 introduces it). I will not check in the first version without change to section 9, which is all technical changes for easier rfcdiff. I will then check in a second revision that swaps section 10 and 9 making the "Benefits" Section at the end of the doc, also renaming to "Summary: Benefits". {70:a} > - section 10.1 also applies to many protocols and their troubleshooting workflow... Unsure whether it belongs in this document ??? or to another one to be created... I am going to use my joker card for this point: ANIMA is OPS, and therefore operational considerations should not be seen as second class citizens like they often are in RTG/INT. Of all the operational section 10 points, this is actually the first one because i think it is the most important one. If we would have had more of this diagnostics in pre-standard implementations, wew could have saved weeks, if not months of troubleshooting with customers. This was also weell received by other reviewers, and i have little hope that i could write more about this in another document better: Other sections such as 10.3 can easily translate into later YANG documents. This section really requires more experimentation by implementations first before we could dare to convert it into YANG. So implementations really should start with that experimentation in the first implementation. Or else they'll run into a lot deployment/interop issues. {71:a} > - section 10.2.3 should the security discussion be moved to the security considerations section ? Given my above opinion about what size of security considerations are better localized in their appropriate context or into the general security considerations, i wold prefer to keep 10.2.3 inside of 10.2 > {72:a} > - section A.3.1, I do not really buy your discussion about LLDP. Section 7.2 could leverage or co-exist with LLDP 7.2 is about building ACP-enabled L2 switches. This section A.3.1 is about co-existance with non-ACP-enabled L2 switches that support CDP or LLDP. Aka: exactly not the switches considered in 7.2. If the info we now have in GRASP was put as extensions into CDP or LLDP, then non-ACP_enabled L2 swiches that do support non-extended CDP/LLDP messages would just ignore those ACP extensions. And it would be a lot of trouble to persuade IEEE to add LLDP options that are propagated across L2 switches. I think tey are doing some of this now, but not generic, but for specific use-cases that IEEE is interested in. _______________________________________________ Anima mailing list Anima@ietf.org https://www.ietf.org/mailman/listinfo/anima