Benjamin Kaduk has entered the following ballot position for draft-ietf-anima-grasp-api-08: No Objection
When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html for more information about IESG DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-anima-grasp-api/ ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- I have two comments in particular that I would like to call your attention to: my comment on cache flushing in Section 2.3.4, and my comment on the CBOR data model used for validation in Appendix A. Section 1 An ASA runs in an ACP node and therefore inherits all its security properties, i.e., message integrity, message confidentiality and the fact that unauthorized nodes cannot join the ACP. All ASAs within a I agree with Roman's comment that the "it" whose security properties are inhereited is the ACP *node*, not the ACP itself, and thus that some rewording is appropriate. The GRASP API library would need to communicate with the GRASP core via an inter-process communication (IPC) mechanism. The details of Hmm, if the GRASP core is in kernel-space and the API library in userspace, wouldn't we normally refer to that exchange as a system call rather than IPC? (Figure 1 also labels this interaction "IPC".) Section 2.1 * Authorization of ASAs is not defined as part of GRASP and is not supported. Any chance I could interest you in s/not supported/a subject for future work/? It is looking somewhat likely since such a statement is already present in the security considerations... * User-supplied explicit locators for an objective are not supported. The GRASP core will supply the locator, using the ACP address of the node concerned. This would seem to prevent any non-ACP use of GRASP; I suggest adding some language with a caveat about "for example" or similar, unless the intent is to limit the API usage to ACP (or DULL) scenarios. Section 2.2.1 I think that the possibility for a single outbound message to get a sequence of incoming replies (at different times) further complicates the design of an asynchronous mechanism, and we would do well to discuss how such scenarios (e.g., broadcast discovery messages) would be handled by the implementation and API. (I see that we do end up using a timeout in practice to resolve this topic, but would probably still mention it as an issue that has been resolved, here.) Section 2.2.2 ports rather than a separate port per session. Hence the GRASP design includes a session identifier. Thus, when necessary, a 'session_nonce' parameter is used in the API to distinguish simultaneous GRASP sessions from each other, so that any number of sessions may proceed asynchronously in parallel. I do see that there was previous discussion on the 'nonce' terminology here, and I am unsure why there is need to move away from the "session ID" terminology used in GRASP itself. In particular, the "session_nonce" is not a number used *once*, rather, it is used only for one session (but potentially multiple times within that session). That, to me, makes it a (short-lived) identifier, not a nonce. Roman's proposal of 'handle' would resolve this apparent disparity. Section 2.2.3 On the first call in a new GRASP session, the API returns a 'session_nonce' value based on the GRASP session identifier. This What does "based on" mean? Does there need to be a one-to-one correspondence? Or just in one direction? Are we going to be constrained by the (IMO, too limited) 32 bits of randomness limit of the GRASP Session ID? Section 2.3.2.3 - Note 3: In a language such as C the preferred implementation may be to represent the Boolean flags as bits in a single byte, Which aspect(s) of C are relevant for the "such as"? An essential requirement for all language mappings and all implementations is that, regardless of what other options exist for a language-specific representation of the value, there is always an option to use a raw CBOR data item as the value. The API will then wrap this with CBOR Tag 24 as an encoded CBOR data item [RFC7049] for transmission via GRASP, and unwrap it after reception. I'm not sure I understand why the bstr wrapping is mandatory -- I would have thought that the attraction of using a raw encoded CBOR data item would be that it could be used directly, without additional wrapping. int loop_count; int value_size; // size of value in bytes Some people might argue for using unsigned types for at least sizes (e.g., size_t), and often for things like loop counts that cannot be negative (though the argument for an unsigned type there is somewhat weaker). self.value = 0 # Place holder; any valid Python object Wouldn't None be a more conventional placeholder in Python? Section 2.3.2.4 * The following cover all locator types currently supported by GRASP: - is_ipaddress (Boolean) - True if the locator is an IP address - is_fqdn (Boolean) - True if the locator is an FQDN - is_uri (Boolean) - True if the locator is a URI Are these mutually exclusive? Section 2.3.2.6 As for the GRASP session ID, I think that a 32-bit cap is too restrictive. I think we should be in the habit of using 128-bit nonces and needing to justify anything smaller. (64 bits would *probably* be fine here, FWIW, and might make it easier to represent in common language bindings.) Section 2.3.2.7). Another possible implementation is to hash the name of the ASA with a locally defined secret key. I recognize that this is a throwaway line, but the naive keyed hash construction is subject to length-extension attacks (for certain hash constructions such as the Merkle-Damgarg family that includes SHA-2); HMAC is more robust for this type of usage and can be phrased in an similarly concise manner ("compute an HMAC of the name of the ASA under a locally defined secret key"). Section 2.3.3 * deregister_asa() [...] - Note - the ASA name is strictly speaking redundant in this call, but is present for clarity. So what happens if the wrong name is passed? transmit to other ASAs. It is not necessary to register an objective that is only received by GRASP synchronization or [...] Registration is not needed for "read-only" operations, i.e., the ASA only wants to receive synchronization or flooded data for the objective concerned. These seem to have high overlap and thus be candidates for deduplication. - The 'ttl' parameter is the valid lifetime (time to live) in milliseconds of any discovery response for this objective. The (nit?) I'd suggest to add "generated", since it would not apply to any hypothetical received discovery response for the objective in question. - If the parameter 'overlap' is True, more than one ASA may register this objective in the same GRASP instance. Do all ASAs registering this objective have to set it to True, or just the first one, in order for the subsequent registrations to succeed? Section 2.3.4 - If the parameter 'minimum_TTL' is greater than zero, any locally cached locators for the objective whose remaining time to live in milliseconds is less than or equal to 'minimum_TTL' are deleted first. Thus 'minimum_TTL' = 0 will flush all entries. Why does one ASA's request flush entries from the cache shared with other ASAs? I am forced to infer the motivation for including the minimum_TTL parameter in the first place, but it seems like it is useful if the requesting ASA needs to find something that will remain active for a given period of time, but different ASAs may have different needs for the peer's stability, and so flushing the cache in this way could hamper the operation of peer ASAs. If the intent is only to not return those cached locators *for this discovery operation*, then say that, not that they are flushed from the cache entirely. Section 2.3.5 Thanks for the figure (I probably should have put one into RFC 7546, which is basically this section but for the GSS-API). I suggest noting in the first paragraph that the negotiation occurs in lockstep, with the initiator starting the negotiation and preparing a message, the responder processing that message and generating a new negotiation message in turn, with at most one negotiation message in flight at any given time. It seems particularly important to note whether this also applies to negotiate_wait() calls/messages, or if those can be made at any time by either entity. (This probably relates to some of the genart reviewer's comments.) I note that the prospect of the loop count going up (and, thus, risk of infinite looping) was pointed out by the genart review. I share such concerns and am happy to see that improved discussion of this topic (and the related 'lifetime' extension) is planned. For this and any other error code, an exponential backoff is recommended before any retry. Any guidance about whether this should be by doubling vs a different exponent base? I guess the security considerations do say that it's dependent on the semantics of the objective in question, which may be enough (though a pointer or mention here would be appreciated). (Also, any reason to not use the 2119 RECOMMENDED?) - This function must be followed by calls to 'negotiate_step' and/or 'negotiate_wait' and/or 'end_negotiate' until the negotiation ends. 'listen_negotiate' may then be called again to await a new negotiation. We just recommended a few paragraph previously that listen_negotiate() should be called again *immediately* after the first listen_negotiate() returns; I don't see why it's useful to also say that it might be called again after a given negotiation ends. - Executes the next negotation step with the peer. The 'objective' parameter contains the next value being proffered by the ASA in this step. It must also contain the latest 'loop_count' value received from request_negotiate() or negotiate_step(). This is intreseting; negotiate_step() must preserve the loop count from the previous call, so only the initial negotiation response (the request_negotiate() 'proffered_objective' output) can increase the loop count, not any arbitrary negotiation step? That seems to limit concerns about infinite looping (as raised by the genart reviewer and apparently acknowledged in the response to the genart review). o Threaded implementation: Called in the same thread as the preceding 'request_negotiate' or 'listen_negotiate', with the same value of 'session_nonce'. IIUC it is *expected* to be called in the same thread as the previous call, but is not strictly speaking *required* to do so, since the session_nonce tracks the library state for the negotiation in question. Or am I mistaken? 'result' = True for accept (successful negotiation), False for decline (failed negotiation). 'reason' = optional string describing reason for decline. What happens if I pass a reason string with result of True? Section 2.3.6 - If the 'peer' parameter is null, and the objective is already available in the local cache, the flooded objective is returned immediately in the 'result' parameter. In this case, the 'timeout' is ignored. - Otherwise, synchronization with a discovered ASA is performed. If successful, the retrieved objective is returned in the 'result' parameter. >From context this 'otherwise' seems to be the "'peer' parameter is null but the objective is not available in the local cache" case (as opposed to also covering the "'peer' parameter is not null" case). It might be possible to clarify this with formatting and/or rewording. * synchronize() [...] - Since this is essentially a read operation, any ASA can do it, unless an authorization model is added to GRASP in future. Therefore the API checks that the ASA is registered, but the objective does not need to be registered by the calling ASA. [...] - Since this is essentially a read operation, any ASA can use it. Therefore GRASP checks that the calling ASA is registered but the objective doesn't need to be registered by the calling ASA. These seem redundant and candidates for de-duplication. - In the case of failure, an exponential backoff is recommended before retrying. [same remark as previously] Section 2.3.7 'info' = optional diagnostic data. May be raw bytes from the invalid message. This means it does not have to be well-formed CBOR, and will be wrapped in a bstr by the library? (The GRASP spec suggests that a different CBOR structure would be permitted, though of course the API need not be required to expose such flexibility.) Section 4 If we're going to keep the 32-bit nonce/handle/etc, it's probably worth a mention of collision/guessing probability. It might be worth a reference to the RFC 3986 security considerations since we do allow URI locators. This is not really any different than for GRASP itself, but the URI is exposed to the API consumer and so reminding them about it seems worthwhile. The session_nonce is nominally opaque to (non-ACP, at least) ASAs, but is likely to be implemented in a way that does preserve some state. Is there a risk if an ASA attempts to "peek through the abstraction barrier"? (I am not sure I see one, but you're the expert!) GRASP objective concerned. These precautions are intended to assist the detection of malicious denial of service attacks. I suggest to drop the word "malicious"; such denial of service conditions need not be malicious and can occur by accident. As a general precaution, all ASAs able to handle multiple negotiation or synchronization requests in parallel may protect themselves against a denial of service attack by limiting the number of requests they can handle simultaneously and silently discarding excess requests. I think that best practices would also include some limit on the number of objectives registered by a given ASA and possibly the number of ASAs registered, to protect the core library/kernel resources. (nit?) I suggest dropping 'can'. Appendix A There was some discussion with the genart reviewer about the CBORfail error code as being particularly useful. I note that draft-ietf-cbor-7049bis is in AUTH48 and introduces a hierarchy of "levels of validation" (in the form of different data models). CBOR that is valid in the generic data model might not be valid in the extended data model or a data model specific to a given application. I strongly encourage this document to update to referencing 7049bis and giving an indication of what data model is in use for processing both information received from the peer and any CBOR-encoded data received from the ASA. 'noSecurity' error will be returned to most calls if GRASP is running in an insecure mode (no ACP), except for the specific DULL usage mode My understanding of the text in the GRASP spec itself was that non-ACP security services were allowed. Is the API intended to be limited to only ACP usage? ASAfull 4 "ASA registry full" (register_asa) dupASA 5 "Duplicate ASA name" (register_asa) noASA 6 "ASA not registered" notYourASA 7 "ASA registered but not by you" Giving this much detail is making things much easier for malicious ASAs ... but given that the deployment model basically assumes that such things don't exist (even if we do give some small consideration to the possibility in some places), I will not complain about retaining this level of detail in the error messages. noDiscReply 17 "No reply to discovery" (req_negotiate) There is perhaps some explanation to give about the distinction between noReply and noDiscReply, i.e., in the body text. Maybe it is self-explanatory, though, provided that the author of the code notices that noDiscReply exists at all. Likewise for noNegReply, noSynchReply, noValidSynch, and, possibly, noValidStep. _______________________________________________ Anima mailing list [email protected] https://www.ietf.org/mailman/listinfo/anima
