Re: [Anima] AD review of draft-ietf-anima-grasp-api-06

Brian E Carpenter Tue, 06 Oct 2020 13:47:25 -0700

Thanks Rob. I think things are clear, so we'll work on an update over the next 
few days.


Regards
   Brian

On 07-Oct-20 07:15, Rob Wilton (rwilton) wrote:
> Hi Brian,
> 
> Thanks for the comments, please see replies inline.
> 
> 
>> -----Original Message-----
>> From: Brian E Carpenter <[email protected]>
>> Sent: 30 September 2020 23:28
>> To: Rob Wilton (rwilton) <[email protected]>; draft-ietf-anima-grasp-
>> [email protected]
>> Cc: [email protected]; Sheng Jiang <[email protected]>
>> Subject: Re: AD review of draft-ietf-anima-grasp-api-06
>>
>> Thanks for this thorough review, Rob. Sorry about the previous partial
>> reply. See responses in line.
>>
>> We'll pause for any feedback before updating the draft. In particular, are
>> there any opnions on the example logic flows issue noted below?
>>
>> On 29-Sep-20 04:19, Rob Wilton (rwilton) wrote:
>>> Hi,
>>>
>>> Apologies for the delay for this review.
>>>
>>> Please see comments below.  I've also attached them in case they get
>> truncated (which has happened on other reviews).
>>>
>>> Thank you for this document.  Overall, it is very well written, easy to
>> read and easy to understand.
>>>
>>>
>>> Comments:
>>>
>>>     1.  Introduction
>>>
>>>        As the following figure shows, a GRASP implementation could
>> contain
>>>        two major sub-layers.  The bottom is the GRASP base protocol
>> module,
>>>        which is only responsible for sending and receiving GRASP
>> messages
>>>        and maintaining shared data structures.  The upper layer contains
>>>
>>> I found the text, relative to the diagram, to be somewhat confusing,
>> although the subsequent paragraphs make its intent clearer.
>>>
>>> I think that the top sub-layer is meant to be "GRASP Extended Function
>> Modules", and the "Grasp Modules" is the bottom sub-layer.  However, the
>> diagram seems to have a sub-layer split in "GRASP Modules", and doesn't
>> necessarily equate to "GRASP base protocol module".
>>
>> Perhaps we have confused two aspects: the view that the app writer has of
>> the API, and the way it has to be implemented. More below...
>>
>>>
>>>     1.  Introduction
>>>
>>>        It is desirable that ASAs can be designed as portable user-space
>>>        programs using a system-independent API.  In many
>> implementations,
>>>        the GRASP module will therefore be split into two layers.  The
>> top
>>>        layer is a library that provides the API and communicates
>> directly
>>>        with ASAs.  The lower layer is a daemon, or a set of sub-
>> services,
>>>        providing GRASP core functions that are independent of specific
>> ASAs,
>>>        such as multicast handling and relaying, and common data
>> structures
>>>        such as the discovery cache.  The GRASP API library would need to
>>>        communicate with the GRASP core via an inter-process
>> communication
>>>        (IPC) mechanism.  The details of this are system-dependent.
>>>
>>> I also find this text confusing relative to the diagram.  More
>> naturally, I would assume that the boxes represent different processes and
>> hence you would have IPC between ASA and say the "GRASP API Library".  If
>> I was to draw this, then I would probably have put the GRASP API library
>> co-located with the ASA boxes.
>>
>> Yes. In fact in my demo implementation, when I added an IPC mechanism,
>> that is essentially what I got: stub functions for all the API calls in
>> the app context, then IPC to the core, then the "real" functions and the
>> various daemons/threads in the core context.
>>
>> I can't instantly provide old/new text to respond to this but we will try
>> to clarify all this.
> [RW] 
> 
> Okay.
> 
> 
>>
>>>
>>>
>>>     1.  Introduction
>>>
>>>        Note that a simple autonomic node might contain very few ASAs in
>>>        addition to the autonomic infrastructure components described in
>>>        [I-D.ietf-anima-bootstrapping-keyinfra] and
>>>        [I-D.ietf-anima-autonomic-control-plane].  Such a node might
>> directly
>>>        integrate GRASP in its code and therefore not require this API to
>> be
>>>        installed.  However, the programmer would then need a deeper
>>>        understanding of the GRASP protocol than is needed to use the
>> API.
>>>
>>> Perhaps change "integrate GRASP in its code" to something like
>> "integrate the full GRASP implementation in its code"?
>>
>> Yes (but actually some cases might choose to implement a GRASP subset).
> [RW] 
> 
> Or possibly something like "integrate a GRASP protocol stack in its code"?  
> The aim of my comment was to try and differentiate between just linking 
> against a GRASP API library and directly embedding part of the GRASP protocol 
> stack.
> 
> 
> 
>>
>>>
>>>     2.2.1.  Alternative Asynchronous Mechanisms
>>>
>>>        Thus, some ASAs need to support asynchronous operations, and
>>>        therefore the GRASP core must do so.  Depending on both the
>> operating
>>>        system and the programming language in use, there are three main
>>>        techniques for such parallel operations: multi-threading, an
>> event
>>>        loop structure using polling, and an event loop structure using
>>>        callback functions.
>>>
>>> Rather than "there are three main techniques for such parallel
>> operations", perhaps just "three common techniques are considered for
>> parallel operations".
>>
>> OK
>>
>>> Reasoning, there are other approaches for effectively handling
>> concurrent operations, e.g., actors, futures, cooperative threads backed
>> by a OS thread pool ...
>>
>> [Or of course different names for the same thing. Imagine trying to
>> describe all this in terms of ADA rendezvous.]
>>
>>>
>>> Possibly, change "simple function calls" in the multi-threading
>> paragraph to be "simple synchronous function calls" to emphasize the
>> synchronous nature.
>>
>> OK
>>
>>>
>>>
>>>     2.2.2.  Multiple Negotiation Scenario
>>>
>>>        In the callback model, for the scenario just described, the ASAs
>> "A"
>>>        and "B" will each provide two instances of
>> negotiate_step_received(),
>>>        one for each session.  For this reason, each ASA must be able to
>>>        distinguish the two sessions, and the peer's IP address is not
>>>        sufficient for this.  It is also not safe to rely on transport
>> port
>>>        numbers for this, since future variants of GRASP might use shared
>>>        ports rather than a separate port per session.  This is why the
>> GRASP
>>>        design includes a session identifier.  Thus, when necessary, a
>>>        'session_nonce' parameter is used in the API to distinguish
>>>        simultaneous GRASP sessions from each other, so that any number
>> of
>>>        sessions may proceed asynchronously in parallel.
>>>
>>> It wasn't obvious to me why both ASAs "A" and "B" both provide
>> callbacks, I presume that this is because there are multiple asynchronous
>> steps involved.  If so, would it be helpful to clarify this point?
>>
>> Yes, one could build very complicated interlocking negotiations. (That is
>> a whole can of worms, explored in draft-ciavaglia-anima-coordination, but
>> for present purposes we just need to ensure that ASAs can handle any
>> degree of asynchronosity, if that's a word.)
>>
>>> Also in this paragraph suggest "Hence, " rather than "This is wht"
>>
>> OK
>>
>>>
>>>
>>>     2.2.3.  Overlapping Sessions and Operations
>>>
>>>        In calls where it is used, the 'session_nonce' is an opaque read/
>>>        write parameter.  On the first call, it is set to a null value,
>> and
>>>        the API returns a non-null 'session_nonce' value based on the
>> GRASP
>>>        session identifier.  This value must be used in all subsequent
>> calls
>>>        for the same session, and will be provided as a parameter in the
>>>        callback functions.  By this mechanism, multiple overlapping
>> sessions
>>>        can be distinguished, both in the ASA and in the GRASP core.  The
>>>        value of the 'session_nonce" is opaque to the ASA.
>>>
>>> Query the use of a null value, and whether that is too restrictive.
>>
>> Its value on the first call is of no importance, so we should just call it
>> a placeholder.
> [RW] 
> 
> Sorry, I had intended to expand my original comment.  To me this reads quite 
> like a C API, where the session_nonce is a INOUT parameter, but other 
> languages that say allow a tuple to be returned would probably just return a 
> tuple containing whatever other return parameters along with a session_nonce.
> 
> However, I'm also not quite sure whether this text is entirely consistent 
> with how the session_nonce is actually used in the APIs.  E.g. 
> request_negotiate() options returns a session_nonce.  It doesn't appear to be 
> an "in" parameter at all.
> 
> Perhaps the text should just say something like the session_nonce is provided 
> when the session is established and used in all subsequent calls, and 
> detailed semantics to the specific APIs?
> 
> 
> 
> 
> 
>>
>>>
>>>
>>>     2.3.  API definition
>>>
>>> Possibly this API would benefit from a short section (or even pseudo
>> code) explaining the flow or flows of how the APIs are expected to work
>> together.  This could also be in an appendix.  Perhaps the GRASP
>> specification makes this obvious, in which case references back to the
>> GRASP spec would be sufficient.
>>
>> We have some example logic flows in https://tools.ietf.org/id/draft-
>> carpenter-anima-asa-guidelines-09.html#section-appendix.b
>>
>> Would a reference to that help? The authors would like that draft to be
>> adopted by the WG. We could of course move the examples into the API
>> draft, if people prefer that.
> [RW] 
> 
> Yes, this is what I was thinking of.
> 
> I don’t mind whether it is included in an appendix in this draft or as a 
> reference.
> 
> 
> 
>>
>>>
>>>
>>>     2.3.1.3.  Objective
>>>
>>>        *  name (UTF-8 string) - the objective's name
>>>
>>> Should there be a limit on the length of the name?  This question
>> equivalently applies to the other strings/variable sized types.
>>
>> I think that's implementation dependent. Because GRASP is a CBOR protocol,
>> there is no particular limit on string length. There is a limit on the
>> total size of a GRASP message (although I suspect we might decide to raise
>> or remove that limit one day).
> [RW] 
> 
> That sounds reasonable, and I don't feel too strongly on this issue.
> 
> I think that my general concern with arbitrary length strings in APIs, that 
> are implementation dependent, is that clients may not know what (if any) max 
> length is supported by an implementation, which potentially could make 
> interop harder.  In the YANG world, this can be mitigated via deviations to 
> indicate implementation specific restrictions.
> 
> Perhaps what I am thinking of is defining a minimum supported length.  E.g., 
> the length is implementation specific, but all implementations must support, 
> say, at least 64 characters.  Hence, a client would know that if the name was 
> below this length then it would work with all implementations, but 
> implementations may support a much higher limit if they wish.
> 
> 
>>
>>>        *  value - a specific data structure expressing the value of the
>>>           objective.  The format is language dependent, with the
>> constraint
>>>           that it can be validly represented in CBOR (default integer =
>> 0).
>>>
>>> I didn't understand the "default integer = 0" part.  If this is a byte
>> string then I would thought that the default might be an empty length
>> bytestring, or perhaps a CBOR Null.
>>
>> I'm not even sure now why we need to specify a default in a generic API.
>>
>>>           An essential requirement for all language mappings and all
>>>           implementations is that, regardless of what other options
>> exist
>>>           for a language-specific represenation of the value, there is
>>>           always an option to use a CBOR byte string as the value.  The
>> API
>>>           will then wrap this byte string in CBOR Tag 24 for
>> transmission
>>>           via GRASP, and unwrap it after reception.
>>>
>>> It was unclear whether you just meant an opaque CBOR byte string here
>> (CBOR major type 2), or a CBOR byte string that itself contains CBOR
>> encoded data (i.e. CBOR Tag 24).  Is there any limit on the length of the
>> CBOR byte string that is allowed?
>>
>> It's not intended to be restricted to major type 2; I'll have to check the
>> CBOR RFC to find the correct phrase to use.
> [RW] 
> 
> Perhaps something along the lines of "An encoded CBOR data item, wrapped in 
> CBOR Tag 24, as per section 3.4.5.1 of 7049bis"
> 
> 
> 
>>
>> As above, the only size limit we have is the total GRASP message size.
>>
>>>
>>>
>>>     2.3.2.  Registration
>>>
>>> Some APIs list "Asynchronous Mechanisms:" and others don't.  It might be
>> helpful to clarify that those APIs that don't list asynchronous mechanisms
>> are implicitly synchronous in their behaviour.
>>
>> OK
>>
>>>
>>>
>>>         *  deregister_asa()
>>>
>>>           -  Note - the ASA name is strictly speaking redundant in this
>>>              call, but is present for clarity.
>>>
>>> Presumably it also makes it significantly harder for a rogue client to
>> try and deregister all ASAs by looping through all asa_nonces.
>>
>> True, but there is so much entropy in the nonce that this would be a
>> challenge anyway.
> [RW] 
> 
> Okay.  When I wrote this, I also realized that the other APIs are just 
> relying on the nonce anyway.
> 
> 
>>
>>>
>>>
>>>     2.3.4.  Negotiation
>>>
>>>              2.  The 'session_nonce' parameter is not null.  In this
>> case
>>>                  negotiation must continue.  The returned
>>>                  'proffered_objective' contains the first value
>> proffered by
>>>                  the negotiation peer.  Note that this instance of the
>>>                  objective must be used in the subsequent negotiation
>> call
>>>                  because it also contains the current loop count.  The
>>>                  'session_nonce' must be presented in all subsequent
>>>                  negotiation steps.
>>>
>>> Presumably the loop count in the proffered_objective must be one greater
>> than the loop count in the objective structure?  Would it be helpful to
>> mention this?
>>
>> Right... I'm going to look in my GRASP code before I try to rephrase that.
>> I recall that it was a bit tricky to get it right.
> [RW] 
> 
> Okay.
> 
>>
>>>
>>> I also wasn't sure whether it must be "this instance of the objective
>> must be used", or just that the loop counter must be copied across.  E.g.,
>> functional languages would encourage these structures to be immutable.
>>
>> And because of Python's vagueness about whether it's call-by-reference or
>> call-by-value, there are cases where I've had to explicitly clone a copy
>> of an objective to be sure that I have the right to decrement the loop
>> counter. So yes, it's "this instance or a clone of this instance". Will
>> rephrase.
> [RW] 
> 
> Okay.
> 
>>
>>>
>>>
>>>          -  Asynchronous Mechanisms:
>>>
>>>              o  Event loop implementation: The 'session_nonce' parameter
>> is
>>>                 used in read/write mode.
>>>
>>> It is unclear to me, exactly what you mean by this, particularly because
>> this is a return parameter.  Perhaps this could be further explained in
>> the text, or refer back the previous text.
>>
>> Yes. I think we were trying to express something that will look a bit
>> different in C compared to high-level languages.
> [RW] 
> 
> Possibly even referring it an in/out parameter would be clearer.
> 
>>
>>>
>>>
>>>     2.3.4.  Negotiation
>>>
>>>         *  listen_negotiate()
>>>
>>> I was surprised that the listen_negotate didn't include any timeout,
>> e.g. to allow it to be used in polling fashion.  Same comment applies to
>> listen_synchronize().
>>
>> Well, in a polling solution won't the listen state just go into the event
>> loop? But I'm a bit out of my depth there.
> [RW] 
> 
> Okay, possibly I was thrown by the "Unless there is an unexpected failure,
> this call only returns after an incoming negotiation request."  I.e., it 
> seems to suggest that it always blocks.  But perhaps the description needs to 
> be refined depending on whether it is a synchronous or asynchronous API.
> 
> In a pure event loop scenario, I would expect a registration API to register 
> an event (e.g. an objective) that I am interested in, and optionally a 
> callback function to be invoked when that event is received.  Normally, the 
> thread would then have a loop waiting (blocking) for an event to be received 
> that might either be returned back in the loop (or plausibly dispatched 
> directly).  In the code that I am familiar with there are separate calls to 
> wait (block) for a new event, and then a separate function is invoked to 
> dispatch the event to the registered call back handler.  In this scenario is 
> it sometimes useful for the event block() call to have a timeout, allowing 
> the thread to do other periodic housekeeping work and then make a new 
> event_block() call.
> 
> I'm not particular familiar with the Linux epoll, but this gives an example 
> of a timeout whilst waiting for an event: 
> https://man7.org/linux/man-pages/man2/epoll_wait.2.html
> 
> So, I was interpreting this API as effectively having a GRASP specific event 
> loop, and that listen_negotiate() was acting like epoll_wait() call.  But I 
> might be misunderstanding the intent here.
> 
> 
> 
> 
>>
>>>
>>>
>>>     2.3.4.  Negotiation
>>>
>>>         *  negotiate_wait()
>>>
>>> It wasn't clear to me why you would want to do this, although that may
>> be apparent in the GRASP draft.  Perhaps a cross reference might be
>> helpful?
>>
>> Yes, the idea is simply that one party in the negotiation wants to tell
>> the other party to be patient for a while. It is described in the GRASP
>> spec so a reference will help.
> [RW] 
> 
> Okay.
> 
> 
>>
>>>
>>>
>>>     2.3.4.  Negotiation
>>>
>>>         *  end_negotiate()
>>>
>>> Possibly "successful" might be a better parameter name than "reply".  I
>> think that you reply both if the negotiation is successful and also if it
>> failed.
>>
>> Yes, I don't know where that came from. In the Python implementation it's
>> called 'result' (True for accept, False for decline).
>>
>>>
>>>
>>>     2.3.5.  Synchronization and Flooding
>>>
>>>        *  synchronize()
>>>
>>>           -  If the objective was already flooded, the flooded value is
>>>              returned immediately in the 'result' parameter.  In this
>> case,
>>>              the 'source' and 'timeout' are ignored.
>>>
>>> Should source be quoted here?  It is unclear what it is referring to.
>>
>> Good catch. That should be 'peer'.
>>
>>>
>>>
>>>           -  Otherwise, synchronization with a discovered ASA is
>> performed.
>>>              The 'peer' parameter is an 'ASA_locator' as returned by
>>>              discover().  If 'peer' is null, GRASP discovery is
>> performed
>>>              first.
>>>
>>> How does the GRASP discovery work in this scenario?  Does this require
>> use of the discovery APIs?  This may require further explanation.
>>
>> OK, will fix. The intention is that discovery is launched automatically.
> [RW] 
> 
> Okay.
> 
> 
>>
>>>
>>>
>>>     3.  Implementation Status [RFC Editor: please remove]
>>>
>>>        A prototype open source Python implementation of GRASP, including
>> an
>>>        API similar to this document, has been used to verify the
>> concepts
>>>        for the threaded model.  It may be found at
>>>        https://github.com/becarpenter/graspy with associated
>> documentation
>>>        and demonstration ASAs.
>>>
>>> This section will be removed anyway, but I thought that the GIL (global
>> interpreter lock) effectively prevented Python threads from running
>> concurrently.
>>
>> It's not that simple, because you can get true concurrency during I/O and
>> clock waits. In any case, the Python queue and lock mechanism is valid
>> regardless of this.
>> https://wiki.python.org/moin/GlobalInterpreterLock is informative.
> [RW] 
> 
> Ah, okay.  My knowledge of Python is fairly superficial.
> 
>>
>>>
>>>
>>>
>>> Nits:
>>>
>>> I've noted that you have chosen not to use RFC2119 language.  No issues
>> with this, but this may come up in IESG review ...
>>
>> Well, it's Informational.
> [RW] 
> Ah, yes.
> 
>>
>>>
>>>
>>> Use of the term "nonce".  Note in UK English this word sometimes takes
>> an alternative meaning, depending on who you ask and which dictionary you
>> look at: https://dictionary.cambridge.org/dictionary/english/nonce
>>>
>>> You may wish to consider an alternative term, but please note that this
>> is up to the authors.  I'm not objecting to this term, just making you
>> aware, if you weren't already.
>>
>> I'm not aware of a viable alternative.
> [RW] 
> 
> That's fine.
> 
> 
>>
>>>     Abstract:
>>> Maybe languages => programming languages.
>>
>> OK
>>
>>>
>>>
>>>
>>>     1.  Introduction
>>>         As the following figure shows,
>>>
>>> Given that the figure is a few paragraphs away (and on the next page in
>> the txt version), perhaps give the figure a name and reference it by name.
>>
>> Yes.
>>>
>>>
>>>
>>>     2.3.1.3.  Objective.
>>> For me, I would outdent the "} objective" in the C struct example.
>>
>> Yes.
>>
>>>
>>> represenation -> representation
>>
>> Yes.
>>
>> Thanks much,
>>
>>     Brian
> [RW] 
> 
> Thanks,
> Rob
> 
> 
>>>
>>> Thanks,
>>> Rob
>>>

_______________________________________________
Anima mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/anima

Re: [Anima] AD review of draft-ietf-anima-grasp-api-06

Reply via email to