[Anima] Use of GRASP M_FLOOD vs. M_NEG_SYN for BRSKI registrar discovery

Toerless Eckert Tue, 06 Jun 2017 12:26:21 -0700

In todays BRSKI draft review, i was proposing some version of Brians
draft-carpenter-anima-ani-objectives for the discovery of registrar.
It then turned out, that not all BRSKI authors where pursuaded that the
M_FLOOD approach is the right mechanism to use, and they also found
that the following text from the GRASP draft was insufficient explanation
when/why to choose M_FLOOD:


   (about M_FLOOD): One application of this is to act as an announcement, 
avoiding 
   the need for discovery of a widely applicable objective.

I therefore wanted to open a thread here to come to conclusions on this issue.
I guess that i have been the prime promponent of M_FLOOD, also for this
(type) of objectives, so please consider my opinions expressed here
as biased in that direction.

Latest  draft-ietf-anima-bootstrapping-keyinfra-06 (before any mods i suggested)
describes in 3.1.2 the proposed mechanism:

  - proxy to send M_NEG_SYN
  - registrar to answer with M_RESPONSE

(see draft for details of the text).

I am not quite clear what actual legal GRASP sequence of packets is implied by 
this
current text because M_NEG_SYN does not seem to be a valid GRASP message (even 
not
in older versions, google can only find it in the BRSKI draft). So in the 
following
i have to guess a bit about the possible valid alternatives and then i'll try 
to 
judge on their efficiency:


A) - Proxies send GRASP (multicast):
        M_DISCOVERY, ... objective("AN_registrar, ... F_SYNC)
   - Registrar unicasts (via TCP connection):
        M_RESPONSE, ... locator-option, objective("AN_registrar", ...)
     objective-value needs to contain the method to indicate the
     BRSKI protocol spoken across the locator-option (BRSKI/TLS, COAP variants 
etc. pp).

   Q: I hope my interpretation is correct, eg: that the response to M_DISCOVERY 
with F_SYNC set
   in the objective is still an M_RESPONSE, and not an M_SYNCH.

   Assume a larger network with all routers/switces being ANI devices == running
   proxy ASA. Dynamic behavior of this approach is quite interesting:

       R3 - R2 - R1 - Registrar
   
   A.1) Maybe first time around, R2 sends M_SYNC, R1 forwards it because R1 does
   not have a GRASP objective cache for AN_registrar. Registar sends unicast 
TCP M_RESPONSE 
   to R2 as described above.

   A.2) Now R3 sends M_DISCOVERY, when it hits R2, R2 should have entered the 
response from
   the previous step (A.1) into its cache if i read the GRASP spec correctly:
   
    > After a GRASP device successfully discovers a locator for a Discovery
    > responder supporting a specific objective, it SHOULD cache this
    > information, including the interface index [RFC3493] via which it was
    > discovered.  This cache record MAY be used for future negotiation or
    > synchronization, and the locator SHOULD be passed on when appropriate
    > as a Divert option to another Discovery Initiator.
   
   Q: Correct ?

   So then R2 would reply with:
     M_RESPONSE, ...divert-option(locator), objective("AN_registrar", ...)

   Pretty much the same as what Registrar responded with in A.1), except that 
R3 clearly sees this
   is a cached reply from a third party because it uses the divert-option to 
carry the locator,
   and the ttl would lower (because cache has started to expire).
   
   Note: If we make GRASP operate as described here, then it will be hard to
   have the actual GRASP unicast TCP sockets in separate ASA processes as 
opposed
   to a GRASP process: If only the ASA sees and processes the M_RESPONSE in
   A.1), then the GRASP process on R2 in step A.2 would not have the cached 
result.
   Or else the proxy ASA on R2 receives the M_RESPONSE in A.1), and then signals
   that response back to the GRASP process locally so that the GRASP process on 
R2
   has the cache information ... in which case the GRASP process would need to 
trust
   all ASAs to deliver cache information..

   Q: How do existing GRASP implementations deal with this ?
   

   So:
   Either:  If the caching as described above would not work as i guess it 
should, then we would
   have all proxy devices periodically create an M_DISCOVERY that gets flooded 
all the
   way to a registrar, and replies are unicast TCP and yada yada - that does 
not scale.

   Or: If the caching is meant to work as described above, we have one 
fundamental limitation:
   in this scheme, R3 will only be ale to learn one "random" registrar locator 
- but
   not all available registrars. Aka: Imagine we have two registrars in the 
network.
   Because the cache logic would be to stop flooding the M_REQUEST when there 
is a cache
   entry, if R2 has only one registrar cached, and if R2 would stop flooding 
the M_REQUEST
   as soon as it has one cached entry, then its quite random which objective 
reponder/locator
   we would see.

   Is this discussed anywhere in the GRASP draft ? I couldn't find it. I can 
see how
   it might be sufficient to have one objective responder to be known for many 
objectives,
   but the main issue is that with this simple cachine scheme, there is no way 
to predict
   or control which objective responder would be cached where. 

   Beyond this caching problem, the other issue is that this approach also 
creates
   unnecessary periodic TCP connections with varying TTL timeouts:
   
   - When i am on an ANI device with a proxy, i assume that i have an ongoing 
interest
     in the registrar objective. Which means that whenever some M_RESPONSE has 
a TTL expiry,
     i would trigger another M_DISCOVERY. And i will get a reply from some 
cache which
     will not have the maximum TTL lifetime, but whatever that cache had left 
over.
     And i will send out the request to all interfaces and likely get responses 
from
     all neighbors:

              R2     R3 -- Registrar
                \  /
                 R1
                /  \
              R4     R5 -- Registrar

     R1 cache expires. It sends the M_DISCOVERY to R2, R3, R4, R5. I guess it 
will get
     TCP connections with M_RESPONSE from all four as well ? Eg: I couldn't 
find any rule
     in GRASP spec saying "If you're on a router like R2, and you get an 
M_DISCOVERY from
     a neighbor R1 and your incoming interface for any cached objective 
responders is via
     R1, then do not reply" (there may be such a rule, but i can not find it 
right now).


  -   R10 - R9 - R8 - R7 - R6 - R5 - R4 - R3 - R2 - R1 - Registrar

     consider a more interesting topology, like what you see in mobile RAN and 
other
     SP aggregation networks. When we are talking about caching information 
that''s
     updated from information in other caches, i am always quite anxious to 
understand
     how exactly that will work, because i have seen this go wrong in other 
similar protocols
     quite badly.

     Aka: can you predict the dynamic behavior of the caches in topologies like 
this when
     every device independently try to update its cache information ? here can 
be problems
     like synchronization, where all devices expire their cache at the ame 
time, then
     start initiating a multicast M_DISCOVERY to their neighbors. Those 
neighbors might
     ahve done the same, but how exactly would they limit if/how to forward 
M_DISCOVERY
     messages ? AFAIK, the draft does not say (only says for M_FLOOD).

     Or how do you refresh ? Eg: Lets say you send a new M_DISCOVERY maybe 
sufficienttly well
     enough BEFORE your cache expires. Let's say when your TTL goes down to 10 
seconds.
     But then the first neighbor that it hits also only has a remaining cache 
TTL time of
     5 seconds. Nothing gained. You've got to go through phases of the cache 
actually
     having expired before you can receive a result with larger TTL.

     Aka: If we wanted to make a multi-hop request/reply caching system work,  
we would
     need IMHO more timing parameters worked out, such as a minimum remaining 
TTL
     time where you would need to trigger a cache refresh M_DISCOVERY, and 
where you
     would NOT consider to send an M_REPLY from your own cache, but instead 
forward
     the request.

B) The simple M_FLOOD that i do understand, where i can easily calculate how 
much
   traffic there is in the network, where we do not have unnecessary periodic 
TCP
   connections fluctuating cache lifetime across multiple hops etc. pp:

   Objective responder sends peroidic, unsolicited M_FLOOD. Typical approcha:
   time-to-live = 90 seconds, periodicity of unsolicited M_FLOOD = 30 seconds.

   Apply to objectives where you know that you have a dense population oof 
interested
   objective initiators. Such as in BRSKI proxy case.

   Done!

C) Now, one could try to build a hybrid between A) and B), by which the replies 
are
   not unicasted M_RESPONSE but instead M_FLOOD, but i am not sure if that 
would even
   be leagl in GRASP, not what else it might buy...

Cheers
    Toerless

_______________________________________________
Anima mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/anima

[Anima] Use of GRASP M_FLOOD vs. M_NEG_SYN for BRSKI registrar discovery

Reply via email to