> Noel, here are my comments.
Here is the third (and final) tranche of replies to your comments.
As before, I did not _always_ agree yadda-yadda-yadda.
Noel
----
>> The probing mechanism is rather heavy-weight and expensive
> You said it was simple but then you say heavy-weight.
Those are not incompatible? (Levelling an entire city with an H-bomb to get
rid of some snipers is a simple, but heavy-weight method of dealing with the
problem... :-)
I looked at this for a while, trying to see how to say that differently, but I
can't come up with anything better than what's there already? (I looked in a
thesaurus, but all the words I saw that seemed plausible were no better.)
> What you want to say is if the map-cache is large, a lot of messages
> will be required to be sent. To make it scale, you have to spread the
> sending of RLOC-probes over time. Which then affects the down detection
> convergence.
That is probably worth pointing out, done.
>> However, it has the advantages of providing information quickly (a
>> single RTT), and being a simple, direct robust way of doing so.
> And note that Echo Noncing in the data-plane can be faster assuming
> bidirectional traffic.
I think what you meant there was not that an Echo Nonce is faster (since both
require a minimum of an RTT), but that _unless likely unreasonable amounts of
RLOC Probing control traffic are used_, Echo Nonce will, on average, provide
faster notice of loss of reachability? That is also worth pointing out, done.
>> Mappings might also be discarded before the TTL expires, depending on
>> what strategies the ITR is using to maintain its cache
> And the RLOC-set can change before the map-cache entry times out.
You mean, the {EID->RLOC} mapping might change? (You're not, for instance,
talking about the LSB bits?) True, but I don't understand the relevance at
this point in the document. There is an earlier section, "Mapping Versioning",
which talks about that circumstance, and the mechanism to detect it, in
detail?
>> Very briefly, the ITR provided a One-Time Key with its query; this key
>> is used by both the MS (to verify the EID block that it has delegated
>> to the ETR), and indirectly by the ETR (to verify the mapping that it
>> is returning to the ITR).
> The ETR uses the MS's OTK to sign the block. The ETR does not know the
> ITRs OTK and doesn't verify anything. The ITR verifies the signed
> Map-Reply by generating the MS OTK from its OTK (like the MS does) and
> uses the MS OTK to verify the signature, since that was the same key the
ETR
> used.
First, yes, 'verify' is the wrong word; I have changed them to 'sign' (and
tweaked the wording for the MS slightly).
As to the rest, I'm not trying to give the full details of the mechanism in
that text (note the "Very briefly"); I'm just trying to summarize _what_ it
does, not _how_. And it does say that the ETR uses the ITR's OTK "indirectly".
>> [[X3: Spec is unclear about who reassembles; says "fragments are
>> reassembled at the destination host" but also "reassembly can
>> happen at the ETR" - I would have thought the latter was very
>> undesirable, for all the obvious reasons, unless the dest cannot
>> reassemble?]]
> Right because if a core router fragments, it is fragmenting a packet
> destined to the ETR.
Ah, right: I forgot there might be two kinds of fragmentation (at the ITR,
and by routers on the path between the ITR and ETR).
Does the latter actually happen, or are all LISP-encapsulated packets sent
with DF on? In other words, are ETRs actually supposed to be prepared to
reassemble fragmented packets which they receive?
>> To recap, the mapping system is split into an indexing sub-system,
>> which keeps track of where all the mappings are kept, and the mappings
>> themselves, the authoritative copies of which are always held by ETRs.
>> [[M0: Should we mention, somewhere, the cases where they aren't -
>> i.e. proxy map-replying?]]
> I think you could have reference proxy map-replying in the DDT example.
It seems to me that that point fits more logically in the section "Interface
to the Mapping System", which talks about Map-Requests and Map-Replies. I have
added mention of proxy map-replying there.
> That way you can say that the MS either forwards the Map-Request to the
> MS or proxy reply itself.
The DDT detailed description (in "The DDT Indexing Sub-system") already
mentions this.
>> "Solicit-Map-Request" (SMR) messages are actually not another message
>> type, but a sub-type of Map-Reply messages.
> SMRs are Map-Requests because you want to use the long nonce as well. So
> an SMR is a Map-Request that is responded to by another Map-Request with
> the SMR-invoked bit set. And the nonce from the SMR is returned in the
> SMR-invoked Map-Request.
Sorry, my brain isn't firing on all cylinders today. Why is it important that
the second Map-Req (the one with the SMR-invoked bit) use the same nonce?
Is it because SMR is sent when the ETR has a new mapping, and it needs the ITR
to reload the mapping? But if so, I'm not following how the nonce adds
anything. All the ETR needs to know is that i) the ITR tried to update the
mapping (and the second Map-Request tells it that, without either the nonce or
the SMR-invoked), and ii) that the ITR successfully received the Map-Reply - and
I don't see how any of this does that?
> Also note an RLOC can be a multi-tuple encoding meaning it can return,
> for example, a Geo-Tag, or ELP or a RLE (replication list entry for
> multicast).
I don't want to say too much about n-tuple RLOCs, because those are all 'work
in progress', and this document is (mostly) supposed to describe LISP as it
is. (I have made an exception about DDT because ALT is already obsolescent.)
I did consider alluding to n-tuple RLOCs as 'improvements in progress', in a
different place, higher up the document (which would be the appropriate
location for such an observation), but I decided not to, because it just opens
up too big a can of worms. Right at the moment that section's really simple
and straight-forward, EIDs identify the hosts, RLOCs where they are,
yadda-yadda, and I have to put in a * and say 'Well, except when RLOCs
are lists', it's just too hairy.
Look, don't get me wrong, I think n-tuple RLOCs are cool and powerful, but...
>> Map-Notify messages have the exact same contents as Map-Register
>> messages; they are purely acknowledgements.
> They are not just acknowledgements. They are used to tell the old
> RLOC-set that new RLOCs have been registered.
In an extension that is not documented in anything which is available to the
IETF... :-( :-)
But I will allude to their extension for other uses.
>> The interaction between MRs and DDT servers is not complex; the MR
>> sends the DDT server a Map-Request control message (which looks
>> almost exactly like the Map-Request which an ITR sends to an MR).
> Don't need to say the text in the parens.
I don't particularly see the harm in it, but OK.
> And again use "DDT node" and not "DDT server".
See previous comment (in tranche #2) about DDT "node" and "server".
>> If the latter, the MR then interacts with that MS, and usually the
>> block's ETR(s) as well, to cause a mapping to be sent to the ITR which
>> queried the MR for it.
> This is the place to put the public key description in and how the MR
> will verify a Map-Referral.
There is an entire section "Security of the DDT Indexing Sub-system", which
covers that (although maybe it should have a bit more detail than it does).
I will put in a forward reference to it.
>> 10.2.1. Map-Referral Messages
>> Map-Referral messages look almost identical to Map-Reply messages
>> (which is felt to be an advantage by some people, although having a
>> more generic record-based format would probably be better in the
>> long run, as ample experience with DNS has shown), except that the
>> RLOCs potentially name either i) other DDT nodes (children in the
>> delegation tree), or ii) terminal MSs.
> There is no need to have this section I think.
I think it's good to have all the message types show up in the TOC, for people
who want a quick reference. And pointing out how the data returned may differ
is, I think, also useful.
> And don't criticize the architecture you are describing. ;-)
I will take out the comment. But the design is still bad, and I will oppose it
in the WG, and (if necessary) at the IESG, and IETF last call.
>> 10.3. Reliability via Replication
>> Everywhere throughout the mapping system, robustness to operational
>> failures is obtained by replicating data in multiple instances of
>> any particular node (of whatever type). Map-Resolvers, Map-Servers,
>> DDT nodes, ETRs - all of them can be replicated, and the protocol
>> supports this replication.
>> ..
>> There are generally no mechanisms specified yet to ensure coherence
>> between multiple copies of any particular data item, etc - this is
>> currently a manual responsibility. If and when LISP protocol adoption
>> proceeds, an automated layer to perform this functionality can 'easily'
>> be layered on top of the existing mechanisms.
> I do not understand what you mean here.
Re-reading it, it seems pretty clear to me?
I will add some examples in the second paragraph (e.g. the copies of
delegation data for a particular block of namespace, in two DDT sibling
servers), but I don't see how else to make this plainer.
> And therefore, why it is even necessary to have multiple copies.
We _already_ have multiple copies of many kinds of data (e.g. the mappings
in ETRs). This replication, and the ensurance of the coherence thereof, is
all entirely manual at the moment.
>> The client interface provides only a single model, using the
>> 'canonical' public-private key system
> There isn't any client interface. There is just the Map-Referrals that
> the DDT-nodes use.
That _is_ the client interface.
> Just indicate that a parent will provide the public key of the children
> so when the children sign a Map-Referral, the MR can verify it.
I had already improved this text somewhat, along the lines you indicate, after
a comment above.
>> [[M4: Any more?]]
> There are examples in the TE draft.
That will have to go in the 'Improvements' ID, I'm afraid. (That ID - long
story, I'll send a message to the WG about it.)
>> [[M5: Perhaps this belongs in "Scalability"?]]
> Yes, put all scalability in one place.
Hmmm. If we're going to do that, _and be consistent_, we'd have to move a
bunch of other stuff too. For instance, the cache scalability stuff in the
"Major Functional Subsystems" section ought to be moved, too.
Let's talk about this at the interim.
>> [[M6: What about potential caching in MRs?]]
> Don't document it because you will set an expectation and readers won't
> find it in any of the RFCs.
Should I apply this principle uniformly? ;-) But I agree that this should not
be mentioned here.
>> If an ITR is holding an outdated cached mapping, it may send packets to
>> an ETR which is no longer an ETR for that EID.
>> ...
>> ETRs can easily detect cases where this happpens, after they have
>> un-wrapped a user data packet;
>> [[F4: The LISP spec is not clear (at least, on a quick reading) on
>> whether ETRs should check for this
> They need state from the mapping system that tells them an EID-prefix
> has moved. That is what tells an ETR is is the wrong one.
Huh? The ETR has to have all its mappings - it is, after all, authoritative
for them!!!
So if a packet for 1.2.3.4 arrives at an ETR, and the ETR's only mapping
currently is for 1.2.77/24, it can tell instantly that it is _not_ a valid ETR
for that packet. The only question is 'do all ETRs check for this'?
>> 12.3. Erroneous Mappings
>> Again, this 'should not happen', but a good system should deal with
>> it. However, in practise, should this happen, it will produce one
>> of the prior two cases (the wrong ETR, or something that is not an
>> ETR), and will be handled as described there.
>> [[F8: I suppose if one ETR is handing out bad mappings, it might be
>> nice to be able to bypass it. This probably falls under 'Future
>> Work', though.]]
> This section sounds like a requirement and not a description of what can
> happen.
Say what? Human errors 'can't happen'? Of course they can.
> I think it should be removed.
I'm trying to cover all the potential cases here.
>> 12.5. Neighbour Reachability
> I think this section is redundant with the previous. If you think there
> are salient points in it, then move them to the prior parapgraph.
Sorry, but I think i) most of the content in this section is not redundant,
and ii) I think it is important to distinguish between liveness and
reachability.
>> 13. On-Going Improvements
Due to a variety of factors, I have split off this entire section as a new
document; I'll deal with comments on these later.
Noel
_______________________________________________
lisp mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lisp