Short version: The Devil is in the detail, but the I-Ds don't yet
have all the details. Multiple levels of caching
is good in some ways and troublesome in others.
Thanks for your comments Robin.
The main point of draft-fuller-lisp-ms-00.txt is to create a API of
sorts for LISP sites. So they can use a set of primitives regardless
of the mapping database system deployed.
By doing this, the cost of managing an xTR goes way down. No GRE
tunnels, no BGP. Simply Map-Request, Map-Reply, and Map-Register
primitives.
LISP Map Server draft-fuller-lisp-ms-00
Abstract
This draft describes the LISP Map-Server (LISP-MS), a computing
system which provides a simple LISP protocol interface as a "front
end" to the Endpoint-ID (EID) to Routing Locator (RLOC) mapping
database and associated virtual network of LISP protocol elements.
The purpose of the Map-Server is to simplify the implementation
and operation of LISP Ingress Tunnel Routers (ITRs) and Egress
Tunnel Routers (ETRs), the devices that implement the "edge" of the
LISP infrastructure and which connect directly to LISP-capable
Internet end sites.
My understanding of and comments on this are:
Instead of ITRs and ETRs needing to act as routers in the ALT
network, they communicate via the ordinary Internet with Map Servers,
which are routers on the ALT network. This will greatly reduce the
complexity and configuration difficulties of ITR and ETRs.
Yes, that is right. ITRs send encapsulated Map-Requests to Map-
Resolvers via the Map-Resolver's RLOC address. ETRs get encapsulated
Map-Requests from Map-Servers via the ETR's RLOC address only after
the ETR Map-Registers to the Map-Server.
These Map Server devices are implicitly local to the ITRs and ETRs in
a given network and are intended to be used only by those ITRs and
ETRs. They are always on RLOC (stable, globally reachable, non
LISP-mapped) addresses.
Don't know what you mean by local. But if you meant the Map-Server is
colocated with ITRs, that is not true. The Map-Server would typically
not be at the site but in the Internet infrastructure somewhere. Most
likely in an service provider, an interconnect provider, a RIR, or a
third-party.
There are two functions which may be combined in the one device:
Map Resolver (MR)
Accepts a mapping query from an ITR and (usually) sends the
ITR a mapping reply. (The exception is if the MR doesn't
have the information and sends the query verbatim to some
other device, which will answer the query directly to the
ITR.)
Right, we want to experiment with Map-Resolver caching but want to do
that as a second phase in the implementation. So the Map-Resolver gets
the Map-Request from the ITR which now puts it on the LISP-ALT
network. If there is another mapping database service, it could be used.
This way we can make the mapping database service modular and don't
need the sites to participate in it directly.
MRs can be caching or non-caching. More on that below.
ITRs are intended to be configured with a single address for
their local MR. This would raise questions of robustness if
not for the next item:
Multiple MRs in a local network (such as an ISP network or
I guess any end-user network which has ITRs) can be configured
on the one anycast address. This way, the ITR's request will
be forwarded to the nearest currently active MR. All
communication is via single packets, not via TCP. Presumably
the MRs will also have their own unique addresses so they can
be managed via TCP.
Right.
I think the MR is an important improvement to LISP-ALT, since
It's not an improvement to the LISP-ALT mapping database, but a Map-
Resolver can be a LISP-ALT router/system, a NERD system, or a CONS/DHT
system.
it enables an ITR to be a much more casual and unstable concept
than was the case when all ITRs needed to participate in the
ALT network as routers (AFAIK). This means that ITRs can be
added easily, without having to configure anything.
True.
It also means (though this is my suggestion, not from the LISP
team) that an ITR function could easily be implemented in a
sending host, assuming it was not behind NAT. I guess the
sending host would need to be on an RLOC address - which rules
out this idea for sending hosts in end-user networks. Ivip's
ITR in sending host function (ITFH) requires the host to be
on a non-NAT address which can be and ordinary or a Scalable PI
address - RLOC or EID in LISP parlance.
True, however, it would increase the number of locators for a site.
That is the EID to RLOC ratio would be 1-to-1. And the mapping
database would be orders of magnitude larger!
Map Server (MS)
Is a router on the ALT network and accepts secure messages from
one or more ETRs. (Secret key pairs to secure these.) ETRs
are (typically, or always?) the authoritative source of mapping
information in LISP.
Right. The ETRs are registering their EID-prefixes more so than the
mapping. Just an FYI, if that wasn't clear. Map-Servers don't answer
Map-Requests because they wouldn't be authoritative.
ETRs can be on any RLOC address and use ordinary packets to
communicate with the MS.
Yes, they send Map-Register messages from one of their local RLOCs.
My understanding is that the MS announces the appropriate
prefixes on the ALT network - one for every EID the ETR
tells it.
Right, but if the Map-Server is at an aggregation boundary, the
specific EID-prefix won't be announced but the configured aggregate in
the Map-Server would.
Ignoring MSes for a moment, I have never understood how this
would work with two ETRs in two separate ISPs handling the same
Multiple ETRs reside at the same site not in the SP network.
EID. Both ETRs would be routers on the ALT network and would
announce the same prefix. So where do packets go to? I guess
Within their aggregation level, there are two paths for Map-Requests
to travel to the site. It's the upstream BGP routers that decide which
path to take. They would take shortest path based on AS-path hop-
count. Recall that each LISP-ALT router is doing "eBGP".
to either. But then the ETRs somehow need to coordinate
themselves, or be coordinated by something else, so they act
in a unified manner. Then, as long as both were reachable and
working properly, it wouldn't matter which ETR got the query.
Right, but they don't need to coordinate. All they need is to be
consistently configured to Map-Register the same EID-prefix.
The same problem seems to apply with MSes. There would be two
ETRs in two separate ISPs and each would presumably (for
robustness in a multihoming situation and probably for security
reasons) have its own MS in its own ISP network.
No, not true.
So now we have two ETRs and two MSes which need to be
coordinated. The two MSes both announce the one EID prefix
on the ALT network. Yet they are supposed to still be
coordinated during outages.
The 2 Map-Servers will converge into a topology that will aggregate
the site's Registered EID-prefix so we can have a smaller ALT core.
Smaller meaning, a small number of EID-prefixes needing to be stored
in the core of the ALT network.
However this is resolved, I think it is a big improvement for
LISP to have MSes, since it reduces the cost, complexity,
management effort etc. for ETRs similarly to how MRs do the
same for ITRs.
Both these functions can presumably be performed quite adequately by
software devices, such as a COTS server with suitable software.
There doesn't have to be any hardware router FIB etc. AFAIK.
Yep, that is true.
This would enable hardware routers to assume ITR and ETR
responsibilities without them also needing all the software and
configuration, stable address etc. to be an ALT router. Also, by
decreasing the total number of ALT routers, this simplifies the ALT
network.
Yes, we thought so too.
I gather from this new I-D, and from what I read in:
http://www.lisp4.net/docs/lisp-ausnog02.ppt
that the current test network and the intention for the future is not
to send traffic packets on the ALT network. This approach was
initially an option, with the intention that the ALT network would
forward the initial packet(s) to the correct ETR, which would then
forward it to the destination network, while also recognising it as a
map request and so would send a map reply message to the ITR.
Right that is correct. The implementation support both sending Map-
Requests and Data-Probes on the ALT network, but we default to Map-
Requests and might possibly deprecate Data-Probes.
I recall from somewhere that the ITR typically sends out a few
mapping requests, just in case one of them is dropped. When the ITR
Well no, we rate-limit Map-Requests but they are triggered when a
source at the site sends data. However, we can play with this to see
what works well.
connects directly to the ALT network, these packets presumably
usually traverse the entire global ALT network until they are
delivered to one or more (probably just all to one) ETR which
responds. I guess the ETR sends multiple replies, but maybe not.
The reply goes to the ITR via the ordinary Internet.
Map-Replies are rate-limited as well.
Removing these potentially long and voluminous traffic packets from
the ALT network seems like a good idea to me. There may well be
security benefits in doing so too. Below, I assume the ALT network
only carries mapping requests, and that the map replies go back from
whatever answers them (an ETR connected to ALT network, or more
likely a Map Server) via a direct ordinary Internet packet to the
device which made the query (perhaps a directly connected ITR or more
likely a Map Resolver).
Yes, this is true.
A Caching Map Resolver?
If the MR caches, then it has the potential to significantly reduce
the traffic on the ALT network. This is due to two or more ITRs in a
given ISP network wanting the same mapping, and the second and
subsequent ones getting it directly from the local caching MR.
Yes, this was Noel's idea with CONS. It is worth experimenting.
This also has the potential to eliminate, for the second and
subsequent ITRs which need this mapping, the major problem of
"LISP-ALT's initial packet delays", so much debated on the RRG in
recent months.
Well, I'm not so sure. If you point an ITR to an RLOC of a Map-
Resolver, you take the shortest path to it. But if you had a GRE
tunnel to the same box, the GRE tunnel destination would be the same
RLOC. So the path would be the same. But you couldn't run an anycast
Map-Resolver service because the eBGP connections that ran over the
GRE tunnels would reset. So I guess this is an improvement.
There is nothing in draft-fuller-lisp-ms-00 to describe this caching
behavior.
The caching time of map replies is specified in units of one minute:
draft-farinacci-lisp-12:
Record TTL: The time in minutes the recipient of the Map-Reply
will store the mapping.
That detail will come in a later draft.
Let's say at time T = 0 minutes, ITR-A sends a map request to MR-1,
which has no mapping for the EID prefix which matches the EID address
in the request message. MR-1 sends its own map request message (with
its own nonce) onto the ALT network which forwards it to either the
Well, that's not the way it works. The ITR sends an encapsulated Map-
Request to the Map-Resolver. The Map-Resolver strips the outer header
and then forwards the Map-Request on the ALT. The source address is
the ITR RLOC address and the destination address is the EID that
caused the map-cache fault on the ITR.
single Map Server which advertises the matching EID prefix on the ALT
network, or to one of the multiple such Map Serves, or perhaps to the
directly ALT-connected ETR(s) which do the same.
Correct.
That device sends the mapping reply back to MR-1 directly via the
Internet. The reply is secured by returning MR-1's nonce.
No, it would go to the ITR because in the Map-Request payload there is
an "ITR RLOC" field. This is quite important because if that Map-
Request was an IPv6 Map-Request with an IPv6 outer header, and since
the LISP-ALT network we have deployed is dual-stack, the IPv6 Map-
Request is forwarded on the ALT, but the ETR may not (and probably
not) have a IPv6 path back to the ITR. So if the "ITR RLOC" field is
encoded with an IPv4 RLOC, the ETR sends a Map-Reply back with an IPv4
header.
In the entire LISP design we treat IPv4 and IPv6 equally and try to
enhance IPv6 connectivity by using IPv4 outer headers or IPv4 RLOCs
when encapsulating.
Today, two IPv6-only sites can open an IPv6 TCP connection to each
other if they run LISP and use IPv4 locators.
Let's say the mapping reply comes back with a 90 minute caching time.
MR-1 sends to ITR-A a map reply, with ITR-A's request's nonce, with
the fresh mapping information and a caching time of 90 minutes. Now
MR-1 can encapsulate packets to its choice of ETRs, based on the
fresh mapping it has received and whatever it has determined about
reachability of those ETRs, and of the ETRs' ability to get packets
to the destination network.
No, no, no. The Map-Resolver does not encapsulate any packets.
Remember the ALT has no data going over it.
If the Map-Resolver is caching Map-Replies and the ITR sends a Map-
Request with A=0, then the Map-Resolver can respond with a Map-Reply.
If the ITR sends a Map-Request with A=1, the Map-Resolver must forward
the Map-Request over the ALT so an authoritative Map-Reply can be
returned by the ETR.
Later, at T = 85 minutes, ITR-B sends a mapping request to MR-1 for
an address which matches this same EID prefix. MR-1 can use its
cached information and send a reply within a few milliseconds. This
means ITR-B's traffic will not be delayed by any significant amount.
What caching time will be in that reply to ITR-B? I assume it will
be 5 minutes. If it would be 90 minutes, ITR-B could be running for
a long time to come on stale mapping information.
We haven't figure that out yet. We don't want to create an impression
that a cacher of a Map-Reply can use any TTL it wants. We want to make
it mandatory to respect the ETR's value.
Assuming ITR-A no longer needs this EID's mapping, but ITR-B keeps
needing to tunnel packets addressed to this EID, then at T=90
minutes, ITR-B will want mapping information again.
Should ITR-B request the mapping again at at T = 88 minutes, in
readiness for probably needing it in 1 minute's time?
It could, but the reasons to time out the map-cache entry is to keep
the cache small and to be resilient, to some extent for locator-set
changes at the ETR site.
This would seem like a generally reasonable approach if it prompted
MR-1 to get fresh mapping information, but why should MR-1 do this?
Would MR-1 need to look at the original caching time and how much
has expired to decide whether it should, by some algorithm, request
fresh mapping? But what if the mapping hadn't changed in the distant
Map Server, but the ETR was going to change it two minutes later?
One of the problems I see with caching in the Map-Resolver is if the
map-cache entry does have a locator-set change and the ETR asks all
cachers to send Map-Requests (it does this by setting the SMR-bit for
active flows), the Map-Resolvers cannot get updated because they are
not seeing data.
However, I have a solution for this because, it will be the ITR that
sends A=1 Map-Requests with an SMR-bit set. That can tell the Map-
Resolver to ask for the Map-Reply back to update it cache. I know
there are security issues with this but it's one way of doing it.
There are also details how a Map-Resolver asks to get the Map-Reply
back. We want to do this in a stateless manner in the Map-Resolver. So
we might have to preserve the ITR RLOC's address in the Map-Request
but instruct the ETR where to send the Map-Reply. We have some ideas
and what to think about it before changing packet formats.
If ITR-B waited until T = 90 or a little later before requesting
fresh mapping, then unless MR-1 had already got fresh mapping in the
last minute or two, then there would presumably be a delay in the ITR
being able to handle traffic for this EID, since it would take some
time for MR-1's second mapping request to traverse the ALT network
and generate a reply to MR-1.
There are various scenarios, but I think there are potential
difficulties with caching times running out in three locations now
rather than one.
Yeah, we can't get too tricky about manipulating TTLs. DNS has bee
fraught with problems due to TTL issues. If anyone has advice about
this, it would be nice to hear about it.
Previously, it was simple (despite the scaling problems of lots of
ITRs peppering an ETR for mapping, not to mention them all trying to
decide reachability for this and other ETRs):
ITR query -----------> ETR
<----------- reply
cache
Now we have:
ITR query -----> Map query -----> Map
<----- reply Resolver <----- reply Server <--Register- ETR
Cache Cache Cached, in a sense,
controlled by messages
from the ETR(s) whenever
they can reach the
Map Server and decide
to send a Map Register
message.
Right, but what if we used for first case and have the ETR schedule an
update to the Map-Resolver? Or what if the ITR updated the Map-
Resolver? Not sure yet. And not even sure how much RTT will buy us
with Map-Resolver caching.
I think this raises more complex problems with:
1 - How to avoid cache times running out at ITRs
which are going to be tunneling packets addressed
to this EID after the cache time expires.
Such a situation will cause a traffic delay unless
the local Map Resolver has recently got fresh mapping.
But I think this is issue has continually been exaggerated. The Map-
Request delay is not for a lot of packets and will be relatively rare
I imagine.
2 - How to minimise unnecessary map requests by Map
Resolvers trying to anticipate ITRs making such
requests, but actually requesting fresh mapping from
the distant Map Server when the ITR doesn't need it.
Right, Map-Resolver caching could be more trouble than it is worth.
Without further complications, the Map Resolver can't know whether
the one or more ITRs which requested the mapping for an EID are still
handling traffic for that EID. So it can't very well request fresh
mapping towards the end of its expiry, just in case an ITR wants it.
To do so would approximately double the volume of map requests
traversing the ALT network, since it is reasonable to assume, with
longish caching times, that the original caching time will generally
suffice for the needs of the one or more ITRs served by the Map
Resolver. (This would not be true with a busy Map Resolver and
popular EIDs many ITRs are sending packets to.)
Without some elaboration of the request protocol, ITR-B at T = 85
minutes can't ask the Map Resolver to get fresh mapping and send it a
new reply - unless there is some algorithm in the Map Resolver such
as: "If the cached mapping is 90% of the way to its expiry time, do
not answer the new request from the cache, but send a fresh map
request and then answer the query if and when the new reply arrives."
To do so would effectively shorten all the caching times.
Well, it the mappings don't change, longer TTLs will help the Map-
Request load on the ALT. If there are frequent changes and you want
fast convergence to them, then you use more resources.
That is the tradeoff.
At present, there is only one kind of map request message from an ITR
to a Map Server - implicitly an urgent request.
If there was a second kind:
"This ITR has mapping for this EID which will expire in some
time period (specified) soon, and requests the Map Resolver
to get fresh mapping from the Map Server now, and to send
a reply once this arrives."
There is no reason why the ITR cannot send a Map-Request directly to
the RLOC of the ETR. It does have a set of them he can try. And the
nonce will protect against ETR spoof attacks.
then I think these problems would be resolvable with less trouble and
less need for choices based on limited information.
- Robin
Thanks again for your comments Robin,
Dino
_______________________________________________
rrg mailing list
[email protected]
http://www.irtf.org/mailman/listinfo/rrg