Hello Noel,
On 12/13/2012 05:09 PM, Noel Chiappa wrote:
> From: "Joel M. Halpern" <[email protected]>
> By assumption, the cache is not big enough to hold the whole mapping
> table (otherwise it would not be subject to overflow).
I'm not sure this is accurate (but I concede it will depend on the
implementation).
The chief reason (at least, in my mind) that LISP uses a pull-based control
plane is not the sheer size of the mapping database (memory is, after all,
cheap and plentiful these days), but rather the control traffic overhead
implications - in particular, the issue of the overhead required to maintain
(in particlar, updating, which is not a one-time-only cost) never-used
entries.
Well, it depends, one of the practical reasons that led to LISP was the
DFZ routing table growth. Many wondered then if both TCAM cache size and
lookup speed would scale fast enough. Fast forwarding with LISP may
require the use of some fast (and maybe large - but more on this lower)
caches.
The local copy of mapping information in the ITRs is described as a cache
because it is demand-loaded. Many (most?) caches that people come across are
size-limited, so I think there is an _assumption_ on people's parts that the
LISP cache is also size-limited, but I don't think there is any particular
reason that this must always be so.
If nothing else, size-limiting the cache means one has to code up a discard
algorithm, etc. Now, it may be that the wise programmer will have some
mechanism for running out of memory for the cache (and I wonder what, e.g.,
BGP implementations do for this circumstance), and perhaps forcing a box into
that operating regime is an attack vector.
Both Luigi's [2,3] works and ours [1] show that for today's traffic
loads/patterns one does not need large caches (relative to those
maintained in BGP routers). In fact, with some relatively small caches
and a simple eviction algorithm, performance is acceptable.
[1]
http://personals.ac.upc.edu/fcoras/publications/2012-fcoras-networking-lisp-cache.pdf
[2] https://www.net.t-labs.tu-berlin.de/papers/KIF-ADDITLCAWISKAI-11.pdf
[3] https://www.net.t-labs.tu-berlin.de/papers/IB-CCLIM-07.pdf
Also, there was an extensive discussion on the mailing list about a year or
so ago of cache designs, for size-limited caches, which would avoid knocking
the 'real' entries out (in favour of the attacker's 'not-really-used'
entries); if the document didn't pick up some of that (sorry, haven't had the
time to read it yet), it could probably usefully do so.
We're currently investigating this and here is where size matters but
not necessarily in obvious ways. One or multiple coordinated users could
trash an ITRs cache quite fast. Of course, user rate limiting and
control plane rate limiting will help but, as you mentioned lower, they
would also limit the ability of the ITR to learn new, legitimate
destinations. So we think that an elaborate eviction policy should help
here a lot.
Florin
The other consideration is that even if the box does not hit a memory limit,
an attack of the kind you describe will increase the load on the control
plane (for loading all the extra mappings) above and beyond its normal load.
this may be worth mentioning, and exploring (if it's not already discussed
- again, sorry, haven't read the ID yet).
I seem to recall that there are already rate limits on the control traffic
from any given ITR? If so, one would expect that the primary 'victim' of such
rate limiting would be the 'bad guy' (unless an attack of this sort happens
as a box is starting), since most of the 'new' requests will be those of the
attacker, so it will mostly be their mapping requests which are dropped.
Yes, this may interfere with the operation of the ITR for real users who are
going 'new' places - if that turns out _in practice_ to be a problem, perhaps
there is some way to isolate such traffic? If we assume that only one (or a
small number) of hosts inside the LISP region (served by that ITR) are the
source of bogus traffic (intended to overflow/dilute the cache), then if an
ITR keeps a 'request count' per source host (yes, more state, and more work
to maintain it, but...), it could rate-limit requests per source.
Since this does not require any _protocol_ changes, we can probably put off
implementing such measures until it proves to be an actual problem in service
- but its probably worth mentioning all this in the I-D.
Noel
_______________________________________________
lisp mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lisp