Not sure about ILA-R but typically when deploying LISP, RTR/Proxy-ITRs have
enough memory to store most, if not all, of the identity to location mappings.
Therefore, once in steady state, most of the requests to the mapping system are
triggered by edge devices ITR/ILA-N.
This then means that just rate limiting ITRs should be enough to avoid DOS-ing
the control plane and the problem converts into one of trying to avoid
providing sub-optimal paths to legitimate traffic due to attacker pressure. As
Alberto mentioned, there are a number of solutions to determining both the
attackers and the destinations set that should be protected against cache
evictions. The former can be used to determine the set of requests that should
not be punted, while the latter ensures that mappings for popular destinations
cannot be evicted by attacks.
> On Mar 13, 2018, at 4:27 PM, Tom Herbert <t...@quantonium.net> wrote:
> On Tue, Mar 13, 2018 at 3:50 PM, Alberto Rodriguez Natal (natal)
> <na...@cisco.com <mailto:na...@cisco.com>> wrote:
>> On 3/13/18, 1:05 PM, "Tom Herbert" <t...@quantonium.net> wrote:
>>> This is reflected below in: "While the mapping is being resolved via
>>> the Map-Request/ Map-Reply process, the ILA-N can send the data
>>> packets to the underlay using the SIR address."
>>> I think it should be assumed in ILA that not queuing packets and not
>>> dropping packets because of resolution are requirements (too much
>>> latency hit).
>>> IMHO, these should not be hard requirements. Leveraging ILA-Rs for mapping
>>> resolution has another set of tradeoffs to be considered. An operator
>>> should be able to decide which set of tradeoffs makes sense for his/her
>>> particular scenario.
>> This is a hard requirement because caches are explicitly not required
>> for ILA to operate. They are *only* optimizations. If there is a cache
>> hit then packets presumably get optimized path, on a cache miss they
>> might take a subopitimal route-- but packets still flow without being
>> blocked! This means that the worse case DOS attack on the cache might
>> cause suboptimal routing; however, if resolution is required then the
>> worse attack case becomes that packets don't flow and it's a much more
>> effective attack.
>> Performing the mapping resolution at the ILA-N doesn't mean that you can't
>> send the packets to the ILA-R to avoid the first-packet-drop. Those are two
>> different things. Traditionally in LISP, a possible deployment model is to
>> have a couple of RTRs with all the mappings in the site, so xTRs can use
>> them as default path while they are resolving mappings. In this scenario,
>> all the mapping resolution is done at the xTRs while the RTRs are only
>> forwarding "first-packets". We have seen this model working really well even
>> for large LISP deployments.
>>> In ILAMP, a redirect method is defined. On a chache miss the packet is
>>> forwarded and no other action is taken. If an ILA-R does
>>> transformation it may send back a mapping redirect informing the ILA-N
>>> of a transformation. The redirects must be completely secure (one
>>> reason I'm partial to TCP) and are only sent to inform an ILA-N about
>>> a positive response. To a large extent this neutralizes the above
>>> random address DOS attack. There are other means of attack on the
>>> cache, but the exposure is narrowed I believe.
>>> That model is supported in LISP via the use of Map-Notifies. However,
>>> moving the mapping resolution to the ILA-R comes at a cost. It's putting
>>> more load (in terms of both data and control plane) into an architectural
>>> component that it's not easy to scale out, since it requires (for instance)
>>> reconfiguring the underlay topology.
>> I'm not see how this creates more load (i.e. the need for map request
>> packets are eliminated), but I really don't understand what
>> "reconfiguring the underlay topology" means!
>> Happy to try to clarify this. I'm talking about the load in the ILA-R. With
>> a "redirect" model, the ILA-R has to (1) serve as the data-plane default
>> path and (2) provide control-plane mapping resolution. This is centralizing
>> the data-plane and control-plane into a single component, the ILA-R.
>> Moreover, this will also require a lot of punts from the fast path to the
>> slow path in the ILA-R which has also implications. With a request/reply
>> model, the control-plane resolution is performed at the edges in a
>> distributed fashion and the ILA-R only serves as data-plane default path to
>> avoid dropping traffic. The latter model alleviates the load in the ILA-Rs,
>> which reduces the need to scale them out.
> Yes, but you are ignoring the load on the mapping servers which also
> needs to scale. Additionally, if ILA-N is both forwarding a packet and
> sending a map request then this potentially doubles the packet load on
> the network and exacerbates the potential DOS attack where someone
> floods an ILA-N with packets having bogus destinations. There might be
> mitigations to this DOS attack, like heavy-hitters you mentioned, but
> we really need the details to see exactly how this works and how
> effective they are. On the surface of it, it looks like
> request/response model is susceptible to DOS especially when third
> parties are allowed to drive the process.
> lisp mailing list
> firstname.lastname@example.org <mailto:email@example.com>
lisp mailing list