On Tue, Mar 13, 2018 at 6:37 PM, Florin Coras <fcoras.li...@gmail.com> wrote:
> Not sure about ILA-R but typically when deploying LISP, RTR/Proxy-ITRs have
> enough memory to store most, if not all, of the identity to location
> mappings. Therefore, once in steady state, most of the requests to the
> mapping system are triggered by edge devices ITR/ILA-N.
ILA-Rs contain the all the mappings for the shard the service. If they
don't have a mapping for a packet, then the packet is dropped.
> This then means that just rate limiting ITRs should be enough to avoid
> DOS-ing the control plane and the problem converts into one of trying to
> avoid providing sub-optimal paths to legitimate traffic due to attacker
> pressure. As Alberto mentioned, there are a number of solutions to
> determining both the attackers and the destinations set that should be
> protected against cache evictions. The former can be used to determine the
> set of requests that should not be punted, while the latter ensures that
> mappings for popular destinations cannot be evicted by attacks.
Okay, but I still don't know where the details and analysis of these
solutions are. It's not enough to simply say that rate limiting is the
solution to the DOS threat. I looked at RFC7835, for instance, which
gives a nice analysis of the threat, but the suggested mitigations are
"careful deployment and configuration" and "Systematically applying
filters and rate limitation"-- that guidance is not particularly
enlightening or convincing. I am really hoping we can get something
more concrete for dealing with DOS threats in a control plane for ILA.
> On Mar 13, 2018, at 4:27 PM, Tom Herbert <t...@quantonium.net> wrote:
> On Tue, Mar 13, 2018 at 3:50 PM, Alberto Rodriguez Natal (natal)
> <na...@cisco.com> wrote:
> On 3/13/18, 1:05 PM, "Tom Herbert" <t...@quantonium.net> wrote:
> This is reflected below in: "While the mapping is being resolved via
> the Map-Request/ Map-Reply process, the ILA-N can send the data
> packets to the underlay using the SIR address."
> I think it should be assumed in ILA that not queuing packets and not
> dropping packets because of resolution are requirements (too much
> latency hit).
> IMHO, these should not be hard requirements. Leveraging ILA-Rs for mapping
> resolution has another set of tradeoffs to be considered. An operator should
> be able to decide which set of tradeoffs makes sense for his/her particular
> This is a hard requirement because caches are explicitly not required
> for ILA to operate. They are *only* optimizations. If there is a cache
> hit then packets presumably get optimized path, on a cache miss they
> might take a subopitimal route-- but packets still flow without being
> blocked! This means that the worse case DOS attack on the cache might
> cause suboptimal routing; however, if resolution is required then the
> worse attack case becomes that packets don't flow and it's a much more
> effective attack.
> Performing the mapping resolution at the ILA-N doesn't mean that you can't
> send the packets to the ILA-R to avoid the first-packet-drop. Those are two
> different things. Traditionally in LISP, a possible deployment model is to
> have a couple of RTRs with all the mappings in the site, so xTRs can use
> them as default path while they are resolving mappings. In this scenario,
> all the mapping resolution is done at the xTRs while the RTRs are only
> forwarding "first-packets". We have seen this model working really well even
> for large LISP deployments.
> In ILAMP, a redirect method is defined. On a chache miss the packet is
> forwarded and no other action is taken. If an ILA-R does
> transformation it may send back a mapping redirect informing the ILA-N
> of a transformation. The redirects must be completely secure (one
> reason I'm partial to TCP) and are only sent to inform an ILA-N about
> a positive response. To a large extent this neutralizes the above
> random address DOS attack. There are other means of attack on the
> cache, but the exposure is narrowed I believe.
> That model is supported in LISP via the use of Map-Notifies. However, moving
> the mapping resolution to the ILA-R comes at a cost. It's putting more load
> (in terms of both data and control plane) into an architectural component
> that it's not easy to scale out, since it requires (for instance)
> reconfiguring the underlay topology.
> I'm not see how this creates more load (i.e. the need for map request
> packets are eliminated), but I really don't understand what
> "reconfiguring the underlay topology" means!
> Happy to try to clarify this. I'm talking about the load in the ILA-R. With
> a "redirect" model, the ILA-R has to (1) serve as the data-plane default
> path and (2) provide control-plane mapping resolution. This is centralizing
> the data-plane and control-plane into a single component, the ILA-R.
> Moreover, this will also require a lot of punts from the fast path to the
> slow path in the ILA-R which has also implications. With a request/reply
> model, the control-plane resolution is performed at the edges in a
> distributed fashion and the ILA-R only serves as data-plane default path to
> avoid dropping traffic. The latter model alleviates the load in the ILA-Rs,
> which reduces the need to scale them out.
> Yes, but you are ignoring the load on the mapping servers which also
> needs to scale. Additionally, if ILA-N is both forwarding a packet and
> sending a map request then this potentially doubles the packet load on
> the network and exacerbates the potential DOS attack where someone
> floods an ILA-N with packets having bogus destinations. There might be
> mitigations to this DOS attack, like heavy-hitters you mentioned, but
> we really need the details to see exactly how this works and how
> effective they are. On the surface of it, it looks like
> request/response model is susceptible to DOS especially when third
> parties are allowed to drive the process.
> lisp mailing list
lisp mailing list