Authors,
Overall this is a really meticulous document- it's clear that the authors have
spent a lot of time thinking about security threats as they relate to the LISP
protocol. Most of my comments below are about wording and organization, there
is only a few technical questions/comments.
Comments to section 1: Introduction:
Might want to add a note that the security evaluation is only from the
perspective
of use of LISP on the public Internet for the use of multi-homing (which is
inferred
by the fact that the LISP specifications are only for that use case).
I believe this document would also benefit from some overall statement
regarding
the paramaters and posture of the analysis. I.E. what standard are we
evaluating
the protocol against? I think this intent is already in the document - that
this document is supposed to see how the LISP protocol might change the
posture of
networks that deploy it, vs networks that do not deploy LISP.
In addition, section 8 doesn't seem to fit into the flow of the document. It
might
to fit better within the xTR threats section… having an outline that covers xTR
(control/data), Interworking Components, Mapping System components in
succession.
Finally, this document could use a summary - either in each of the major
subsections
(control and data plane) or at the end, summarizing the threats and determining
if there are any unmitigated vulnerabilities. The recommendation section is a
good
section, but doesn't align exactly with the flow of the draft. So I found it
hard
to find if there are any serious un-mitigated threats in the protocol or
implementation.
Comments to section 3: On Path Attackers
"We do not consider that LISP has to cope with such kind of attackers."
Seems wordy
and may hide some meaning. I think you mean that the authors agree with the
LISP
protocol's design decision to not consider this type of threat. Maybe a clear
statement like: "As with IP, LISP relies on higher layer cryptography to
secure
packet payloads from on path attacks".
The second paragraph of this describes in detail something you are not going
to consider
(a subset of on-path attackers using time-shifted replay attacks) something
that
while interesting I'm not sure is important given that you've already decided
not
to consider on-path attacks in the first paragraph. I would consider
deleting - maybe
adding a parenthetical reference (including time-shifted replay attacks) in
the first
paragraph of section 3.
Comments to 4: Off Path Attackers Reference environment
Is there a reason why LR1 and LR2 connect to ISP-1 and ISP-2, while LR3 and
LR4
connect only to 'Internet'? I don't think it matters but I am just curious if
you did it for a reason.
Comments to 5.x)
A general comment about section 5 - many of these data plane black-holing
issues are no different than IP - if I corrupt the IP data-plane (BGP) then
I can black-hole traffic. I think its important to note that (like you did
in section 4) where LISP is different or introduces new potential
vulnerabilities.
I think the difference from IP+BGP is mainly in its ability to encapsulate and
redirect things to arbitrary destinations. A short statement about this
would
help put this section in context.
Further down in the section:
"A key component of the overall LISP architecture is the EID-to-RLOC
Cache. The EID-to-RLOC Cache is the data structure that stores the
bindings between EID and RLOC (namely the "mappings") to be used
later on."
Is a little unclear to me. It might be clearer to say in the second sentance:
" The EID-to-RLOC Cache (also called the Map-Cache) is the data structure that
stores a copy of of the mapping retrieved from a remote ETR's mapping
database
via the LISP control plane."
So the above may not be great wording either, but the point I am trying to
make
is that the map-cache is a copy of the ETR's map-database, which is
generally
stored until replaced by a new mapping, or the mapping's TTL expires.
In general this document uses the word 'can' to describe a possible attack
scenario. I would think it would be better to say 'could', which is a minor
nit but one that kept coming to mind as I read the document.
Comments to section 5.2)
This section, located in the data plane, has many repetitive references to
using the control plane to insert information into the map-cache data
structure.
When ever I see a list of 8 paragraphs all with similar wording at the end,
(similarly to abive, again, again, etc) :-) I suggest revising these
paragraphs
structure.
In this specific case. I think its appropriate to say that the ITR
implementation
can only, in general, trust what it learns from the control plane.
Compromising
the control plane therefor will compromise the integrity and correctness of
any
copies made of this control plane's data (the map-cache). You could then
serially list the Bad Things (TM) that could happen once the Control Plane
is compromised.
Some further comments on the individual attacks you mention:
(Reachability poisoning:) .... "If reachability information is not verified
through the control-plane this attack can be simply achieved by
sending a spoofed packet with swapped or all locator status
bits reset."
"simply achieved" I think is a qualitative analysis that should not be
here - I would reword to say "this attack could be achieved" .
Instance ID poisoning: The LISP protocol allows using a 24-bit
identifier to select the forwarding table to use on the
decapsulating ETR to forward the decapsulated packet. By
spoofing this attribute the attacker is able to redirect or
blackhole inbound traffic.
I think you mean, above, that by spoofing the IID value the attacker
might be able to cause traffic to be either be dropped or decapsulated
and then placed into the incorrect VRF at the destination ETR... is that
correct?
"If the above listed attacks succeed, the attacker has the means of
controlling the traffic."
This seems an over-broad summary of the above. I suggest something like:
"If the ITR's map-cache is compromised (likely via compromising the LISP
control plane) it is possible that traffic may be redirected (encapsulated
to the wrong destination) a or dropped by the ITR."
It might also be good to make a recommendation that if data plane redirection
is of a critical concern, then deploying some sort of IPSEC or TLS based
security
on a layer above LISP (just like you would on top of IP) is a good idea.
Comments on section 5.4.1:
"Rate limitation, as described in
[I-D.ietf-lisp], does not allow sending high number of such a
request, resulting in the attacker saturating the rate with these
spoofed packets."
The above assumes the implementation of the rate-limiter is primitive enough
to classify map-requests generated by these types of failures to be
classified.
and limited, in the same bucket as map-requests generated by (legitimate)
data packets. It might be better to make a recommendation - that indeed
implementations should consider the instigation of the map-request - a
data-packet
(and even from which source EID), SMR, reverse cache check failure, and
selectively
limit each accordingly. Section 11 has a nice description of this - and
seems
to be in contradiction to the claims in the above sections.
Comments on section 5.4.2:
Maybe it should be noted that Map-Versioning is an optional part of the LISP
protocol
and not all implementations support it. "If map-versioning has been
implemented by
an xTR..")
My comments to section 5.4.1 regarding rate-limiting apply here too, I think.
Comments on section 5.4.4:
I think the title should read "instance id' instead of "id instance'
Comments on Section 6.1:
"The first possible exploitation is the P bit. The P bit is used to
probe the reachability of remote ETRs in the control plane."
Usually map-requests with the probe bit sent are sent from the the ITR's
RLOC to
the ETR's RLOC. Since sending them via the control plane might not
guarantee that
they arrive on the appropriate ETR.
" Furthermore, appending Map-Records to Map-Request messages represents
a major security risk since an off-path attacker could generate a
(spoofed or not) Map-Request message and include in the Map-Reply
portion of the message mapping for EID prefixes that it does not
serve."
While some sites have chosen to accept those security risks and use gleaning,
I agree that (like your earlier reference to trusting LSBs), gleaning is
best used in more trusted environments, rather than the public Internet. In
addition, I think this paragraph should reference section 6.3.
Comments on Section 6.2:
"Negative Map-Reply messages are used to support PTR and interconnect
the LISP Internet with the legacy Internet."
I think you mean to say that Negative map-reply messages are used to indicate
non-lisp prefixes. ITRs can, if needed, be configured to send all traffic
destined for
non-lisp prefixes to a Proxy-ETR.
"Note, however, that the nonce only confirms that the Map-Reply was
sent by the ETR that received the Map-Request. It does not validate
the content of the Map-Reply message."
While the above seems correct, I had to read the text three times to make
sure (was it originally written by Dino? :-)). It might be clearer to say
"the nonce only confirms that the map-reply received was sent in response to
a map-request sent', it does not validate the contents of that map-reply.
"In addition, an attacker can perform EID-to-RLOC Cache overflow
attack by de-aggregating (i.e., splitting an EID prefix into
artificially smaller EID prefixes) either positive or negative
mappings."
I'm not sure I follow the above, in a couple of different ways. First,
it seems you are suggesting that a malicious ETR might fragment its
eid-to-rloc database and then instigate traffic to its site, therefor
creating
alot of state on the corresponding ITR's map-cache. But this doesn't follow
the theme of the rest of the section which talks about off-path attackers
sending map-replies. Second, you mention that this type of thing
could occur with negative mappings. I can only imagine this happening with
negative mappings in the case where an ETR, in conjunction with a mapping
service provider's Map-Server, allowed for such de-aggregation into the
control plane (DDT or ALT), or if a map-resolver was compromised and crafting
specifically bogus information. Neither of these scenarios seems in scope
with the rest of the section. I would recommend removing this paragraph.
Comments on section 6.3:
I suggest either an intro or summary (or both) to this very detailed section
that sums up what you want to say, which I think is:
"Appending Map-Data into Map-Requests, and then having an ETR glean that
information
on receipt of said map-request, bypasses all security protections of the
LISP control
plane. It is slightly less evil if used with verification. Therefor we
recommend you do not do it unless you have a damn good reason to do so, or
are in
a trusted (read, not the public internet) environment.
Comments on section 7:
Unfortunately the Reference topology from section 4 doesn't include
interworking
components. Some of this might be easier if it did, but I don't think its a
critical issue.
"To limit such an issue it is recommended to use the current practice based
on
firewalls and ACLs on the machine running the Proxy-ITR service."
This seems a little unclear to me, I suggest something like:
"To limit Proxy-ITRs being used as relays for attacks, Proxy-ITR operators
are
encouraged to implement best practices for data plane access control on the
proxy-ITR and the boarder of the network, that is the edge of the scope of
the Proxy-ITR's announcement of the EID-Prefix."
Note that Access Control Lists placed in front of Proxy-ETRs could be
populated with
the RLOC's of ITRs that are allowed to use said Proxy-ETR, which will greatly
limit the possibility of traffic being injected by 3rd parties.
Comments on Section 8:
One general comment, this section has many long paragraphs, and might
benefit
from some additional hierarchy.
"In this section, we discuss the threats that could be caused by
malicious xTRs."
This section seems to be somewhat contradicting section 5.1, which implies
that compromised xTRs are out of scope. If I understand your intent,
I think you mean to say that you want to analyze how the LISP architecture
handles parts of it being compromised. I think there is a
disconnect in the flow of the document - to this point it has been a
thorough analysis of the protocol, and now it becomes a review of the LISP
architecture's operational security posture. Now I definitely think a
discussion like this belongs in this document, but I think further
introductory text could frame this either in the introduction or via
organizing the outline differently. In fact, I would consider
changing the wording of the introduction - instead of having two main parts,
the control and data plane, this document has two main parts - a protocol
security analysis and a system security analysis.
Ok, some specific comments about this section's text follows:
"Malicious xTRs are probably the most serious threat to the LISP
control plane from a security viewpoint."
I might suggest a bit more detail here maybe something talking about
why its so important, and why any shared control plane is easiest
to attack from within the system (exactly like BGP).
"The impact of a compromised lisp control plane can be severe, and
the most effective way to attack any multi-organizational control
plane is from within the system itself"
Later in section 8:
"The current LISP specification briefly discusses the overclaiming
problem [I-D.ietf-lisp], but does not propose any specific solution
to solve the problem. Nevertheless, [I-D.ietf-lisp-sec] proposes a
solution to protect LISP against overclaiming attacks under the
assumption that the mapping system can be trusted."
Not sure if the above commentary adds anything to the document. The over
claiming
threat was discussed in Stockholm and Hiroshima, and determined to be
important
enough that a detailed proposal was developed as a separate draft. In my
opinion it solves the problem it was designed for elegantly and
comprehensively.
(also, what cryptographic solution to security doesn't have a trust anchor?)
Later in section 8:
"An important point to note about this flooding attack is that it
reveals a limitation of the LISP architecture. A LISP ITR relies on
the received mapping and possible reachability information to select
the RLOC of the ETR that it uses to reach a given EID or block of
EIDs. However, if the ITR made a mistake, e.g., due to
misconfiguration, wrong implementation, or other types of errors and
has chosen a RLOC that does not serve the destination EID, there is
no easy way for the LISP ETR to inform the ITR of its mistake. A
possible solution is to enforce an ETR to perform a reachability test
with the selected ITR as soon as there is LISP encapsulated traffic
between the two."
As you say, an ITR can make a reachability test (probe) to an ETR as soon
as it receives a map-reply (in fact our implementation does this). Also
an ETR that receives traffic by mistake by sending a Solicit Map-Request
back to the sending ITR. So i strongly disagree with the wording above.
Comments on section 9:
Unfortunately, experience with BGP on the global Internet has shown
that BGP is subject to various types of misconfiguration problems and
security attacks. The SIDR working group is developing a more secure
inter-domain routing architecture to solve this problem ([RFC6480]).
And yet, in-spite of this, BGP manages to do quite well on the global
internet for tens of thousands of organizations and hundreds of
thousands of routes. SIDR comes at a complexity and operational cost
which, in the case of ALT, might be low enough to see deployment. In
short, there is no free lunch. The ALT would be at least as secure
as the current global BGP infrastructure - and isn't that the bar we
are trying to meet to see LISP be deployable?
Comments on Section 10.1. Map Server:
"Similarly to the previous case, a malicious ETR can register an
invalid EID-prefix to attract Map-Requests or to redirect them to a
target to mount a DoS attack. To avoid this kind of attack, the Map
Server must check that the prefixes registered by an ETR belong to
that ETR. One method could be to manually configure EID-prefix
ranges that can be announced by ETRs."
In every MS implementation I'm aware of, the MS strictly checks the
EID prefix being registered against a static configuration. We
use operational techniques to validate the ownership of the prefix
by the operator of the ETR.
Comments on section 10.2. Map Resolver:
While it is allowed within the architecture to have a caching Map-Resolver
to my knowledge one has not been implemented. The reason is that,
aside from security, cache coherency in an environment that can support
fast moves (like LISP-MN) is very hard. So some mention of this
might be good - its not like we have a mixture of caching and non caching
resolvers deployed...
Comments on section 11
This is a really nice set of recommendations, however, it might be nice
to cross reference the recommendations with the sections (gleaning, etc).
that they apply to. For example, the rate limiting section applies to 5.4.1
etc.
I think this would go along way to my comment in the beginning that in
this document its hard to get a concise list of threats and recommendations
for their mitigation, and any outstanding threats that need further work.
"In order to mitigate flooding attacks it would be worth consider
developing secure mechanisms to allow an ETR to indicate to an ITR
that it does not serve a particular EID or block of EIDs."
We have this, its called an SMR :-)
-Darrel
_______________________________________________
lisp mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/lisp