Re: [rrg] LISP PMTU - 2 methods in draft-farinacci-lisp-11

Robin Whittle Tue, 27 Jan 2009 05:48:16 -0800

Short version:     Luigi suggests a method by which the ITR
                   can perform PMTUD to an ETR without having
                   to cache the original packet.  This requires
                   three things to be done to achieve the goals
                   without the most obvious DoS vulnerabilities:

                   1 - The intermediate router sends back at least
                       16 bytes of original packet in the PTB -
                       8 bytes more than RFC 1191 requires.

                   2 - It is acceptable for the system not to
                       deliver the first "too long" packet.

                   3 - The sending host sends a second "too long"
                       packet after the ITR has received and
                       processed the PTB from the intermediate
                       router.

                   If the system is supposed to work without the
                   PTB check requiring the LISP nonce in the PTB's
                   original packet fragment, then I think the ITR
                   still needs to store some details of recently
                   sent packets, to force the attacker to set some
                   other parts of the PTB's original packet fragment
                   to hard-to-guess actual values of packets sent.

                   I modify my critique of LISP accordingly - this
                   looks like a reasonable way to do PMTUD without
                   the ITR maintaining a lot of state.  However the
                   smallest amount of state would be the LISP
                   nonce of recently sent packets and this requires
                   all intermediate routers which might have an
                   MTU problem to send back more than the RFC 1191
                   minimum of 8 bytes.

                   The process of checking PTBs in a busy ITR could
                   become so expensive (due to the large number of
                   LISP nonces or other details to be checked, of
                   recently sent packets) that it could become a DoS
                   vulnerability in itself.

                   How I think an off-path attacker could find the
                   addresses of some, many or most ITRs which are
                   tunneling packets to some ETR the attacker wants
                   to DoS.

                   An ITR occasionally needs to allow longer packets
                   to the ETR than its current MTU setting allows -
                   otherwise it has no way of adapting to an
                   increased actual path MTU.

Hi Luigi,

Thanks for your reply, in which you wrote:

> As a member of the OpenLISP team, let me say that we do not
> *accept* or *reject* anything.  AFAICT, that is left to the
> community and to the (future) LISP WG.

OK - "rejected" was my way of stating that you didn't find the
stateless approach suitable for OpenLISP:

http://tools.ietf.org/html/draft-iannone-openlisp-implementation-01#section-6.8.1

   6.8.1. OpenLISP local MTU Management

   During preliminary tests, we observed that the MTU issue is at the
   origin of many problems.  OpenLISP does not (and will not)
   implement the fragmentation mechanism proposed in Sec. 5.4 of
   [I-D.farinacci-lisp].  The reason is because the proposed method
   sounds very primitive and does not appear to be efficient.  The
   original LISP specification is based on an architectural constant
   used by the xTR to limit the MTU of LISP encapsulated packets.
   OpenLISP uses a more advanced solution, based on the real MTU of
   the local RLOCs present on the xTR, as described below.

> What we proposed is just an alternative approach that takes advantage of
> ICMP messages (if they are present).

OK - your approach makes sense.  I am reading it now.

>> Stateful approach for all packets
>> ---------------------------------
>>
>> I am reading the LISP version - I have not looked in detail at the
>> OpenLISP source of this approach.
>>
>> This makes no reference to IPv4 DF=0 packets.  So this approach of
>> the ITR sending a PTB packet to the sending host when a DF=0 packet
>> exceeds some length is not going to result in any action on the part
>> of the sending host.  Such a DF=0 packet will be dropped by the ITR.
>>  That may be OK - Ivip will do much the same - but it needs to be
>> specified clearly.
>   
> The text that we provided is just a first cut. I agree that there are
> several points to clarify.

OK.

>> This approach of determining the MTU to each ITR by receiving ICMP
>> messages from an intermediate router needs to be done securely.
>   
> Am I wrong if I say that this means to change the ICMP protocol?

I meant securely enough to prevent spoofing by an off-path attacker
intent on DoS.  (Full description below.)

The LISP nonce should do the trick - but that requires the
intermediate router to send back to the ITR, in the PTB message, the
8 byte LISP header after the UDP header.  RFC 1191 only requires it
to send back the outer IP header and the next 8 bytes - which in a
LISP packet is the UDP header.

This was a pretty short-sighted requirement, I think.  I don't know
how DFZ routers behave, but I recall someone (Bill Herrin?) stating
some months ago that it was common to send back more bytes than the
bare minimum of 8.  If LISP or something similar was implemented in 5
years time, I guess it wouldn't be too much to ask the router
manufacturers to create firmware updates if their routers don't
already send back a decent number of bytes such as 32 or whatever.

Without this, the ITR can't "securely" respond to ICMP PTB messages.

> Our goal was not to introduce new mechanisms in the core, but rather to
> take advantage of what exists.

OK.

>> It requires the ITR to cache significant amounts of information for
>> every packet it sends which might trigger such a PTB.
>   
> Why should the ITR cache information?

Skip the following if you like:

    My understanding of what ITRs would have to do, before I read the
    rest of your message:

    For every encapsulated packet the ITR sends which might generate
    a PTB message in some router en-route to the ETR, the ITR needs
    to cache - I mean store for a few seconds - the start of the
    original packet and the nonce it put in the LISP header of the
    encapsulating header.   I guess the ITR needs to store it for a
    few seconds, since it might take that long for the PTB to come
    back.  Technically, maybe it should hold it for longer, but my
    guess is that 2 seconds should do the trick.

    Then, when a PTB arrives, the ITR needs to compare the LISP nonce
    in the PTB's fragment of the initial packet (see discussion above
    about how this requires more bytes than RFC 1191 requires) with
    its cached information about recently sent packets.  It should
    also check other things such as the destination address in the
    packet fragment, to ensure it is the same as the ETR to which it
    tunnelled the packet.  This will enable the ITR to uniquely
    identify one of the packets it recently sent.  Then it can adjust
    for that ETR its record of path MTU, and generate a PTB packet to
    be sent to the sending host, from its cached fragment of the
    start of the original packet.

    After that, the ITR can use the stored value of MTU for that ETR
    and use it to decide whether to accept or reject with a PTB any
    packet to be tunneled to that ETR.

I don't think your I-D doesn't mention this, but the ITR also needs
to occasionally let a longer packet be encapsulated to this ETR - a
packet long enough that once encapsulated it would exceed the ITR's
current notion of path MTU to this ETR.  This will only occur if and
when one of the following occur:

  1 - The original sending host, in the same communication session
      tries its luck with a longer packet for the same reason.  RFC
      1191 specifies the one sending host should not try a longer
      packet to a given destination address for 10 minutes after
      receiving a PTB.

  2 - Some other sending host sending a longer packet through this
      ITR to an EID address which is mapped to the same ETR.

  3 - The same sending host, in the same or another application,
      sending a long packet to a different destination address,
      which may or may not be in the same EID as the first
      destination address, which is mapped to the same ETR.

Then, if the longer packet does not result in a PTB, the ITR should
probably raise its notion of the MTU to that ETR accordingly.

However, packet loss here (loss of the too long packet before the
intermediate router with the limiting MTU, or loss of the PTB message
from that router) could result in a too-long MTU setting in the ITR.
This is not such a problem, since it would probably soon be corrected
if subsequent packets passed through the ITR due to this too-high
value and did result in a PTB being received by the ITR.

The Ivip approach can use PTBs from intermediate routers, but does
not rely on them absolutely.  It uses positive and negative
acknowledgement from the ETR for a small subset of packets which,
once encapsulated, are of a length which falls within the ITR's "zone
of uncertainty" for the path MTU to this ETR.  This zone is reduced
by every such packet and would, in general, quickly diminish to zero
- after which the technique is only needed occasionally to test
whether the MTU limit has risen.

I think would be best to store the path MTU value in a separate body
of data from however you store the mapping information.  The mapping
information is indexed on EID, I assume.  You might have in the
mapping cache thousands of entries which all use the same ETR.  It
makes no sense to store the ETR's path MTU in every one of those
entries.  I think the encapsulation process needs to check the path
MTU to the ETR after it has determined which ETR address to tunnel
each packet to.

>> The intermediate router would need to send back sufficient of the
>> original packet to ITR to include the LISP nonce.  Otherwise, PTBs
>> spoofed by off-path attackers would be accepted and the whole system
>> could easily be DoSed.
>   
> Probably there is a DoS risk, but IMO is from on-path attackers. How can
> an off-path attacker know that a specific ITR is sending packets to a
> specific ETR and send a fake ICMP message to shrink the MTU toward that
> specific ETR?

It is easy to find out where the ITRs of a particular ISP or end-user
network are.  Attackers could discover this by various means, but
here is one:  They get their own EID space, which is mapped to an
RLOC address of one of their own machines - which pretends to be, or
is, an ETR.  Then, the attacker sends to some host in the ISP or
end-user network a ping packet or any other packet which will elicit
a response.  That packet has a source address in the attacker's EID
range, so the targeted host will send back a response to that
address.  The ITR in the targeted site will look up the mapping for
the attacker's EID address and so tunnel the response packet to the
attacker's "ETR".  With LISP, APT and TRRP, the encapsulation outer
header's source address is that of the ITR, so the attacker's "ETR"
learns the address of the ITR.  (With Ivip, the source address is
that of the sending host.)

So attackers can easily find the address of an ITR in some network of
its choice where there are sending hosts.  It could do this en-masse
and so discover the addresses of ITRs in many such networks.

The attacker presumably has some idea of where the sending hosts are
for the destination network it wants to DoS.

The attacker could find the addresses of Proxy Tunnel Routers which
handle the attacker's EID.  The attacker simply send packets from one
of its own EID addresses to random IP addresses to get responses from
a bunch of hosts, many of which will use a PTR (all those which don't
have an ITR in their own network).  If there were a single set of
PTRs which handled every EID, then this would suffice to locate all
PTRs in the world.

However, at least in the Ivip model, there probably won't be OITRDs
(the Ivip version of LISP's PTR) which handle every micronet.  There
seems to be no business plan for LISP PTRs, so I can't say how they
would be organised.  Assuming that there is one set of PTRs for one
subset of the EID address space, and another set for another subset,
then the above method will only allow the attacker to discover the
PTRs which handle its subset.

I can't easily think of a way an off-path attacker could find the
addresses of the PTRs which handle the EID space of some network
which the attacker wants to DoS.  Nonetheless, attackers in general
could figure out the addresses of PTRs in various PTR networks and
share this information between themselves.

So an attacker could find the addresses of some, many or most ITRs
which might be tunneling packets to the site they want to DoS.

The attacker presumably has a motivation to DoS a particular ETR.  If
the attacker wants to DoS the ETR(s) of a particular organisation, it
can easily look up the mapping for that organisation's EID prefix to
obtain the ETR addresses.

With the ETR address and a list of ITR addresses, the attacker can
inexpensively generate a sequence of spoofed PTB packets.  (But see
discussion below on how much information the ITR needs to store about
recently sent packets, to force the attacker to correctly set length,
checksum and (IPv4 only) Identification bits to match a packet
actually sent, which the attacker, being off-path, can't see.

Each such spoofed PTB, if recognised by the ITR, can be used to
clobber the MTU the ITR perceives for the targeted ETR, so causing
significant and reasonably persistent DoS for a potentially large
number of sending hosts which rely on that ITR - and a potentially
large number of destination hosts in potentially multiple destination
networks which rely on this ETR.

If the ITR requires the LISP nonce to be in the PTB fragment, then
the system is secure against such spoofing, but then this PMTUD
system will only work if all the intermediate routers which might
cause an MTU problem send back at least 16 bytes rather than the RFC
1191 minimum of 8 bytes after the IP header.

>> The ITR needs to store an initial fragment of each incoming traffic
>> packet for some time, so it can generate a PTB message for the
>> sending host.  It can't rely on enough of the original packet coming
>> back in the PTB from the intermediate router.  The ITR needs to cache
>> this for a second or two at least - while it waits for a possible
>> PTB.  This is an onerous requirement in a high-volume ITR.
>   
> No. This is not the case.
> Assume the following topology:
> 
> H1-----------ITR1-----<DFZ>----------ETR2-----------H2
> 
> where host H1 send packets to host H2.  If a first packets triggers a
> ICMP PTB in the DFZ, this is sent back to ITR1, which sjust stores the
> fact that to reach ETR2 a smaller MTU must be used. Nothing is sent back
> to H1.

But that involves data loss.  The Ivip approach is intended to avoid
this.

> H1 sends a second packet. On the ITR1 there will be a check on the MTU
> triggering a second ICMP PTB to be sent back to H1.

Ahh . . . OK, you can make it work this way if you accept data loss
on the first large packet, expecting the sending host to generate
another similarly large packet, which seems reasonable.

I don't think this technique can be used with Ivip, but I will
consider it when I write the PMTUD I-D.

I withdraw my critique of LISP's stateful approach to PMTUD requiring
storage of the initial part of longer packets for a few seconds.
However, I think the text of draft-farinacci-lisp and your own
OpenLISP I-D should be revised to make this process clearer.

Still, if you are to check the nonce, you do need to store each nonce
of recently sent packets.

To do a proper check of the PTB, you need to store some packet
details - to force the attacker to generate the correct
Identification (IPv4 only) and Length bits from the IP header.

Likewise, if you store the UDP checksum of recently sent packets,
then you can force the attacker to get that right, which would be
generally impossible except by lots of attempts for data packets.  It
would be easier to spoof the checksum and length of TCP SYN packets
or some other commonly used short packets.  To protect against that,
your ITR would need to store the length of such (generally short)
packets and make sure the potentially spoofed PTB message's MTU limit
was not longer than the actual packet.  This, combined with some
general limit on how low you would accept the MTU to be according to
any PTB, would probably make the system immune to this kind of attack.

Also, I think you need to either check the LISP nonce (and accept the
system will fail if the PTB original packet fragment doesn't contain
it) or not expect it and so recognise that the system is open to a
powerful DoS attack if you don't store a bunch of details of recent
packets, and use them to carefully check all PTBs.

This PTB checking process itself could represent a DoS vulnerability
for a busy ITR.  If it has a lot of details of packets recently sent
- potentially hundreds of thousands or millions, I guess, and you are
not requiring LISP nonces, then you still have to search through some
long and constantly changing lists of packet details such as length,
UDP checksum (IPv4 only) Identification etc.  for every PTB received.

This could easily get way out of hand - those cached lists of recent
packet details are expensive enough to add to and clean out, and I
guess they can't inexpensively be optimised for inexpensive searching
to see which one matches a PTB.

PTB messages are pretty short (inexpensive for the attacker to send)
and could chew up a lot of ITR CPU power to process.  Checking only a
fraction of these PTBs (in order to limit the CPU time spent chewing
through a flood of spoofed PTBs) would slow down the ITR's ability to
adapt properly to actual MTU problems.

> So, you need two big packets in order to make the feedback reach the
> original sender, but the ITR does not need to store packets.
> 
> Hope this clearer now.

Yes - thanks.

  - Robin
_______________________________________________
rrg mailing list
[email protected]
http://www.irtf.org/mailman/listinfo/rrg

Re: [rrg] LISP PMTU - 2 methods in draft-farinacci-lisp-11

Reply via email to