[rrg] LISP-ALT's long path problem yet again

Robin Whittle Tue, 23 Dec 2008 16:55:25 -0800

I think that some LISP folks do not consider the "long path" critique
a problem.


When we debated this earlier this year, it was my impression that
many people agreed with the critique, and that there was no
substantial defence against it from LISP supporters:

  LISP-ALT's long path problem again
  http://psg.com/lists/rrg/2008/msg01676.html

  K. Sriram's diagram illustrating the problem:
  http://www.antd.nist.gov/~ksriram/strong_aggregation.png

In the hope of prompting a detailed, constructive, debate, here is
another account of the critique.

The current, small, LISP ALT network won't have much of a problem
with large numbers of hops through the ALT network from ITR to ETR
since there are so few ALT routers in the network.  Also, unless the
 test sites and ALT routers are spread all over the globe, the
distances between ALT routers will be relatively small compared to
the distances in a fully deployed network.

However, if the LISP-ALT network handled 10M ITRs end ETRs, I guess
there would need to be something like 500k to 1M ALT routers.  Lets
say there are 512k at level 1 of the hierarchy - those which directly
connect to ITRs and ETRs.  Then, at level 2, with an aggregation rate
of 8, there are 64k ALT routers.

So the numbers of routers in each level of the hierarchy would be
something like:

L6       16
L5      128
L4     1k
L3     8k
L2    64k
L1   512k

If L6 is fully meshed, then for a packet going from any one ITR to
any one ETR, on average (assuming an even spread of ITR and ETR
addresses over the address space) in 15 cases out of 16 the addresses
of the ITR and ETR are so different that the packet will need to
ascend to L6 before it can cross to another L6 ALT router to begin
its descent to the part of the ALT (upside-down) tree where the ETR
resides.

In only 1 of 16 cases will the ITR and ETR be on the same 1/16 of the
address space that the packet needs to ascend from the ITR only to
the L5 level before it descends to the ETR.

So the typical path of a packet is:

ITR ->  L1 ALT router
    ->  L2 ALT router
    ->  L3 ALT router
    ->  L4 ALT router
    ->  L5 ALT router
    ->  L6 ALT router (from one 1/16 of the address space to another)
    ->  L6 ALT router
    ->  L5 ALT router
    ->  L4 ALT router
    ->  L3 ALT router
    ->  L2 ALT router
    ->  L1 ALT router -> ETR

Then the packet is delivered to the host network, and a mapping reply
goes straight back to the ITR via the Internet.

The problem would be less than depicted above if the ALT network
aggregated more than 8 ALT routers per level in the hierarchy - or if
 there were less then 10M ITRs and ETRs.

Assuming that the ITR and ETR are pretty close to their local ALT
router, there would still typically be 11 hops in the ALT network.
That is a lot, by any standards.  However, there are two things which
make this number of hops much worse than the same number of hops
between internal or DFZ routers.

1 - Since the ALT network structure is defined by address ("highly
    aggregated") rather than geography or network topology and since
    the organisations who are responsible for various parts of the
    address space are geographically widely dispersed (that is,
    addressing generally does not follow physical network
    topography), the hop from one level ALT router to the next will
    involve multiple physical router hops, most of them probably
    in the DFZ.

    In terms of physical distance and number of physical routers,
    each ALT to ALT hop will involve the packet traversing a tunnel
    involving many physical links and routers, each with its delay
    and risk of packet loss.

2 - Because the organisations who are responsible for the various
    parts of the address space are scattered over continents:

      North America
      Europe
      East Asia - Japan and Korea
      China
      Russia etc.
      South America
      Singapore, Australia etc.
      Africa

    hops between ALT routers will often involve traversing
    intercontinental distances, including across the Pacific
    and Atlantic oceans.


The only way around this problem of typically very long physical
paths would be to concentrate all the ALT routers in one location.
However, then, pretty much every packet from an ITR would need to go
to the central location, and then back to wherever the ETR is.

That would be OK for ITRs and ETRs near the central location, but for
much of the world, this would require a total path length of up to
the circumference of the Earth.  But there's no way the organisations
who are responsible for the address space will want to put all their
ALT routers in any one central location.


I think it is bad enough that ALT relies on a global query server
network.  APT and Ivip avoid this - as does LISP-NERD.

By "long path", I mean not just from Australia to the USA - this is
obvious from the global nature of the ALT query server network.

What I mean is something like from Australia, to Singapore, to China,
to Los Angeles, to Munich, and then to the east coast of the USA.


This "long path" problem will not occur in the current test network,
or even in the first year or two of deployment.

These typically long paths (physically, and in terms of number of ALT
routers and the greater number of phyisical routers) mean:

1 - Excessive delays in initial packets getting to the ETR.

2 - Higher levels of packet loss for these packets.

3 - Therefore, excessive delays or non-responses in the ITR getting
    the map reply message.

4 - Therefore, the ITR sending more than just the initial packet
    on the ALT network, but multiple such packets when the one
    or more sending host times out and sends a second attempt.


It might be possible to have some kind of caching server at some or
all of the ALT routers.

However, this wouldn't solve the problem of the delay in delivery of
the initial packet, unless the ITRs sent a pure map request message
over the ALT network, and buffered the packet while waiting for a reply.

Still, how far up and down the ALT hierarchy would the request need
to go before it rached a caching ALT router which had the mapping
information?  I think the packet would typically need to go quite a
bit of the way, depending on how frequently ITRs with similar
addresses had requested mapping for this particular EID.

With caching ALT routers, there would still be two problems:

1 - The more caching ALT routers were deployed, the more state
    will be stored in the ALT network.  Yet the original design goal
    was to avoid all such state, and to make the ALT network a
    lightweight method of getting mapping information and delivering
    the initial packet(s) ASAP.

    If it is acceptable to have mapping information stored in the
    network, rather than only at the authoritative query servers
    (ETRs) then this is an argument for the hybrid push/pull
    architectures of APT and Ivip, where the full mapping database is
    pushed to query servers which are in the same ISP as the ITRs.

2 - There are scaling problems trying to keep those caches up to
    date.  One major advantage of the ALT network without
    caches is that the query always goes to the authoritative
    query server (the ETR) and so the ITR always gets the latest
    valid mapping, with whatever caching time the ETR decides to
    give that response.   This way, the ETR can set a short
    caching time if it wants to, depending on how much that
    increases the rate of queries and how much it wants ITRs
    to have reasonably fresh mapping information.

    However, if there are lots of caching ALT routers, then each
    one might need to repeatedly query the ETR to keep its cached
    information up to date.  Sometimes, this would not increase the
    load on the ALT network and the ETR compared to doing it without
    caching - this is when the ITR was going to ask again for the
    latest mapping, once the cache time ran out.  However, I think
    there would be many instances where the ITR doesn't actually need
    the mapping again, but the caching ALT routers would (I guess) be
    programmed to keep asking the ETR every caching time for the
    latest mapping in case an ITR wants it.

    - Robin


_______________________________________________
rrg mailing list
[email protected]
https://www.irtf.org/mailman/listinfo/rrg

[rrg] LISP-ALT's long path problem yet again

Reply via email to