[rrg] Purpose and mechanisms of a Core-Edge Separation scheme - Ivip in particular

Robin Whittle Wed, 27 Jan 2010 05:11:28 -0800

This is a walk-through my understanding of what a CES architecture
aims to do and how it would work.  I am using Ivip as an example, and
I am including global mobility in addition to the basic goals of
scalable routing.

This stuff can be hard to understand.  I hope by reading about it in
various ways that people will understand it better.

  - Robin

In msg05799, Patrick Frejborg wrote:

> In a nutshell, CES is all about taking out the multi-homed PI
> addresses from the DFZ - transforming them into PA-addresses
> (RLOCs, which are aggregated) and provide a multi-homing solution
> for IPv6 so it will not further expand the size of DFZ.

It is not just about transforming an actual address or prefix of
addresses from PI use to a new use - SPI (Scalable PI) for Ivip, or
EID for LISP.  It is certainly not about transforming actual
addresses or prefixes of addresses from PI to PA.

It is broadly speaking about generating a growing subset of the
address space which are "edge" addresses - SPI space for Ivip - which
is suitable for end-user network use because it can be used for
portability, multihoming and inbound TE, in a manner which scales
well: does not impact the DFZ control plane, or the DFZ routers'
FIBs, anywhere near as much has the current only way of doing it - PI
space advertised in the DFZ.

Then more and more end-user networks can adopt this type of space,
gaining the benefits, in a scalable fashion.  Some will be new
networks.  Some will have used PA space in the past.  Their SPI
addresses will be different from their PA address.  (Though they may
continue to use their small piece of PA space to run an ETR, through
which they can use their SPI space - and maybe get another PA address
from another ISP so they can multihome, with an second ETR on that
address too.)

Others will have PI space now. They could either convert their PI
space into SPI space, or return it to their RIR and hire some SPI
space from a company which provides this.

Because (with Ivip) SPI space can be split up into micronets as small
as a single IPv4 address, or an IPv6 /64, this kind of space can be
used much more efficiently than with the current approach, in which
prefixes of 256 IPv4 addresses, or 512, 1024 etc. must be used.

Here is a fuller description.  The first part is a statement of my
understanding of the goals of scalable routing, extended to include
support for global mobility.

Mobility is not a formal part of the RRG's goals, but we obviously
need to do it.  In the future, I am sure, most Internet hosts are
going to be wireless mobile hand-held devices.  I think global
mobility - a single IPv4 or IPv6 address or range of addresses, no
matter what the point of attachment, including behind NAT, is the
best and most practical approach to mobility.  The TTR Mobility
architecture provides this - and as far as I know this is the only
way to achieve global mobility.

The second part is an Ivip-specific description of how a CES works -
not counting some important details such as PMTUD discovery
management if encapsulation tunneling is used, or the adoption
instead of tunneling with Modified Header Forwarding - either
initially or in the long-term.

Purpose 1 - general scalable routing and addressing
---------------------------------------------------

There are, or will be, probably 10 million (10^7) or so non-mobile
end-user networks which want or need portability, multihoming and/or
inbound Traffic Engineering (TE).

Brian Carpenter and I have independently suggested this figure in the
past and recently.  I plan to write more about this.

We can't satisfy this want/need with current techniques because the
only way of doing it would be to have each such end-user network
advertise at least one PI prefix in the DFZ.

With IPv4, it could be tricky to find enough such prefixes, unless
they were longer (less addresses) than /24 - which is not inconceivable.

Many such networks lack the resources (financial and technical) to
obtain and manage PI prefixes like this.

However, the most important problem is that doing this would
overburden the DFZ routers.  The overall DFZ control plane would have
to handle 10 million or more prefixes, compared to the current 300k.
 The route processors, with their RIB data structures, would need to
handle all this - and the RIB & route processor load scales in rough
proportion to the number of neighbours each DFZ router has, because
each neighbour engages in a continual two-way parley about every
advertised prefix.  (Not counting a few which might be aggregated
into one.) Also, each DFZ router would need this number of prefixes
in its FIB too.

The aim is to enable the wants and needs of these end-user networks,
regarding portability, multihoming an TE, within their resource
constraints, and while not excessively burdening the DFZ.

Ideally, existing PI users could migrate to the new system and so
reduce the number of prefixes in the DFZ.

Ideally (though this is not in the RRG goals) the new arrangement
would enable more efficient use of IPv4 space.

While the support for these 10M or so end-user networks will have
some impact on the DFZ control plane, RIBs and FIBs, it is vital that
the impact on average per end-user network is very much less than at
present with PI prefixes.

Purpose 2 - supporting mobility on a massive scale
--------------------------------------------------

In the future there may be 10 billion (10^10) wireless-linked mobile
devices, such as cell-phones, laptops or whatever.

Whatever changes we make to the routing and addressing system should
also support each such device working well with the Internet - though
for the larger numbers this probably means the IPv6 Internet rather
than the IPv4 Internet.

This is not a formal goal of the RRG, but I think many people agree
we should make a single set of architectural changes to solve both
challenges.

It so happens that a CES architecture which works well for non-mobile
end-user networks can also support the TTR Mobility architecture.
The Translating Tunnel Routers appear to be ordinary ETRs and ITRs to
the CES system.  MNs (Mobile Nodes) make two-way tunnels to one or
more (typically) nearby TTRs - and these handle the MN's incoming and
outgoing packets for its globally portable, micronet of SPI (Scalable
PI) address space.  Mapping changes are not needed every time the MN
gets a new access-network address - they are only desirable when the
MN moves a long distance, such as 1000km or more.

The other approach to making CES support mobility is to have the MN
be its own ETR and probably ITR.  If so, the MN can't be behind NAT
and each such MN needs a global unicast address of its own to be its
own ETR.  This approach also requires effectively instant changes in
ITR behavior to maintain connectivity in the frequent instances where
the MN gets a new access network address.  No CES architecture can do
this.  TTR Mobility is a much better approach.

Assuming the CES mapping system can handle 10^10 micronets, rather
than the 10^7 needed without ubiquitous mobility, and assuming the
CES network elements (ITRs and ETRs) can handle the extra traffic for
all these mobile devices, then the CES scheme needs no elaboration to
support the TTR Mobility architecture.  A CES architecture with
real-time mapping (Ivip) would be better than LISP (non-real-time) -
but either would work.

So it is possible and desirable to plan a CES architecture to achieve
both goals.

How a CEE architecture might attempt to achieve these goals
-----------------------------------------------------------

Core-Edge Elimination involves creating a new namespace for uniquely
identifying hosts.  Applications request the IP stack to set up
communication with other hosts which are specified by their
Identifier address.  (Some people object to the use of "address" in
this context.)

The stack handles the hosts acquiring and relinquishing their
"physical" addresses - the IP address by which they actually send and
receive packets - to retain session continuity during these changes,
and in the long-term to provide complete portability of the hosts
Identifier "address" no matter which one or more ISPs they are using
to connect to the Net.

Therefore, the new stack functions, and the applications which work
entirely in terms of Identifiers, enable hosts (and whole networks of
hosts) to have portability, multihoming and inbound TE.  Mobility is
basically a fast form of multihoming - but it involves the MN
acquiring addresses on all sorts of access networks, on an ad-hoc
basis.  If the CEE architecture can handle mobility (not all attempt
to do so) then it will need some fancy authentication arrangements so
the session can continue with packets being sent to and from some new
access network physical address than was previously used.

Because applications only use Identifier addresses, the hosts
themselves can use any kind of access network address (though
probably not behind NAT) - any kind of "Locator" or "physical"
address.  If all hosts and all their applications follow the CEE
paradigm, then there is no need for PI addresses at all.  ISPs will
each have a handful of prefixes they advertise in the DFZ, and all
end-user networks will be temporarily be given an address or prefix
from these.  This is PA address space, and it will be fine for all
hosts, since all applications select their correspondent hosts
entirely by using Identifier "addresses".

In CEE, there is no separation of a subset of the "core" addresses to
become "edge" addresses - which is what happens in a CES architecture.

In a CEE architecture, all the addresses the hosts physically use are
 ordinary IP addresses (or some new construct which replaces IP
addresses).  All hosts physically connect using a single type of
Locator address, and there is no distinction between subsets of this
space to be reserved for "edge" networks, such as those end-user
networks which want portability, multihoming and TE on a scalable
basis.  All hosts use a PA address from each one or more upstream ISP
- so there's no need for any end-user network to have is own PI
assignments of Locator address space.

There are now two namespaces: Locator and Identifier.  There is no
core and edge distinction between different classes of Locator
address - they are all PA addresses from the point of view of
end-user networks.  Identifiers are from a separate namespace, which
is global - and each Identifier is unique in the world.  Identifiers
have nothing to do with ISPs and have no relationship with geography
or network topology.  (Some CEE architectures allow non-unique
Identifiers, but this look like a lot of trouble to me.)

The first trouble with CEE is that it involves changes to all host
stacks, APIs and applications - in order to replace the current
methods by which hosts communicate, with benefits only arising when
two CEE hosts, with CEE applications, communicate.  The second is
that it can be argued that it is undesirable to expect all hosts to
take on more responsibility for routing and addressing - because some
or many of them are on slow and potentially unreliable links.  See:

 http://www.ietf.org/mail-archive/web/rrg/current/msg05745.html
 http://www.firstpr.com.au/ip/ivip/RRG-2009/host-responsibilities/.

How a CES architecture - Ivip - achieves these goals
----------------------------------------------------

Some of the following applies to LISP, though with different terminology.

Ivip creates a new subset of the global unicast address space.
Addresses previously used for conventional purposes are now in this
subset - and the subset grows as more people who control address
space choose to use it in the new way.

Currently we have ISP prefixes, which when given temporarily, or
semi-permanently, to the ISP's customers (usually a small section of
the actual prefix the ISP advertises) is known as a PA prefix.  This
is "Provider Assigned" and the end-user network only gets this space
when using this ISP.  PA space is not portable, or suitable for
multihoming or TE.

We also have PI prefixes - those which belong to some larger end-user
networks, and are the cause of the routing scaling problem.

Most home, SOHO and other small business and organisational end-users
are doing fine with their current PA arrangements, which are
acceptably reliable.  Most of these end-user networks don't need
portability, multihoming or TE.

For the subset of end-user networks which do need these things (10^7
or so) and for the 10^10 mobile end-user networks of the future, Ivip
creates a new kind of address space - a subset of the existing global
unicast address space.

This is neither PI or PA.  It is "Scalable PI" address space - SPI.

To do this, multiple BGP-advertised prefixes known as MABs (Mapped
Address Blocks) are used.  (LISP has the same thing, but there is no
LISP term for them.)

For instance, an IPv4 MAB might be 12.34.0.0 /16.  This is advertised
in the DFZ by multiple DITRs (Default ITRs in the DFZ). DITRs are
arranged around the Net and work like any other ITR - however a given
DITR may only advertise a subset of all the MABs in the entire Ivip
system.  Overall, all MABs are covered by widely distributed DITRs.

For more explanation of the commercial arrangements for this and for
the mapping system, see:

  http://tools.ietf.org/html/draft-whittle-ivip-arch
  http://psg.com/lists/rrg/2008/msg01158.html

It is possible for an end-user network which already has PI space to
convert some or all of this into one or more MABs.  In this case,
they are the sole users of each MAB - but they could hire out some of
the space.  This does not make them ISPs, because having MAB space
does not give Internet connectivity.

The typical arrangement will be for multiple companies to acquire
MABs and hire out the space within them, in small chunks, to large
numbers of end-user networks which want portability, multihoming, TE
and/or mobility.

This section does not concern mobility - but there is a Mobility
section at the end.

The following description concentrates on end-user networks which
rent their SPI space from a MAB company - a single range of SPI
addresses in a single MAB.  However, an end-user network could rent
space in multiple MABs from the same company or from multiple companies.

These end-user networks may never have had any address space before -
so they are new networks.  They may previously have had PI space, but
have abandoned it.  If they had PA space, such as the single IPv4
address they get with their DSL etc. service from their one ISP, then
they will presumably retain that PA space, or whatever PA address
their ISP gives them.

Renting SPI space on its own is of no use.  The end-user network
needs to pay at least one ISP for Internet access, and it needs an
ETR.  It also needs to either control the mapping of its space
directly, or pay some specialised company to do it for them.

The end-user network gets a range of SPI addresses - any integer
number.  For instance, end-user network AAA gets a User Address Block
(UAB) of 6 IP addresses:

   12.34.5.60
   12.34.5.61
   12.34.5.62
   12.34.5.63
   12.34.5.64
   12.34.5.65

(For IPv6, the units by which address space can be split up into
micronets is a /64 prefix, so the equivalent would be six contiguous
/64 prefixes.)

AAA can decide to use the whole UAB as a single micronet.  A micronet
is a integer number of IPv4 addresses (or IPv6 /64s) in a single MAB,
which are all covered by the same mapping.

In Ivip, the mapping is a single ETR address.  There is no need to
give ITRs choice between ETRs, because if the currently mapped ETR
should not be used, the end-user network will (directly or
indirectly) cause the mapping to be changed to point to another ETR
which should now be used.  Ivip does this in real-time - a few
seconds at most.  (Less than a second, globally, looks technically
possible but it is best to think of 2 to 3 seconds.)

AAA can divide this 6 address UAB into micronets however it likes -
including into 6 micronets of a single IPv4 address each.  Then it
could map each micronet to any ETR in the world.

With a single link to an ISP, such as with its existing DSL link, AAA
gets a single PA IPv4 address 9.9.9.9.  It can still use this however
it likes, but if it wants to use some of its SPI space via this link
to the Net, then it needs to create an ETR at this address.  The
router or whatever (router function in a DSL modem, or an a server
running *nix) needs to accept IP-in-IP packets and decapsulate them.

Ivip ITRs encapsulate packets with the outer header's source address
being that of the sending host.  If the packet arrives from the DFZ
and passes through the ISPs BR filtering system (which may be set up
to reject any packet with a source address matching any of the ISP's
prefixes) then it will arrive at the ETR function in AAA's router.

The ETR function strips off the outer header and compares the inner
source address with that of the outer header's source address.  If
the two match, the packet is then ready to be forwarded.  If not,
then the packet is dropped, because it is apparently an attempt by an
attacker to get the ETR to forward a packet with a forged source
address.  (For the purposes of this discussion, I assume the ISP has
its own ITRs so it never sends packets out to the DFZ with SPI source
addresses - otherwise, this BR filtering by source address would
prevent hosts in its own networks being able to successfully send
packets to AAA's micronet address.)

If the packet is not addressed to one of AAA's SPI addresses, then it
 should be dropped.  (This could result from an attack or from some
other SPI-using network erroneously mapping their micronet to this
9.9.9.9 ETR address.)

So now the router has the packet, addressed to one of the SPI
addresses, and can do whatever AAA wants with it.

AAA could have all 6 addresses as a single micronet.  Then it could
have 6 hosts, one for each SPI address, at its site.  AAA can still
use its 9.9.9.9 address as before - typically it would have been
used, and may still be used, as the public address of a NAT function,
with a bunch of hosts behind NAT on the LAN.

AAA might use one of the SPI addresses for a NAT box, another for a
web server, another for a mail server or whatever.

AAA could split the space into multiple micronets and map each
micronet to any ETR address in the world.  It could map all the
micronets to 9.9.9.9, which would have the same effect as having all
6 IP addresses in one micronet, and mapping that to 9.9.9.9.

(It is not possible to map the micronets to any address in the SPI
subset, but any other global unicast address will do.  It is AAA's
responsibility to only map its micronets to ETR addresses which it is
using.)

AAA might have a staff member who works from home, and needs for some
reason to have a portable, public address.  So they might have a
single IPv4 address micronet mapped to their DSL home address.

(For simplicity, I am assuming these DSL addresses are fixed or at
least highly stable - which is not necessarily the case in practice.)

AAA might set up branch offices and want each one to get a single
stable, public, global unicast IPv4 address - no matter how those
offices choose different ISPs.  So it ensures each of those offices
has a stable address DSL service or whatever, and maps a micronet to
each one - each office needs to run an ETR function at its DSL
address, as just described for AAA's main office.

These offices can be anywhere in the world, with any ISP.

Now back to AAA's main office.  AAA can change its ISP so its DSL or
whatever service arrives with a single IPv4 address 7.7.7.7.  So it
changes the mapping of whatever micronets it was using in the main
office from the previous ETR address of 9.9.9.9 to 7.7.7.7.

Let's say AAA wanted to multihome its main office - and for
simplicity of explanation, that the main office only needed a single
IPv4 address.

AAA keeps its DSL service with 9.9.9.9 and gets an HFC service with
an address of 8.8.8.8.

Now it sets up an ETR function for each link, on each address.  It
can send out packets on either link (more on outgoing packets below)
and it can receive packets addressed to its SPI addresses via tunnels
from ITRs to the ETR function on either link.

Let's say the DSL link was superior due to lower latency and/or
better upstream speed.  I will assume the single micronet containing
just one SPI address:

  12.34.5.60

is all that is required for this main office.

AAA sets the mapping for this micronet to 9.9.9.9 and all the ITRs in
the world will tunnel packets addressed to 12.34.5.60 to the ETR
function which attaches to the DSL link.

If the DSL link fails, AAA needs to somehow (ideally quickly and
automatically) change the mapping of this micronet to 8.8.8.8.

Maybe AAA could do this manually once either its DSL ISP, or its DSL
link, dies.  It could use wireless Internet to log into the company
which it uses to send mapping commands, or maybe it could phone them
and give a username and password etc. with the new address to use:
8.8.8.8.

In general, networks like AAA will want to pay a separate company - a
Multihoming Monitoring company (MM) to do this for them.

The MM company has a bunch of servers all over the world and fancy
software so several of these servers continually probe the
reachability of 12.34.5.60 via both ETRs.  So the router at AAA's
office needs to have some UDP port, or some other mechanism, by which
it can send out replies to these probe packets.  Ping would be
sufficient.

AAA tells MM how often it wants these probe packets to be sent, and
what algorithm to use to decide when to change the mapping to 8.8.8.8
- and likewise what algorithm to use when deciding when to change it
back to 9.9.9.9 once the DSL-link ETR is working properly again.

AAA gives MM the credentials it needs to control the mapping of the
12.34.5.60 micronet.  MM's servers do this via some SSL system, HTTPS
or whatever, with the Update Authorisation Server which AAA uses for
changing mapping.

Now, if the DSL ISP, or the DSL link, fails, within a few seconds
(depending on how frequently the MM servers probe reachability) the
MM system will decide to change the mapping to 8.8.8.8 - and so
connectivity will be restored.

These probe packets look like any other packet which has been
encapsulated by an ITR - they are IP-in-IP packets.  However, the MM
servers generate these directly - they don't need an ITR.

The outer and inner source addresses would be the address of
whichever MM company server sent the probe packet.  The inner
destination address will be 12.34.5.60.  The outer destination
address will be 9.9.9.9 if the MM server is probing via the DSL ISP,
 link and ETR - or 8.8.8.8 if it is probing via the HFC ISP, link and
ETR.   Typically the MM company would send these probes from multiple
sites near and far from these ISPs

The response to the probe packet should go out the link the probe
arrived on.  Obviously if a probe arrived via DSL link, to the
9.9.9.9 address and the ETR function at this address decapsulated and
forwarded it to whatever handles the 12.34.5.60 SPI packets - and
then the resulting response was sent out the HFC link, then failure
of the HFC link would make it look like the DSL link had died.

In this case, AAA is running its own ETRs.  The ETR functions are
probably in the same router that handles the whole network, so there
is no concept of an ETR being disconnected from the destination network.

An alternative, and better scaling, arrangement is for an ISP1 to
have an ETR on one of its addresses, such as 9.9.99.99, with links
(tunnels, direct connections via some physical means or whatever) to
multiple end-user networks.

Then AAA would receive, via its DSL service, packets from this ETR.
The ETR could serve the needs of many end-user networks, large and
small.  This is more efficient, since a single IPv4 address for a
single ETR serves the needs of many end-user networks.  (This address
efficiency doesn't matter with IPv6.)

In this case, the probing by the MM company proceeds just as before.
 The MM company is not probing reachability of the ETR - it is
probing reachability of the end-user network *via* the ETR.  The ETR
decapsulates the packet, sends it to the router at AAA's office, and
the router sends back a response packet, out the DSL link.

If the DSL link dies, then the MM servers will receive no response
packets.  After a few seconds or non-response (as chosen by AAA) the
MM system will change the mapping of the micronet to that of the DSL
link ETR.  This takes a few seconds at most to get to all the ITRs
which are tunneling packets addressed to this micronet.  So
multihoming service restoration can be achieved in a few seconds.

Note that with Ivip, the probing and the decisions about how to
change mapping due to multihoming failures, is entirely outside the
Ivip CES.  With LISP and other CES systems, the probing (or however
failures are detected) must be done by the ITRs individually, and
each one must make its own decision about choosing another ETR from
the list of ETRs supplied in the more complex mapping information.

Ivip ITRs have no load-sharing functionality.  (LISP ITRs do.) To do
basic load sharing (the most common form of inbound TE), here is how
AAA would organise its network.

Firstly, the incoming traffic would need to be split in some way over
at least two separate IP addresses.  Some of the incoming packets
need to be addressed to one or more SPI addresses in one micronet X
and the rest need to be for one or more SPI addresses in at least a
second micronet Y.  There may be three or more such micronets, each
with one or more IPv4 addresses (or IPv6 /64s) receiving incoming
packets.   In this example, I assume the incoming traffic is roughly
evenly split over two single SPI address micronets:

   12.34.5.60  X
   12.34.5.61  Y

This could be done in several ways.  Even if to the outside world
there was a single FQDN which was the destination of all this
traffic, DNS could return both these addresses, so on average the
sending hosts would be sending half the packets to the X micronet and
the other half to the Y micronet.

AAA configures its service with the MM company to probe reachability
to hosts on both micronets, via both ETR addresses.  The
decision-making algorithm is configured to achieve:

  If the network is reachable via both ETRs, then:
      Map micronet X to the DSL ETR address:   12.34.5.60 >> 9.9.9.9
      Map micronet Y to the HFC ETR address:   12.34.5.61 >> 8.8.8.8

  If the DSL ETR is working and the HFC one is not:
      Map micronet X to the DSL ETR address:   12.34.5.60 >> 9.9.9.9
      Map micronet Y to the DSL ETR address:   12.34.5.61 >> 9.9.9.9

  If the HFC ETR is working and the DSL one is not:
      Map micronet X to the HFC ETR address:   12.34.5.60 >> 8.8.8.8
      Map micronet Y to the HFC ETR address:   12.34.5.61 >> 8.8.8.8

  If the network is not reachable via either ETR:
      Send an SMS message to the cellphone of AAA's IT manager!

AAA would also configure its service with the MM company so the
mapping would be returned to the normal state (the first of the four
above) for instance 2 minutes after both ETRs were found to be
operating properly again.

In this example, simple load sharing is achieved, and it can easily
be seen how it could expand to multiple other links.  Each micronet
could have more than one IPv4 address.

The same approach could be used if the HFC service had fluctuating
downstream capacity (which is quite likely) due to the shared nature
of its downstream RF channel - other customers in the area could
demand more and so AAA gets less.  The HFC link might be usable for a
mailserver, but the DSL link is more suitable for HTTP to the web
server at the office.  So this is not so much load sharing, but
splitting traffic over two links for other reasons - which is still
inbound TE.  In this case the first micronet X would be for the one
or more SPI addresses which were used for the web server.  The second
micronet Y would be for the SPI address used by the mailserver.

Let's say AAA had multiple streams of packets coming in, with varying
data rates at different times.  Maybe it receives website traffic in
response to a TV program which screens at different times in
different countries or parts of the country.  Maybe its servers
handles multiple quite different bodies of traffic for purposes which
rise and fall in popularity hour-by-hour, according to all sorts of
unrelated variables.

If it can split the destination SPI addresses for these streams into
different micronets, such as 4, 6 or more, then depending on the
traffic arriving on each at any given point in time, it can map the
micronets between its two or more ETRs, and so dynamically balance
the flows to maximise the utilisation of its links, while reducing
the chance of any one link being congested.

While each Ivip mapping change would incur a fee (I guess 20 cents to
a fraction of a cent), it may still be highly attractive to use this
at peak times, if it meant that it was possible to dynamically
balance these loads over several less expensive links, rather than
paying perhaps a great deal more for a faster link.

In this case, assuming AAA is still using MM to control the mapping
of its micronets, it needs to work out with MM a way its router,
and/or its IT managers, can tell the MM system which micronet should
be mapped to which ETR.  The MM system will pass on these requests,
by changing the mapping, directly - as long as all ETRs are working.

None of these MM company activities absolutely needs to be
standardised by the IETF.  They are not part of Ivip.  Its up to AAA
and the MM company how they communicate and how MM probes the
reachability of its network.  IETF standards would probably be
desirable, but even if there were such standards, MM and AAA can do
things differently if it suits them.

If the MM company detects that AAA's network is not reachable via one
or more ETRs then its multihoming service restoration algorithm takes
precedence over whatever AAA last requested in terms of load-sharing
- so the MM system would change the mapping of micronets to use the
ETRs which are working.

In principle, AAA could directly control its mapping itself.  But
this looks tricky to do in order to achieve rapid detection and
mapping changes to achieve multihoming service restoration.  So I
think companies such as MM will be the best approach for most networks.

AAA pays its ISPs for connectivity, and perhaps for use of any ETRs
which the ISPs run.

AAA also pays an annual fee to rent its 6 IPv4 addresses of SPI space
from whichever MAB company it hires them from.   This MAB company is
also responsible for accepting the mapping changes AAA orders, or
which MM orders, and sending these out to all the full-database query
servers (QSDs) in the world.  The MAB company also runs, directly or
indirectly, the DITRs which handle this MAB and so which tunnel
packets to AAA's micronets when they are sent from hosts in networks
without ITRs.  The MAB company would sample this DITR activity and
charge AAA and all its other SPI-space renting customers, according
to the traffic for their micronets which the DITRs handled.

The MAB company either is a RUAS (Root Update Authorisation Server)
company or has a contract with a RUAS company to handle the mapping
for its MABs.  The RUAS companies work together to run the fast-push
mapping distribution system which sends all mapping changes to all
the world's full-database mapping query servers: QSDs

ITRs all over the world get their mapping from local QSDs - directly
or via intermediate, caching QSCs.

Whenever a QSD gets a mapping change for a micronet for which it
recently gave out a mapping, it sends a "map update" message to the
querier.  The querier may be an ITR or a QSD, and if it was a QSD,
the same process occurs.  The result is that ITRs reliably get
mapping changes for micronets they currently have cached mapping for.
 This is explained secured with a nonce from the original map query
and is explained fully in:

  http://tools.ietf.org/html/draft-whittle-ivip-arch

There's no need for AAA's router, or its hosts, to have any ITR function.

The ISPs which it uses (ISP1 for the DSL link, ISP2 for the HFC link)
will typically have their own ITRs.

It is not absolutely essential that these ISPs have ITRs, since if
there is none, then the packets being sent from AAA's network which
are addressed to SPI addresses will go out to the DFZ and then to the
nearest DITR which is advertising the matching MAB.

However, any ISP such as ISP1 or ISP2 which has customers using SPI
space would be motivated to install its own ITR(s) so that whenever a
host of any of its customers sends a packet to an SPI address which
is in a micronet currently mapped to any of the ETRs of these SPI
using customers, these packets will be tunneled to these ETRs from
within the ISP's network, rather than going out to a DITR and coming
back as a tunneled packet to one of these ETRs.

AAA could put an ITR in its network - and it could put ITR functions
in one or more of its sending hosts.  There's no particular reason
for doing so, as long as the ISP provides an ITR which worked fine.

But perhaps in the future an ISP might charge less for customers such
as AAA who didn't emit packets addressed to SPI addresses, and so
burdened the ISP's ITRs.  This seems unlikely, but if this occurred,
it would be an incentive for AAA to install its own ITR(s).

To run ITRs in its network, AAA would need these to be able to
request mapping directly (or indirectly, through QSCs) from one or
ideally two or so QSDs in the ISP's network.

AAA could run its own QSDs, but most small networks would be better
off using their ISP's QSDs, since (with a full deployment, with 10^10
micronets) a QSD will require significant storage (hundreds of
gigabytes) and involve significant inflows of mapping updates.
(Also, at boot time, there is a lot of data to download to the QSD -
though by the time we have 10^10 micronets, a few hundred gigabytes
won't be considered such a large amount of data.)

Whenever an ISP has a customer such as AAA which is using SPI space,
the hosts of the customer will be sending out packets with a source
address which is an SPI address.  The same will be true if AAA runs
its own ITRs, since the encapsulation makes the outer header have the
source address of the sending host.

ISPs will need to accept packets and forward them when they have
these SPI source addresses.   There's no obvious way the ISP could be
fussy about whether or not this particular customer was authorised to
use these SPI addresses - though I suppose it could monitor the
mapping system and see which micronets where currently mapped to the
ETR address this customer is currently using.  But if the ISP was so
restrictive, then this would force AAA to send out packets from an
SPI address N on the same link as whichever is used for the ETR which
currently has N's micronet mapped to it.   This would restrict the
ability of AAA to freely choose its outgoing paths - its outgoing TE.

I previously wrote that an ISP wouldn't be able to know or prevent a
customer such as AAA using its PA address as an ETR, and so using SPI
address.  This is true for the incoming packets (unless the ISP
snooped on the packets going to the customer and saw a bunch of Ivip
IP-in-IP packets).  However, to be useful, AAA really needs its one
or more ISPs to forward packets from these SPI addresses.

Therefore, ISPs should in general forward packets from customers with
any source address matching any MAB.

There's no way the ISP could quickly enough detect that some SPI
address has been mapped to a particular ETR address to alter its
filtering.  It would be perfectly allowable and desirable for some
other end-user network BBB, with some micronet totally unrelated to
AAA's micronets, to map one of its micronets to AAA's ETR - and so
for AAA's hosts to receive these packets and so need to send out
packets with source addresses matching one of BBB's micronets.

The ISP could easily check whether the source address matched any of
the MABs, and therefore was an SPI address - so the ISP could filter
packets emerging from AAA's network to drop any which had source
addresses other than an SPI address, and other than the PA address
the ISP provides with the link.

So it will be trivially easy to spoof SPI addresses.  As far as I
know, it is frequently or typically easy to spoof PI or PA addresses,
so this doesn't represent a significant reduction in security.

I hope these examples can be imagined on larger scales - with whole
corporations and universities using SPI space.

The more end-user networks, of all sizes, we can attract to SPI space
the better for routing scalability.  If the very largest existing PI
using end-user networks keep their PI space, that's OK, because their
prefixes are few in number.  The main aim is to serve the needs of
many networks which currently don't have PI space.

Still, those with PI space could probably get by with less SPI space,
since PI space can only be divided, in IPv4, down to /24s and SPI
space can be divided down to any number of IPv4 addresses.  Hopefully
the largest end-user networks, with large amounts of PI space today,
will find SPI space attractive for some or all their uses - and so be
able to reduce the load on the DFZ by reducing or eliminating their
PI advertisements.

TTR Mobility
------------

This is a brief account of something which is fully documented, with
diagrams, at:

  http://www.firstpr.com.au/ip/ivip/#mobile
  http://www.firstpr.com.au/ip/ivip/TTR-Mobility.pdf

This is an IPv4 explanation, but the same thing goes for IPv6 - with
the mobile device getting one or more /64 prefixes.

Let's say AAA has a mobile device, such as cellphone, laptop or
whatever which can get IPv4 access by various means, including 3G
wireless, wired Ethernet, WiFi etc.  (Still, this applies even if it
can only use 3G.)

It doesn't matter that this access may be behind one or more levels
of NAT or if the MN's address uses some Mobile IP techniques.  (It
still works if the MN's address is in SPI address of some network,
including another mobile device.)

AAA decides to make one of its SPI addresses into a micronet and use
it for this mobile device, hereafter known as "MN".

The MN has an ordinary IPv4 stack, but it has an extra bit of
tunneling software with some other features specific to the TTR company.

There could be multiple TTR companies in competition - and AAA
chooses one of them.  I will call it "the TTR company".

The TTR company has a bunch of sites around the world, at various
locations - in major data centres, exchange points or whatever.
There might be two or three in Australia, five or ten in North
America, 15 in Europe etc.   The whole system would work OK with less
than this - I am just describing a fully deployed, mass-market, TTR
company.

In this case, AAA has its own SPI space, but it is quite possible for
a customer to buy a mobile device (PC, cellphone or whatever) and
choose a TTR company which would provide the micronet of SPI space as
well as the services I am about to describe.  The TTR company does
not provide Internet connectivity.  (An ISP could be a TTR company,
but its TTR activities do not provide connectivity.)

Here I assume that only a single IPv4 address is needed for this MN.
 More could be used, and it could use multiple micronets - but one is
used here.

The aim is to give this device this SPI address at all times it has
even a single link to the Internet, no matter where it is.  So this
is a portable, mobile, address.  The MN can still have its
applications send and receive packets from its 3G etc. access network
addresses, but here I assume the applications send and receive
packets using the SPI address.

As soon as the MN has any kind of Internet access, its tunneling
software sets up a two-way, encrypted, tunnel to the last TTR it had
contact with.  (The first time it would start with some central TTR.)

A Translating Tunnel Router (TRR) behaves to the CES system like an
ETR.  The ETR function of the TTR serves multiple MNs at the same
time.  If the MN has a tunnel to a single TTR, then all the packets
addressed to its micronet are tunnelled to the ETR function of the
TTR.  The TTR passes the packets into the bidirectional encrypted
tunnel to the MN.

For now I am assuming a TCP-based tunnel.  In practice, probably a
more sophisticated tunnel would be good.  It would be best to handle
packets of various lengths which may be longer than the TTR to MN
tunnel MTU, by fragmenting them.  This is because the MN might use
various wireless links, with various access networks, and the varying
MTUs of these should not be visible to the sending hosts.

Since the final link is often wireless, the system should be able to
resend packets which do not arrive.  However, some classes of packets
might best be treated like UDP packets, and not resent.

Maybe there is such a tunnel protocol already.

The MN sends its outgoing packets, from its SPI source address, to
the TTR, and the TTR forwards them normally.  Those addressed to
conventional addresses are forwarded to DFZ routers.  Those addressed
 to SPI addresses are tunnelled by the TTR's ITR function.  The TTR's
ITR function needs a nearby QSD, and this would probably be a server
in the same rack.

This will work no matter where the MN is physically, and no matter
which access network it is using.  It can be behind any number of
layers of NAT and it can still tunnel out to the TTR.

To make it work well, the MN should tunnel to a topologically "close"
TTR.  Any TTR in 1000 km or so would be fine - or a few thousand km
if there is none closer.   The TTR company would have some means of
guiding its software in the MN to tunnel to nearby TTRs.

If, for instance in a 3G network, the MN is suddenly given a new
address, it tunnels from that new address to the TTR and
authenticates itself by some means.  There would be a brief loss of
connectivity, but all applications would keep their sessions, since
they are running from the SPI address.  There's no need to change the
mapping when this happens, since the mapping is to the ETR function
of the one TTR.

When the MN has two or more links, it has two or more IP addresses -
in different access networks.  From each such address it would make a
tunnel to the TTR.   This means it can still receive and send packets
if either one of the links fails.  (Exactly how the TTR decides to
send packets is a matter for the TTR company.  It could load-share
the downstream packets over both links, or send duplicate streams for
robustness.  Likewise it is up to the TTR-company software whether it
sends packets up one link, the other, spreads them over both or
whatever.  Either way, they go to the one TTR.

The second address might be topologically distant from the first - so
it might be best to find a different TTR.  Then the MN could make a
tunnel from each address to each TTR.

There could be some tricky software to help the MN find out where its
new address is in the topology and to find the nearest TTR.  But none
of this absolutely needs to be standardised by the IETF, since it is
a matter between the particular TTR company and that company's
software it has provided for the MN.   Ideally this could be
standardised, but there should be room for innovation by various TTR
companies in how they do all this.

The result is that the MN retains its micronet of SPI space, maybe
just a single IPv4 address, no matter where it is - and that by
choosing closer TTRs and then dropping the tunnels to TTRs which used
to be close, but are now distant, the path lengths are generally not
excessively long.

The mapping of the micronet is controlled by the TTR company.  The
TTR company only changes it when it decides that a new TTR is the
best one to send incoming packets to the MN.

An MN could use the same TTR for months or years if it never moved
more than 1000 km or so.   It could still use that TTR if it moved to
the other side of the world, but there would be longer paths, more
latency and more chance of packets being lost.

So TTR Mobility does not mean frequent mapping changes.  Most MNs
wouldn't need their mapping changed from one month or year to the
next.  Only by taking a cross-continent or other long flight would
the MN move enough to warrant choosing a new TTR.

LISP with its slow mapping would work pretty much as well as Ivip for
the TTR mobility architecture.  Ivip would be better, since a new TTR
could be used within a few seconds, and then the old one wouldn't be
needed any more, and the MN could drop its tunnels to the old TTR.

The SPI space the MN gets is global unicast space.  It can be used to
communicate normally with all other hosts - including those using SPI
space, via ordinary ETRs or via TTRs.  The applications in the MN are
perfectly normal and can use any protocols.  The stack itself is an
ordinary IPv4 stack, with tunneling software providing a new IP
address in addition to those which come from the physical connections.

_______________________________________________
rrg mailing list
[email protected]
http://www.irtf.org/mailman/listinfo/rrg

[rrg] Purpose and mechanisms of a Core-Edge Separation scheme - Ivip in particular

Reply via email to