[rrg] Fundamental objections to a host-based scalable routing solution

Robin Whittle Sun, 23 Nov 2008 22:08:20 -0800

Short version:  I think the existing division of labour between
                the network (including the routing and addressing
                system) and hosts is good.  I think we should
                add architectural elements to the network to solve
                its scaling problem when millions of end-user
                networks need portability, multihoming and TE.



                Here is a list of objections to any routing scaling
                solution which pushes work relating to multihoming,
                TE and changing ISPs onto hosts.  This is in
                addition to the objections regarding the
                impracticality of introducing such upgrades to the
                great majority of host stacks and applications.

                   Extra host traffic

                   Host operation more affected by packet loss

                   Increased cost and reliability problems
                   for mobile hosts operating over wireless

                   Extra complexity in the host

                   Slowness of response to multihoming

                   Flurry of activity for each multihoming outage

                   General principle of solving a problem close
                   to its origin

                   Complexifying mobile-IP

                I haven't yet listened to the RRG meeting.  There
                may well be other objections.


My messages:

  http://www.irtf.org/pipermail/rrg/2008-November/000215.html
  http://www.irtf.org/pipermail/rrg/2008-November/000228.html

mention two types of objections to a host-based scalable routing
solution:

  1 - The impossibility of introducing it except perhaps over
      a very long time period, such as 15 years.

  2 - Inherent problems with burdening hosts with the responsibility
      for coping with things which occur in the routing system.

Below I concentrate on the second class of objections.

I assume that the proposed host-based routing scaling solution involves:

  1 - All applications use a purely hostname API to the TCP/IP
      stack, which will establish sessions - or allow something
      equivalent to send and forget UDP-like communications -
      no matter what changes occur to the IP address(es) of this
      host, or of any of the hosts it is communicating with.

  2 - Multihoming is achieved by the end-user network having
      two or more upstream ISPs, which each host having at least
      one IP address from the address space provided by each ISP.

  3 - In order to be able to communicate, every host needs an entry
      in the DNS, which will probably be a souped up version of
      the current DNS system.

  4 - The solution is only for IPv6 or something else - not IPv4.

  5 - Perhaps some system within end-user networks so that every
      host can quickly and securely become aware of (by being
      securely informed, or by the host polling the system):

      a - ISPs being added or deleted from the current ISP list.

      b - Therefore, whether the host should gain, or lose one or
          more of its IP addresses.

      c - Real time information about whether each ISP is currently
          to be used - and therefore which of the host's current
          IP addresses should be used.

I will refer to the current host as "A" and to hosts it is
communicating with as "B", "C" etc.


Host-based solutions increase host traffic
------------------------------------------

Currently, hosts assume that their one or more IP addresses are
stable (within any limit set by a DHCP lease) and make the same
assumptions about other hosts, potentially subject to caching times
for DNS replies.

Any host-based routing scaling solution such as described above will
involve extra traffic to and from each host, including:

  1 - More DNS requests and responses.

  2 - DNS responses which are longer, such as due to them returning:

      a - More IP addresses.

      b - Extra information about IP addresses, such as priorities
          for using them, individual caching times for each address
          etc.

          There may need to be more than one IP address with the
          same priority, to facilitate load sharing over hosts
          which use one or more ISPs, for the top priority which
          will be chosen if there are no detected failures.

      c - Any other information required by the new solution.

  3 - Some or all (probably just the initial packet) user-data
      packets sent between hosts having the sending host's hostname
      in an initial packet.

      Hostnames are variable length and potentially very long.

  4 - Some or all user data packets may also carrying extra
      information such as:

      a - Protocol identifier.

      b - Command, ACK and status bits to implement the protocol.

      c - Any other information, such as caching times for
          information provided or implied - for instance to tell
          the destination host how long to cache the supplied
          hostname with the source IP address of the current
          packet.

      d - Any other stuff to cope with the sending host having a
          different address than the one which appears in the
          source address of the packet when it reaches its
          destination.  For instance due to NAT or some mobile-IP
          arrangements.

  5 - Extra packets to and from hosts to implement the protocol -
      for instance to request acknowledgement of what would
      be a UDP packet with current protocols.

      This includes any ICMP destination unreachable messages
      resulting from the host sending a packet to an IP address
      which is no longer reachable.

      It also includes the possibility that host A may probe host
      B on one or more of its IP addresses to check before sending
      a large quantity of user data to B.

  4 - Extra packets to and from other systems, not counting the
      correspondent hosts and DNS.  For instance, to DHCP servers
      or whatever it is by which A obtains its one or more IP
      addresses and potentially learns about which of these should
      be used at present.

The extra traffic is clearly a cost burden.  However, there are
further reasons to want to minimise or eliminate extra traffic to and
from hosts.


Extra host traffic makes operation more vulnerable to packet loss
-----------------------------------------------------------------

The more the proper operation of A depends on its ability to send and
receive these extra packets, or longer packets, the more any lost
packets will contribute to:

  1 - Slower operation.

  2 - Incorrect operation.

  3 - Difficulties managing and debugging A and its interactions with
      B, C etc.


Extra traffic is extremely undesirable for wireless devices
-----------------------------------------------------------

Devices relying on a wireless connection suffer from high financial
costs of sending and receiving packets.  Packet losses are high, and
lost packets generally mean some combination of slow operation and/or
attempts to send and receive still more packets.

Slow operation leads to one or both ends retrying the communication,
perhaps with other hosts, or via other IP addresses to the same hosts.

So the slowness and unreliability of mobile connections has a general
multiplying effect on the costs, slowness and flakiness of the entire
system - including the generation of more traffic to try to cope with
the lost packets or the packets which are not actually lost, but
whose ACKs were lost.

With current protocols, the usability of a wireless device degrades
as a certain function X of packet loss.  With new host requirements
and consequent dependence on extra packets, the degradation of
usability due to packet loss will be at least X and will have some
additional component Y, depending on many factors.

Two such wireless devices with packet loss and delay problems will
suffer worse still losses of usability as a result of the new
host-based protocols.


Extra complexity in the host
----------------------------

Generally, extra complexity anywhere is a bad thing.  So it is with
hosts, which are the most numerous things on the Net.  (However, many
of them have immense CPU and RAM resources, so this complexity can
have almost no incremental cost in many instances.) The simpler host
stacks and applications need to be, the less effort has to go into
writing software for, configuring, managing etc. individual hosts.

I am advocating that the addressing, TE and multihoming problems be
solved in the routing system, not pushed out to each host to have to
cope with.  So I am suggesting that it would be less complex, or that
it would be a healthier, less expensive, more reliable, form of
complexity, to solve these problems with enhancements to the routing
system, rather than to the hosts.

This is a general principle, but it is particularly important for
very lightweight hosts, such as sensors, small battery operated
devices etc.  It is also important for hosts for which it is
difficult or impossible to upgrade their stack and/or applications.
Complexity means the likelihood of bugs, security vulnerabilities,
unanticipated interactions in sub-optimal circumstances, necessity to
update firmware or stack and applications to future versions of the
complex protocols etc.

Hosts which can't easily have their stacks or applications upgraded
include:

  1 - Small, battery operated devices.  They may lack the
      sophistication and user interface to allow secure updating of
      their stack and application code.  Also, for security reasons,
      they may be built to be non-upgradeable.

  2 - Older devices which are still useful, but for which no-one is
      upgrading the stack or applications.

  3 - Devices where there is insufficient awareness, motivation or
      expertise on the part of the end-user to upgrade their
      stack and/or applications.

  5 - Devices which have insufficient bandwidth to receive the
      upgrades - for instance, small, battery operated sensor
      devices.

  6 - Devices which for whatever reasons don't use FLASH or any
      other reprogrammable firmware storage.


Slowness of response to multihoming
-----------------------------------

In a host-based solution as described above, each sending host A has
to figure out for itself what is happening if the IP address it is
using for B is no-longer usable.  This could take quite a while, and
the time it takes starts from whenever A sends a packet to B via an
IP address which no longer works.

If the fault occurred at T=0 and A did not need to send a packet to B
until T=60, then any 10 second (for instance) process by which A
discovers it must use another IP address for B will complete at T=70.

A similar problem occurs for the non-Ivip core-edge separation
schemes: LISP, APT and TRRP:  Each ITR has to detect the
unreachability and make a decision to use another ETR address, on its
own.

A potential advantage of these non-Ivip core-edge separation
techniques over the above host-based solution is that the one ITR
serving multiple sending hosts will discover the outage and learn to
cope with it within 10 seconds (for instance) of the first packet any
of the hosts sends after the outage.  So packets sent after the ITR
adapts its encapsulation rule for this EID will not be subject to any
losses or delays, whereas they would in the host-based solution where
every sending host has to do its own outage detection and recovery
operation.

A potential advantage of the host-based solution over LISP etc. is
that the host may be able to detect that the packets it sent did not
arrive, and so will send the same packets to another IP address.
This reduces or eliminates data loss from the point of view of the
application, whereas with LISP etc. and with Ivip, ITRs do not
attempt to resend packets.

The big advantage of Ivip over the host-based solution and LISP etc.
is that in a properly administered system, the end-user network's own
monitoring system (probably run by a specialised company hired by the
end-user network) will detect the outage very quickly (within seconds
ideally) and will update the mapping (again, a few seconds).
Thereafter all ITRs will be sending user packets to an ETR which does
connect to the end-user network.   This has the potential advantage
that no end-user packets will be delayed or lost.  For instance, if
no hosts are sending to the network in the time it takes the
monitoring system to change the mapping, then no host suffers any
delay or loss of packets.


Flurry of activity for each multihoming outage
----------------------------------------------

With Ivip, there is a limited flurry of activity.  The monitoring
system detects the outage, reliably, by whatever criteria the
end-user network administrator chooses - and changes the mapping.
This involves activity in the mapping distribution system, but the
end-user network pays for this - it might cost a few cents.  It also
involves activity in all full-database query servers, and those query
servers sending messages to any ITR which recently requested this
mapping (according to the caching time provided with the response to
that request).

With LISP etc. there is a flurry of activity, since all ITRs sending
packets to the ETR which is no longer reachable (or which can no
longer reach the end-user network) must individually discover this
and choose another ETR address from the set of RLOCs in the mapping
information.

With a host-based solution, the flurry is typically worse than with
LISP etc, since there are generally going to be more hosts involved
than there are ITRs in LISP etc. or Ivip.


For instance, if there are 1000 hosts, 10 of which are in each of 100
"sites" (end-user or ISP networks) sending packets to one or more
hosts in end-user network Z which has its link to ISP A fail, then
here are the three types of response:

  Ivip:     1 mapping change to all full database query servers.

            The query servers in 100 sites send messages to
            the one or more ITRs in each site which are handling
            the traffic - based on those ITRs having asked for the
            mapping for Z recently (according to the caching
            time the query server returned with the reply).
            This can include ITR functions in sending hosts,
            but not behind NAT.

            In many cases, hosts will not suffer lost packets.
            Except for those lost packets, hosts themselves are not
            affected by the failure.


  LISP etc. 100 or so ITRs discover on their own that the
            ISP A ETR RLOC address shouldn't be used, so they
            choose the next RLOC address in the mapping info.

            Each is prompted to discover this only after one
            or more of its sending hosts sends a packet.


  Host-based 1000 hosts individually have to discover that the
            ISP A address they have for their correspondent
            hosts is inoperative, and so they send packets to
            its ISP B address instead.

            Every one of these hosts has extra state and required
            extra or longer packets in order to be able to do
            this.  The outage discovery process begins for each
            sending host whenever it sends the next packet.



General principle of solving a problem close to its origin
----------------------------------------------------------

The root cause of the routing scaling problem can be summarised as
being related to difficulties coping with the one kind of thing, in
several settings.

The one kind of thing is:

   The desire of the end-user network to use an alternative ISP
   to the one they are currently using.

This has several manifestations:

  1 - Long-term.  The end-user network wants to stop using
      one ISP and to use another instead.   This can be done today
      by two methods:

        PA - the end-user network needs to renumber its
             network.

        PI - the BGP interdomain core (DFZ) needs to cope with the
             changed advertisement of one or more PI prefixes.


  2 - Short-term for multihoming.

        PI - interdomain core needs to cope with change, propagate
             it quickly and reliably.

        PA - can't be done now, but the host-based scalable routing
             system described above can make multihoming work with
             each end-user network having a PA prefix from its
             2 or more upstream ISPs.


  3 - Short term for inbound Traffic Engineering.

      As for short-term multihoming.  However, while multihoming
      failures are presumably quite rare, there is no upper limit
      on how often an end-user network might want to change
      the incoming ISP for all or part of its network in order
      to to TE for load balancing.

Note that none of these scenarios has anything to do with hosts.
They are all concerned with network-related matters:

  1 - What set of address space (or multiple sets) will the
      network (routers and hosts) use?

  2 - To what extent will this range be stable, when there is a
      need to select a new ISP in the long-term, of for short-term
      reasons such as multihoming and TE?

The stability of IP addresses for the network in general, and for
individual hosts, is also important in terms of trust, security,
spam-prevention etc. since at present, IP addresses and their
currently assumed stability play an important part in these
mechanisms.  This includes ACLs, spam black-hole lists etc.

These matters - which generally concern trust, security etc. at the
level of the end-user organisation - do not have any natural
connection with individual hosts.


Complexifying mobile-IP
-----------------------

A host-based solution such as I described above needs to be
implemented on all hosts, no matter whether they are running on an
end-user network or not.  This solution implies two classes of
address space:

  IP addresses which are part of provider networks and are not
  used in any way by any other network.

  IP addresses which are used by hosts in end-user networks, but
  which are always part of the address space of an ISP.

    (I am ignoring transitional arrangements where some end-user
     networks have their own PI space.)


A network-based solution, such as the core-edge separation systems,
LISP, APT, Ivip and TRRP create two classes of address space, again
ignoring the current pattern of some end-user networks having their
own PI prefixes advertised in the DFZ.

  1 - IP addresses which are outside the mapping system.  Assuming
      these are advertised in BGP, they must be prefixes used
      only by ISPs, or by their conventional PA end-user network
      customers.  (ETRs must use this type of address.)

  2 - A new class of address space I call Scalable Provider
      Independent (SPI).  To use Ivip terminology, all SPI space
      is within a Mapped Address Block (MAB) prefix, which is
      advertised in the DFZ.  Each MAB is split into many micronets
      for potentially large numbers of end-user networks, with each
      end-user network having 1 or many micronets.  All packets sent
      to an SPI address (except perhaps those whose source and
      destination is entirely within the one end-user network) need
      to go through an ITR, which will send it to the correct ETR.

(For simplicity, I am leaving Six/One Router out of this analysis.)

Existing mobile IP techniques will work fine with the core-edge
separation schemes, since at any point in an ISP or end-user network
which uses the new scheme, IP addresses are as stable as they are today.

However, with a host-based solution as described above, the
foundations of current mobile-IP techniques can no longer be assumed.

It is bad enough with present mobile-IP techniques with having a
mobile IP address created within the stack, implemented by various
host and router-based techniques on the basis of the one or more
stable "care-of" addresses.  But to make all these "care-of"
addresses, and all the addresses of the routers and other parts of
the total mobile-IP system also subject to the unstable IP address
arrangements which are required by the above host-based solution,
sounds like a complete nightmare to me.

If such a host-based solution was adopted, I think there would be two
approaches to implementing mobile-IP:

  1 - Completely rewrite the mobile-IP protocols, greatly
      complexifying them by replacing their current roots in
      care-of IP addresses, home-agent IP addresses etc. - and
      whatever cryptographic techniques are built on those
      assumed to be stable IP addresses - with something new
      based on the new hosname-based API and stack.

  2 - Leave mobile-IP protocols as they are, but restrict their
      use to networks where hosts and routers have stable IP
      addresses: to ISP networks and to their single homed PA
      customer networks.

      This precludes the possibility of a home-agent router
      being placed in any end-user network which is multihomed
      with the host-based routing scaling solution.


I think 2 above is completely unworkable.  Mobile-IP is a
tangled-enough prospect as it is, making various demands upon local
networks where mobile hosts or home agent routers attach.  To then
make mobile-IP inapplicable in end-user networks would destroy much
of its utility.  This would make it unusable in any university or
corporate network - yet those large end-user networks, with their
WiFi and other types of WAN and physically mobile device are just
where mobile-IP needs to work well.

Option 1 sounds like a complete nightmare.  I think people who are
seriously contemplating uprooting all hosts from their stable IP
addresses and grafting them on to a new rootstock with a
hostname-based foundation should consider how they are going to
convince the mobile-IP folks to rewrite all their stuff.


What to do?
-----------

I think the existing division of labour is good:

   The network (including the routing and addressing system) provides
   hosts with a reasonably stable IP address.

   The host doesn't have to worry about things which happen in the
   network, such as changes to the core routing system, outages
   in ISPs and in links to ISPs, or (ideally) an end-user network
   choosing to use another ISP.

This minimises the complexity in the hosts and the traffic to and
from the host.

The problem is that the current routing system can't provide
multihoming and full address portability for all the end-user
networks which want it.  Since we all agree BGP can't be souped up to
cope with tens or hundreds of millions of PI end-user networks, we
need to do something.

I suggest we keep the existing division of labour and add some new
architectural structures to the routing and addressing system so it
can provide multihoming, TE and portability in a scalable fashion to
the potentially hundred of millions (billions maybe?) end-user
networks which want and need it.

I suggest the best approach is a core-edge separation scheme based on
ITRs, ETRs and a mapping system.  Specifically, I suggest:

   http://www.firstpr.com.au/ip/ivip/

for reasons including:

  Simpler ITRs and ETRs, since they are not required to do any
  reachability testing, or choose between multiple ETR addresses.

  Real time control of mapping enables and requires each end-user
  network to control the mapping.  This means they can use whatever
  decision making processes, reachability testing etc. they like
  to control the behaviour of ITRs.  This is a modular approach,
  in contrast to the monolithic integration of reachability
  testing and multihoming service restoration decision making
  in LISP, APT and TRRP.

  Ivip has minimal encapsulation overhead, has a scheme for
  coping with the resulting PMTUD problems.  If DFZ routers
  can be upgraded - probably with a firmware update - then
  Ivip has two "forwarding" based approaches which involve
  no overheads or PMTUD problems.

Also, if we solve the routing scaling problem with a system such as
Ivip, a powerful new form of mobility becomes possible, with
genuinely mobile IP addresses, no matter what sort of network the
mobile device uses for its one or more care-of addresses:

  http://www.firstpr.com.au/ip/ivip/#mobile


  - Robin

_______________________________________________
rrg mailing list
[email protected]
https://www.irtf.org/mailman/listinfo/rrg

[rrg] Fundamental objections to a host-based scalable routing solution

Reply via email to