Re: [rrg] Summary of architectural solution space - problem definition

Robin Whittle Sun, 21 Dec 2008 20:18:20 -0800

Hi Bill,

I do think the current statement is biased towards seeing the problem
as a deficiency in hosts, and therefore towards solving the problem
in the hosts too.

You wrote:

>>  2 - That current host functionality is deficient, since hosts
>>      themselves are incapable of providing multihoming, TE and
>>      portability.
>>
>>      Assumes: 1 - The routing system is not capable of, or should
>>                   not be required to, provide these things, at
>>                   least to such numbers of end-user networks.
>>
>>               2 - Hosts should be required to do this for
>>                   themselves.
>>
>> Your current definition of the root cause of the routing scaling
>> problem:  http://bill.herrin.us/network/rrgarchitectures.html (5th)
>> is entirely along the lines of 2 above.  However the underlying
>> assumptions are not stated.
> 
> Hi Robin,
> 
> I don't see it. I describe deficiencies with the protocol where
> functionality improperly overlaps. I make no claims in item #1 about
> where in the system the problem should be solved or whether tackling
> it head on is the best choice.
> 
> If you feel that the language is biased towards host-level changes,
> I'm open to making small tweaks to remove that bias. What would you
> suggest?

I can't think of a small tweak which also presents the problem from
the point of view that multihoming, TE and portability (or some
alternative method of painless ISP choice) can and should be provided
by the network, rather than by hosts.

My attempt to rewrite your section is to insert the following text
between the "Item 1: Routing table size problem, root cause:" heading
and your current text.

I make certain statements about RRG consensus, based on my perception
 of this.  I guess all the stuff I suggest below needs to be
evaluated and debated.

  Regards

    - Robin

The routing scaling problem involves addressing and the provision of
multihoming, inbound traffic engineering (TE) and portability of
address space for end-user networks.

"Portability of address space" is one term for the problem of not
being able to select another ISP, without the unreasonable costs and
disruption of having to renumber the network.  Currently, the only
solution to this problem is portability, in the form of one or more
PI prefixes.

An alternative position is that this problem can and should be solved
by making it easy to quickly and reliably renumber networks.

A critique of this is that automated renumbering is unlikely to ever
meet the needs of end-user networks, due to factors such as:
addresses appearing in config files which are hard to identify and to
securely and automatically change; addresses appearing in various
forms in hosts and routers outside the network; stability problems in
any network undergoing any kind of renumbering; and difficulties
securely and reliably testing renumbering, without causing disruption.

No techniques come close to making "painless renumbering" possible
for IPv4, and attempts to achieve this for IPv6 are still far from
sufficient to convince most administrators of end-user networks of
any significant size that automated renumbering and PA space suits
their needs.

The routing scaling problem we are trying to solve is due to a
mismatch between the increasingly large numbers of end-user networks
which want or need multihoming, TE and/or address portability and the
capacity of the current Internet architecture to provide this.

At present, the only way of providing any of these, with either IPv4
or IPv6, is for each end-user network to have one or more PI prefixes
for each of its one or more sites.  The current interdomain routing
system, which uses BGP, requires every DFZ router to maintain state,
and engage in communications with its neighbours, for every
advertised prefix, including each PI prefix of any end-user network.
  Generally, the FIB also needs an entry for each such PI prefix too.
  Most concern is about the scaling difficulties of the BGP control
plane, rather than the FIB of each router.  The scaling problem is
due to the increasing number of end-user network PI prefixes placing
unacceptable demands on CPU and RAM resources.  Furthermore, the BGP
network is widely regarded as becoming increasingly unstable and
unresponsive to network outages, as the number of such prefixes grows.

Also, the practice of achieving TE by dynamically changing the
advertisement of these prefixes places burdens on potentially all DFZ
routers.

In addition to the technical concerns just mentioned, there is a
fundamental problem of the beneficiaries of the increasingly large
number of end-user PI prefixes (the end-user networks themselves) not
paying for the costs they impose on others: every ISP which runs one
or more DFZ routers.

There is consensus in the RRG that the desires and needs of end-user
networks for multihoming and TE are reasonable.  However, opinions
vary as to how much there is a need these in the smallest networks,
such as a residential or SOHO service via DSL / cable modem / fibre /
WiMax etc. or a mobile device, via wired Ethernet and one or more
radio links.

There is less consensus that the desire or need for portable address
space is reasonable, but it is clear that many or most substantial
end-user networks need freedom of choice of ISP, and that they
believe there is no method of achieving this other than portable
address space.

There is little consensus in the RRG on how many end-user networks
will want or need multihoming, TE and portability in the future, but
it is agreed that a scalable routing solution, to be successful,
should scale to very large numbers of end-user networks, such as
hundreds of millions.  There is a widely held belief that the
scalable routing solution should ideally scale well to a situation
where billions, perhaps 10 billion, separate mobile devices each
constitute an end-user network, with multihoming, TE and address
portability - though this scenario could not develop in IPv4.

There is no clear consensus that a scalable routing solution needs to
appeal to every type and size of end-user network.  For instance, if
some of the larger networks continued with their current PI
practices, the scaling problem could probably be solved as long as
most or all of the medium sized and smaller networks adopted the new
system.  A critique of this is that the new system needs to be
attractive, in the short term, for the vast majority of end-user
networks of all sizes, for reasons including it being necessary to
attract smaller end-user networks which aspire to become "large" in
the future.  So a perception that the scalable routing solution is
not suitable for some class of end-user networks, such as the
largest, will deter many other smaller networks from adopting it.

There is complete consensus that we have no power to force adoption
of the scalable routing solution - and that we need to make it
attractive in the short term, for solving end-user networks'
immediate needs, irrespective of how concerned they are about
worsening or solving the routing scaling problem.

There is no consensus on how rapidly IPv6 will be adopted, or about
the urgency in solving the routing scaling problems for IPv4 or for
IPv6 - which is a long way from having such a problem.  Ideally, one
system would be optimal for both Internets.  However, there are
significant differences between the two networks which might result
in differences between the optimal solutions for each.

The statement of the problem can be developed further in at least two
generally incompatible ways - by locating the problem in the routing
system (1), or in the hosts (2).

1 - That the mismatch is due to a deficiency in the interdomain
routing system which can and should be corrected.  This is based on
the position that it is impractical and/or undesirable to expect
hosts to implement multihoming, TE and address portability (or
something equivalent which makes it easy to choose another ISP).

Reasons for this position include: not wanting to increase the
minimum complexity of hosts; the belief that multihoming, TE and
portability are responses to network-centric changes, and that they
should be handled in the "network" (DFZ, ISP routers and end-user
network PE routers), rather than in hosts or end-user network edge
routers; concern about the extra traffic costs for mobile devices due
to the extra management traffic if hosts have to implement
multihoming, TE and portability; and concerns about further delays
and robustness problems with user communications caused by an
increasing dependency on unreliable carriage of this management
traffic, particularly over mobile radio links.  There are at least
two approaches to defining the problem in this "network" sense, each
with its own implied solution:

1a - The problem is the BGP system's inability to handle millions, or
perhaps billions, of PI prefixes.  Therefore, the solution is to
improve BGP, or to replace it with something else which can scale
well to such numbers.  However, there is complete consensus that
firstly there is no potential for improving the BGP system
significantly in this respect and secondly that there is no prospect
for introducing a replacement for BGP without prohibitive levels of
disruption.

1b - The deficiency is in the interdomain routing system itself, due
to it currently consisting only of the BGP system.  Consequently, the
solution is to progressively augment the BGP system with another
architectural system which handles many, most or ideally all end-user
networks, providing them with portable address space suitable for
multihoming and TE in a manner which burdens the BGP system very
lightly per end-user network prefix, or ideally not at all.  This
gives rise to the core-edge separation class of scalable routing
solutions.  These are mainly based on map-encap (LISP, APT and TRRP).
 One core-edge separation scheme, (Six-One Router, for IPv6 only)
uses translation, and another (Ivip) uses different types of
forwarding for IPv4 and IPv6, using modified DFZ routers, with the
option of introducing the system with map-encap if it must be
introduced before all the DFZ routers can be upgraded.

2 - That the mismatch is not due to a deficiency in the interdomain
routing system, but due to the current architecture not expecting
hosts to provide multihoming, TE and portability on their own, with a
routing system similar or identical to today's BGP system.

The argument from this point of view is that it is more desirable for
hosts to provide the functionality required for end-user network
multihoming, TE and portability, than to add a new architectural
layer to the interdomain routing system.

The solutions suggested by this point of view involve changes to host
stacks and usually to APIs and applications.  These solutions
generally work only for a modified form of IPv6, of for a clean-slate
design with fresh protocols - which means that a mass migration to
this new Internet needs to be achieved before such time as the
routing scaling problem in the IPv4 Internet becomes "unacceptable",
however this is defined.

These solutions aim to equip all hosts with the capabilities to
communicate reliably - especially in the presence of failures of
links to one of the multiple ISPs - while using two or more
addresses, each drawn from two or more PA prefixes provided by two or
more upstream ISPs.  This class of solutions solves the routing
scaling problem by ensuring that all participating end-user networks
- ideally all such networks of all sizes - are able to meet their
needs for multihoming, TE and easy selection of new ISPs (termed
"portability" in the alternative viewpoint which considers this can
only be achieved with portable address space) with the only prefixes
being handled by BGP being those of the ISPs.

A description of the problem in these terms is:

_______________________________________________
rrg mailing list
[email protected]
https://www.irtf.org/mailman/listinfo/rrg

Re: [rrg] Summary of architectural solution space - problem definition

Reply via email to