Short version: I think the existing division of labour between
the network (including the routing and addressing
system) and hosts is good. I think we should
add architectural elements to the network to solve
its scaling problem when millions of end-user
networks need portability, multihoming and TE.
Here is a list of objections to any routing scaling
solution which pushes work relating to multihoming,
TE and changing ISPs onto hosts. This is in
addition to the objections regarding the
impracticality of introducing such upgrades to the
great majority of host stacks and applications.
Extra host traffic
Host operation more affected by packet loss
Increased cost and reliability problems
for mobile hosts operating over wireless
Extra complexity in the host
Slowness of response to multihoming
Flurry of activity for each multihoming outage
General principle of solving a problem close
to its origin
Complexifying mobile-IP
I haven't yet listened to the RRG meeting. There
may well be other objections.
My messages:
http://www.irtf.org/pipermail/rrg/2008-November/000215.html
http://www.irtf.org/pipermail/rrg/2008-November/000228.html
mention two types of objections to a host-based scalable routing
solution:
1 - The impossibility of introducing it except perhaps over
a very long time period, such as 15 years.
2 - Inherent problems with burdening hosts with the responsibility
for coping with things which occur in the routing system.
Below I concentrate on the second class of objections.
I assume that the proposed host-based routing scaling solution involves:
1 - All applications use a purely hostname API to the TCP/IP
stack, which will establish sessions - or allow something
equivalent to send and forget UDP-like communications -
no matter what changes occur to the IP address(es) of this
host, or of any of the hosts it is communicating with.
2 - Multihoming is achieved by the end-user network having
two or more upstream ISPs, which each host having at least
one IP address from the address space provided by each ISP.
3 - In order to be able to communicate, every host needs an entry
in the DNS, which will probably be a souped up version of
the current DNS system.
4 - The solution is only for IPv6 or something else - not IPv4.
5 - Perhaps some system within end-user networks so that every
host can quickly and securely become aware of (by being
securely informed, or by the host polling the system):
a - ISPs being added or deleted from the current ISP list.
b - Therefore, whether the host should gain, or lose one or
more of its IP addresses.
c - Real time information about whether each ISP is currently
to be used - and therefore which of the host's current
IP addresses should be used.
I will refer to the current host as "A" and to hosts it is
communicating with as "B", "C" etc.
Host-based solutions increase host traffic
------------------------------------------
Currently, hosts assume that their one or more IP addresses are
stable (within any limit set by a DHCP lease) and make the same
assumptions about other hosts, potentially subject to caching times
for DNS replies.
Any host-based routing scaling solution such as described above will
involve extra traffic to and from each host, including:
1 - More DNS requests and responses.
2 - DNS responses which are longer, such as due to them returning:
a - More IP addresses.
b - Extra information about IP addresses, such as priorities
for using them, individual caching times for each address
etc.
There may need to be more than one IP address with the
same priority, to facilitate load sharing over hosts
which use one or more ISPs, for the top priority which
will be chosen if there are no detected failures.
c - Any other information required by the new solution.
3 - Some or all (probably just the initial packet) user-data
packets sent between hosts having the sending host's hostname
in an initial packet.
Hostnames are variable length and potentially very long.
4 - Some or all user data packets may also carrying extra
information such as:
a - Protocol identifier.
b - Command, ACK and status bits to implement the protocol.
c - Any other information, such as caching times for
information provided or implied - for instance to tell
the destination host how long to cache the supplied
hostname with the source IP address of the current
packet.
d - Any other stuff to cope with the sending host having a
different address than the one which appears in the
source address of the packet when it reaches its
destination. For instance due to NAT or some mobile-IP
arrangements.
5 - Extra packets to and from hosts to implement the protocol -
for instance to request acknowledgement of what would
be a UDP packet with current protocols.
This includes any ICMP destination unreachable messages
resulting from the host sending a packet to an IP address
which is no longer reachable.
It also includes the possibility that host A may probe host
B on one or more of its IP addresses to check before sending
a large quantity of user data to B.
4 - Extra packets to and from other systems, not counting the
correspondent hosts and DNS. For instance, to DHCP servers
or whatever it is by which A obtains its one or more IP
addresses and potentially learns about which of these should
be used at present.
The extra traffic is clearly a cost burden. However, there are
further reasons to want to minimise or eliminate extra traffic to and
from hosts.
Extra host traffic makes operation more vulnerable to packet loss
-----------------------------------------------------------------
The more the proper operation of A depends on its ability to send and
receive these extra packets, or longer packets, the more any lost
packets will contribute to:
1 - Slower operation.
2 - Incorrect operation.
3 - Difficulties managing and debugging A and its interactions with
B, C etc.
Extra traffic is extremely undesirable for wireless devices
-----------------------------------------------------------
Devices relying on a wireless connection suffer from high financial
costs of sending and receiving packets. Packet losses are high, and
lost packets generally mean some combination of slow operation and/or
attempts to send and receive still more packets.
Slow operation leads to one or both ends retrying the communication,
perhaps with other hosts, or via other IP addresses to the same hosts.
So the slowness and unreliability of mobile connections has a general
multiplying effect on the costs, slowness and flakiness of the entire
system - including the generation of more traffic to try to cope with
the lost packets or the packets which are not actually lost, but
whose ACKs were lost.
With current protocols, the usability of a wireless device degrades
as a certain function X of packet loss. With new host requirements
and consequent dependence on extra packets, the degradation of
usability due to packet loss will be at least X and will have some
additional component Y, depending on many factors.
Two such wireless devices with packet loss and delay problems will
suffer worse still losses of usability as a result of the new
host-based protocols.
Extra complexity in the host
----------------------------
Generally, extra complexity anywhere is a bad thing. So it is with
hosts, which are the most numerous things on the Net. (However, many
of them have immense CPU and RAM resources, so this complexity can
have almost no incremental cost in many instances.) The simpler host
stacks and applications need to be, the less effort has to go into
writing software for, configuring, managing etc. individual hosts.
I am advocating that the addressing, TE and multihoming problems be
solved in the routing system, not pushed out to each host to have to
cope with. So I am suggesting that it would be less complex, or that
it would be a healthier, less expensive, more reliable, form of
complexity, to solve these problems with enhancements to the routing
system, rather than to the hosts.
This is a general principle, but it is particularly important for
very lightweight hosts, such as sensors, small battery operated
devices etc. It is also important for hosts for which it is
difficult or impossible to upgrade their stack and/or applications.
Complexity means the likelihood of bugs, security vulnerabilities,
unanticipated interactions in sub-optimal circumstances, necessity to
update firmware or stack and applications to future versions of the
complex protocols etc.
Hosts which can't easily have their stacks or applications upgraded
include:
1 - Small, battery operated devices. They may lack the
sophistication and user interface to allow secure updating of
their stack and application code. Also, for security reasons,
they may be built to be non-upgradeable.
2 - Older devices which are still useful, but for which no-one is
upgrading the stack or applications.
3 - Devices where there is insufficient awareness, motivation or
expertise on the part of the end-user to upgrade their
stack and/or applications.
5 - Devices which have insufficient bandwidth to receive the
upgrades - for instance, small, battery operated sensor
devices.
6 - Devices which for whatever reasons don't use FLASH or any
other reprogrammable firmware storage.
Slowness of response to multihoming
-----------------------------------
In a host-based solution as described above, each sending host A has
to figure out for itself what is happening if the IP address it is
using for B is no-longer usable. This could take quite a while, and
the time it takes starts from whenever A sends a packet to B via an
IP address which no longer works.
If the fault occurred at T=0 and A did not need to send a packet to B
until T=60, then any 10 second (for instance) process by which A
discovers it must use another IP address for B will complete at T=70.
A similar problem occurs for the non-Ivip core-edge separation
schemes: LISP, APT and TRRP: Each ITR has to detect the
unreachability and make a decision to use another ETR address, on its
own.
A potential advantage of these non-Ivip core-edge separation
techniques over the above host-based solution is that the one ITR
serving multiple sending hosts will discover the outage and learn to
cope with it within 10 seconds (for instance) of the first packet any
of the hosts sends after the outage. So packets sent after the ITR
adapts its encapsulation rule for this EID will not be subject to any
losses or delays, whereas they would in the host-based solution where
every sending host has to do its own outage detection and recovery
operation.
A potential advantage of the host-based solution over LISP etc. is
that the host may be able to detect that the packets it sent did not
arrive, and so will send the same packets to another IP address.
This reduces or eliminates data loss from the point of view of the
application, whereas with LISP etc. and with Ivip, ITRs do not
attempt to resend packets.
The big advantage of Ivip over the host-based solution and LISP etc.
is that in a properly administered system, the end-user network's own
monitoring system (probably run by a specialised company hired by the
end-user network) will detect the outage very quickly (within seconds
ideally) and will update the mapping (again, a few seconds).
Thereafter all ITRs will be sending user packets to an ETR which does
connect to the end-user network. This has the potential advantage
that no end-user packets will be delayed or lost. For instance, if
no hosts are sending to the network in the time it takes the
monitoring system to change the mapping, then no host suffers any
delay or loss of packets.
Flurry of activity for each multihoming outage
----------------------------------------------
With Ivip, there is a limited flurry of activity. The monitoring
system detects the outage, reliably, by whatever criteria the
end-user network administrator chooses - and changes the mapping.
This involves activity in the mapping distribution system, but the
end-user network pays for this - it might cost a few cents. It also
involves activity in all full-database query servers, and those query
servers sending messages to any ITR which recently requested this
mapping (according to the caching time provided with the response to
that request).
With LISP etc. there is a flurry of activity, since all ITRs sending
packets to the ETR which is no longer reachable (or which can no
longer reach the end-user network) must individually discover this
and choose another ETR address from the set of RLOCs in the mapping
information.
With a host-based solution, the flurry is typically worse than with
LISP etc, since there are generally going to be more hosts involved
than there are ITRs in LISP etc. or Ivip.
For instance, if there are 1000 hosts, 10 of which are in each of 100
"sites" (end-user or ISP networks) sending packets to one or more
hosts in end-user network Z which has its link to ISP A fail, then
here are the three types of response:
Ivip: 1 mapping change to all full database query servers.
The query servers in 100 sites send messages to
the one or more ITRs in each site which are handling
the traffic - based on those ITRs having asked for the
mapping for Z recently (according to the caching
time the query server returned with the reply).
This can include ITR functions in sending hosts,
but not behind NAT.
In many cases, hosts will not suffer lost packets.
Except for those lost packets, hosts themselves are not
affected by the failure.
LISP etc. 100 or so ITRs discover on their own that the
ISP A ETR RLOC address shouldn't be used, so they
choose the next RLOC address in the mapping info.
Each is prompted to discover this only after one
or more of its sending hosts sends a packet.
Host-based 1000 hosts individually have to discover that the
ISP A address they have for their correspondent
hosts is inoperative, and so they send packets to
its ISP B address instead.
Every one of these hosts has extra state and required
extra or longer packets in order to be able to do
this. The outage discovery process begins for each
sending host whenever it sends the next packet.
General principle of solving a problem close to its origin
----------------------------------------------------------
The root cause of the routing scaling problem can be summarised as
being related to difficulties coping with the one kind of thing, in
several settings.
The one kind of thing is:
The desire of the end-user network to use an alternative ISP
to the one they are currently using.
This has several manifestations:
1 - Long-term. The end-user network wants to stop using
one ISP and to use another instead. This can be done today
by two methods:
PA - the end-user network needs to renumber its
network.
PI - the BGP interdomain core (DFZ) needs to cope with the
changed advertisement of one or more PI prefixes.
2 - Short-term for multihoming.
PI - interdomain core needs to cope with change, propagate
it quickly and reliably.
PA - can't be done now, but the host-based scalable routing
system described above can make multihoming work with
each end-user network having a PA prefix from its
2 or more upstream ISPs.
3 - Short term for inbound Traffic Engineering.
As for short-term multihoming. However, while multihoming
failures are presumably quite rare, there is no upper limit
on how often an end-user network might want to change
the incoming ISP for all or part of its network in order
to to TE for load balancing.
Note that none of these scenarios has anything to do with hosts.
They are all concerned with network-related matters:
1 - What set of address space (or multiple sets) will the
network (routers and hosts) use?
2 - To what extent will this range be stable, when there is a
need to select a new ISP in the long-term, of for short-term
reasons such as multihoming and TE?
The stability of IP addresses for the network in general, and for
individual hosts, is also important in terms of trust, security,
spam-prevention etc. since at present, IP addresses and their
currently assumed stability play an important part in these
mechanisms. This includes ACLs, spam black-hole lists etc.
These matters - which generally concern trust, security etc. at the
level of the end-user organisation - do not have any natural
connection with individual hosts.
Complexifying mobile-IP
-----------------------
A host-based solution such as I described above needs to be
implemented on all hosts, no matter whether they are running on an
end-user network or not. This solution implies two classes of
address space:
IP addresses which are part of provider networks and are not
used in any way by any other network.
IP addresses which are used by hosts in end-user networks, but
which are always part of the address space of an ISP.
(I am ignoring transitional arrangements where some end-user
networks have their own PI space.)
A network-based solution, such as the core-edge separation systems,
LISP, APT, Ivip and TRRP create two classes of address space, again
ignoring the current pattern of some end-user networks having their
own PI prefixes advertised in the DFZ.
1 - IP addresses which are outside the mapping system. Assuming
these are advertised in BGP, they must be prefixes used
only by ISPs, or by their conventional PA end-user network
customers. (ETRs must use this type of address.)
2 - A new class of address space I call Scalable Provider
Independent (SPI). To use Ivip terminology, all SPI space
is within a Mapped Address Block (MAB) prefix, which is
advertised in the DFZ. Each MAB is split into many micronets
for potentially large numbers of end-user networks, with each
end-user network having 1 or many micronets. All packets sent
to an SPI address (except perhaps those whose source and
destination is entirely within the one end-user network) need
to go through an ITR, which will send it to the correct ETR.
(For simplicity, I am leaving Six/One Router out of this analysis.)
Existing mobile IP techniques will work fine with the core-edge
separation schemes, since at any point in an ISP or end-user network
which uses the new scheme, IP addresses are as stable as they are today.
However, with a host-based solution as described above, the
foundations of current mobile-IP techniques can no longer be assumed.
It is bad enough with present mobile-IP techniques with having a
mobile IP address created within the stack, implemented by various
host and router-based techniques on the basis of the one or more
stable "care-of" addresses. But to make all these "care-of"
addresses, and all the addresses of the routers and other parts of
the total mobile-IP system also subject to the unstable IP address
arrangements which are required by the above host-based solution,
sounds like a complete nightmare to me.
If such a host-based solution was adopted, I think there would be two
approaches to implementing mobile-IP:
1 - Completely rewrite the mobile-IP protocols, greatly
complexifying them by replacing their current roots in
care-of IP addresses, home-agent IP addresses etc. - and
whatever cryptographic techniques are built on those
assumed to be stable IP addresses - with something new
based on the new hosname-based API and stack.
2 - Leave mobile-IP protocols as they are, but restrict their
use to networks where hosts and routers have stable IP
addresses: to ISP networks and to their single homed PA
customer networks.
This precludes the possibility of a home-agent router
being placed in any end-user network which is multihomed
with the host-based routing scaling solution.
I think 2 above is completely unworkable. Mobile-IP is a
tangled-enough prospect as it is, making various demands upon local
networks where mobile hosts or home agent routers attach. To then
make mobile-IP inapplicable in end-user networks would destroy much
of its utility. This would make it unusable in any university or
corporate network - yet those large end-user networks, with their
WiFi and other types of WAN and physically mobile device are just
where mobile-IP needs to work well.
Option 1 sounds like a complete nightmare. I think people who are
seriously contemplating uprooting all hosts from their stable IP
addresses and grafting them on to a new rootstock with a
hostname-based foundation should consider how they are going to
convince the mobile-IP folks to rewrite all their stuff.
What to do?
-----------
I think the existing division of labour is good:
The network (including the routing and addressing system) provides
hosts with a reasonably stable IP address.
The host doesn't have to worry about things which happen in the
network, such as changes to the core routing system, outages
in ISPs and in links to ISPs, or (ideally) an end-user network
choosing to use another ISP.
This minimises the complexity in the hosts and the traffic to and
from the host.
The problem is that the current routing system can't provide
multihoming and full address portability for all the end-user
networks which want it. Since we all agree BGP can't be souped up to
cope with tens or hundreds of millions of PI end-user networks, we
need to do something.
I suggest we keep the existing division of labour and add some new
architectural structures to the routing and addressing system so it
can provide multihoming, TE and portability in a scalable fashion to
the potentially hundred of millions (billions maybe?) end-user
networks which want and need it.
I suggest the best approach is a core-edge separation scheme based on
ITRs, ETRs and a mapping system. Specifically, I suggest:
http://www.firstpr.com.au/ip/ivip/
for reasons including:
Simpler ITRs and ETRs, since they are not required to do any
reachability testing, or choose between multiple ETR addresses.
Real time control of mapping enables and requires each end-user
network to control the mapping. This means they can use whatever
decision making processes, reachability testing etc. they like
to control the behaviour of ITRs. This is a modular approach,
in contrast to the monolithic integration of reachability
testing and multihoming service restoration decision making
in LISP, APT and TRRP.
Ivip has minimal encapsulation overhead, has a scheme for
coping with the resulting PMTUD problems. If DFZ routers
can be upgraded - probably with a firmware update - then
Ivip has two "forwarding" based approaches which involve
no overheads or PMTUD problems.
Also, if we solve the routing scaling problem with a system such as
Ivip, a powerful new form of mobility becomes possible, with
genuinely mobile IP addresses, no matter what sort of network the
mobile device uses for its one or more care-of addresses:
http://www.firstpr.com.au/ip/ivip/#mobile
- Robin
_______________________________________________
rrg mailing list
[email protected]
https://www.irtf.org/mailman/listinfo/rrg