On Sat, 2008-02-16 at 14:18 +1100, Robin Whittle wrote:
> APT Hybrid push-pull. Full database ITRs and query servers
> are integrated into a Default Mapper. All
> participating ISPs have a few of these, with a slow
> push scheme (separate BGP-like flooding system). No
> support for non-upgraded networks (Ivip's anycast ITRs
> in the core == LISP's Proxy Tunnel Routers) - so it is
> not incrementally deployable.
Hey Guys,
So there have been some questions about how APT(and other proposals)
will be incrementally deployed. We have devised a plan for incremental
deployment that we would like you to consider.
The ideas in this plan may be applicable to other proposals that do not
require edge-network modifications.
Please hit us with any comments and questions. We would love feedback
on how ridiculous or perfect you think our ideas are.
Dan Jen and Michael Meisel
APT Incremental Deployment
On the Internet, one simply cannot set a flag day when all sites will switch to
a new design, no matter how great an advance the design offers. As a result,
APT explicitly assumes incremental deployment. We offer backwards compatibility
for sites that are slow to adopt APT and also offer incentives for sites that
do adopt it.
Before we delve into the details, we define the following terms. If a transit
network has adopted APT, it is called an APT network. Otherwise, it is called a
non-APT network. A topologically connected set of APT networks form an APT
island. We assume that unconnected APT islands do not exchange mapping
information with each other.
Edge Networks
Because the APT design focuses on placing new functionality in transit
networks, all changes go virtually unnoticed by edge networks. The only new
task for an edge network is to provide traffic engineering information to its
providers. If necessary, the providers themselves can generate this traffic
engineering information and APT can be incrementally deployed with NO CHANGES
TO ANY EDGE NETWORK.
Not only does APT incur minimal cost at edge networks, it also provides
incentives for these networks to connect to APT providers. The mapping entries
provide a powerful tool for traffic engineering. Currently, an edge network
may use AS-path padding or address de-aggregation for load balancing. However,
these techniques provide only rudimentary control over which route is selected
by a traffic source. In APT, an edge network can clearly specify all
providers and traffic preferences. This explicit approach to managing inbound
traffic could greatly simplify existing practices.
Transit Networks
All transit networks will continue to use BGP to reach transit prefixes, even
if all of them adopt APT. However, we note the following differences from
today's Internet. First, APT networks do not run BGP sessions with their
customers in edge space. Second, inside an APT island, the APT networks
exchange their mapping information with each other. This allows their default
mappers to maintain mapping information for the entire island. Let's call this
table the Island Mapping Table. Third, the BGP tables in APT networks do not
contain prefixes that are already in the island mapping table. As APT islands
grow larger and merge with each other, this BGP table can be gradually reduced
and eventually contain only transit prefixes. Smaller BGP table provides an
incentive for transit networks to deploy APT.
________________ ________________
| APT Island 1 | BGP ______ BGP | APT Island 2 |
| ______ | Routes / ISP4 \ Routes | ______ |
| / ISP1 \<===|========>\______/<========|===>/ ISP3 \ |
| \__,___/ | /\ | \__,___/ |
|_______|________| BGP Routes || |_______|________|
| __\/__ |
___|___ / ISP2 \ ___|___
/ Site1 \ \______/ / Site3 \
\_______/ /\ \_______/
BGP Routes ||
___\/__
/ Site2 \
\_______/
We now describe how to enable the communication between APT and non-APT
networks or between two different islands using the topology in the figure
above. Suppose edge networks Site1 and Site2 are customers of ISP1 (an APT
network) and ISP2 (a non-APT network), respectively. How can Site1 reach
Site2?
When an ITR in ISP1 receives a packet from Site1, it attempts to map the
destination address in Site2 to an ETR address. Assuming no cache value is
present for the destination, the packet is forwarded to a DM. Since the
destination is not attached to an APT network, the DM will be unable to find a
mapping entry. In this case, the packet is forwarded toward the destination
using the forwarding table generated by BGP. A special-use mapping entry is
returned to ITR so it can forward future packets using the BGP route. Note
that we assume ITRs and DMs have access to BGP routing information.
Second, we describe how Site2 can reach Site1. Essentially, we need to
convert the mapping information for Site1 into a BGP route and inject it into
non-APT networks. To accomplish this, those transit networks in ISP1's APT
island that have non-APT neighbors must advertise Site1's prefix to their
non-APT neighbors through BGP. Since the default mappers in those networks
maintain a complete island mapping table, they can do the conversion -- the
converted BGP route will contain only the announcing DM's own AS number (the AS
where traffic will enter the island), ISP1 (the AS where traffic will exit the
island towards Site1) and Site1 (the origin). The details of the path taken
within the APT island are not relevant to the BGP routers in the legacy system.
In addition, the DMs will distribute these BGP routes to their TRs, and the
TRs will advertise these routes to their non-APT neighbors in accordance with
routing policies. Eventually, Site2 will receive the BGP route to!
Site1.
Third, how do two APT islands communicate with each other? Suppose Site3 is a
customer of an APT network ISP3, but ISP3 is not in the same island as Site1's
provider ISP1 (i.e. there are some non-APT networks in between). Although the
two islands do not exchange mapping information, they will receive each other's
BGP routes injected using the method described previously. As a result, the
packets between Site1 and Site3 will be tunneled through the island that the
destination is connected to.