On 4-okt-04, at 7:22, Paul Vixie wrote:

It is true that if you turn on load balancing over multiple paths in
BGP and then per packet load balancing between several links, packets
belonging to one session can end up on different anycast instances.

Actually, no. That would be a disaster.

Whether it's possible and whether it's a disaster are two different issues. :-)


I think the explanation in my previous message shows that it is actually possible for this to happen.

The case where a customer connects to a singe upstream AS over different links and these links are terminated on different routers within this upstream AS isn't a big stretch: this way, the customer is protected against failure of one of the routers they're connected to. (Note that in this case incoming traffic can't be balanced, but if both links terminate on the same router on the customer side, outgoing balancing is still possible.)

The second requirement is that these different routers select different outgoing paths to an anycast instance. This is fairly unlikely, but it could happen in a network design where all components exist in pairs, and the links between routers in the same "half" of the network are preferred over links between both halves, or it takes more links to get to a place connected to the other half.

Noone would ever be able to turn
PPLB on under those conditions, since anycast and multihoming both produce
multiple candidate paths.

I don't follow your reasoning. For multihoming this isn't much of an issue since all packets end up in the same place regardless of how they get there. And if you have two expensive links, one of which sees 99% utilization and the other 10%, you really want that per packet load balancing.


(Often times people are reluctant to enable per packet load balancing as it may introduce reordering. However, this is only the case when the entire path towards the load balanced links is the same speed as those links, unless there is serious congestion. So when load balancing over two GE links a host connected over 100 Mbps won't see any reordering as its packets arrive at the bundled link with more time between them than the different propagation times for different size packets or random differences in queuing delays can make up for.)

BGP's default (as I said in my earlier reply) is
to only install the best path, not all of them. (At least on C and J gear.)

Yes, and by default load balancing is per destination rather than per packet. But people have been known to override defaults...


Doing multipath BGP "safely" would require end to end metrics beyond just
aspath-length.

That's why it's required that either the next hop AS (C old behavior) or the entire AS path (C new behavior) are identical.


[just two anycasted addresses are authorative for .org]

End-user impacting issues
with this have been reported (but have predictably been almost
impossible to reproduce) but the situation persists.

If you can make a strong and cohesive argument about this, ICANN would listen.
(I know that UltraDNS would be happy implementing whatever the DNS community
felt was the strongest configuration, and I think ICANN would say the same.)

This has been discussed fairly extensively on NANOG in the past. If you are positive ICANN/Ultradns are willing to listen, I'll see if I can find the time to write something down. (I don't have any .org domains so it may take me a while to get around to this.)


In the cases where this means a bunch of root servers all anycast in the
same metro, two *great* things happen: (1) attacks and failures against
a single root "letter" don't affect all root servers anycasting in a metro;
and (2) attacks against the entire root server system (all letters) have to
be much stronger in order to upstream-congest every root server instance in
any given metro.

That's great. But this is also the case where unpleasant anycast side effects crop up most easily, as AS paths and metrics towards borders are more easily identical. And if an attacker gets to saturate a network in that metro, all the anycast instances there suffer. It's harder to saturate many links in many places. And from a performance viewpoint, just a few "local" roots are all you need, as good resolvers home in on servers with low RTTs.


In the cases where this means only one root server anycasts from a given
metro, it's still a lot better than no anycast in that metro at all, and it
may reflect the best use of available resources at a particular point in time.

Not sure what you mean here.

Problems such as congestion and BGP blackholes or (temporary) BGP
instability can then impact most or even all of the root servers. (Only
for some places connected to the net, though.) So I feel it's very
important to have a reasonable number of root servers that are NOT
anycast. Preferably, those should be in locations that are far apart.

Without speaking for any of them, and without telling you who they are, I
will say that several rootops feel as you describe, and the likelihood is
very high that your preferences will be followed in this matter.

Excellent. Please tell your anonymous friends that a statement to this effect from them would be highly appreciated.


.
dnsop resources:_____________________________________________________
web user interface: http://darkwing.uoregon.edu/~llynch/dnsop.html
mhonarc archive: http://darkwing.uoregon.edu/~llynch/dnsop/index.html

Reply via email to