Re: Shim6, was: Re: filtering /48 is going to be necessary
On 3/12/12 08:56 , Iljitsch van Beijnum wrote: On 12 Mar 2012, at 16:21 , Leigh Porter wrote: Grass-roots, bottom-up policy process + Need for multihoming + Got tired of waiting = IPv6 PI A perfect summation. Except that it didn't happen in that order. When ARIN approved PI the shim6 effort was well underway, but it was too early to be able to know to what degree it would solve the multihoming problem. Earlier, when multi6 was stuck or later, when shim6, at least as a specification, but preferably as multiple implementations, could have been evaluated would both have been reasonable times to decide to go for PI instead. Recall that from the outset (e.g. long before shim6) some of the very early pi prefixes to be assigned were done to organizations which are not internet service providers in any traditional sense. 2001:490::/32 not an isp... 2001:420::/32 not an isp... having received an assignment under the then existing policy it was not hard for large corporate or academic network operators to describe themselves as LIRs. Morever no-one batted an eye when I deaggregated a /32 into /36s we can hem and haw for a long about the possible prefix count and where one draws the line but it's been a consideration since the begining. If the fundamental distinction for who got a pi prefix and who didn't is scale, well there are a lot of ISPS that are small. That camel had it's nose under the tent from day one.
Re: Shim6, was: Re: filtering /48 is going to be necessary
William Herrin wrote: DV is a distributed computation by intelligent intermediate systems, whereas, with LS, intermediate systems just flood and computation is by each end. That's basically wrong. Please, don't demonstrate your lack of basic knowledge. Both systems perform computation on each router. The difference, as you can see in the above sentence of mine, is whether the computation is done as an intermediate system or an end. Link State performs much more complex computation to arrive at its export to the forwarding information base. In fact, Distance Vector's calculation is downright trivial in comparison. FYI, DV uses Bellman-Ford while LS can use Dijkstra, which is faster than Bellman-Ford. http://en.wikipedia.org/wiki/Routing Distance vector algorithms use the Bellman-Ford algorithm. http://en.wikipedia.org/wiki/Bellman-Ford The Bellman–Ford algorithm computes single-source shortest paths in a weighted digraph. For graphs with only non-negative edge weights, the faster Dijkstra's algorithm also solves the problem. should help you a lot to have basic knowledge. The difference is that Link State shares the original knowledge, which it can do before recomputing its own tables. Distance Vector recomputes its own state first and then shares each router's state with the neighbors rather than sharing the original knowledge. The result is that the knowledge propagates faster with Link State and each router recomputes only once for each change. In some cases, distance vector will have to recompute several times before the system settles into a new stable state, delaying the process even further. That is implied in my statements. So, don't repeat it in such verbose way only to reduce clarity. You failed to deny MH know layer 3 address of its private HA. Here's a tip for effective written communication: the first time in any document that you use an abbreviation that isn't well known, spell it out. In this case, the document is the thread. And a tip for you is to remember the past mails in a thread before sending mails to the thread. Like any other host, including MH in your plan, it already knows its domain name and the IP addresses of its private DNS servers. And, to deny HA, your assumption must be that the private DNS servers may be mobile. In your home agent architecture, it doesn't matter if they can have multiple addresses. It matters if they can have the same address. That's totally insane operation. There is no reason to have anycast HA only to bloat the global routing table. Masataka Ohta
Re: Shim6, was: Re: filtering /48 is going to be necessary
2012/3/16 Masataka Ohta mo...@necom830.hpcl.titech.ac.jp: William Herrin wrote: As LS requires less intelligence than DV, it converges faster. I do believe that's the first time I've heard anybody suggest that a link state routing protocol requires less intelligence than a distance vector protocol. I mean intelligence as intermediate systems. DV is a distributed computation by intelligent intermediate systems, whereas, with LS, intermediate systems just flood and computation is by each end. That's basically wrong. Both systems perform computation on each router. Link State performs much more complex computation to arrive at its export to the forwarding information base. In fact, Distance Vector's calculation is downright trivial in comparison. The difference is that Link State shares the original knowledge, which it can do before recomputing its own tables. Distance Vector recomputes its own state first and then shares each router's state with the neighbors rather than sharing the original knowledge. The result is that the knowledge propagates faster with Link State and each router recomputes only once for each change. In some cases, distance vector will have to recompute several times before the system settles into a new stable state, delaying the process even further. Here is an exercise for you insisting on DNS, an intermediate system. What if DNS servers, including root ones, are mobile? DNS' basic bootstrapping issues don't change, nor do the solutions. The resovlers find the roots via a set of static well-known layer 3 address You failed to deny MH know layer 3 address of its private HA. Here's a tip for effective written communication: the first time in any document that you use an abbreviation that isn't well known, spell it out. It's waste of resource for MH have well known IP address of root servers and domain names of its private DNS server and security keys for dynamic update only to avoid to know IP address of its private HA. There's no reason for the Mobile Host to know the IP addresses of the root servers. Like any other host, including MH in your plan, it already knows its domain name and the IP addresses of its private DNS servers. That leaves only the security key. So, by your own accounting I swap knowledge of a topology-independent element (the security key) for a topology-dependent element (an IP address) which may change any time you adjust your home agent's required-to-be-landed network with all of today's vagaries around the renumbering problem. For that matter, how do you solve the problem with your home agent approach? Is it even capable of having multiple home agents active for each node? How do you keep them in sync? I actually designed and implemented such a system. Multiple home agents each may have multiple addresses. If some address of HA does not work, MH tries other addresses of HA. If some HA can not communicate with MH, CH may try to use other HA. There is nothing mobility specific. Mobile protocols are modified just as other protocols are modified for multiple addresses. In practice, however, handling multiple addresses is not very useful because selection of the best working address is time consuming unless hosts have default free routing tables. In your home agent architecture, it doesn't matter if they can have multiple addresses. It matters if they can have the same address. Otherwise you're pushing off the generalized continuity of operations problem. One which my DNS add/drop approach handles seamlessly and at a granularity of the individual services on the mobile host. Regards, Bill Herrin -- William D. Herrin her...@dirtside.com b...@herrin.us 3005 Crane Dr. .. Web: http://bill.herrin.us/ Falls Church, VA 22042-3004
Re: Shim6, was: Re: filtering /48 is going to be necessary
2012/3/15 Masataka Ohta mo...@necom830.hpcl.titech.ac.jp: William Herrin wrote: I've been an IRTF RRG participant and in my day job I build backend systems for mobile messaging devices used in some very challenging and very global IP and non-IP environments. I know non-IP mobile environment is heavily encumbered. So, I can understand why you insist on using DNS for mobility only to make IP mobility as encumbered as non-IP ones. I don't understand your statement. None of the technologies I work with use the word encumbered in a comparable context. Perhaps you could rephrase? Ignoring that DNS does not work so fast, TCP becomes it wasn't sure what addresses it should be talking to only after a long timeout. Says who? Our hypothetical TCP can become unsure as soon as the first retransmission if we want it to. It can even become unsure when handed a packet to send after a long delay with no traffic. There's little delay kicking off the recheck either way. That may be a encumbered way of doing things in non-IP, or bell headed, mobile systems, where 0.05 second of voice loss is acceptable but 0.2 second of voice loss is significant. However, on the Internet, 0.05 second of packet losses can be significant and things work end to end. Get real. Even EAPS takes 0.05 seconds to recover from an unexpected link failure and that's on a trivial Ethernet ring where knowledge propagation is far less complex than a mobile environment. For expected link failures, you can't get any fewer than zero packets lost, which is exactly what my add/drop approach delivers. In this case, your peer, a mobile host, is the proper end, because it is sure when it has lost or are losing a link. Correct, but... Then, the end establishes a new link with a new IP and initiate update messages for triangle elimination at proper timing without unnecessary checking. This is where the strategy falls apart every time. You know when your address set changes but you don't know the destination endpoint's instant address set unless either (1) he tells you or (2) he tells a 3rd party which you know to ask. Your set and his set are both in motion so there _will_ be times when your address set changes before he can tell you the changes for his set. Hence #1 alone is an _incomplete_ solution. It was incomplete in SCTP, it was incomplete in Shim6 and it'll be incomplete in MPTCP as well. And oh-by-the-way, if you want to avoid being chatty on every idle connection every time an address set changes and you want either endpoint to be able to reacquire the other when it next has data to send then the probability your destination endpoint has lost all the IP addresses you know about goes way up. Regards, Bill Herrin -- William D. Herrin her...@dirtside.com b...@herrin.us 3005 Crane Dr. .. Web: http://bill.herrin.us/ Falls Church, VA 22042-3004
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Thu, Mar 15, 2012 at 01:18:04PM +0900, Masataka Ohta wrote: As long as we keep using IPv4, we are mostly stopping at /24 and must stop at /32. But, see the subject. It's well above moore. For high speed (fixed time) routed look up with 1M entries, SRAM is cheap at /24 and is fine at /32 but expensive and power consuming TCAM is required at /48. That's one reason why we should stay away from IPv6. What prevents you from using http://www.nature.com/ncomms/journal/v1/n6/full/ncomms1063.html with IPv6?
Re: Shim6, was: Re: filtering /48 is going to be necessary
William Herrin wrote: I know non-IP mobile environment is heavily encumbered. So, I can understand why you insist on using DNS for mobility only to make IP mobility as encumbered as non-IP ones. I don't understand your statement. None of the technologies I work with use the word encumbered in a comparable context. Perhaps you could rephrase? OK. You are bell headed. However, on the Internet, 0.05 second of packet losses can be significant and things work end to end. Get real. Even EAPS takes 0.05 seconds to recover from an unexpected link failure If you keep two or more links, keep them alive, and let them know their IP addresses each other, which can be coordinated by mobile hosts as the ends, links can cooperate to avoid broken links for a lot faster recovery than 0.05s. and that's on a trivial Ethernet ring where knowledge propagation is far less complex than a mobile environment. The previous statement of mine merely assumes radio links with sudden link failure by, say, phasing. Its spatial diversity arranged by the mobile hosts as the ends. If link failure is expected several seconds before, which is usual with radio links, mobile hosts can smoothly migrate to a new link without any packet losses, because it has much time to resend possibly lost control packets. In this case, your peer, a mobile host, is the proper end, because it is sure when it has lost or are losing a link. Correct, but... Then, the end establishes a new link with a new IP and initiate update messages for triangle elimination at proper timing without unnecessary checking. This is where the strategy falls apart every time. You know when your address set changes but you don't know the destination endpoint's instant address set unless Certainly, if two communicating mobile hosts, two ends, changes their IP addresses simultaneously, they can not communicate *DIRECTLY* each other, because they can not receive new IP addresses of their peers. The proper end for the issue is the home agent. Just send triangle elimination messages to the home agent without triangle eliminations. With the new layer of indirection by the home agent, control messages for triangle elimination are sent reliably (though best effort). The home agent knows reachable foreign addresses of mobile hosts, as long as the mobile hosts can tell the home agent their new foreign addresses before the mobile hosts entirely loses their old links. either (1) he tells you or (2) he tells a 3rd party which you know to ask. (3) he tells his home agent, his first party, to which you, his second party, ask packet forwarding. Unlike DNS servers, the first party is responsible for its home agent. Your set and his set are both in motion so there _will_ be times when your address set changes before he can tell you the changes for his set. Hence #1 alone is an _incomplete_ solution. A difficulty to understand the end to end principle is to properly recognize ends. Here, you failed to recognize home agents as the essential ends to support reliable communication to mobile hosts. It was incomplete in SCTP, it was incomplete in Shim6 and it'll be incomplete in MPTCP as well. It is complete though shim6 is utterly incomplete. And oh-by-the-way, if you want to avoid being chatty on every idle connection every time an address set changes and you want either endpoint to be able to reacquire the other when it next has data to send then the probability your destination endpoint has lost all the IP addresses you know about goes way up. Idle connections may have timeouts for triangle elimination after which they use home agents of their peers. That's how the end to end Internet is working without any packet losses not caused by congestion nor unexpected sudden link failures. Masataka Ohta
Re: Shim6, was: Re: filtering /48 is going to be necessary
Eugen Leitl wrote: For high speed (fixed time) routed look up with 1M entries, SRAM is cheap at /24 and is fine at /32 but expensive and power consuming TCAM is required at /48. That's one reason why we should stay away from IPv6. What prevents you from using http://www.nature.com/ncomms/journal/v1/n6/full/ncomms1063.html with IPv6? Though I didn't paid $32 to read the full paper, it's like a proposal of geography based addressing. So, I should ask what prevents you from using it with IPv4? Masataka Ohta
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Thu, Mar 15, 2012 at 09:57:10PM +0900, Masataka Ohta wrote: That's one reason why we should stay away from IPv6. What prevents you from using http://www.nature.com/ncomms/journal/v1/n6/full/ncomms1063.html with IPv6? Though I didn't paid $32 to read the full paper, it's like a proposal of geography based addressing. You can access the free full text at http://arxiv.org/pdf/1009.0267v2.pdf So, I should ask what prevents you from using it with IPv4? Because IPv4 will be legacy by the time something like this lands, and because IPv6 needs more bits/route so more pain there.
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Thu, Mar 15, 2012 at 9:58 AM, Eugen Leitl eu...@leitl.org wrote: On Thu, Mar 15, 2012 at 09:57:10PM +0900, Masataka Ohta wrote: That's one reason why we should stay away from IPv6. What prevents you from using http://www.nature.com/ncomms/journal/v1/n6/full/ncomms1063.html with IPv6? Though I didn't paid $32 to read the full paper, it's like a proposal of geography based addressing. You can access the free full text at http://arxiv.org/pdf/1009.0267v2.pdf Hi Eugen, Geographic routing strategies have been all but proven to irredeemably violate the recursive commercial payment relationships which create the Internet's topology. In other words, they always end up stealing bandwidth on links for which neither the source of the packet nor it's destination have paid for a right to use. This is documented in a 2008 Routing Research Group thread. http://www.ops.ietf.org/lists/rrg/2008/msg01781.html If you have a new geographic routing strategy you'd like to table for consideration, start by proving it doesn't share the problem. Regards, Bill Herrin -- William D. Herrin her...@dirtside.com b...@herrin.us 3005 Crane Dr. .. Web: http://bill.herrin.us/ Falls Church, VA 22042-3004
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Thu, Mar 15, 2012 at 10:25:46AM -0400, William Herrin wrote: Geographic routing strategies have been all but proven to irredeemably violate the recursive commercial payment relationships which create the Internet's topology. In other words, they always end up stealing bandwidth on links for which neither the source of the packet nor it's destination have paid for a right to use. This is documented in a 2008 Routing Research Group thread. http://www.ops.ietf.org/lists/rrg/2008/msg01781.html If you have a new geographic routing strategy you'd like to table for consideration, start by proving it doesn't share the problem. I think the problem can be tackled by implementing this in wireless last-mile networks owned and operated by end users. (Obviously the /64 space is enough to carry that information. Long-range could be done via VPN overlay over the Internet). This will reduce the local chatter for route discovery and remove some of the last-mile load on wired connections, which is in ISPs' interest. I think we'll see some 1-10 GBit/s effective bandwidth in sufficiently small wireless cells. If this scenario plays out, this will inch up to low-end gear like Mikrotik and eventually move to the core. I don't think this will initially happen in the network core for the reasons you mentioned.
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Thu, Mar 15, 2012 at 10:41, Eugen Leitl eu...@leitl.org wrote: On Thu, Mar 15, 2012 at 10:25:46AM -0400, William Herrin wrote: Geographic routing strategies have been all but proven to irredeemably violate the recursive commercial payment relationships which create the Internet's topology. In other words, they always end up stealing bandwidth on links for which neither the source of the packet nor it's destination have paid for a right to use. This is documented in a 2008 Routing Research Group thread. http://www.ops.ietf.org/lists/rrg/2008/msg01781.html I think the problem can be tackled by implementing this in wireless last-mile networks owned and operated by end users. Interesting point, and the growth in municipal networks could help. But they are still a vast minority. Scott
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Thu, Mar 15, 2012 at 10:41 AM, Eugen Leitl eu...@leitl.org wrote: On Thu, Mar 15, 2012 at 10:25:46AM -0400, William Herrin wrote: Geographic routing strategies have been all but proven to irredeemably violate the recursive commercial payment relationships which create the Internet's topology. In other words, they always end up stealing bandwidth on links for which neither the source of the packet nor it's destination have paid for a right to use. I think the problem can be tackled by implementing this in wireless last-mile networks owned and operated by end users. (Obviously the /64 space is enough to carry that information. Long-range could be done via VPN overlay over the Internet). If an endpoint is allowed to have multiple addresses and allowed to rapidly change addresses then a more optimal last-mile solution is dynamic topological address delegation. Each IP represents a current-best-path coreward through the ISP's network. When the path changes, so do the downstream addresses. Instead of a routing protocol you have an addressing protocol. In theory, such a thing automatically aggregates into very small routing tables. Very much a work in progress: http://bill.herrin.us/network/name/nr1.gif http://bill.herrin.us/network/name/nr2.gif http://bill.herrin.us/network/name/nr3.gif Regards, Bill Herrin -- William D. Herrin her...@dirtside.com b...@herrin.us 3005 Crane Dr. .. Web: http://bill.herrin.us/ Falls Church, VA 22042-3004
Re: Shim6, was: Re: filtering /48 is going to be necessary
2012/3/15 Masataka Ohta mo...@necom830.hpcl.titech.ac.jp: William Herrin wrote: I know non-IP mobile environment is heavily encumbered. So, I can understand why you insist on using DNS for mobility only to make IP mobility as encumbered as non-IP ones. I don't understand your statement. None of the technologies I work with use the word encumbered in a comparable context. Perhaps you could rephrase? OK. You are bell headed. If you want to be snippy in English, you should first gain a better command of the language. Neither of your previous statements has a meaning recognized beyond the confines of your own brain. Your set and his set are both in motion so there _will_ be times when your address set changes before he can tell you the changes for his set. Hence #1 alone is an _incomplete_ solution. A difficulty to understand the end to end principle is to properly recognize ends. Here, you failed to recognize home agents as the essential ends to support reliable communication to mobile hosts. A device which relays IP packets is not an endpoint, it's a router. It may or may not be a worthy part of a network architecture but it is unambiguously not an endpoint. If that isn't clear to you then don't presume to lecture me about the end to end principle. Regards, Bill Herrin -- William D. Herrin her...@dirtside.com b...@herrin.us 3005 Crane Dr. .. Web: http://bill.herrin.us/ Falls Church, VA 22042-3004
Re: Shim6, was: Re: filtering /48 is going to be necessary
2012/3/14 Masataka Ohta mo...@necom830.hpcl.titech.ac.jp: stuff deleted For high speed (fixed time) routed look up with 1M entries, SRAM is cheap at /24 and is fine at /32 but expensive and power consuming TCAM is required at /48. That's one reason why we should stay away from IPv6. Masataka Ohta I found this bit of research from 2007 ( http://www.cise.ufl.edu/~wlu/papers/tcam.pdf ). It seems to me there are probably more ways to mix and match different types of ram to be able to deal with this beast. james
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Thu, 15 Mar 2012 13:31:42 EDT, William Herrin said: 2012/3/15 Masataka Ohta mo...@necom830.hpcl.titech.ac.jp: OK. You are bell headed. If you want to be snippy in English, you should first gain a better command of the language. Neither of your previous statements has a meaning recognized beyond the confines of your own brain. http://www.pcmag.com/encyclopedia_term/0,2542,t=Bellheadi=38536,00.asp I don't think the term means what Masataka thinks it means, because nobody in this discussion is talking in terms of circuits rather than packet routing. pgpCBF8YVkgY4.pgp Description: PGP signature
Re: Shim6, was: Re: filtering /48 is going to be necessary
I don't think the term means what Masataka thinks it means, because nobody in this discussion is talking in terms of circuits rather than packet routing. Geographical addressing can tend towards bellhead thinking, in the sense that it assumes a small number (one?) of suppliers servicing all end users in a geographical area, low mobility, higher traffic volumes towards other end-users in the same or a close geography, relative willingness to renumber when a permanent change of location does occur, and simple, tightly defined interconnects where these single-suppliers can connect to the neighbouring single-supplier and their block of geography. I'm not sure he's right, but I think I understand what he's getting at. Regards, Tim.
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Thu, 15 Mar 2012 21:52:54 +0900, Masataka Ohta said: Get real. Even EAPS takes 0.05 seconds to recover from an unexpected link failure If you keep two or more links, keep them alive, and let them know their IP addresses each other, which can be coordinated by mobile hosts as the ends, links can cooperate to avoid broken links for a lot faster recovery than 0.05s. May work for detecting a dead access point in a wireless mesh, but it doesn't scale to WAN sized connections. Standard systems control theory tells us that you can't control a system in less than 2*RTT across the network. There's *plenty* of network paths where endpoint-homebase-endpoint will be over 50ms. Consider the case where one endpoint is in Austria, the other is in Boston, and the node handling the mobility is in Japan. Now a router fails in Seattle. How long will it take for the endpoints to notice? (Alternatively, explain how you locate a suitable home base node closer than Japan. Remember in your explanation to consider that you may not have a business relationship with the carrier that would be an optimum location) pgpvSJRi69Hw8.pgp Description: PGP signature
Re: Shim6, was: Re: filtering /48 is going to be necessary
Eugen Leitl wrote: So, I should ask what prevents you from using it with IPv4? Because IPv4 will be legacy by the time something like this lands, Maybe. But, IPv6 will be so before IPv4 (or, is already IMHO). and because IPv6 needs more bits/route so more pain there. Feel free to propose filter everything beyond /32 and get accepted by the community. Masataka Ohta
Re: Shim6, was: Re: filtering /48 is going to be necessary
William Herrin wrote: A difficulty to understand the end to end principle is to properly recognize ends. Here, you failed to recognize home agents as the essential ends to support reliable communication to mobile hosts. A device which relays IP packets is not an endpoint, it's a router. If you want to call something which may not participate in routing protocol exchanges a router, that's fine, it's your terminology. But, as far as HA has the knowledge obtained through control packet exchanges with MH, it is the end that can give the help to make mobile IP correct and complete. It may or may not be a worthy part of a network architecture but it is unambiguously not an endpoint. Even ordinary routers are ends w.r.t. routing protocols, though they also behave as intermediate systems to other routers. As LS requires less intelligence than DV, it converges faster. If that isn't clear to you then don't presume to lecture me about the end to end principle. Here is an exercise for you insisting on DNS, an intermediate system. What if DNS servers, including root ones, are mobile? Masataka Ohta
Re: Shim6, was: Re: filtering /48 is going to be necessary
james machado wrote: For high speed (fixed time) routed look up with 1M entries, SRAM is cheap at /24 and is fine at /32 but expensive and power consuming TCAM is required at /48. That's one reason why we should stay away from IPv6. I found this bit of research from 2007 ( http://www.cise.ufl.edu/~wlu/papers/tcam.pdf ). It seems to me there are probably more ways to mix and match different types of ram to be able to deal with this beast. But it's not fixed time. Worse, it synthesis IPv6 table from the current IPv4 ones, which means the number of routing table entries is a lot less than 1M. Masataka Ohta
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Fri, 16 Mar 2012 08:31:07 +0900, Masataka Ohta said: Here is an exercise for you insisting on DNS, an intermediate system. What if DNS servers, including root ones, are mobile? So, is this question more like: What if computers worked in trinary? or What if people show criminal negligence in misdesigning their networks? You're asking a what if for a usage case that nobody sane has suggested. pgpUYaTxCsgV0.pgp Description: PGP signature
Re: Shim6, was: Re: filtering /48 is going to be necessary
valdis.kletni...@vt.edu wrote: If you keep two or more links, keep them alive, and let them know their IP addresses each other, which can be coordinated by mobile hosts as the ends, links can cooperate to avoid broken links for a lot faster recovery than 0.05s. May work for detecting a dead access point in a wireless mesh, That's not my point. My point is to avoid dead links. Base stations try sending packets to a MH and if it fails a few times, they forward the packet to other base stations which may have live links to the MH. but it doesn't scale to WAN sized connections. Regardless of whether links are wireless or wired, the coordination is necessary only within (small number of) links to which MH, the end, is attached, which means the coordination is a local coordination if coordinated by the end. There is no WAN involved for the coordination. Consider the case where one endpoint is in Austria, the other is in Boston, and the node handling the mobility is in Japan. Now a router fails in Seattle. How long will it take for the endpoints to notice? Huh? (Alternatively, explain how you locate a suitable home base node No home agent is involved for the recovery. Masataka Ohta
Re: Shim6, was: Re: filtering /48 is going to be necessary
valdis.kletni...@vt.edu wrote: You're asking a what if for a usage case that nobody sane has suggested. If you are saying it's insane to use DNS to manage frequently changing locations of mobile hosts instead of relying on immobile home agents, I fully agree with you. Masataka Ohta
Re: Shim6, was: Re: filtering /48 is going to be necessary
2012/3/15 Masataka Ohta mo...@necom830.hpcl.titech.ac.jp: valdis.kletni...@vt.edu wrote: You're asking a what if for a usage case that nobody sane has suggested. If you are saying it's insane to use DNS to manage frequently changing locations of mobile hosts instead of relying on immobile home agents, I fully agree with you. Masataka Ohta Non sequitur. Mobile root DNS servers are what is insane, because the queries must terminate somewhere, and ultimately there must be something that doesn't have a circular dependency -- requiring working DNS to get DNS. As for using DNS to manage frequently changing locations of mobile hosts, DNS is almost perfect for that -- it's just the sort of thing DNS is designed for, depending on how frequent you mean by frequently changing. -- -JH
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Fri, 16 Mar 2012 09:29:44 +0900, Masataka Ohta said: valdis.kletni...@vt.edu wrote: You're asking a what if for a usage case that nobody sane has suggested. If you are saying it's insane to use DNS to manage frequently changing locations of mobile hosts instead of relying on immobile home agents, I fully agree with you. I'm specifically saying that what if the root servers are mobile? is a stupid question, because nobody sane has proposed that they be mobile. Hope that makes it clearer for you. pgpST43sdY4RG.pgp Description: PGP signature
Re: Shim6, was: Re: filtering /48 is going to be necessary
2012/3/15 Masataka Ohta mo...@necom830.hpcl.titech.ac.jp: Even ordinary routers are ends w.r.t. routing protocols, though they also behave as intermediate systems to other routers. As LS requires less intelligence than DV, it converges faster. I do believe that's the first time I've heard anybody suggest that a link state routing protocol requires less intelligence than a distance vector protocol. If that isn't clear to you then don't presume to lecture me about the end to end principle. Here is an exercise for you insisting on DNS, an intermediate system. What if DNS servers, including root ones, are mobile? DNS' basic bootstrapping issues don't change, nor do the solutions. The resovlers find the roots via a set of static well-known layer 3 address which, more and more these days, are actually anycast destinations matching diverse pieces of equipment. It makes no particular sense to enhance their mobility beyond this level. Before you jump up and down and yell Ah ha! realize that this is true of a mapping function at any level of the stack. ARP doesn't work without knowing the layer 2 broadcast address and IPv6's ND doesn't work without knowing a static set of multicast addresses. Below the roots, the authoritative zone servers are no different than any other node. If you're willing to tolerate the lowered TTL for your NS server's A and records then when IP address changes and your parent zone is willing to tolerate dynamic updates for any glue, then you can make DNS updates to the parent zone like any other mobile node. The clients find the recursing resovlers via whatever process assigns the client's IP address, e.g. DHCP or PPP. If it is for some reason useful for the server's base address to change then assign a set of VIPs to the DNS service and route them at layer 3. On the other side of the wall, the recursing resolvers don't particularly care about their source addresses for originating queries to the authoritative servers and will move to the newly favored address with nary a hitch. If you want an actually hard question, try this one: what do you do when fewer than all of the authoritative DNS servers for your node's name are available to receive an update? What do you do when those servers suffer a split brain where each is reachable to some clients but they can't talk to each other? How do you stop bog standard outages from escalating into major network partitions? For that matter, how do you solve the problem with your home agent approach? Is it even capable of having multiple home agents active for each node? How do you keep them in sync? Regards, Bill Herrin -- William D. Herrin her...@dirtside.com b...@herrin.us 3005 Crane Dr. .. Web: http://bill.herrin.us/ Falls Church, VA 22042-3004
Re: Shim6, was: Re: filtering /48 is going to be necessary
William Herrin wrote: As LS requires less intelligence than DV, it converges faster. I do believe that's the first time I've heard anybody suggest that a link state routing protocol requires less intelligence than a distance vector protocol. I mean intelligence as intermediate systems. DV is a distributed computation by intelligent intermediate systems, whereas, with LS, intermediate systems just flood and computation is by each end. Here is an exercise for you insisting on DNS, an intermediate system. What if DNS servers, including root ones, are mobile? DNS' basic bootstrapping issues don't change, nor do the solutions. The resovlers find the roots via a set of static well-known layer 3 address You failed to deny MH know layer 3 address of its private HA. It's waste of resource for MH have well known IP address of root servers and domain names of its private DNS server and security keys for dynamic update only to avoid to know IP address of its private HA. For that matter, how do you solve the problem with your home agent approach? Is it even capable of having multiple home agents active for each node? How do you keep them in sync? I actually designed and implemented such a system. Multiple home agents each may have multiple addresses. If some address of HA does not work, MH tries other addresses of HA. If some HA can not communicate with MH, CH may try to use other HA. There is nothing mobility specific. Mobile protocols are modified just as other protocols are modified for multiple addresses. In practice, however, handling multiple addresses is not very useful because selection of the best working address is time consuming unless hosts have default free routing tables. Masataka Ohta
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Tue, 13 Mar 2012 20:13:41 PDT, Owen DeLong said: I expect within my lifetime that multi-gigabit ethernet will become commonplace in the household LAN environment and that when that becomes reality, localized IP Multicast over multi-gigabit ethernet will eventually supplant HDMI as the primary transport for audio/video streams between devices (sources such as BD players, DVRs, computers, etc. and destinations such as receivers/amps, monitors, speaker drivers, etc.). The only reason you got HDMI at all was because the content owners managed to get HDCP included. You won't get a replacement that doesn't do HDCP until we fix the sorry state of copyright in the US. So it's equivalent to asking if we're going to fix copyright within your lifetime... :) pgpLUYvgrisVr.pgp Description: PGP signature
RE: Shim6, was: Re: filtering /48 is going to be necessary
-Original Message- From: valdis.kletni...@vt.edu [mailto:valdis.kletni...@vt.edu] The only reason you got HDMI at all was because the content owners managed to get HDCP included. You won't get a replacement that doesn't do HDCP until we fix the sorry state of copyright in the US. So it's equivalent to asking if we're going to fix copyright within your lifetime... :) When the revolution comes, all will be fixed. -- Leigh __ This email has been scanned by the Symantec Email Security.cloud service. For more information please visit http://www.symanteccloud.com __
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Wed, Mar 14, 2012 at 04:39:21PM +, Leigh Porter wrote: From: valdis.kletni...@vt.edu [mailto:valdis.kletni...@vt.edu] The only reason you got HDMI at all was because the content owners managed to get HDCP included. You won't get a replacement that doesn't do HDCP until we fix the sorry state of copyright in the US. So it's equivalent to asking if we're going to fix copyright within your lifetime... :) When the revolution comes, all will be fixed. Mhm. Yeah. But until then, it's equivalent to solving the halting problem. -- Mike Andrews, W5EGO mi...@mikea.ath.cx Tired old sysadmin
Re: Shim6, was: Re: filtering /48 is going to be necessary
Mike Andrews mi...@mikea.ath.cx wrote: On Wed, Mar 14, 2012 at 04:39:21PM +, Leigh Porter wrote: From: valdis.kletni...@vt.edu [mailto:valdis.kletni...@vt.edu] The only reason you got HDMI at all was because the content owners managed to get HDCP included. You won't get a replacement that doesn't do HDCP until we fix the sorry state of copyright in the US. So it's equivalent to asking if we're going to fix copyright within your lifetime... :) When the revolution comes, all will be fixed. Mhm. Yeah. But until then, it's equivalent to solving the halting problem. Come the revolution, things will be different. Not necessarily -better-, but _different_. wry grin
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Wed, Mar 14, 2012 at 1:45 PM, Owen DeLong o...@delong.com wrote: I fully expect them to develop an HDCP-or-equivalent enabled protocol to run over IP Multicast. Do you have any reason you believe that won't happen? Owen I'm pretty sure it's already in place for IPTV solutions.
Re: Shim6, was: Re: filtering /48 is going to be necessary
Randy Bush wrote: none of which seem to move us forward. i guess the lesson is that, as long as we are well below moore, we just keep going down the slippery, and damned expensive, slope. As long as we keep using IPv4, we are mostly stopping at /24 and must stop at /32. But, see the subject. It's well above moore. For high speed (fixed time) routed look up with 1M entries, SRAM is cheap at /24 and is fine at /32 but expensive and power consuming TCAM is required at /48. That's one reason why we should stay away from IPv6. Masataka Ohta
Re: Shim6, was: Re: filtering /48 is going to be necessary
William Herrin wrote: I've been an IRTF RRG participant and in my day job I build backend systems for mobile messaging devices used in some very challenging and very global IP and non-IP environments. I know non-IP mobile environment is heavily encumbered. So, I can understand why you insist on using DNS for mobility only to make IP mobility as encumbered as non-IP ones. Au contraire. Triangle elimination is a problem because the IP address can't change with session survivability. But that's because TCP and UDP require it. If A follows from B and B follows from C then A follows from C: TCP is at fault. If a correspondent host CH send packets to a mobile host MH, it may be tunneled by a home agent HA or, with triangle elimination, tunneled by CH itself, in both of which cases, IP address of internal packets within tunnels are that of CH and MH's home address, which is handled by TCP just normally. Ignoring that DNS does not work so fast, TCP becomes it wasn't sure what addresses it should be talking to only after a long timeout. Says who? Our hypothetical TCP can become unsure as soon as the first retransmission if we want it to. It can even become unsure when handed a packet to send after a long delay with no traffic. There's little delay kicking off the recheck either way. That may be a encumbered way of doing things in non-IP, or bell headed, mobile systems, where 0.05 second of voice loss is acceptable but 0.2 second of voice loss is significant. However, on the Internet, 0.05 second of packet losses can be significant and things work end to end. In this case, your peer, a mobile host, is the proper end, because it is sure when it has lost or are losing a link. Then, the end establishes a new link with a new IP and initiate update messages for triangle elimination at proper timing without unnecessary checking. According to the end to end argument of Saltzer et. al: The function in question can completely and correctly be implemented only with the knowledge and help of the application standing at the end points of the communication system. the mobility module of the mobile host has the knowledge for proper timing to update triangle elimination the function in question. Masataka Ohta
Re: Shim6, was: Re: filtering /48 is going to be necessary
William Herrin wrote: When I ran the numbers a few years ago, a route had a global cost impact in the neighborhood of $8000/year. It's tough to make a case that folks who need multihoming's reliability can't afford to put that much into the system. The cost for bloated DFZ routing table is not so small and is paid by all the players, including those who use DFZ but do not multihome. Hi, http://bill.herrin.us/network/bgpcost.html If you believe there's an error in my methodology, feel free to take issue with it. Your estimate on the number of routers in DFZ: somewhere between 120,000 and 180,000 with the consensus number near 150,000 is a result of high cost of routers and is inappropriate to estimate global cost of a routing table entry. Because DFZ capable routers are so expensive, the actual number of routers is so limited. If the number of routes in DFZ is, say, 100, many routers and hosts will be default free. Often overlooked is that multihoming through multi-addressing could solve IP mobility too. Not. What is often overlooked is the fact that they are orthogonal problems. I respectfully disagree. My statement is based on my experience to implement locator/ID separation system with multi-address TCP and IP mobility. They need separate mechanisms and separate coding. Current mobility efforts have gone down a blind alley of relays from a home server and handoffs from one network to the next. And in all fairness, with TCP tightly bound to a particular IP address pair there aren't a whole lot of other options. Nevertheless, it's badly suboptimal. Latency and routing inefficiency rapidly increases with distance from the home node among other major problems. That is a mobility issue of triangle elimination having nothing to do with TCP. But suppose you had a TCP protocol that wasn't statically bound to the IP address by the application layer. Suppose each side of the connection referenced each other by name, TCP expected to spread packets across multiple local and remote addresses, and suppose TCP, down at layer 4, expected to generate calls to the DNS any time it wasn't sure what addresses it should be talking to. Ignoring that DNS does not work so fast, TCP becomes it wasn't sure what addresses it should be talking to only after a long timeout. And if the node gets even moderately good at predicting when it will lose availability for each network it connects to and/or when to ask the DNS again instead of continuing to try the known IP addresses you can What? A node asks DNS IP addresses of its peer, because the node is changing its IP addresses? The only end to end way to handle multiple addresses is to let applications handle them explicitly. For connection-oriented protocols, that's nonsense. Pick an appropriate mapping function and you can handle multiple layer 3 addresses just fine at layer 4. It will require the applications perform reverse mapping function, when they require raw IP addresses. For connectionless protocols, maybe. I'm afraid you are unaware of connected UDP. However, I'm not convinced that can't be reliably accomplished with a hinting process where the application tells layer 4 its best guess of which send()'s succeeded or failed and lets layer 4 figure out the resulting gory details of address selection. That's annoying, which is partly why shim6 failed. It's a lot easier for UDP-based applications directly manage multiple IP addresses. Masataka Ohta
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mar 13, 2:21 am, Masataka Ohta mo...@necom830.hpcl.titech.ac.jp wrote: William Herrin wrote: When I ran the numbers a few years ago, a route had a global cost impact in the neighborhood of $8000/year. It's tough to make a case that folks who need multihoming's reliability can't afford to put that much into the system. The cost for bloated DFZ routing table is not so small and is paid by all the players, including those who use DFZ but do not multihome. Hi, http://bill.herrin.us/network/bgpcost.html If you believe there's an error in my methodology, feel free to take issue with it. Your estimate on the number of routers in DFZ: somewhere between 120,000 and 180,000 with the consensus number near 150,000 is a result of high cost of routers and is inappropriate to estimate global cost of a routing table entry. Because DFZ capable routers are so expensive, the actual number of routers is so limited. If the number of routes in DFZ is, say, 100, many routers and hosts will be default free For quite some time, a sub-$2000 PC running Linux/BSD has been able to cope with DFZ table sizes and handle enough packets per second to saturate two or more if the prevalent LAN interfaces of the day. The reason current routers in the core are so expensive is because of the 40 gigabit interfaces, custom ASICs to handle billions of PPS, esoteric features, and lack of competition. The fact that long-haul fiber is very expensive to run limits the number of DFZ routers more than anything else. Why not take a default route and simplify life when you're at the end of a single coax link? If your lucky enough to have access to fiber from multiple providers, the cost of a router which can handle a full table is not a major concern compared with your monthly recurring charges. -- RPM
Re: Shim6, was: Re: filtering /48 is going to be necessary
Ryan Malayter wrote: If the number of routes in DFZ is, say, 100, many routers and hosts will be default free For quite some time, a sub-$2000 PC running Linux/BSD has been able to cope with DFZ table sizes and handle enough packets per second to saturate two or more if the prevalent LAN interfaces of the day. What if, you run windows? The reason current routers in the core are so expensive is because of the 40 gigabit interfaces, custom ASICs to handle billions of PPS, esoteric features, and lack of competition. The point of http://bill.herrin.us/network/bgpcost.html was that routers are more expensive because of bloated routing table. If you deny it, you must deny its conclusion. The fact that long-haul fiber is very expensive to run limits the number of DFZ routers more than anything else. Given that global routing table is bloated because of site multihoming, where the site uses multiple ISPs within a city, costs of long-haul fiber is irrelevant. Why not take a default route and simplify life when you're at the end of a single coax link? That's fine. If your lucky enough to have access to fiber from multiple providers, the cost of a router which can handle a full table is not a major concern compared with your monthly recurring charges. As it costs less than $100 per month to have fiber from a local ISP, having them from multiple ISPs costs a lot less is negligible compared to having routers with a so bloated routing table. Masataka Ohta
Re: Shim6, was: Re: filtering /48 is going to be necessary
In a message written on Tue, Mar 13, 2012 at 02:19:00PM +1100, Geoff Huston wrote: On 13/03/2012, at 2:31 AM, Leo Bicknell wrote: It was never clear to me that even if it worked 100% as advertised that it would be cheaper / better in the global sense. I think that's asking too much of the IETF Leo - Shim6 went through much the same process as most of the IETF work these days: bubble of thought, BOF sanity check, requirements work, protocol prototyping, technology specification. I think you took my statement a bit too literally, as if I wanted a proof that shim6 would be cheaper than building larger routers. That would be asking way too much. However, shim6 for me never even passed the theoretical smell test economically. To make routers handle more DFZ routes basically means putting more memory in routers. It may be super fancy super expensive fast TCAM to handle the job, but at the end of the day it's pretty much just more memory, which means more money. There's a wild range of estimates as to how many DFZ routers there are out there, but it seems like the low end is 50,000 and the high end is 500,000. A lot of ram and a lot of money for sure, but as far as we can tell a tractable problem even with a growth rate much higher than we have now. Compare and contrast with shim6, even if you assume it does everything it was billed to do. First, it assumes we migrate everyone to IPv6, because it's not an IPv4 solution. Second, it assumes we update well, basically every since device with an IP stack. I'm guessing we're north of 5 billion IP devices in the world, and wouldn't be surprised if the number isn't more like 10 billion. Third, because it is a software solution, it will have to be patched/maintained/ported _forever_. I'm hard pressed in my head to rationalize how maintaining software for the next 50 years on a few billion or so boxes is cheaper in the global sense than adding memory to perhaps half a million routers. -- Leo Bicknell - bickn...@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ pgpnVPiZxAff6.pgp Description: PGP signature
Re: Shim6, was: Re: filtering /48 is going to be necessary
2012/3/13 Masataka Ohta mo...@necom830.hpcl.titech.ac.jp: William Herrin wrote: http://bill.herrin.us/network/bgpcost.html If you believe there's an error in my methodology, feel free to take issue with it. Your estimate on the number of routers in DFZ: somewhere between 120,000 and 180,000 with the consensus number near 150,000 is a result of high cost of routers and is inappropriate to estimate global cost of a routing table entry. Hi, Please elaborate. In what way is the average cost of routers carrying the DFZ table an inappropriate variable in estimating the cost of the routing system? Because DFZ capable routers are so expensive, the actual number of routers is so limited. If the number of routes in DFZ is, say, 100, many routers and hosts will be default free. If wishes were horses, beggars would ride. The number of routes in the DFZ isn't 100 and is trending north, not south. Often overlooked is that multihoming through multi-addressing could solve IP mobility too. Not. What is often overlooked is the fact that they are orthogonal problems. I respectfully disagree. My statement is based on my experience to implement locator/ID separation system with multi-address TCP and IP mobility. They need separate mechanisms and separate coding. I've been an IRTF RRG participant and in my day job I build backend systems for mobile messaging devices used in some very challenging and very global IP and non-IP environments. If we're done touting our respective qualifications to hold an opinion, let's get back to vetting the idea itself. Current mobility efforts have gone down a blind alley of relays from a home server and handoffs from one network to the next. And in all fairness, with TCP tightly bound to a particular IP address pair there aren't a whole lot of other options. Nevertheless, it's badly suboptimal. Latency and routing inefficiency rapidly increases with distance from the home node among other major problems. That is a mobility issue of triangle elimination having nothing to do with TCP. Au contraire. Triangle elimination is a problem because the IP address can't change with session survivability. But that's because TCP and UDP require it. If A follows from B and B follows from C then A follows from C: TCP is at fault. But suppose you had a TCP protocol that wasn't statically bound to the IP address by the application layer. Suppose each side of the connection referenced each other by name, TCP expected to spread packets across multiple local and remote addresses, and suppose TCP, down at layer 4, expected to generate calls to the DNS any time it wasn't sure what addresses it should be talking to. Ignoring that DNS does not work so fast, TCP becomes it wasn't sure what addresses it should be talking to only after a long timeout. Says who? Our hypothetical TCP can become unsure as soon as the first retransmission if we want it to. It can even become unsure when handed a packet to send after a long delay with no traffic. There's little delay kicking off the recheck either way. And if the node gets even moderately good at predicting when it will lose availability for each network it connects to and/or when to ask the DNS again instead of continuing to try the known IP addresses you can What? A node asks DNS IP addresses of its peer, because the node is changing its IP addresses? A re-verify by name lookup kicks off in a side thread any time the target threshold for a certainty heuristic is hit. Inputs into that heuristic include things like the TTL expiration of the prior lookup, the time since successful communication with the peer and the time spent retrying since the last successful communication with the peer. If you have any communication with the peer on any address pair, he can tell you what addresses should still be on your try-me list. If there's a reasonable chance that you've lost communication with the peer, then you ask the DNS server for the peer's latest information. The only end to end way to handle multiple addresses is to let applications handle them explicitly. For connection-oriented protocols, that's nonsense. Pick an appropriate mapping function and you can handle multiple layer 3 addresses just fine at layer 4. It will require the applications perform reverse mapping function, when they require raw IP addresses. No. The application passes the IP address in a string the same way it passes a name. The layer 4 protocol figures out how it's going to map that to a name. It could do a reverse mapping. It could connect to the raw IP and request that the peer provide a name. There are several other strategies which could be used independently or as a group. But you avoid using them at the application level. Keep that operation under layer 4's control. For connectionless protocols, maybe. I'm afraid you are unaware of connected UDP. Your fears are unfounded. However, I'm not
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Tue, Mar 13, 2012 at 9:48 AM, Leo Bicknell bickn...@ufp.org wrote: I'm hard pressed in my head to rationalize how maintaining software for the next 50 years on a few billion or so boxes is cheaper in the global sense than adding memory to perhaps half a million routers. For a one-order of magnitude increase in routes, (upper bound of $30B/year the BGP way) it may or may not be. For a four orders increase ($30T/year) it's self-evidently cheaper to change software on the billion or so boxes. How many routes would a system improvement that radically reduced the cost per route add? Regards, Bill Herrin -- William D. Herrin her...@dirtside.com b...@herrin.us 3005 Crane Dr. .. Web: http://bill.herrin.us/ Falls Church, VA 22042-3004
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mar 13, 2012, at 6:03 AM, Masataka Ohta wrote: Ryan Malayter wrote: If the number of routes in DFZ is, say, 100, many routers and hosts will be default free For quite some time, a sub-$2000 PC running Linux/BSD has been able to cope with DFZ table sizes and handle enough packets per second to saturate two or more if the prevalent LAN interfaces of the day. What if, you run windows? Why would you want to run windows on a box you're trying to use as a router? That's like trying to invade Fort Knox with a bag of plastic soldiers. Leo's point is that you can build/buy a DFZ capable router for less than $2,000. If you run windows, the box will be more expensive, less capable, and less reliable. If that's what you want, knock yourself out, but, it's hardly relevant to the discussion at hand. The reason current routers in the core are so expensive is because of the 40 gigabit interfaces, custom ASICs to handle billions of PPS, esoteric features, and lack of competition. The point of http://bill.herrin.us/network/bgpcost.html was that routers are more expensive because of bloated routing table. If you deny it, you must deny its conclusion. To a certain extent you are right. I believe that Bill's analysis and his conclusions are deeply flawed in many ways. However, he is marginally correct in that the high cost of core DFZ routers is the product of the large forwarding table multiplied by the cost per forwarding entry in a high-pps high-data-rate system. Further adding to this is the fact that high-rate (pps,data) routers generally need to distribute copies of the FIB to each line card so the cost per forwarding entry is further multiplied by the number of line cards (and in some cases, the number of modules installed on each line card). The fact that long-haul fiber is very expensive to run limits the number of DFZ routers more than anything else. Given that global routing table is bloated because of site multihoming, where the site uses multiple ISPs within a city, costs of long-haul fiber is irrelevant. Long-haul meaning anything that leaves the building. Yes, it's a poor choice of terminology, but, if you prefer, the costs of last- mile fiber apply equally to Leo's point. Why not take a default route and simplify life when you're at the end of a single coax link? That's fine. If your lucky enough to have access to fiber from multiple providers, the cost of a router which can handle a full table is not a major concern compared with your monthly recurring charges. As it costs less than $100 per month to have fiber from a local ISP, having them from multiple ISPs costs a lot less is negligible compared to having routers with a so bloated routing table. $100/month * 2 = $200/month. $200/month pays for a DFZ capable router every year. That's means that the cost of 2*fiber costs quite a bit more than the cost of the router. There is a difference between a DFZ router and a core router. I personally run a DFZ router for my personal AS. I don't personally own or run a core router for my personal AS. The fact that people conflate the idea of a DFZ router with the idea of a core router is part of the problem and a big part of where Bill's cost structure analysis breaks, as you pointed out. Small to medium businesses that want to multihome can easily do so with relatively small investments in equipment which are actually negligible compared to the telecom costs for the multiple connections. Owen
Re: Shim6, was: Re: filtering /48 is going to be necessary
It's _WAY_ more than a billion boxes at this point. Owen On Mar 13, 2012, at 10:27 AM, William Herrin wrote: On Tue, Mar 13, 2012 at 9:48 AM, Leo Bicknell bickn...@ufp.org wrote: I'm hard pressed in my head to rationalize how maintaining software for the next 50 years on a few billion or so boxes is cheaper in the global sense than adding memory to perhaps half a million routers. For a one-order of magnitude increase in routes, (upper bound of $30B/year the BGP way) it may or may not be. For a four orders increase ($30T/year) it's self-evidently cheaper to change software on the billion or so boxes. How many routes would a system improvement that radically reduced the cost per route add? Regards, Bill Herrin -- William D. Herrin her...@dirtside.com b...@herrin.us 3005 Crane Dr. .. Web: http://bill.herrin.us/ Falls Church, VA 22042-3004
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mar 13, 2:18 pm, Owen DeLong o...@delong.com wrote: On Mar 13, 2012, at 6:03 AM, Masataka Ohta wrote: Ryan Malayter wrote: If the number of routes in DFZ is, say, 100, many routers and hosts will be default free For quite some time, a sub-$2000 PC running Linux/BSD has been able to cope with DFZ table sizes and handle enough packets per second to saturate two or more if the prevalent LAN interfaces of the day. What if, you run windows? Why would you want to run windows on a box you're trying to use as a router? That's like trying to invade Fort Knox with a bag of plastic soldiers. Check your quoting depth... you're attributing Masataka Ohta's comments to me, he brought up running windows. I am the one who brought put forward the notion of a sub-$2000 DFZ router.
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mar 13, 8:03 am, Masataka Ohta mo...@necom830.hpcl.titech.ac.jp wrote: The point of http://bill.herrin.us/network/bgpcost.html was that routers are more expensive because of bloated routing table. If you deny it, you must deny its conclusion. Bill's analysis is quite interesting, but my initial take is that it is somehwat flawed. It assumes that the difference between what Cisco charges for a 7606 and a 3750G bears some resemblance to the actual bill of materials needed to support the larger routing table. That simply isn't the case: Cisco rightly charges what they think the market will bear for their routers and switches. I think a more realistic approach would be to use the cost differential between a router model X that supports 1M routes the same model configured to support 2M routes. Or perhaps we could look at the street prices for TCAM expansion modules. Either would be a better indicator of the incremental cost attributable to routing table size. The majority of costs in a mid-to-high-end Cisco/Juniper chassis are sunk and have nothing to do with the size of the routing table. The expensive routers currently used by providers are expensive because the market isn't that big in quantity, so they are not commodity items. They are designed to maximize the utility of very expensive long-haul fibers and facilities to a service provider. This means providing a high density of high-speed interfaces which can handle millions to billions of packets per second. They also provide lots of features that service providers and large enterprises want, sometimes in custom ASICs. These are features which have nothing to do with the size of the DFZ routing table, but significantly impact the cost of the device. Given that global routing table is bloated because of site multihoming, where the site uses multiple ISPs within a city, costs of long-haul fiber is irrelevant. I suppose smaller multi-homed sites can and often do take a full table, but they don't *need* to do so. What they do need is their routes advertised to the rest of the internet, which means they must be in the fancy-and-currently-expensive routers somewhere upstream. This is where the cost of long-haul fiber becomes relevant: Until we can figure out how dig cheaper ditches and negotiate cheaper rights-of- way, there will not be an explosion of the number of full-table provider edge routers, because there are only so many interconnection points where they are needed. Incremental growth, perhaps, but physical infrastructure cannot follow an exponential growth curve. As it costs less than $100 per month to have fiber from a local ISP, having them from multiple ISPs costs a lot less is negligible compared to having routers with a so bloated routing table. For consumer connections, a sub-$1000 PC would serve you fine with a full table given the level of over-subscription involved. Even something like Quagga or Vyatta running in a virutal machine would suffice. Or a Linksys with more RAM. Getting your providers to speak BGP with you on such a connection for that same $100/month will be quite a feat. Even in your contrived case, however, the monthly recurring charges exceed a $1000 router cost after a few months. Enterprises pay several thousand dollars per month per link for quality IP transit at Gigabit rates.
Re: Shim6, was: Re: filtering /48 is going to be necessary
Yes, the economics of routing are strange, and the lack of any real strictures in the routing tables are testament to the observation that despite more than two decades of tossing the idea around we've yet to find the equivalent of a route deaggregation tax or a route advertisement tax or any other mechanism that effectively turns the routing space into a form of market that imposes some economic constraints on the activity. among other things, i suspect that the shadow of telco settlements makes us shy away from this. randy
Re: Shim6, was: Re: filtering /48 is going to be necessary
On 14/03/2012, at 9:16 AM, Randy Bush wrote: Yes, the economics of routing are strange, and the lack of any real strictures in the routing tables are testament to the observation that despite more than two decades of tossing the idea around we've yet to find the equivalent of a route deaggregation tax or a route advertisement tax or any other mechanism that effectively turns the routing space into a form of market that imposes some economic constraints on the activity. among other things, i suspect that the shadow of telco settlements makes us shy away from this. Agreed. It's all ugly! The shadow of telco settlement nonsense, the entire issue of route pull vs route push, and the spectre of any such payments morphing into a coerced money flow towards to the so-called tier 1 networks all make this untenable. The topic has been coming up pretty regularly every 2 years since about 1994 to my knowledge, and probably earlier, and has never managed to get anywhere useful. Geoff
Re: Shim6, was: Re: filtering /48 is going to be necessary
Yes, the economics of routing are strange, and the lack of any real strictures in the routing tables are testament to the observation that despite more than two decades of tossing the idea around we've yet to find the equivalent of a route deaggregation tax or a route advertisement tax or any other mechanism that effectively turns the routing space into a form of market that imposes some economic constraints on the activity. among other things, i suspect that the shadow of telco settlements makes us shy away from this. Agreed. It's all ugly! The shadow of telco settlement nonsense, the entire issue of route pull vs route push, and the spectre of any such payments morphing into a coerced money flow towards to the so-called tier 1 networks all make this untenable. The topic has been coming up pretty regularly every 2 years since about 1994 to my knowledge, and probably earlier, and has never managed to get anywhere useful. so we are left with o name and shame, and we have seen how unsucsessful that has been. the polluters have no shame. o operational incentives. peers' and general routing filters were the classic dis-incentive to deaggregate. but the droids cave in the minute the geeks leave the room (ntt/verio caved within a month or two of my departure). o router hacks. we have had tickets open for many years asking for knob variations on 'if it is covered (from same peer, from same origin, ...), drop it.' none of which seem to move us forward. i guess the lesson is that, as long as we are well below moore, we just keep going down the slippery, and damned expensive, slope. randy
Re: Shim6, was: Re: filtering /48 is going to be necessary
In a message written on Wed, Mar 14, 2012 at 07:58:30AM +0900, Randy Bush wrote: none of which seem to move us forward. i guess the lesson is that, as long as we are well below moore, we just keep going down the slippery, and damned expensive, slope. Bill's model for price is too simple, and it's because the number of devices with a full table change as the price pressure changes, and that causes other costs. Quite simply, if a box that could take a full table were 10x cheaper, more people would take a full table at the edge. More full tables at the edge probably means more BGP speakers. More BGP speakers means more churn, and churn means the core device needs more CPU. TL;DR A savings in ram may result in an increased need for CPU, based on a change in user behavior. I also think the difference in the BOM to a router vendor is small for most boxes. That is the actual cost to manufacture difference between a 1M route box and 2M route box is noise, on the high end the cost of 40 and 100G optics dominate, and on the low end in a CPU switching box RAM is super-cheap. The only proof I can offer is the _lack_ of vendors offering different route-holding profiles, and that the few that do are stuck in the mid-range equipment. If the route memory was such a big factor you would see more vendors with route memory options. Indeed, over time, the number of boxes with route-memory options have dropped over time and I think this is due to the fact that memory prices have dropped _much_ faster than CPU or optic prices. TL;DR backbone routers are on a treadmill for faster interfaces, and memory is a small fraction of their cost, edge routers are on a tredmill for more CPU for edge features, and again RAM is a fraction of their cost. It's only boxes in the middle being squeezed. I'll note Bill used the 6509/7600 platform, which is solidly in the middle and does have route-memory options (Sup720-3C Sup720-3CXL). If my theory is right, he used pretty much the _worst_ case to arrive at his $8k per route figure. The list price difference in these two cards is $12,000 to go from 256,000 routes to 1,000,000 routes. $12,000 / 750,000 routes = 1.6 cents per route per box. That matches Bill's number (and I think is where he got it), $8000 route/box / 1.6 cents/route/box = 500,000 boxes. But that box has a 5-7 year time frame, so it's really more like (being generous) $1600 per route per box per year. Priced a 100 Gig optic lately, or long haul DWDM system? I don't think the cost of routes is damned expensive. -- Leo Bicknell - bickn...@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ pgpxsgNWsHWHQ.pgp Description: PGP signature
Re: Shim6, was: Re: filtering /48 is going to be necessary
Given that global routing table is bloated because of site multihoming, where the site uses multiple ISPs within a city, costs of long-haul fiber is irrelevant. I suppose smaller multi-homed sites can and often do take a full table, but they don't *need* to do so. What they do need is their routes advertised to the rest of the internet, which means they must be in the fancy-and-currently-expensive routers somewhere upstream. This is where the cost of long-haul fiber becomes relevant: Until we can figure out how dig cheaper ditches and negotiate cheaper rights-of- way, there will not be an explosion of the number of full-table provider edge routers, because there are only so many interconnection points where they are needed. Incremental growth, perhaps, but physical infrastructure cannot follow an exponential growth curve. Not entirely accurate. Most of the reduction in cost/mbps that has occurred over the last couple of decades has come not from better digging economics (though there has been some improvement there), but rather from more Mpbs per dig. As technology continues to increase the Mbps/strand, strands/cable, etc., the cost/Mbps will continue to drop. I expect within my lifetime that multi-gigabit ethernet will become commonplace in the household LAN environment and that when that becomes reality, localized IP Multicast over multi-gigabit ethernet will eventually supplant HDMI as the primary transport for audio/video streams between devices (sources such as BD players, DVRs, computers, etc. and destinations such as receivers/amps, monitors, speaker drivers, etc.). There are already hackish efforts at this capability in the form of TiVO's HTTTPs services, Sling Box, and others. As it costs less than $100 per month to have fiber from a local ISP, having them from multiple ISPs costs a lot less is negligible compared to having routers with a so bloated routing table. For consumer connections, a sub-$1000 PC would serve you fine with a full table given the level of over-subscription involved. Even something like Quagga or Vyatta running in a virutal machine would suffice. Or a Linksys with more RAM. Getting your providers to speak BGP with you on such a connection for that same $100/month will be quite a feat. Even in your contrived case, however, the monthly recurring charges exceed a $1000 router cost after a few months. Simpler solution, let the providers speak whatever they will sell you. Ideally, find one that will at least sell you a static address. Then use a tunnel to do your real routing. There are several free tunnel services and I know at least one will do BGP. Enterprises pay several thousand dollars per month per link for quality IP transit at Gigabit rates. Since this isn't a marketing list, I'll let this one slide by. Owen
Re: Shim6, was: Re: filtering /48 is going to be necessary
Doug Barton do...@dougbarton.us writes: On 3/11/2012 3:15 PM, Iljitsch van Beijnum wrote: But ARIN's action meant it never had a chance. I really don't get why they felt the need to start allowing IPv6 PI after a decade Because as far back as 2003 ARIN members (and members from all the other RIRs for that matter) were saying in very clear terms that PI space was a requirement for moving to v6. No one wanted to lose the provider independence that they had gained with v4. Without that, v6 was a total non-starter. ARIN was simply listening to its members. It didn't help that there was initially no implementation of shim6 whatsoever. That later turned into a single prototype implementation of shim6 for linux. As much as I tried to keep an open mind about shim6, eventually it became clear that this was a Gedankenexperiment in protocol design. Somewhere along the line I started publicly referring to it as sham6. I'm sure I'm not the only person who came to that conclusion. Grass-roots, bottom-up policy process + Need for multihoming + Got tired of waiting = IPv6 PI -r
RE: Shim6, was: Re: filtering /48 is going to be necessary
Grass-roots, bottom-up policy process + Need for multihoming + Got tired of waiting = IPv6 PI -r A perfect summation. Also given that people understand what PI space is and how it works and indeed it does pretty much just work for the end users of the space. -- Leigh Porter UK Broadband __ This email has been scanned by the Symantec Email Security.cloud service. For more information please visit http://www.symanteccloud.com __
Re: Shim6, was: Re: filtering /48 is going to be necessary
On 12-3-2012 16:07, Robert E. Seastrom wrote: Doug Barton do...@dougbarton.us writes: Grass-roots, bottom-up policy process + Need for multihoming + Got tired of waiting = IPv6 PI + Cheap End Users = IPv6 NPt (IPv6 Prefix Translation) Cheers, Seth
Re: Shim6, was: Re: filtering /48 is going to be necessary
In a message written on Mon, Mar 12, 2012 at 11:07:54AM -0400, Robert E. Seastrom wrote: Grass-roots, bottom-up policy process + Need for multihoming + Got tired of waiting = IPv6 PI I'll also add that Shim6 folks never made a good economic argument. It's true that having routes in the DFZ costs money, and that reducing the number of routes will save the industry money in router upgrades and such to handle more routes. However, it's also true that deploying SHIM6 (or similar solutions) also has a cost in rewritten software, traning for network engineers and administrators, and so on. It was never clear to me that even if it worked 100% as advertised that it would be cheaper / better in the global sense. -- Leo Bicknell - bickn...@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ pgpeb1SyIUvLc.pgp Description: PGP signature
Re: Shim6, was: Re: filtering /48 is going to be necessary
On 12 Mar 2012, at 16:21 , Leigh Porter wrote: Grass-roots, bottom-up policy process + Need for multihoming + Got tired of waiting = IPv6 PI A perfect summation. Except that it didn't happen in that order. When ARIN approved PI the shim6 effort was well underway, but it was too early to be able to know to what degree it would solve the multihoming problem. Earlier, when multi6 was stuck or later, when shim6, at least as a specification, but preferably as multiple implementations, could have been evaluated would both have been reasonable times to decide to go for PI instead. Of course as has been the case over and over the argument if you give us feature X we'll implement IPv6 has never borne out. Also given that people understand what PI space is and how it works and indeed it does pretty much just work for the end users of the space. The trouble is that it doesn't scale. Which is fine right now at the current IPv6 routing table size, but who knows what the next decades bring. We've been living with IPv4 for 30 years now, and IPv6 doesn't have a built-in 32-bit expiry date so it's almost certainly going to be around for much longer.
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mar 12, 10:07 am, Robert E. Seastrom r...@seastrom.com wrote: It didn't help that there was initially no implementation of shim6 whatsoever. That later turned into a single prototype implementation of shim6 for linux. As much as I tried to keep an open mind about shim6, eventually it became clear that this was a Gedankenexperiment in protocol design. Somewhere along the line I started publicly referring to it as sham6. I'm sure I'm not the only person who came to that conclusion. I thought the IETF required two inter-operable implementations for protocols. Or was that just for standards-track stuff? Anyway, the effort involved in getting Shim6 implemented globally on all devices would have been nearly as large as switching over all applications from TCP to a protocol with a proper session layer, like SCTP. I believe there are libraries that wrap SCTP and make it look like TCP to legacy applications; wouldn't that have been a better approach?
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mar 12, 2012, at 8:23 AM, Seth Mos wrote: On 12-3-2012 16:07, Robert E. Seastrom wrote: Doug Barton do...@dougbarton.us writes: Grass-roots, bottom-up policy process + Need for multihoming + Got tired of waiting = IPv6 PI + Cheap End Users = IPv6 NPt (IPv6 Prefix Translation) Cheers, Seth I don't get the association between cheap end users and NPT. Can you explain how one relates to the other, given the added costs of unnecessarily translating prefixes? Owen
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mar 12, 2012, at 8:56 AM, Iljitsch van Beijnum wrote: On 12 Mar 2012, at 16:21 , Leigh Porter wrote: Grass-roots, bottom-up policy process + Need for multihoming + Got tired of waiting = IPv6 PI A perfect summation. Except that it didn't happen in that order. When ARIN approved PI the shim6 effort was well underway, but it was too early to be able to know to what degree it would solve the multihoming problem. Earlier, when multi6 was stuck or later, when shim6, at least as a specification, but preferably as multiple implementations, could have been evaluated would both have been reasonable times to decide to go for PI instead. Of course as has been the case over and over the argument if you give us feature X we'll implement IPv6 has never borne out. Except it didn't happen that way. The argument wasn't If you give us PI, we'll implement IPv6. The argument that carried the day and is, IMHO, quite valid was If you don't give us PI we definitely WON'T implement IPv6. The inability to obtain PI was a serious detractor from IPv6 for any organization that already had IPv4 PI. Shim6 showed no promise whatsoever of changing this even in its most optimistic marketing predictions at the time. (As you point out, it was well underway at that point and it's not as if we didn't look at it prior to drafting the policy proposal.) Frankly, I think the long term solution is to implement IDR based on Locators in the native packet header and not using map/encap schemes that reduce MTU, but that doesn't seem to be a popular idea so far. Also given that people understand what PI space is and how it works and indeed it does pretty much just work for the end users of the space. The trouble is that it doesn't scale. Which is fine right now at the current IPv6 routing table size, but who knows what the next decades bring. We've been living with IPv4 for 30 years now, and IPv6 doesn't have a built-in 32-bit expiry date so it's almost certainly going to be around for much longer. If IPv6 works out in the 1.6-2:1 prefix:ASN ratio that I expect or even as much as 4:1, we'll get at least another 30 years out of it. Since we've had IPv6 now for about 15 years, it's already half way through that original 30. :p Owen
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mar 11, 2012, at 3:15 PM, Iljitsch van Beijnum wrote: On 11 Mar 2012, at 20:15 , Joel jaeggli wrote: The IETF and IRTF have looked at the routing scalability issue for a long time. The IETF came up with shim6, which allows multihoming without BGP. Unfortunately, ARIN started to allow IPv6 PI just in time so nobody bothered to adopt shim6. That's a fairly simplistic version of why shim6 failed. A better reason (appart from the fact the building an upper layer overlay of the whole internet on an ip protocol that's largely unedeployed was hard) is that it leaves the destination unable to perform traffic engineering. I'm not saying that shim6 would have otherwise ruled the world by now, it was always an uphill battle because it requires support on both sides of a communication session/association. But ARIN's action meant it never had a chance. I really don't get why they felt the need to start allowing IPv6 PI after a decade, just when the multi6/shim6 effort started to get going but before the work was complete enough to judge whether it would be good enough. That fundementaly is the business we're in when advertising prefixes to more than one provider, ingress path selection. That's the business network operators are in. That's not the business end users who don't want to depend on a single ISP are in. Remember, shim6 was always meant as a solution that addresses the needs of a potential 1 billion basement multihomers with maybe ADSL + cable. The current 25k or so multihomers are irrelevant from the perspective of routing scalability. It's the other 999,975,000 that will kill the routing tables if multihoming becomes mainstream. When discussing 'why shim6 failed' I think its only fair to include a link to a (well reasoned, imho) network operator's perspective on what it did and did not provide in the way of capabilities that network operators desired. http://www.nanog.org/meetings/nanog35/abstracts.php?pt=NDQ3Jm5hbm9nMzU=nm=nanog35 -Darrel
Re: Shim6, was: Re: filtering /48 is going to be necessary
Hi, Op 12 mrt 2012, om 18:09 heeft Owen DeLong het volgende geschreven: + Cheap End Users = IPv6 NPt (IPv6 Prefix Translation) Cheers, Seth I don't get the association between cheap end users and NPT. Can you explain how one relates to the other, given the added costs of unnecessarily translating prefixes? Well, to explain cheap here I would like to explain it as following: - The existing yumcha plastic soap box that you can buy at your local electronics store is powerful enough. About as fast in v6 as it does v4 since it is all software anyhow. It only gets faster from there. - Requires no cooperation from the ISP. This gets excessively worse where n 1. Some have 8 or more for added bandwidth. - The excessive cost associated by current ISP practices that demand you use a business connection (at reduced bandwidth and increased cost). Somehow there was a decision that you can't have PI on consumer connections. - Traffic engineering is a cinch, since it is all controlled by the single box. For example round robin the connections for increased download speed. Similar to how we do it in v4 land. - It is mighty cheap to implement in current software, a number of Cisco and Jumiper releases support it. The various *bsd platforms do and linux is in development. - Not to underestimate the failover capabilities when almost all routers support 3G dongles for backup internet these days. There are considerable drawbacks ofcourse: - Rewriting prefixes breaks voip/ftp again although without the port rewriting the impact is less, but significant. I'd really wish that h323, ftp and voip would go away. Or other protocols the embed local IP information inside the datagram. But I digress. - People balk at the idea of NAT66, not to underestimate a very focal group here. All for solutions here. :-) - It requires keeping state, so no graceful failover. This means dropping sessions ofcourse but the people that want this likely won't care for the price they are paying. Probably missed a bunch of arguments the people will complain about. It is probably best explained in the current experimental draft for NPt. http://tools.ietf.org/html/rfc6296 Cheers, Seth
Re: Shim6, was: Re: filtering /48 is going to be necessary
Ryan Malayter malay...@gmail.com writes: On Mar 12, 10:07 am, Robert E. Seastrom r...@seastrom.com wrote: It didn't help that there was initially no implementation of shim6 whatsoever. That later turned into a single prototype implementation of shim6 for linux. As much as I tried to keep an open mind about shim6, eventually it became clear that this was a Gedankenexperiment in protocol design. Somewhere along the line I started publicly referring to it as sham6. I'm sure I'm not the only person who came to that conclusion. I thought the IETF required two inter-operable implementations for protocols. Or was that just for standards-track stuff? Rough consensus and working code is soo 1993. -r
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mon, Mar 12, 2012 at 11:31 AM, Leo Bicknell bickn...@ufp.org wrote: In a message written on Mon, Mar 12, 2012 at 11:07:54AM -0400, Robert E. Seastrom wrote: Grass-roots, bottom-up policy process + Need for multihoming + Got tired of waiting = IPv6 PI It was never clear to me that even if it worked 100% as advertised that it would be cheaper / better in the global sense. Hi Leo, When I ran the numbers a few years ago, a route had a global cost impact in the neighborhood of $8000/year. It's tough to make a case that folks who need multihoming's reliability can't afford to put that much into the system. As long as the system is largely restricted to folks who do put that much in, there's really no problem with the current flood-all-routers multihoming strategy: at $8k/year the demand will never again exceed the supply. A *working* multi-addressed end user system (like shim6 attempted) could solve cheap multihoming. That could have a billion dollar a year impact as folks at the leaf nodes decide they don't need the more costly BGP multihoming. But that's not where the real money is. Often overlooked is that multihoming through multi-addressing could solve IP mobility too. Provider-agnostic and media-agnostic mobility without levering off a home router. That's where the money is. Carry your voip call uninterrupted from your home wifi on the cable modem to your cell provider in the car to your employer's wired ethernet and back. Keep your SSH sessions alive on the notebook as you travel from home, to the airport, to London and to the hotel. Let folks access the web server on your notebook as it travels from home, to the airport, to Tokyo and back. The capability doesn't exist today. The potential economic impact of such a capability's creation is unbounded. Unfortunately, shim6 didn't work in some of the boundary cases. Since single-homing works pretty well in the ordinary case, there's not much point to a multihoming protocol that fails to deliver all the boundary cases. IIRC, the main problem was that they tried to bootstrap the layer 3 to layer 2 mapping function instead of externally requesting it. That's like trying to build ARP by making a unicast request to a local router instead of a broadcast/multicast request on the LAN. What happens when the local routers no longer have MAC addresses that you know about? Fail. Also, in complete fairness, shim6 suffered for the general lack of consumer interest in IPv6 that persists even today. It's proponents bought in to the hype that new work should focus on IPv6, and they paid for it. Regards, Bill Herrin -- William D. Herrin her...@dirtside.com b...@herrin.us 3005 Crane Dr. .. Web: http://bill.herrin.us/ Falls Church, VA 22042-3004
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mar 12, 2012, at 11:53 AM, Seth Mos wrote: Hi, Op 12 mrt 2012, om 18:09 heeft Owen DeLong het volgende geschreven: + Cheap End Users = IPv6 NPt (IPv6 Prefix Translation) Cheers, Seth I don't get the association between cheap end users and NPT. Can you explain how one relates to the other, given the added costs of unnecessarily translating prefixes? Well, to explain cheap here I would like to explain it as following: - The existing yumcha plastic soap box that you can buy at your local electronics store is powerful enough. About as fast in v6 as it does v4 since it is all software anyhow. It only gets faster from there. Right. - Requires no cooperation from the ISP. This gets excessively worse where n 1. Some have 8 or more for added bandwidth. This one doesn't really parse for me. I'm not sure I understand what you are saying. - The excessive cost associated by current ISP practices that demand you use a business connection (at reduced bandwidth and increased cost). Somehow there was a decision that you can't have PI on consumer connections. There's a big gap between PA without NPT and NPT, however. At the consumer level, I'd rather go PA than NPT. For a business, it's a different story, but, for a business, PI seems feasible and I would think that the business connection is sort of a given. - Traffic engineering is a cinch, since it is all controlled by the single box. For example round robin the connections for increased download speed. Similar to how we do it in v4 land. With all the same dysfunction. Further, in v4 land this depends a great deal on support built into applications and ALGs and a lot of other bloat and hacking to glue the broken little pieces back together and make it all work. I'm truly hoping that we can move away from that in IPv6. I'd really like to see application developers free to develop robust networking code in their applications instead of having to focus all their resources on dealing with the perils and pitfalls of NAT environments. - It is mighty cheap to implement in current software, a number of Cisco and Jumiper releases support it. The various *bsd platforms do and linux is in development. Well, I guess that depends on how and where you measure cost. Sure, if you only count the cost of making the capability available in the feature set on the router, it's cheap and easy. If you count the cost and overhead of the application bloat and complexity and the support costs, the security costs, etc. it adds up pretty quickly. Sort of like it doesn't cost much to send spam, but, the cost of dealing with the never ending onslaught of unwanted email seems to go up every year. (Yes, I just compared people using NPT to spammers). - Not to underestimate the failover capabilities when almost all routers support 3G dongles for backup internet these days. If you care that much about failover, PI is a much better solution. I know my view is unpopular, but, I really would rather see PI made inexpensive and readily available than see NAT brought into the IPv6 mainstream. However, in my experience, very few residential customers make use of that 3G backup port. There are considerable drawbacks ofcourse: - Rewriting prefixes breaks voip/ftp again although without the port rewriting the impact is less, but significant. I'd really wish that h323, ftp and voip would go away. Or other protocols the embed local IP information inside the datagram. But I digress. Yep. - People balk at the idea of NAT66, not to underestimate a very focal group here. All for solutions here. :-) For good reason! - It requires keeping state, so no graceful failover. This means dropping sessions ofcourse but the people that want this likely won't care for the price they are paying. True. Probably missed a bunch of arguments the people will complain about. It is probably best explained in the current experimental draft for NPt. http://tools.ietf.org/html/rfc6296 More than likely. Hopefully we can stop trying so hard to break the internet and start working on ways to make it better soon. Owen
Re: Shim6, was: Re: filtering /48 is going to be necessary
On 12 Mar 2012, at 19:30, Owen DeLong wrote: I know my view is unpopular, but, I really would rather see PI made inexpensive and readily available than see NAT brought into the IPv6 mainstream. However, in my experience, very few residential customers make use of that 3G backup port. So what assumptions do you think future IPv6-enabled homenets might make about the prefixes they receive or can use? Isn't having a PI per residential homenet rather unlikely? It would be desirable to avoid NPTv6 in the homenet scenario. Tim
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mar 12, 2012, at 12:50 PM, Tim Chown wrote: On 12 Mar 2012, at 19:30, Owen DeLong wrote: I know my view is unpopular, but, I really would rather see PI made inexpensive and readily available than see NAT brought into the IPv6 mainstream. However, in my experience, very few residential customers make use of that 3G backup port. So what assumptions do you think future IPv6-enabled homenets might make about the prefixes they receive or can use? Isn't having a PI per residential homenet rather unlikely? Yes, but, having reasonable and/or multiple PA prefixes is very likely and there is no reason not to use that instead of cobbled solutions based on NPT. It would be desirable to avoid NPTv6 in the homenet scenario. Very much so. (Or any other scenario I can think of as well). Owen
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mon, Mar 12, 2012 at 3:50 PM, Tim Chown t...@ecs.soton.ac.uk wrote: On 12 Mar 2012, at 19:30, Owen DeLong wrote: I know my view is unpopular, but, I really would rather see PI made inexpensive and readily available than see NAT brought into the IPv6 mainstream. However, in my experience, very few residential customers make use of that 3G backup port. So what assumptions do you think future IPv6-enabled homenets might make about the prefixes they receive or can use? Isn't having a PI per residential homenet rather unlikely? Hi Tim, Not at all. You just build a second tier to the routing system. BGP is at the top tier. The second tier anchors SOHO users' provider independent addresses to a dynamically mapped set of top-tier relay addresses where each address in the relay anchor set can reach the SOHO's IP. Then you put an entry relay at many/most ISPs which receives the unrouted portions of PI space, looks up the exit relay set and relays the packet. The ingress relays have to keep some state but it's all discardable (can be re-looked up at any time). Also, they can be pushed close enough to the network edge that they aren't overwhelmed. The egress relays are stateless. Do it right and you get within a couple percent of the routing efficiency of BGP for SOHOs with only two or three ISPs. There are some issues with dead path detection which get thorny but they're solvable. There's also an origin filtering problem: packets originating from the PI space to BGP routed space aren't relayed and the ISP doesn't necessarily need to know that one of the PA addresses assigned to customer X is acting as an inbound relay for PI space. Again: solvable. If you want to dig in to how such a thing might work, read: http://bill.herrin.us/network/trrp.html Regards, Bill Herrin -- William D. Herrin her...@dirtside.com b...@herrin.us 3005 Crane Dr. .. Web: http://bill.herrin.us/ Falls Church, VA 22042-3004
Re: Shim6, was: Re: filtering /48 is going to be necessary
On 12 Mar 2012, at 21:15 , William Herrin wrote: Not at all. You just build a second tier to the routing system. It's so strange how people think a locator/identifier split will solve the scalability problem. We already have two tiers: DNS names and IP addresses. So that didn't solve anything. I don't see any reason a second second tier would.
Re: Shim6, was: Re: filtering /48 is going to be necessary
On 2012-03-12 22:14, Iljitsch van Beijnum wrote: On 12 Mar 2012, at 21:15 , William Herrin wrote: Not at all. You just build a second tier to the routing system. It's so strange how people think a locator/identifier split will solve the scalability problem. We already have two tiers: DNS names and IP addresses. So that didn't solve anything. I don't see any reason a second second tier would. Wrong analogy IMHO. Using it, you'd know how to get to specific host in IPv4/IPv6-centric Internet by looking up it's name. Knowing a host is 'thishost.org' doesn't give you information needed to route IPv4/v6 packets that we still use, to this specific system. You still need to lookup the IP assigned to this name. For LISP (other solutions may vary obviously) knowing node 54.100 is available (after lookup) currently at 200.101 makes possibility for core routers to only remember the paths to 200.101/16 and not thousands of this prefix aggregates. This is aggregation of information at the same level of lookup execution. The real problems for world-wide LISP adoption are currently: - nobody sees a FIB explosion for IPv6, because - only around 8k worth of prefixes is in the global IPv6 table Hardly a reason for anyone to implement aggregation. If IPv6 would reach todays IPv4 level of 400k it would be still not a very compelling reason apart from those SPs willing to run all their edge without MPLS and with L3 devices that have very tiny FIBs - like 2/4/8k of entries. Typical core router has ability to forward 2-3M of IPv4 prefixes in hardware, and around 500k-2M of IPv6 prefixes in hardware - today. Ideal LISP use case would be for example 4M of IPv6 prefixes with steady clearly visible growth. Aggregating this down to for example (I've made this completely up) 200k prefixes and still having ability to traffic engineer the paths between the source and destination almost at the levels of having all 4M prefixes in FIB is very compelling reason to deploy LISP. -- There's no sense in being precise when | Łukasz Bromirski you don't know what you're talking | jid:lbromir...@jabber.org about. John von Neumann |http://lukasz.bromirski.net
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mon, Mar 12, 2012 at 5:14 PM, Iljitsch van Beijnum iljit...@muada.com wrote: On 12 Mar 2012, at 21:15 , William Herrin wrote: Not at all. You just build a second tier to the routing system. We already have two tiers: DNS names and IP addresses. Hi Iljitsch, If only that were true. The DNS doesn't sit to the side of TCP, managing the moment to moment layer 4 to layer 3 mapping function the way ARP sits to the side of IP. Instead, the DNS's function is actuated all the way up at layer 7. This was the crux of my complaint about the getaddrinfo/connect APIs last week. Their design makes a future introduction of a transport protocol, something which actually does interact with the name service at the proper layer, needlessly hard. That and the common non-operation of the DNS TTL invalidates DNS' use as a routing tier. Regards, Bill Herrin -- William D. Herrin her...@dirtside.com b...@herrin.us 3005 Crane Dr. .. Web: http://bill.herrin.us/ Falls Church, VA 22042-3004
Re: Shim6, was: Re: filtering /48 is going to be necessary
William Herrin wrote: When I ran the numbers a few years ago, a route had a global cost impact in the neighborhood of $8000/year. It's tough to make a case that folks who need multihoming's reliability can't afford to put that much into the system. The cost for bloated DFZ routing table is not so small and is paid by all the players, including those who use DFZ but do not multihome. Those who can't pay the cost silently give up to be multihomed, which is why you overlooked them. Even those who pays the cost are not using full routing table for IGP, which makes their multihoming less capable. A *working* multi-addressed end user system (like shim6 attempted) Shim6 is too poorly designed that it does not work. Often overlooked is that multihoming through multi-addressing could solve IP mobility too. Not. What is often overlooked is the fact that they are orthogonal problems. Carry your voip call uninterrupted from your home wifi on the cable modem to your cell provider in the car to your employer's wired ethernet and back. Use mobile IP implemented long before shim6 was designed. Unfortunately, shim6 didn't work in some of the boundary cases. Since single-homing works pretty well in the ordinary case, there's not much point to a multihoming protocol that fails to deliver all the boundary cases. Just like NAT, shim6 is an intelligent intermediate entity trying to hide its existence from applications, which is why it does not work sometimes just as NAT does not work sometimes. The only end to end way to handle multiple addresses is to let applications handle them explicitly. Masataka Ohta
Re: Shim6, was: Re: filtering /48 is going to be necessary
2012/3/12 Masataka Ohta mo...@necom830.hpcl.titech.ac.jp: William Herrin wrote: When I ran the numbers a few years ago, a route had a global cost impact in the neighborhood of $8000/year. It's tough to make a case that folks who need multihoming's reliability can't afford to put that much into the system. The cost for bloated DFZ routing table is not so small and is paid by all the players, including those who use DFZ but do not multihome. Hi, http://bill.herrin.us/network/bgpcost.html If you believe there's an error in my methodology, feel free to take issue with it. Often overlooked is that multihoming through multi-addressing could solve IP mobility too. Not. What is often overlooked is the fact that they are orthogonal problems. I respectfully disagree. Current mobility efforts have gone down a blind alley of relays from a home server and handoffs from one network to the next. And in all fairness, with TCP tightly bound to a particular IP address pair there aren't a whole lot of other options. Nevertheless, it's badly suboptimal. Latency and routing inefficiency rapidly increases with distance from the home node among other major problems. However, there's another way to imagine the problem: Networks become available. Networks cease to be available. No handoff. No home server. Just add and drop. Announce a route into the global system to each available network with priority set based on the node's best estimate of the network's bandwidth, likely future availablilty, etc. Cancel the announcement for any network that has left or is leaving range. Modify the announcement priority as the node's estimate of the network evolves. This is quite impossible with today's BGP core. The update rate would crush the core, as would the prefix count. And if those problems were magically solved, BGP still isn't capable of propagating a change fast enough to be useful for mobile applications. But suppose you had a TCP protocol that wasn't statically bound to the IP address by the application layer. Suppose each side of the connection referenced each other by name, TCP expected to spread packets across multiple local and remote addresses, and suppose TCP, down at layer 4, expected to generate calls to the DNS any time it wasn't sure what addresses it should be talking to. DNS servers can withstand the update rate. And the prefix count is moot. DNS is a distributed database. It *already* easily withstands hundreds of millions of entries in the in-addr.arpa zone alone. And if the node gets even moderately good at predicting when it will lose availability for each network it connects to and/or when to ask the DNS again instead of continuing to try the known IP addresses you can get to where network drops are ordinarily lossless and only occasionally result in a few packet losses over the course of a a single-digit number of seconds. Which would be just dandy for mobile IP applications. The only end to end way to handle multiple addresses is to let applications handle them explicitly. For connection-oriented protocols, that's nonsense. Pick an appropriate mapping function and you can handle multiple layer 3 addresses just fine at layer 4. Just like we successfully handle layer 2 addresses at layer 3. For connectionless protocols, maybe. Certainly layer 7 knowledge is needed to decide whether each path is operational. However, I'm not convinced that can't be reliably accomplished with a hinting process where the application tells layer 4 its best guess of which send()'s succeeded or failed and lets layer 4 figure out the resulting gory details of address selection. Regards, Bill Herrin -- William D. Herrin her...@dirtside.com b...@herrin.us 3005 Crane Dr. .. Web: http://bill.herrin.us/ Falls Church, VA 22042-3004
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mon, Mar 12, 2012 at 8:01 PM, William Herrin b...@herrin.us wrote: But suppose you had a TCP protocol that wasn't statically bound to the IP address by the application layer. Suppose each side of the connection referenced each other by name, TCP expected to spread packets across multiple local and remote addresses, and suppose TCP, down at layer 4, expected to generate calls to the DNS any time it wasn't sure what addresses it should be talking to. DNS servers can withstand the update rate. And the prefix count is moot. DNS is a distributed database. It *already* easily withstands hundreds of millions of entries in the in-addr.arpa zone alone. And if the node gets even moderately good at predicting when it will lose availability for each network it connects to and/or when to ask the DNS again instead of continuing to try the known IP addresses you can get to where network drops are ordinarily lossless and only occasionally result in a few packet losses over the course of a a single-digit number of seconds. Which would be just dandy for mobile IP applications. DNS handles many of millions of records sure, but that's because it was designed with caching in mind. DNS changes are rarely done at the rapid I think you are suggesting except for those who can stand the brunt of 5 minute time to live values. I think it would be insane to try and set a TTL much lower then that, but that would seem to work counter to the idea of sub 10 second loss. If you cut down caching as significantly as I think this idea would suggest I would expect scaling will take a plunge. Also consider the significant increased load on DNS servers to handling the constant stream of dynamic DNS updates to make this possible, and that you have to find some reliable trust mechanism to handle these updates because with out that you just made man in the middle attacks a just a little bit easier. That said, I might be misunderstanding something. I would like to see that idea elaborated.
Re: Shim6, was: Re: filtering /48 is going to be necessary
In message camcdhonqqyuzd5cllzmbkw1tjq5h6qmle9lljo4z_h4d3co...@mail.gmail.com , Josh Hoppes writes: Also consider the significant increased load on DNS servers to handling the constant stream of dynamic DNS updates to make this possible, and that you have to find some reliable trust mechanism to handle these updates because with out that you just made man in the middle attacks a just a little bit easier. The DNS already supports cryptographically authenticated updates. There is a good chance that your DHCP server used one of the methods below when you got your lease. SIG(0), TSIG and GSS_TSIG all scale appropiately for this. Mark -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org
Re: Shim6, was: Re: filtering /48 is going to be necessary
On 13/03/2012, at 2:31 AM, Leo Bicknell wrote: In a message written on Mon, Mar 12, 2012 at 11:07:54AM -0400, Robert E. Seastrom wrote: Grass-roots, bottom-up policy process + Need for multihoming + Got tired of waiting = IPv6 PI I'll also add that Shim6 folks never made a good economic argument. It's true that having routes in the DFZ costs money, and that reducing the number of routes will save the industry money in router upgrades and such to handle more routes. However, it's also true that deploying SHIM6 (or similar solutions) also has a cost in rewritten software, traning for network engineers and administrators, and so on. It was never clear to me that even if it worked 100% as advertised that it would be cheaper / better in the global sense. I think that's asking too much of the IETF Leo - Shim6 went through much the same process as most of the IETF work these days: bubble of thought, BOF sanity check, requirements work, protocol prototyping, technology specification. Yes, the economics of routing are strange, and the lack of any real strictures in the routing tables are testament to the observation that despite more than two decades of tossing the idea around we've yet to find the equivalent of a route deaggregation tax or a route advertisement tax or any other mechanism that effectively turns the routing space into a form of market that imposes some economic constraints on the activity. So after so long looking for such a framework in routing, the hope that someday we will figure it out gets smaller and smaller every day. And in some ways the routing explosion problem is one of fear rather than actuality - the growth rates of the IPv4 routing table have been sitting at around 8% - 15% p.a. for many years. oWhile you can't route the Internet on 15 year old hardware, the growth figures are still low enough under Moore's Law that the unit cost of routing is not escalating at levels that are notably higher than other cost elements for an ISP. Its not the routing table explosion that will cause you to raise your fees or worse, go bankrupt tomorrow. So in some ways for Shim6 to have a good economic argument I suspect that Shim6 would have to have pulled out of thin air an approach that completely externalised the cost of routing, and made routing completely free for ISPs. And that is simply fantasy land! Geoff
Re: Shim6, was: Re: filtering /48 is going to be necessary
On 13/03/2012, at 8:14 AM, Iljitsch van Beijnum wrote: On 12 Mar 2012, at 21:15 , William Herrin wrote: Not at all. You just build a second tier to the routing system. It's so strange how people think a locator/identifier split will solve the scalability problem. We already have two tiers: DNS names and IP addresses. So that didn't solve anything. I don't see any reason a second second tier would. I think you have encountered an article of faith Iljitsch :-) http://en.wikipedia.org/wiki/Indirectio: Any problem can be solved by adding another layer of indirection.
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mon, Mar 12, 2012 at 11:33 PM, Geoff Huston g...@apnic.net wrote: On 13/03/2012, at 8:14 AM, Iljitsch van Beijnum wrote: On 12 Mar 2012, at 21:15 , William Herrin wrote: Not at all. You just build a second tier to the routing system. It's so strange how people think a locator/identifier split will solve the scalability problem. We already have two tiers: DNS names and IP addresses. So that didn't solve anything. I don't see any reason a second second tier would. I think you have encountered an article of faith Iljitsch :-) http://en.wikipedia.org/wiki/Indirection: Any problem can be solved by adding another layer of indirection. But that usually will create another problem. Then the test must be: does any particular proposed layer of indirection solve more intractable and more valuable problems than it creates, enough more valuable to be worth the cost of implementation? Still, I concede that it would be better to more effectively use the indirection layer we have (DNS) rather than create another. Better, but not necessarily achievable. Regards, Bill Herrin -- William D. Herrin her...@dirtside.com b...@herrin.us 3005 Crane Dr. .. Web: http://bill.herrin.us/ Falls Church, VA 22042-3004
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mon, Mar 12, 2012 at 10:42 PM, Josh Hoppes josh.hop...@gmail.com wrote: On Mon, Mar 12, 2012 at 8:01 PM, William Herrin b...@herrin.us wrote: Which would be just dandy for mobile IP applications. DNS handles many of millions of records sure, but that's because it was designed with caching in mind. DNS changes are rarely done at the rapid I think you are suggesting except for those who can stand the brunt of 5 minute time to live values. I think it would be insane to try and set a TTL much lower then that, but that would seem to work counter to the idea of sub 10 second loss. If you cut down caching as significantly as I think this idea would suggest I would expect scaling will take a plunge. Hi Josh, Actually, there was a study presented a few years ago. I think it was at a Fall NANOG. At any rate, a gentleman at a university decided to study the impact of adjusting the DNS TTL on the query count hitting his authoritative server. IIRC he tested ranges from 24 hours to 60 seconds. In my opinion he didn't control properly for browser DNS pinning (which would tend to suppress query count) but even with that taken into account, the increase in queries due to decreased TTLs was much less than you might expect. Also consider the significant increased load on DNS servers to handling the constant stream of dynamic DNS updates to make this possible, and that you have to find some reliable trust mechanism to handle these updates because with out that you just made man in the middle attacks a just a little bit easier. That's absolutely correct. We would see a ten-factor increase in load on the naming system and could see as much as a two order of magnitude increase in load. But not on the root -- that load increase is distributed almost exclusively to the leaves. And DNS has long since proven it can scale up many orders of magnitude more than that. By adding servers to be sure... but the DNS job parallelizes trivially and well. Route processing, like with BGP, doesn't. And you're right about implementing a trust mechanism suitable for such an architecture. There's quite a bit of cryptographic work already present in DNS updates but I frankly have no idea whether it would hold up here or whether something new would be required. If it can be reduced to hostname and DNS password, and frankly I'd be shocked if it couldn't, then any problem should be readily solvable. That said, I might be misunderstanding something. I would like to see that idea elaborated. From your questions, it sounds like you're basically following the concept. I sketched out the idea a couple years ago, working through some of the permutations. And the MPTCP working group has been chasing some of the concepts for a while too, though last I checked they'd fallen into one of the major architectural pitfalls of shim6, trying to bootstrap the address list instead of relying on a mapper. The main problem is that we can't get there from here. No set of changes modest enough to not be another IPv6 transition gets the job done. We'd need to entrench smaller steps in the direction of such a protocol first. Like enhancing the sockets API with a variant of connect() which expects to take a host name and service name and return a connected protocol-agnostic socket. Today, just some under-the-hood calls to a non-blocking getaddrinfo and some parallelized connect()'s that happens to work better and be an easier choice than what most folks could write for themselves. But in the future, a socket connection call which receives all the knowledge that a multi-addressed protocol needs to get the job done without further changes to the application's code. Or, if I'm being fair about it, doing what the MPTCP folks are doing and then following up later with additional enhancements to call out to DNS from the TCP layer. Regards, Bill Herrin -- William D. Herrin her...@dirtside.com b...@herrin.us 3005 Crane Dr. .. Web: http://bill.herrin.us/ Falls Church, VA 22042-3004
Shim6, was: Re: filtering /48 is going to be necessary
On 11 Mar 2012, at 20:15 , Joel jaeggli wrote: The IETF and IRTF have looked at the routing scalability issue for a long time. The IETF came up with shim6, which allows multihoming without BGP. Unfortunately, ARIN started to allow IPv6 PI just in time so nobody bothered to adopt shim6. That's a fairly simplistic version of why shim6 failed. A better reason (appart from the fact the building an upper layer overlay of the whole internet on an ip protocol that's largely unedeployed was hard) is that it leaves the destination unable to perform traffic engineering. I'm not saying that shim6 would have otherwise ruled the world by now, it was always an uphill battle because it requires support on both sides of a communication session/association. But ARIN's action meant it never had a chance. I really don't get why they felt the need to start allowing IPv6 PI after a decade, just when the multi6/shim6 effort started to get going but before the work was complete enough to judge whether it would be good enough. That fundementaly is the business we're in when advertising prefixes to more than one provider, ingress path selection. That's the business network operators are in. That's not the business end users who don't want to depend on a single ISP are in. Remember, shim6 was always meant as a solution that addresses the needs of a potential 1 billion basement multihomers with maybe ADSL + cable. The current 25k or so multihomers are irrelevant from the perspective of routing scalability. It's the other 999,975,000 that will kill the routing tables if multihoming becomes mainstream.
Re: Shim6, was: Re: filtering /48 is going to be necessary
On 3/11/2012 3:15 PM, Iljitsch van Beijnum wrote: But ARIN's action meant it never had a chance. I really don't get why they felt the need to start allowing IPv6 PI after a decade Because as far back as 2003 ARIN members (and members from all the other RIRs for that matter) were saying in very clear terms that PI space was a requirement for moving to v6. No one wanted to lose the provider independence that they had gained with v4. Without that, v6 was a total non-starter. ARIN was simply listening to its members. Doug -- If you're never wrong, you're not trying hard enough
Re: Shim6, was: Re: filtering /48 is going to be necessary
On Mar 11, 2012, at 3:15 PM, Iljitsch van Beijnum wrote: On 11 Mar 2012, at 20:15 , Joel jaeggli wrote: The IETF and IRTF have looked at the routing scalability issue for a long time. The IETF came up with shim6, which allows multihoming without BGP. Unfortunately, ARIN started to allow IPv6 PI just in time so nobody bothered to adopt shim6. That's a fairly simplistic version of why shim6 failed. A better reason (appart from the fact the building an upper layer overlay of the whole internet on an ip protocol that's largely unedeployed was hard) is that it leaves the destination unable to perform traffic engineering. I'm not saying that shim6 would have otherwise ruled the world by now, it was always an uphill battle because it requires support on both sides of a communication session/association. But ARIN's action meant it never had a chance. I really don't get why they felt the need to start allowing IPv6 PI after a decade, just when the multi6/shim6 effort started to get going but before the work was complete enough to judge whether it would be good enough. As the person who led the charge in that action, I can probably answer that question... First, from my perspective at the time, SHIM6 didn't stand a chance. It was massively complex, required modifying the stack on every single end system to yield useful results and mad windows domain administration look simple by comparison. As such, I just didn't see any probability of SHIM6 becoming operational reality. (I think LISP suffers from many, though not all) of the same problems, frankly. I remember having this argument with you at the time, so, I'm surprised you don't remember the other side of the argument from the original discussions. However, there was also tremendous pressure in the community for We're not going to adopt IPv6 when it puts us at a competitive disadvantage by locking us in to our upstream choices while we have portability with IPv4. Like it or not, that's a reality and it's a reality that is critically important to getting IPv6 adopted on a wider scale. Fortunately, it was a reality we were able to address through policy (though not without significant opposition from purists like yourself and larger providers that like the idea of locking in customers). That fundementaly is the business we're in when advertising prefixes to more than one provider, ingress path selection. That's the business network operators are in. That's not the business end users who don't want to depend on a single ISP are in. Remember, shim6 was always meant as a solution that addresses the needs of a potential 1 billion basement multihomers with maybe ADSL + cable. The current 25k or so multihomers are irrelevant from the perspective of routing scalability. It's the other 999,975,000 that will kill the routing tables if multihoming becomes mainstream. It's not just about depending on a single ISP, it's also about being able to change your mind about which ISPs you are attached to without having to undertake a multi-month corporate-wide project in the process. Let's compare... BGP multihoming with portable PI prefix: 1. Sign new contract. 2. Make new connection. 3. Bring up new BGP session. 4. Verify routes are working in both directions and seen globally. 5. -- 6. -- 7. -- 8. -- 9. Tear down old BGP session. 10. -- 11. Terminate old contract. 12. -- PA based prefix: 1. Sign new contract. 2. Make new connection. 3. Get routing working for new prefix over new connection. 4. Add new prefix to all routers, switches, provisioning systems, databases, etc. 5. Renumber every machine in the company. 6. Renumber all of the VPNs. 7. Deal with all the remote ACL issues. 8. Deal with any other fallout. 9. Turn off old prefix and connection. 10. Deal with the fallout from the things that weren't symptomatic in steps 4-9. 11. Terminate old contract 12. Remove old prefix from all remaining equipment configurations. By my count, that's twice as many steps to move a PA end-user organization and let's face it, steps 5, 6, and 7 (which don't exist in the PI scenario) take the longest and steps 7, 8, and 10 (again, non-existant in the PI scenario) are the most painful and potentially the most costly. No multihomed business in their right mind is going to accept PA space as a viable way to run their network. Owen