Robert> Could those who claim that that sending /32 or /64 or /128 in BGP Robert> mainly within contained DC zone environment will not scale be a bit Robert> more precise and kindly indicate what the real problem is ?
I think the issue is really straightforward, and has nothing specifically to do with BGP. My interpretation of the draft is that it is trying to meet the following requirements: - Allow the number of VMs to increase without bound. - Give each VM a host address that is location-independent (and hence not summarizable within routing). - Provide optimal routing from any client (presumably anywhere on the Internet) to any VM. I think it is difficult to meet these requirements jointly without running into some problems of scale. Perhaps I'm just misunderstanding the requirements that the draft is trying to meet. Your mention of "contained DC zone environment" certainly suggests that I'm overlooking some context. Similarly, your remark in another thread: Robert> Inter-DC or DC to user rather always depends on careful choice and Robert> when possible aggregation at the gateway. suggests that summarization at some level is still going to be important for good scaling. However, the draft does not talk about that at all, nor does it have normative references to other drafts that set the context. I suppose it's possible, as Pedro seems to suggest, that my thinking is hopelessly trapped in the last century ;-) Perhaps new developments in hardware and/or virtualization mean that address summarizability is no longer an important factor in scaling the routing system. If that's the assumption, it would be good to state it. Note that the draft itself calls attention to these scalability issues by explicitly positioning itself as being more scalable than an L2 solution. While L3 is certainly more scalable than L2, much of that increased scalability comes from L3's ability to summarize addresses on a topological basis. The second scaling issue has to do with rate of change. Pedro has presented some facts indicating that the time to complete a VM move is quite long compared with routing convergence time. I don't question that. However, that by itself does not imply anything about the rate of VM movement. It's not clear to me whether the rate of VM movement is supposed to be able to increase without bound, or whether realistically this rate is expected to always remain small compared to what routing can handle. If the latter, that would certainly be worth mentioning. The third scaling issue is specific to the "ARP snooping" scheme, and to the way that enduser activity (the generation of ARP responses) can lead to the auto-origination of BGP routes. Perhaps MVPN has inured us to this sort of phenomenon. But RFC6514 does talk of rate limiting the generation of MVPN BGP routes, while this draft does not mention rate limiting at all. A related area of concern is the feedback loop where ARP responses cause BGP activity, and BGP activity causes ARP responses. (This feedback loop doesn't exist if the PEs get their information from the orchestration system, rather than from snooping ARPs.) In a previous message you said: Robert> Just presence of learned entry in the ARP table should not trigger Robert> the host route auto-generation. So, where are the detailed rules relating ARP snooping to host route auto-generation? I do like Xiaohu's suggestion to de-emphasize the ARP snooping procedure and to better document its applicability restrictions. I didn't mean to start a food fight, or to start yet another religious discussion about the use of BGP. I just think that the draft makes claims of scalability that depend upon certain assumptions, and the assumptions are not made explicit. It doesn't bother me if the assumptions are controversial, but I would like to know what they are.
