Re: /32s in DC and BGP

Robert Raszuk Wed, 12 Feb 2014 09:40:15 -0800

Hi Eric,

My interpretation of the draft is that it is trying to meet
> the following requirements:
>
> - Allow the number of VMs to increase without bound.
>
> - Give each VM a host address that is location-independent (and hence not
>   summarizable within routing).
>
> - Provide optimal routing from any client (presumably anywhere on the
>   Internet) to any VM.
>
> I think it is difficult to meet these requirements jointly without running
> into some problems of scale.
>



I think your interpretation is almost correct. Almost since no one is
claiming to use this "anywhere on the Internet".

Access to data centers or WAN VPNs from internet is happening via secure
gateways and this is natural place where summarization can and will happen.


Likewise if you need to join same-subnet between DC and WAN L3VPN (I
personally would recommend to avoid it) you can leak some host routes, but
this is still done within VPN scope.



> 
Perhaps I'm just misunderstanding the requirements that the draft is trying

> to meet.  Your mention of "contained DC zone environment" certainly
> suggests
> that I'm overlooking some context.  Similarly, your remark in another
> thread:
>
> Robert> Inter-DC or DC to user rather always depends on careful choice and
> Robert> when possible aggregation at the gateway.
>
> suggests that summarization at some level is still going to be important
> for
> good scaling.  However, the draft does not talk about that at all, nor does
> it have normative references to other drafts that set the context.
>


I look at this as an optional enhancement working together with L3VPN. So
L3VPN will be summarized at the zone/DC gateway when extending it to join
WAN L3VPN instances.

In the same time you can inject some host routes when needed. But this
would not mean in any deployment I would use same-subnet for everything
within DC and WAN. No need.



> I suppose it's possible, as Pedro seems to suggest, that my thinking is
> hopelessly trapped in the last century ;-) Perhaps new developments in
> hardware and/or virtualization mean that address summarizability is no
> longer an important factor in scaling the routing system.  If that's the
> assumption, it would be good to state it.
>
>
I think it all depends what the routing system you are really talking
about. It also depends if this is an overlay or not. Last and not least it
really matters if your routing system is running on new x86 CPUs or being
squeezed to run on some underpowered RP/RE blades of traditional last
century routers ;-). Should we design all protocol extensions to run on
those black sheeps ?



> Note that the draft itself calls attention to these scalability issues by
> explicitly positioning itself as being more scalable than an L2 solution.
> While L3 is certainly more scalable than L2, much of that increased
> scalability comes from L3's ability to summarize addresses on a topological
> basis.
>


I guess no one will argue that. But does this means that we can not have
solutions which do not use summarization in some of their parts ?



> The second scaling issue has to do with rate of change.  Pedro has
> presented
> some facts indicating that the time to complete a VM move is quite long
> compared with routing convergence time.  I don't question that.  However,
> that by itself does not imply anything about the rate of VM movement.  It's
> not clear to me whether the rate of VM movement is supposed to be able to
> increase without bound, or whether realistically this rate is expected to
> always remain small compared to what routing can handle.  If the latter,
> that would certainly be worth mentioning.
>


Eric .. let's keep in mind that you inject /32 when the VM get's created
.. when it's ready to take over you may prefer it over former /32 if it was
the same address.

So one can argue that starting VM will always be slower then BGP Update or
BGP Withdraw messages. Wouldn't you agree ?




>
> The third
> s
> caling issue is specific to the "ARP snooping" scheme, and to the
> way that enduser activity (the generation of ARP responses) can lead to the
> auto-origination of BGP routes.  Perhaps MVPN has inured us to this sort of
> phenomenon.  But RFC6514 does talk of rate limiting the generation of MVPN
> BGP routes, while this draft does not mention rate limiting at all.  A
> related area of concern is the feedback loop where ARP responses cause BGP
> activity, and BGP activity causes ARP responses. (This feedback loop
> doesn't
> exist if the PEs get their information from the orchestration system,
> rather
> than from snooping ARPs.)
>
> In a previous message you said:
>
> Robert> Just presence of learned entry in the ARP table should not trigger
> Robert> the host route auto-generation.
>
> So, where are the detailed rules relating ARP snooping to host route
> auto-generation?
>


I would perhaps go as far as GARP trigger. But honestly this solution
works well in concert with orchestration system. 



> I do like Xiaohu's suggestion to de-emphasize the ARP snooping procedure
> and
> to better document its applicability restrictions.
>
>
+1



> I didn't mean to start a food fight, or to start yet another religious
> discussion about the use of BGP.  I just think that the draft makes claims
> of scalability that depend upon certain assumptions, and the assumptions
> are
> not made explicit.  It doesn't bother me if the assumptions are
> controversial, but I would like to know what they are.
>

I think your points make perfect sense now. In fact the entire idea to
make this a WG document is not to have it blessed by IETF, but get more
feedback within the working group where people can contribute to make the
document and its applicability clearly articulated.

Best,
R.

Re: /32s in DC and BGP

Reply via email to