RE: WG review for draft-lapukhov-bgp-routing-large-dc

Vitkovský Adam Thu, 31 Jul 2014 02:49:27 -0700

Hello Peter,

I'd like to ask couple of questions regarding the design and confirm my 
understanding please.



What is the recommended fan out ratio for Tier3 to Tier2 and Tier2 to Tier1 
please? 
Tier3 to Tier2 would be 1/4 (so in case one Tier2 device fails the remaining BW 
still available for a cluster is 75%) (Tier3 device has 4 ECMP paths). 
Tier2 to Tier1 would be 1/2 (so in case one Tier1 device fails the remaining BW 
still available for cluster is 87.5%) (Tier2 device has 2 ECMP paths). 

Or is it more like 1/32 for the Tier3 to Tier2 (Tier3 device has 32 ECMP paths) 
please?  


8.2.1. Collapsing Tier-1 Devices Layer. 
 - I think that as a result of collapsing the number of Tier1 devices to a 1/2 
the impact of the failure of a single Tier1 device will increase in 50%. 
 - Thus wouldn't it be more desirable to leave the same number of Tier1 devices 
and only add links from a particular tier2 device to another/neighboring pair 
of Tier1 devices please? 
 - The reduction in port capacity would remain the same. 
 - However the impact of a failure of a single link or single Tier1 device is 
unchanged. 



8.2. Route Summarization within Clos Topology. 
 - Since you have mentioned that all the devices are preferably of the same 
type to accommodate REQ2. 
 - I'm thinking they would probably have the same FIB capacity right please? 
 - So if Tier1 device can hold all the DC routes than Tier2 and Tier3 devices 
can as well right? 
 - If the FIB size differs between the devices used in various Tiers than 
summarization is beneficial indeed. 

If FIB size is a cause for concern would it be possible to utilize a scheme 
where servers are grouped into server groups, then to define which server 
groups need to communicate with other set of server groups or everybody or the 
internet please? 
This way prefixes could be marked and filters on Tier3 and Tier 2 devices set 
accordingly -to only allow the necessary prefixes to be accepted form a Peer or 
inserted from BGP into RIB/FIB. 
Drawback is of course the increased operational complexity maintaining filters 
as well as troubleshooting. 
Though with clear server groups to Tier3 devices (or clusters) mapping scheme 
the filters would be set once than maintained only occasionally. 
Also with a clear communities scheme troubleshooting would be straight forward 
I believe. 

I'm thinking like in MPLS VRFs if a particular PE (Tier3 device) is serving 
only a subset of VRFs(server groups) it doesn't really need to hold all the DC 
routes. 


8.3. ICMP Unreachable Message Masquerading. 
 Another option is to make the network device perform IP address
 masquerading, 
 - Does that mean the network device will respond with RID/Loopback IP during 
traceroute please? 
 - If so it would be than impossible to pinpoint the link used to forward 
traffic to the next hop so if there are two IP paths between directly connected 
devices we wouldn't be able to distinguish the failed one. 
 - But I guess this kind of setup is not going to be used. 



And just some nit picking. 
7.1. Fault Detection Timing. 
This feature is sometimes called as "fast fallover" 
 - Do you mean the "fast external failover" as it only applies to eBGP 
sessions? 
 - Or you mean the "fast peering session deactivation" functionality that 
brings the same functionality to iBGP sessions please? 


Thank you very much

adam

_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

RE: WG review for draft-lapukhov-bgp-routing-large-dc

Reply via email to