On 07/12/2012 12:36, Simon Lockhart wrote: > its knees. The "BGP Router" process takes all the available CPU while it tries > to re-establish the BGP sessions. While this is happening, the SUP720 seems to > give up processing other stuff in a timely manner - and I see MPLS LDP drop, > OSPF neighbours drop, and then BGP sessions drop due to hold timer expires. > With all these drops, it causes even more CPU load, and the cycle continues. > > I've been talking to other SUP720 using ISPs, and it seems that some see this > same effect, and others don't.
There are two problems here: IOS and a slow cpu; one exacerbates the effect of the other. IOS is a non-preemptive multitasking system, and if a single process on it decides to suck up all available CPU - particularly a high priority process like the bgp router - then other processes will suffer. This is why ldp and ospf sessions drop. The BGP sessions also drop, but that's an internal "BGP Router" process scheduling thing which causes the code which generates bgp keepalives not to be run as often as necessary. When keepalives are not sent, bgp sessions are torn down, which causes churn, which causes more cpu load which causes keepalives not to be sent, etc. This is a classic performance knee problem brought on by insufficient cpu resources and a poor quality scheduler and there's no way of fixing it. Some versions of the SX train appear to cope slightly better than others, but as I haven't run a sup720 on a large IXP, I'm not going to give advice about which ios versions to try. You could play around with the "scheduler interval" command, but I doubt it would make any difference. fwiw, ixp route server operators running quagga bgpd ran into almost exactly the same performance knee. The fix for this was to run bgp keepalives in a separate thread in the quagga bgp daemon, but you can only do that on an operating system with pre-emptive threading support. Maybe one day if Cisco split the bgp router out the iosd process on XE, that might help things as a long term approach towards dealing with this problem, but I don't think that we'll ever see XE on the sup720. We hit 300k prefixes in the dfz in July 2009 and 400k in Feb 2012 which works out as 33% growth in 2.5 years. However badly a sup720 is handling large IXP operation now, it's not going to get any better. Unfortunately, the sup720 is not suitable for DFZ operation these days for this among a variety of other reasons including poor ipv6 support, difficulties with control plane policing, bad netflow implementation and several other things. > And, as a follow-on question, given that the SUP720 is so under-powered for > BGP, what other options do I have which would cope better? SUP-2T? Or, if > I need to move away from the 6500, what's good for BGP routing with about > 20-40G of throughput (i.e. 4-8 * 10GE ports)? How does the ASR9k or ASR1k > range fair for BGP performance? The ASR1k doesn't look to me like a good choice for raw packet forwarding at 10G+ due to high cost and limited performance (although it can do quite smart stuff at lower speeds if you need that instead). If you can live with less than 12 x 10GE ports over the expected operational lifetime of the unit, the ASR9001 is ravishing. The SUP2T also looks good and may be a cheaper option than either while providing an overall greater port density. Be aware that if you're upgrading from an older version, you're either going to be stuck with the limitations of the 6704-10ge line cards or if you're using 6708-10ge cards, you will need to replace the lot of them with 6908-10G line cards - the 6708-10ge cards will not work at all with the sup2t. Inexplicably, Cisco have continued with their lovefest for X2 optics on the newer sup2t line cards rather than conforming with the industry standard of sfp+ or xfp. If you budget stretches to it / you're doing green-field stuff, the chassis based asr9000 is the platform of choice for larger installations these days, but this can work out quite expensive per port. Nick _______________________________________________ cisco-nsp mailing list [email protected] https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
