On May 27, 2010, at 11:37 PM, matthew zeier wrote:

> Running into performance issue with a couple 6503/Sup720-3BXL routers with 
> about 8 or more peers.  Each peer's sending a full BGP table.

Anemic ppc cpu blues? performance issues? say it aint so!

(fwiw, rsp720 is not much of an improvement. maybe the sup2t will be, but i 
suspect it will not be faster than other platforms available from C today.)

> If a couple peers flap, the box typically stays at 100% long enough to either 
> drop more peers or drop OSPF.  

Yup!

> Cisco's site is vague, only mentioning 1m v4 routes.

in your case, it's not a tcam/pfc programming issue, it's a bgp rib update & 
cef fib update process interaction (understatement?) which is likely causing 
cpu soak-age, and for you, trouble.

I'd suggest a simple tweak, starting with:

process-max-time 20

then, within the ospf router config:

process-min-time percent 20

if you already have these configured, then I guess it's time for an upgrade to 
mx80 or 240 (or asr, or crs, I guess).

in the unlikely event that your msfc is somehow getting 'slammed' with punts or 
other let-throughs from the pfc during said flaps (i.e. actually forwarding 
some traffic, versus handling only protocol chatter), you may want to also add 
this (or something like it) to your config:

scheduler allocate 8000 4000

FWIW, a test box running 12.2(18)SXF17a on a sup2/msfc2/pfc2 has six active 
peers sending a full table (filtered upon reception to ~200k each), with peer 
overlap/uniques, the fib ends up holding about ~240k (just under pfc2 max). The 
timers for all six neighbors are:

 neighbor transit-in peer-group
 neighbor transit-in timers 15 45

I did observe 'cascading bgp flapp-age' when I had bgp session timers set to 2 
second hellos, and 10 second dead intervals.

With 15 hello/45 dead, administratively flapping any/any/subset of neighbors 
doesn't affect anything on the lab box. LDP and OSPF are tuned to short-ish 
values (1 sec helos, 3 to 4 sec dead for each), and none flap when bgp does 
'stuff.' 

Even worse, a few sessions of BFD are running alongside this mix, and also 
seems fine. An example of the otherwise cool-runingness on this particular 
low-end platform is:

MinTxInt: 50000, MinRxInt: 100000, Multiplier: 4
Received MinRxInt: 100000, Received Multiplier: 4
Holdown (hits): 400(0), Hello (hits): 100(1981407)
Rx Count: 2032111, Rx Interval (ms) min/max/avg: 80/104/90 last: 76 ms ago
Tx Count: 1981412, Tx Interval (ms) min/max/avg: 80/112/92 last: 20 ms ago
Registered protocols: OSPF
Uptime: 2d03h

Best to lab this up, in the hopes you're able to wring out said demons in a 
somewhat more controlled environment.

-Tk
_______________________________________________
cisco-nsp mailing list  [email protected]
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

Reply via email to