Re: [c-nsp] Huge SP CPU usage spikes 100%
Hi all the protocol mentioned are stable, we've less than 10 vlans on the gear. Here the output taken 3 times after 30 secs. thanks xxx#sh mls statistics Statistics for Earl in Module 1 L2 Forwarding Engine Total packets Switched: 1126244033503 L3 Forwarding Engine Total packets Processed : 421505439125 @ 14674 pps Total packets L3 Switched : 204107242703 @ 8069 pps Total Packets Bridged : 3318438612 Total Packets FIB Switched: 204107242702 Total Packets ACL Routed : 0 Total Packets Netflow Switched: 1 Total Mcast Packets Switched/Routed : 124896376575 Total ip packets with TOS changed : 2 Total ip packets with COS changed : 2 Total non ip packets COS changed : 0 Total packets dropped by ACL : 3022 Total packets dropped by Policing : 0 Total packets exceeding CIR : 0 Total packets exceeding PIR : 0 Errors MAC/IP length inconsistencies : 1 Short IP packets received : 0 IP header checksum errors : 0 TTL failures : 103 MTU failures : 0 Statistics for Earl in Module 2 L2 Forwarding Engine Total packets Switched: 288056446772 L3 Forwarding Engine Total packets Processed : 140160213188 @ 8038 pps Total packets L3 Switched : 34932291771 @ 1199 pps Total Packets Bridged : 656015670 Total Packets FIB Switched: 34932291770 Total Packets ACL Routed : 0 Total Packets Netflow Switched: 1 Total Mcast Packets Switched/Routed : 484819383 Total ip packets with TOS changed : 2 Total ip packets with COS changed : 2 Total non ip packets COS changed : 0 Total packets dropped by ACL : 0 Total packets dropped by Policing : 0 Total packets exceeding CIR : 0 Total packets exceeding PIR : 0 Errors MAC/IP length inconsistencies : 0 Short IP packets received : 0 IP header checksum errors : 0 TTL failures : 31737163 MTU failures : 0 Statistics for Earl in Module 3 L2 Forwarding Engine Total packets Switched: 16262292535 L3 Forwarding Engine Total packets Processed : 6051789421 @ 189 pps Total packets L3 Switched : 3336100064 @ 105 pps Total Packets Bridged : 18843968 Total Packets FIB Switched: 3336100063 Total Packets ACL Routed : 0 Total Packets Netflow Switched: 1 Total Mcast Packets Switched/Routed : 2696739936 Total ip packets with TOS changed : 2 Total ip packets with COS changed : 2 Total non ip packets COS changed : 0 Total packets dropped by ACL : 0 Total packets dropped by Policing : 0 Total packets exceeding CIR : 0 Total packets exceeding PIR : 0 Errors MAC/IP length inconsistencies : 0 Short IP packets received : 0 IP header checksum errors : 0 TTL failures : 0 MTU failures : 0 Statistics for Earl in Module 5 L2 Forwarding Engine Total packets Switched: 2610065106448 L3 Forwarding Engine Total packets Processed : 1142795976405 @ 32745 pps Total packets L3 Switched : 508643600024 @ 19294 pps Total Packets Bridged : 17347781752 Total Packets FIB Switched: 508643600023 Total Packets ACL Routed : 0 Total Packets Netflow Switched: 1 Total Mcast Packets Switched/Routed : 181041209538 Total ip packets with TOS changed : 2 Total ip packets with COS changed : 2 Total non ip packets COS changed : 0 Total packets dropped by ACL : 2276418 Total packets dropped by Policing : 0 Total packets exceeding CIR : 0 Total packets exceeding PIR : 0 Errors MAC/IP length inconsistencies : 1881 Short IP packets received : 0 IP header checksum errors : 0 TTL failures : 8688774 MTU failures : 0 Total packets L3 Processed by all Modules: 1710513418139 @ 55646 pps xxx#sh mls statistics Statistics for Earl in Module 1 L2 Forwarding Engine Total packets Switched: 1126244939688 L3 Forwarding Engine Total packets Processed : 421505780728 @ 14792 pps Total packets L3 Switched : 204107434350 @ 8375 pps Total Packets Bridged : 3318439356 Total Packets FIB Switched: 204107434349 Total Packets ACL Routed : 0 Total Packets Netflow Switched: 1 Total Mcast Packets Switched/Routed : 124896452309 Total ip packe
Re: [c-nsp] Huge SP CPU usage spikes 100%
On 1 March 2018 at 09:53, james list wrote: > xxx#show ibc > Interface information: > 5 minute rx rate 944000 bits/sec, 793 packets/sec > 5 minute tx rate 25000 bits/sec, 37 packets/sec ... > 2467023087 Packets out of 554699386 CEF Switched, 0 Packets out of 0 > Tag CEF Switched > 3916625157 Packets Fast Switched ... > Potential/Actual paks copied to process level 228808364/225833216 > (2975148 dropped, 265 spd drops) ... > MISTRAL ERROR COUNTERS ... > 2974883 total packets dropped on throttled interfaces (2954630 low, > 16704 medium, 3549 high) >> On 1 March 2018 at 08:29, james list wrote: >> > Dear experts, >> > has anybody experienced a 100% SP CPU usage on C6500-Sup720 >> > (12.2(33)SXI5) >> > with a lot of interrupts ? >> > The main process is Heartbeat. >> > >> > Cisco TAC is struggling in having an idea to sorting out the issue, they >> > are working since 3 days on it.. >> > >> > STP is stable, no mac moving, no real issue found… maybe somebody >> > experienced the same due to something in particular? I've compared to a similar box I have, it has less control-plane traffic than yours it would seem. You have a decent amount of dropped packets which I guess to be expected if you have sustained 100% SP CPU utilisation. Do you have a lot of spanning-tree instances, HSRP/VRRP, multicast (various other control-plane stuff) running on this box? Have you confirmed there are no layer 2 loops? If you run "show mls statistics" several times do you see any error stats climbing very quickly? I would also run a netdr capture and check if the traffic in their is normal for your network. Cheers, James. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] Huge SP CPU usage spikes 100%
It is expected to have something like 30% SP CPU due to the diagnostics tests that run on the LCs. Excessive SP CPU usage is typically related to Layer 2 operations like STP, IGMP and LACP. But another root cause of SP CPU running wild can be HSRP on multiple SVIs. If this applies - have You set priority? Do You use preempt? Applied "no ip proxy-arp" on SVIs? "no ip redirects"? You may also try to disable the diagnostic monitoring of LCs and verify outcome. switch(config)#no diagnostic monitor module all switch(config)#end switch#remote command switch show proc cpu hi Cheers Sven ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] Huge SP CPU usage spikes 100%
You need service internal. On 1 March 2018 at 11:50, james list wrote: > Hi > the mentioned command are not present: > > > xxx#show platform ? > aclDisplay CWAN ACL commands > bridge Distributed/Hardware-based bridging information > buffersShow buffer allocation > cfmShow CFM Commands > eeprom Show CPU eeprom > etherchannel Platform EtherChannel information > fault Show fault data > hardware Show platform hardware information > internal-vlan Show internal vlan > netint Show platform net interrupt information > redundancy Display bias and CWAN platform redundancy > software Show platform software information > stats Display CWAN statistics > supervisor Show supervisor info > tech-support Show system information for Tech-Support > tlbShow processor TLB registers > vfiDisplay CWAN VFI commands > vlans Display hidden VLAN to WAN interface mapping > > 2018-03-01 9:41 GMT+01:00 Saku Ytti : >> >> Hey, >> >> Anything in punts? >> >> show plat cap buffer asic pinnacle slot 5 port 4 direction out priority lo >> show plat cap buffer collect for 5 >> show plat cap buffer data filt >> show plat cap buffer data sample >> >> >> Replace 'slot 5' with your port SUP port number. >> >> >> On 1 March 2018 at 10:29, james list wrote: >> > Dear experts, >> > has anybody experienced a 100% SP CPU usage on C6500-Sup720 >> > (12.2(33)SXI5) >> > with a lot of interrupts ? >> > The main process is Heartbeat. >> > >> > Cisco TAC is struggling in having an idea to sorting out the issue, they >> > are working since 3 days on it.. >> > >> > STP is stable, no mac moving, no real issue found… maybe somebody >> > experienced the same due to something in particular? >> > >> > Thanks for any hints. >> > >> > Cheers, >> > James >> > >> > >> > xxx#remote command switch show process cpu sorted >> > >> > CPU utilization for five seconds: 91%/83%; one minute: 96%; five >> > minutes: >> > 97% >> > PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process >> > 10212118224 387324287 31 100.00% 38.16% 32.07% 0 >> > Heartbeat >> > Proces >> > 258 4104910748 127607878 32168 2.23% 2.01% 2.03% 0 Vlan >> > Statistics >> > 117 7497040242279235756 0 1.19% 0.60% 0.56% 0 >> > DiagCard2/-1 >> > 114 9372052522290556905 0 1.11% 1.05% 1.06% 0 slcp >> > process >> > 500 384748832 761210720505 0.47% 0.49% 0.47% 0 >> > DiagCard3/-1 >> >3 8458075601628030520519 0.47% 0.45% 0.46% 0 >> > DiagCard1/-1 >> > 124 540996344 628393475860 0.39% 0.40% 0.39% 0 >> > DiagCard4/-1 >> > 75 6645542122968378193 0 0.31% 0.20% 0.19% 0 SCP >> > Download >> > Lis >> > >> > xxx#remote command switch show proc cpu his >> > >> > 99 >> > 76687678666777 >> > 100 ** >> > 90 ** >> > 80 ** >> > 70 ** >> > 60 ** >> > 50 ** >> > 40 ** >> > 30 ** >> > 20 ** >> > 10 ** >> >051122334455 >> > 0505050505 >> >CPU% per second (last 60 seconds) >> > >> > 99 >> > 8987889787889888999878 >> > 100 ## >> > 90 ## >> > 80 ## >> > 70 ## >> > 60 ## >> > 50 ## >> > 40 ## >> > 30 ## >> > 20 ## >> > 10 ## >> >051122334455 >> > 0505050505 >> >CPU% per minute (last 60 minutes) >> > * = maximum CPU% # = average CPU% >> >
Re: [c-nsp] Huge SP CPU usage spikes 100%
Here the output, I can share netcapture privately if you agree. Thanks xxx#show platform netint Network IO Interrupt Throttling: throttle count=122323 (0 in multicast fs), timer count=122323 active=0, configured=1 netint usec=4000, netint mask usec=800 resource netint mask usec=400 inband_throttle_mask_hi = 0x0 inband_throttle_mask_lo = 0x80 SPD-drop triggered throttles=265 SPD triggered unthrottles=192 Max SPD throttle time=4000 usecs Buffer depletion triggered throttles=0 ARP queue triggered throttles=0 Low inband activity:throttles=0, active=0, min=5 pkts/6000 bytes, 1000 usec Low activity threshold 1500 pps (192) Current rx packet rate 742 pps (95) Idle hook unthrottles=89404 xxx#remote command switch show platform hardware earl status Superman 0 interrupt counts : Total 6912472122 se_one: 0 se_hwm: 10653 se_dn: 280184388 ntfy_one: 0 ntfy_hwm: 0 ntfy_ovfl: 0 ft_b0_corr_ecc: 0 ft_b0_uncorr_ecc: 0 ft_b1_corr_ecc: 0 ft_b1_uncorr_ecc: 0 ft_b0_multi_pg_hit: 0 ft_b1_multi_pg_hit: 0 l2_flush_done: 0 loop_ntfy: 0 ntfy_fifo_full: 0 l2_line_full: 0 mc_cap: 6707068077 b0_invld_pg_acc: 0 b1_invld_pg_acc: 0 pkt_crc32_err: 0 cpu_par_err: 0 cpu_timeout: 0 rbus_timeout: 0 dbus_timeout: 0 ip_chksum_err: 0 l3_fcs_err: 0 dbus_seq_err: 0 dbus_hdr_err: 0 l2l3_seq_mismatch: 0 l3_rcv_ovfl_err: 0 l2_merge_err: 0 Tycho Interrupts: Total - 320837111 CPU interruptblock : Total interrupts - 0 Netflow interruptblock : Total interrupts - 0 IFIFO_OVF_INT : 0 SINGLE_BIT_ECC_INT: 0 MULTI_BIT_ECC_INT : 0 ECC_DATA_CAPT_INT : 0 TCAM_THRLD_EXCD_INT : 0 ICAM_THRLD_EXCD_INT : 0 TABLE_FULL_INT: 0 ENTRY_ALIAS_INT : 0 FIB interruptblock : Total interrupts - 0 FC_FOVR_INT : 0 FC_FUDR_INT : 0 BP_FOVR_INT : 0 BP_FUDR_INT : 0 RS_FOVR_INT : 0 RS_FUDR_INT : 0 AJ_FOVR_INT : 0 AJ_FUDR_INT : 0 FB_COR_ECC_INT: 0 FB_UNCOR_ECC_INT : 0 Search interrupt block : Total interrupts - 320833986 Adj. Statistics tabl block : Total interrupts - 0 Adj. table interface block : Total interrupts - 0 AT_SEQ_ERR_INT: 0 AT_FOVR_INT : 0 AT_FUDR_INT : 0 AT_IB_ADJ_INT : 0 AT_BZONE_INT : 0 AT_CORR_ECC_ERR_INT : 0 AT_UNCORR_ECC_ERR_INT : 0 AT_ECC_ERR_DATA_CAPT : 0 Packet Parser block block : Total interrupts - 1881 IP_LEN_INT: 1881 IP_SHRT_INT : 0 IP_CS_INT : 0 QDBUS_CRC_INT : 0 QDBUS_LEN_INT : 0 IB_LEN_INT: 0 IB_SEC_INT: 0 Decision Engine bloc block : Total interrupts - 0 NS_ECC_SBE_INT: 0 NS_ECC_MBE_INT: 0 NS_ECC_DCE_INT: 0 ACL_SEQ_ERR_INT : 0 STAT_FOVF_INT : 0 RSLT_FOVF_INT : 0 GEM_MEM_LEAK_INT : 0 Rewrite block interr block : Total interrupts - 0 RW_VO_INT : 0 RW_VU_INT : 0 RW_RO_INT : 0 RW_RU_INT : 0 Statistics block int block : Total interrupts - 1245 GLOBAL_OVFL : 0 BGP_OVFL : 1245 VLAN_OVFL_0 : 0 VLAN_OVFL_1 : 0 VLAN_OVFL_2 : 0 VLAN_OVFL_3 : 0 VLAN_OVFL_4 : 0 VLAN_OVFL_5 : 0 INIT_DONE : 0 Level 4 Map interfac block : Total interrupts - 0 L4MAP_OVFL: 0 Key Queue Block inte block : Total interrupts - 0 KQ_OVFL : 0 Interrupt statistics for Kuma 0 Kuma cpu parity error: 0 Kuma K interface: data intf crc error: 0 Kuma K'interface: data intf crc error: 0 Kuma K interface: result intf hdr fcs err: 0 Kuma K interface: result intf data fcs err : 0 Kuma K'interface: result intf hdr fcs err: 0 Kuma K'interface: result intf data fcs err : 0 Kuma K interface: intf freeze err: 0 Kuma K'interface: intf freeze err: 0 Kuma stats overflow_0 err: 0 Kuma stats overflow_1 err: 0 Kuma stats saturate_0 err: 0 Kuma stats saturate_1 err: 0 Kuma K'interface err : 0 Kuma K interface err : 0 Kuma E interface: frame crc error on received dbus frame : 0 Kuma E interface: header crc e
Re: [c-nsp] Huge SP CPU usage spikes 100%
Hi the mentioned command are not present: xxx#show platform ? aclDisplay CWAN ACL commands bridge Distributed/Hardware-based bridging information buffersShow buffer allocation cfmShow CFM Commands eeprom Show CPU eeprom etherchannel Platform EtherChannel information fault Show fault data hardware Show platform hardware information internal-vlan Show internal vlan netint Show platform net interrupt information redundancy Display bias and CWAN platform redundancy software Show platform software information stats Display CWAN statistics supervisor Show supervisor info tech-support Show system information for Tech-Support tlbShow processor TLB registers vfiDisplay CWAN VFI commands vlans Display hidden VLAN to WAN interface mapping 2018-03-01 9:41 GMT+01:00 Saku Ytti : > Hey, > > Anything in punts? > > show plat cap buffer asic pinnacle slot 5 port 4 direction out priority lo > show plat cap buffer collect for 5 > show plat cap buffer data filt > show plat cap buffer data sample > > > Replace 'slot 5' with your port SUP port number. > > > On 1 March 2018 at 10:29, james list wrote: > > Dear experts, > > has anybody experienced a 100% SP CPU usage on C6500-Sup720 > (12.2(33)SXI5) > > with a lot of interrupts ? > > The main process is Heartbeat. > > > > Cisco TAC is struggling in having an idea to sorting out the issue, they > > are working since 3 days on it.. > > > > STP is stable, no mac moving, no real issue found… maybe somebody > > experienced the same due to something in particular? > > > > Thanks for any hints. > > > > Cheers, > > James > > > > > > xxx#remote command switch show process cpu sorted > > > > CPU utilization for five seconds: 91%/83%; one minute: 96%; five minutes: > > 97% > > PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process > > 10212118224 387324287 31 100.00% 38.16% 32.07% 0 Heartbeat > > Proces > > 258 4104910748 127607878 32168 2.23% 2.01% 2.03% 0 Vlan > > Statistics > > 117 7497040242279235756 0 1.19% 0.60% 0.56% 0 > > DiagCard2/-1 > > 114 9372052522290556905 0 1.11% 1.05% 1.06% 0 slcp > > process > > 500 384748832 761210720505 0.47% 0.49% 0.47% 0 > > DiagCard3/-1 > >3 8458075601628030520519 0.47% 0.45% 0.46% 0 > > DiagCard1/-1 > > 124 540996344 628393475860 0.39% 0.40% 0.39% 0 > > DiagCard4/-1 > > 75 6645542122968378193 0 0.31% 0.20% 0.19% 0 SCP > Download > > Lis > > > > xxx#remote command switch show proc cpu his > > > > 99 > > 76687678666777 > > 100 ** > > 90 ** > > 80 ** > > 70 ** > > 60 ** > > 50 ** > > 40 ** > > 30 ** > > 20 ** > > 10 ** > >051122334455 > > 0505050505 > >CPU% per second (last 60 seconds) > > > > 99 > > 8987889787889888999878 > > 100 ## > > 90 ## > > 80 ## > > 70 ## > > 60 ## > > 50 ## > > 40 ## > > 30 ## > > 20 ## > > 10 ## > >051122334455 > > 0505050505 > >CPU% per minute (last 60 minutes) > > * = maximum CPU% # = average CPU% > > 1 > > 9099 > 99 > > 9099 > 99 > > 100 #***
Re: [c-nsp] Huge SP CPU usage spikes 100%
On 1 March 2018 at 08:29, james list wrote: > Dear experts, > has anybody experienced a 100% SP CPU usage on C6500-Sup720 (12.2(33)SXI5) > with a lot of interrupts ? > The main process is Heartbeat. > > Cisco TAC is struggling in having an idea to sorting out the issue, they > are working since 3 days on it.. > > STP is stable, no mac moving, no real issue found… maybe somebody > experienced the same due to something in particular? > > Thanks for any hints. > > Cheers, > James > > > xxx#remote command switch show process cpu sorted > > CPU utilization for five seconds: 91%/83%; one minute: 96%; five minutes: > 97% When you say a lot of interrupts, what do you get from: show platform netint remote command switch show platform hardware earl status show ibc show eobc I don't know what that Heartbeat process is for, e.g. between SP and RP, or SP and DFCs, or SP and line cards etc. In terms of fixing the issue, perhaps reboot the RSP or line card? That obviously doesn't give you a root cause though :) It seems like the process is stuck in a loop if you are saying that forwarding is working without issue. You could run a NetDR capture to see if that is control-plane traffic and maybe where its coming from or going to: https://null.53bits.co.uk/index.php?page=netdr-captures Cheers, James. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] Huge SP CPU usage spikes 100%
Hey, Anything in punts? show plat cap buffer asic pinnacle slot 5 port 4 direction out priority lo show plat cap buffer collect for 5 show plat cap buffer data filt show plat cap buffer data sample Replace 'slot 5' with your port SUP port number. On 1 March 2018 at 10:29, james list wrote: > Dear experts, > has anybody experienced a 100% SP CPU usage on C6500-Sup720 (12.2(33)SXI5) > with a lot of interrupts ? > The main process is Heartbeat. > > Cisco TAC is struggling in having an idea to sorting out the issue, they > are working since 3 days on it.. > > STP is stable, no mac moving, no real issue found… maybe somebody > experienced the same due to something in particular? > > Thanks for any hints. > > Cheers, > James > > > xxx#remote command switch show process cpu sorted > > CPU utilization for five seconds: 91%/83%; one minute: 96%; five minutes: > 97% > PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process > 10212118224 387324287 31 100.00% 38.16% 32.07% 0 Heartbeat > Proces > 258 4104910748 127607878 32168 2.23% 2.01% 2.03% 0 Vlan > Statistics > 117 7497040242279235756 0 1.19% 0.60% 0.56% 0 > DiagCard2/-1 > 114 9372052522290556905 0 1.11% 1.05% 1.06% 0 slcp > process > 500 384748832 761210720505 0.47% 0.49% 0.47% 0 > DiagCard3/-1 >3 8458075601628030520519 0.47% 0.45% 0.46% 0 > DiagCard1/-1 > 124 540996344 628393475860 0.39% 0.40% 0.39% 0 > DiagCard4/-1 > 75 6645542122968378193 0 0.31% 0.20% 0.19% 0 SCP Download > Lis > > xxx#remote command switch show proc cpu his > > 99 > 76687678666777 > 100 ** > 90 ** > 80 ** > 70 ** > 60 ** > 50 ** > 40 ** > 30 ** > 20 ** > 10 ** >051122334455 > 0505050505 >CPU% per second (last 60 seconds) > > 99 > 8987889787889888999878 > 100 ## > 90 ## > 80 ## > 70 ## > 60 ## > 50 ## > 40 ## > 30 ## > 20 ## > 10 ## >051122334455 > 0505050505 >CPU% per minute (last 60 minutes) > * = maximum CPU% # = average CPU% > 1 > 90 > 90 > 100 #***## > 90 ## > 80 ## > 70 ## > 60 ## > 50 ## > 40 ## > 30 ## > 20 ## > 10 ## >051122334455667. > 0505050505050 >CPU% per hour (last 72 hours) > * = maximum CPU% # = average CPU% > ___ > cisco-nsp mailing list cisco-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/cisco-nsp >