[c-nsp] ISR4431 memory usage

2016-05-31 Thread CiscoNSP List
Hi Everyone,


Purchased a couple of ISR4431's for a small POP,  that has a single IPTransit 
service (Currently being handled by an old 2851, taking full table and 
default)obviously full table not necessary, but we had a customer at this 
POP that wanted the full table advertised to them, so we needed to take it from 
the upstream.


2851 handles the full table no problems - only has 1Gb dram, and is using ~57% 
ram


The 4431's we purchased to replace the 2851 have (default) 4Gb ram, and I was a 
little shocked when I turned them on to see that with virtually no config on 
them, they are already using ~83-84% of the ram:


#show platform software status control-processor brief
Load Average
 Slot  Status  1-Min  5-Min 15-Min
  RP0 Healthy   0.00   0.00   0.00

Memory (kB)
 Slot  StatusTotal Used (Pct) Free (Pct) Committed (Pct)
  RP0 Healthy  3972052  3317944 (84%)   654108 (16%)   1530296 (39%)


sh platform resources
**State Acronym: H - Healthy, W - Warning, C - Critical
Resource Usage Max Warning 
CriticalState

RP0 (ok, active)
   H
 Control Processor   5.81% 100%90% 
95% H
  DRAM   3240MB(83%)   3878MB  90% 
95% H
ESP0(ok, active)
   H
 QFP
   H
  DRAM   1609582KB(76%)2097152KB   80% 
90% H
  IRAM   0KB(0%)   0KB 80% 
90% H


..and iosd looks to be the main user:


#monitor platform software process rp active

top - 09:59:58 up 7 days, 23:38,  0 users,  load average: 0.00, 0.00, 0.00
Tasks: 380 total,   4 running, 376 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.7%us,  1.7%sy,  0.0%ni, 97.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   3972052k total,  3324360k used,   647692k free,   211736k buffers
Swap:0k total,0k used,0k free,  1705968k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
30505 root  20   0 9830m 161m 113m R   10  4.2   1226:27 fman_fp_image
23117 root  20   0 2205m 709m 341m S3 18.3 258:15.06 linux_iosd-imag
20408 root  20   0  288m  73m  30m S2  1.9 192:48.66 bsm
 2142 root  20   0 72468  24m  18m S1  0.6  69:33.01 iomd



...Now, my question is, can we "safely" take the full table on the 4431's...Ive 
had a read of the following: 
http://www.cisco.com/c/en/us/td/docs/routers/access/4400/troubleshooting/memorytroubleshooting/isr4000_mem.html


And it mentions that iosd/memory allocation is allocated as "needed"...but Im 
not clear on whether the way the platform allocates memory, will allow us to 
take a full table with 4Gb ram.Im really hoping it will, and we dont have 
to upgrade the ram on them?


Cheers.



[http://www.cisco.com/web/fw/i/logo-open-graph.gif]

Memory Troubleshooting Guide for Cisco 4000 Series 
ISRs
www.cisco.com
DRAM for Cisco 4300 Series ISRs . Cisco 4300 ISR platforms use 1600MHz DIMMs 
for memory. The platforms have one or two DIMM slots for main system memory.




___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] 6500/7600 TCAM Usage

2016-05-31 Thread Mack McBride
>From prior experience, using 100% and bad things happen.
As the device approaches a full tcam convergence will get much slower.
Additionally the table is not static so you can get bursts of routes associated 
with leaks.
Don't forget there are routes that are not in the BGP and OSPF tables that get 
inserted.
Connecteds, Statics, next hops and vlans and only cisco knows what else.  Most 
people
forget there is a route for every vlan even if no IPs are associated with it on 
the 6500 platform.

I usually use a margin of 10K above what "show ip route summary" is producing 
as a 'safety net'.
Once you get into that range, 'Bad things happen'.

'show mls cef summary' actually shows about 5K less on my devices but those 
routes are still in there.
So don't use that as what is actually getting inserted.



Mack McBride | Senior Network Architect | ViaWest, Inc.
O: 720.891.2502 | C: 303.720.2711 | mack.mcbr...@viawest.com | www.viawest.com

-Original Message-
From: cisco-nsp [mailto:cisco-nsp-boun...@puck.nether.net] On Behalf Of Pete 
Templin
Sent: Tuesday, May 31, 2016 2:53 PM
To: Gert Doering; James Bensley
Cc: cisco-nsp@puck.nether.net
Subject: Re: [c-nsp] 6500/7600 TCAM Usage

+1 on what Gert said. You'll get log entries at the 90% threshold within
a region, but the badness only happens when you tickle the 100% threshold.


On 5/31/2016 11:45 AM, Gert Doering wrote:
> Hi,
>
> On Tue, May 31, 2016 at 07:19:22PM +0100, James Bensley wrote:
>> I have asked TAC and they said the TCAM can be 100% used, not until we
>> have 1,024,000 entries in TCAM will we start of see the syslog
>> messages for failing to install a prefix. I am certain that one CAN
>> NOT use 100% of the TCAM space, I'm sure I read somewhere that at
>> around 90% utilisation we start so process switch / drop packets /
>> fail to install routes.
> You can use 100% of what you have partitoned for - so if you partion for
> 512k IPv4, you'll blow up at 512*1024 IPv4 routes (minus a few, I'd
> assume).  Been there, done that - not at 512k but at something like 200k
> on non-XLs, years ago.
>
> That "at 90% utilization bad things will happen" sounds like an urban
> legend from the BNC ethernet times...  it's TCAM, there is nothing magic
> about 90% - either a route can be poked in there, then it will work,
> or not, then all excess routes will be process switch (and subject to
> rate-limiting)
>

___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
This message contains information that may be confidential, privileged or 
otherwise protected by law from disclosure. It is intended for the exclusive 
use of the addressee(s). Unless you are the addressee or authorized agent of 
the addressee, you may not review, copy, distribute or disclose to anyone the 
message or any information contained within. If you have received this message 
in error, please contact the sender by electronic reply and immediately delete 
all copies of the message.
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] 6500/7600 TCAM Usage

2016-05-31 Thread Pete Templin
+1 on what Gert said. You'll get log entries at the 90% threshold within 
a region, but the badness only happens when you tickle the 100% threshold.



On 5/31/2016 11:45 AM, Gert Doering wrote:

Hi,

On Tue, May 31, 2016 at 07:19:22PM +0100, James Bensley wrote:

I have asked TAC and they said the TCAM can be 100% used, not until we
have 1,024,000 entries in TCAM will we start of see the syslog
messages for failing to install a prefix. I am certain that one CAN
NOT use 100% of the TCAM space, I'm sure I read somewhere that at
around 90% utilisation we start so process switch / drop packets /
fail to install routes.

You can use 100% of what you have partitoned for - so if you partion for
512k IPv4, you'll blow up at 512*1024 IPv4 routes (minus a few, I'd
assume).  Been there, done that - not at 512k but at something like 200k
on non-XLs, years ago.

That "at 90% utilization bad things will happen" sounds like an urban
legend from the BNC ethernet times...  it's TCAM, there is nothing magic
about 90% - either a route can be poked in there, then it will work,
or not, then all excess routes will be process switch (and subject to
rate-limiting)



___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] 6500/7600 TCAM Usage

2016-05-31 Thread Gert Doering
Hi,

On Tue, May 31, 2016 at 07:19:22PM +0100, James Bensley wrote:
> I have asked TAC and they said the TCAM can be 100% used, not until we
> have 1,024,000 entries in TCAM will we start of see the syslog
> messages for failing to install a prefix. I am certain that one CAN
> NOT use 100% of the TCAM space, I'm sure I read somewhere that at
> around 90% utilisation we start so process switch / drop packets /
> fail to install routes.

You can use 100% of what you have partitoned for - so if you partion for
512k IPv4, you'll blow up at 512*1024 IPv4 routes (minus a few, I'd
assume).  Been there, done that - not at 512k but at something like 200k
on non-XLs, years ago.

That "at 90% utilization bad things will happen" sounds like an urban
legend from the BNC ethernet times...  it's TCAM, there is nothing magic
about 90% - either a route can be poked in there, then it will work,
or not, then all excess routes will be process switch (and subject to
rate-limiting)

> I obviously can???t find any Cisco documentation on saying we can???t use
> 100% of the TCAM. TAC have said we can use 100% of the TCAM. I still
> don???t believe this, I???m so certain I have read somewhere that we
> can???t.

You can't use 100%, as you'll never get the partitioning right...

gert
-- 
USENET is *not* the non-clickable part of WWW!
   //www.muc.de/~gert/
Gert Doering - Munich, Germany g...@greenie.muc.de
fax: +49-89-35655025g...@net.informatik.tu-muenchen.de


signature.asc
Description: PGP signature
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

[c-nsp] 6500/7600 TCAM Usage

2016-05-31 Thread James Bensley
Hi All,

In a similar vein to the concurrently running thread about a SUP-2T-XL
thinking it is out of TCAM space...

What are realistic usage levels for TCAM on 6500 and 7600s RSPs? We've
got some 7600s running RSP720-3CXL-10GEs and they are at nearly 80%
utilisation (74%-78% off the top of my head).

Ignoring the fact that one can partition a certain percentage for IPv4
or IPv6 or multicast space. I mean overall can we actually use 1
million TCAM entries?

I have asked TAC and they said the TCAM can be 100% used, not until we
have 1,024,000 entries in TCAM will we start of see the syslog
messages for failing to install a prefix. I am certain that one CAN
NOT use 100% of the TCAM space, I'm sure I read somewhere that at
around 90% utilisation we start so process switch / drop packets /
fail to install routes.

I obviously can’t find any Cisco documentation on saying we can’t use
100% of the TCAM. TAC have said we can use 100% of the TCAM. I still
don’t believe this, I’m so certain I have read somewhere that we
can’t.

Has anyone had any experiences relating to this they can comment on?

Cheers,

James.
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

[c-nsp] SUP2T.. TCAM related errors..

2016-05-31 Thread Peter Kranz
I cannot for the life of me figure out why this box seems to think it has
TCAM issues..  It's a SUP-2T XL platform.. Usage levels look well under TCAM
limits.



May 23 12:06:22: %CFIB-7-CFIB_EXCEPTION: FIB TCAM exception, Some entries
will be software switched
May 31 08:58:51: %CV6_LC-5-FIB_EXCEP_ON: Failed to insert IPv6 prefix in FIB
TCAM because it is full

rtr #show platform hardware capacity forwarding
L2 Forwarding Resources
   MAC Table usage:   Module  Collisions  Total   Used
%Used
  10 131072   1525
1%
  20 131072   1526
1%
  60 131072   1522
1%

L3 Forwarding Resources
 FIB TCAM usage: TotalUsed
%Used
  72 bits (IPv4, MPLS, EoM) 1048576  555182
53%
 144 bits (IP mcast, IPv6)  524288   25930
5%
 288 bits (IPv6 mcast)  262144   1
1%

 detail:  ProtocolUsed
%Used
  IPv4  555180
53%
  MPLS   1
1%
  EoM1
1%

  IPv6   25924
5%
  IPv4 mcast 6
1%
  IPv6 mcast 1
1%

Adjacency usage: TotalUsed
%Used
   1048576   33569
3%

rtr #sh mls cef exception status   
Current IPv4 FIB exception state = TRUE
Current IPv6 FIB exception state = TRUE
Current MPLS FIB exception state = FALSE
Current EoM/VPLS FIB TCAM exception state = FALSE



___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] udld fail ?

2016-05-31 Thread james list
yes, in general I see your points, I was wondering if there could be a
reasonable reason for the mentioned behaviour

2016-05-31 16:33 GMT+02:00 Nick Hilliard :

> james list wrote:
> > Apparently the Cisco gear has disabled one out of the two ten giga
> > interface after some flapping of the other one and due to UDLD that is
> > currently non configured as aggressive nor bidirectional (not supported
> by
> > Juniper gear).
> >
> > Among the two gears LACP fast is running.
> >
> > I kindly ask any feedback if it's something already experienced by
> somebody.
>
> udld is proprietary and non-interoperable technology.  One vendor's
> implementation will not work with another's.  Sometimes, a vendor's
> implementation will not interoperate with other equipment from the same
> vendor.  You need to disable udld on the c6500.
>
> Nick
>
>
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] udld fail ?

2016-05-31 Thread Nick Hilliard
james list wrote:
> Apparently the Cisco gear has disabled one out of the two ten giga
> interface after some flapping of the other one and due to UDLD that is
> currently non configured as aggressive nor bidirectional (not supported by
> Juniper gear).
> 
> Among the two gears LACP fast is running.
> 
> I kindly ask any feedback if it's something already experienced by somebody.

udld is proprietary and non-interoperable technology.  One vendor's
implementation will not work with another's.  Sometimes, a vendor's
implementation will not interoperate with other equipment from the same
vendor.  You need to disable udld on the c6500.

Nick

___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


[c-nsp] udld fail ?

2016-05-31 Thread james list
dear experts
I've a Cisco 6500 (12.2(33)) connected to a juniper EX4200 with a 2 x 10Gb
port channel.

Apparently the Cisco gear has disabled one out of the two ten giga
interface after some flapping of the other one and due to UDLD that is
currently non configured as aggressive nor bidirectional (not supported by
Juniper gear).

Among the two gears LACP fast is running.

I kindly ask any feedback if it's something already experienced by somebody.

Cheers
James
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] BFD on ME3600/ME3800/7600s

2016-05-31 Thread Adam Vitkovsky
> James Bensley
> Sent: Tuesday, May 31, 2016 9:50 AM
>
> On 28 May 2016 at 10:31, Adam Vitkovsky 
> wrote:
> > Alright then so indeed nodes participating in echo mode have to do more
> work as nodes participating in non-echo mode. That's why assume it
> performs slower (comparing performance of both modes in SW).
> >
> > To do list of one of the nodes in non-echo mode:
> > Tx at a given rate.
> > Reset dead timer if hello from remote node received.
> >
> > To do list of one of the nodes in non-echo mode:
> > Tx at a given rate.
> > Reset dead timer if own hello received.
>
> >>   > Loop partner's hellos at a given rate.   <<
>
> This last item is a bit of a misnomer.
>
> This isn't an extra feature the CPU has to do. The CPU on has to perform the
> first two action, which are the same (more or less) in both echo and non-
> echo modes.
>
> In echo mode the sending host sets its own MAC as the source MAC and the
> receiving node MAC as the destination MAC within the Ethernet headers. In
> the IP headers the source and destination IP are both the sending node's IP.
> So the receiving node receives the BFD packet because it has the destination
> MAC, but it see's the destination IP is the sending node, so this should be
> forwarded in hardware like any other L3 packet, so in echo mode the remote
> node that received the echo frame is unaware it is forwarding BFD packets
> for the node that send the echo packet. So there is no additional action for
> nodes running echo mode.
>
Aah good point, I see now where you're coming from.
So the packet is looped via the forwarding path and it doesn't have to be 
punted to the BDF control plane.

In echo mode I'd expect some increase in Rx times because now the hello has to 
travel up and down the link (so twice the propagation delay) and also at the 
remote/looping end it depends on how BFD packets are scheduled.


Ok looking at your initial post and the increase is huge (seconds).
But it's the max Tx that's affected (in addition to Rx) so I guess that's 
because the BFD messages are now generated in the CPU so if you do "sh run" or 
bgp does it's thing it can negatively impact generation of BFD messages.
Why the Rx has this huge jump as well? -I don't know, maybe it's correlated it 
with other Rx values rather than particular Tx value? And since Tx is shifted 
it shifts the Rx as well?
Since it's seconds I doubt this is a problem with buffers at the looping end.


adam






Adam Vitkovsky
IP Engineer

T:  0333 006 5936
E:  adam.vitkov...@gamma.co.uk
W:  www.gamma.co.uk

This is an email from Gamma Telecom Ltd, trading as “Gamma”. The contents of 
this email are confidential to the ordinary user of the email address to which 
it was addressed. This email is not intended to create any legal relationship. 
No one else may place any reliance upon it, or copy or forward all or any of it 
in any form (unless otherwise notified). If you receive this email in error, 
please accept our apologies, we would be obliged if you would telephone our 
postmaster on +44 (0) 808 178 9652 or email postmas...@gamma.co.uk

Gamma Telecom Limited, a company incorporated in England and Wales, with 
limited liability, with registered number 04340834, and whose registered office 
is at 5 Fleet Place London EC4M 7RD and whose principal place of business is at 
Kings House, Kings Road West, Newbury, Berkshire, RG14 5BY.


___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

Re: [c-nsp] BFD on ME3600/ME3800/7600s

2016-05-31 Thread James Bensley
On 28 May 2016 at 10:31, Adam Vitkovsky  wrote:
> Alright then so indeed nodes participating in echo mode have to do more work 
> as nodes participating in non-echo mode. That's why assume it performs slower 
> (comparing performance of both modes in SW).
>
> To do list of one of the nodes in non-echo mode:
> Tx at a given rate.
> Reset dead timer if hello from remote node received.
>
> To do list of one of the nodes in non-echo mode:
> Tx at a given rate.
> Reset dead timer if own hello received.

>>   > Loop partner's hellos at a given rate.   <<

This last item is a bit of a misnomer.

This isn't an extra feature the CPU has to do. The CPU on has to
perform the first two action, which are the same (more or less) in
both echo and non-echo modes.

In echo mode the sending host sets its own MAC as the source MAC and
the receiving node MAC as the destination MAC within the Ethernet
headers. In the IP headers the source and destination IP are both the
sending node's IP. So the receiving node receives the BFD packet
because it has the destination MAC, but it see's the destination IP is
the sending node, so this should be forwarded in hardware like any
other L3 packet, so in echo mode the remote node that received the
echo frame is unaware it is forwarding BFD packets for the node that
send the echo packet. So there is no additional action for nodes
running echo mode.

So going back to your "to do" list:

> To do list of one of the nodes in non-echo mode:
> Tx at a given rate.
> Reset dead timer if hello from remote node received.

In this mode the remote nodes packet with the local node IP as the
destination IP causing the packets to the punted.

> To do list of one of the nodes in echo mode:
> Tx at a given rate.
> Reset dead timer if own hello received.

In this mode the local node sends packets with the destination IP as
its own IP since they are looped back, they will be punted upon return
arrival.

So in both cases what I expect to be happening is that the originating
of BFD packets and checking of the received packet be they locally
originated or remotely originated (echo more or non-echo mode
respectively) is offloaded to the ASIC.

So since we are doing more or less the same amount of work in ASIC,
I'm not sure why echo mode is not support on the ME3600X/ME3800X/7600
LAN & CFC cards in hardware but non-echo mode is?

Whatever the answer way we have reliably lab tested that 50ms x3 works
in ME3600X/ME3800X and something like 500ms x3 works fine on 7600 LAN
cards (that could probably go lower for 7600s tbh!) but some official
word from Cisco would be nice.

Cheers,
James.
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/