Re: [c-nsp] Multicast issues on 7600s with WS-6748-sfp blades

2010-07-30 Thread Kevin Hatem
It never worked, when I sent that reply I had mistakenly thought it 
was an internal email.

I feel sorry for you Tim :-) ... you continue to get beat up for a simple 
mistake from many many moons ago!

This e-mail, including any attachments and response string, may contain 
proprietary information which is confidential and may be legally privileged. It 
is for the intended recipient only. If you are not the intended recipient or 
transmission error has misdirected this e-mail, please notify the author by 
return e-mail and delete this message and any attachment immediately. If you 
are not the intended recipient you must not use, disclose, distribute, forward, 
copy, print or rely on this e-mail in any way except as permitted by the author.

___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Multicast issues on 7600s with WS-6748-sfp blades

2010-07-29 Thread Matthew Huff
I have seen some pretty low level problems with multicast packets being dropped 
on the floor by the hardware forwarded on 67xx linecards. The bug seems to 
appear when changes that effect the asic (turning on mls qos for example). 
Resetting the sup engine doesn't resolve it. Everything looks good at the 
CEF/mfib level, and all counters show the packet count increasing, but the 
packets never get forwarded out of the linecard.

Cisco engineering was able to verify this was happening, but had no solution 
other than resetting the linecard. Once it was reset, everything worked.


Matthew Huff   | One Manhattanville Rd
OTA Management LLC | Purchase, NY 10577
http://www.ox.com  | Phone: 914-460-4039
aim: matthewbhuff  | Fax:   914-460-4139


 -Original Message-
 From: cisco-nsp-boun...@puck.nether.net 
 [mailto:cisco-nsp-boun...@puck.nether.net] On Behalf Of John
 Neiberger
 Sent: Wednesday, July 28, 2010 6:04 PM
 To: cisco-nsp@puck.nether.net
 Subject: [c-nsp] Multicast issues on 7600s with WS-6748-sfp blades
 
 We have a weird problem on some 7606s with WS-6748-SFP blades. We have
 a whole bunch of multicast streams running through these routers and
 there are multicast receivers directly attached. We had a problem
 where one particular multicast stream would occasionally have dropped
 packets resulting in MPEG CC errors on the receiver. We were able to
 prove that the source was clean, as were the paths between the source
 and the receiver. The receiver was not seeing MPEG CC errors on any
 other stream, which is really odd.
 
 Here's where it gets even stranger. When we moved the receiver to
 another port, like from 23 to 24, the receiver still saw the errors.
 We moved it to port 25 and the errors apparently went away. Our only
 guess is that this could potentially be an issue with the ASICs on the
 blade. Ports 13-24 are controlled by one ASIC (I believe a Rohini
 ASIC), and the four Rohini ASICs connect to two Janus ASICs. That is
 as I understand it. I may be wrong. When we moved the receiver from
 port 24 to 25, we moved to a different Rohini ASIC and possibly to
 another Janus ASIC. Regardless, according to our video guys the
 problem cleared up after the move.
 
 Have any of you ever experienced anything like this? Could this really
 be a problem on the Rohini or Janus ASICs? What sort of problem would
 only affect certain multicast groups and not others?
 
 Thanks
 ___
 cisco-nsp mailing list  cisco-nsp@puck.nether.net
 https://puck.nether.net/mailman/listinfo/cisco-nsp
 archive at http://puck.nether.net/pipermail/cisco-nsp/
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

Re: [c-nsp] Multicast issues on 7600s with WS-6748-sfp blades

2010-07-29 Thread Asbjorn Hojmark - Lists
On Thu, 29 Jul 2010 08:29:18 -0400, you wrote:

 I have seen some pretty low level problems with multicast packets
 being dropped on the floor by the hardware forwarded on 67xx
 linecards. The bug seems to appear when changes that effect the asic
 (turning on mls qos for example). Resetting the sup engine doesn't
 resolve it. Everything looks good at the CEF/mfib level, and all
 counters show the packet count increasing, but the packets never get
 forwarded out of the linecard.

Sounds a bit like CSCtc24959, which I saw with a couple of customers.
It was typically seen on changes to topology (i.e. rerouting).

Flipping the replication mode:
  mls ip multicast replication-mode ingress
  no mls ip multicast replication-mode ingress [1]
causes reprogramming of the hardware, which fixed the problem without
having to reset the whole blade. Note that it *is* traffic-affecting
for all active multicast flows.

CSCtc24959 is fixed in SRD4 and will not be integrated in earlier
releases, or so I was told.

HTH,
-A
[1] Or the other way round if you use ingress replication

___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Multicast issues on 7600s with WS-6748-sfp blades

2010-07-29 Thread John Neiberger
On Thu, Jul 29, 2010 at 7:28 AM, Asbjorn Hojmark - Lists
li...@hojmark.org wrote:
 On Thu, 29 Jul 2010 08:29:18 -0400, you wrote:

 I have seen some pretty low level problems with multicast packets
 being dropped on the floor by the hardware forwarded on 67xx
 linecards. The bug seems to appear when changes that effect the asic
 (turning on mls qos for example). Resetting the sup engine doesn't
 resolve it. Everything looks good at the CEF/mfib level, and all
 counters show the packet count increasing, but the packets never get
 forwarded out of the linecard.

 Sounds a bit like CSCtc24959, which I saw with a couple of customers.
 It was typically seen on changes to topology (i.e. rerouting).

 Flipping the replication mode:
  mls ip multicast replication-mode ingress
  no mls ip multicast replication-mode ingress [1]
 causes reprogramming of the hardware, which fixed the problem without
 having to reset the whole blade. Note that it *is* traffic-affecting
 for all active multicast flows.

 CSCtc24959 is fixed in SRD4 and will not be integrated in earlier
 releases, or so I was told.

 HTH,
 -A
 [1] Or the other way round if you use ingress replication


That sounds a lot like what we're seeing, but we're not on the SR
train. We're running 12.2(18)SXF9 at the moment.

___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Multicast issues on 7600s with WS-6748-sfp blades

2010-07-29 Thread Benjamin Lovell
John,

Is your 6748 a DFC line card? If so you are correct, moving from port 24 to 25 
would have moved you to the other Janus ASIC. Janus is the fabric/mcast 
replication ASIC. 

If your problem is ONLY with mcast then you can safely ignore the Rohini. We 
have addressed a number of issues with mcast replication on the Janus in recent 
years. So if you are running older 12.2(18)SXF or even earlier SX code you may 
want to consider an IOS update. Also not knowing your network I can't say but 
it's not impossible to hit the scale limits with high rates mcast pps and 
mroute counts. 

The newer LCs with the 3CXL use a newer replication ASIC that preforms 
significantly better than the Janus. 


-Ben


On Jul 28, 2010, at 6:04 PM, John Neiberger wrote:

 We have a weird problem on some 7606s with WS-6748-SFP blades. We have
 a whole bunch of multicast streams running through these routers and
 there are multicast receivers directly attached. We had a problem
 where one particular multicast stream would occasionally have dropped
 packets resulting in MPEG CC errors on the receiver. We were able to
 prove that the source was clean, as were the paths between the source
 and the receiver. The receiver was not seeing MPEG CC errors on any
 other stream, which is really odd.
 
 Here's where it gets even stranger. When we moved the receiver to
 another port, like from 23 to 24, the receiver still saw the errors.
 We moved it to port 25 and the errors apparently went away. Our only
 guess is that this could potentially be an issue with the ASICs on the
 blade. Ports 13-24 are controlled by one ASIC (I believe a Rohini
 ASIC), and the four Rohini ASICs connect to two Janus ASICs. That is
 as I understand it. I may be wrong. When we moved the receiver from
 port 24 to 25, we moved to a different Rohini ASIC and possibly to
 another Janus ASIC. Regardless, according to our video guys the
 problem cleared up after the move.
 
 Have any of you ever experienced anything like this? Could this really
 be a problem on the Rohini or Janus ASICs? What sort of problem would
 only affect certain multicast groups and not others?
 
 Thanks
 ___
 cisco-nsp mailing list  cisco-nsp@puck.nether.net
 https://puck.nether.net/mailman/listinfo/cisco-nsp
 archive at http://puck.nether.net/pipermail/cisco-nsp/


___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Multicast issues on 7600s with WS-6748-sfp blades

2010-07-29 Thread John Neiberger
It is a DFC card. We have a TAC case open about this, but we have
instances of weirdness happening on another device recently that is
similar and we haven't opened a TAC case yet on that one. I wonder if
we're running into an issue with the Janus ASIC. That has been
suggested by some people here, but others were waiting for Cisco TAC
to confirm it.

On Thu, Jul 29, 2010 at 9:00 AM, Benjamin Lovell belov...@cisco.com wrote:
 John,

 Is your 6748 a DFC line card? If so you are correct, moving from port 24 to 
 25 would have moved you to the other Janus ASIC. Janus is the fabric/mcast 
 replication ASIC.

 If your problem is ONLY with mcast then you can safely ignore the Rohini. We 
 have addressed a number of issues with mcast replication on the Janus in 
 recent years. So if you are running older 12.2(18)SXF or even earlier SX code 
 you may want to consider an IOS update. Also not knowing your network I can't 
 say but it's not impossible to hit the scale limits with high rates mcast pps 
 and mroute counts.

 The newer LCs with the 3CXL use a newer replication ASIC that preforms 
 significantly better than the Janus.


 -Ben


 On Jul 28, 2010, at 6:04 PM, John Neiberger wrote:

 We have a weird problem on some 7606s with WS-6748-SFP blades. We have
 a whole bunch of multicast streams running through these routers and
 there are multicast receivers directly attached. We had a problem
 where one particular multicast stream would occasionally have dropped
 packets resulting in MPEG CC errors on the receiver. We were able to
 prove that the source was clean, as were the paths between the source
 and the receiver. The receiver was not seeing MPEG CC errors on any
 other stream, which is really odd.

 Here's where it gets even stranger. When we moved the receiver to
 another port, like from 23 to 24, the receiver still saw the errors.
 We moved it to port 25 and the errors apparently went away. Our only
 guess is that this could potentially be an issue with the ASICs on the
 blade. Ports 13-24 are controlled by one ASIC (I believe a Rohini
 ASIC), and the four Rohini ASICs connect to two Janus ASICs. That is
 as I understand it. I may be wrong. When we moved the receiver from
 port 24 to 25, we moved to a different Rohini ASIC and possibly to
 another Janus ASIC. Regardless, according to our video guys the
 problem cleared up after the move.

 Have any of you ever experienced anything like this? Could this really
 be a problem on the Rohini or Janus ASICs? What sort of problem would
 only affect certain multicast groups and not others?

 Thanks
 ___
 cisco-nsp mailing list  cisco-...@puck.nether.net
 https://puck.nether.net/mailman/listinfo/cisco-nsp
 archive at http://puck.nether.net/pipermail/cisco-nsp/



___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Multicast issues on 7600s with WS-6748-sfp blades

2010-07-29 Thread John Neiberger
Also, I forgot to mention that we have a lot of high-rate multicast
flows. Quite a few of them. Do you have an recommendations on how to
determine if we're running into scale limits for this hardware?

Thanks!

On Thu, Jul 29, 2010 at 9:00 AM, Benjamin Lovell belov...@cisco.com wrote:
 John,

 Is your 6748 a DFC line card? If so you are correct, moving from port 24 to 
 25 would have moved you to the other Janus ASIC. Janus is the fabric/mcast 
 replication ASIC.

 If your problem is ONLY with mcast then you can safely ignore the Rohini. We 
 have addressed a number of issues with mcast replication on the Janus in 
 recent years. So if you are running older 12.2(18)SXF or even earlier SX code 
 you may want to consider an IOS update. Also not knowing your network I can't 
 say but it's not impossible to hit the scale limits with high rates mcast pps 
 and mroute counts.

 The newer LCs with the 3CXL use a newer replication ASIC that preforms 
 significantly better than the Janus.


 -Ben


 On Jul 28, 2010, at 6:04 PM, John Neiberger wrote:

 We have a weird problem on some 7606s with WS-6748-SFP blades. We have
 a whole bunch of multicast streams running through these routers and
 there are multicast receivers directly attached. We had a problem
 where one particular multicast stream would occasionally have dropped
 packets resulting in MPEG CC errors on the receiver. We were able to
 prove that the source was clean, as were the paths between the source
 and the receiver. The receiver was not seeing MPEG CC errors on any
 other stream, which is really odd.

 Here's where it gets even stranger. When we moved the receiver to
 another port, like from 23 to 24, the receiver still saw the errors.
 We moved it to port 25 and the errors apparently went away. Our only
 guess is that this could potentially be an issue with the ASICs on the
 blade. Ports 13-24 are controlled by one ASIC (I believe a Rohini
 ASIC), and the four Rohini ASICs connect to two Janus ASICs. That is
 as I understand it. I may be wrong. When we moved the receiver from
 port 24 to 25, we moved to a different Rohini ASIC and possibly to
 another Janus ASIC. Regardless, according to our video guys the
 problem cleared up after the move.

 Have any of you ever experienced anything like this? Could this really
 be a problem on the Rohini or Janus ASICs? What sort of problem would
 only affect certain multicast groups and not others?

 Thanks
 ___
 cisco-nsp mailing list  cisco-...@puck.nether.net
 https://puck.nether.net/mailman/listinfo/cisco-nsp
 archive at http://puck.nether.net/pipermail/cisco-nsp/



___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Multicast issues on 7600s with WS-6748-sfp blades

2010-07-29 Thread Tim Stevenson
Moving from port 24 to 25 moves you to the other Janus regardless of 
whether there's a DFC or not.


John, if as you say moving to the other replication engine (RE) 
solves the problem, it's possible you are simply exceeding the 
replication capacity of the Janus.


Metro based cards (6708, 6716) have superior replication capacity, 
but that doesn't help you for GE, there is no shipping GE card for 
c6k that uses metro. Note that changing the DFC has no impact 
whatsoever on the RE hardware, the RE is on the linecard itself.


More recent code does provide a CLI to monitor drops in the 
replication engine, show platform hardware capacity rewrite-engine, 
this was added in 12.2(33)SXI. That could help you figure out whether 
drops are happening due to oversubscription of the Janus.


Hope that helps,
Tim

At 08:00 AM 7/29/2010, Ben Lovell (belovell) submitted:


John,

Is your 6748 a DFC line card? If so you are correct, moving from 
port 24 to 25 would have moved you to the other Janus ASIC. Janus is 
the fabric/mcast replication ASIC.


If your problem is ONLY with mcast then you can safely ignore the 
Rohini. We have addressed a number of issues with mcast replication 
on the Janus in recent years. So if you are running older 
12.2(18)SXF or even earlier SX code you may want to consider an IOS 
update. Also not knowing your network I can't say but it's not 
impossible to hit the scale limits with high rates mcast pps and mroute counts.


The newer LCs with the 3CXL use a newer replication ASIC that 
preforms significantly better than the Janus.



-Ben


On Jul 28, 2010, at 6:04 PM, John Neiberger wrote:

 We have a weird problem on some 7606s with WS-6748-SFP blades. We have
 a whole bunch of multicast streams running through these routers and
 there are multicast receivers directly attached. We had a problem
 where one particular multicast stream would occasionally have dropped
 packets resulting in MPEG CC errors on the receiver. We were able to
 prove that the source was clean, as were the paths between the source
 and the receiver. The receiver was not seeing MPEG CC errors on any
 other stream, which is really odd.

 Here's where it gets even stranger. When we moved the receiver to
 another port, like from 23 to 24, the receiver still saw the errors.
 We moved it to port 25 and the errors apparently went away. Our only
 guess is that this could potentially be an issue with the ASICs on the
 blade. Ports 13-24 are controlled by one ASIC (I believe a Rohini
 ASIC), and the four Rohini ASICs connect to two Janus ASICs. That is
 as I understand it. I may be wrong. When we moved the receiver from
 port 24 to 25, we moved to a different Rohini ASIC and possibly to
 another Janus ASIC. Regardless, according to our video guys the
 problem cleared up after the move.

 Have any of you ever experienced anything like this? Could this really
 be a problem on the Rohini or Janus ASICs? What sort of problem would
 only affect certain multicast groups and not others?

 Thanks
 ___
 cisco-nsp mailing list  cisco-nsp@puck.nether.net
 
https://puck.nether.net/mailman/listinfo/cisco-nsphttps://puck.nether.net/mailman/listinfo/cisco-nsp
 archive at 
http://puck.nether.net/pipermail/cisco-nsp/http://puck.nether.net/pipermail/cisco-nsp/



___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsphttps://puck.nether.net/mailman/listinfo/cisco-nsp
archive at 
http://puck.nether.net/pipermail/cisco-nsp/http://puck.nether.net/pipermail/cisco-nsp/





Tim Stevenson, tstev...@cisco.com
Routing  Switching CCIE #5561
Distinguished Technical Marketing Engineer, Cisco Nexus 7000
Cisco - http://www.cisco.com
IP Phone: 408-526-6759

The contents of this message may be *Cisco Confidential*
and are intended for the specified recipients only.


___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Multicast issues on 7600s with WS-6748-sfp blades

2010-07-29 Thread Tim Stevenson
It never worked, when I sent that reply I had mistakenly thought it 
was an internal email.


The closest public thing you'll probably find is the 6500 h/w 
architecture decks from networkers. That, or talk to your Cisco account team.


Tim

At 01:17 PM 7/29/2010, Mikhail submitted:

Googling about it gave me link to Tim's page, which doesn't work
anymore:
http://bock-bock.cisco.com/~tstevens/FAQs/module-asic-mappings.htmlhttp://bock-bock.cisco.com/~tstevens/FAQs/module-asic-mappings.html





Tim Stevenson, tstev...@cisco.com
Routing  Switching CCIE #5561
Distinguished Technical Marketing Engineer, Cisco Nexus 7000
Cisco - http://www.cisco.com
IP Phone: 408-526-6759

The contents of this message may be *Cisco Confidential*
and are intended for the specified recipients only.


___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/