On Mon, 16 Feb 2015, Brad Fleming wrote:
> We’ve seen it since installing the high-capacity switch fabrics into our
> XMR4000 chassis roughly 4 years ago. We saw it through IronWare 5.4.00d.
> I’m not sure what software we were using when they were first installed;
> probably whatever would have been stable/popular around December 2010.
>
> Command is simply “power-off snm [1-3]” then “power-on snm [1-3]”.
Ah I see it ... I was looking for "SFM"s not "SNM"s!
I also echo the poster's questions about how you notice the corruption.
I have a suspicion I may be seeing similar things; particularly so with
UDP-based transactions like NTP and RADIUS which could pass through such a
chassis. But I also suffer on CPU spikes with mcast traffic on that
chassis too which has always been an issue for me.
Thanks.
Jethro.
>
> Note that the power-on process causes your management session to hang
> for a few seconds. The device isn’t broken and packets aren’t getting
> dropped; it’s just going through checks and echoing back status.
>
> -brad
>
>
> > On Feb 16, 2015, at 7:07 AM, Jethro R Binks <jethro.bi...@strath.ac.uk>
> > wrote:
> >
> > On Fri, 13 Feb 2015, Brad Fleming wrote:
> >
> >> Over the years we’ve seen odd issues where one of the
> >> switch-fabric-links will “wigout” and some of the data moving between
> >> cards will get corrupted. When this happens we power cycle each switch
> >> fab one at a time using this process:
> >>
> >> 1) Shutdown SFM #3
> >> 2) Wait 1 minute
> >> 3) Power SFM #3 on again
> >> 4) Verify all SFM links are up to SFM#3
> >> 5) Wait 1 minute
> >> 6) Perform steps 1-5 for SFM #2
> >> 7) Perform steps 1-5 for SFM #3
> >>
> >> Not sure you’re seeing the same issue that we see but the “SFM Dance”
> >> (as we call it) is a once-every-four-months thing somewhere across our
> >> 16 XMR4000 boxes. It can be done with little to no impact if you are
> >> patient verify status before moving to the next SFM.
> >
> > That's all interesting. What code versions is this? Also, how do you
> > shutdown the SFMs? I don't recall seeing documentation for that.
> >
> > Jethro.
> >
> >
> >>
> >>> On Feb 13, 2015, at 11:41 AM, net...@gmail.com wrote:
> >>>
> >>> We have three switch fabrics installed, all are under 1% utilized.
> >>>
> >>>
> >>> From: Jeroen Wunnink | Hibernia Networks
> >>> [mailto:jeroen.wunn...@atrato.com <mailto:jeroen.wunn...@atrato.com>]
> >>> Sent: Friday, February 13, 2015 12:27 PM
> >>> To: net...@gmail.com <mailto:net...@gmail.com>; 'Jeroen Wunnink |
> >>> Hibernia Networks'
> >>> Subject: Re: [f-nsp] MLX throughput issues
> >>>
> >>> How many switchfabrics do you have in that MLX and how high is the
> >>> utilization on them
> >>>
> >>> On 13/02/15 18:12, net...@gmail.com <mailto:net...@gmail.com> wrote:
> >>>> We also tested with a spare Quanta LB4M we have and are seeing about the
> >>>> same speeds as we are seeing with the FLS648 (around 20MB/s or 160Mbps).
> >>>>
> >>>> I also reduced the number of routes we are accepting down to about 189K
> >>>> and that did not make a difference.
> >>>>
> >>>>
> >>>> From: foundry-nsp [mailto:foundry-nsp-boun...@puck.nether.net
> >>>> <mailto:foundry-nsp-boun...@puck.nether.net>] On Behalf Of Jeroen
> >>>> Wunnink | Hibernia Networks
> >>>> Sent: Friday, February 13, 2015 3:35 AM
> >>>> To: foundry-nsp@puck.nether.net <mailto:foundry-nsp@puck.nether.net>
> >>>> Subject: Re: [f-nsp] MLX throughput issues
> >>>>
> >>>> The FLS switches do something weird with packets. I've noticed they
> >>>> somehow interfere with changing the MSS window size dynamically,
> >>>> resulting in destinations further away having very poor speed results
> >>>> compared to destinations close by.
> >>>>
> >>>> We got rid of those a while ago.
> >>>>
> >>>>
> >>>> On 12/02/15 17:37, net...@gmail.com <mailto:net...@gmail.com> wrote:
> >>>>> We are having a strange issue on our MLX running code 5.6.00c. We are
> >>>>> encountering some throughput issues that seem to be randomly impacting
> >>>>> specific networks.
> >>>>>
> >>>>> We use the MLX to handle both external BGP and internal VLAN routing.
> >>>>> Each FLS648 is used for Layer 2 VLANs only.
> >>>>>
> >>>>> From a server connected by 1 Gbps uplink to a Foundry FLS648 switch,
> >>>>> which is then connected to the MLX on a 10 Gbps port, running a speed
> >>>>> test to an external network is getting 20MB/s.
> >>>>>
> >>>>> Connecting the same server directly to the MLX is getting 70MB/s.
> >>>>>
> >>>>> Connecting the same server to one of my customer's Juniper EX3200
> >>>>> (which BGP peers with the MLX) also gets 70MB/s.
> >>>>>
> >>>>> Testing to another external network, all three scenarios get 110MB/s.
> >>>>>
> >>>>> The path to both test network locations goes through the same IP
> >>>>> transit provider.
> >>>>>
> >>>>> We are running NI-MLX-MR with 2GB RAM, NI-MLX-10Gx4 connect to the
> >>>>> Foundry FLS648 by XFP-10G-LR, NI-MLX-1Gx20-GC was used for directly
> >>>>> connecting the server. A separate NI-MLX-10Gx4 connects to our
> >>>>> upstream BGP providers. Customer’s Juniper EX3200 connects to the same
> >>>>> NI-MLX-10Gx4 as the FLS648. We take default routes plus full tables
> >>>>> from three providers by BGP, but filter out most of the routes.
> >>>>>
> >>>>> The fiber and optics on everything look fine. CPU usage is less than
> >>>>> 10% on the MLX and all line cards and CPU usage at 1% on the FLS648.
> >>>>> ARP table on the MLX is about 12K, and BGP table is about 308K routes.
> >>>>>
> >>>>> Any assistance would be appreciated. I suspect there is a setting that
> >>>>> we’re missing on the MLX that is causing this issue.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> foundry-nsp mailing list
> >>>>> foundry-nsp@puck.nether.net <mailto:foundry-nsp@puck.nether.net>
> >>>>> http://puck.nether.net/mailman/listinfo/foundry-nsp
> >>>>> <http://puck.nether.net/mailman/listinfo/foundry-nsp>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>>
> >>>> Jeroen Wunnink
> >>>> IP NOC Manager - Hibernia Networks
> >>>> Main numbers (Ext: 1011): USA +1.908.516.4200 | UK +44.1704.322.300
> >>>> Netherlands +31.208.200.622 | 24/7 IP NOC Phone: +31.20.82.00.623
> >>>> jeroen.wunn...@hibernianetworks.com
> >>>> <mailto:jeroen.wunn...@hibernianetworks.com>
> >>>> www.hibernianetworks.com <http://www.hibernianetworks.com/>
> >>>
> >>>
> >>> --
> >>>
> >>> Jeroen Wunnink
> >>> IP NOC Manager - Hibernia Networks
> >>> Main numbers (Ext: 1011): USA +1.908.516.4200 | UK +44.1704.322.300
> >>> Netherlands +31.208.200.622 | 24/7 IP NOC Phone: +31.20.82.00.623
> >>> jeroen.wunn...@hibernianetworks.com
> >>> <mailto:jeroen.wunn...@hibernianetworks.com>
> >>> www.hibernianetworks.com
> >>> <http://www.hibernianetworks.com/>_______________________________________________
> >>> foundry-nsp mailing list
> >>> foundry-nsp@puck.nether.net <mailto:foundry-nsp@puck.nether.net>
> >>> http://puck.nether.net/mailman/listinfo/foundry-nsp
> >>> <http://puck.nether.net/mailman/listinfo/foundry-nsp>
> >>
> >
> > . . . . . . . . . . . . . . . . . . . . . . . . .
> > Jethro R Binks, Network Manager,
> > Information Services Directorate, University Of Strathclyde, Glasgow, UK
> >
> > The University of Strathclyde is a charitable body, registered in
> > Scotland, number SC015263.
>
>
. . . . . . . . . . . . . . . . . . . . . . . . .
Jethro R Binks, Network Manager,
Information Services Directorate, University Of Strathclyde, Glasgow, UK
The University of Strathclyde is a charitable body, registered in
Scotland, number SC015263.
_______________________________________________
foundry-nsp mailing list
foundry-nsp@puck.nether.net
http://puck.nether.net/mailman/listinfo/foundry-nsp