Re: [c-nsp] STP over Port-channel issue

2024-05-06 Thread Saku Ytti via cisco-nsp
On Mon, 6 May 2024 at 15:53, james list via cisco-nsp
 wrote:

> The question: since the PO remains up, why we see this behaviour ?
> are BDPU sent just over one link (ie the higher interfac e) ?

Correct.

> how can we solve this issue keeping this scenario ?
> moving to RSTP could solve ?

No.

I understand you want topology to remain intact, as long as there is
at least 1 member up, but I'm not sure we can guarantee that. I think
if you set LACP to fast, it'll fail in max 3s, and you ensure STP
fails slower (i.e. don't use rapid pvst), you probably will just see
BPDU switch physical interface, instead of STP convergence.

You'd need something similar to microBFD on LAGs, where BFD PDU is
sent on each member, instead of an unspecified single member, but
afaik this does not exist.

So what you really need is L3/MPLS topology :/.

-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] [j-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-12 Thread Saku Ytti via cisco-nsp
On Mon, 12 Feb 2024 at 09:44, james list  wrote:

> I'd like to test with LACP slow, then can see if physical interface still 
> flaps...

I don't think that's good idea, like what would we know? Would we have
to wait 30 times longer, so month-3months, to hit what ever it is,
before we have confidence?

I would suggest
 - turn on debugging, to see cisco emitting LACP PDU, and juniper
receiving LACP PDU
 - do packet capture, if at all reasonable, ideally tap, but in
absence of tap mirror
 - turn off LACP distributed handling on junos
 - ping on the link, ideally 0.2-0.5s interval, to record how ping
stops in relation to first syslog emitted about LACP going down
 - wait for 4days


-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] [j-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread Saku Ytti via cisco-nsp
On Sun, 11 Feb 2024 at 17:52, james list  wrote:

> - why physical interface flaps in DC1 if it is related to lacp ?

16:39:35.813 Juniper reports LACP timeout (so problem started at
16:39:32, (was traffic passing at 32, 33, 34 seconds?))
16:39:36.xxx Cisco reports interface down, long after problem has
already started

Why Cisco reports physical interface down, I'm not sure. But clearly
the problem was already happening before interface down, and first log
entry is LACP timeout, which occurs 3s after the problem starts.
Perhaps Juniper asserts for some reason RFI? Perhaps Cisco resets the
physical interface once removed from LACP?

> - why the same setup in DC2 do not report issues ?

If this is is LACP related software issue, could be difference not
identified. You need to gather more information, like how does ping
look throughout this event, particularly before syslog entries. And if
ping still works up-until syslog, you almost certainly have software
issue with LACP inject at Cisco, or more likely LACP punt at Juniper.

-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] [j-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread Saku Ytti via cisco-nsp
On Sun, 11 Feb 2024 at 15:24, james list  wrote:

> While on Juniper when the issue happens I always see:
>
> show log messages | last 440 | match LACPD_TIMEOUT
> Jan 25 21:32:27.948 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp 
> current while timer expired current Receive State: CURRENT

> Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp 
> current while timer expired current Receive State: CURRENT

Ok so problem always starts by Juniper seeing 3seconds without LACP
PDU, i.e. missing 3 consecutive LACP PDU. It would be good to ping
while this problem is happening, to see if ping stops at 3s before the
syslog lines, or at the same time as syslog lines.
If ping stops 3s before, it's link problem from cisco to juniper.
If ping stops at syslog time (my guess), it's software problem.

There is unfortunately log of bug surface here, both on inject and on
punt path. You could be hitting PR1541056 on the Juniper end. You
could test for this by removing distributed LACP handling with 'set
routing-options ppm no-delegate-processing'
You could also do packet capture for LACP on both ends, to try to see
if LACP was sent by Cisco and received by capture, but not by system.


-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] [j-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread Saku Ytti via cisco-nsp
Hey James,

You shared this off-list, I think it's sufficiently material to share.

2024 Feb  9 16:39:36 NEXUS1
%ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface
port-channel101 is down (No operational members)
2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_DOWN:
port-channel101: Ethernet1/44 is down
Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5:
lacp current while timer expired current Receive State: CURRENT
Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACP_INTF_DOWN: ae49:
Interface marked down due to lacp timeout on member et-0/1/5

We can't know the order of events here, due to no subsecond precision
enabled on Cisco end.

But if failure would start from interface down, it would take 3seconds
for Juniper to realise LACP failure. However we can see that it
happens in less than 1s, so we can determine the interface was not
down first, the first problem was Juniper not receiving 3 consecutive
LACP PDUs, 1s apart, prior to noticing any type of interface state
related problems.

Is this always the order of events? Does it always happen with Juniper
noticing problems receiving LACP PDU first?


On Sun, 11 Feb 2024 at 14:55, james list via juniper-nsp
 wrote:
>
> Hi
>
> 1) cable has been replaced with a brand new one, they said that to check an
> MPO 100 Gbs cable is not that easy
>
> 3) no errors reported on both side
>
> 2) here the output of cisco and juniper
>
> NEXUS1# sh interface eth1/44 transceiver details
> Ethernet1/44
> transceiver is present
> type is QSFP-100G-SR4
> name is CISCO-INNOLIGHT
> part number is TR-FC85S-NC3
> revision is 2C
> serial number is INL27050TVT
> nominal bitrate is 25500 MBit/sec
> Link length supported for 50/125um OM3 fiber is 70 m
> cisco id is 17
> cisco extended id number is 220
> cisco part number is 10-3142-03
> cisco product id is QSFP-100G-SR4-S
> cisco version id is V03
>
> Lane Number:1 Network Lane
>SFP Detail Diagnostics Information (internal calibration)
>
> 
> Current  Alarms  Warnings
> Measurement HighLow High  Low
>
> 
>   Temperature   30.51 C75.00 C -5.00 C 70.00 C0.00 C
>   Voltage3.28 V 3.63 V  2.97 V  3.46 V3.13 V
>   Current6.40 mA   12.45 mA 3.25 mA12.45 mA   3.25
> mA
>   Tx Power   0.98 dBm   5.39 dBm  -12.44 dBm2.39 dBm -8.41
> dBm
>   Rx Power  -1.60 dBm   5.39 dBm  -14.31 dBm2.39 dBm-10.31
> dBm
>   Transmit Fault Count = 0
>
> 
>   Note: ++  high-alarm; +  high-warning; --  low-alarm; -  low-warning
>
> Lane Number:2 Network Lane
>SFP Detail Diagnostics Information (internal calibration)
>
> 
> Current  Alarms  Warnings
> Measurement HighLow High  Low
>
> 
>   Temperature   30.51 C75.00 C -5.00 C 70.00 C0.00 C
>   Voltage3.28 V 3.63 V  2.97 V  3.46 V3.13 V
>   Current6.40 mA   12.45 mA 3.25 mA12.45 mA   3.25
> mA
>   Tx Power   0.62 dBm   5.39 dBm  -12.44 dBm2.39 dBm -8.41
> dBm
>   Rx Power  -1.18 dBm   5.39 dBm  -14.31 dBm2.39 dBm-10.31
> dBm
>   Transmit Fault Count = 0
>
> 
>   Note: ++  high-alarm; +  high-warning; --  low-alarm; -  low-warning
>
> Lane Number:3 Network Lane
>SFP Detail Diagnostics Information (internal calibration)
>
> 
> Current  Alarms  Warnings
> Measurement HighLow High  Low
>
> 
>   Temperature   30.51 C75.00 C -5.00 C 70.00 C0.00 C
>   Voltage3.28 V 3.63 V  2.97 V  3.46 V3.13 V
>   Current6.40 mA   12.45 mA 3.25 mA12.45 mA   3.25
> mA
>   Tx Power   0.87 dBm   5.39 dBm  -12.44 dBm2.39 dBm -8.41
> dBm
>   Rx Power   0.01 dBm   5.39 dBm  -14.31 dBm2.39 dBm-10.31
> dBm
>   Transmit Fault Count = 0
>
> 
>   Note: ++  high-alarm; +  high-warning; --  low-alarm; -  low-warning
>
> Lane Number:4 Network Lane
>SFP Detail Diagnostics 

Re: [c-nsp] [j-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread Saku Ytti via cisco-nsp
I want to clarify, I meant this in the context of the original question.

That is, if you have a BGP specific problem, and no FCS errors, then
you can't have link problems.

But in this case, the problem is not BGP specific, in fact it has
nothing to do with BGP, since the problem begins on observing link
flap.

On Sun, 11 Feb 2024 at 14:14, Saku Ytti  wrote:
>
> I don't think any of these matter. You'd see FCS failure on any
> link-related issue causing the BGP packet to drop.
>
> If you're not seeing FCS failures, you can ignore all link related
> problems in this case.
>
>
> On Sun, 11 Feb 2024 at 14:13, Havard Eidnes via juniper-nsp
>  wrote:
> >
> > > DC technicians states cable are the same in both DCs and
> > > direct, no patch panel
> >
> > Things I would look at:
> >
> >  * Has all the connectors been verified clean via microscope?
> >
> >  * Optical levels relative to threshold values (may relate to the
> >first).
> >
> >  * Any end seeing any input errors?  (May relate to the above
> >two.)  On the Juniper you can see some of this via PCS
> >("Physical Coding Sublayer") unexpected events independently
> >of whether you have payload traffic, not sure you can do the
> >same on the Nexus boxes.
> >
> > Regards,
> >
> > - Håvard
> > ___
> > juniper-nsp mailing list juniper-...@puck.nether.net
> > https://puck.nether.net/mailman/listinfo/juniper-nsp
>
>
>
> --
>   ++ytti



-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] [j-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread Saku Ytti via cisco-nsp
I don't think any of these matter. You'd see FCS failure on any
link-related issue causing the BGP packet to drop.

If you're not seeing FCS failures, you can ignore all link related
problems in this case.


On Sun, 11 Feb 2024 at 14:13, Havard Eidnes via juniper-nsp
 wrote:
>
> > DC technicians states cable are the same in both DCs and
> > direct, no patch panel
>
> Things I would look at:
>
>  * Has all the connectors been verified clean via microscope?
>
>  * Optical levels relative to threshold values (may relate to the
>first).
>
>  * Any end seeing any input errors?  (May relate to the above
>two.)  On the Juniper you can see some of this via PCS
>("Physical Coding Sublayer") unexpected events independently
>of whether you have payload traffic, not sure you can do the
>same on the Nexus boxes.
>
> Regards,
>
> - Håvard
> ___
> juniper-nsp mailing list juniper-...@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp



-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] [j-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread Saku Ytti via cisco-nsp
On Sun, 11 Feb 2024 at 13:51, james list via juniper-nsp
 wrote:

> One think I've omit to say is that BGP is over a LACP with currently just
> one interface 100 Gbs.
>
> I see that the issue is triggered on Cisco when eth interface seems to go
> in Initializing state:

Ok, so we can forget BGP entirely. And focus on why the LACP is going down.

Is the LACP single port, eth1/44?

When the LACP fails, does Juniper end emit any syslog? Does Juniper
see the interface facing eth1/44 flapping?

--
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] [j-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread Saku Ytti via cisco-nsp
Open JTAC and CTAC cases.

The amount of information provided is wildly insufficient.

'BGP flaps' what does that mean, is it always the same direction? If
so, which direction thinks it's not seeing keepalives? Do you also
observe loss in 'ping' between the links during the period?

Purely stabbing in the dark, I'd say you always observe it in a single
direction, because in that direction you are losing reliably every nTh
keepalive, and statistically it takes 1-3 days to lose 3 in a row,
with the probability you're seeing. Now why exactly is this, is one
end not sending to wire or is one end not receiving from wire. Again
stabbing in the dark, more likely that problem is in the punt path,
rather than inject path, so I would focus my investigation on the
party who is tearing down the session, due to lack of keepalive, on
thesis this device has problem in punt path and is for some reason
dropping at reliable probability BGP packets from the wire.

On Sun, 11 Feb 2024 at 12:09, james list via juniper-nsp
 wrote:
>
> Dear experts
> we have a couple of BGP peers over a 100 Gbs interconnection between
> Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different datacenters
> like this:
>
> DC1
> MX1 -- bgp -- NEXUS1
> MX2 -- bgp -- NEXUS2
>
> DC2
> MX3 -- bgp -- NEXUS3
> MX4 -- bgp -- NEXUS4
>
> The issue we see is that sporadically (ie every 1 to 3 days) we notice BGP
> flaps only in DC1 on both interconnections (not at the same time), there is
> still no traffic since once noticed the flaps we have blocked deploy on
> production.
>
> We've already changed SPF (we moved the ones from DC2 to DC1 and viceversa)
> and cables on both the interconnetion at DC1 without any solution.
>
> SFP we use in both DCs:
>
> Juniper - QSFP-100G-SR4-T2
> Cisco - QSFP-100G-SR4
>
> over MPO cable OM4.
>
> Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the issue.
>
> Any idea or suggestion what to check or to do ?
>
> Thanks in advance
> Cheers
> James
> ___
> juniper-nsp mailing list juniper-...@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp



-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] ASR9000 QoS counters on LAG

2024-01-20 Thread Saku Ytti via cisco-nsp
On Jun, 21 Jan 2024 at 03:17, Ross Halliday
 wrote:

> Moving to qos-group for egress classes got me the result I was looking for. 
> Thank you very much!

I'm happy that you got the results you wanted, but that shouldn't have
fixed it. The 'priority level 3' is the only thing that I can think of
which might cause it to fail to program.

Just to continue on the priority level, if you set it on ingress,
it'll still carry the values on egress. But you receive a lot more
protection, because you ensure that fabric isn't being congested by
less important traffic, causing more important traffic to drop during
fabric congestion.


-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] ASR9000 QoS counters on LAG

2024-01-19 Thread Saku Ytti via cisco-nsp
On Fri, 19 Jan 2024 at 05:10, Ross Halliday via cisco-nsp
 wrote:


> We've inherited some older ASR9000 systems that we're trying to support 
> in-place. The software version on this one router is fairly old at 6.1.4. 
> Driving it are a pair of RSP440-SE. The line cards are A9K-MOD160-SE with 
> A9K-MPA-8X10GE in each.
>
> I haven't had any issues until trying to apply a policy map in the egress 
> direction on a LAG. The counters simply don't increase. I'm aware of the 
> complexities of policing, but right now I just want to see packets match a 
> class - any class - even class-default doesn't increment as expected. 
> Everything works as expected on a non-LAG port. Ingress works fine, as well - 
> this is just egress on a LAG.
>
> IOS-XR is not my strong point at all. I'm not sure if I'm missing something 
> very obvious, but this seems so weird that it feels like a display bug.
>
> The LAG members are split between the two linecards.
>
> Any suggestions would be greatly appreciated!


Any syslog messages when you attach it?

I don't think the device supports 'priority level 3', there is only
default, 2 and 1 . Default being the worst and 1 the best (well in
CLI, in NPU they are reversed to make debugging less boring).
Practically all the utility of priority level has already been used,
by the time egress policy is considered, so they don't much here, you
should set them on ingress.

Not related, but I can't help myself, you shouldn't classify and
schedule on egress, you classify on ingress, and schedule/rewrite on
egress. That is 'your match dscp/cos' should be on ingress, with 'set
qos-group N', and on 'egress' you do 'match qos-group N'. Not only
will this separation of concerns make things a lot easier to rationale
and manage, but it's the only way you can do QoS on many other
platforms, so you don't have re-invent policies for different
platforms.

Do remember that by default 'police X' in LAG in ASR9000 means X in
each interface, for total LAG capacity of X*interfaces_up (variant).
There is no way to configure shared policer in any platform at all,
but there is a way to tell platform to divide X by active member count
for each member, so aggregate cannot be more than X, but no single
interface can burst more than X/member_count.

-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] ASR9902 fpd upgrade

2023-12-20 Thread Saku Ytti via cisco-nsp
On Thu, 21 Dec 2023 at 09:21, Hank Nussbacher via cisco-nsp
 wrote:

> It used to be TAC was a main selling card of Cisco vs competitors.  Not
> any longer :-(

Don't remember them ever being relatively or absolutely good.

Having one support channel for all requests doesn't work, because >99%
cases are useless cases from people who didn't do their homework and
are just wasting resources, so everyone optimises the process around
dealing with those, which makes the experience feel horrible to the
legitimate support cases.

But vendors sell at premium cost different experience, Cisco calls it
HTTS, it's not necessarily that you get better people, but you get
named people who will quickly learn that the previous optimisation
point doesn't work with you, because your cases are legitimate, so
they don't perform the useless volleys in hope that the customer
realises their error and doesn't come back, which is good strategy as
it works often.

Unfortunately HTTS is quite expensive and it feels backwards, that the
people who are reporting legitimate problems in your product also have
to pay more to get support. It often feels like no one buys Ciscos or
Junipers, but leases them, as the support contracts are so outrageous,
and you can't go without the support contracts.

I'm not blaming Cisco or any other vendor, I think this outcome is a
market driven fact. If I try to imagine that someone releases a new
NOS, which works as well as Windows or Linux, in that the OS almost
never is the reason why basic functionality of your product is broken,
then I can imagine lot of my customers would choose not to buy this
lease-level support contract, and I'd be out of business. Market
requires NOS' to be of really poor quality.
-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] ASR 1000 series replacement

2023-12-16 Thread Saku Ytti via cisco-nsp
On Sat, 16 Dec 2023 at 18:38, Charles Sprickman via cisco-nsp
 wrote:

> > There are hundreds of GRE tunnels.
>
> I have nothing to offer, and I'm mostly out of the ISP game, but I am so 
> curious what the use-case is here, especially the "BGP to each CPE". I 
> understand this might be private info, but I'm just so very curious. The BGP 
> part is where I'm stumped...

Any reason why you'd need hub+spoke topology, so many use cases.  I've
use it for two things.

Mobile backup termination
OOB termination

In both cases with BGP, because I had 2 hubs, for redundancy. But BGP
would be needed in the first case anyhow, as the customer delivers
IPs. And helps in the 2nd case, even without redundancy to simplify
configuration and keep hub configuration free (no touching on hub,
when adding or removing spokes, due to BGP listen/allow).

I mean this is what got rebranded under SDN but has existed before,
it's just pragmatic and specific.

-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Netflow vs SNMP

2023-10-02 Thread Saku Ytti via cisco-nsp
On Mon, 2 Oct 2023 at 13:22, Dobbins, Roland via cisco-nsp
 wrote:

> cache timeout inactive 15
> Kentik recommends 15s:
>
> This is an old, out-of-date recommendation from Cisco should be retired.
>
> 5s is plenty of time for inactive flows.

What is the basis for this recommendation? With 1:10k or 1:1k, either
way you'll have 1 packet per cache item. So 15, 5, 1, 0 would allow an
equal amount of cache row re-use, which is none.

--
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Netflow vs SNMP

2023-10-02 Thread Saku Ytti via cisco-nsp
On Mon, 2 Oct 2023 at 09:14, Hank Nussbacher via cisco-nsp
 wrote:

> Does this make sense to go 1:1 which will only increase the number of
> Netflow record to export?  Everyone that does 1:1000 or 1:1
> sampling, do you also seen a discrepancy between Netflow stats vs SNMP
> stats?

Both 1:1000 and 1:1 make netflow expensive sflow. You will see
almost all records exported are exactly 1 packet of data. You are
spending a lot of resources storing that data and later exporting that
data out, when you only ever punch the flow exactly once.
This is because people have run the same configuration for decades,
while traffic has exponentially grown, so the probability of hitting
packets in the same flow twice has exponentially gone down. As the
amount of traffic grows, sampling needs to become more and more
aggressive to retain the same resolution. It is basically becoming
massively more expensive over time, and likely cache based in-line
netflow is dead in the water, and will become specialised in-line tap
devices for the few who actually can justify the cost.

Juniper has realised this, and PTX no longer uses cache at all, but
exports immediately after sampling.

IPFIX has newer sampling entities, which allow you to communicate that
every N packet you sample C packets. This would allow you to ensure
that once you fire sampling/export, you can sample enough packets to
fill the MTU on export, to have an ideal balance of resource use and
data density. Again entirely without cache, as cache does nothing
unless you have very very aggressive sampling.

--
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Port-channel not working Juniper vs Cisco

2023-06-11 Thread Saku Ytti via cisco-nsp
Your output says the JNPR side is still fast/1s.

I'd actually prefer to look at the LACP PDU from the wire 'monitor traffic '

Flap would be expected if CSCO sends every 30s, JNPR sends every 1s.
We should expect JNPR to be happy about 3s after receiving PDU from
CSCO. I thoroughly recommend setting both ends fast, always. Because
unless all your ports are purely lighted from local optic,the ability
to know if a link carries traffic or not is generally really poor, and
LACP at least will guarantee you won't blackhole longer than 3s when
the link stops carrying traffic for some reason unobservable to you.

On Sun, 11 Jun 2023 at 13:31, james list  wrote:
>
> Hi
> I've deactivated the FAST on Juniper but nothing changes...
>
> As I wrote on Cisco remains down hence you will not see any LACP while on 
> Juniper it flaps, once I shut on cisco it stop to flap
> I'm not sure it is a cabling issue...
> we should try to setup a pure L3 p2p link and test that I guess...
>
> See the output:
>
> JUNIPER
>
> # run show interfaces diagnostics optics ge-0/2/3
> Physical interface: ge-0/2/3
> Laser bias current:  5.686 mA
> Laser output power:  0.2420 mW / -6.16 dBm
> Module temperature:  39 degrees C / 102 degrees F
> Module voltage:  3.3000 V
> Laser receiver power  :  0.2093 mW / -6.79 dBm
> Laser bias current high alarm :  Off
> Laser bias current low alarm  :  Off
> Laser bias current high warning   :  Off
> Laser bias current low warning:  Off
> Laser output power high alarm :  Off
> Laser output power low alarm  :  Off
> Laser output power high warning   :  Off
> Laser output power low warning:  Off
> Module temperature high alarm :  Off
> Module temperature low alarm  :  Off
> Module temperature high warning   :  Off
> Module temperature low warning:  Off
> Module voltage high alarm :  Off
> Module voltage low alarm  :  Off
> Module voltage high warning   :  Off
> Module voltage low warning:  Off
> Laser rx power high alarm :  Off
> Laser rx power low alarm  :  Off
> Laser rx power high warning   :  Off
> Laser rx power low warning:  Off
> Laser bias current high alarm threshold   :  12.000 mA
> Laser bias current low alarm threshold:  2.000 mA
> Laser bias current high warning threshold :  10.000 mA
> Laser bias current low warning threshold  :  2.000 mA
> Laser output power high alarm threshold   :  1.1000 mW / 0.41 dBm
> Laser output power low alarm threshold:  0.0600 mW / -12.22 dBm
> Laser output power high warning threshold :  1. mW / 0.00 dBm
> Laser output power low warning threshold  :  0.0850 mW / -10.71 dBm
> Module temperature high alarm threshold   :  100 degrees C / 212 degrees F
> Module temperature low alarm threshold:  -40 degrees C / -40 degrees F
> Module temperature high warning threshold :  85 degrees C / 185 degrees F
> Module temperature low warning threshold  :  -10 degrees C / 14 degrees F
> Module voltage high alarm threshold   :  3.630 V
> Module voltage low alarm threshold:  2.970 V
> Module voltage high warning threshold :  3.465 V
> Module voltage low warning threshold  :  3.134 V
> Laser rx power high alarm threshold   :  1.8000 mW / 2.55 dBm
> Laser rx power low alarm threshold:  0. mW / - Inf dBm
> Laser rx power high warning threshold :  1. mW / 0.00 dBm
> Laser rx power low warning threshold  :  0.0200 mW / -16.99 dBm
>
>
>
> # run show lacp statistics interfaces ae10
> Aggregated interface: ae10
> LACP Statistics:   LACP Rx LACP Tx   Unknown Rx   Illegal Rx
>   ge-0/2/3   0 23300
>
>
> # run show lacp interfaces ae10 extensive
> Aggregated interface: ae10
> LACP state:   Role   Exp   Def  Dist  Col  Syn  Aggr  Timeout  
> Activity
>   ge-0/2/3   ActorNo   YesNo   No   No   Yes Fast
> Active
>   ge-0/2/3 PartnerNo   YesNo   No   No   Yes Fast   
> Passive
> LACP protocol:Receive State  Transmit State  Mux State
>   ge-0/2/3Defaulted   Fast periodic   Detached
> LACP info:Role System System   Port Port  
>   Port
>  priority identifier   priority   number  
>key
>   ge-0/2/3   Actor127  f8:c1:16:56:0a:001277  
> 11
>   ge-0/2/3 Partner  1  00:00:00:00:00:00  17  
>

Re: [c-nsp] Port-channel not working Juniper vs Cisco

2023-06-11 Thread Saku Ytti via cisco-nsp
You've changed JNPR from 30s to 1s, but not CSCO. I'm not sure if this
is the only problem, as insufficient data is shown about the state and
LACP PDUs.

I believe the command is 'lacp rate fast' or 'lacp period short', to
reduce risk of operators getting bored, In your case, the former.

On Sun, 11 Jun 2023 at 10:38, james list via cisco-nsp
 wrote:
>
> Dear expert
> we've an issue in setting up a port-channel between a Juniper EX4400 and a
> Cisco Nexus N9K-C93180YC-EX over an SX 1 Gbs link.
>
> We've implemented the following configuration but on Juniper side it is
> interface flapping while on Cisco side it remains down.
> Light levels seem ok.
>
> Has anyone ever experienced the same ? Any suggestions ?
>
> Thanks in advance for any hint
> Kind regards
> James
>
> JUNIPER *
>
> > show configuration interfaces ae10 | display set
> set interfaces ae10 description "to Cisco leaf"
> set interfaces ae10 aggregated-ether-options lacp active
> set interfaces ae10 aggregated-ether-options lacp periodic fast
> set interfaces ae10 unit 0 family ethernet-switching interface-mode trunk
> set interfaces ae10 unit 0 family ethernet-switching vlan members 301
>
> > show configuration interfaces ge-0/2/3 | display set
> set interfaces ge-0/2/3 description "to Cisco leaf"
> set interfaces ge-0/2/3 ether-options 802.3ad ae10
>
> > show vlans VLAN_301
>
> Routing instanceVLAN name Tag  Interfaces
> default-switch  VLAN_301  301
>ae10.0
>
>
>
>
> CISCO  ***
>
> interface Ethernet1/41
>   description <[To EX4400]>
>   switchport
>   switchport mode trunk
>   switchport trunk allowed vlan 301
>   channel-group 41 mode active
>   no shutdown
>
> interface port-channel41
>   description <[To EX4400]>
>   switchport
>   switchport mode trunk
>   switchport trunk allowed vlan 301
>
>
> # sh vlan id 301
>
> VLAN Name StatusPorts
>   -
> ---
> 301  P2P_xxx  activePo1, Po41, Eth1/1, Eth1/41
>
> VLAN Type Vlan-mode
>  ---
> 301  enet CE
>
> Remote SPAN VLAN
> 
> Disabled
>
> Primary  Secondary  Type Ports
> ---  -  ---
>  ---
> ___
> cisco-nsp mailing list  cisco-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/



-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] BGP Routes

2023-03-12 Thread Saku Ytti via cisco-nsp
On Sun, 12 Mar 2023 at 20:50, Mark Tinka via cisco-nsp
 wrote:

> ASR9K1 has more routes with better paths toward destinations via its
> upstream than ASR9K2 does.

Or at worst, race.

You might want add-path or best-external for predictability and
improved convergence time.

-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] NCS IOS-XR rant (was:Re: Internet border router recommendations and experiences)

2023-03-01 Thread Saku Ytti via cisco-nsp
On Wed, 1 Mar 2023 at 02:41, Phil Bedard via cisco-nsp
 wrote:

> With XR7 the idea was to mimic how things are done with Linux repos by having 
> a specific RPM repo for the routers and the patches which is managed similar 
> to Linux and that’s how all software is packaged now.  Dependencies are 
> resolved automatically, etc.   RPMs are installed as atomic operations, there 
> is no more install apply, etc.  Most customers do not want to manage an RPM 
> repo for their routers, so they just use whole images.

I believe this is why people prefer Linux containers to legacy
time-shared mutable boxes, the mutable package management is actually
anti-pattern today.

I wonder why I can upgrade my IRC client while keeping state, but I
can't upgrade my BGP.

There are two paths that consumers would accept
   a) immutable NOS, you give it image, it boots up and converges in <5min
   b) mutable NOS, process restarts keep state, if upgrade is hitful,
forwarding stoppage should be measured in low seconds

I think a) is far easier to achieve.

-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Cisco IOS switch SSH connections not working

2023-02-13 Thread Saku Ytti via cisco-nsp
On Tue, 14 Feb 2023 at 02:21, Lee Starnes via cisco-nsp
 wrote:

> So attempted to just remove the ACL and try. Still nothing. Lastly, I
> enabled telnet and tried to connect via telnet. Still nothing. I really
> don't want to restart the switch if there is any other way to resolve this.
>
> Anyone have any suggestions?

I assume you have connectivity to the box by some means, based on the
content of your email.

If packets are getting delivered to the device port, then it seems
like they fail to enter from HW to the control-plane, a somewhat
common problem and this would require deep dive into how to debug each
step in the punt path. But as some starting points, by no means not a
complete view into the punt path.

You could try ELAM capture to see what PFC thinks of the packet, is it
going to punt it to software.
You could try pinnacle capture to see what the punt looks like.
- show plat cap buffer asic pinnacle  port 4 direction out
priority lo. ## sup => rp path
- show plat cap buffer filter data ipv4 IP_SA=
- show plat cap buffer data filt
- show plat cap buffer data sample 
You could try to look at input buffers on input interface, to see if
buffers are full, very often there are bugs where control-plane
packets get wedged, eventually filling the buffer precluding new
packets from entering software.
 - show buffer input X dump, if so
You could review TCP/TCB for stuck sessions you might need to clear manually
-  clear tcp tcb might help
You could review TTY/VTY session box thinks it is holding
 - clear line might help


You might not be able to fix the problem without boot, but you can
absolutely find incriminating evidence of the anatomy of the problem,
assuming you supply enough time on the keyboard.






-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] How can one escalate within Cisco TAC?

2023-02-08 Thread Saku Ytti via cisco-nsp
On Wed, 8 Feb 2023 at 09:48, Hank Nussbacher via cisco-nsp
 wrote:

> So how does one escalate such an issue within TAC?  Is there some secret
> email like escalati...@cisco.com or vp-...@cisco.com that one can contact?

You call your account team, express your grief and set expectations.
Then you have someone in your corner internally, which is far more
effective than externally trying to fix it.

It saddens me greatly, because it shouldn't work in a world full of
responsible adults, but having weekly case review calls works very
well, because then the account team will be embarrassed to say 'ok
this didn't move since last week', and they ensure things move even a
little bit. It steals 30min-1h per week per vendor of your time, but
pays dividends. Working would be much more pleasurable if half the
world's white collar workers wouldn't be unemployed plat card holders
and cruising without output, while looking down on people doing 3 jobs
and not qualifying for a mortgage.

-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] MTU and PMTUD

2022-12-08 Thread Saku Ytti via cisco-nsp
On Thu, 8 Dec 2022 at 11:40, Marcin Kurek  wrote:
>
> >  https://www.rfc-editor.org/rfc/rfc8654.html
>
> Thanks!
>
> > But it was [8936, 1240].min - so it was 'negotiated' here to the
> > smallest? If you change the 8936 end to 1239, then that will be used,
> > regardless who starts it.
>
> Yes, but why would XR advertise 1240 if I'm using 'tcp mss 8936' for that 
> neighbor?
> Initially I thought that the whole point of this command is to set MSS to a 
> fixed value of our choice.

I may have misunderstood you. Ddi you have 8936 configured on both
ends? I thought you had only 8936 on the CSR.

How I understood it:

*Dec  8 11:17:15.453: TCP0: Connection to 12.0.0.7:179, advertising MSS 8936
*Dec  8 11:17:15.456: TCP: tcb 7FFB9A6D64C0 connection to
12.0.0.7:179, peer MSS 1240, MSS is 1240

Local 12.0.0.13 CSR is advertising 8936 to remote 12.0.0.7
Remote 12.0.0.7 is advertising 1240
We settle to 1240

I guess you are saying the remote 12.0.0.7 is as well configured to
8936? Then I agree, I wouldn't expect that, and can't explain it.
Almost sounds like a bug where the configuration command is only read
when IOS-XR establishes the connection, but not when it receives it?

-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] MTU and PMTUD

2022-12-08 Thread Saku Ytti via cisco-nsp
On Thu, 8 Dec 2022 at 11:25, Marcin Kurek  wrote:

> Interesting, but why would 'sh run' or 'write' raise an interrupt?
> Isn't this a branch in code that handles the CLI?

Maybe to access NVRAM? I don't know for sure, I didn't pursue it, I
just know I could observe it.

> I'm not sure if I'm reading it right - on the one hand, the interrupts are 
> disabled, but on the other hand, some CLI commands actually raise them?

I don't know if IOS does polling or interrupt for NIC packets, but
there are tons of reasons to raise interrupt, not just NIC.

> Would you mind elaborating on why going above 4k would mean "newish features" 
> and what are they?

https://www.rfc-editor.org/rfc/rfc8654.html

> So here CSR1kv is initiating the connection to XR box advertising MSS 8936 
> (as expected).
> However, peer MSS is 1240, which is not quite expected, considering XR config:

But it was [8936, 1240].min - so it was 'negotiated' here to the
smallest? If you change the 8936 end to 1239, then that will be used,
regardless who starts it.

-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] MTU and PMTUD

2022-12-07 Thread Saku Ytti via cisco-nsp
On Wed, 7 Dec 2022 at 22:20, Marcin Kurek  wrote:

> > I've seen Cisco presentations in the 90s and early 00s showing
> > significant benefit from it. I have no idea how accurate it is
> > today,nor why it would have made a difference in the past, like was
> > the CPU interrupt rate constrained?
>
> I'm sorry, I didn't get that part about constrained CPU interrupt rate?
> My simple way of looking into that is that if we bump up the MTU, we end
> up with fewer packets on the wire, so less processing on both sides.

To handle NIC received packets you can do two things

a) CPU can get interrupt, and handle the interrupt
b) Interrupts can be disabled, and CPU can poll to see if there are
packets to process

The mechanism a) is the norm and the mechanism b) is modernish. To
improve PPS performance under heavy rate, at cost of increasing jitter
and latency because it takes variable time to pick up packet. In
software based routers, like VXR, if you had precise enough (thanks
Creanord!) measurements of network performance, you could observe
jitter during rancid (Thanks Heas!) collections, because 'show run'
and 'write' raises interrupts, which stops packet forwarding.

So less PPS, less interrupt, might be one contributing factor. I don't
know what the overhead cost of processing packets is, but intuitively
I don't expect much improvement with large MTU BGP packets. And at any
rate, going above 4k would mean newish features you don't have. But I
don't have high confidence in being right.

> Testing using XR 7.5.2 and older IOS XE, resulting MSS depends on who is
> passive/active.

MSS is 'negotiated' to the smallest. Much like BGP timers are
'negotiated' to the smallest (so your customer controls your BGP
timers, not you). Does this help to explain what you saw?



-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] MTU and PMTUD

2022-12-06 Thread Saku Ytti via cisco-nsp
Hey Marcin,

> XR without enabled PMTUD (default setting) means ~1240 bytes available
> for TCP payload.
>
> That seems to be a bit small, did you perform any kind of performance
> testing to see the difference between defaults and let's say 9000 for BGP?

I am embarrassed to say, but I _just_ found out, like literally weeks
ago, that Junos BGP TCP window is 16kB, I did also find hidden command
(https://github.com/ytti/seeker-junos) to bump it up to 64kB. I am
working with JNPR to have public documentation for the hidden command
to improve supportability and optics. I am separately working on hopes
of getting TCP window scaling.
I know that we are limited by TCP window, as the BGP transfer speed is
within 0.5% of theoretical max, and increasing 16kB to 64kB increases
BGP transfer speed exactly 4 times, being still capped by TCP window.
I think Cisco can go to 130k, but won't by default.
Maybe after that issue is remedied I will review packet size.

> I'm thinking about RRs in particular, higher MTU (9000 vs 1200) should
> result in some performance benefit, at least from CPU point of view. I
> haven't tested this though.

I've seen Cisco presentations in the 90s and early 00s showing
significant benefit from it. I have no idea how accurate it is
today,nor why it would have made a difference in the past, like was
the CPU interrupt rate constrained?

> Agree. Thing is, PMTUD on XR is a global setting, so it does affect all
> TCP based protocols.

You can do 'tcp mss X' under neighbor stanza.

-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] MTU and PMTUD

2022-12-06 Thread Saku Ytti via cisco-nsp
> Assuming typical MPLS network running L2VPNs and L3VPNs, how do you
> configure MTU values for core interfaces? Do you set interface (L2) MTU,
> MPLS MTU and IP MTU separately?

We set the 'physical' MTU (In IOS-XR+Junos L2 but no CRC is in this
humber) as high as it went when the decision was made. Today you can
do I think 10k in Cisco8k and 16k in Juniper. We do not seet MPLS or
IP MTUs separately in Core. On the edge you should always set L3 MTU,
because you want to have the ability to add subinterfaces with large
MTU, so physical MTU must be large, as change will affect all
subinterfaces. This way you can later add big MTU subint, without
affecting other subints.

> Do you enable PMTUD for TCP based control plane protocols like BGP or LDP?

No, not at this time, our BGP transfer performance is limited by TCP
window-size, so larger packets would not do anything for us.  And LDP
has a trivial amount of stable data so it doesn't matter.

-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Help with MAC control on redundant L2 links.

2022-11-13 Thread Saku Ytti via cisco-nsp
How would you imagine it works, in a world without any limitations?
This seems like a day1 building ethernet LANs question, unless I'm
missing something.

Comcast learns the 1701 now from every site.
Frame comes in with DMAC 1701
Where does Comcast send it? Should it balance between three sites? And
re-receive then if it happened to send it to B or C?

My imagination doesn't stretch how this could possibly work. And this
is exactly why Radia invented STP, to automatically remove all
redundancy until such time that it is needed.

What I would do is run MPLS point-to-point  A<->B, A<->C, B<->C (not
use any comcast or zayo provided multipoint service). And then run
EVPN on my edge ports. EVPN will allow you to even ECMP.

If that is not an option, run standard MSTP with two topologies. One
topology blocks on Zayo another on Comcast. Put half VLANs on
Zayo-first half vlans on Comcast-first. Now because STP blocks
redundant ports, you'll learn [BC] MAC from a single port guaranteed.
You will lose RST convergence benefits because your 'core' port is
facing >1 other 'core', basically you have a 'hub' between your core
ports. So even this topology would be better if you don't buy
multipoint service from a vendor, but point-to-points (w/o mac
learning).

I would very strongly encourage not to go the STP route, and look towards EVPN.


On Sat, 12 Nov 2022 at 23:54, Howard Leadmon via cisco-nsp
 wrote:
>
>
>   I have an issue I am trying to sort out that I haven't run into
> before, and figured someone might be able to give me a few pointers.
>
>   I have three sites lets say A, B and C, and they all have redundant
> connectivity via two Layer 2 fiber loops with two different carriers.
>
>   We use Comcast and Zayo to reach the various sites, but realized that
> I was having connectivity issues, but after talking with Comcast, they
> are informing me the issue is the MAC being presented from different
> locations at the same time.
>
> So say at Site-A I am presenting a mac ending in 1701, I of course
> present this to both Comcast and Zayo, as expected.   Now at Site-B, I
> am being informed that when my switch receives that 1701 down the loop
> from Zayo, it is of course presenting it back to Comcast as a valid
> MAC.  As such they say they are learning this MAC from multiple
> locations at the same time, and they can only support learning it from
> one point, so they drop the MAC.   Of course Site-C has the same issue,
> also presenting what it knows from the other points.
>
> I thought setting 'spanning-tree link-type shared' allowed it to handle
> this, but I am guessing not well enough.  Well it might let the Cisco
> handle it, but apparently is playing havoc with the Ciena switches that
> Comcast is using.
>
> I looked at setting a mac filter (maybe I am looking at this wrong) to
> say if you saw this coming in, don't resend it back out to any other
> place. The issue I saw, was it only allowed it to be an ingress filter,
> which means I would discard the address completely which doesn't seem
> good either.
>
> I am sure there is a right way to handle this, but honestly not
> something I have encountered before.  If anyone could give me any hints,
> or point me to something that might help it would be appreciated..
>
>
> ---
> Howard Leadmon - how...@leadmon.net
> PBW Communications, LLC
> http://www.pbwcomm.com
>
> ___
> cisco-nsp mailing list  cisco-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/



-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] NTP network design considerations

2022-10-15 Thread Saku Ytti via cisco-nsp
On Fri, 14 Oct 2022 at 23:32, Gert Doering via cisco-nsp
 wrote:

> For a true time geek, the time the rPIs provide is just not good
> enough (fluctuates +/- 20 usec if the rPI has work to do and gets
> warm) :-)

Meinberg does not do HW timestamping, nor does NTP. Almost every NIC
out there actually does support HW timestamping, but you'd need chrony
to actually enable the support.

When using Meinberg and clocking your host, ~all of the inaccuracy is
due to SW timestamping, oscillator doesn't matter. Basically with NTP
you are measuring the host context switch times in your jitter. This
is because networks have become organically extremely low jitter
(because storing packets is expensive).
We see across our network single digit microsecond jitters (in my M1
computer I see loopback pings having >50us stddev in latency), and
because the timing we use is SW timestamp, our one-way delay
measurement precision is measuring the NTP host kernel/software
context switch delays.
The most expensive oscillator would do nothing for us, but HW
timestamping and cheap 2eur OXCO would radically improve the quality
of our one-way delay measurements.


-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] storm-control errdisable with no traffic or vlan

2022-08-05 Thread Saku Ytti via cisco-nsp
On Sat, 6 Aug 2022 at 05:27, Paul via cisco-nsp
 wrote:

> Storm control pps is bugged on the 10g ports on the older 4900
> platforms, 4948E , 4900M, sup6 platforms.

When you say 'bugged', what do you specifically mean? Do you mean the
platform can do PPS accounting in HW and there is DDTS or do you mean
the platform cannot do PPS accounting?
It is not given at all that the platform does PPS accounting, for
example much more modern JNPR paradise chip doesn't do it, while their
next-generation triton does it. And I know most Cisco pipeline gear
didn't do it either.

My guess would be that PPS is not supported by the hardware and
behaviour is undefined, and you would need to poke hardware to
understand what was actually programmed when you asked it to PPS, i.e.
it hasn't worked as desired in the 1GE either.

-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] storm-control errdisable with no traffic or vlan

2022-08-04 Thread Saku Ytti via cisco-nsp
Are you sure packet based storm-control is even supported in the platform?

What does 'show storm-control' say?

It is interesting that all packets are multicast packets in the Ten
interfaces, but the packet count is so low, I don't think we can put a
lot of weight into it.

On Thu, 4 Aug 2022 at 13:52, Joe Maimon  wrote:
>
> Thanks for responding. I was looking for a controller like command to
> see maybe there were some malformed frames or something, but couldnt
> find one on this platform.
>
>
>
> Saku Ytti wrote:
> > On Thu, 4 Aug 2022 at 02:06, Joe Maimon via cisco-nsp
> >  wrote:
> >
> >> I have a vendor trying to turn up a 10gb link from their juniper mx to a
> >> cisco 4900M, using typical X2 LR.
> >>
> >> The link was being upgraded from a functioning 1gb. Same traffic.
> GigabitEthernet2/21 is up, line protocol is up (connected)
>Hardware is Gigabit Ethernet Port, address is 588d..a8f8 (bia
> 588d..a8f8)
>Description: Cx 12.KFxxx
>MTU 9198 bytes, BW 100 Kbit/sec, DLY 10 usec,
>   reliability 255/255, txload 1/255, rxload 1/255
>Encapsulation ARPA, loopback not set
>Keepalive set (10 sec)
>Full-duplex, 1000Mb/s, link type is auto, media type is 1000BaseLH
>input flow-control is off, output flow-control is off
>ARP type: ARPA, ARP Timeout 04:00:00
>Last input 00:00:00, output never, output hang never
>Last clearing of "show interface" counters never
>Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 0
>Queueing strategy: fifo
>Output queue: 0/40 (size/max)
>30 second input rate 263000 bits/sec, 147 packets/sec
>30 second output rate 663000 bits/sec, 201 packets/sec
>   5900382712 packets input, 1369541098917 bytes, 0 no buffer
>   Received 100294521 broadcasts (100294050 multicasts)
>   0 runts, 0 giants, 0 throttles
>   0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
>   0 input packets with dribble condition detected
>   7240868168 packets output, 6306357851288 bytes, 0 underruns
>   0 output errors, 0 collisions, 3 interface resets
>   0 unknown protocol drops
>   0 babbles, 0 late collision, 0 deferred
>   0 lost carrier, 0 no carrier
>   0 output buffer failures, 0 output buffers swapped out
>
> interface GigabitEthernet2/21
>   description Cx 12.KFxxx
>   switchport trunk allowed vlan 1660-1679
>   switchport trunk native vlan 1001
>   switchport mode trunk
>   switchport nonegotiate
>   switchport port-security maximum 100
>   switchport port-security aging time 3
>   switchport port-security aging type inactivity
>   switchport port-security
>   mtu 9198
>   load-interval 30
>   no cdp enable
>   storm-control broadcast level pps 10k
>   storm-control action shutdown
>   spanning-tree portfast edge
>
>
> >
> >> Even with switchport mode trunk and switchport allowed vlan none, with
> >> input counters in single digits, storm control immediately takes the
> >> port down after link up. There was negligible traffic on the link before
> >> or after the attempt.
> > Broadcast, multicast or unicast storm-control? What rate? Have you
> > tried increasing the rate? Can you provide the logs of storm-control
> > taking the port down?
>
> Gobs of this
>
> Aug  2 22:18:22: %PM-4-ERR_DISABLE: storm-control error detected on
> Te1/3, putting Te1/3 in err-disable state
> Aug  2 22:18:22: %STORM_CONTROL-3-SHUTDOWN: A packet storm was detected
> on Te1/3. The interface has been disabled.
> Aug  2 22:19:07: %PM-4-ERR_RECOVER: Attempting to recover from
> storm-control err-disable state on Te1/3
> Aug  2 22:19:10: %PM-4-ERR_DISABLE: storm-control error detected on
> Te1/3, putting Te1/3 in err-disable state
>
> Tried another X2
>
> Aug  2 22:31:33: %C4K_IOSINTF-5-TRANSCEIVERREMOVED: Slot=1 Port=3:
> Transceiver has been removed
> Aug  2 22:31:48: %C4K_IOSINTF-5-TRANSCEIVERINSERTED: Slot=1 Port=3:
> Transceiver has been inserted
> Aug  2 22:31:50: %SFF8472-5-THRESHOLD_VIOLATION: Te1/3: Tx power low
> alarm; Operating value: -40.0 dBm, Threshold value: -12.2 dBm.
> Aug  2 22:32:09: %SYS-5-CONFIG_I: Configured from console by joe on vty0
> (216.222.148.103)
> Aug  2 22:32:14: %PM-4-ERR_RECOVER: Attempting to recover from
> storm-control err-disable state on Te1/3
> Aug  2 22:32:17: %PM-4-ERR_DISABLE: storm-control error detected on
> Te1/3, putting Te1/3 in err-disable state
> Aug  2 22:32:17: %STORM_CONTROL-3-SHUTDOWN: A packet storm was detected
> on Te1/3. The interface has been disabled.
>
> Tried another port
>
> Aug  2 22:52:05: %C4K_IOSINTF-5-TRANSCEIVERINSERTED: Slot=3 Port=1:
> Transceiver has been inserted
> Aug  2 22:52:08: %LINK-3-UPDOWN: Interface TenGigabitEthernet3/1,
> changed state to up
> Aug  2 22:52:08: %STORM_CONTROL-3-FILTERED: A Broadcast storm detected
> on Te3/1. A packet filter action has been applied on the interface.
> Aug  2 22:52:09: %LINEPROTO-5-UPDOWN: Line protocol on Interface
> 

Re: [c-nsp] storm-control errdisable with no traffic or vlan

2022-08-03 Thread Saku Ytti via cisco-nsp
On Thu, 4 Aug 2022 at 02:06, Joe Maimon via cisco-nsp
 wrote:

> I have a vendor trying to turn up a 10gb link from their juniper mx to a
> cisco 4900M, using typical X2 LR.
>
> The link was being upgraded from a functioning 1gb. Same traffic.

10G will serialise packets 10x faster. Even if the average packet rate
is the same, you can have a 10x faster traffic rate on smaller
sampling intervals, which may exceed configured protection rates.

> Even with switchport mode trunk and switchport allowed vlan none, with
> input counters in single digits, storm control immediately takes the
> port down after link up. There was negligible traffic on the link before
> or after the attempt.

Broadcast, multicast or unicast storm-control? What rate? Have you
tried increasing the rate? Can you provide the logs of storm-control
taking the port down?

-- 
  ++ytti
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] ASR920: egress ACL on BDIs

2020-01-28 Thread Saku Ytti via cisco-nsp
--- Begin Message ---
On Tue, 28 Jan 2020 at 17:28, Nathan Lannine  wrote:

> FWIW we are actually using object ACLs.  What's the behavior then? Copy-swap? 
>  Is there a real name for that which I'm not remembering?

Object should be indeed copy-swap (atomic).

-- 
  ++ytti
--- End Message ---
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/