Re: [c-nsp] STP over Port-channel issue

2024-05-06 Thread james list via cisco-nsp
Thanks
good point on LACP Fast, we'll test it.
RSTP should be in any case slower than 3 seconds with LACP FAST.

Cheers
James



Il giorno lun 6 mag 2024 alle ore 15:22 Saku Ytti  ha scritto:

> On Mon, 6 May 2024 at 15:53, james list via cisco-nsp
>  wrote:
>
> > The question: since the PO remains up, why we see this behaviour ?
> > are BDPU sent just over one link (ie the higher interfac e) ?
>
> Correct.
>
> > how can we solve this issue keeping this scenario ?
> > moving to RSTP could solve ?
>
> No.
>
> I understand you want topology to remain intact, as long as there is
> at least 1 member up, but I'm not sure we can guarantee that. I think
> if you set LACP to fast, it'll fail in max 3s, and you ensure STP
> fails slower (i.e. don't use rapid pvst), you probably will just see
> BPDU switch physical interface, instead of STP convergence.
>
> You'd need something similar to microBFD on LAGs, where BFD PDU is
> sent on each member, instead of an unspecified single member, but
> afaik this does not exist.
>
> So what you really need is L3/MPLS topology :/.
>
> --
>   ++ytti
>
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


[c-nsp] STP over Port-channel issue

2024-05-06 Thread james list via cisco-nsp
dear experts
a customer of mine has a legacy environment with 4 x Cisco 9500 (IOS XE
17.09.03) connected in a square mode with 2 links (2 per each connection)
and each couple of links is considered a single virtual port (port-channel).
Loops are managed with PVSTP.
Two x C9500 are in DC1 while the other two x 9500 are in DC2.

9500-1 == 9500-3
   || ||
9500-2 == 9500-4

The issue is: when there is a provider flapping, ie one out of the two link
connecting C9500 in the two DC is flapping, we see one port flaps,
port-channel remains UP but PVSTP reconverge and there is a communication
failure.

The question: since the PO remains up, why we see this behaviour ?
are BDPU sent just over one link (ie the higher interfac e) ?
how can we solve this issue keeping this scenario ?
moving to RSTP could solve ?

Thanks in advance
Cheers
James

here some logs:

May  6 11:48:35.590: %LINEPROTO-5-UPDOWN: Line protocol on Interface
TwentyFiveGigE1/0/48, changed state to down
May  6 11:48:36.593: %LINK-3-UPDOWN: Interface TwentyFiveGigE1/0/48,
changed state to down
May  6 11:48:40.343: %LINK-3-UPDOWN: Interface TwentyFiveGigE1/0/48,
changed state to up
May  6 11:48:44.171: %LINEPROTO-5-UPDOWN: Line protocol on Interface
TwentyFiveGigE1/0/48, changed state to up



#sh etherchannel summary
--+-+---+---
1  Po1(SU) LACPTwe1/0/1(P) Twe1/0/2(P)
12 Po12(SU)LACPTwe1/0/47(P)Twe1/0/48(P)



Port-channel: Po12(Primary Aggregator)



Age of the Port-channel   = 187d:15h:09m:01s
Logical slot/port   = 9/12  Number of ports = 2
HotStandBy port = null
Port state  = Port-channel Ag-Inuse
Protocol=   LACP
Port security   = Disabled
Fast-switchover = disabled
Fast-switchover Dampening = disabled

Ports in the Port-channel:

Index   Load   PortEC stateNo of bits
--+--+--+--+---
  0 00 Twe1/0/47  Active 0
  0 00 Twe1/0/48  Active 0

Time since last port bundled:0d:01h:50m:00s Twe1/0/48
Time since last port Un-bundled: 0d:01h:50m:09s Twe1/0/48


# sh spanning-tree interface port-channel 12 state
VLAN0015blocking
VLAN0016forwarding



# sh spanning-tree vlan 15 detail

 VLAN0015 is executing the rstp compatible Spanning Tree protocol
  Bridge Identifier has priority 20480, sysid 15, address 8c84.4236.5fe0
  Configured hello time 2, max age 20, forward delay 15, transmit
hold-count 6
  Current root has priority 8207, address 444c.a830.ff71
  Root port is 2089 (Port-channel1), cost of root path is 2000
  Topology change flag not set, detected flag not set
  Number of topology changes 73 last change occurred 01:52:13 ago
  from Port-channel1
  Times:  hold 1, topology change 35, notification 2
  hello 2, max age 20, forward delay 15
  Timers: hello 0, topology change 0, notification 0, aging 300

 Port 2089 (Port-channel1) of VLAN0015 is root forwarding
   Port path cost 1000, Port priority 128, Port Identifier 128.2089.
   Designated root has priority 8207, address 444c.a830.ff71
   Designated bridge has priority 16399, address cc79.d760.73e0
   Designated port id is 128.2089, designated path cost 1000
   Timers: message age 15, forward delay 0, hold 0
   Number of transitions to forwarding state: 18
   Link type is point-to-point by default
   BPDU: sent 289, received 4527220

 Port 2100 (Port-channel12) of VLAN0015 is alternate blocking
   Port path cost 1000, Port priority 128, Port Identifier 128.2100.
   Designated root has priority 8207, address 444c.a830.ff71
   Designated bridge has priority 12303, address 444c.a830.fc01
   Designated port id is 128.107, designated path cost 1999
   Timers: message age 15, forward delay 0, hold 0
   Number of transitions to forwarding state: 20
   Link type is point-to-point by default
   BPDU: sent 340, received 4525304
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] [j-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via cisco-nsp
hi
I'd like to test with LACP slow, then can see if physical interface still
flaps...

Thanks for your support

Il giorno dom 11 feb 2024 alle ore 18:02 Saku Ytti  ha
scritto:

> On Sun, 11 Feb 2024 at 17:52, james list  wrote:
>
> > - why physical interface flaps in DC1 if it is related to lacp ?
>
> 16:39:35.813 Juniper reports LACP timeout (so problem started at
> 16:39:32, (was traffic passing at 32, 33, 34 seconds?))
> 16:39:36.xxx Cisco reports interface down, long after problem has
> already started
>
> Why Cisco reports physical interface down, I'm not sure. But clearly
> the problem was already happening before interface down, and first log
> entry is LACP timeout, which occurs 3s after the problem starts.
> Perhaps Juniper asserts for some reason RFI? Perhaps Cisco resets the
> physical interface once removed from LACP?
>
> > - why the same setup in DC2 do not report issues ?
>
> If this is is LACP related software issue, could be difference not
> identified. You need to gather more information, like how does ping
> look throughout this event, particularly before syslog entries. And if
> ping still works up-until syslog, you almost certainly have software
> issue with LACP inject at Cisco, or more likely LACP punt at Juniper.
>
> --
>   ++ytti
>
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] [j-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via cisco-nsp
Hi
I have a couple of points to ask related to your idea:
- why physical interface flaps in DC1 if it is related to lacp ?
- why the same setup in DC2 do not report issues ?

NEXUS01# sh logging | in  Initia | last 15
2024 Jan 17 22:37:49 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 18 23:54:25 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 19 00:58:13 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 19 07:15:04 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 22 16:03:13 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 25 21:32:29 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 26 18:41:12 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 28 05:07:20 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 29 04:06:52 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Jan 30 03:09:44 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  5 18:13:20 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  6 02:17:25 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  6 22:00:24 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  9 09:29:36 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  9 16:39:36 NEXUS01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)

Il giorno dom 11 feb 2024 alle ore 14:36 Saku Ytti  ha
scritto:

> On Sun, 11 Feb 2024 at 15:24, james list  wrote:
>
> > While on Juniper when the issue happens I always see:
> >
> > show log messages | last 440 | match LACPD_TIMEOUT
> > Jan 25 21:32:27.948 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5:
> lacp current while timer expired current Receive State: CURRENT
> 
> > Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5:
> lacp current while timer expired current Receive State: CURRENT
>
> Ok so problem always starts by Juniper seeing 3seconds without LACP
> PDU, i.e. missing 3 consecutive LACP PDU. It would be good to ping
> while this problem is happening, to see if ping stops at 3s before the
> syslog lines, or at the same time as syslog lines.
> If ping stops 3s before, it's link problem from cisco to juniper.
> If ping stops at syslog time (my guess), it's software problem.
>
> There is unfortunately log of bug surface here, both on inject and on
> punt path. You could be hitting PR1541056 on the Juniper end. You
> could test for this by removing distributed LACP handling with 'set
> routing-options ppm no-delegate-processing'
> You could also do packet capture for LACP on both ends, to try to see
> if LACP was sent by Cisco and received by capture, but not by system.
>
>
> --
>   ++ytti
>
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] [j-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via cisco-nsp
On Cisco I see physical goes down (initializing), what does that mean?

While on Juniper when the issue happens I always see:

show log messages | last 440 | match LACPD_TIMEOUT
Jan 25 21:32:27.948 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Jan 26 18:41:12.514 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Jan 28 05:07:20.283 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Jan 29 04:06:51.768 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Jan 30 03:09:43.923 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Feb  5 18:13:20.158 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Feb  6 02:17:23.703 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Feb  6 22:00:23.758 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Feb  9 09:29:35.728 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT

Il giorno dom 11 feb 2024 alle ore 14:10 Saku Ytti  ha
scritto:

> Hey James,
>
> You shared this off-list, I think it's sufficiently material to share.
>
> 2024 Feb  9 16:39:36 NEXUS1
> %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface
> port-channel101 is down (No operational members)
> 2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_DOWN:
> port-channel101: Ethernet1/44 is down
> Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5:
> lacp current while timer expired current Receive State: CURRENT
> Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACP_INTF_DOWN: ae49:
> Interface marked down due to lacp timeout on member et-0/1/5
>
> We can't know the order of events here, due to no subsecond precision
> enabled on Cisco end.
>
> But if failure would start from interface down, it would take 3seconds
> for Juniper to realise LACP failure. However we can see that it
> happens in less than 1s, so we can determine the interface was not
> down first, the first problem was Juniper not receiving 3 consecutive
> LACP PDUs, 1s apart, prior to noticing any type of interface state
> related problems.
>
> Is this always the order of events? Does it always happen with Juniper
> noticing problems receiving LACP PDU first?
>
>
> On Sun, 11 Feb 2024 at 14:55, james list via juniper-nsp
>  wrote:
> >
> > Hi
> >
> > 1) cable has been replaced with a brand new one, they said that to check
> an
> > MPO 100 Gbs cable is not that easy
> >
> > 3) no errors reported on both side
> >
> > 2) here the output of cisco and juniper
> >
> > NEXUS1# sh interface eth1/44 transceiver details
> > Ethernet1/44
> > transceiver is present
> > type is QSFP-100G-SR4
> > name is CISCO-INNOLIGHT
> > part number is TR-FC85S-NC3
> > revision is 2C
> > serial number is INL27050TVT
> > nominal bitrate is 25500 MBit/sec
> > Link length supported for 50/125um OM3 fiber is 70 m
> > cisco id is 17
> > cisco extended id number is 220
> > cisco part number is 10-3142-03
> > cisco product id is QSFP-100G-SR4-S
> > cisco version id is V03
> >
> > Lane Number:1 Network Lane
> >SFP Detail Diagnostics Information (internal calibration)
> >
> >
> 
> > Current  Alarms  Warnings
> > Measurement HighLow High  Low
> >
> >
> 
> >   Temperature   30.51 C75.00 C -5.00 C 70.00 C
> 0.00 C
> >   Voltage3.28 V 3.63 V  2.97 V  3.46 V
> 3.13 V
> >   Current6.40 mA   12.45 mA 3.25 mA12.45 mA
>  3.25
> > mA
> >   Tx Power   0.98 dBm   5.39 dBm  -12.44 dBm2.39 dBm
>  -8.41
> > dBm
> >   Rx Power  -1.60 dBm   5.39 dBm  -14.31 dBm2.39 dBm
> -10.31
> > dBm
> >   Transmit Fault Count = 0
> >
> >
> 
> >   Note: ++  high-alarm; +  high-warning; --  low-alarm; -  low-warning
> >
> > Lane Number:2 Network Lane
> >SFP Detail Diagnostics Information (internal calibration)
> >
> >
> 
> > Current  Alarms  Warnings
> > Measurement HighLow High  Low
> >
> >
> 

Re: [c-nsp] [j-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via cisco-nsp
Hi

1) cable has been replaced with a brand new one, they said that to check an
MPO 100 Gbs cable is not that easy

3) no errors reported on both side

2) here the output of cisco and juniper

NEXUS1# sh interface eth1/44 transceiver details
Ethernet1/44
transceiver is present
type is QSFP-100G-SR4
name is CISCO-INNOLIGHT
part number is TR-FC85S-NC3
revision is 2C
serial number is INL27050TVT
nominal bitrate is 25500 MBit/sec
Link length supported for 50/125um OM3 fiber is 70 m
cisco id is 17
cisco extended id number is 220
cisco part number is 10-3142-03
cisco product id is QSFP-100G-SR4-S
cisco version id is V03

Lane Number:1 Network Lane
   SFP Detail Diagnostics Information (internal calibration)


Current  Alarms  Warnings
Measurement HighLow High  Low


  Temperature   30.51 C75.00 C -5.00 C 70.00 C0.00 C
  Voltage3.28 V 3.63 V  2.97 V  3.46 V3.13 V
  Current6.40 mA   12.45 mA 3.25 mA12.45 mA   3.25
mA
  Tx Power   0.98 dBm   5.39 dBm  -12.44 dBm2.39 dBm -8.41
dBm
  Rx Power  -1.60 dBm   5.39 dBm  -14.31 dBm2.39 dBm-10.31
dBm
  Transmit Fault Count = 0


  Note: ++  high-alarm; +  high-warning; --  low-alarm; -  low-warning

Lane Number:2 Network Lane
   SFP Detail Diagnostics Information (internal calibration)


Current  Alarms  Warnings
Measurement HighLow High  Low


  Temperature   30.51 C75.00 C -5.00 C 70.00 C0.00 C
  Voltage3.28 V 3.63 V  2.97 V  3.46 V3.13 V
  Current6.40 mA   12.45 mA 3.25 mA12.45 mA   3.25
mA
  Tx Power   0.62 dBm   5.39 dBm  -12.44 dBm2.39 dBm -8.41
dBm
  Rx Power  -1.18 dBm   5.39 dBm  -14.31 dBm2.39 dBm-10.31
dBm
  Transmit Fault Count = 0


  Note: ++  high-alarm; +  high-warning; --  low-alarm; -  low-warning

Lane Number:3 Network Lane
   SFP Detail Diagnostics Information (internal calibration)


Current  Alarms  Warnings
Measurement HighLow High  Low


  Temperature   30.51 C75.00 C -5.00 C 70.00 C0.00 C
  Voltage3.28 V 3.63 V  2.97 V  3.46 V3.13 V
  Current6.40 mA   12.45 mA 3.25 mA12.45 mA   3.25
mA
  Tx Power   0.87 dBm   5.39 dBm  -12.44 dBm2.39 dBm -8.41
dBm
  Rx Power   0.01 dBm   5.39 dBm  -14.31 dBm2.39 dBm-10.31
dBm
  Transmit Fault Count = 0


  Note: ++  high-alarm; +  high-warning; --  low-alarm; -  low-warning

Lane Number:4 Network Lane
   SFP Detail Diagnostics Information (internal calibration)


Current  Alarms  Warnings
Measurement HighLow High  Low


  Temperature   30.51 C75.00 C -5.00 C 70.00 C0.00 C
  Voltage3.28 V 3.63 V  2.97 V  3.46 V3.13 V
  Current6.40 mA   12.45 mA 3.25 mA12.45 mA   3.25
mA
  Tx Power   0.67 dBm   5.39 dBm  -12.44 dBm2.39 dBm -8.41
dBm
  Rx Power   0.11 dBm   5.39 dBm  -14.31 dBm2.39 dBm-10.31
dBm
  Transmit Fault Count = 0


  Note: ++  high-alarm; +  high-warning; --  low-alarm; -  low-warning



MX1> show interfaces diagnostics optics et-1/0/5
Physical interface: et-1/0/5
Module temperature:  38 degrees C / 100 degrees
F
Module voltage:  3.2740 V
Module temperature high alarm :  Off
Module temperature low alarm  :  Off
Module temperature high warning   :  Off
Module temperature low warning:  Off
Module voltage high alarm   

Re: [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via cisco-nsp
Hi
there are no errors on both interfaces (Cisco and Juniper).

here following logs of one event on both side, config and LACP stats.

LOGS of one event time 16:39:

CISCO
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN:
Interface port-channel101 is down (No operational members)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PARENT_DOWN: Interface
port-channel101.2303 is down (Parent interface is down)
2024 Feb  9 16:39:36 NEXUS1 %BGP-5-ADJCHANGE:  bgp- [xxx] (xxx) neighbor
172.16.6.17 Down - sent:  other configuration change
2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-FOP_CHANGED:
port-channel101: first operational port changed from Ethernet1/44 to none
2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_DOWN: port-channel101:
Ethernet1/44 is down
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel101,bandwidth changed to 10 Kbit
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN:
Interface port-channel101 is down (No operational members)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-SPEED: Interface port-channel101,
operational speed changed to 100 Gbps
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DUPLEX: Interface
port-channel101, operational duplex mode changed to Full
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_RX_FLOW_CONTROL: Interface
port-channel101, operational Receive Flow Control state changed to off
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_TX_FLOW_CONTROL: Interface
port-channel101, operational Transmit Flow Control state changed to off
2024 Feb  9 16:39:39 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel101:
Ethernet1/44 is up
2024 Feb  9 16:39:39 NEXUS1 %ETH_PORT_CHANNEL-5-FOP_CHANGED:
port-channel101: first operational port changed from none to Ethernet1/44
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel101,bandwidth changed to 1 Kbit
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface Ethernet1/44 is up
in Layer3
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface port-channel101 is
up in Layer3
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface
port-channel101.2303 is up in Layer3
2024 Feb  9 16:39:43 NEXUS1 %BGP-5-ADJCHANGE:  bgp- [xxx] (xxx) neighbor
172.16.6.17 Up



Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACP_INTF_DOWN: ae49: Interface
marked down due to lacp timeout on member et-0/1/5
Feb  9 16:39:35.819 2024  MX1 kernel: lag_bundlestate_ifd_change: bundle
ae49: bundle IFD minimum bandwidth or minimum links not met, Bandwidth
(Current : Required) 0 : 1000 Number of links (Current : Required)
0 : 1
Feb  9 16:39:35.815 2024  MX1 lacpd[31632]: LACP_INTF_MUX_STATE_CHANGED:
ae49: et-0/1/5: Lacp state changed from COLLECTING_DISTRIBUTING to
ATTACHED, actor port state : |EXP|-|-|-|IN_SYNC|AGG|SHORT|ACT|, partner
port state : |-|-|DIS|COL|OUT_OF_SYNC|AGG|SHORT|ACT|
Feb  9 16:39:35.869 2024  MX1 rpd[31866]: bgp_ifachange_group:10697:
NOTIFICATION sent to 172.16.6.18 (External AS xxx): code 6 (Cease) subcode
6 (Other Configuration Change), Reason: Interface change for the peer-group
Feb  9 16:39:35.909 2024  MX1 mib2d[31909]: SNMP_TRAP_LINK_DOWN: ifIndex
684, ifAdminStatus up(1), ifOperStatus down(2), ifName ae49
Feb  9 16:39:36.083 2024  MX1 lacpd[31632]: LACP_INTF_MUX_STATE_CHANGED:
ae49: et-0/1/5: Lacp state changed from ATTACHED to
COLLECTING_DISTRIBUTING, actor port state :
|-|-|DIS|COL|IN_SYNC|AGG|SHORT|ACT|, partner port state :
|-|-|DIS|COL|IN_SYNC|AGG|SHORT|ACT|
Feb  9 16:39:36.089 2024  MX1 kernel: lag_bundlestate_ifd_change: bundle
ae49 is now Up. uplinks 1 >= min_links 1
Feb  9 16:39:36.089 2024  MX1 kernel: lag_bundlestate_ifd_change: bundle
ae49: bundle IFD minimum bandwidth or minimum links not met, Bandwidth
(Current : Required) 0 : 1000 Number of links (Current : Required)
0 : 1
Feb  9 16:39:36.085 2024  MX1 lacpd[31632]: LACP_INTF_MUX_STATE_CHANGED:
ae49: et-0/1/5: Lacp state changed from COLLECTING_DISTRIBUTING to
ATTACHED, actor port state : |-|-|-|-|IN_SYNC|AGG|SHORT|ACT|, partner port
state : |-|-|-|-|OUT_OF_SYNC|AGG|SHORT|ACT|
Feb  9 16:39:39.095 2024  MX1 lacpd[31632]: LACP_INTF_MUX_STATE_CHANGED:
ae49: et-0/1/5: Lacp state changed from ATTACHED to
COLLECTING_DISTRIBUTING, actor port state :
|-|-|DIS|COL|IN_SYNC|AGG|SHORT|ACT|, partner port state :
|-|-|-|-|IN_SYNC|AGG|SHORT|ACT|
Feb  9 16:39:39.101 2024  MX1 kernel: lag_bundlestate_ifd_change: bundle
ae49 is now Up. uplinks 1 >= min_links 1
Feb  9 16:39:39.109 2024  MX1 mib2d[31909]: SNMP_TRAP_LINK_UP: ifIndex 684,
ifAdminStatus up(1), ifOperStatus up(1), ifName ae49
Feb  9 16:39:41.190 2024  MX1 rpd[31866]: bgp_recv: read from peer
172.16.6.18 (External AS xxx) failed: Unknown error: 48110976


CONFIG:

CISCO

NEXUS1# sh run int 

Re: [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via cisco-nsp
DC technicians states cable are the same in both DCs and direct, no patch
panel

Cheers

Il giorno dom 11 feb 2024 alle ore 11:20 nivalMcNd d 
ha scritto:

> Can it be DC1 is connecting links over an intermediary patch panel and you
> face fibre disturbance? That may be eliminated if your interfaces on DC1
> links do not go down
>
> On Sun, Feb 11, 2024, 21:16 Igor Sukhomlinov via cisco-nsp <
> cisco-nsp@puck.nether.net> wrote:
>
>> Hi James,
>>
>> Do you happen to run the same software on all nexuses and all MXes?
>> Do the DC1 and DC2 bgp session exchange the same amount of routing updates
>> across the links?
>>
>>
>> On Sun, Feb 11, 2024, 21:09 james list via cisco-nsp <
>> cisco-nsp@puck.nether.net> wrote:
>>
>> > Dear experts
>> > we have a couple of BGP peers over a 100 Gbs interconnection between
>> > Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different
>> datacenters
>> > like this:
>> >
>> > DC1
>> > MX1 -- bgp -- NEXUS1
>> > MX2 -- bgp -- NEXUS2
>> >
>> > DC2
>> > MX3 -- bgp -- NEXUS3
>> > MX4 -- bgp -- NEXUS4
>> >
>> > The issue we see is that sporadically (ie every 1 to 3 days) we notice
>> BGP
>> > flaps only in DC1 on both interconnections (not at the same time),
>> there is
>> > still no traffic since once noticed the flaps we have blocked deploy on
>> > production.
>> >
>> > We've already changed SPF (we moved the ones from DC2 to DC1 and
>> viceversa)
>> > and cables on both the interconnetion at DC1 without any solution.
>> >
>> > SFP we use in both DCs:
>> >
>> > Juniper - QSFP-100G-SR4-T2
>> > Cisco - QSFP-100G-SR4
>> >
>> > over MPO cable OM4.
>> >
>> > Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the
>> issue.
>> >
>> > Any idea or suggestion what to check or to do ?
>> >
>> > Thanks in advance
>> > Cheers
>> > James
>> > ___
>> > cisco-nsp mailing list  cisco-nsp@puck.nether.net
>> > https://puck.nether.net/mailman/listinfo/cisco-nsp
>> > archive at http://puck.nether.net/pipermail/cisco-nsp/
>> >
>> ___
>> cisco-nsp mailing list  cisco-nsp@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>
>
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via cisco-nsp
yes same version
currently no traffic exchange is in place, just BGP peer setup
no traffic

Il giorno dom 11 feb 2024 alle ore 11:16 Igor Sukhomlinov <
dvalinsw...@gmail.com> ha scritto:

> Hi James,
>
> Do you happen to run the same software on all nexuses and all MXes?
> Do the DC1 and DC2 bgp session exchange the same amount of routing updates
> across the links?
>
>
> On Sun, Feb 11, 2024, 21:09 james list via cisco-nsp <
> cisco-nsp@puck.nether.net> wrote:
>
>> Dear experts
>> we have a couple of BGP peers over a 100 Gbs interconnection between
>> Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different
>> datacenters
>> like this:
>>
>> DC1
>> MX1 -- bgp -- NEXUS1
>> MX2 -- bgp -- NEXUS2
>>
>> DC2
>> MX3 -- bgp -- NEXUS3
>> MX4 -- bgp -- NEXUS4
>>
>> The issue we see is that sporadically (ie every 1 to 3 days) we notice BGP
>> flaps only in DC1 on both interconnections (not at the same time), there
>> is
>> still no traffic since once noticed the flaps we have blocked deploy on
>> production.
>>
>> We've already changed SPF (we moved the ones from DC2 to DC1 and
>> viceversa)
>> and cables on both the interconnetion at DC1 without any solution.
>>
>> SFP we use in both DCs:
>>
>> Juniper - QSFP-100G-SR4-T2
>> Cisco - QSFP-100G-SR4
>>
>> over MPO cable OM4.
>>
>> Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the issue.
>>
>> Any idea or suggestion what to check or to do ?
>>
>> Thanks in advance
>> Cheers
>> James
>> ___
>> cisco-nsp mailing list  cisco-nsp@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>
>
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via cisco-nsp
Hi
One think I've omit to say is that BGP is over a LACP with currently just
one interface 100 Gbs.

I see that the issue is triggered on Cisco when eth interface seems to go
in Initializing state:


2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN:
Interface port-channel101 is down (No operational members)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PARENT_DOWN: Interface
port-channel101.2303 is down (Parent interface is down)
2024 Feb  9 16:39:36 NEXUS1 %BGP-5-ADJCHANGE:  bgp- [xxx] (xxx) neighbor
172.16.6.17 Down - sent:  other configuration change
2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-FOP_CHANGED:
port-channel101: first operational port changed from Ethernet1/44 to none
2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_DOWN: port-channel101:
Ethernet1/44 is down
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel101,bandwidth changed to 10 Kbit
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN:
Interface port-channel101 is down (No operational members)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-SPEED: Interface port-channel101,
operational speed changed to 100 Gbps
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DUPLEX: Interface
port-channel101, operational duplex mode changed to Full
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_RX_FLOW_CONTROL: Interface
port-channel101, operational Receive Flow Control state changed to off
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_TX_FLOW_CONTROL: Interface
port-channel101, operational Transmit Flow Control state changed to off
2024 Feb  9 16:39:39 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel101:
Ethernet1/44 is up
2024 Feb  9 16:39:39 NEXUS1 %ETH_PORT_CHANNEL-5-FOP_CHANGED:
port-channel101: first operational port changed from none to Ethernet1/44
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel101,bandwidth changed to 1 Kbit
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface Ethernet1/44 is up
in Layer3
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface port-channel101 is
up in Layer3
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface
port-channel101.2303 is up in Layer3
2024 Feb  9 16:39:43 NEXUS1 %BGP-5-ADJCHANGE:  bgp- [xxx] (xxx) neighbor
172.16.6.17 Up

Cheers
James

Il giorno dom 11 feb 2024 alle ore 11:12 Gert Doering 
ha scritto:

> Hi,
>
> On Sun, Feb 11, 2024 at 11:08:29AM +0100, james list via cisco-nsp wrote:
> > we notice BGP flaps
>
> Any particular error message?  BGP flaps can happen due to many different
> reasons, and usually $C is fairly good at logging the reason.
>
> Any interface errors, packet errors, ping packets lost?
>
> "BGP flaps" *can* be related to lower layer issues (so: interface counters,
> error counters, extended pings) or to something unrelated, like "MaxPfx
> exceeded"...
>
> gert
> --
> "If was one thing all people took for granted, was conviction that if you
>  feed honest figures into a computer, honest figures come out. Never
> doubted
>  it myself till I met a computer with a sense of humor."
>  Robert A. Heinlein, The Moon is a Harsh
> Mistress
>
> Gert Doering - Munich, Germany
> g...@greenie.muc.de
>
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


[c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via cisco-nsp
Dear experts
we have a couple of BGP peers over a 100 Gbs interconnection between
Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different datacenters
like this:

DC1
MX1 -- bgp -- NEXUS1
MX2 -- bgp -- NEXUS2

DC2
MX3 -- bgp -- NEXUS3
MX4 -- bgp -- NEXUS4

The issue we see is that sporadically (ie every 1 to 3 days) we notice BGP
flaps only in DC1 on both interconnections (not at the same time), there is
still no traffic since once noticed the flaps we have blocked deploy on
production.

We've already changed SPF (we moved the ones from DC2 to DC1 and viceversa)
and cables on both the interconnetion at DC1 without any solution.

SFP we use in both DCs:

Juniper - QSFP-100G-SR4-T2
Cisco - QSFP-100G-SR4

over MPO cable OM4.

Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the issue.

Any idea or suggestion what to check or to do ?

Thanks in advance
Cheers
James
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Port-channel not working Juniper vs Cisco

2023-06-11 Thread james list via cisco-nsp
,
ifAdminStatus up(1), ifOperStatus up(1), ifName ge-0/2/3
Jun 11 12:18:20.621 2023  mib2d[9005]: SNMP_TRAP_LINK_UP: ifIndex 585,
ifAdminStatus up(1), ifOperStatus up(1), ifName ge-0/2/3.0
Jun 11 12:18:25.221 2023  mib2d[9005]: SNMP_TRAP_LINK_DOWN: ifIndex 745,
ifAdminStatus up(1), ifOperStatus down(2), ifName ge-0/2/3
Jun 11 12:18:26.621 2023  mib2d[9005]: SNMP_TRAP_LINK_UP: ifIndex 745,
ifAdminStatus up(1), ifOperStatus up(1), ifName ge-0/2/3
Jun 11 12:18:26.621 2023  mib2d[9005]: SNMP_TRAP_LINK_UP: ifIndex 585,
ifAdminStatus up(1), ifOperStatus up(1), ifName ge-0/2/3.0
Jun 11 12:18:31.221 2023  mib2d[9005]: SNMP_TRAP_LINK_DOWN: ifIndex 745,
ifAdminStatus up(1), ifOperStatus down(2), ifName ge-0/2/3
Jun 11 12:18:32.621 2023  mib2d[9005]: SNMP_TRAP_LINK_UP: ifIndex 745,
ifAdminStatus up(1), ifOperStatus up(1), ifName ge-0/2/3
Jun 11 12:18:32.621 2023  mib2d[9005]: SNMP_TRAP_LINK_UP: ifIndex 585,
ifAdminStatus up(1), ifOperStatus up(1), ifName ge-0/2/3.0
Jun 11 12:18:36.721 2023  mib2d[9005]: SNMP_TRAP_LINK_DOWN: ifIndex 745,
ifAdminStatus up(1), ifOperStatus down(2), ifName ge-0/2/3
Jun 11 12:18:37.721 2023  mib2d[9005]: SNMP_TRAP_LINK_UP: ifIndex 745,
ifAdminStatus up(1), ifOperStatus up(1), ifName ge-0/2/3
Jun 11 12:18:37.721 2023  mib2d[9005]: SNMP_TRAP_LINK_UP: ifIndex 585,
ifAdminStatus up(1), ifOperStatus up(1), ifName ge-0/2/3.0
Jun 11 12:18:42.221 2023  mib2d[9005]: SNMP_TRAP_LINK_DOWN: ifIndex 745,
ifAdminStatus up(1), ifOperStatus down(2), ifName ge-0/2/3
Jun 11 12:18:42.721 2023  mib2d[9005]: SNMP_TRAP_LINK_UP: ifIndex 745,
ifAdminStatus up(1), ifOperStatus up(1), ifName ge-0/2/3
Jun 11 12:18:42.721 2023  mib2d[9005]: SNMP_TRAP_LINK_UP: ifIndex 585,
ifAdminStatus up(1), ifOperStatus up(1), ifName ge-0/2/3.0
Jun 11 12:18:47.721 2023  mib2d[9005]: SNMP_TRAP_LINK_DOWN: ifIndex 745,
ifAdminStatus up(1), ifOperStatus down(2), ifName ge-0/2/3
Jun 11 12:18:48.721 2023  mib2d[9005]: SNMP_TRAP_LINK_UP: ifIndex 745,
ifAdminStatus up(1), ifOperStatus up(1), ifName ge-0/2/3
Jun 11 12:18:48.721 2023  mib2d[9005]: SNMP_TRAP_LINK_UP: ifIndex 585,
ifAdminStatus up(1), ifOperStatus up(1), ifName ge-0/2/3.0
Jun 11 12:18:53.221 2023  mib2d[9005]: SNMP_TRAP_LINK_DOWN: ifIndex 745,
ifAdminStatus up(1), ifOperStatus down(2), ifName ge-0/2/3

CISCO

#sh int eth1/41 transceiver calibrations
Ethernet1/41
transceiver is present
type is 1000base-SX
name is CISCO-FINISAR
part number is FTLF8519P2BCL-CS
revision is 
serial number is FNS11150LN8
nominal bitrate is 1300 MBit/sec
cisco id is 3
cisco extended id number is 4
cisco part number is 30-1301-02

SFP is internally calibrated



# sh int eth1/41
Ethernet1/41 is down (Link not connected)
admin state is up, Dedicated Interface
  Belongs to Po41
  Hardware: 100/1000/1/25000 Ethernet, address: 502f.a8ea.bbb0 (bia
502f.a8ea.bbb0)
  Description: <[To EX4400]>
  MTU 1500 bytes, BW 2500 Kbit, DLY 10 usec
  reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, medium is broadcast
  Port mode is trunk
  auto-duplex, auto-speed, media type is 1G
  Beacon is turned off
  Auto-Negotiation is turned on  FEC mode is Auto
  Input flow-control is off, output flow-control is off
  Auto-mdix is turned off
  Rate mode is dedicated
  Switchport monitor is off
  EtherType is 0x8100
  EEE (efficient-ethernet) : n/a
  Last link flapped never
  Last clearing of "show interface" counters 3d20h
  0 interface resets
  Load-Interval #1: 30 seconds
30 seconds input rate 0 bits/sec, 0 packets/sec
30 seconds output rate 0 bits/sec, 0 packets/sec
input rate 0 bps, 0 pps; output rate 0 bps, 0 pps
  Load-Interval #2: 5 minute (300 seconds)
300 seconds input rate 0 bits/sec, 0 packets/sec
300 seconds output rate 0 bits/sec, 0 packets/sec
input rate 0 bps, 0 pps; output rate 0 bps, 0 pps
  RX
0 unicast packets  0 multicast packets  0 broadcast packets
0 input packets  0 bytes
0 jumbo packets  0 storm suppression bytes
0 runts  0 giants  0 CRC  0 no buffer
0 input error  0 short frame  0 overrun   0 underrun  0 ignored
0 watchdog  0 bad etype drop  0 bad proto drop  0 if down drop
0 input with dribble  0 input discard
0 Rx pause
  TX
0 unicast packets  0 multicast packets  0 broadcast packets
0 output packets  0 bytes
0 jumbo packets
0 output error  0 collision  0 deferred  0 late collision
0 lost carrier  0 no carrier  0 babble  0 output discard
0 Tx pause


Il giorno dom 11 giu 2023 alle ore 09:59 Saku Ytti  ha
scritto:

> You've changed JNPR from 30s to 1s, but not CSCO. I'm not sure if this
> is the only problem, as insufficient data is shown about the state and
> LACP PDUs.
>
> I believe the command is 'lacp rate fast' or 'lacp period short', to
> reduce risk of operators getting bored, In your case, the former.
>
> On Sun, 11 Jun 2023 at 10:38, james list via cisco-nsp
>  wrote:
> >
> > Dear expert

[c-nsp] Port-channel not working Juniper vs Cisco

2023-06-11 Thread james list via cisco-nsp
Dear expert
we've an issue in setting up a port-channel between a Juniper EX4400 and a
Cisco Nexus N9K-C93180YC-EX over an SX 1 Gbs link.

We've implemented the following configuration but on Juniper side it is
interface flapping while on Cisco side it remains down.
Light levels seem ok.

Has anyone ever experienced the same ? Any suggestions ?

Thanks in advance for any hint
Kind regards
James

JUNIPER *

> show configuration interfaces ae10 | display set
set interfaces ae10 description "to Cisco leaf"
set interfaces ae10 aggregated-ether-options lacp active
set interfaces ae10 aggregated-ether-options lacp periodic fast
set interfaces ae10 unit 0 family ethernet-switching interface-mode trunk
set interfaces ae10 unit 0 family ethernet-switching vlan members 301

> show configuration interfaces ge-0/2/3 | display set
set interfaces ge-0/2/3 description "to Cisco leaf"
set interfaces ge-0/2/3 ether-options 802.3ad ae10

> show vlans VLAN_301

Routing instanceVLAN name Tag  Interfaces
default-switch  VLAN_301  301
   ae10.0




CISCO  ***

interface Ethernet1/41
  description <[To EX4400]>
  switchport
  switchport mode trunk
  switchport trunk allowed vlan 301
  channel-group 41 mode active
  no shutdown

interface port-channel41
  description <[To EX4400]>
  switchport
  switchport mode trunk
  switchport trunk allowed vlan 301


# sh vlan id 301

VLAN Name StatusPorts
  -
---
301  P2P_xxx  activePo1, Po41, Eth1/1, Eth1/41

VLAN Type Vlan-mode
 ---
301  enet CE

Remote SPAN VLAN

Disabled

Primary  Secondary  Type Ports
---  -  ---
 ---
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/