Re: [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via cisco-nsp
Hi
there are no errors on both interfaces (Cisco and Juniper).

here following logs of one event on both side, config and LACP stats.

LOGS of one event time 16:39:

CISCO
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN:
Interface port-channel101 is down (No operational members)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PARENT_DOWN: Interface
port-channel101.2303 is down (Parent interface is down)
2024 Feb  9 16:39:36 NEXUS1 %BGP-5-ADJCHANGE:  bgp- [xxx] (xxx) neighbor
172.16.6.17 Down - sent:  other configuration change
2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-FOP_CHANGED:
port-channel101: first operational port changed from Ethernet1/44 to none
2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_DOWN: port-channel101:
Ethernet1/44 is down
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel101,bandwidth changed to 10 Kbit
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN:
Interface port-channel101 is down (No operational members)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-SPEED: Interface port-channel101,
operational speed changed to 100 Gbps
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DUPLEX: Interface
port-channel101, operational duplex mode changed to Full
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_RX_FLOW_CONTROL: Interface
port-channel101, operational Receive Flow Control state changed to off
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_TX_FLOW_CONTROL: Interface
port-channel101, operational Transmit Flow Control state changed to off
2024 Feb  9 16:39:39 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel101:
Ethernet1/44 is up
2024 Feb  9 16:39:39 NEXUS1 %ETH_PORT_CHANNEL-5-FOP_CHANGED:
port-channel101: first operational port changed from none to Ethernet1/44
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel101,bandwidth changed to 1 Kbit
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface Ethernet1/44 is up
in Layer3
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface port-channel101 is
up in Layer3
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface
port-channel101.2303 is up in Layer3
2024 Feb  9 16:39:43 NEXUS1 %BGP-5-ADJCHANGE:  bgp- [xxx] (xxx) neighbor
172.16.6.17 Up



Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACPD_TIMEOUT: et-0/1/5: lacp
current while timer expired current Receive State: CURRENT
Feb  9 16:39:35.813 2024  MX1 lacpd[31632]: LACP_INTF_DOWN: ae49: Interface
marked down due to lacp timeout on member et-0/1/5
Feb  9 16:39:35.819 2024  MX1 kernel: lag_bundlestate_ifd_change: bundle
ae49: bundle IFD minimum bandwidth or minimum links not met, Bandwidth
(Current : Required) 0 : 1000 Number of links (Current : Required)
0 : 1
Feb  9 16:39:35.815 2024  MX1 lacpd[31632]: LACP_INTF_MUX_STATE_CHANGED:
ae49: et-0/1/5: Lacp state changed from COLLECTING_DISTRIBUTING to
ATTACHED, actor port state : |EXP|-|-|-|IN_SYNC|AGG|SHORT|ACT|, partner
port state : |-|-|DIS|COL|OUT_OF_SYNC|AGG|SHORT|ACT|
Feb  9 16:39:35.869 2024  MX1 rpd[31866]: bgp_ifachange_group:10697:
NOTIFICATION sent to 172.16.6.18 (External AS xxx): code 6 (Cease) subcode
6 (Other Configuration Change), Reason: Interface change for the peer-group
Feb  9 16:39:35.909 2024  MX1 mib2d[31909]: SNMP_TRAP_LINK_DOWN: ifIndex
684, ifAdminStatus up(1), ifOperStatus down(2), ifName ae49
Feb  9 16:39:36.083 2024  MX1 lacpd[31632]: LACP_INTF_MUX_STATE_CHANGED:
ae49: et-0/1/5: Lacp state changed from ATTACHED to
COLLECTING_DISTRIBUTING, actor port state :
|-|-|DIS|COL|IN_SYNC|AGG|SHORT|ACT|, partner port state :
|-|-|DIS|COL|IN_SYNC|AGG|SHORT|ACT|
Feb  9 16:39:36.089 2024  MX1 kernel: lag_bundlestate_ifd_change: bundle
ae49 is now Up. uplinks 1 >= min_links 1
Feb  9 16:39:36.089 2024  MX1 kernel: lag_bundlestate_ifd_change: bundle
ae49: bundle IFD minimum bandwidth or minimum links not met, Bandwidth
(Current : Required) 0 : 1000 Number of links (Current : Required)
0 : 1
Feb  9 16:39:36.085 2024  MX1 lacpd[31632]: LACP_INTF_MUX_STATE_CHANGED:
ae49: et-0/1/5: Lacp state changed from COLLECTING_DISTRIBUTING to
ATTACHED, actor port state : |-|-|-|-|IN_SYNC|AGG|SHORT|ACT|, partner port
state : |-|-|-|-|OUT_OF_SYNC|AGG|SHORT|ACT|
Feb  9 16:39:39.095 2024  MX1 lacpd[31632]: LACP_INTF_MUX_STATE_CHANGED:
ae49: et-0/1/5: Lacp state changed from ATTACHED to
COLLECTING_DISTRIBUTING, actor port state :
|-|-|DIS|COL|IN_SYNC|AGG|SHORT|ACT|, partner port state :
|-|-|-|-|IN_SYNC|AGG|SHORT|ACT|
Feb  9 16:39:39.101 2024  MX1 kernel: lag_bundlestate_ifd_change: bundle
ae49 is now Up. uplinks 1 >= min_links 1
Feb  9 16:39:39.109 2024  MX1 mib2d[31909]: SNMP_TRAP_LINK_UP: ifIndex 684,
ifAdminStatus up(1), ifOperStatus up(1), ifName ae49
Feb  9 16:39:41.190 2024  MX1 rpd[31866]: bgp_recv: read from peer
172.16.6.18 (External AS xxx) failed: Unknown error: 48110976


CONFIG:

CISCO

NEXUS1# sh run int 

Re: [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via cisco-nsp
DC technicians states cable are the same in both DCs and direct, no patch
panel

Cheers

Il giorno dom 11 feb 2024 alle ore 11:20 nivalMcNd d 
ha scritto:

> Can it be DC1 is connecting links over an intermediary patch panel and you
> face fibre disturbance? That may be eliminated if your interfaces on DC1
> links do not go down
>
> On Sun, Feb 11, 2024, 21:16 Igor Sukhomlinov via cisco-nsp <
> cisco-nsp@puck.nether.net> wrote:
>
>> Hi James,
>>
>> Do you happen to run the same software on all nexuses and all MXes?
>> Do the DC1 and DC2 bgp session exchange the same amount of routing updates
>> across the links?
>>
>>
>> On Sun, Feb 11, 2024, 21:09 james list via cisco-nsp <
>> cisco-nsp@puck.nether.net> wrote:
>>
>> > Dear experts
>> > we have a couple of BGP peers over a 100 Gbs interconnection between
>> > Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different
>> datacenters
>> > like this:
>> >
>> > DC1
>> > MX1 -- bgp -- NEXUS1
>> > MX2 -- bgp -- NEXUS2
>> >
>> > DC2
>> > MX3 -- bgp -- NEXUS3
>> > MX4 -- bgp -- NEXUS4
>> >
>> > The issue we see is that sporadically (ie every 1 to 3 days) we notice
>> BGP
>> > flaps only in DC1 on both interconnections (not at the same time),
>> there is
>> > still no traffic since once noticed the flaps we have blocked deploy on
>> > production.
>> >
>> > We've already changed SPF (we moved the ones from DC2 to DC1 and
>> viceversa)
>> > and cables on both the interconnetion at DC1 without any solution.
>> >
>> > SFP we use in both DCs:
>> >
>> > Juniper - QSFP-100G-SR4-T2
>> > Cisco - QSFP-100G-SR4
>> >
>> > over MPO cable OM4.
>> >
>> > Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the
>> issue.
>> >
>> > Any idea or suggestion what to check or to do ?
>> >
>> > Thanks in advance
>> > Cheers
>> > James
>> > ___
>> > cisco-nsp mailing list  cisco-nsp@puck.nether.net
>> > https://puck.nether.net/mailman/listinfo/cisco-nsp
>> > archive at http://puck.nether.net/pipermail/cisco-nsp/
>> >
>> ___
>> cisco-nsp mailing list  cisco-nsp@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>
>
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via cisco-nsp
yes same version
currently no traffic exchange is in place, just BGP peer setup
no traffic

Il giorno dom 11 feb 2024 alle ore 11:16 Igor Sukhomlinov <
dvalinsw...@gmail.com> ha scritto:

> Hi James,
>
> Do you happen to run the same software on all nexuses and all MXes?
> Do the DC1 and DC2 bgp session exchange the same amount of routing updates
> across the links?
>
>
> On Sun, Feb 11, 2024, 21:09 james list via cisco-nsp <
> cisco-nsp@puck.nether.net> wrote:
>
>> Dear experts
>> we have a couple of BGP peers over a 100 Gbs interconnection between
>> Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different
>> datacenters
>> like this:
>>
>> DC1
>> MX1 -- bgp -- NEXUS1
>> MX2 -- bgp -- NEXUS2
>>
>> DC2
>> MX3 -- bgp -- NEXUS3
>> MX4 -- bgp -- NEXUS4
>>
>> The issue we see is that sporadically (ie every 1 to 3 days) we notice BGP
>> flaps only in DC1 on both interconnections (not at the same time), there
>> is
>> still no traffic since once noticed the flaps we have blocked deploy on
>> production.
>>
>> We've already changed SPF (we moved the ones from DC2 to DC1 and
>> viceversa)
>> and cables on both the interconnetion at DC1 without any solution.
>>
>> SFP we use in both DCs:
>>
>> Juniper - QSFP-100G-SR4-T2
>> Cisco - QSFP-100G-SR4
>>
>> over MPO cable OM4.
>>
>> Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the issue.
>>
>> Any idea or suggestion what to check or to do ?
>>
>> Thanks in advance
>> Cheers
>> James
>> ___
>> cisco-nsp mailing list  cisco-nsp@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>
>
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via cisco-nsp
Hi
One think I've omit to say is that BGP is over a LACP with currently just
one interface 100 Gbs.

I see that the issue is triggered on Cisco when eth interface seems to go
in Initializing state:


2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN:
Interface port-channel101 is down (No operational members)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PARENT_DOWN: Interface
port-channel101.2303 is down (Parent interface is down)
2024 Feb  9 16:39:36 NEXUS1 %BGP-5-ADJCHANGE:  bgp- [xxx] (xxx) neighbor
172.16.6.17 Down - sent:  other configuration change
2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-FOP_CHANGED:
port-channel101: first operational port changed from Ethernet1/44 to none
2024 Feb  9 16:39:36 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_DOWN: port-channel101:
Ethernet1/44 is down
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel101,bandwidth changed to 10 Kbit
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface
Ethernet1/44 is down (Initializing)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN:
Interface port-channel101 is down (No operational members)
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-SPEED: Interface port-channel101,
operational speed changed to 100 Gbps
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_DUPLEX: Interface
port-channel101, operational duplex mode changed to Full
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_RX_FLOW_CONTROL: Interface
port-channel101, operational Receive Flow Control state changed to off
2024 Feb  9 16:39:36 NEXUS1 %ETHPORT-5-IF_TX_FLOW_CONTROL: Interface
port-channel101, operational Transmit Flow Control state changed to off
2024 Feb  9 16:39:39 NEXUS1 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel101:
Ethernet1/44 is up
2024 Feb  9 16:39:39 NEXUS1 %ETH_PORT_CHANNEL-5-FOP_CHANGED:
port-channel101: first operational port changed from none to Ethernet1/44
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface
port-channel101,bandwidth changed to 1 Kbit
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface Ethernet1/44 is up
in Layer3
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface port-channel101 is
up in Layer3
2024 Feb  9 16:39:39 NEXUS1 %ETHPORT-5-IF_UP: Interface
port-channel101.2303 is up in Layer3
2024 Feb  9 16:39:43 NEXUS1 %BGP-5-ADJCHANGE:  bgp- [xxx] (xxx) neighbor
172.16.6.17 Up

Cheers
James

Il giorno dom 11 feb 2024 alle ore 11:12 Gert Doering 
ha scritto:

> Hi,
>
> On Sun, Feb 11, 2024 at 11:08:29AM +0100, james list via cisco-nsp wrote:
> > we notice BGP flaps
>
> Any particular error message?  BGP flaps can happen due to many different
> reasons, and usually $C is fairly good at logging the reason.
>
> Any interface errors, packet errors, ping packets lost?
>
> "BGP flaps" *can* be related to lower layer issues (so: interface counters,
> error counters, extended pings) or to something unrelated, like "MaxPfx
> exceeded"...
>
> gert
> --
> "If was one thing all people took for granted, was conviction that if you
>  feed honest figures into a computer, honest figures come out. Never
> doubted
>  it myself till I met a computer with a sense of humor."
>  Robert A. Heinlein, The Moon is a Harsh
> Mistress
>
> Gert Doering - Munich, Germany
> g...@greenie.muc.de
>
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread nivalMcNd d via cisco-nsp
Can it be DC1 is connecting links over an intermediary patch panel and you
face fibre disturbance? That may be eliminated if your interfaces on DC1
links do not go down

On Sun, Feb 11, 2024, 21:16 Igor Sukhomlinov via cisco-nsp <
cisco-nsp@puck.nether.net> wrote:

> Hi James,
>
> Do you happen to run the same software on all nexuses and all MXes?
> Do the DC1 and DC2 bgp session exchange the same amount of routing updates
> across the links?
>
>
> On Sun, Feb 11, 2024, 21:09 james list via cisco-nsp <
> cisco-nsp@puck.nether.net> wrote:
>
> > Dear experts
> > we have a couple of BGP peers over a 100 Gbs interconnection between
> > Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different
> datacenters
> > like this:
> >
> > DC1
> > MX1 -- bgp -- NEXUS1
> > MX2 -- bgp -- NEXUS2
> >
> > DC2
> > MX3 -- bgp -- NEXUS3
> > MX4 -- bgp -- NEXUS4
> >
> > The issue we see is that sporadically (ie every 1 to 3 days) we notice
> BGP
> > flaps only in DC1 on both interconnections (not at the same time), there
> is
> > still no traffic since once noticed the flaps we have blocked deploy on
> > production.
> >
> > We've already changed SPF (we moved the ones from DC2 to DC1 and
> viceversa)
> > and cables on both the interconnetion at DC1 without any solution.
> >
> > SFP we use in both DCs:
> >
> > Juniper - QSFP-100G-SR4-T2
> > Cisco - QSFP-100G-SR4
> >
> > over MPO cable OM4.
> >
> > Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the
> issue.
> >
> > Any idea or suggestion what to check or to do ?
> >
> > Thanks in advance
> > Cheers
> > James
> > ___
> > cisco-nsp mailing list  cisco-nsp@puck.nether.net
> > https://puck.nether.net/mailman/listinfo/cisco-nsp
> > archive at http://puck.nether.net/pipermail/cisco-nsp/
> >
> ___
> cisco-nsp mailing list  cisco-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread Igor Sukhomlinov via cisco-nsp
Hi James,

Do you happen to run the same software on all nexuses and all MXes?
Do the DC1 and DC2 bgp session exchange the same amount of routing updates
across the links?


On Sun, Feb 11, 2024, 21:09 james list via cisco-nsp <
cisco-nsp@puck.nether.net> wrote:

> Dear experts
> we have a couple of BGP peers over a 100 Gbs interconnection between
> Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different datacenters
> like this:
>
> DC1
> MX1 -- bgp -- NEXUS1
> MX2 -- bgp -- NEXUS2
>
> DC2
> MX3 -- bgp -- NEXUS3
> MX4 -- bgp -- NEXUS4
>
> The issue we see is that sporadically (ie every 1 to 3 days) we notice BGP
> flaps only in DC1 on both interconnections (not at the same time), there is
> still no traffic since once noticed the flaps we have blocked deploy on
> production.
>
> We've already changed SPF (we moved the ones from DC2 to DC1 and viceversa)
> and cables on both the interconnetion at DC1 without any solution.
>
> SFP we use in both DCs:
>
> Juniper - QSFP-100G-SR4-T2
> Cisco - QSFP-100G-SR4
>
> over MPO cable OM4.
>
> Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the issue.
>
> Any idea or suggestion what to check or to do ?
>
> Thanks in advance
> Cheers
> James
> ___
> cisco-nsp mailing list  cisco-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


[c-nsp] Stange issue on 100 Gbs interconnection Juniper - Cisco

2024-02-11 Thread james list via cisco-nsp
Dear experts
we have a couple of BGP peers over a 100 Gbs interconnection between
Juniper (MX10003) and Cisco (Nexus N9K-C9364C) in two different datacenters
like this:

DC1
MX1 -- bgp -- NEXUS1
MX2 -- bgp -- NEXUS2

DC2
MX3 -- bgp -- NEXUS3
MX4 -- bgp -- NEXUS4

The issue we see is that sporadically (ie every 1 to 3 days) we notice BGP
flaps only in DC1 on both interconnections (not at the same time), there is
still no traffic since once noticed the flaps we have blocked deploy on
production.

We've already changed SPF (we moved the ones from DC2 to DC1 and viceversa)
and cables on both the interconnetion at DC1 without any solution.

SFP we use in both DCs:

Juniper - QSFP-100G-SR4-T2
Cisco - QSFP-100G-SR4

over MPO cable OM4.

Distance is DC1 70 mt and DC2 80 mt, hence is less where we see the issue.

Any idea or suggestion what to check or to do ?

Thanks in advance
Cheers
James
___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/