[c-nsp] MPLS LDP and BGP Neighbor flapping constantly

Justin Shore Wed, 04 Mar 2009 22:38:11 -0800

This afternoon I stumbled across a problem with a LDP session between a7613 and a 7201. Actually both LDP and iBGP were flapping every 10seconds or so. I had both interfaces configured for MPLS, LDP, IS-IS(with AUTH and BFD though BFD isn't enabled on the interface itself yet)with an interface MTU of 9000 and CLNS MTU of 1496. Nothing too fancy.The systems as a whole are configured with MPLS graceful-restart, LDP,no mpls ip propagate-ttl, and LDP router-ID on a loopback:


# 7201
mpls label protocol ldp
no mpls ip propagate-ttl
mpls ldp graceful-restart
mpls ldp router-id Loopback0 force


# 7613
mls mpls tunnel-recir
mpls traffic-eng tunnels
mpls ldp graceful-restart
no mpls ip propagate-ttl
mpls label protocol ldp
mpls ldp router-id Loopback0 force

This morning at 7:05 the router stopped responding to SNMP queries forabout 15m. The load was about 13 before. Cacti shows the load doublingin the 10m prior to the 15m of nothing. When it came back the load wasjust shy of 50 and stayed there for about 30m. After that it stayed ataround 30-35 for the next 7.5hrs before I noticed the BGP flapping issueand shutdown the peer for troubleshooting. The load dropped back toaround 16, higher than it was before the hiccup this morning. I'm at aloss to adequately explain why the load has been so jacked. I think the30-35 load was because BGP flapping and the slightly higher load now isdue to the LDP flapping issue. That's my best guess.

Anyone know how to troubleshoot a LDP neighbor flapping issue? The 7613is logging this:

730278: Mar 4 20:43:48.696 CST: LDP GR: Received FT Sess TLV from10.64.0.34:0 (fl 0x1, rs 0x0, rconn 0, rcov 120000)730279: Mar 4 20:43:48.696 CST: LDP GR: MFI cutover wait delay =600000, Forwarding State Hold Timer = 600000730280: Mar 4 20:43:48.696 CST: LDP GR: searching for down nbr record(10.64.0.34:0, 10.64.0.178)730281: Mar 4 20:43:48.696 CST: LDP GR: Added FT Sess TLV (Rconn120000, Rcov 0) to INIT msg to 10.64.0.34:0


The 7201 is logging this:

054705: Mar  5 00:28:19.599 CST: LDP GR: GR session 10.64.0.20:0:: lost

054706: Mar 5 00:28:19.599 CST: LDP GR: down nbr 10.64.0.20:0:: created[1 total]054707: Mar 5 00:28:19 CST: %LDP-5-GR: GR session 10.64.0.20:0 (inst.3): interrupted--recovery pending054708: Mar 5 00:28:19.599 CST: LDP GR: GR session 10.64.0.20:0::bindings retained054709: Mar 5 00:28:19.599 CST: LDP GR: down nbr 10.64.0.20:0:: statechange (None -> Reconnect-Wait)054710: Mar 5 00:28:19.599 CST: LDP GR: down nbr 10.64.0.20:0::reconnect timer started [120000 msecs]054711: Mar 5 00:28:19.599 CST: LDP GR: down nbr 10.64.0.20:0:: addedto bindings task queue [1 entries]054712: Mar 5 00:28:19 CST: %LDP-5-NBRCHG: LDP Neighbor 10.64.0.20:0(0) is DOWN (Received error notification from peer: Shut down)

054713: Mar 5 00:28:25.923 CST: LDP GR: searching for down nbr record(10.64.0.20:0, 10.64.0.179)054714: Mar 5 00:28:25.923 CST: LDP GR: search for down nbr record(10.64.0.20:0, 10.64.0.179) returned 10.64.0.20:0054715: Mar 5 00:28:25.923 CST: LDP GR: Added FT Sess TLV (Rconn 0,Rcov 120000) to INIT msg to 10.64.0.20:0054716: Mar 5 00:28:25.947 CST: LDP GR: Received FT Sess TLV from10.64.0.20:0 (fl 0x1, rs 0x0, rconn 120000, rcov 0)054717: Mar 5 00:28:25.947 CST: LDP GR: GR session 10.64.0.20:0::established054718: Mar 5 00:28:25.947 CST: LDP GR: GR session 10.64.0.20:0:: founddown nbr 10.64.0.20:0054719: Mar 5 00:28:25.947 CST: LDP GR: down nbr 10.64.0.20:0::reconnect timer stopped054720: Mar 5 00:28:25.947 CST: LDP GR: down nbr 10.64.0.20:0:: statechange (Reconnect-Wait -> Recovering)054721: Mar 5 00:28:25.947 CST: LDP GR: down nbr 10.64.0.20:0::recovery timer started [1 msecs]054722: Mar 5 00:28:25 CST: %LDP-5-GR: GR session 10.64.0.20:0 (inst.4): starting graceful recovery054723: Mar 5 00:28:25 CST: %LDP-5-NBRCHG: LDP Neighbor 10.64.0.20:0(4) is UP054724: Mar 5 00:28:25.951 CST: LDP GR: down nbr 10.64.0.20:0::recovery timer expired054725: Mar 5 00:28:25 CST: %LDP-5-GR: GR session 10.64.0.20:0 (inst.4): completed graceful recovery054726: Mar 5 00:28:25.951 CST: LDP GR: down nbr 10.64.0.20:0::destroying record [0 left]054727: Mar 5 00:28:25.951 CST: LDP GR: down nbr 10.64.0.20:0:: statechange (Recovering -> Delete-Wait)

054728: Mar 5 00:28:28.091 CST: LDP GR: Tagcon querying for up to 12bindings update tasks [table 0]054729: Mar 5 00:28:28.091 CST: LDP GR: down nbr 10.64.0.20:0::requesting bindings DEL for {10.64.0.20:0, 3}054730: Mar 5 00:28:28.091 CST: LDP GR: down nbr 10.64.0.20:0:: removedfrom bindings task queue [0 entries]054731: Mar 5 00:28:28.091 CST: LDP GR: Requesting 1 bindings updatetasks [0 left in queue]

10.64.0.20 is a loopback on the 7613 and 10.64.0.34 is a loopback on the7201.

I do have some interface errors which I also can't explain. They do notappear to be incrementing though. 7613:


GigabitEthernet9/1 is up, line protocol is up (connected)

Hardware is C6k 1000Mb 802.3, address is 001a.3063.0a80 (bia001a.3063.0a80)

  Description: TO 2821-2.dc Gi0/0
  Internet address is 10.64.0.179/31
  MTU 9000 bytes, BW 1000000 Kbit, DLY 10 usec,
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full-duplex, 1000Mb/s
  input flow-control is off, output flow-control is off
  Clock mode is auto
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:02, output 00:00:00, output hang never
  Last clearing of "show interface" counters never

Input queue: 0/75/1936665/7581 (size/max/drops/flushes); Total outputdrops: 4

  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 49000 bits/sec, 17 packets/sec
  5 minute output rate 56000 bits/sec, 24 packets/sec

L2 Switched: ucast: 52903876 pkt, 3771470311 bytes - mcast: 15056043pkt, 1653756471 bytesL3 in Switched: ucast: 80170438 pkt, 12709078926 bytes - mcast: 0pkt, 0 bytes mcastL3 out Switched: ucast: 185161821 pkt, 36022953056 bytes mcast: 0pkt, 0 bytes

     150040994 packets input, 30087625055 bytes, 0 no buffer
     Received 15660647 broadcasts (0 IP multicasts)
     30 runts, 4247159 giants, 0 throttles
     1929071 input errors, 68 CRC, 0 frame, 13 overrun, 0 ignored
     0 watchdog, 0 multicast, 0 pause input
     0 input packets with dribble condition detected
     257650143 packets output, 64726258058 bytes, 0 underruns
     2 output errors, 0 collisions, 2 interface resets
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 0 PAUSE output
     0 output buffer failures, 0 output buffers swapped out

7201:
GigabitEthernet0/0 is up, line protocol is up

Hardware is MV64460 Internal MAC, address is 0023.5ee9.ac1b (bia0023.5ee9.ac1b)

  Description: TO 7613-2.clr Gi9/1
  Internet address is 10.64.0.178/31
  MTU 9000 bytes, BW 1000000 Kbit/sec, DLY 10 usec,
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full-duplex, 1000Mb/s, media type is RJ45
  output flow-control is XON, input flow-control is unsupported
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 00:00:00, output 00:00:00, output hang never
  Last clearing of "show interface" counters never
  Input queue: 0/75/3951/0 (size/max/drops/flushes); Total output drops: 6
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 45000 bits/sec, 19 packets/sec
  5 minute output rate 64000 bits/sec, 13 packets/sec
     51466122 packets input, 1916487584 bytes, 0 no buffer
     Received 1891956 broadcasts, 0 runts, 0 giants, 0 throttles
     5 input errors, 0 CRC, 0 frame, 0 overrun, 5 ignored
     0 watchdog, 2247902 multicast, 0 pause input
     0 input packets with dribble condition detected
     32927369 packets output, 1549013167 bytes, 0 underruns
     8 output errors, 0 collisions, 1 interface resets
     23 unknown protocol drops
     23 unknown protocol drops
     0 babbles, 0 late collision, 0 deferred
     8 lost carrier, 0 no carrier, 0 pause output
     0 output buffer failures, 0 output buffers swapped out

Any thoughts as to what's going on here? I can't tell for certain whichof the 2 routers is causing LDP and BGP to drop. Knowing that wouldhelp me narrow my troubleshooting focus. The 7600 is running SRB1 andthe 7201 is running 12.4(15)T7.


Thanks
 Justin

_______________________________________________
cisco-nsp mailing list  [email protected]
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

[c-nsp] MPLS LDP and BGP Neighbor flapping constantly

Reply via email to