Hey vincent,
Good to see you on the list :)
I think you've already read all threads of the long history of krt
queuing issue, on juniper (specialy this one
http://www.gossamer-threads.com/lists/nsp/juniper/40739).
I have to say, that even if the design problem remain, two minutes isn't
that bad. In the first day of MX80 with flowing enabling it took age
from the rib syncing down to the fib (friends report 20min in the worst
case). Juniper have made enhancement (or fix) in the last junos.
If you can, bench junos 14, or 15.
Is there some way to not advertise the default route in OSPF during the
convergence time? Like a criteria: don't advertise this route when the
KRT queue has 1000+ elements and until it reaches 0 (to avoid flapping).
I've heard that some event script have been made to test this, and to
dynamicly change the congirution of whatever, in your case the
annoucement of the default.
I hope that juniper still continue to work on this; even if this is due
to a design flaw wich may be very hard to fix; I think there are again
some quick fix to mitigate this problem.
For example, I think of a conservative mode, wich basicly should trigger
massive change of route and do :
- quick clear on the entire fib,
- quick install of some specific route (which was flaged).
- normal update
Or other, but provide some options to the operators.
Regards,
--
Raphael Mazelier
Le 05/02/2016 17:15, Brad Fleming a écrit :
Welcome to running a full table on the MX104. This is exactly what we found
when lab testing the devices. After months of working with JTAC we never found
a workaround. After several software updates and major configuration changes
there was never a way to resolve the issues. During a major convergence event
impacting a significant amount of the routes in a full table it took many
minutes to get RIB and FIB sync’d. In the meantime traffic was getting
blackholed. In the end we had to give up and roll bigger MX gear with much
bigger REs (and much more expensive).
On Feb 3, 2016, at 3:21 PM, Vincent Bernat <[email protected]> wrote:
Hey!
I have a pair of MX104. Each one is receiving a full view and a default
through an external BGP session. They share an iBGP session. They
redistribute the default in OSPF (with a higher metric when the default
comes through the iBGP session). Nothing fancy.
If I shut the upstream port of one of the MX, the session goes down and
the RIB is quickly updated. Unfortunately, the KRT is quite slow to be
updated. A "show krt queue" shows there are many
deletion/addition/changes queued and they take about 2 minutes to be
processed.
Unfortunately, during this time, I have a lot of more specific routes
still pointing to a non-existant hop:
[email protected]# run show route 138.231.136.1 extensive table public.inet.0
| no-more
public.inet.0: 571546 destinations, 996364 routes (425305 active, 321183
holddown, 571058 hidden)
138.231.0.0/16 (2 entries, 1 announced)
TSI:
KRT queued (pending) change
138.231.0.0/16 -> {1.1.1.1}=>{indirect(1048578)}
Page 0 idx 1, (group v4-IBGP type Internal) Type 3 val 22b9ccb8 (grp rto)
Advertised metrics:
No metrics
(Queued)
Enqueued metrics 1: (for peers 00000001 3.3.3.3)
Flags: Nexthop Change
Nexthop: Self
MED: 10
Localpref: 100
AS path: [61098] 25091 2200 2426 I
Communities: 25091:22413 25091:24115
[...]
Path 138.231.0.0 from 159.100.255.231 Vector len 4. Val: 1
*BGP Preference: 140/-101
Next hop type: Indirect
Address: 0x177743a0
Next-hop reference count: 877603
Source: 3.3.3.3
Next hop type: Router, Next hop index: 1048577
Next hop: 2.2.2.2 via xe-2/0/3.100
Session Id: 0x18
Next hop: 2.2.2.0 via xe-2/0/2.100, selected
Session Id: 0x17
Protocol next hop: 3.3.3.3
Indirect next hop: 0x19ec4b2c 1048578 INH Session ID: 0x1b
State: <Active Int Ext>
Age: 16:57 Metric: 10 Metric2: 0
Validation State: unverified
Task: BGP_61098_61098.3.3.3.3+50640
Announcement bits (3): 2-KRT 3-BGP_RT_Background 4-Resolve tree
2
AS path: 8218 2200 2426 I
Communities: 8218:102 8218:20000 8218:20110
Accepted
Localpref: 100
Router ID: 3.3.3.3
Indirect next hops: 1
Protocol next hop: 3.3.3.3
Indirect next hop: 0x19ec4b2c 1048578 INH Session ID:
0x1b
Indirect path forwarding next hops: 2
Next hop type: Router
Next hop: 2.2.2.2 via xe-2/0/3.100
Session Id: 0x18
Next hop: 2.2.2.0 via xe-2/0/2.100
Session Id: 0x17
3.3.3.3/32 Originating RIB: public.inet.0
Node path count: 1
Forwarding nexthops: 2
Nexthop: 2.2.2.2 via xe-2/0/3.100
So, I have three questions:
Is it expected for a route to be flagged "active" while it is still
queued to KRT?
Is there a way to delete those invalid routes in a more speedier manner
to let packets use the default route during the convergence time?
Is there some way to not advertise the default route in OSPF during the
convergence time? Like a criteria: don't advertise this route when the
KRT queue has 1000+ elements and until it reaches 0 (to avoid flapping).
I am running 13.3R8.7.
Thanks!
--
Treat end of file conditions in a uniform manner.
- The Elements of Programming Style (Kernighan & Plauger)
_______________________________________________
juniper-nsp mailing list [email protected]
https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list [email protected]
https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list [email protected]
https://puck.nether.net/mailman/listinfo/juniper-nsp