Hi juniper-nsp readers,
Recently we encountered an issue with L3-incompletes counters started
incrementing on internal backbone links. It began after adding new PE,
core routers and route-reflectors.
After quite long investigation with TAC involved the problem was
identified: v6 traffic was sent over RSVP tunnels without explicit-null
label and was arriving with v4 Ethertype in MAC header to the egress PE.
The issue with missing explicit-null label turned out to be caused by
having both inet6 unicast (over ipv6) and inet6 labeled-unicast
explicit-null (over ipv4) BGP sessions running in parallel.
Route-reflector receives the same prefix from originating PE over v4 and
v6 BGP session and installs both paths in inet6.0 table.
akostin@rr02> show route 2a03:2880:f10e::/48 receive-protocol bgp
X.X.X.130 detail <<< Received over v4 BGP session with family inet6
labeled-unicast explicit-null and has Label 2 accordingly
inet6.0: 195655 destinations, 1173973 routes (195655 active, 6 holddown,
0 hidden)
* 2a03:2880:f10e::/48 (2 entries, 0 announced)
Accepted Multipath
Route Label: 2
Nexthop: ::ffff:X.X.X.130
MED: 95
Localpref: 106
AS path: 32934 I
Communities: Y:30000 Y:30127
Addpath Path ID: 1
Accepted MultipathContrib MultipathDup
Route Label: 2
Nexthop: ::ffff:X.X.X.140
MED: 95
Localpref: 106
AS path: 32934 I (Originator)
Cluster list: X.X.2.4
Originator ID: X.X.X.140
Communities: Y:30000 Y:30127
Addpath Path ID: 2
akostin@rr02> show route 2a03:2880:f10e::/48 receive-protocol bgp
2607:X:X::1:130 detail <<<< Received over v6 BGP session and has v6
nexthop
inet6.0: 195656 destinations, 1173985 routes (195657 active, 6 holddown,
0 hidden)
2a03:2880:f10e::/48 (1 entry, 0 announced)
Accepted
Nexthop: 2607:X:X::1:130
MED: 95
Localpref: 106
AS path: 32934 I
Communities: Y:30000 Y:30127
So far so good, but when route-reflector advertises the prefix to a
rr-client it picks up one or more best paths if add-path is configured.
In this case RR chooses the path with mapped IPv4 address and sends it
over ipv6 BGP session, obviously without implicit-null label.
akostin@rr02> show route 2a03:2880:f10e::/48 advertising-protocol bgp
X.X.X.237 detail <<<< Correctly advertised over v4 BGP session
with mapped v4 nexthop and explicit-null label
inet6.0: 195756 destinations, 1174580 routes (195756 active, 6 holddown,
0 hidden)
* 2a03:2880:f10e::/48 (6 entries, 0 announced)
BGP group internal-rr-v4 type Internal
Route Label: 2
Nexthop: ::ffff:X.X.X.130
MED: 95
Localpref: 106
AS path: [Y] 32934 I
Communities: Y:30000 Y:30127
Cluster ID: X.X.X.155
Originator ID: X.X.X.130
Addpath Path ID: 1
BGP group internal-rr-v4 type Internal
Route Label: 2
Nexthop: ::ffff:X.X.X.140
MED: 95
Localpref: 106
AS path: [Y] 32934 I
Communities: Y:30000 Y:30127
Cluster ID: X.X.X.155
Originator ID: X.X.X.140
Addpath Path ID: 2
akostin@rr02> show route 2a03:2880:f10e::/48 advertising-protocol bgp
2607:X:X::1:237 detail <<<< The path, received over v4 BGP session,
is advertised over v6 session. Important, that this path has mapped IPv4
nexthop but doesn't have explicit-null label.
inet6.0: 195760 destinations, 1174603 routes (195760 active, 7 holddown,
0 hidden)
* 2a03:2880:f10e::/48 (6 entries, 0 announced)
BGP group internal-rr-v6 type Internal
Nexthop: ::ffff:X.X.X.130
MED: 95
Localpref: 106
AS path: [Y] 32934 I
Communities: Y:30000 Y:30127
Cluster ID: X.X.X.155
Originator ID: X.X.X.130
On the receiving router all paths are installed because of BGP
multipath. If the last path is used, v6 packets are sent without
explicit-null label, arrive to the egress PE with wrong ethertype and
dropped as L3-incompletes.
[email protected]> show route 2a03:2880:f10e::/48 table inet6.0
+ = Active Route, - = Last Active, * = Both
2a03:2880:f10e::/48*[BGP/170] 2d 21:46:57, MED 95, localpref 106, from
X.X.X.154
AS path: 32934 I, validation-state: unverified
to X.X.X.14 via ae0.0, label-switched-path
BE-agg02-to-bdr01-1
> to X.X.X.14 via ae0.0, label-switched-path
BE-agg02-to-bdr01-2
[BGP/170] 2d 21:54:26, MED 95, localpref 106, from
X.X.X.155
AS path: 32934 I, validation-state: unverified
to X.X.X.14 via ae0.0, Push 2, Push 129063(top)
> to X.X.X.14 via ae0.0, Push 2, Push 129001(top)
[BGP/170] 2d 21:47:17, MED 95, localpref 106, from
X.X.X.154
AS path: 32934 I, validation-state: unverified
to X.X.X.14 via ae0.0, Push 2, Push 129314(top)
> to X.X.X.14 via ae0.0, Push 2, Push 128995(top)
[BGP/170] 2d 21:47:17, MED 95, localpref 106, from
X.X.X.155
AS path: 32934 I, validation-state: unverified
to X.X.X.14 via ae0.0, Push 2, Push 129314(top)
> to X.X.X.14 via ae0.0, Push 2, Push 128995(top)
[BGP/170] 2d 21:47:17, MED 95, localpref 106, from
2607:X:X::1:154
AS path: 32934 I, validation-state: unverified
to X.X.X.14 via ae0.0, Push 129314
> to X.X.X.14 via ae0.0, Push 128995
[BGP/170] 2d 21:47:17, MED 95, localpref 106, from
2607:X:X::1:155
AS path: 32934 I, validation-state: unverified
to X.X.X.14 via ae0.0, Push 129314
> to X.X.X.14 via ae0.0, Push 128995
The first four paths are correct, but the last two are missing Label 2
because they are received over v6 BGP session without explicit-null. If
incorrect path is used, the mapped ipv4 nexthop is resolved over MPLS
tunnel but packets are sent with only transport label (129314 or 128995
in this case) that's removed on a penultimate hop. Because of missing
label 2, packets arrive to the egress PE with wrong ethertype and
dropped as L3-incompletes.
The problem here is that route-reflector selects a path with ipv4 mapped
nexthop and advertises it over ipv6 session. I'm wondering, is anybody
already encountered this problem and found a solution how to make a RR
to advertise paths with a correct nexthop?
I know that having two session for ipv6 adds complexity and one of them
can be removed, but interested to find out an elegant solution for this
issue.
Kind regards,
Andrey
_______________________________________________
juniper-nsp mailing list [email protected]
https://puck.nether.net/mailman/listinfo/juniper-nsp