Hi juniper-nsp readers,

Recently we encountered an issue with L3-incompletes counters started incrementing on internal backbone links. It began after adding new PE, core routers and route-reflectors. After quite long investigation with TAC involved the problem was identified: v6 traffic was sent over RSVP tunnels without explicit-null label and was arriving with v4 Ethertype in MAC header to the egress PE.

The issue with missing explicit-null label turned out to be caused by having both inet6 unicast (over ipv6) and inet6 labeled-unicast explicit-null (over ipv4) BGP sessions running in parallel. Route-reflector receives the same prefix from originating PE over v4 and v6 BGP session and installs both paths in inet6.0 table.

akostin@rr02> show route 2a03:2880:f10e::/48 receive-protocol bgp X.X.X.130 detail <<< Received over v4 BGP session with family inet6 labeled-unicast explicit-null and has Label 2 accordingly

inet6.0: 195655 destinations, 1173973 routes (195655 active, 6 holddown, 0 hidden)

* 2a03:2880:f10e::/48 (2 entries, 0 announced)
     Accepted Multipath
     Route Label: 2
     Nexthop: ::ffff:X.X.X.130
     MED: 95
     Localpref: 106
     AS path: 32934 I
     Communities: Y:30000 Y:30127
     Addpath Path ID: 1
     Accepted MultipathContrib MultipathDup
     Route Label: 2
     Nexthop: ::ffff:X.X.X.140
     MED: 95
     Localpref: 106
     AS path: 32934 I  (Originator)
     Cluster list:  X.X.2.4
     Originator ID: X.X.X.140
     Communities: Y:30000 Y:30127
     Addpath Path ID: 2

akostin@rr02> show route 2a03:2880:f10e::/48 receive-protocol bgp 2607:X:X::1:130 detail <<<< Received over v6 BGP session and has v6 nexthop

inet6.0: 195656 destinations, 1173985 routes (195657 active, 6 holddown, 0 hidden)

  2a03:2880:f10e::/48 (1 entry, 0 announced)
     Accepted
     Nexthop: 2607:X:X::1:130
     MED: 95
     Localpref: 106
     AS path: 32934 I
     Communities: Y:30000 Y:30127

So far so good, but when route-reflector advertises the prefix to a rr-client it picks up one or more best paths if add-path is configured. In this case RR chooses the path with mapped IPv4 address and sends it over ipv6 BGP session, obviously without implicit-null label.

akostin@rr02> show route 2a03:2880:f10e::/48 advertising-protocol bgp X.X.X.237 detail <<<< Correctly advertised over v4 BGP session with mapped v4 nexthop and explicit-null label

inet6.0: 195756 destinations, 1174580 routes (195756 active, 6 holddown, 0 hidden)

* 2a03:2880:f10e::/48 (6 entries, 0 announced)
 BGP group internal-rr-v4 type Internal
     Route Label: 2
     Nexthop: ::ffff:X.X.X.130
     MED: 95
     Localpref: 106
     AS path: [Y] 32934 I
     Communities: Y:30000 Y:30127
     Cluster ID: X.X.X.155
     Originator ID: X.X.X.130
     Addpath Path ID: 1
 BGP group internal-rr-v4 type Internal
     Route Label: 2
     Nexthop: ::ffff:X.X.X.140
     MED: 95
     Localpref: 106
     AS path: [Y] 32934 I
     Communities: Y:30000 Y:30127
     Cluster ID: X.X.X.155
     Originator ID: X.X.X.140
     Addpath Path ID: 2

akostin@rr02> show route 2a03:2880:f10e::/48 advertising-protocol bgp 2607:X:X::1:237 detail <<<< The path, received over v4 BGP session, is advertised over v6 session. Important, that this path has mapped IPv4 nexthop but doesn't have explicit-null label.

inet6.0: 195760 destinations, 1174603 routes (195760 active, 7 holddown, 0 hidden)

* 2a03:2880:f10e::/48 (6 entries, 0 announced)
 BGP group internal-rr-v6 type Internal
     Nexthop: ::ffff:X.X.X.130
     MED: 95
     Localpref: 106
     AS path: [Y] 32934 I
     Communities: Y:30000 Y:30127
     Cluster ID: X.X.X.155
     Originator ID: X.X.X.130

On the receiving router all paths are installed because of BGP multipath. If the last path is used, v6 packets are sent without explicit-null label, arrive to the egress PE with wrong ethertype and dropped as L3-incompletes.

[email protected]> show route  2a03:2880:f10e::/48  table inet6.0

+ = Active Route, - = Last Active, * = Both

2a03:2880:f10e::/48*[BGP/170] 2d 21:46:57, MED 95, localpref 106, from X.X.X.154
                      AS path: 32934 I, validation-state: unverified
to X.X.X.14 via ae0.0, label-switched-path BE-agg02-to-bdr01-1 > to X.X.X.14 via ae0.0, label-switched-path BE-agg02-to-bdr01-2 [BGP/170] 2d 21:54:26, MED 95, localpref 106, from X.X.X.155
                      AS path: 32934 I, validation-state: unverified
                       to X.X.X.14 via ae0.0, Push 2, Push 129063(top)
                    >  to X.X.X.14 via ae0.0, Push 2, Push 129001(top)
[BGP/170] 2d 21:47:17, MED 95, localpref 106, from X.X.X.154
                      AS path: 32934 I, validation-state: unverified
                       to X.X.X.14 via ae0.0, Push 2, Push 129314(top)
                    >  to X.X.X.14 via ae0.0, Push 2, Push 128995(top)
[BGP/170] 2d 21:47:17, MED 95, localpref 106, from X.X.X.155
                      AS path: 32934 I, validation-state: unverified
                       to X.X.X.14 via ae0.0, Push 2, Push 129314(top)
                    >  to X.X.X.14 via ae0.0, Push 2, Push 128995(top)
[BGP/170] 2d 21:47:17, MED 95, localpref 106, from 2607:X:X::1:154
                      AS path: 32934 I, validation-state: unverified
                       to X.X.X.14 via ae0.0, Push 129314
                    >  to X.X.X.14 via ae0.0, Push 128995
[BGP/170] 2d 21:47:17, MED 95, localpref 106, from 2607:X:X::1:155
                      AS path: 32934 I, validation-state: unverified
                       to X.X.X.14 via ae0.0, Push 129314
                    >  to X.X.X.14 via ae0.0, Push 128995

The first four paths are correct, but the last two are missing Label 2 because they are received over v6 BGP session without explicit-null. If incorrect path is used, the mapped ipv4 nexthop is resolved over MPLS tunnel but packets are sent with only transport label (129314 or 128995 in this case) that's removed on a penultimate hop. Because of missing label 2, packets arrive to the egress PE with wrong ethertype and dropped as L3-incompletes.

The problem here is that route-reflector selects a path with ipv4 mapped nexthop and advertises it over ipv6 session. I'm wondering, is anybody already encountered this problem and found a solution how to make a RR to advertise paths with a correct nexthop? I know that having two session for ipv6 adds complexity and one of them can be removed, but interested to find out an elegant solution for this issue.

Kind regards,
Andrey
_______________________________________________
juniper-nsp mailing list [email protected]
https://puck.nether.net/mailman/listinfo/juniper-nsp

Reply via email to