From: Jiayuan Chen <[email protected]>

When a standalone IPv6 nexthop object is created with a loopback device
(e.g., "ip -6 nexthop add id 100 dev lo"), fib6_nh_init() misclassifies
it as a reject route. This is because nexthop objects have no destination
prefix (fc_dst=::), causing fib6_is_reject() to match any loopback
nexthop. The reject path skips fib_nh_common_init(), leaving
nhc_pcpu_rth_output unallocated. If an IPv4 route later references this
nexthop, __mkroute_output() dereferences NULL nhc_pcpu_rth_output and
panics.

The reject classification was designed for regular IPv6 routes to prevent
kernel loopback loops, but nexthop objects should not be subject to this
check since they carry no destination information - loop prevention is
handled separately when the route is created.

An alternative approach of unconditionally calling fib_nh_common_init()
for all reject routes was considered, but on large machines (e.g., 256
CPUs) with many routes, this wastes significant memory since
nhc_pcpu_rth_output allocates a per-CPU pointer for each route.

Since fib6_nh_init() is shared by multiple callers (route creation,
nexthop object creation, IPv4 gateway validation), using fc_dst_len to
implicitly distinguish nexthop objects would be fragile. Add an explicit
fc_is_nh flag to fib6_config to clearly identify nexthop object creation
and skip the reject check for this path.

Fixes: 7dd73168e273 ("ipv6: Always allocate pcpu memory in a fib6_nh")
Reported-by: [email protected]
Closes: 
https://lore.kernel.org/all/[email protected]/T/
Signed-off-by: Jiayuan Chen <[email protected]>
---
 include/net/ip6_fib.h | 1 +
 net/ipv4/nexthop.c    | 1 +
 net/ipv6/route.c      | 8 +++++++-
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 88b0dd4d8e09..7710f247b8d9 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -62,6 +62,7 @@ struct fib6_config {
        struct nlattr   *fc_encap;
        u16             fc_encap_type;
        bool            fc_is_fdb;
+       bool            fc_is_nh;
 };
 
 struct fib6_node {
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
index 7b9d70f9b31c..efad2dd27636 100644
--- a/net/ipv4/nexthop.c
+++ b/net/ipv4/nexthop.c
@@ -2859,6 +2859,7 @@ static int nh_create_ipv6(struct net *net,  struct 
nexthop *nh,
        struct fib6_config fib6_cfg = {
                .fc_table = l3mdev_fib_table(cfg->dev),
                .fc_ifindex = cfg->nh_ifindex,
+               .fc_is_nh = true,
                .fc_gateway = cfg->gw.ipv6,
                .fc_flags = cfg->nh_flags,
                .fc_nlinfo = cfg->nlinfo,
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index c0350d97307e..347f464ce7fe 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3628,7 +3628,13 @@ int fib6_nh_init(struct net *net, struct fib6_nh 
*fib6_nh,
         * they would result in kernel looping; promote them to reject routes
         */
        addr_type = ipv6_addr_type(&cfg->fc_dst);
-       if (fib6_is_reject(cfg->fc_flags, dev, addr_type)) {
+       /*
+        * Nexthop objects have no destination prefix, so fib6_is_reject()
+        * will misclassify loopback nexthops as reject routes, causing
+        * fib_nh_common_init() to be skipped along with its allocation
+        * of nhc_pcpu_rth_output, which IPv4 routes require.
+        */
+       if (!cfg->fc_is_nh && fib6_is_reject(cfg->fc_flags, dev, addr_type)) {
                /* hold loopback dev/idev if we haven't done so. */
                if (dev != net->loopback_dev) {
                        if (dev) {
-- 
2.43.0


Reply via email to