On 20 Aug 2025, at 18:00, Mark Johnston wrote:
On Wed, Aug 20, 2025 at 02:30:20PM +0200, Kristof Provost wrote:
We’re panicing because the V_rtzone zone has been cleaned up (in
vnet_rtzone_destroy()). I explicitly NULL out V_rtzone too, to make this
more obvious.
Note that we failed to completely free all rtentries (`Freed UMA keg
(rtentry) was not empty (2 items). Lost 1 pages of memory.`). Presumably at least on of those two gets freed later, and that’s the panic we see.

rt_free() queues the actual delete as an epoch callback
(`NET_EPOCH_CALL(destroy_rtentry_epoch, &rt->rt_epoch_ctx);`), and that’s what we see here: the zone is removed before we’re done freeing all of the
rtentries.

vnet_rtzone_destroy() is called from rtables_destroy(), but that explicitly calls NET_EPOCH_DRAIN_CALLBACKS() first, so I’d expect all of the pending cleanups to have been done at that point. The comment block above does suggest that there may still be nexthop entries pending deletion even after
the we drain the callbacks. I think I can see how that’d happen for
nexthops, but I do not see how it can happen for rtentries.

Is it possible that if_detach_internal()->rt_flushifroutes() is running
after the rtentry zone is being destroyed?  That is, maybe we're
destroying interfaces too late in the jail teardown process?

With a little work to pass the calling function and line number through the call stack (and a lot of patience to reproduce the panic) I think I’ve found where we initially rt_free() the relevant rtentry, but it’s left me even more confused.

The call happens from ip6_destroy() -> in6_purgeaddr() -> ifa_del_loopback_route() -> ifa_maintain_loopback_route() -> rib_action() -> rib_del_route() -> rt_free(). That’s a NET_EPOCH_CALL(), which should be fine because in rtables_destroy() we NET_EPOCH_CALLBACK_DRAIN() before we vnet_rtzone_destory() (which naturally destroys the relevant uma zone).

ip6_destroy()’s VNET_SYSUNIT is SI_SUB_PROTO_DOMAIN/SI_ORDER_THIRD and rtables_destroy()’s is SI_SUB_PROTO_DOMAIN/SI_ORDER_FIRST. Given that it’s *un*init that means we call ip6_destroy() first, so that should all just work. The enqueued freeing of the rtentries should all be handled once NET_EPOCH_CALLBACK_DRAIN completes, but that appears to not be the case.

—
Kristof

Reply via email to