On 12/05/2026 11:57, Yuan Tan wrote:
On Tue, May 12, 2026 at 1:29 AM Nikolay Aleksandrov <[email protected]> wrote:

On 12/05/2026 07:31, Ren Wei wrote:
From: Nan Li <[email protected]>

The bridge local receive path may be deferred by netfilter and resumed
later. By the time br_handle_local_finish() runs, skb->dev may still be
valid while its bridge port association has already been removed.

br_handle_local_finish() unconditionally looks up the bridge port from
skb->dev and dereferences it for source learning. If the port is no
longer attached to the bridge, the lookup returns NULL and the deferred
local receive path can no longer rely on the port state being present.

Skip the learning step when the bridge port lookup fails. In that case
there is no port state left to learn on, so returning early preserves
the normal behavior for existing ports while avoiding access to stale
state.

Fixes: 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE 
or STOLEN verdict")

I don't think that is the correct commit, it seems to me this bug
has existed for a very long time. From a quick search I think (Florian
please correct me if I'm wrong, but I think NF_QUEUE existed back then)
it was introduced in 2010 by:
   f350a0a87374 ("bridge: use rx_handler_data pointer to store net_bridge_port
pointer")


After checking the history, I believe f350a0a87374 is indeed the
commit that introduced the underlying root cause.

The 8626c56c8279 commit in the patch is the one that actually made the
bug reachable in practice. I am a bit unsure which commit should be
used in the Fixes: tag. We have run into this situation several times
already, where the commit that introduced the root cause is different
from the commit that actually made the bug triggerable/reachable.


Hmm could you please elaborate? How did that commit make it reachable?
I can see the call was done before it as well:
                /* Deliver packet to local host only */
-               if (NF_HOOK(NFPROTO_BRIDGE, NF_BR_LOCAL_IN,
-                           dev_net(skb->dev), NULL, skb, skb->dev, NULL,
-                           br_handle_local_finish)) {
-                       return RX_HANDLER_CONSUMED; /* consumed by filter */
-               } else {
-                       *pskb = skb;
-                       return RX_HANDLER_PASS; /* continue processing */
-               }
+               NF_HOOK(NFPROTO_BRIDGE, NF_BR_LOCAL_IN, dev_net(skb->dev),
+                       NULL, skb, skb->dev, NULL, br_handle_local_finish);
+               return RX_HANDLER_CONSUMED;

NF_HOOK() would return 1 for NF_QUEUEed packet so essentially it was doing
the same, how would did it make the bug triggerable?

What do you think would be best?

Also, if the Fixes: tag should be changed, would you prefer that we
resend the patch, or would you rather have the committer adjust the
Fixes: tag when applying it in order to reduce traffic on netdev?



You should update the Fixes: tag but also wait 24h before re-posting another
patch version.


because that commit removed the same check for a NULL port.
The patch itself is ok, it restores the check that was there before the commit
I mentioned.

Cc: [email protected]
Reported-by: Yuan Tan <[email protected]>
Reported-by: Yifan Wu <[email protected]>
Reported-by: Juefei Pu <[email protected]>
Reported-by: Xin Liu <[email protected]>
Signed-off-by: Nan Li <[email protected]>
Signed-off-by: Ren Wei <[email protected]>
---
   net/bridge/br_input.c | 3 +++
   1 file changed, 3 insertions(+)

diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 2cbae0f9ae1f..5b0d7450de5f 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -247,6 +247,9 @@ static void __br_handle_local_finish(struct sk_buff *skb)
       struct net_bridge_port *p = br_port_get_rcu(skb->dev);
       u16 vid = 0;

+     if (unlikely(!p))
+             return;
+
       /* check if vlan is allowed, to avoid spoofing */
       if ((p->flags & BR_LEARNING) &&
           nbp_state_should_learn(p) &&




Reply via email to