Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider: - [High] State machine corruption in mana_detach leads to permanently bricked port. - [Medium] The commit message falsely claims the patch fixes a NULL pointer dereference, indicating a misdiagnosis of the original bug. --
commit 9693e5b39c52406711e627ad84a6eb7471551a72 Author: Dipayaan Roy <[email protected]> net: mana: Skip redundant detach on already-detached port This commit adds an early exit to mana_detach() to prevent redundant teardown when a port is already detached due to a previous failed attach. > When mana_per_port_queue_reset_work_handler() runs after a previous > detach succeeded but attach failed, the port is left in a detached > state with apc->tx_qp and apc->rxqs already freed. Calling > mana_detach() again unconditionally leads to NULL pointer dereferences > during queue teardown. [Severity: Medium] Is this description of the crash accurate? Looking at the unpatched code, if mana_attach() fails, apc->port_is_up is left as false. If mana_detach() is called again, apc->port_st_save is set to apc->port_is_up (which is false). Because mana_dealloc_queues() is guarded by if (apc->port_st_save), queue teardown would be entirely skipped. Could the actual issue with the old code be that apc->port_st_save was incorrectly cleared, preventing future mana_attach() retries from allocating queues and recovering the port, rather than a NULL pointer dereference? > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c > b/drivers/net/ethernet/microsoft/mana/mana_en.c > index 9e7e4bf526bf6..c9b1df1ed1098 100644 > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c > @@ -3350,6 +3350,12 @@ int mana_detach(struct net_device *ndev, bool > from_close) > > ASSERT_RTNL(); > > + /* If already detached (indicates detach succeeded but attach failed > + * previously). Now skip mana detach and just retry mana_attach. > + */ > + if (!from_close && !netif_device_present(ndev)) > + return 0; > + [Severity: High] Can this early return cause state machine corruption by bypassing the updates to apc->port_st_save? Consider this sequence: 1. queue_reset_work runs, mana_detach() succeeds (apc->port_st_save = true, apc->port_is_up = false), but mana_attach() fails. 2. The admin brings the interface down (ip link set dev eth0 down), skipping mana_close() since apc->port_is_up is false. 3. The admin changes the MTU, triggering mana_change_mtu() which calls mana_detach() followed by mana_attach(). 4. mana_detach() hits this new early return, preserving apc->port_st_save == true. When mana_attach() runs, it sees apc->port_st_save == true and allocates queues, setting apc->vport_use_count = 1 and apc->port_is_up = true, even though the interface is administratively down. If the admin then brings the interface up, mana_open() will unconditionally call mana_alloc_queues(). That function calls mana_cfg_vport(), which will return -EBUSY because apc->vport_use_count is already 1. This leaves mana_open() failing and the interface down. Since the interface is already down, trying to bring it down again is a no-op, meaning mana_close() is never called to clean up the orphaned queues. Does this sequence permanently brick the port until the driver is reloaded? > apc->port_st_save = apc->port_is_up; > apc->port_is_up = false; -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=2
