Hello,

The nipa/ai-review bot found these problems:

| 
https://netdev-ai.bots.linux.dev/ai-review.html?id=b046d75a-2de1-4063-bb96-8f64abc65e7d

commit 0f3cb3b3bc94eaff9b9709b528e498fe6150f9c1
Author: Matias Ezequiel Vara Larsen <[email protected]>

can: virtio: Add virtio CAN driver

This commit adds a virtio CAN driver implementing a complete CAN bus
interface over Virtio transport, supporting both CAN Classic and CAN-FD
frames.

> diff --git a/drivers/net/can/virtio_can.c b/drivers/net/can/virtio_can.c
> new file mode 100644
> index 000000000000..57641c3ed6d0
> --- /dev/null
> +++ b/drivers/net/can/virtio_can.c

[ ... ]

> +static int __maybe_unused virtio_can_restore(struct virtio_device *vdev)
> +{
> +     struct virtio_can_priv *priv = vdev->priv;
> +     struct net_device *ndev = priv->dev;
> +     size_t size;
> +     int err;
> +
> +     napi_enable(&priv->napi);
> +     napi_enable(&priv->napi_tx);
> +
> +     err = virtio_can_find_vqs(priv);
> +     if (err != 0)
> +             return err;

What happens to the enabled NAPI instances when this error path is taken?
virtio_can_restore() enables both NAPI instances at the start, but if
virtio_can_find_vqs() fails here, the function returns without calling
napi_disable().

The corresponding virtio_can_freeze() function disables both NAPIs before
returning, so the restore function leaves the NAPI subsystem in an
inconsistent state on failure.

If the PM framework retries restore after this failure, the second
napi_enable() call on an already-enabled NAPI will hit the BUG_ON in
napi_enable_locked():

net/core/dev.c:napi_enable_locked() {
        BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
        ...
}

In contrast, virtio_can_probe() enables NAPI only after all setup
succeeds, avoiding this issue.

> +
> +     size = priv->rpkt_len * priv->vqs[VIRTIO_CAN_QUEUE_RX]->num_free;
> +     priv->rpkt = devm_krealloc(&vdev->dev, priv->rpkt, size, GFP_KERNEL | 
> __GFP_ZERO);
> +     if (!priv->rpkt) {
> +             virtio_can_del_vq(vdev);
> +             return -ENOMEM;
> +     }

The same NAPI state issue occurs on this error path.

> +     virtio_can_populate_rx_vq(vdev);
> +
> +     if (netif_running(ndev)) {
> +             err = virtio_can_start(ndev);
> +             if (err) {
> +                     virtio_can_del_vq(vdev);
> +                     return err;
> +             }

And again on this error path.

Should all three error paths call napi_disable(&priv->napi_tx) and
napi_disable(&priv->napi) before returning the error code?

-- 
Pengutronix e.K.                 | Marc Kleine-Budde          |
Embedded Linux                   | https://www.pengutronix.de |
Vertretung Nürnberg              | Phone: +49-5121-206917-129 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-9   |

Attachment: signature.asc
Description: PGP signature

Reply via email to