The nipa/ai-review bot found these problems: | https://netdev-ai.bots.linux.dev/ai-review.html?id=e2753951-0455-4ce7-ad95-3da80456b40e
commit bab72ffd354779765bf84bc17cc185e0662d3717 Author: Matias Ezequiel Vara Larsen <[email protected]> can: virtio: Add virtio CAN driver This commit adds a new virtio CAN driver implementing a complete CAN bus interface over Virtio transport, supporting both CAN Classic and CAN-FD frames. > diff --git a/drivers/net/can/Kconfig b/drivers/net/can/Kconfig > index e15e320db476..ec941f3c25f6 100644 > --- a/drivers/net/can/Kconfig > +++ b/drivers/net/can/Kconfig > @@ -234,6 +234,18 @@ config CAN_XILINXCAN > Xilinx CAN driver. This driver supports both soft AXI CAN IP and > Zynq CANPS IP. > > +config CAN_VIRTIO_CAN > + depends on VIRTIO > + tristate "Virtio CAN device support" The CAN_VIRTIO_CAN config entry is placed after CAN_XILINXCAN, breaking the alphabetical ordering maintained by all other config entries in this file. Since V comes before X alphabetically, should CAN_VIRTIO_CAN be placed between CAN_TI_HECC and CAN_XILINXCAN? The Makefile entry is correctly ordered (virtio_can.o appears before xilinx_can.o), which makes this inconsistency more evident. > + default n > + help > + Say Y here if you want to support for Virtio CAN. > + > + To compile this driver as a module, choose M here: the > + module will be called virtio-can. > + > + If unsure, say N. > + [ ... ] > diff --git a/drivers/net/can/virtio_can.c b/drivers/net/can/virtio_can.c > new file mode 100644 > index 000000000000..8d60336b3ebe > --- /dev/null > +++ b/drivers/net/can/virtio_can.c > @@ -0,0 +1,934 @@ [ ... ] > +static int virtio_can_alloc_tx_idx(struct virtio_can_priv *priv) > +{ > + int tx_idx; > + > + tx_idx = ida_alloc_max(&priv->tx_putidx_ida, > + priv->can.echo_skb_max - 1, GFP_KERNEL); Is GFP_KERNEL safe here? This function is called from virtio_can_start_xmit(), which is the ndo_start_xmit callback. This callback can be invoked from softirq context with BH disabled, where sleeping is not allowed. Should this use GFP_ATOMIC to match the calling context? > + if (tx_idx >= 0) > + atomic_inc(&priv->tx_inflight); > + > + return tx_idx; > +} [ ... ] > +static int virtio_can_open(struct net_device *ndev) > +{ > + virtio_can_start(ndev); > + > + netif_start_queue(ndev); > + > + return 0; > +} Is open_candev() missing? The CAN framework requires drivers to call open_candev() in their ndo_open handler. open_candev() validates that bittiming has been configured (returns -EINVAL if bitrate is 0), checks CAN FD data bitrate requirements, and manages carrier state. Without this call, can the device be brought up without configuring bitrate? The driver does call close_candev() in virtio_can_close(), showing awareness of the CAN lifecycle. Every other CAN driver in the kernel calls open_candev() in its open function. [ ... ] > +static netdev_tx_t virtio_can_start_xmit(struct sk_buff *skb, > + struct net_device *dev) > +{ [ ... ] > + can_tx_msg = kzalloc(sizeof(*can_tx_msg) + cf->len, GFP_ATOMIC); > + if (!can_tx_msg) { > + dev->stats.tx_dropped++; > + goto kick; /* No way to return NET_XMIT_DROP here */ > + } Does this leak the skb? When can_dev_dropped_skb() returns false (valid frame, skb not consumed) and kzalloc() for can_tx_msg fails, the code jumps to 'kick' without freeing the skb. The skb is still owned by the caller at this point. Should kfree_skb(skb) be called before goto kick? > + > + can_tx_msg->tx_out.msg_type = cpu_to_le16(VIRTIO_CAN_TX); > + can_flags = 0; [ ... ] > + putidx = virtio_can_alloc_tx_idx(priv); > + > + if (unlikely(putidx < 0)) { > + /* -ENOMEM or -ENOSPC here. -ENOSPC should not be possible as > + * tx_inflight >= can.echo_skb_max is checked in flow control > + */ > + WARN_ON_ONCE(putidx == -ENOSPC); > + kfree(can_tx_msg); > + dev->stats.tx_dropped++; > + goto kick; /* No way to return NET_XMIT_DROP here */ > + } Does this also leak the skb? When ida_alloc_max() returns -ENOMEM, can_tx_msg is correctly freed but the skb is not. can_put_echo_skb() has not been called yet at this point, so the skb is still owned by virtio_can_start_xmit. Should kfree_skb(skb) be called before goto kick? > + > + can_tx_msg->putidx = (unsigned int)putidx; > + > + /* Push loopback echo. Will be looped back on TX interrupt/TX NAPI */ > + can_put_echo_skb(skb, dev, can_tx_msg->putidx, 0); [ ... ] > +static void virtio_can_populate_rx_vq(struct virtio_device *vdev) > +{ > + struct virtio_can_priv *priv = vdev->priv; > + struct virtqueue *vq = priv->vqs[VIRTIO_CAN_QUEUE_RX]; > + struct virtio_can_rx *buf = priv->rpkt; > + unsigned int buf_size = priv->rpkt_len; > + int num_elements = vq->num_free; > + unsigned int idx; > + int ret; > + > + for (idx = 0; idx < num_elements; idx++) { > + ret = virtio_can_add_inbuf(vq, buf, buf_size); > + if (ret < 0) { > + dev_dbg(&vdev->dev, "rpkt fill: ret=%d, idx=%u, > size=%u\n", > + ret, idx, buf_size); > + break; > + } > + buf += buf_size; > + } Does this cause an out-of-bounds memory access? 'buf' is declared as 'struct virtio_can_rx *' and 'buf_size' equals 'priv->rpkt_len' (sizeof(struct virtio_can_rx) + CAN[FD]_MAX_DLEN). The expression 'buf += buf_size' advances by buf_size * sizeof(struct virtio_can_rx) bytes instead of the intended buf_size bytes, because C pointer arithmetic scales by the pointed-to type size. For CAN classic (rpkt_len=24, sizeof(struct virtio_can_rx)=16), this advances 384 bytes per iteration instead of 24. The allocation in virtio_can_probe only provides rpkt_len bytes per element, so doesn't the second and all subsequent loop iterations pass out-of-bounds pointers to the virtqueue as RX buffers? When the device writes to these buffers, can this corrupt adjacent kernel memory? virtio_can_probe(): priv->rpkt = devm_kzalloc(&vdev->dev, priv->rpkt_len * priv->vqs[VIRTIO_CAN_QUEUE_RX]->num_free, GFP_KERNEL); Should a 'char *' or 'u8 *' pointer be used for the byte-level iteration instead, e.g. 'u8 *p = (u8 *)priv->rpkt; ... p += buf_size;' and pass 'p' to virtio_can_add_inbuf? > + > + if (!ret) Does this use an uninitialized variable? The variable 'ret' is declared as 'int ret;' without initialization. If vq->num_free equals 0, the for loop body never executes, and 'ret' is read uninitialized here. Should this be initialized to 0? > + virtqueue_kick(vq); > + > + dev_dbg(&vdev->dev, "%u rpkt added\n", idx); > +} [ ... ] > +/* Function must not be called before virtio_can_find_vqs() has been run */ > +static void virtio_can_del_vq(struct virtio_device *vdev) > +{ > + struct virtio_can_priv *priv = vdev->priv; > + int q; > + > + /* Reset the device */ > + if (vdev->config->reset) > + vdev->config->reset(vdev); Should this use virtio_reset_device() instead? When CONFIG_VIRTIO_HARDEN_NOTIFICATION is enabled, virtio_reset_device() calls virtio_break_device() to mark all virtqueues as broken and virtio_synchronize_cbs() to wait for in-flight callbacks to complete before performing the actual reset. Calling config->reset directly skips these hardening steps, meaning virtqueue callbacks may still be executing when del_vqs destroys the queues. All other virtio network drivers (virtio_net.c, caif_virtio.c, mac80211_hwsim.c) use virtio_reset_device(). The NULL check is also unnecessary as this op is always present in virtio devices. > + > + /* From here we have dead silence from the device side so no locks > + * are needed to protect against device side events. > + */ > + > + for (q = 0; q < VIRTIO_CAN_QUEUE_COUNT; q++) > + while (virtqueue_detach_unused_buf(priv->vqs[q])) > + ; > + > + if (vdev->config->del_vqs) > + vdev->config->del_vqs(vdev); > +} > + > +static void virtio_can_remove(struct virtio_device *vdev) > +{ > + struct virtio_can_priv *priv = vdev->priv; > + struct net_device *dev = priv->dev; > + > + unregister_candev(dev); > + > + virtio_can_del_vq(vdev); Are napi_disable() calls missing here? The probe function enables both NAPI instances: virtio_can_probe(): napi_enable(&priv->napi); napi_enable(&priv->napi_tx); And virtio_can_freeze() correctly disables them: virtio_can_freeze(): napi_disable(&priv->napi); napi_disable(&priv->napi_tx); But virtio_can_remove() never disables NAPI. The ndo_stop callback (virtio_can_close) intentionally leaves NAPI active per its comment. Can the NAPI poll callbacks (virtio_can_rx_poll, virtio_can_tx_poll) execute concurrently with virtio_can_del_vq() deleting virtqueues and virtio_can_free_candev() freeing the net_device? Should napi_disable() be called for both instances before unregister_candev()? > + > + virtio_can_free_candev(dev); > +} [ ... ] -- Pengutronix e.K. | Marc Kleine-Budde | Embedded Linux | https://www.pengutronix.de | Vertretung Nürnberg | Phone: +49-5121-206917-129 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-9 |
signature.asc
Description: PGP signature
