Thank you for your contribution! Sashiko AI review found 3 potential issue(s) 
to consider:
- [High] Use-after-free and workqueue corruption due to a race between NAPI 
polling and interface teardown.
- [High] Unconditional use of GDMA_MESSAGE_V3 breaks backward compatibility 
with older firmware.
- [Medium] Static interrupt moderation is silently disabled if only one 
parameter is configured.
--

commit ec13708996b121633f56552fac3f0ceba004bf71
Author: Haiyang Zhang <[email protected]>

net: mana: Add Interrupt Moderation support

This commit adds Static and Dynamic Interrupt Moderation (DIM) support for Rx
and Tx. It updates the queue creation procedure with new data structures and
settings, adds statistics collection functions for DIM, and introduces workers
to update DIM settings. It also hooks into ethtool to allow configuration.

> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c 
> b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index db14357d3732e..b1e0c444f4144 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -1551,6 +1551,9 @@ int mana_create_wq_obj(struct mana_port_context *apc,
>  
>       mana_gd_init_req_hdr(&req.hdr, MANA_CREATE_WQ_OBJ,
>                            sizeof(req), sizeof(resp));
> +
> +     req.hdr.req.msg_version = GDMA_MESSAGE_V3;
> +     req.hdr.resp.msg_version = GDMA_MESSAGE_V2;
>       req.vport = vport;

[Severity: High]
Will this break initialization on older firmware? We unconditionally set
msg_version to GDMA_MESSAGE_V3 here without checking if the hardware actually
supports it.

If an older Azure host without the GDMA_PF_CAP_FLAG_1_DYN_INTERRUPT_MODERATION
capability rejects the V3 request, could this cause queue creation to fail
and result in a complete loss of network connectivity?

[ ... ]
> @@ -2274,7 +2344,13 @@ static int mana_cq_handler(void *context, struct 
> gdma_queue *gdma_queue)
>       if (w < cq->budget) {
>               mana_gd_ring_cq(gdma_queue, SET_ARM_BIT);
>               cq->work_done_since_doorbell = 0;
> -             napi_complete_done(&cq->napi, w);
> +
> +             if (napi_complete_done(&cq->napi, w)) {
> +                     if (cq->type == MANA_CQ_TYPE_RX)
> +                             mana_update_rx_dim(cq);
> +                     else
> +                             mana_update_tx_dim(cq);
> +             }

[Severity: High]
Is there a race condition here during teardown? If napi_complete_done()
clears the scheduling state, a concurrent teardown thread in
mana_destroy_rxq() can return from napi_disable_locked() and proceed to call
cancel_work_sync().

Because mana_update_rx_dim() hasn't been called yet to schedule the work,
cancel_work_sync() would return immediately and the queue memory would be
freed. When the NAPI thread resumes and calls mana_update_rx_dim(), could
this lead to a use-after-free or corruption of the workqueue linked list
by scheduling work on freed memory?

[ ... ]
> @@ -2783,6 +2869,10 @@ static struct mana_rxq *mana_create_rxq(struct 
> mana_port_context *apc,
>       cq_spec.queue_size = cq->gdma_cq->queue_size;
>       cq_spec.modr_ctx_id = 0;
>       cq_spec.attached_eq = cq->gdma_cq->cq.parent->id;
> +     cq_spec.req_cq_moderation = apc->rx_dim_enabled ||
> +             (apc->intr_modr_rx_usec && apc->intr_modr_rx_comp);
> +     cq_spec.cq_moderation_usec = apc->intr_modr_rx_usec;
> +     cq_spec.cq_moderation_comp = apc->intr_modr_rx_comp;

[Severity: Medium]
Does this logic unintentionally ignore user configurations where only one
parameter is non-zero? 

In mana_set_coalesce(), ethtool allows setting rx-usecs without setting
rx-frames. If a user sets only time-based moderation (e.g., rx-usecs 50,
rx-frames 0), the condition here evaluates to false and silently disables
static moderation without returning an error via ethtool.

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=1

Reply via email to