Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider: - [High] Use-after-free and workqueue corruption due to a race between NAPI polling and interface teardown. - [High] Unconditional use of GDMA_MESSAGE_V3 breaks backward compatibility with older firmware. - [Medium] Static interrupt moderation is silently disabled if only one parameter is configured. --
commit ec13708996b121633f56552fac3f0ceba004bf71 Author: Haiyang Zhang <[email protected]> net: mana: Add Interrupt Moderation support This commit adds Static and Dynamic Interrupt Moderation (DIM) support for Rx and Tx. It updates the queue creation procedure with new data structures and settings, adds statistics collection functions for DIM, and introduces workers to update DIM settings. It also hooks into ethtool to allow configuration. > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c > b/drivers/net/ethernet/microsoft/mana/mana_en.c > index db14357d3732e..b1e0c444f4144 100644 > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c > @@ -1551,6 +1551,9 @@ int mana_create_wq_obj(struct mana_port_context *apc, > > mana_gd_init_req_hdr(&req.hdr, MANA_CREATE_WQ_OBJ, > sizeof(req), sizeof(resp)); > + > + req.hdr.req.msg_version = GDMA_MESSAGE_V3; > + req.hdr.resp.msg_version = GDMA_MESSAGE_V2; > req.vport = vport; [Severity: High] Will this break initialization on older firmware? We unconditionally set msg_version to GDMA_MESSAGE_V3 here without checking if the hardware actually supports it. If an older Azure host without the GDMA_PF_CAP_FLAG_1_DYN_INTERRUPT_MODERATION capability rejects the V3 request, could this cause queue creation to fail and result in a complete loss of network connectivity? [ ... ] > @@ -2274,7 +2344,13 @@ static int mana_cq_handler(void *context, struct > gdma_queue *gdma_queue) > if (w < cq->budget) { > mana_gd_ring_cq(gdma_queue, SET_ARM_BIT); > cq->work_done_since_doorbell = 0; > - napi_complete_done(&cq->napi, w); > + > + if (napi_complete_done(&cq->napi, w)) { > + if (cq->type == MANA_CQ_TYPE_RX) > + mana_update_rx_dim(cq); > + else > + mana_update_tx_dim(cq); > + } [Severity: High] Is there a race condition here during teardown? If napi_complete_done() clears the scheduling state, a concurrent teardown thread in mana_destroy_rxq() can return from napi_disable_locked() and proceed to call cancel_work_sync(). Because mana_update_rx_dim() hasn't been called yet to schedule the work, cancel_work_sync() would return immediately and the queue memory would be freed. When the NAPI thread resumes and calls mana_update_rx_dim(), could this lead to a use-after-free or corruption of the workqueue linked list by scheduling work on freed memory? [ ... ] > @@ -2783,6 +2869,10 @@ static struct mana_rxq *mana_create_rxq(struct > mana_port_context *apc, > cq_spec.queue_size = cq->gdma_cq->queue_size; > cq_spec.modr_ctx_id = 0; > cq_spec.attached_eq = cq->gdma_cq->cq.parent->id; > + cq_spec.req_cq_moderation = apc->rx_dim_enabled || > + (apc->intr_modr_rx_usec && apc->intr_modr_rx_comp); > + cq_spec.cq_moderation_usec = apc->intr_modr_rx_usec; > + cq_spec.cq_moderation_comp = apc->intr_modr_rx_comp; [Severity: Medium] Does this logic unintentionally ignore user configurations where only one parameter is non-zero? In mana_set_coalesce(), ethtool allows setting rx-usecs without setting rx-frames. If a user sets only time-based moderation (e.g., rx-usecs 50, rx-frames 0), the condition here evaluates to false and silently disables static moderation without returning an error via ethtool. -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=1
