> Subject: [PATCH] net/iavf: fix VF reset race and stale ARQ message handling
>
> During VF reset, multiple issues can lead to initialization instability.
>
> The first issue is a race condition in iavf_check_vf_reset_done(),
> where VFR state VFACTIVE is treated as both "reset not started" and
> "reset completed". This can allow the VF to proceed before the PF
> has acknowledged the reset, resulting in inconsistent initialization.
>
> The second issue is the presence of stale messages in the Admin
> Receive Queue (ARQ) after VF reset. These may include opcode 0
> (VIRTCHNL_OP_UNKNOWN) or responses to commands issued before reset,
> which can interfere with API negotiation and cause command mismatch
> errors.
>
> Additionally, opcode 0 messages generate excessive warning logs,
> causing unnecessary noise during initialization.
>
> The solution involves:
> 1. Introducing a two-phase reset wait mechanism in
> iavf_check_vf_reset_done():
> - Wait for RSTAT to leave VFACTIVE to confirm reset start.
> - Wait for RSTAT to return to VFACTIVE or COMPLETED to confirm
> reset completion.
>
> 2. Draining stale ARQ messages after admin queue initialization
> and before issuing commands in iavf_init_vf().
>
> 3. Downgrading opcode 0 message logging to DEBUG level while
> preserving mismatch detection for other opcodes.
>
> 4. Refactoring reset-start detection and ARQ drain logic into
> helper functions (iavf_wait_for_reset_start() and
> iavf_drain_arq()) to improve readability and reuse.
>
> This ensures reliable VF reset handling and stable initialization.
>
> Fixes: 28a1a72eac26 ("net/iavf: add VF initiated reset")
Is the fix only for the VF initiated reset sequence, or does it aim to
fix other reset sequences as well eg. PF initiated?
> Cc: [email protected]
>
> Signed-off-by: Talluri Chaitanyababu <[email protected]>
> ---
> drivers/net/intel/iavf/iavf_ethdev.c | 54
> ++++++++++++++++++++++++++++
> drivers/net/intel/iavf/iavf_vchnl.c | 16 +++++++--
> 2 files changed, 67 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/intel/iavf/iavf_ethdev.c
> b/drivers/net/intel/iavf/iavf_ethdev.c
> index 1eca20bc9a..692e4455dc 100644
> --- a/drivers/net/intel/iavf/iavf_ethdev.c
> +++ b/drivers/net/intel/iavf/iavf_ethdev.c
> @@ -2002,11 +2002,36 @@ iavf_dev_rx_queue_intr_disable(struct
> rte_eth_dev *dev, uint16_t queue_id)
> return 0;
> }
>
> +/* Wait until PF acknowledges VF reset (RSTAT leaves VFACTIVE) */
> +static int
> +iavf_wait_for_reset_start(struct iavf_hw *hw)
> +{
> + int i;
> + uint32_t rstat;
> +
> + for (i = 0; i < 100; i++) {
> + rte_delay_ms(10);
> +
> + rstat = IAVF_READ_REG(hw, IAVF_VFGEN_RSTAT);
> + rstat &= IAVF_VFGEN_RSTAT_VFR_STATE_MASK;
> + rstat >>= IAVF_VFGEN_RSTAT_VFR_STATE_SHIFT;
> +
> + if (rstat != VIRTCHNL_VFR_VFACTIVE)
> + return 0;
> + }
> +
> + return -1;
> +}
> +
> static int
> iavf_check_vf_reset_done(struct iavf_hw *hw)
> {
> int i, reset;
>
> + /* Phase 1: wait for reset to start (leave VFACTIVE) */
> + if (iavf_wait_for_reset_start(hw) != 0)
> + PMD_DRV_LOG(DEBUG, "VF reset did not start within
> timeout");
There are multiple paths that lead to the iavf_wait_for_reset_start
function being called, including during device initialisation
iavf_init_vf -> iavf_check_vf_reset_done. In this case, we are not
expecting a reset event, just making sure we are not trying to
initialise a VF in the middle of a reset. I think the
iavf_wait_for_reset_start is not needed here, and the associated log
would be incorrect. Can you check that
iavf_wait_for_reset_start is only triggered when absolutely necessary.
Consider these events: device init, pf initiated reset, vf initiated
reset, queue pair change (iavf_queues_req_reset)
> +
> for (i = 0; i < IAVF_RESET_WAIT_CNT; i++) {
> reset = IAVF_READ_REG(hw, IAVF_VFGEN_RSTAT) &
> IAVF_VFGEN_RSTAT_VFR_STATE_MASK;
> @@ -2517,6 +2542,31 @@ iavf_init_proto_xtr(struct rte_eth_dev *dev)
> }
> }
>
> +/* Drain stale Admin Receive Queue messages after reset */
> +static void
> +iavf_drain_arq(struct iavf_hw *hw, struct iavf_info *vf)
> +{
> + struct iavf_arq_event_info event;
> + int drain_count = 0;
> +
> + memset(&event, 0, sizeof(event));
> + event.msg_buf = vf->aq_resp;
> +
> + while (drain_count < IAVF_AQ_LEN) {
> + event.buf_len = IAVF_AQ_BUF_SZ;
> +
> + if (iavf_clean_arq_element(hw, &event, NULL) !=
> IAVF_SUCCESS)
> + break;
> +
> + drain_count++;
> + }
> +
> + if (drain_count > 0)
> + PMD_INIT_LOG(DEBUG,
> + "Drained %d stale ARQ messages",
> + drain_count);
> +}
> +
> static int
> iavf_init_vf(struct rte_eth_dev *dev)
> {
> @@ -2558,6 +2608,10 @@ iavf_init_vf(struct rte_eth_dev *dev)
> PMD_INIT_LOG(ERR, "unable to allocate vf_aq_resp
> memory");
> goto err_aq;
> }
> +
> + /* Drain stale ARQ messages after VF reset */
> + iavf_drain_arq(hw, vf);
Would we expect stale messages in the adminq here, it was only just
allocated a few lines above?
> +
> if (iavf_check_api_version(adapter) != 0) {
> PMD_INIT_LOG(ERR, "check_api version failed");
> goto err_api;
> diff --git a/drivers/net/intel/iavf/iavf_vchnl.c
> b/drivers/net/intel/iavf/iavf_vchnl.c
> index 08dd6f2d7f..a014be8b57 100644
> --- a/drivers/net/intel/iavf/iavf_vchnl.c
> +++ b/drivers/net/intel/iavf/iavf_vchnl.c
> @@ -296,11 +296,21 @@ iavf_read_msg_from_pf(struct iavf_adapter
> *adapter, uint16_t buf_len,
> __func__, vpe->event);
> }
> } else {
> - /* async reply msg on command issued by vf previously */
> + /* Async reply for previously issued VF command.
> + * Stale messages from before reset are ignored, and polling
> + * continues until the expected response is received.
> + */
> result = IAVF_MSG_CMD;
> if (opcode != vf->pend_cmd) {
> - PMD_DRV_LOG(WARNING, "command mismatch,
> expect %u, get %u",
> - vf->pend_cmd, opcode);
> + if (opcode == VIRTCHNL_OP_UNKNOWN)
> + PMD_DRV_LOG(DEBUG,
> + "Ignoring stale msg (opcode
> 0), pending cmd %u",
> + vf->pend_cmd);
> + else
> + PMD_DRV_LOG(WARNING,
> + "command mismatch, expect
> %u, get %u",
> + vf->pend_cmd, opcode);
> +
> result = IAVF_MSG_ERR;
> }
> }
> --
> 2.43.0