> -----Original Message-----
> From: Simon Horman <[email protected]>
> Sent: Friday, May 15, 2026 8:11 PM
> To: Korba, Przemyslaw <[email protected]>
> Cc: 'Simon Horman' <[email protected]>; [email protected];
> [email protected]; Nguyen, Anthony L
> <[email protected]>; Kitszel, Przemyslaw
> <[email protected]>; Loktionov, Aleksandr
> <[email protected]>;
> Kubalewski, Arkadiusz <[email protected]>
> Subject: Re: [PATCH iwl-net] ice: suppress DPLL errors during reset recovery
>
> From: 'Simon Horman' <[email protected]>
>
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://sashiko.dev
> ---
> ice: suppress DPLL errors during reset recovery
>
> This commit suppresses DPLL error logging and counter increments when the
> admin
> queue returns EBUSY. The intent is to avoid unnecessary warnings and prevent
> the DPLL periodic worker from being disabled during reset recovery.
>
> > diff --git a/drivers/net/ethernet/intel/ice/ice_dpll.c
> > b/drivers/net/ethernet/intel/ice/ice_dpll.c
> > index 0704e92ab0430..78b8836b534b3 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_dpll.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_dpll.c
> > @@ -784,7 +784,7 @@ ice_dpll_pin_state_update(struct ice_pf *pf, struct
> > ice_dpll_pin *pin,
> > ret,
> > libie_aq_str(pf->hw.adminq.sq_last_status),
> > pin_type_name[pin_type], pin->idx);
> > - else
> > + else if (pf->hw.adminq.sq_last_status != LIBIE_AQ_RC_EBUSY)
>
Hi, thank you for review!
> Could this check of sq_last_status race with other admin queue commands?
>
> The sq_last_status field seems to be global state that is modified by any
> admin queue command. If this is checked outside of the admin queue lock,
> could another thread (like an ethtool operation) issue an admin queue
> command and overwrite sq_last_status before this check runs?
>
> If it gets overwritten to LIBIE_AQ_RC_OK, this would falsely evaluate as not
> EBUSY and log the error anyway, incrementing the error counter.
The check happens immediately after the AQ call that set it.
In practice, the window for another thread to issue an AQ command between the
failure and the check is very small,
and even if it happens, the worst case has no functional impact
>
> [ ... ]
>
> > @@ -2955,7 +2957,9 @@ static void ice_dpll_periodic_work(struct
> > kthread_work *work)
> > d->periodic_counter % dp->phase_offset_monitor_period == 0)
> > ret = ice_dpll_pps_update_phase_offsets(pf, &phase_offset_ntf);
>
> Does the reset recovery suppression also need to be applied inside
> ice_dpll_pps_update_phase_offsets()?
>
I agree, will send next version.
> If ice_aq_get_cgu_input_pin_measure() fails with EBUSY during reset recovery,
> it looks like ice_dpll_pps_update_phase_offsets() only checks for
> LIBIE_AQ_RC_EAGAIN before logging an error, which might still spam the log
> with EBUSY errors.
>
> > if (ret) {
> > - d->cgu_state_acq_err_num++;
> > + /* EBUSY is expected during reset recovery */
> > + if (pf->hw.adminq.sq_last_status != LIBIE_AQ_RC_EBUSY)
> > + d->cgu_state_acq_err_num++;
> > /* stop rescheduling this worker */
> > if (d->cgu_state_acq_err_num >
> > ICE_CGU_STATE_ACQ_ERR_THRESHOLD) {
>
> By skipping the increment of the error counter on EBUSY, will this create an
> unbounded polling loop?
>
> Looking at the end of ice_dpll_periodic_work():
>
> resched:
> kthread_queue_delayed_work(d->kworker, &d->work,
> ret ? msecs_to_jiffies(10) :
> msecs_to_jiffies(500));
>
> If the firmware gets wedged and persistently returns EBUSY, the error
> threshold will never be reached, and the worker will continue to hammer the
> admin queue every 10ms forever. Is it expected to poll this frequently
> without a backoff mechanism or an upper bound?
Line 2950: if (ice_is_reset_in_progress(pf->state)) goto resched; - during
reset, no AQ operations are attempted, it just reschedules
EBUSY usually occurs only in the brief window after reset flag clears but
before AQ is fully ready
If FW is truly wedged, there are bigger problems, that should be detected by
other mechanisms.
Thanks again for review