On 2026-05-21 11:16:32+02:00, Paolo Abeni wrote: > From: AI Reviewer <[email protected]> > > This is an AI-generated review of your patch. The human sending this > email has considered the AI review valid, or at least plausible. > Full review at: https://netdev-ai.bots.linux.dev/sashiko/ > --- > net: ena: Add error bound to PHC communication layer > > This commit extends the ENA PHC communication layer to retrieve an > error_bound value from the device alongside the timestamp, and adds a > phc_err_eb statistic to count error_bound retrieval failures. > > > diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c > > b/drivers/net/ethernet/amazon/ena/ena_com.c > > index 257910460bd8..832c4c1d0e5e 100644 > > --- a/drivers/net/ethernet/amazon/ena/ena_com.c > > +++ b/drivers/net/ethernet/amazon/ena/ena_com.c > > [ ... ] > > > @@ -1882,14 +1888,15 @@ int ena_com_phc_get_timestamp(struct ena_com_dev > > *ena_dev, u64 *timestamp) > > } > > > > /* req_id was updated by the device which indicates that > > - * PHC timestamp and error_flags are updated too, > > - * checking errors before retrieving timestamp > > + * PHC timestamp, error_bound and error_flags are updated too, > > + * checking error flags before retrieving timestamp and > > + * error_bound values > > */ > > if (unlikely(resp->error_flags & ENA_PHC_ERROR_FLAGS)) { > > [High] > > Once req_id is observed equal to phc->req_id, the function reads > resp->error_flags, resp->timestamp and (now) resp->error_bound from the > device-shared DMA buffer. The updated comment explicitly states that once > req_id is observed updated, "PHC timestamp, error_bound and error_flags > are updated too" — which is the producer/consumer pattern that normally > requires a read barrier between the index check and the dependent data > loads. > > The buffer is declared volatile and the req_id check uses READ_ONCE(), > but on weakly-ordered architectures such as arm64 (which includes AWS > Graviton, a primary ENA deployment target) neither volatile nor > READ_ONCE() orders dependent loads against DMA writes from the device. > Without a dma_rmb() the CPU may observe the new req_id while still > returning stale or torn values for the 64-bit timestamp, the error_bound, > or the error_flags. > > Could a dma_rmb() be added between the req_id check and the subsequent > reads of resp->error_flags / resp->timestamp / resp->error_bound, similar > to the pattern already used elsewhere in the same driver? > > For example, ena_com_handle_admin_completion() in the same file does: > > /* Do not read the rest of the completion entry before the > * phase bit was validated > */ > dma_rmb(); > > and ena_com_aenq_intr_handler() has: > > /* Make sure the phase bit (ownership) is as expected before > * reading the rest of the descriptor. > */ > dma_rmb(); > > The same shape applies here: device writes data fields, then writes > req_id last; driver checks req_id, then reads data fields. > > This ordering issue is not introduced by this patch — the timestamp and > error_flags reads after the req_id check existed before — but the patch > extends the same unsynchronized read pattern to a third field > (error_bound) and the new comment now documents the assumed ordering > that is not actually enforced. Would it make sense to address the missing > dma_rmb() while touching this code? > > > /* Retrieved invalid PHC timestamp, PHC enters into > > [ ... ] > > > @@ -1897,12 +1904,15 @@ int ena_com_phc_get_timestamp(struct ena_com_dev > > *ena_dev, u64 *timestamp) > > > > /* PHC timestamp value is returned to the caller */ > > *timestamp = resp->timestamp; > > + if (error_bound) > > + *error_bound = resp->error_bound; > > > > /* Update statistic on valid PHC timestamp retrieval */ > > phc->stats.phc_cnt++; > > [ ... ]
Thanks Paolo. Noted, we'll look into addressing this.
