On Mon, May 18, 2026 at 11:44 AM Mark Blasko <[email protected]> wrote: > > On Sun, May 17, 2026 at 4:15 PM Stephen Hemminger > <[email protected]> wrote: > > > > Warning: (unchanged from v1) gve_read_clock and the periodic > > gve_read_nic_clock alarm callback both issue > > GVE_ADMINQ_REPORT_NIC_TIMESTAMP into the single shared DMA buffer > > priv->nic_ts_report, then read it after gve_adminq_execute_cmd > > has released adminq_lock. If gve_read_clock is preempted between > > gve_adminq_report_nic_timestamp returning and the be64_to_cpu > > read, the alarm callback can memset() and reissue its own > > command, so the user thread will read either zero or another > > command's response. The simplest fix is for gve_read_clock to > > return the cached priv->last_read_nic_timestamp instead of > > issuing a fresh adminq command - the 250ms periodic sync keeps > > it fresh enough for .read_clock semantics. Once the dev_op > > registration is restored this race becomes reachable. > > I want to make sure I fully understand the API contract here. Is the > fact that .read_clock does not require a fresh device read documented > in the DPDK specification, or is this based on typical application > use cases? If the latter, what are those typical use cases?
Looking at the documentation for the user-exposed rte_eth_read_clock() API, it seems that this API is intended to be quite accurate on a read, or at least more accurate than 250ms of granularity would allow. This is the example given in the documentation: > * E.g, a simple heuristic to derivate the frequency would be: > * uint64_t start, end; > * rte_eth_read_clock(port, start); > * rte_delay_ms(100); > * rte_eth_read_clock(port, end); > * double freq = (end - start) * 10; > * > * Compute a common reference with: > * uint64_t base_time_sec = current_time(); > * uint64_t base_clock; > * rte_eth_read_clock(port, base_clock); > * > * Then, convert the raw mbuf timestamp with: > * base_time_sec + (double)(*timestamp_dynfield(mbuf) - base_clock) / freq; read_clock() having 250ms granularity would completely nullify this use case, as it is entirely possible for both clock reads to hold the same value. And even if they _didn't_ hold the same value, they would end up being 250ms-worth of NIC cycles apart, instead of a more accurate representation of the delta over that period of time -- not very useful. -- Joshua Washington

