On Sun, 11 Aug 2024 21:29:57 +0530
Rakesh Kudurumalla <[email protected]> wrote:
> Race condition between jobstats and time metrics
> for forwarding and flushing is maintained using spinlock.
> Timer metrics are not displayed properly due to the
> frequent unavailability of the lock.This patch fixes
> the issue by introducing a delay before acquiring
> the lock in the loop. This delay allows for betteravailability
> of the lock, ensuring that show_lcore_stats() can
> periodically update the statistics even when forwarding
> jobs are running.
>
> Fixes: 204896f8d66c ("examples/l2fwd-jobstats: add new example")
> Cc: [email protected]
>
> Signed-off-by: Rakesh Kudurumalla <[email protected]>
> ---
I prodded AI to review this patch and it had some good observations.
You could go farther with rte_wait_until_equal which has even better power
performance.
Let me see the critical `l2fwd_main_loop` function to understand the full
locking context:Now I have a complete picture. Let me provide my review.
---
## Patch Review: examples/l2fwd-jobstats: fix lock availability
### Commit Message Issues
**Errors:**
- Line 84: Missing space before "betteravailability" → should be "better
availability"
- Line 82: Missing space after period: "lock.This" → should be "lock. This"
**Style (minor):** The commit message is a bit verbose for what it's doing.
### Technical Analysis
The patch addresses a real problem. Looking at the code flow:
```c
/* Main loop holds lock almost continuously */
for (;;) {
rte_spinlock_lock(&qconf->lock); // Acquire
do {
/* ... lots of work ... */
stats_read_pending = rte_atomic_load_explicit(...);
} while (likely(stats_read_pending == 0));
rte_spinlock_unlock(&qconf->lock); // Release
rte_pause(); // <-- This is too short!
// Immediately re-acquires lock...
}
```
Meanwhile, `show_lcore_stats()` sets the flag and waits on the lock:
```c
rte_atomic_store_explicit(&qconf->stats_read_pending, 1, ...);
rte_spinlock_lock(&qconf->lock); // Starved - main loop re-acquires too fast
```
**The fix works**, but it's suboptimal:
- The 10µs delay is an arbitrary magic number
- It delays **unconditionally**, even when no stats reader is waiting
- Wastes cycles when the system is idle
### Better Option: Wait for stats completion
The `stats_read_pending` flag already exists to coordinate this handshake but
isn't used fully. The cleaner fix is to make the main loop wait until the stats
reader has finished:
```c
rte_spinlock_unlock(&qconf->lock);
/* Wait for stats reader to complete before reacquiring lock */
while (rte_atomic_load_explicit(&qconf->stats_read_pending,
rte_memory_order_relaxed) != 0)
rte_pause();
```
This approach:
- Uses the existing synchronization mechanism as intended
- Only yields when stats are actually being read
- Has zero overhead when no stats read is pending
- Doesn't introduce arbitrary magic constants
- Guarantees the stats reader gets the lock (not probabilistic)
### Alternative: Conditional delay
If you want to keep the delay approach, at least make it conditional:
```c
rte_spinlock_unlock(&qconf->lock);
if (stats_read_pending)
rte_delay_us(10);
else
rte_pause();
```
This avoids the unconditional 10µs penalty on every iteration.
### Verdict
**NAK as-is.** The fix works but the better solution is to properly use the
existing `stats_read_pending` handshake. Would suggest the author revise to use
the wait-for-completion pattern instead of an arbitrary delay.