On Wed, May 20, 2026 at 9:34 PM Muhammad Bilal <[email protected]> wrote:
>
> commit fa747e9f843b ("selftests/bpf: Fix cold_lru producing zero
> batch_hash in XDP LB benchmark") claims the addition ensures the
> multiplier input is "always >= 1". This invariant does not hold after
> wraparound.
> batch_gen is __u32. After 2^32 increments it wraps to 0. On CPU 0,
> bpf_get_smp_processor_id() returns 0:
>
>     batch_gen = 0  (after u32 wraparound)
>     batch_hash = (0 + 0) * KNUTH_HASH_MULT = 0
>     *saddr ^= 0  ->  no-op, cold_lru miss counter stays 0
>
> Setting bit 0 before multiplying guarantees a non-zero odd result for
> all possible values of batch_gen and cpu_id, including after wraparound:
>
>     (any_value | 1) >= 1  always, since bit 0 is always set

You say - batch_gen is __u32. After 2^32 increments it wraps to 0

A single batch runs for 10ms, and batch_gen is incremented for every
batch, so for it to wrap we need to run the benchmark for 1000+ years
with a single producer.

and even if we want to benchmark for 1000 years and want to fix this,
then doing " | 1 " is not the correct way because:

On CPU 0, consecutive batches with | 1:
  batch_gen=2:  (2 + 0) | 1 = 3,  batch_hash = 3 * KNUTH
  batch_gen=3:  (3 + 0) | 1 = 3,  batch_hash = 3 * KNUTH
  batch_gen=4:  (4 + 0) | 1 = 5,  batch_hash = 5 * KNUTH
  batch_gen=5:  (5 + 0) | 1 = 5,  batch_hash = 5 * KNUTH

Each even/odd pair of batch_gen values collapses to the same
batch_hash, so half the batches reuse the previous batch's cold
address which is already warm in the LRU.

Reply via email to