On 28/07/2025 16.34, Alexei Starovoitov wrote:
diff --git a/tools/testing/selftests/bpf/progs/lpm_trie_bench.c 
b/tools/testing/selftests/bpf/progs/lpm_trie_bench.c
new file mode 100644
index 000000000000..522e1cbef490
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/lpm_trie_bench.c
[...]
+
+static void gen_random_key(struct trie_key *key)
+{
+       key->prefixlen = prefixlen;
+       key->data = bpf_get_prandom_u32() % nr_entries;
bpf_get_prandom_u32() is not free
and modulo operation isn't free either.
The benchmark includes their time.
It's ok to have it, but add a mode where the bench
tests linear lookup/update too with simple key.data++

I've extended this bench with a "noop" and "baseline" benchmark[1].

[1] https://lore.kernel.org/all/175509897596.2755384.18413775753563966331.stgit@firesoul/

This allowed us to measure and deduce that the:
  bpf_get_prandom_u32() % nr_entries

Takes 14.1 nanosec for doing the rand + modulo.

The "noop" test shows harness overhead is 13.402 ns/op
and on-top the "baseline" shows randomness takes 27.529 ns/op.

--Jesper

Reply via email to