Mina Almasry <almasrym...@google.com> writes:

> From: Jesper Dangaard Brouer <h...@kernel.org>
>
> We frequently consult with Jesper's out-of-tree page_pool benchmark to
> evaluate page_pool changes.
>
> Import the benchmark into the upstream linux kernel tree so that (a)
> we're all running the same version, (b) pave the way for shared
> improvements, and (c) maybe one day integrate it with nipa, if possible.
>
> Import bench_page_pool_simple from commit 35b1716d0c30 ("Add
> page_bench06_walk_all"), from this repository:
> https://github.com/netoptimizer/prototype-kernel.git
>
> Changes done during upstreaming:
> - Fix checkpatch issues.
> - Remove the tasklet logic not needed.
> - Move under tools/testing
> - Create ksft for the benchmark.
> - Changed slightly how the benchmark gets build. Out of tree, time_bench
>   is built as an independent .ko. Here it is included in
>   bench_page_pool.ko
>
> Steps to run:
>
> ```
> mkdir -p /tmp/run-pp-bench
> make -C ./tools/testing/selftests/net/bench
> make -C ./tools/testing/selftests/net/bench install 
> INSTALL_PATH=/tmp/run-pp-bench
> rsync --delete -avz --progress /tmp/run-pp-bench mina@$SERVER:~/
> ssh mina@$SERVER << EOF
>   cd ~/run-pp-bench && sudo ./test_bench_page_pool.sh
> EOF
> ```
>
> Output:
>
> ```
> (benchmrk dmesg logs)
>
> Fast path results:
> no-softirq-page_pool01 Per elem: 11 cycles(tsc) 4.368 ns
>
> ptr_ring results:
> no-softirq-page_pool02 Per elem: 527 cycles(tsc) 195.187 ns
>
> slow path results:
> no-softirq-page_pool03 Per elem: 549 cycles(tsc) 203.466 ns
> ```
>
> Cc: Jesper Dangaard Brouer <h...@kernel.org>
> Cc: Ilias Apalodimas <ilias.apalodi...@linaro.org>
> Cc: Jakub Kicinski <k...@kernel.org>
> Cc: Toke Høiland-Jørgensen <t...@toke.dk>
>
> Signed-off-by: Mina Almasry <almasrym...@google.com>

Back when you posted the first RFC, Jesper and I chatted about ways to
avoid the ugly "load module and read the output from dmesg" interface to
the test.

One idea we came up with was to make the module include only the "inner"
functions for the benchmark, and expose those to BPF as kfuncs. Then the
test runner can be a BPF program that runs the tests, collects the data
and passes it to userspace via maps or a ringbuffer or something. That's
a nicer and more customisable interface than the printk output. And if
they're small enough, maybe we could even include the functions into the
page_pool code itself, instead of in a separate benchmark module?

WDYT of that idea? :)

-Toke


Reply via email to