Stanisław Kardach <stanislaw.kard...@gmail.com> writes: > On Thu, May 2, 2024 at 4:44 PM Daniel Gregory > <daniel.greg...@bytedance.com> wrote: >> >> The zawrs extension adds a pair of instructions that stall a core until >> a memory location is written to. This patch uses one of them to >> implement RISCV-specific versions of the rte_wait_until_equal_* >> functions. This is potentially more energy efficient than the default >> implementation that uses rte_pause/Zihintpause. >> >> The technique works as follows: >> >> * Create a reservation set containing the address we want to wait on >> using an atomic load (lr.dw) >> * Call wrs.nto - this blocks until the reservation set is invalidated by >> someone else writing to that address >> * Execution can also resume arbitrarily, so we still need to check >> whether a change occurred and loop if not >> >> Due to RISC-V atomics only supporting naturally aligned word (32 bit) >> and double word (64 bit) loads, I've used pointer rounding and bit >> shifting to implement waiting on 16-bit values. >> >> This new functionality is controlled by a Meson flag that is disabled by >> default. >> >> Signed-off-by: Daniel Gregory <daniel.greg...@bytedance.com> >> Suggested-by: Punit Agrawal <punit.agra...@bytedance.com> >> --- >> >> Posting as an RFC to get early feedback and enable testing by others >> with Zawrs-enabled hardware. Whilst I have been able to test it compiles >> & passes tests using QEMU, I am waiting on some Zawrs-enabled hardware >> to become available before I carry out performance tests. >> >> Nonetheless, I would be glad to hear any feedback on the general >> approach. Thanks, Daniel >> >> config/riscv/meson.build | 5 ++ >> lib/eal/riscv/include/rte_pause.h | 139 ++++++++++++++++++++++++++++++ >> 2 files changed, 144 insertions(+) >> >> diff --git a/config/riscv/meson.build b/config/riscv/meson.build >> index 07d7d9da23..4cfdc42ecb 100644 >> --- a/config/riscv/meson.build >> +++ b/config/riscv/meson.build >> @@ -26,6 +26,11 @@ flags_common = [ >> # read from /proc/device-tree/cpus/timebase-frequency. This property is >> # guaranteed on Linux, as riscv time_init() requires it. >> ['RTE_RISCV_TIME_FREQ', 0], >> + >> + # Enable use of RISC-V Wait-on-Reservation-Set extension (Zawrs) >> + # Mitigates looping when polling on memory locations >> + # Make sure to add '_zawrs' to your target's -march below >> + ['RTE_RISCV_ZAWRS', false] > A bit orthogonal to this patch (or maybe not?) > Should we perhaps add a Qemu target in meson.build which would have > the modified -march for what qemu supports now? > Or perhaps add machine detection logic based either on the "riscv,isa" > cpu@0 property in the DT or RHCT ACPI table?
Compile time feature detection doesn't add a lot of benefit - it doesn't work in cross builds environments - which is the common way things are built for RISC-V at the moment. Also it doesn't work for distros where a single build is used across a broad set of machines. > Or add perhaps some other way we could specify the extension list > suffix for -march? Making it easier to specify the required extensions during the build does make sense. Though this is an orthogonal change and is better done in follow-on patches. [...]