Stanisław Kardach <stanislaw.kard...@gmail.com> writes:

> On Thu, May 2, 2024 at 4:44 PM Daniel Gregory
> <daniel.greg...@bytedance.com> wrote:
>>
>> The zawrs extension adds a pair of instructions that stall a core until
>> a memory location is written to. This patch uses one of them to
>> implement RISCV-specific versions of the rte_wait_until_equal_*
>> functions. This is potentially more energy efficient than the default
>> implementation that uses rte_pause/Zihintpause.
>>
>> The technique works as follows:
>>
>> * Create a reservation set containing the address we want to wait on
>>   using an atomic load (lr.dw)
>> * Call wrs.nto - this blocks until the reservation set is invalidated by
>>   someone else writing to that address
>> * Execution can also resume arbitrarily, so we still need to check
>>   whether a change occurred and loop if not
>>
>> Due to RISC-V atomics only supporting naturally aligned word (32 bit)
>> and double word (64 bit) loads, I've used pointer rounding and bit
>> shifting to implement waiting on 16-bit values.
>>
>> This new functionality is controlled by a Meson flag that is disabled by
>> default.
>>
>> Signed-off-by: Daniel Gregory <daniel.greg...@bytedance.com>
>> Suggested-by: Punit Agrawal <punit.agra...@bytedance.com>
>> ---
>>
>> Posting as an RFC to get early feedback and enable testing by others
>> with Zawrs-enabled hardware. Whilst I have been able to test it compiles
>> & passes tests using QEMU, I am waiting on some Zawrs-enabled hardware
>> to become available before I carry out performance tests.
>>
>> Nonetheless, I would be glad to hear any feedback on the general
>> approach. Thanks, Daniel
>>
>>  config/riscv/meson.build          |   5 ++
>>  lib/eal/riscv/include/rte_pause.h | 139 ++++++++++++++++++++++++++++++
>>  2 files changed, 144 insertions(+)
>>
>> diff --git a/config/riscv/meson.build b/config/riscv/meson.build
>> index 07d7d9da23..4cfdc42ecb 100644
>> --- a/config/riscv/meson.build
>> +++ b/config/riscv/meson.build
>> @@ -26,6 +26,11 @@ flags_common = [
>>      # read from /proc/device-tree/cpus/timebase-frequency. This property is
>>      # guaranteed on Linux, as riscv time_init() requires it.
>>      ['RTE_RISCV_TIME_FREQ', 0],
>> +
>> +    # Enable use of RISC-V Wait-on-Reservation-Set extension (Zawrs)
>> +    # Mitigates looping when polling on memory locations
>> +    # Make sure to add '_zawrs' to your target's -march below
>> +    ['RTE_RISCV_ZAWRS', false]
> A bit orthogonal to this patch (or maybe not?)
> Should we perhaps add a Qemu target in meson.build which would have
> the modified -march for what qemu supports now?
> Or perhaps add machine detection logic based either on the "riscv,isa"
> cpu@0 property in the DT or RHCT ACPI table?

Compile time feature detection doesn't add a lot of benefit - it doesn't
work in cross builds environments - which is the common way things are
built for RISC-V at the moment. Also it doesn't work for distros where a
single build is used across a broad set of machines.

> Or add perhaps some other way we could specify the extension list
> suffix for -march?

Making it easier to specify the required extensions during the build
does make sense. Though this is an orthogonal change and is better done
in follow-on patches.

[...]

Reply via email to