[PATCH v3 0/8] riscv: optimize string functions and add kunit tests

Feng Jiang Mon, 19 Jan 2026 23:21:52 -0800

This series provides optimized implementations of strnlen(), strchr(),
and strrchr() for the RISC-V architecture. The strnlen implementation
is derived from the existing optimized strlen. For strchr and strrchr,
the current versions use simple byte-by-byte assembly logic, which
will serve as a baseline for future Zbb-based optimizations.


The patch series is organized into three parts:
1. Correctness Testing: The first three patches add KUnit test cases
   for strlen, strnlen, and strrchr to ensure the baseline and optimized
   versions are functionally correct.
2. Benchmarking Tool: Patches 4 and 5 extend string_kunit to include
   performance measurement capabilities, allowing for comparative
   analysis within the KUnit environment.
3. Architectural Optimizations: The final three patches introduce the
   RISC-V specific assembly implementations.

Following suggestions from Andy Shevchenko, performance benchmarks have
been added to string_kunit.c to provide quantifiable evidence of the
improvements. Andy provided many specific comments on the implementation
of the benchmark logic, which is also inspired by Eric Biggers'
crc_benchmark(). Performance was measured in a QEMU TCG (rv64) environment,
comparing the generic C implementation with the new RISC-V assembly versions.

Performance Summary (Improvement %):
---------------------------------------------------------------
Function  |  16 B (Short) |  512 B (Mid) |  4096 B (Long)
---------------------------------------------------------------
strnlen   |    +64.0%     |   +346.2%    |    +410.7%
strchr    |    +4.0%      |   +6.4%      |    +1.5%
strrchr   |    +6.6%      |   +2.8%      |    +0.0%
---------------------------------------------------------------
The benchmarks can be reproduced by enabling CONFIG_STRING_KUNIT_BENCH
and running: ./tools/testing/kunit/kunit.py run --arch=riscv \
--cross_compile=riscv64-linux-gnu- --kunitconfig=my_string.kunitconfig \
--raw_output

The strnlen implementation leverages the Zbb 'orc.b' instruction and
word-at-a-time logic, showing significant gains as the string length
increases. For strchr and strrchr, the handwritten assembly reduces
fixed overhead by eliminating stack frame management. The gain is most
prominent on short strings (1-16B) where function call overhead dominates,
while the performance converges with the C implementation for longer
strings in the TCG environment.

I would like to thank Andy Shevchenko for the suggestion to add benchmarks
and for his detailed feedback on the test framework, and Eric Biggers for
the benchmarking approach. Thanks also to Joel Stanley for testing support
and feedback, and to David Laight for his suggestions regarding performance
measurement.

Changes:
v3:
- Re-implement benchmark logic inspired by crc_benchmark().
- Add 'len - 2' test case to strnlen correctness tests.
- Incorporate detailed benchmark data into individual commit messages.

v2: 
- Refactored lib/string.c to export __generic_* functions and added
  corresponding functional/performance tests for strnlen, strchr,
  and strrchr (Andy Shevchenko).
- Replaced magic numbers with STRING_TEST_MAX_LEN etc. (Andy Shevchenko).

v1: Initial submission.

---

Feng Jiang (8):
  lib/string_kunit: add correctness test for strlen
  lib/string_kunit: add correctness test for strnlen
  lib/string_kunit: add correctness test for strrchr()
  lib/string_kunit: add performance benchmarks for strlen
  lib/string_kunit: extend benchmarks to strnlen and chr searches
  riscv: lib: add strnlen implementation
  riscv: lib: add strchr implementation
  riscv: lib: add strrchr implementation

 arch/riscv/include/asm/string.h |   9 ++
 arch/riscv/lib/Makefile         |   3 +
 arch/riscv/lib/strchr.S         |  35 +++++
 arch/riscv/lib/strnlen.S        | 164 ++++++++++++++++++++
 arch/riscv/lib/strrchr.S        |  37 +++++
 arch/riscv/purgatory/Makefile   |  11 +-
 lib/Kconfig.debug               |  11 ++
 lib/tests/string_kunit.c        | 258 ++++++++++++++++++++++++++++++++
 8 files changed, 527 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/lib/strchr.S
 create mode 100644 arch/riscv/lib/strnlen.S
 create mode 100644 arch/riscv/lib/strrchr.S

-- 
2.25.1

[PATCH v3 0/8] riscv: optimize string functions and add kunit tests

Reply via email to