Re: [PATCH v2 2/2] riscv: Add support for str(n)cmp inline expansion

2023-09-12 Thread Philipp Tomsich
Applied to master. Thanks!
Philipp.

On Tue, 12 Sept 2023 at 05:34, Jeff Law  wrote:
>
>
>
> On 9/6/23 10:07, Christoph Muellner wrote:
> > From: Christoph Müllner 
> >
> > This patch implements expansions for the cmpstrsi and cmpstrnsi
> > builtins for RV32/RV64 for xlen-aligned strings if Zbb or XTheadBb
> > instructions are available.  The expansion basically emits a comparison
> > sequence which compares XLEN bits per step if possible.
> >
> > This allows to inline calls to strcmp() and strncmp() if both strings
> > are xlen-aligned.  For strncmp() the length parameter needs to be known.
> > The benefits over calls to libc are:
> > * no call/ret instructions
> > * no stack frame allocation
> > * no register saving/restoring
> > * no alignment tests
> >
> > The inlining mechanism is gated by a new switches ('-minline-strcmp' and
> > '-minline-strncmp') and by the variable 'optimize_size'.
> > The amount of emitted unrolled loop iterations can be controlled by the
> > parameter '--param=riscv-strcmp-inline-limit=N', which defaults to 64.
> >
> > The comparision sequence is inspired by the strcmp example
> > in the appendix of the Bitmanip specification (incl. the fast
> > result calculation in case the first word does not contain
> > a NULL byte).  Additional inspiration comes from rs6000-string.c.
> >
> > The emitted sequence is not triggering any readahead pagefault issues,
> > because only aligned strings are accessed by aligned xlen-loads.
> >
> > This patch has been tested using the glibc string tests on QEMU:
> > * rv64gc_zbb/rv64gc_xtheadbb with riscv-strcmp-inline-limit=64
> > * rv64gc_zbb/rv64gc_xtheadbb with riscv-strcmp-inline-limit=8
> > * rv32gc_zbb/rv32gc_xtheadbb with riscv-strcmp-inline-limit=64
> >
> > Signed-off-by: Christoph Müllner 
> >
> > gcc/ChangeLog:
> >
> >   * config/riscv/bitmanip.md (*_not): Export INSN name.
> >   (_not3): Likewise.
> >   * config/riscv/riscv-protos.h (riscv_expand_strcmp): New
> >   prototype.
> >   * config/riscv/riscv-string.cc (GEN_EMIT_HELPER3): New helper
> >   macros.
> >   (GEN_EMIT_HELPER2): Likewise.
> >   (emit_strcmp_scalar_compare_byte): New function.
> >   (emit_strcmp_scalar_compare_subword): Likewise.
> >   (emit_strcmp_scalar_compare_word): Likewise.
> >   (emit_strcmp_scalar_load_and_compare): Likewise.
> >   (emit_strcmp_scalar_call_to_libc): Likewise.
> >   (emit_strcmp_scalar_result_calculation_nonul): Likewise.
> >   (emit_strcmp_scalar_result_calculation): Likewise.
> >   (riscv_expand_strcmp_scalar): Likewise.
> >   (riscv_expand_strcmp): Likewise.
> >   * config/riscv/riscv.md (*slt_): Export
> >   INSN name.
> >   (@slt_3): Likewise.
> >   (cmpstrnsi): Invoke expansion function for str(n)cmp.
> >   (cmpstrsi): Likewise.
> >   * config/riscv/riscv.opt: Add new parameter
> >   '-mstring-compare-inline-limit'.
> >   * doc/invoke.texi: Document new parameter
> >   '-mstring-compare-inline-limit'.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/riscv/xtheadbb-strcmp-unaligned.c: New test.
> >   * gcc.target/riscv/xtheadbb-strcmp.c: New test.
> >   * gcc.target/riscv/zbb-strcmp-disabled-2.c: New test.
> >   * gcc.target/riscv/zbb-strcmp-disabled.c: New test.
> >   * gcc.target/riscv/zbb-strcmp-unaligned.c: New test.
> >   * gcc.target/riscv/zbb-strcmp.c: New test.
> OK for the trunk.  THanks for pushing this along.
>
> jeff


Re: [PATCH v2 2/2] riscv: Add support for str(n)cmp inline expansion

2023-09-11 Thread Jeff Law via Gcc-patches




On 9/6/23 10:07, Christoph Muellner wrote:

From: Christoph Müllner 

This patch implements expansions for the cmpstrsi and cmpstrnsi
builtins for RV32/RV64 for xlen-aligned strings if Zbb or XTheadBb
instructions are available.  The expansion basically emits a comparison
sequence which compares XLEN bits per step if possible.

This allows to inline calls to strcmp() and strncmp() if both strings
are xlen-aligned.  For strncmp() the length parameter needs to be known.
The benefits over calls to libc are:
* no call/ret instructions
* no stack frame allocation
* no register saving/restoring
* no alignment tests

The inlining mechanism is gated by a new switches ('-minline-strcmp' and
'-minline-strncmp') and by the variable 'optimize_size'.
The amount of emitted unrolled loop iterations can be controlled by the
parameter '--param=riscv-strcmp-inline-limit=N', which defaults to 64.

The comparision sequence is inspired by the strcmp example
in the appendix of the Bitmanip specification (incl. the fast
result calculation in case the first word does not contain
a NULL byte).  Additional inspiration comes from rs6000-string.c.

The emitted sequence is not triggering any readahead pagefault issues,
because only aligned strings are accessed by aligned xlen-loads.

This patch has been tested using the glibc string tests on QEMU:
* rv64gc_zbb/rv64gc_xtheadbb with riscv-strcmp-inline-limit=64
* rv64gc_zbb/rv64gc_xtheadbb with riscv-strcmp-inline-limit=8
* rv32gc_zbb/rv32gc_xtheadbb with riscv-strcmp-inline-limit=64

Signed-off-by: Christoph Müllner 

gcc/ChangeLog:

* config/riscv/bitmanip.md (*_not): Export INSN name.
(_not3): Likewise.
* config/riscv/riscv-protos.h (riscv_expand_strcmp): New
prototype.
* config/riscv/riscv-string.cc (GEN_EMIT_HELPER3): New helper
macros.
(GEN_EMIT_HELPER2): Likewise.
(emit_strcmp_scalar_compare_byte): New function.
(emit_strcmp_scalar_compare_subword): Likewise.
(emit_strcmp_scalar_compare_word): Likewise.
(emit_strcmp_scalar_load_and_compare): Likewise.
(emit_strcmp_scalar_call_to_libc): Likewise.
(emit_strcmp_scalar_result_calculation_nonul): Likewise.
(emit_strcmp_scalar_result_calculation): Likewise.
(riscv_expand_strcmp_scalar): Likewise.
(riscv_expand_strcmp): Likewise.
* config/riscv/riscv.md (*slt_): Export
INSN name.
(@slt_3): Likewise.
(cmpstrnsi): Invoke expansion function for str(n)cmp.
(cmpstrsi): Likewise.
* config/riscv/riscv.opt: Add new parameter
'-mstring-compare-inline-limit'.
* doc/invoke.texi: Document new parameter
'-mstring-compare-inline-limit'.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadbb-strcmp-unaligned.c: New test.
* gcc.target/riscv/xtheadbb-strcmp.c: New test.
* gcc.target/riscv/zbb-strcmp-disabled-2.c: New test.
* gcc.target/riscv/zbb-strcmp-disabled.c: New test.
* gcc.target/riscv/zbb-strcmp-unaligned.c: New test.
* gcc.target/riscv/zbb-strcmp.c: New test.

OK for the trunk.  THanks for pushing this along.

jeff


[PATCH v2 2/2] riscv: Add support for str(n)cmp inline expansion

2023-09-06 Thread Christoph Muellner
From: Christoph Müllner 

This patch implements expansions for the cmpstrsi and cmpstrnsi
builtins for RV32/RV64 for xlen-aligned strings if Zbb or XTheadBb
instructions are available.  The expansion basically emits a comparison
sequence which compares XLEN bits per step if possible.

This allows to inline calls to strcmp() and strncmp() if both strings
are xlen-aligned.  For strncmp() the length parameter needs to be known.
The benefits over calls to libc are:
* no call/ret instructions
* no stack frame allocation
* no register saving/restoring
* no alignment tests

The inlining mechanism is gated by a new switches ('-minline-strcmp' and
'-minline-strncmp') and by the variable 'optimize_size'.
The amount of emitted unrolled loop iterations can be controlled by the
parameter '--param=riscv-strcmp-inline-limit=N', which defaults to 64.

The comparision sequence is inspired by the strcmp example
in the appendix of the Bitmanip specification (incl. the fast
result calculation in case the first word does not contain
a NULL byte).  Additional inspiration comes from rs6000-string.c.

The emitted sequence is not triggering any readahead pagefault issues,
because only aligned strings are accessed by aligned xlen-loads.

This patch has been tested using the glibc string tests on QEMU:
* rv64gc_zbb/rv64gc_xtheadbb with riscv-strcmp-inline-limit=64
* rv64gc_zbb/rv64gc_xtheadbb with riscv-strcmp-inline-limit=8
* rv32gc_zbb/rv32gc_xtheadbb with riscv-strcmp-inline-limit=64

Signed-off-by: Christoph Müllner 

gcc/ChangeLog:

* config/riscv/bitmanip.md (*_not): Export INSN name.
(_not3): Likewise.
* config/riscv/riscv-protos.h (riscv_expand_strcmp): New
prototype.
* config/riscv/riscv-string.cc (GEN_EMIT_HELPER3): New helper
macros.
(GEN_EMIT_HELPER2): Likewise.
(emit_strcmp_scalar_compare_byte): New function.
(emit_strcmp_scalar_compare_subword): Likewise.
(emit_strcmp_scalar_compare_word): Likewise.
(emit_strcmp_scalar_load_and_compare): Likewise.
(emit_strcmp_scalar_call_to_libc): Likewise.
(emit_strcmp_scalar_result_calculation_nonul): Likewise.
(emit_strcmp_scalar_result_calculation): Likewise.
(riscv_expand_strcmp_scalar): Likewise.
(riscv_expand_strcmp): Likewise.
* config/riscv/riscv.md (*slt_): Export
INSN name.
(@slt_3): Likewise.
(cmpstrnsi): Invoke expansion function for str(n)cmp.
(cmpstrsi): Likewise.
* config/riscv/riscv.opt: Add new parameter
'-mstring-compare-inline-limit'.
* doc/invoke.texi: Document new parameter
'-mstring-compare-inline-limit'.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadbb-strcmp-unaligned.c: New test.
* gcc.target/riscv/xtheadbb-strcmp.c: New test.
* gcc.target/riscv/zbb-strcmp-disabled-2.c: New test.
* gcc.target/riscv/zbb-strcmp-disabled.c: New test.
* gcc.target/riscv/zbb-strcmp-unaligned.c: New test.
* gcc.target/riscv/zbb-strcmp.c: New test.
---
 gcc/config/riscv/bitmanip.md  |   2 +-
 gcc/config/riscv/riscv-protos.h   |   1 +
 gcc/config/riscv/riscv-string.cc  | 411 ++
 gcc/config/riscv/riscv.md |  44 +-
 gcc/config/riscv/riscv.opt|  12 +
 gcc/doc/invoke.texi   |  20 +-
 .../gcc.target/riscv/xtheadbb-strcmp.c|  57 +++
 .../gcc.target/riscv/zbb-strcmp-disabled-2.c  |  38 ++
 .../gcc.target/riscv/zbb-strcmp-disabled.c|  38 ++
 .../gcc.target/riscv/zbb-strcmp-limit.c   |  57 +++
 .../gcc.target/riscv/zbb-strcmp-unaligned.c   |  38 ++
 gcc/testsuite/gcc.target/riscv/zbb-strcmp.c   |  57 +++
 12 files changed, 772 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadbb-strcmp.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp-disabled-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp-disabled.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp-limit.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp-unaligned.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp.c

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 1544ef4e125..1e90636dd60 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -206,7 +206,7 @@ (define_expand "popcount2"
(popcount:GPR (match_operand:GPR 1 "register_operand")))]
   "TARGET_ZBB")
 
-(define_insn "*_not"
+(define_insn "_not3"
   [(set (match_operand:X 0 "register_operand" "=r")
 (bitmanip_bitwise:X (not:X (match_operand:X 1 "register_operand" "r"))
 (match_operand:X 2 "register_operand" "r")))]
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index b060d047f01..0006fe0564e 100644
--- a/gcc/config/riscv/riscv-protos.h
+++