Re: [PATCH v3] RISC-V: Implement C[LT]Z_DEFINED_VALUE_AT_ZERO

2022-06-03 Thread Philipp Tomsich
Thank you for catching this one.  I'll provide a fix.

On Fri, 3 Jun 2022 at 16:56, H.J. Lu  wrote:
>
> On Fri, May 13, 2022 at 1:17 PM Philipp Tomsich
>  wrote:
> >
> > The Zbb support has introduced ctz and clz to the backend, but some
> > transformations in GCC need to know what the value of c[lt]z at zero
> > is. This affects how the optab is generated and may suppress use of
> > CLZ/CTZ in tree passes.
> >
> > Among other things, this is needed for the transformation of
> > table-based ctz-implementations, such as in deepsjeng, to work
> > (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90838).
> >
> > Prior to this change, the test case from PR90838 would compile to
> > on RISC-V targets with Zbb:
> >   myctz:
> > lui a4,%hi(.LC0)
> > ld  a4,%lo(.LC0)(a4)
> > neg a5,a0
> > and a5,a5,a0
> > mul a5,a5,a4
> > lui a4,%hi(.LANCHOR0)
> > addia4,a4,%lo(.LANCHOR0)
> > srlia5,a5,58
> > sh2add  a5,a5,a4
> > lw  a0,0(a5)
> > ret
> >
> > After this change, we get:
> >   myctz:
> > ctz a0,a0
> > andia0,a0,63
> > ret
> >
> > Testing this with deepsjeng_r (from SPEC 2017) against QEMU, this
> > shows a clear reduction in dynamic instruction count:
> >  - before  1961888067076
> >  - after   1907928279874 (2.75% reduction)
> >
> > This also merges the various target-specific test-cases (for x86-64,
> > aarch64 and riscv) within gcc.dg/pr90838.c.
> >
> > This extends the macros (i.e., effective-target keywords) used in
> > testing (lib/target-supports.exp) to reliably distinguish between RV32
> > and RV64 via __riscv_xlen (i.e., the integer register bitwidth) :
> > testing for ILP32 could be misleading (as ILP32 is a valid memory
> > model for 64bit systems).
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv.h (CLZ_DEFINED_VALUE_AT_ZERO): Implement.
> > (CTZ_DEFINED_VALUE_AT_ZERO): Same.
> > * doc/sourcebuild.texi: add documentation for RISC-V specific
> > test target keywords
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/pr90838.c: Add additional flags (dg-additional-options)
> >   when compiling for riscv64 and subsume 
> > gcc.target/aarch64/pr90838.c
> >   and gcc.target/i386/pr95863-2.c.
> > * gcc.target/riscv/zbb-ctz.c: New test.
> > * gcc.target/aarch64/pr90838.c: Removed.
> > * gcc.target/i386/pr95863-2.c: Removed.
> > * lib/target-supports.exp: Recognize RV32 or RV64 via XLEN
> >
> > Signed-off-by: Philipp Tomsich 
> > Signed-off-by: Manolis Tsamis 
> > Co-developed-by: Manolis Tsamis 
> >
> > ---
> > Changes in v3:
> > - Address nit from Kito (use rv64 and rv32 on gcc.dg/pr90838.c
> >   consistently.
> >
> > Changes in v2:
> > - Address review comments from Palmer (merging testcases)
> > - Merge the different target-specific testcases for CLZ into one
> > - Add RV32 tests
> > - Fix pr90383.c testcase for x86_64
> >
> >  gcc/config/riscv/riscv.h   |  5 ++
> >  gcc/doc/sourcebuild.texi   | 12 
> >  gcc/testsuite/gcc.dg/pr90838.c | 25 +
> >  gcc/testsuite/gcc.target/aarch64/pr90838.c | 64 --
> >  gcc/testsuite/gcc.target/i386/pr95863-2.c  | 27 -
> >  gcc/testsuite/lib/target-supports.exp  | 30 ++
> >  6 files changed, 72 insertions(+), 91 deletions(-)
> >  delete mode 100644 gcc/testsuite/gcc.target/aarch64/pr90838.c
> >  delete mode 100644 gcc/testsuite/gcc.target/i386/pr95863-2.c
> >
> > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> > index 8a4d2cf7f85..b191606edb4 100644
> > --- a/gcc/config/riscv/riscv.h
> > +++ b/gcc/config/riscv/riscv.h
> > @@ -1004,4 +1004,9 @@ extern void riscv_remove_unneeded_save_restore_calls 
> > (void);
> >
> >  #define HARD_REGNO_RENAME_OK(FROM, TO) riscv_hard_regno_rename_ok (FROM, 
> > TO)
> >
> > +#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> > +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> > +#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> > +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> > +
> >  #endif /* ! GCC_RISCV_H */
> > diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> > index 613ac29967b..71c04841df2 100644
> > --- a/gcc/doc/sourcebuild.texi
> > +++ b/gcc/doc/sourcebuild.texi
> > @@ -2420,6 +2420,18 @@ PowerPC target pre-defines macro _ARCH_PWR9 which 
> > means the @code{-mcpu}
> >  setting is Power9 or later.
> >  @end table
> >
> > +@subsection RISC-V specific attributes
> > +
> > +@table @code
> > +
> > +@item rv32
> > +Test system has an integer register width of 32 bits.
> > +
> > +@item rv64
> > +Test system has an integer register width of 64 bits.
> > +
> > +@end table
> > +
> >  @subsubsection Other hardware attributes
> >
> >  @c Please keep this table sorted alphabetically.
> > diff --git a/gcc/testsuite/gcc.dg/pr90838.c b/gcc/testsuite/gcc.dg/pr90838.c
> > index 

Re: [PATCH v3] RISC-V: Implement C[LT]Z_DEFINED_VALUE_AT_ZERO

2022-06-03 Thread H.J. Lu via Gcc-patches
On Fri, May 13, 2022 at 1:17 PM Philipp Tomsich
 wrote:
>
> The Zbb support has introduced ctz and clz to the backend, but some
> transformations in GCC need to know what the value of c[lt]z at zero
> is. This affects how the optab is generated and may suppress use of
> CLZ/CTZ in tree passes.
>
> Among other things, this is needed for the transformation of
> table-based ctz-implementations, such as in deepsjeng, to work
> (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90838).
>
> Prior to this change, the test case from PR90838 would compile to
> on RISC-V targets with Zbb:
>   myctz:
> lui a4,%hi(.LC0)
> ld  a4,%lo(.LC0)(a4)
> neg a5,a0
> and a5,a5,a0
> mul a5,a5,a4
> lui a4,%hi(.LANCHOR0)
> addia4,a4,%lo(.LANCHOR0)
> srlia5,a5,58
> sh2add  a5,a5,a4
> lw  a0,0(a5)
> ret
>
> After this change, we get:
>   myctz:
> ctz a0,a0
> andia0,a0,63
> ret
>
> Testing this with deepsjeng_r (from SPEC 2017) against QEMU, this
> shows a clear reduction in dynamic instruction count:
>  - before  1961888067076
>  - after   1907928279874 (2.75% reduction)
>
> This also merges the various target-specific test-cases (for x86-64,
> aarch64 and riscv) within gcc.dg/pr90838.c.
>
> This extends the macros (i.e., effective-target keywords) used in
> testing (lib/target-supports.exp) to reliably distinguish between RV32
> and RV64 via __riscv_xlen (i.e., the integer register bitwidth) :
> testing for ILP32 could be misleading (as ILP32 is a valid memory
> model for 64bit systems).
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.h (CLZ_DEFINED_VALUE_AT_ZERO): Implement.
> (CTZ_DEFINED_VALUE_AT_ZERO): Same.
> * doc/sourcebuild.texi: add documentation for RISC-V specific
> test target keywords
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/pr90838.c: Add additional flags (dg-additional-options)
>   when compiling for riscv64 and subsume gcc.target/aarch64/pr90838.c
>   and gcc.target/i386/pr95863-2.c.
> * gcc.target/riscv/zbb-ctz.c: New test.
> * gcc.target/aarch64/pr90838.c: Removed.
> * gcc.target/i386/pr95863-2.c: Removed.
> * lib/target-supports.exp: Recognize RV32 or RV64 via XLEN
>
> Signed-off-by: Philipp Tomsich 
> Signed-off-by: Manolis Tsamis 
> Co-developed-by: Manolis Tsamis 
>
> ---
> Changes in v3:
> - Address nit from Kito (use rv64 and rv32 on gcc.dg/pr90838.c
>   consistently.
>
> Changes in v2:
> - Address review comments from Palmer (merging testcases)
> - Merge the different target-specific testcases for CLZ into one
> - Add RV32 tests
> - Fix pr90383.c testcase for x86_64
>
>  gcc/config/riscv/riscv.h   |  5 ++
>  gcc/doc/sourcebuild.texi   | 12 
>  gcc/testsuite/gcc.dg/pr90838.c | 25 +
>  gcc/testsuite/gcc.target/aarch64/pr90838.c | 64 --
>  gcc/testsuite/gcc.target/i386/pr95863-2.c  | 27 -
>  gcc/testsuite/lib/target-supports.exp  | 30 ++
>  6 files changed, 72 insertions(+), 91 deletions(-)
>  delete mode 100644 gcc/testsuite/gcc.target/aarch64/pr90838.c
>  delete mode 100644 gcc/testsuite/gcc.target/i386/pr95863-2.c
>
> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> index 8a4d2cf7f85..b191606edb4 100644
> --- a/gcc/config/riscv/riscv.h
> +++ b/gcc/config/riscv/riscv.h
> @@ -1004,4 +1004,9 @@ extern void riscv_remove_unneeded_save_restore_calls 
> (void);
>
>  #define HARD_REGNO_RENAME_OK(FROM, TO) riscv_hard_regno_rename_ok (FROM, TO)
>
> +#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> +#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> +
>  #endif /* ! GCC_RISCV_H */
> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> index 613ac29967b..71c04841df2 100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -2420,6 +2420,18 @@ PowerPC target pre-defines macro _ARCH_PWR9 which 
> means the @code{-mcpu}
>  setting is Power9 or later.
>  @end table
>
> +@subsection RISC-V specific attributes
> +
> +@table @code
> +
> +@item rv32
> +Test system has an integer register width of 32 bits.
> +
> +@item rv64
> +Test system has an integer register width of 64 bits.
> +
> +@end table
> +
>  @subsubsection Other hardware attributes
>
>  @c Please keep this table sorted alphabetically.
> diff --git a/gcc/testsuite/gcc.dg/pr90838.c b/gcc/testsuite/gcc.dg/pr90838.c
> index 41c5dab9a5c..7502b846346 100644
> --- a/gcc/testsuite/gcc.dg/pr90838.c
> +++ b/gcc/testsuite/gcc.dg/pr90838.c
> @@ -1,5 +1,8 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -fdump-tree-forwprop2-details" } */
> +/* { dg-additional-options "-mbmi" { target { { i?86-*-* x86_64-*-* } && { ! 
> { ia32 } } } } } */
> +/* { dg-additional-options "-march=rv64gc_zbb" { 

Re: [PATCH v3] RISC-V: Implement C[LT]Z_DEFINED_VALUE_AT_ZERO

2022-06-02 Thread Philipp Tomsich
Cherry picked from commit 16f7fcadac19dabd04a5abbe6601df52d22e9685
onto releases/gcc-12.

On Thu, 2 Jun 2022 at 10:49, Kito Cheng  wrote:
>
> OK to back port, thanks!
>
> On Thu, Jun 2, 2022 at 4:46 PM Philipp Tomsich  
> wrote:
> >
> > OK for backport?
> >
> > Thanks,
> > Phil.
> >
> > On Fri, 13 May 2022 at 22:23, Philipp Tomsich 
> > wrote:
> >
> > > Added the two nits from Kito's review and … Applied to trunk!
> > >
> > >
> > > On Fri, 13 May 2022 at 22:16, Philipp Tomsich 
> > > wrote:
> > > >
> > > > The Zbb support has introduced ctz and clz to the backend, but some
> > > > transformations in GCC need to know what the value of c[lt]z at zero
> > > > is. This affects how the optab is generated and may suppress use of
> > > > CLZ/CTZ in tree passes.
> > > >
> > > > Among other things, this is needed for the transformation of
> > > > table-based ctz-implementations, such as in deepsjeng, to work
> > > > (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90838).
> > > >
> > > > Prior to this change, the test case from PR90838 would compile to
> > > > on RISC-V targets with Zbb:
> > > >   myctz:
> > > > lui a4,%hi(.LC0)
> > > > ld  a4,%lo(.LC0)(a4)
> > > > neg a5,a0
> > > > and a5,a5,a0
> > > > mul a5,a5,a4
> > > > lui a4,%hi(.LANCHOR0)
> > > > addia4,a4,%lo(.LANCHOR0)
> > > > srlia5,a5,58
> > > > sh2add  a5,a5,a4
> > > > lw  a0,0(a5)
> > > > ret
> > > >
> > > > After this change, we get:
> > > >   myctz:
> > > > ctz a0,a0
> > > > andia0,a0,63
> > > > ret
> > > >
> > > > Testing this with deepsjeng_r (from SPEC 2017) against QEMU, this
> > > > shows a clear reduction in dynamic instruction count:
> > > >  - before  1961888067076
> > > >  - after   1907928279874 (2.75% reduction)
> > > >
> > > > This also merges the various target-specific test-cases (for x86-64,
> > > > aarch64 and riscv) within gcc.dg/pr90838.c.
> > > >
> > > > This extends the macros (i.e., effective-target keywords) used in
> > > > testing (lib/target-supports.exp) to reliably distinguish between RV32
> > > > and RV64 via __riscv_xlen (i.e., the integer register bitwidth) :
> > > > testing for ILP32 could be misleading (as ILP32 is a valid memory
> > > > model for 64bit systems).
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * config/riscv/riscv.h (CLZ_DEFINED_VALUE_AT_ZERO): Implement.
> > > > (CTZ_DEFINED_VALUE_AT_ZERO): Same.
> > > > * doc/sourcebuild.texi: add documentation for RISC-V specific
> > > > test target keywords
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > * gcc.dg/pr90838.c: Add additional flags (dg-additional-options)
> > > >   when compiling for riscv64 and subsume
> > > gcc.target/aarch64/pr90838.c
> > > >   and gcc.target/i386/pr95863-2.c.
> > > > * gcc.target/riscv/zbb-ctz.c: New test.
> > > > * gcc.target/aarch64/pr90838.c: Removed.
> > > > * gcc.target/i386/pr95863-2.c: Removed.
> > > > * lib/target-supports.exp: Recognize RV32 or RV64 via XLEN
> > > >
> > > > Signed-off-by: Philipp Tomsich 
> > > > Signed-off-by: Manolis Tsamis 
> > > > Co-developed-by: Manolis Tsamis 
> > > >
> > > > ---
> > > > Changes in v3:
> > > > - Address nit from Kito (use rv64 and rv32 on gcc.dg/pr90838.c
> > > >   consistently.
> > > >
> > > > Changes in v2:
> > > > - Address review comments from Palmer (merging testcases)
> > > > - Merge the different target-specific testcases for CLZ into one
> > > > - Add RV32 tests
> > > > - Fix pr90383.c testcase for x86_64
> > > >
> > > >  gcc/config/riscv/riscv.h   |  5 ++
> > > >  gcc/doc/sourcebuild.texi   | 12 
> > > >  gcc/testsuite/gcc.dg/pr90838.c | 25 +
> > > >  gcc/testsuite/gcc.target/aarch64/pr90838.c | 64 --
> > > >  gcc/testsuite/gcc.target/i386/pr95863-2.c  | 27 -
> > > >  gcc/testsuite/lib/target-supports.exp  | 30 ++
> > > >  6 files changed, 72 insertions(+), 91 deletions(-)
> > > >  delete mode 100644 gcc/testsuite/gcc.target/aarch64/pr90838.c
> > > >  delete mode 100644 gcc/testsuite/gcc.target/i386/pr95863-2.c
> > > >
> > > > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> > > > index 8a4d2cf7f85..b191606edb4 100644
> > > > --- a/gcc/config/riscv/riscv.h
> > > > +++ b/gcc/config/riscv/riscv.h
> > > > @@ -1004,4 +1004,9 @@ extern void
> > > riscv_remove_unneeded_save_restore_calls (void);
> > > >
> > > >  #define HARD_REGNO_RENAME_OK(FROM, TO) riscv_hard_regno_rename_ok
> > > (FROM, TO)
> > > >
> > > > +#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> > > > +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> > > > +#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> > > > +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> > > > +
> > > >  #endif /* ! GCC_RISCV_H */
> > > > diff --git a/gcc/doc/sourcebuild.texi 

Re: [PATCH v3] RISC-V: Implement C[LT]Z_DEFINED_VALUE_AT_ZERO

2022-06-02 Thread Kito Cheng via Gcc-patches
OK to back port, thanks!

On Thu, Jun 2, 2022 at 4:46 PM Philipp Tomsich  wrote:
>
> OK for backport?
>
> Thanks,
> Phil.
>
> On Fri, 13 May 2022 at 22:23, Philipp Tomsich 
> wrote:
>
> > Added the two nits from Kito's review and … Applied to trunk!
> >
> >
> > On Fri, 13 May 2022 at 22:16, Philipp Tomsich 
> > wrote:
> > >
> > > The Zbb support has introduced ctz and clz to the backend, but some
> > > transformations in GCC need to know what the value of c[lt]z at zero
> > > is. This affects how the optab is generated and may suppress use of
> > > CLZ/CTZ in tree passes.
> > >
> > > Among other things, this is needed for the transformation of
> > > table-based ctz-implementations, such as in deepsjeng, to work
> > > (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90838).
> > >
> > > Prior to this change, the test case from PR90838 would compile to
> > > on RISC-V targets with Zbb:
> > >   myctz:
> > > lui a4,%hi(.LC0)
> > > ld  a4,%lo(.LC0)(a4)
> > > neg a5,a0
> > > and a5,a5,a0
> > > mul a5,a5,a4
> > > lui a4,%hi(.LANCHOR0)
> > > addia4,a4,%lo(.LANCHOR0)
> > > srlia5,a5,58
> > > sh2add  a5,a5,a4
> > > lw  a0,0(a5)
> > > ret
> > >
> > > After this change, we get:
> > >   myctz:
> > > ctz a0,a0
> > > andia0,a0,63
> > > ret
> > >
> > > Testing this with deepsjeng_r (from SPEC 2017) against QEMU, this
> > > shows a clear reduction in dynamic instruction count:
> > >  - before  1961888067076
> > >  - after   1907928279874 (2.75% reduction)
> > >
> > > This also merges the various target-specific test-cases (for x86-64,
> > > aarch64 and riscv) within gcc.dg/pr90838.c.
> > >
> > > This extends the macros (i.e., effective-target keywords) used in
> > > testing (lib/target-supports.exp) to reliably distinguish between RV32
> > > and RV64 via __riscv_xlen (i.e., the integer register bitwidth) :
> > > testing for ILP32 could be misleading (as ILP32 is a valid memory
> > > model for 64bit systems).
> > >
> > > gcc/ChangeLog:
> > >
> > > * config/riscv/riscv.h (CLZ_DEFINED_VALUE_AT_ZERO): Implement.
> > > (CTZ_DEFINED_VALUE_AT_ZERO): Same.
> > > * doc/sourcebuild.texi: add documentation for RISC-V specific
> > > test target keywords
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.dg/pr90838.c: Add additional flags (dg-additional-options)
> > >   when compiling for riscv64 and subsume
> > gcc.target/aarch64/pr90838.c
> > >   and gcc.target/i386/pr95863-2.c.
> > > * gcc.target/riscv/zbb-ctz.c: New test.
> > > * gcc.target/aarch64/pr90838.c: Removed.
> > > * gcc.target/i386/pr95863-2.c: Removed.
> > > * lib/target-supports.exp: Recognize RV32 or RV64 via XLEN
> > >
> > > Signed-off-by: Philipp Tomsich 
> > > Signed-off-by: Manolis Tsamis 
> > > Co-developed-by: Manolis Tsamis 
> > >
> > > ---
> > > Changes in v3:
> > > - Address nit from Kito (use rv64 and rv32 on gcc.dg/pr90838.c
> > >   consistently.
> > >
> > > Changes in v2:
> > > - Address review comments from Palmer (merging testcases)
> > > - Merge the different target-specific testcases for CLZ into one
> > > - Add RV32 tests
> > > - Fix pr90383.c testcase for x86_64
> > >
> > >  gcc/config/riscv/riscv.h   |  5 ++
> > >  gcc/doc/sourcebuild.texi   | 12 
> > >  gcc/testsuite/gcc.dg/pr90838.c | 25 +
> > >  gcc/testsuite/gcc.target/aarch64/pr90838.c | 64 --
> > >  gcc/testsuite/gcc.target/i386/pr95863-2.c  | 27 -
> > >  gcc/testsuite/lib/target-supports.exp  | 30 ++
> > >  6 files changed, 72 insertions(+), 91 deletions(-)
> > >  delete mode 100644 gcc/testsuite/gcc.target/aarch64/pr90838.c
> > >  delete mode 100644 gcc/testsuite/gcc.target/i386/pr95863-2.c
> > >
> > > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> > > index 8a4d2cf7f85..b191606edb4 100644
> > > --- a/gcc/config/riscv/riscv.h
> > > +++ b/gcc/config/riscv/riscv.h
> > > @@ -1004,4 +1004,9 @@ extern void
> > riscv_remove_unneeded_save_restore_calls (void);
> > >
> > >  #define HARD_REGNO_RENAME_OK(FROM, TO) riscv_hard_regno_rename_ok
> > (FROM, TO)
> > >
> > > +#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> > > +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> > > +#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> > > +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> > > +
> > >  #endif /* ! GCC_RISCV_H */
> > > diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> > > index 613ac29967b..71c04841df2 100644
> > > --- a/gcc/doc/sourcebuild.texi
> > > +++ b/gcc/doc/sourcebuild.texi
> > > @@ -2420,6 +2420,18 @@ PowerPC target pre-defines macro _ARCH_PWR9 which
> > means the @code{-mcpu}
> > >  setting is Power9 or later.
> > >  @end table
> > >
> > > +@subsection RISC-V specific attributes
> > > +
> > > +@table @code
> > > +

Re: [PATCH v3] RISC-V: Implement C[LT]Z_DEFINED_VALUE_AT_ZERO

2022-06-02 Thread Philipp Tomsich
OK for backport?

Thanks,
Phil.

On Fri, 13 May 2022 at 22:23, Philipp Tomsich 
wrote:

> Added the two nits from Kito's review and … Applied to trunk!
>
>
> On Fri, 13 May 2022 at 22:16, Philipp Tomsich 
> wrote:
> >
> > The Zbb support has introduced ctz and clz to the backend, but some
> > transformations in GCC need to know what the value of c[lt]z at zero
> > is. This affects how the optab is generated and may suppress use of
> > CLZ/CTZ in tree passes.
> >
> > Among other things, this is needed for the transformation of
> > table-based ctz-implementations, such as in deepsjeng, to work
> > (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90838).
> >
> > Prior to this change, the test case from PR90838 would compile to
> > on RISC-V targets with Zbb:
> >   myctz:
> > lui a4,%hi(.LC0)
> > ld  a4,%lo(.LC0)(a4)
> > neg a5,a0
> > and a5,a5,a0
> > mul a5,a5,a4
> > lui a4,%hi(.LANCHOR0)
> > addia4,a4,%lo(.LANCHOR0)
> > srlia5,a5,58
> > sh2add  a5,a5,a4
> > lw  a0,0(a5)
> > ret
> >
> > After this change, we get:
> >   myctz:
> > ctz a0,a0
> > andia0,a0,63
> > ret
> >
> > Testing this with deepsjeng_r (from SPEC 2017) against QEMU, this
> > shows a clear reduction in dynamic instruction count:
> >  - before  1961888067076
> >  - after   1907928279874 (2.75% reduction)
> >
> > This also merges the various target-specific test-cases (for x86-64,
> > aarch64 and riscv) within gcc.dg/pr90838.c.
> >
> > This extends the macros (i.e., effective-target keywords) used in
> > testing (lib/target-supports.exp) to reliably distinguish between RV32
> > and RV64 via __riscv_xlen (i.e., the integer register bitwidth) :
> > testing for ILP32 could be misleading (as ILP32 is a valid memory
> > model for 64bit systems).
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv.h (CLZ_DEFINED_VALUE_AT_ZERO): Implement.
> > (CTZ_DEFINED_VALUE_AT_ZERO): Same.
> > * doc/sourcebuild.texi: add documentation for RISC-V specific
> > test target keywords
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/pr90838.c: Add additional flags (dg-additional-options)
> >   when compiling for riscv64 and subsume
> gcc.target/aarch64/pr90838.c
> >   and gcc.target/i386/pr95863-2.c.
> > * gcc.target/riscv/zbb-ctz.c: New test.
> > * gcc.target/aarch64/pr90838.c: Removed.
> > * gcc.target/i386/pr95863-2.c: Removed.
> > * lib/target-supports.exp: Recognize RV32 or RV64 via XLEN
> >
> > Signed-off-by: Philipp Tomsich 
> > Signed-off-by: Manolis Tsamis 
> > Co-developed-by: Manolis Tsamis 
> >
> > ---
> > Changes in v3:
> > - Address nit from Kito (use rv64 and rv32 on gcc.dg/pr90838.c
> >   consistently.
> >
> > Changes in v2:
> > - Address review comments from Palmer (merging testcases)
> > - Merge the different target-specific testcases for CLZ into one
> > - Add RV32 tests
> > - Fix pr90383.c testcase for x86_64
> >
> >  gcc/config/riscv/riscv.h   |  5 ++
> >  gcc/doc/sourcebuild.texi   | 12 
> >  gcc/testsuite/gcc.dg/pr90838.c | 25 +
> >  gcc/testsuite/gcc.target/aarch64/pr90838.c | 64 --
> >  gcc/testsuite/gcc.target/i386/pr95863-2.c  | 27 -
> >  gcc/testsuite/lib/target-supports.exp  | 30 ++
> >  6 files changed, 72 insertions(+), 91 deletions(-)
> >  delete mode 100644 gcc/testsuite/gcc.target/aarch64/pr90838.c
> >  delete mode 100644 gcc/testsuite/gcc.target/i386/pr95863-2.c
> >
> > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> > index 8a4d2cf7f85..b191606edb4 100644
> > --- a/gcc/config/riscv/riscv.h
> > +++ b/gcc/config/riscv/riscv.h
> > @@ -1004,4 +1004,9 @@ extern void
> riscv_remove_unneeded_save_restore_calls (void);
> >
> >  #define HARD_REGNO_RENAME_OK(FROM, TO) riscv_hard_regno_rename_ok
> (FROM, TO)
> >
> > +#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> > +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> > +#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> > +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> > +
> >  #endif /* ! GCC_RISCV_H */
> > diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> > index 613ac29967b..71c04841df2 100644
> > --- a/gcc/doc/sourcebuild.texi
> > +++ b/gcc/doc/sourcebuild.texi
> > @@ -2420,6 +2420,18 @@ PowerPC target pre-defines macro _ARCH_PWR9 which
> means the @code{-mcpu}
> >  setting is Power9 or later.
> >  @end table
> >
> > +@subsection RISC-V specific attributes
> > +
> > +@table @code
> > +
> > +@item rv32
> > +Test system has an integer register width of 32 bits.
> > +
> > +@item rv64
> > +Test system has an integer register width of 64 bits.
> > +
> > +@end table
> > +
> >  @subsubsection Other hardware attributes
> >
> >  @c Please keep this table sorted alphabetically.
> > diff --git a/gcc/testsuite/gcc.dg/pr90838.c
> 

Re: [PATCH v3] RISC-V: Implement C[LT]Z_DEFINED_VALUE_AT_ZERO

2022-05-13 Thread Philipp Tomsich
Added the two nits from Kito's review and … Applied to trunk!


On Fri, 13 May 2022 at 22:16, Philipp Tomsich  wrote:
>
> The Zbb support has introduced ctz and clz to the backend, but some
> transformations in GCC need to know what the value of c[lt]z at zero
> is. This affects how the optab is generated and may suppress use of
> CLZ/CTZ in tree passes.
>
> Among other things, this is needed for the transformation of
> table-based ctz-implementations, such as in deepsjeng, to work
> (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90838).
>
> Prior to this change, the test case from PR90838 would compile to
> on RISC-V targets with Zbb:
>   myctz:
> lui a4,%hi(.LC0)
> ld  a4,%lo(.LC0)(a4)
> neg a5,a0
> and a5,a5,a0
> mul a5,a5,a4
> lui a4,%hi(.LANCHOR0)
> addia4,a4,%lo(.LANCHOR0)
> srlia5,a5,58
> sh2add  a5,a5,a4
> lw  a0,0(a5)
> ret
>
> After this change, we get:
>   myctz:
> ctz a0,a0
> andia0,a0,63
> ret
>
> Testing this with deepsjeng_r (from SPEC 2017) against QEMU, this
> shows a clear reduction in dynamic instruction count:
>  - before  1961888067076
>  - after   1907928279874 (2.75% reduction)
>
> This also merges the various target-specific test-cases (for x86-64,
> aarch64 and riscv) within gcc.dg/pr90838.c.
>
> This extends the macros (i.e., effective-target keywords) used in
> testing (lib/target-supports.exp) to reliably distinguish between RV32
> and RV64 via __riscv_xlen (i.e., the integer register bitwidth) :
> testing for ILP32 could be misleading (as ILP32 is a valid memory
> model for 64bit systems).
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.h (CLZ_DEFINED_VALUE_AT_ZERO): Implement.
> (CTZ_DEFINED_VALUE_AT_ZERO): Same.
> * doc/sourcebuild.texi: add documentation for RISC-V specific
> test target keywords
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/pr90838.c: Add additional flags (dg-additional-options)
>   when compiling for riscv64 and subsume gcc.target/aarch64/pr90838.c
>   and gcc.target/i386/pr95863-2.c.
> * gcc.target/riscv/zbb-ctz.c: New test.
> * gcc.target/aarch64/pr90838.c: Removed.
> * gcc.target/i386/pr95863-2.c: Removed.
> * lib/target-supports.exp: Recognize RV32 or RV64 via XLEN
>
> Signed-off-by: Philipp Tomsich 
> Signed-off-by: Manolis Tsamis 
> Co-developed-by: Manolis Tsamis 
>
> ---
> Changes in v3:
> - Address nit from Kito (use rv64 and rv32 on gcc.dg/pr90838.c
>   consistently.
>
> Changes in v2:
> - Address review comments from Palmer (merging testcases)
> - Merge the different target-specific testcases for CLZ into one
> - Add RV32 tests
> - Fix pr90383.c testcase for x86_64
>
>  gcc/config/riscv/riscv.h   |  5 ++
>  gcc/doc/sourcebuild.texi   | 12 
>  gcc/testsuite/gcc.dg/pr90838.c | 25 +
>  gcc/testsuite/gcc.target/aarch64/pr90838.c | 64 --
>  gcc/testsuite/gcc.target/i386/pr95863-2.c  | 27 -
>  gcc/testsuite/lib/target-supports.exp  | 30 ++
>  6 files changed, 72 insertions(+), 91 deletions(-)
>  delete mode 100644 gcc/testsuite/gcc.target/aarch64/pr90838.c
>  delete mode 100644 gcc/testsuite/gcc.target/i386/pr95863-2.c
>
> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> index 8a4d2cf7f85..b191606edb4 100644
> --- a/gcc/config/riscv/riscv.h
> +++ b/gcc/config/riscv/riscv.h
> @@ -1004,4 +1004,9 @@ extern void riscv_remove_unneeded_save_restore_calls 
> (void);
>
>  #define HARD_REGNO_RENAME_OK(FROM, TO) riscv_hard_regno_rename_ok (FROM, TO)
>
> +#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> +#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> +
>  #endif /* ! GCC_RISCV_H */
> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> index 613ac29967b..71c04841df2 100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -2420,6 +2420,18 @@ PowerPC target pre-defines macro _ARCH_PWR9 which 
> means the @code{-mcpu}
>  setting is Power9 or later.
>  @end table
>
> +@subsection RISC-V specific attributes
> +
> +@table @code
> +
> +@item rv32
> +Test system has an integer register width of 32 bits.
> +
> +@item rv64
> +Test system has an integer register width of 64 bits.
> +
> +@end table
> +
>  @subsubsection Other hardware attributes
>
>  @c Please keep this table sorted alphabetically.
> diff --git a/gcc/testsuite/gcc.dg/pr90838.c b/gcc/testsuite/gcc.dg/pr90838.c
> index 41c5dab9a5c..7502b846346 100644
> --- a/gcc/testsuite/gcc.dg/pr90838.c
> +++ b/gcc/testsuite/gcc.dg/pr90838.c
> @@ -1,5 +1,8 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -fdump-tree-forwprop2-details" } */
> +/* { dg-additional-options "-mbmi" { target { { i?86-*-* x86_64-*-* } && { ! 
> { ia32 } } } 

[PATCH v3] RISC-V: Implement C[LT]Z_DEFINED_VALUE_AT_ZERO

2022-05-13 Thread Philipp Tomsich
The Zbb support has introduced ctz and clz to the backend, but some
transformations in GCC need to know what the value of c[lt]z at zero
is. This affects how the optab is generated and may suppress use of
CLZ/CTZ in tree passes.

Among other things, this is needed for the transformation of
table-based ctz-implementations, such as in deepsjeng, to work
(see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90838).

Prior to this change, the test case from PR90838 would compile to
on RISC-V targets with Zbb:
  myctz:
lui a4,%hi(.LC0)
ld  a4,%lo(.LC0)(a4)
neg a5,a0
and a5,a5,a0
mul a5,a5,a4
lui a4,%hi(.LANCHOR0)
addia4,a4,%lo(.LANCHOR0)
srlia5,a5,58
sh2add  a5,a5,a4
lw  a0,0(a5)
ret

After this change, we get:
  myctz:
ctz a0,a0
andia0,a0,63
ret

Testing this with deepsjeng_r (from SPEC 2017) against QEMU, this
shows a clear reduction in dynamic instruction count:
 - before  1961888067076
 - after   1907928279874 (2.75% reduction)

This also merges the various target-specific test-cases (for x86-64,
aarch64 and riscv) within gcc.dg/pr90838.c.

This extends the macros (i.e., effective-target keywords) used in
testing (lib/target-supports.exp) to reliably distinguish between RV32
and RV64 via __riscv_xlen (i.e., the integer register bitwidth) :
testing for ILP32 could be misleading (as ILP32 is a valid memory
model for 64bit systems).

gcc/ChangeLog:

* config/riscv/riscv.h (CLZ_DEFINED_VALUE_AT_ZERO): Implement.
(CTZ_DEFINED_VALUE_AT_ZERO): Same.
* doc/sourcebuild.texi: add documentation for RISC-V specific
test target keywords

gcc/testsuite/ChangeLog:

* gcc.dg/pr90838.c: Add additional flags (dg-additional-options)
  when compiling for riscv64 and subsume gcc.target/aarch64/pr90838.c
  and gcc.target/i386/pr95863-2.c.
* gcc.target/riscv/zbb-ctz.c: New test.
* gcc.target/aarch64/pr90838.c: Removed.
* gcc.target/i386/pr95863-2.c: Removed.
* lib/target-supports.exp: Recognize RV32 or RV64 via XLEN

Signed-off-by: Philipp Tomsich 
Signed-off-by: Manolis Tsamis 
Co-developed-by: Manolis Tsamis 

---
Changes in v3:
- Address nit from Kito (use rv64 and rv32 on gcc.dg/pr90838.c
  consistently.

Changes in v2:
- Address review comments from Palmer (merging testcases)
- Merge the different target-specific testcases for CLZ into one
- Add RV32 tests
- Fix pr90383.c testcase for x86_64

 gcc/config/riscv/riscv.h   |  5 ++
 gcc/doc/sourcebuild.texi   | 12 
 gcc/testsuite/gcc.dg/pr90838.c | 25 +
 gcc/testsuite/gcc.target/aarch64/pr90838.c | 64 --
 gcc/testsuite/gcc.target/i386/pr95863-2.c  | 27 -
 gcc/testsuite/lib/target-supports.exp  | 30 ++
 6 files changed, 72 insertions(+), 91 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.target/aarch64/pr90838.c
 delete mode 100644 gcc/testsuite/gcc.target/i386/pr95863-2.c

diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 8a4d2cf7f85..b191606edb4 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -1004,4 +1004,9 @@ extern void riscv_remove_unneeded_save_restore_calls 
(void);
 
 #define HARD_REGNO_RENAME_OK(FROM, TO) riscv_hard_regno_rename_ok (FROM, TO)
 
+#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
+  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
+#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
+  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
+
 #endif /* ! GCC_RISCV_H */
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 613ac29967b..71c04841df2 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2420,6 +2420,18 @@ PowerPC target pre-defines macro _ARCH_PWR9 which means 
the @code{-mcpu}
 setting is Power9 or later.
 @end table
 
+@subsection RISC-V specific attributes
+
+@table @code
+
+@item rv32
+Test system has an integer register width of 32 bits.
+
+@item rv64
+Test system has an integer register width of 64 bits.
+
+@end table
+
 @subsubsection Other hardware attributes
 
 @c Please keep this table sorted alphabetically.
diff --git a/gcc/testsuite/gcc.dg/pr90838.c b/gcc/testsuite/gcc.dg/pr90838.c
index 41c5dab9a5c..7502b846346 100644
--- a/gcc/testsuite/gcc.dg/pr90838.c
+++ b/gcc/testsuite/gcc.dg/pr90838.c
@@ -1,5 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-forwprop2-details" } */
+/* { dg-additional-options "-mbmi" { target { { i?86-*-* x86_64-*-* } && { ! { 
ia32 } } } } } */
+/* { dg-additional-options "-march=rv64gc_zbb" { target { rv64 } } } */
+/* { dg-additional-options "-march=rv32gc_zbb" { target { rv32 } } } */
 
 int ctz1 (unsigned x)
 {
@@ -56,4 +59,26 @@ int ctz4 (unsigned long x)
   return table[(lsb * magic) >> 58];
 }
 
+/* { dg-final { scan-tree-dump-times {= \.CTZ} 4 "forwprop2" { target { { 
i?86-*-* x86_64-*-*