Re: [PATCH v2 05/07] RISC-V: autovec: Add tuning and target vectorization hooks

2023-03-05 Thread Richard Biener via Gcc-patches
On Mon, Mar 6, 2023 at 4:16 AM Michael Collison  wrote:
>
> This patch adds support for registering target hooks for basic
> autovectorization support as well as basic tuning information for the
> vector extension.

Btw, during the state tuning isn't established or autovect support being
limited I would suggest to make the costing hooks reject all vectorization
and thus vectorize with -fno-vect-cost-model only (that's what the basic
vect.exp testsuite uses).

That allows collaborative development on trunk while not surprising
users with not profitable vectorization.

I agree that loads and stores are the first priority for any autovect
attempts because there you learn about the details and you get
pushed on the right track.

Richard.

> gcc/ChangeLog:
>
> 2023-03-02  Michael Collison 
>  Juzhe Zhong 
>
>  * config/riscv/riscv-cores.def (RISCV_TUNE):
>  Add VECTOR_TUNE_INFO parameter and
>  * common/config/riscv/riscv-common.cc (RISCV_TUNE):
>  Add VECTOR_TUNE_INFO parameter.
>  * config/riscv/riscv.cc (riscv_vector_tune_param):
>  New struct for vector tuning information.
>  (riscv_tune_info): add vector_tune_param.
>  (vector_tune_param): New static variable.
>  (riscv_vectorization_factor): New variable.
>  (generic_rvv_insn_scale_table): New struct.
>  (generic_rvv_stmt_scale_table): New struct.
>  (generic_rvv_insn_cost_table): New vector insn cost table.
>  (generic_rvv_stmt_cost_table): New vector statement
> cost table.
>  (generic_rvv_tune_info): New rvv tuning table.
>  (RISCV_TUNE): Add VECTOR_TUNE_INFO parameter.
>  (riscv_rtx_costs): Return vector estimate if vector mode.
>  (riscv_option_override): Set vector_tune_param.
>  (riscv_option_override): Set riscv_vectorization_factor.
>  (riscv_estimated_poly_value): Implement
>  TARGET_ESTIMATED_POLY_VALUE.
>  (riscv_preferred_simd_mode): Implement
>  TARGET_VECTORIZE_PREFERRED_SIMD_MODE.
>  (riscv_autovectorize_vector_modes): Implement
>  TARGET_AUTOVECTORIZE_VECTOR_MODES.
>  (riscv_get_mask_mode): Implement
> TARGET_VECTORIZE_GET_MASK_MODE.
>  (riscv_empty_mask_is_expensive): Implement
>  TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE.
>  (riscv_builtin_vectorization_cost): Implement
>  TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST.
>  (riscv_vectorize_create_costs): Implement
>  TARGET_VECTORIZE_CREATE_COSTS.
>  (TARGET_ESTIMATED_POLY_VALUE): Register target macro.
>  (TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST): Ditto.
> (TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Ditto.
>  (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Ditto.
>  (TARGET_VECTORIZE_GET_MASK_MODE): Ditto.
>  (TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE): Ditto.
>  (TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK): Ditto.
>  (TARGET_VECTORIZE_CREATE_COSTS): Ditto
>
> ---
>   gcc/common/config/riscv/riscv-common.cc |   2 +-
>   gcc/config/riscv/riscv-cores.def|  14 +-
>   gcc/config/riscv/riscv.cc   | 324 +++-
>   3 files changed, 328 insertions(+), 12 deletions(-)
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc
> b/gcc/common/config/riscv/riscv-common.cc
> index ebc1ed7d7e4..6b8d92af986 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -246,7 +246,7 @@ static const riscv_cpu_info riscv_cpu_tables[] =
>
>   static const char *riscv_tunes[] =
>   {
> -#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO) \
> +#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO,
> VECTOR_TUNE_INFO)\
>   TUNE_NAME,
>   #include "../../../config/riscv/riscv-cores.def"
>   NULL
> diff --git a/gcc/config/riscv/riscv-cores.def
> b/gcc/config/riscv/riscv-cores.def
> index 2a834cae21d..4feb0366222 100644
> --- a/gcc/config/riscv/riscv-cores.def
> +++ b/gcc/config/riscv/riscv-cores.def
> @@ -30,15 +30,15 @@
>  identifier, reference to riscv.cc.  */
>
>   #ifndef RISCV_TUNE
> -#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO)
> +#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO, VECTOR_TUNE_INFO)
>   #endif
>
> -RISCV_TUNE("rocket", generic, rocket_tune_info)
> -RISCV_TUNE("sifive-3-series", generic, rocket_tune_info)
> -RISCV_TUNE("sifive-5-series", generic, rocket_tune_info)
> -RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info)
> -RISCV_TUNE("thead-c906", generic, thead_c906_tune_info)
> -RISCV_TUNE("size", generic, optimize_size_tune_info)
> +RISCV_TUNE("rocket", generic, rocket_tune_info, generic_rvv_tune_info)

Re: [PATCH 1/2] gcov: Fix "do-while" structure in case statement leads to incorrect code coverage [PR93680]

2023-03-05 Thread Xionghu Luo via Gcc-patches




On 2023/3/2 18:45, Richard Biener wrote:



   small.gcno:  648:  block 2:`small.c':1, 3, 4, 6
   small.gcno:  688:0145:  36:LINES
   small.gcno:  700:  block 3:`small.c':8, 9
   small.gcno:  732:0145:  32:LINES
   small.gcno:  744:  block 5:`small.c':10
-small.gcno:  772:0145:  32:LINES
-small.gcno:  784:  block 6:`small.c':12
-small.gcno:  812:0145:  36:LINES
-small.gcno:  824:  block 7:`small.c':12, 13
+small.gcno:  772:0145:  36:LINES
+small.gcno:  784:  block 6:`small.c':12, 13
+small.gcno:  816:0145:  32:LINES
+small.gcno:  828:  block 8:`small.c':14
   small.gcno:  856:0145:  32:LINES
-small.gcno:  868:  block 8:`small.c':14
-small.gcno:  896:0145:  32:LINES
-small.gcno:  908:  block 9:`small.c':17
+small.gcno:  868:  block 9:`small.c':17


Looking at the CFG and the instrumentation shows

:
   PROF_edge_counter_17 = __gcov0.f[0];
   PROF_edge_counter_18 = PROF_edge_counter_17 + 1;
   __gcov0.f[0] = PROF_edge_counter_18;
   [t.c:3:7] p_6 = 0;
   [t.c:5:3] switch (s_7(D))  [INV], [t.c:7:5] case 0:
 [INV], [t.c:11:5] case 1:  [INV]>

:
   # n_1 = PHI 
   # p_3 = PHI <[t.c:3:7] p_6(2), [t.c:8:15] p_12(4)>
[t.c:7:5] :
   [t.c:8:15] p_12 = p_3 + 1;
   [t.c:8:28] n_13 = n_1 + -1;
   [t.c:8:28] if (n_13 != 0)
 goto ; [INV]
   else
 goto ; [INV]

:
   PROF_edge_counter_21 = __gcov0.f[2];
   PROF_edge_counter_22 = PROF_edge_counter_21 + 1;
   __gcov0.f[2] = PROF_edge_counter_22;
   [t.c:7:5] goto ; [100.00%]

:
   PROF_edge_counter_23 = __gcov0.f[3];
   PROF_edge_counter_24 = PROF_edge_counter_23 + 1;
   __gcov0.f[3] = PROF_edge_counter_24;
   [t.c:9:16] _14 = p_12;
   [t.c:9:16] goto ; [INV]

so the reason this goes wrong is that gcov associates the "wrong"
counter with the block containing
the 'case' label(s), for the case 0 it should have chosen the counter
from bb5 but it likely
computed the count of bb3?

It might be that ordering blocks differently puts the instrumentation
to different blocks or it
makes gcovs association chose different blocks but that means it's
just luck and not fixing
the actual issue?

To me it looks like the correct thing to investigate is switch
statement and/or case label
handling.  One can also see that  having line number 7 is wrong to
the extent that
the position of the label doesn't match the number of times it
executes in the source.  So
placement of the label is wrong here, possibly caused by CFG cleanup
after CFG build
(but generally labels are not used for anything once the CFG is built
and coverage
instrumentation is late so it might fail due to us moving labels).  It
might be OK to
avoid moving labels for --coverage but then coverage should possibly
look at edges
rather than labels?



Thanks, I investigated the Labels, it seems wrong at the beginning from
.gimple to .cfg very early quite like PR90574:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90574

.gimple:

int f (int s, int n)
[small.c:2:1] {
  int D.2755;
  int p;

  [small.c:3:7] p = 0;
  [small.c:5:3] switch (s) , [small.c:7:5] case 0: , 
[small.c:11:5] case 1: >
  [small.c:7:5] :  <= case label
  :<= loop label
  [small.c:8:13] p = p + 1;
  [small.c:8:26] n = n + -1;
  [small.c:8:26] if (n != 0) goto ; else goto ;
  :
  [small.c:9:14] D.2755 = p;
  [small.c:9:14] return D.2755;
  [small.c:11:5] :
  :
  [small.c:12:13] p = p + 1;
  [small.c:12:26] n = n + -1;
  [small.c:12:26] if (n != 0) goto ; else goto ;
  :
  [small.c:13:14] D.2755 = p;
  [small.c:13:14] return D.2755;
  :
  [small.c:16:10] D.2755 = 0;
  [small.c:16:10] return D.2755;
}

.cfg:

int f (int s, int n)
{
  int p;
  int D.2755;

   :
  [small.c:3:7] p = 0;
  [small.c:5:3] switch (s)  [INV], [small.c:7:5] case 0:  [INV], 
[small.c:11:5] case 1:  [INV]>

   :
[small.c:7:5] :   <= case 0
  [small.c:8:13 discrim 1] p = p + 1;
  [small.c:8:26 discrim 1] n = n + -1;
  [small.c:8:26 discrim 1] if (n != 0)
goto ; [INV]
  else
goto ; [INV]

   :
  [small.c:9:14] D.2755 = p;
  [small.c:9:14] goto ; [INV]

   :
[small.c:11:5] :  <= case 1
  [small.c:12:13 discrim 1] p = p + 1;
  [small.c:12:26 discrim 1] n = n + -1;
  [small.c:12:26 discrim 1] if (n != 0)
goto ; [INV]
  else
goto ; [INV]


The labels are merged into the loop unexpected, so I tried below fix
for --coverage if two labels are not on same line to start new basic block:


index 10ca86714f4..b788198ac31 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -2860,6 +2860,13 @@ stmt_starts_bb_p (gimple *stmt, gimple *prev_stmt)
  || !DECL_ARTIFICIAL (gimple_label_label (plabel)))
return true;

+ location_t loc_prev = gimple_location (plabel);
+ location_t locus = gimple_location (label_stmt);
+ expanded_location locus_e = 

Re: [PATCH v2] LoongArch: Stop -mfpu from silently breaking ABI [PR109000]

2023-03-05 Thread Lulu Cheng



在 2023/3/3 下午4:16, Xi Ruoyao 写道:

In the toolchain convention, we describe -mfpu= as:

"Selects the allowed set of basic floating-point instructions and
registers. This option should not change the FP calling convention
unless it's necessary."

Though not explicitly stated, the rationale of this rule is to allow
combinations like "-mabi=lp64s -mfpu=64".  This will be useful for
running applications with LP64S/F ABI on a double-float-capable
LoongArch hardware and using a math library with LP64S/F ABI but native
double float HW instructions, for a better performance.

And now a case in Linux kernel has again proven the usefulness of this
kind of combination.  The AMDGPU DCN kernel driver needs to perform some
floating-point operation, but the entire kernel uses LP64S ABI.  So the
translation units of the AMDGPU DCN driver need to be compiled with
-mfpu=64 (the kernel lacks soft-FP routines in libgcc), but -mabi=lp64s
(or you can't link it with the other part of the kernel).

Unfortunately, currently GCC uses TARGET_{HARD,SOFT,DOUBLE}_FLOAT to
determine the floating calling convention.  This causes "-mfpu=64"
silently allow using $fa* to pass parameters and return values EVEN IF
-mabi=lp64s is used.  To make things worse, the generated object file
has SOFT-FLOAT set in the eflags field so the linker will happily link
it with other LP64S ABI object files, but obviously this will lead to
bad results at runtime.  And for now all loongarch64 CPU models (-march
settings) implies -mfpu=64 on by default, so the issue makes a single
"-mabi=lp64s" option basically broken (fortunately most projects for eg
the Linux kernel have used -msoft-float which implies both -mabi=lp64s
and -mfpu=none as we've recommended in the toolchain convention doc).

The fix is simple: use TARGET_*_FLOAT_ABI instead.

I consider this a bug fix: the behavior difference from the toolchain
convention doc is a bug, and generating object files with SOFT-FLOAT
flag but parameters/return values passed through FPRs is definitely a
bug.

Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk and
release/gcc-12 branch?


LGTM!

Thanks!



gcc/ChangeLog:

PR target/109000
* config/loongarch/loongarch.h (FP_RETURN): Use
TARGET_*_FLOAT_ABI instead of TARGET_*_FLOAT.
(UNITS_PER_FP_ARG): Likewise.

gcc/testsuite/ChangeLog:

PR target/109000
* gcc.target/loongarch/flt-abi-isa-1.c: New test.
* gcc.target/loongarch/flt-abi-isa-2.c: New test.
* gcc.target/loongarch/flt-abi-isa-3.c: New test.
* gcc.target/loongarch/flt-abi-isa-4.c: New test.
---
  gcc/config/loongarch/loongarch.h   |  4 ++--
  gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c | 14 ++
  gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c | 10 ++
  gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c |  9 +
  gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c | 10 ++
  5 files changed, 45 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c
  create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c
  create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c
  create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c

diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index f4e903d46bb..f8167875646 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -676,7 +676,7 @@ enum reg_class
 point values.  */
  
  #define GP_RETURN (GP_REG_FIRST + 4)

-#define FP_RETURN ((TARGET_SOFT_FLOAT) ? GP_RETURN : (FP_REG_FIRST + 0))
+#define FP_RETURN ((TARGET_SOFT_FLOAT_ABI) ? GP_RETURN : (FP_REG_FIRST + 0))
  
  #define MAX_ARGS_IN_REGISTERS 8
  
@@ -1154,6 +1154,6 @@ struct GTY (()) machine_function

  /* The largest type that can be passed in floating-point registers.  */
  /* TODO: according to mabi.  */
  #define UNITS_PER_FP_ARG  \
-  (TARGET_HARD_FLOAT ? (TARGET_DOUBLE_FLOAT ? 8 : 4) : 0)
+  (TARGET_HARD_FLOAT_ABI ? (TARGET_DOUBLE_FLOAT_ABI ? 8 : 4) : 0)
  
  #define FUNCTION_VALUE_REGNO_P(N) ((N) == GP_RETURN || (N) == FP_RETURN)

diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c 
b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c
new file mode 100644
index 000..1c9490f6a87
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-mabi=lp64d -mfpu=64 -march=loongarch64 -O2" } */
+/* { dg-final { scan-assembler "frecip\\.d" } } */
+/* { dg-final { scan-assembler-not "movgr2fr\\.d" } } */
+/* { dg-final { scan-assembler-not "movfr2gr\\.d" } } */
+
+/* FPU is used for calculation and FPR is used for arguments and return
+   values.  */
+
+double
+t (double x)
+{
+  return 1.0 / x;
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c 
b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c
new file mode 100644

Re: [PATCH 0/2] LoongArch: testsuite: Fix tests related to stack

2023-03-05 Thread Xi Ruoyao via Gcc-patches
On Mon, 2023-03-06 at 11:16 +0800, Xi Ruoyao wrote:

/* snip */

> > > Sorry for the late reply, the first patch I think is fine. But I haven't 
> > > reproduced the problem of the second mail.
> > > 
> > > Is there any special option in the configuration?
> > 
> > Oh some strange thing might be happening... I'll try to figure out what
> > has caused the behavior difference.
> 
> Oh no, the difference is caused by --enable-default-pie.
> 
> Maybe I should just add -fno-PIE for the dg-options.  But now I'm still
> puzzled: why would -fPIE affect code generation on LoongArch?  AFAIK all
> the code we are generating is position independent (at least for now).

Without -fPIE, the compiler stores a register with no reason:

$ cat t.c
int test(int x)
{
char buf[128 << 10];
return buf[x];
}
$ ./gcc/cc1 t.c -nostdinc  -O2 -fdump-rtl-all -o- 2>/dev/null | grep test: -A20
test:
.LFB0 = .
lu12i.w $r13,-135168>>12# 0xfffdf000
ori $r13,$r13,4080
add.d   $r3,$r3,$r13
.LCFI0 = .
lu12i.w $r12,-131072>>12# 0xfffe
lu12i.w $r13,131072>>12 # 0x2
add.d   $r13,$r13,$r12
addi.d  $r12,$r3,16
add.d   $r12,$r13,$r12
lu12i.w $r13,131072>>12 # 0x2
st.d$r12,$r3,8
ori $r13,$r13,16
ldx.b   $r4,$r12,$r4
add.d   $r3,$r3,$r13
.LCFI1 = .
jr  $r1
.LFE0:
.size   test, .-test
.section.eh_frame,"aw",@progbits

Note the "st.d  $r12,$r3,8" instruction is completely meaningless.

The t.c.300r.ira dump contains some "interesting" thing:

Pass 0 for finding pseudo/allocno costs

a0 (r87,l0) best GR_REGS, allocno GR_REGS
a1 (r84,l0) best NO_REGS, allocno NO_REGS
a2 (r83,l0) best GR_REGS, allocno GR_REGS

  a0(r87,l0) costs: SIBCALL_REGS:2000,2000 JIRL_REGS:2000,2000 
CSR_REGS:2000,2000 GR_REGS:2000,2000 FP_REGS:8000,8000 ALL_REGS:32000,32000 
MEM:8000,8000
  a1(r84,l0) costs: SIBCALL_REGS:100,100 JIRL_REGS:100,100 
CSR_REGS:100,100 GR_REGS:100,100 FP_REGS:1004000,1004000 
ALL_REGS:1016000,1016000 MEM:1004000,1004000
  a2(r83,l0) costs: SIBCALL_REGS:100,100 JIRL_REGS:100,100 
CSR_REGS:100,100 GR_REGS:100,100 FP_REGS:1004000,1004000 
ALL_REGS:1008000,1008000 MEM:1004000,1004000


Here r84 is the pseudo register for ($frame - 131072).  Any idea why the
compiler selects "NO_REGS" here?

FWIW RISC-V port suffers the same issue:
https://godbolt.org/z/aPorqj73b.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v2 00/07] RISC-V: autovec: Add auto-vectorization support

2023-03-05 Thread Michael Collison
Thanks for the feedback, will try that next time.

Michael Collison


> On Mar 5, 2023, at 11:06 PM, Xi Ruoyao  wrote:
> 
> On Sun, 2023-03-05 at 22:13 -0500, Michael Collison wrote:
> 
> /* snip */
> 
>> - Fixed ChangeLog email formatting
> 
> Unfortunately it's not fixed.  We expect one tab, but now you have 16
> whitespaces.
> 
> To me it looks like your email client is being too smart and destroying
> the patch .  Try "git send-email" which is much easier to be correctly
> configured.
> 
> -- 
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University


Re: [PATCH v2 00/07] RISC-V: autovec: Add auto-vectorization support

2023-03-05 Thread Xi Ruoyao via Gcc-patches
On Sun, 2023-03-05 at 22:13 -0500, Michael Collison wrote:

/* snip */

> - Fixed ChangeLog email formatting

Unfortunately it's not fixed.  We expect one tab, but now you have 16
whitespaces.

To me it looks like your email client is being too smart and destroying
the patch .  Try "git send-email" which is much easier to be correctly
configured.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH] Always define `WIN32_LEAN_AND_MEAN` before

2023-03-05 Thread Ian Lance Taylor via Gcc-patches
On Fri, Mar 3, 2023 at 10:47 PM Xi Ruoyao  wrote:
>
> On Sat, 2023-01-07 at 06:52 +, Jonathan Yong via Gcc-patches wrote:
> > On 1/6/23 18:10, Jakub Jelinek wrote:
> > > On Sat, Jan 07, 2023 at 02:01:05AM +0800, LIU Hao via Gcc-patches
> > > wrote:
> > > > libgomp/
> > > >
> > > > PR middle-end/108300
> > > > * config/mingw32/proc.c: Define `WIN32_LEAN_AND_MEAN`
> > > > before
> > > > .
> > >
> > > This change is ok for trunk.
> > >
> > > Jakub
> > >
> >
> > Pushed to master branch, thanks LH.
>
> The patch touches libgo (w/o mentioning it in the ChangeLog).  I guess
> you need to contribute the libgo part into the upstream Go runtime or
> the change will be undone when Ian merges libgo next time.

Thanks, I've reverted the part of the patch that applies to libgo.

It's not worth changing upstream because gccgo doesn't support Windows
anyhow, and because that change is gone in the even-more-upstream
sources.

Ian


[PATCH v2 07/07] RISC-V: autovec: Add autovectorization patterns for add & sub

2023-03-05 Thread Michael Collison

This patch adds tests for autovectorization of integer add and subtract.

gcc/testsuite/ChangeLog:

2023-03-02  Michael Collison 
                Vineet Gupta 

                * gcc.target/riscv/rvv/autovec: New directory
            for autovectorization tests.
            * gcc.target/riscv/rvv/autovec/loop-add-rv32.c: New
            test to verify code generation of vector add on rv32.
            * gcc.target/riscv/rvv/autovec/loop-add.c: New
            test to verify code generation of vector add on rv64.
            * gcc.target/riscv/rvv/autovec/loop-sub-rv32.c: New
            test to verify code generation of vector subtract on rv32.
            * gcc.target/riscv/rvv/autovec/loop-sub.c: New
            test to verify code generation of vector subtract on rv64.

---
 .../riscv/rvv/autovec/loop-add-rv32.c | 24 +++
 .../gcc.target/riscv/rvv/autovec/loop-add.c   | 24 +++
 .../riscv/rvv/autovec/loop-sub-rv32.c | 24 +++
 .../gcc.target/riscv/rvv/autovec/loop-sub.c   | 24 +++
 4 files changed, 96 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c

 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c

 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c

new file mode 100644
index 000..bdc3b6892e9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv 
-mabi=ilp32d" } */

+
+#include 
+
+#define TEST_TYPE(TYPE)                 \
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)    \
+  {                            \
+    for (int i = 0; i < n; i++)                \
+  dst[i] = a[i] + b[i];                \
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()    \
+ TEST_TYPE(int16_t)    \
+ TEST_TYPE(uint16_t)    \
+ TEST_TYPE(int32_t)    \
+ TEST_TYPE(uint32_t)    \
+ TEST_TYPE(int64_t)    \
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvadd\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c

new file mode 100644
index 000..d7f992c7d27
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv 
-mabi=lp64d" } */

+
+#include 
+
+#define TEST_TYPE(TYPE)                 \
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)    \
+  {                            \
+    for (int i = 0; i < n; i++)                \
+  dst[i] = a[i] + b[i];                \
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()    \
+ TEST_TYPE(int16_t)    \
+ TEST_TYPE(uint16_t)    \
+ TEST_TYPE(int32_t)    \
+ TEST_TYPE(uint32_t)    \
+ TEST_TYPE(int64_t)    \
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvadd\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c

new file mode 100644
index 000..7d0a40ec539
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv 
-mabi=ilp32d" } */

+
+#include 
+
+#define TEST_TYPE(TYPE)                 \
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)    \
+  {                            \
+    for (int i = 0; i < n; i++)                \
+  dst[i] = a[i] - b[i];                \
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()    \
+ TEST_TYPE(int16_t)    \
+ TEST_TYPE(uint16_t)    \
+ TEST_TYPE(int32_t)    \
+ TEST_TYPE(uint32_t)    \
+ TEST_TYPE(int64_t)    \
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvsub\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c

new file mode 100644
index 000..c8900884f83
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv 
-mabi=lp64d" } */

+
+#include 
+
+#define TEST_TYPE(TYPE)                 \
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)    \
+  {                            \
+    for (int i = 0; i < n; i++)                \
+  dst[i] = a[i] - b[i];                \
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()    \
+ TEST_TYPE(int16_t) 

Re: [PATCH 0/2] LoongArch: testsuite: Fix tests related to stack

2023-03-05 Thread Xi Ruoyao via Gcc-patches
On Mon, 2023-03-06 at 10:48 +0800, Xi Ruoyao wrote:
> On Mon, 2023-03-06 at 09:15 +0800, Lulu Cheng wrote:
> > 
> > 在 2023/3/6 上午12:21, Xi Ruoyao 写道:
> > > On Fri, 2023-03-03 at 08:21 -0800, Mike Stump wrote:
> > > > On Mar 3, 2023, at 12:40 AM, Xi Ruoyao via Gcc-patches
> > > >  wrote:
> > > > > Some trivial test case fixes.  Ok for trunk?
> > > > Ok.
> > > Lulu: if you don't object I'll push these two in this week.
> > > 
> > > I tried to bisect for the exact point where the test cases are broken,
> > > but it turns out they are broken the first day committed (r13-4401).  As
> > > the draft of r13-4401 was sent in Sept 2022 but it's committed in Nov
> > > 2022, I can only guess something had changed in the two months and broke
> > > the tests...
> > 
> > Sorry for the late reply, the first patch I think is fine. But I haven't 
> > reproduced the problem of the second mail.
> > 
> > Is there any special option in the configuration?
> 
> Oh some strange thing might be happening... I'll try to figure out what
> has caused the behavior difference.

Oh no, the difference is caused by --enable-default-pie.

Maybe I should just add -fno-PIE for the dg-options.  But now I'm still
puzzled: why would -fPIE affect code generation on LoongArch?  AFAIK all
the code we are generating is position independent (at least for now).
> 

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH V2 06/07] RISC-V: autovec: Add autovectorization patterns for add & sub

2023-03-05 Thread Michael Collison
This patch adds patterns that provide basic autovectorization support 
for integer adds and subtracts.


gcc/ChangeLog:

2023-03-02  Michael Collison 
                Juzhe Zhong 

                * config/riscv/riscv.md 
(riscv_vector_preferred_simd_mode): Include

                vector-iterators.md.
                * config/riscv/vector-auto.md: New file containing
                autovectorization patterns.
                * config/riscv/vector-iterators.md 
(UNSPEC_VADD/UNSPEC_VSUB):

                New unspecs for autovectorization patterns.
                * config/riscv/vector.md: Remove include of 
vector-iterators.md

                and include vector-auto.md.

---
 gcc/config/riscv/riscv.md    |   1 +
 gcc/config/riscv/vector-auto.md  | 172 +++
 gcc/config/riscv/vector-iterators.md |   2 +
 gcc/config/riscv/vector.md   |   4 +-
 4 files changed, 177 insertions(+), 2 deletions(-)
 create mode 100644 gcc/config/riscv/vector-auto.md

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 6c3176042fb..a504ace72e5 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -131,6 +131,7 @@
 (include "predicates.md")
 (include "constraints.md")
 (include "iterators.md")
+(include "vector-iterators.md")

 ;; 
 ;;
diff --git a/gcc/config/riscv/vector-auto.md 
b/gcc/config/riscv/vector-auto.md

new file mode 100644
index 000..e5a19663d18
--- /dev/null
+++ b/gcc/config/riscv/vector-auto.md
@@ -0,0 +1,172 @@
+;; Machine description for RISC-V 'V' Extension for GNU compiler.
+;; Copyright (C) 2022-2023 Free Software Foundation, Inc.
+;; Contributed by Juzhe Zhong (juzhe.zh...@rivai.ai), RiVAI 
Technologies Ltd.

+;; Contributed by Michael Collison (colli...@rivosinc.com), Rivos Inc.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+
+;; 
-

+;;  [INT] Addition
+;; 
-

+;; Includes:
+;; - vadd.vv
+;; - vadd.vx
+;; - vadd.vi
+;; 
-

+
+(define_expand "add3"
+  [(match_operand:VI 0 "register_operand")
+   (match_operand:VI 1 "register_operand")
+   (match_operand:VI 2 "vector_arith_operand")]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = gen_rtx_UNSPEC (mode, gen_rtvec (1, const0_rtx), 
UNSPEC_VUNDEF);

+  rtx vl = emit_vlmax_vsetvl (mode);
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = CONSTM1_RTX(mode);
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_add(operands[0], mask, merge, operands[1], 
operands[2],

+                vl, tail_policy, mask_policy, vlmax_avl_p));
+
+  DONE;
+})
+
+(define_expand "cond_add"
+  [(match_operand:VI 0 "register_operand")
+   (match_operand: 1 "register_operand")
+   (match_operand:VI 2 "register_operand")
+   (match_operand:VI 3 "vector_reg_or_const_dup_operand")
+   (match_operand:VI 4 "register_operand")]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = operands[4];
+  rtx vl = emit_vlmax_vsetvl (mode);
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = operands[1];
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_add(operands[0], mask, merge, operands[2], 
operands[3],

+                vl, tail_policy, mask_policy, vlmax_avl_p));
+  DONE;
+})
+
+(define_expand "len_add"
+  [(match_operand:VI 0 "register_operand")
+   (match_operand:VI 1 "register_operand")
+   (match_operand:VI 2 "vector_reg_or_const_dup_operand")
+   (match_operand 3 "p_reg_or_const_csr_operand")]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = gen_rtx_UNSPEC (mode, gen_rtvec (1, const0_rtx), 
UNSPEC_VUNDEF);

+  rtx vl = operands[3];
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = CONSTM1_RTX(mode);
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_add(operands[0], mask, merge, operands[1], 
operands[2],

+                vl, tail_policy, mask_policy, vlmax_avl_p));
+  DONE;
+})
+
+
+;; 

[PATCH v2 05/07] RISC-V: autovec: Add tuning and target vectorization hooks

2023-03-05 Thread Michael Collison
This patch adds support for registering target hooks for basic 
autovectorization support as well as basic tuning information for the 
vector extension.


gcc/ChangeLog:

2023-03-02  Michael Collison 
                Juzhe Zhong 

            * config/riscv/riscv-cores.def (RISCV_TUNE):
            Add VECTOR_TUNE_INFO parameter and
            * common/config/riscv/riscv-common.cc (RISCV_TUNE):
            Add VECTOR_TUNE_INFO parameter.
            * config/riscv/riscv.cc (riscv_vector_tune_param):
            New struct for vector tuning information.
            (riscv_tune_info): add vector_tune_param.
            (vector_tune_param): New static variable.
            (riscv_vectorization_factor): New variable.
            (generic_rvv_insn_scale_table): New struct.
            (generic_rvv_stmt_scale_table): New struct.
            (generic_rvv_insn_cost_table): New vector insn cost table.
            (generic_rvv_stmt_cost_table): New vector statement 
cost table.

            (generic_rvv_tune_info): New rvv tuning table.
            (RISCV_TUNE): Add VECTOR_TUNE_INFO parameter.
            (riscv_rtx_costs): Return vector estimate if vector mode.
            (riscv_option_override): Set vector_tune_param.
            (riscv_option_override): Set riscv_vectorization_factor.
            (riscv_estimated_poly_value): Implement
            TARGET_ESTIMATED_POLY_VALUE.
            (riscv_preferred_simd_mode): Implement
            TARGET_VECTORIZE_PREFERRED_SIMD_MODE.
        (riscv_autovectorize_vector_modes): Implement
        TARGET_AUTOVECTORIZE_VECTOR_MODES.
        (riscv_get_mask_mode): Implement 
TARGET_VECTORIZE_GET_MASK_MODE.

        (riscv_empty_mask_is_expensive): Implement
        TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE.
        (riscv_builtin_vectorization_cost): Implement
        TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST.
        (riscv_vectorize_create_costs): Implement
        TARGET_VECTORIZE_CREATE_COSTS.
        (TARGET_ESTIMATED_POLY_VALUE): Register target macro.
        (TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST): Ditto.
           (TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Ditto.
        (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Ditto.
        (TARGET_VECTORIZE_GET_MASK_MODE): Ditto.
        (TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE): Ditto.
        (TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK): Ditto.
        (TARGET_VECTORIZE_CREATE_COSTS): Ditto

---
 gcc/common/config/riscv/riscv-common.cc |   2 +-
 gcc/config/riscv/riscv-cores.def    |  14 +-
 gcc/config/riscv/riscv.cc   | 324 +++-
 3 files changed, 328 insertions(+), 12 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc

index ebc1ed7d7e4..6b8d92af986 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -246,7 +246,7 @@ static const riscv_cpu_info riscv_cpu_tables[] =

 static const char *riscv_tunes[] =
 {
-#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO) \
+#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO, 
VECTOR_TUNE_INFO)    \

 TUNE_NAME,
 #include "../../../config/riscv/riscv-cores.def"
 NULL
diff --git a/gcc/config/riscv/riscv-cores.def 
b/gcc/config/riscv/riscv-cores.def

index 2a834cae21d..4feb0366222 100644
--- a/gcc/config/riscv/riscv-cores.def
+++ b/gcc/config/riscv/riscv-cores.def
@@ -30,15 +30,15 @@
    identifier, reference to riscv.cc.  */

 #ifndef RISCV_TUNE
-#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO)
+#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO, VECTOR_TUNE_INFO)
 #endif

-RISCV_TUNE("rocket", generic, rocket_tune_info)
-RISCV_TUNE("sifive-3-series", generic, rocket_tune_info)
-RISCV_TUNE("sifive-5-series", generic, rocket_tune_info)
-RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info)
-RISCV_TUNE("thead-c906", generic, thead_c906_tune_info)
-RISCV_TUNE("size", generic, optimize_size_tune_info)
+RISCV_TUNE("rocket", generic, rocket_tune_info, generic_rvv_tune_info)
+RISCV_TUNE("sifive-3-series", generic, rocket_tune_info, 
generic_rvv_tune_info)
+RISCV_TUNE("sifive-5-series", generic, rocket_tune_info, 
generic_rvv_tune_info)
+RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info, 
generic_rvv_tune_info)
+RISCV_TUNE("thead-c906", generic, thead_c906_tune_info, 
generic_rvv_tune_info)

+RISCV_TUNE("size", generic, optimize_size_tune_info, generic_rvv_tune_info)

 #undef RISCV_TUNE

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index befb9b498b7..44659062070 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -60,6 +60,16 @@ along with GCC; see the file COPYING3.  If not see
 #include "opts.h"
 #include "tm-constrs.h"
 #include "rtl-iter.h"
+#include "gimple.h"
+#include "cfghooks.h"
+#include "cfgloop.h"

[PATCH v2 04/07] RISC-V: autovec: Add auto-vectorization support functions

2023-03-05 Thread Michael Collison
This patch adds support for functions used in implementing various 
portions of autovectorization support.


gcc/ChangeLog:

2023-03-02  Michael Collison 
                Juzhe Zhong 

            * config/riscv/riscv-v.cc (riscv_classify_vlmul_field):
            New function.
            (riscv_vector_preferred_simd_mode): Ditto.
            (get_mask_policy_no_pred): Ditto.
            (get_tail_policy_no_pred): Ditto.
            (riscv_tuple_mode_p): Ditto.
            (riscv_classify_nf): Ditto.
            (riscv_vlmul_regsize): Ditto.
            (riscv_vector_mask_mode_p): Ditto.
            (riscv_vector_get_mask_mode): Ditto.

---
 gcc/config/riscv/riscv-v.cc | 176 
 1 file changed, 176 insertions(+)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 2d2de6e4a6c..c9a0d6b4c06 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -38,10 +38,12 @@
 #include "memmodel.h"
 #include "emit-rtl.h"
 #include "tm_p.h"
+#include "targhooks.h"
 #include "target.h"
 #include "expr.h"
 #include "optabs.h"
 #include "tm-constrs.h"
+#include "riscv-vector-builtins.h"
 #include "rtx-vector-builder.h"

 using namespace riscv_vector;
@@ -109,6 +111,41 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT 
minval,

   && IN_RANGE (INTVAL (elt), minval, maxval));
 }

+/* Return the vlmul field for a specific machine mode.  */
+unsigned int
+riscv_classify_vlmul_field (enum machine_mode mode)
+{
+  /* Make the decision based on the mode's enum value rather than its
+ properties, so that we keep the correct classification regardless
+ of -mriscv-vector-bits.  */
+  switch (mode)
+    {
+    case E_VNx8BImode:
+  return VLMUL_FIELD_111;
+
+    case E_VNx4BImode:
+  return VLMUL_FIELD_110;
+
+    case E_VNx2BImode:
+  return VLMUL_FIELD_101;
+
+    case E_VNx16BImode:
+  return VLMUL_FIELD_000;
+
+    case E_VNx32BImode:
+  return VLMUL_FIELD_001;
+
+    case E_VNx64BImode:
+  return VLMUL_FIELD_010;
+
+    default:
+  break;
+    }
+
+  /* we don't care about VLMUL for Mask.  */
+  return VLMUL_FIELD_000;
+}
+
 rtx
 emit_vlmax_vsetvl (machine_mode vmode)
 {
@@ -163,6 +200,64 @@ calculate_ratio (unsigned int sew, enum vlmul_type 
vlmul)

   return ratio;
 }

+/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE for RVV.  */
+
+machine_mode
+riscv_vector_preferred_simd_mode (scalar_mode mode, unsigned vf)
+{
+  if (!TARGET_VECTOR)
+    return word_mode;
+
+  switch (mode)
+    {
+    case E_QImode:
+  return vf == 1   ? VNx8QImode
+     : vf == 2 ? VNx16QImode
+     : vf == 4 ? VNx32QImode
+           : VNx64QImode;
+  break;
+    case E_HImode:
+  return vf == 1   ? VNx4HImode
+     : vf == 2 ? VNx8HImode
+     : vf == 4 ? VNx16HImode
+           : VNx32HImode;
+  break;
+    case E_SImode:
+  return vf == 1   ? VNx2SImode
+     : vf == 2 ? VNx4SImode
+     : vf == 4 ? VNx8SImode
+           : VNx16SImode;
+  break;
+    case E_DImode:
+  if (riscv_vector_elen_flags != MASK_VECTOR_ELEN_32
+      && riscv_vector_elen_flags != MASK_VECTOR_ELEN_FP_32)
+    return vf == 1     ? VNx1DImode
+       : vf == 2 ? VNx2DImode
+       : vf == 4 ? VNx4DImode
+             : VNx8DImode;
+  break;
+    case E_SFmode:
+  if (TARGET_HARD_FLOAT && riscv_vector_elen_flags != 
MASK_VECTOR_ELEN_32

+      && riscv_vector_elen_flags != MASK_VECTOR_ELEN_64)
+    return vf == 1     ? VNx2SFmode
+       : vf == 2 ? VNx4SFmode
+       : vf == 4 ? VNx8SFmode
+             : VNx16SFmode;
+  break;
+    case E_DFmode:
+  if (TARGET_DOUBLE_FLOAT && TARGET_VECTOR_ELEN_FP_64)
+    return vf == 1     ? VNx1DFmode
+       : vf == 2 ? VNx2DFmode
+       : vf == 4 ? VNx4DFmode
+             : VNx8DFmode;
+  break;
+    default:
+  break;
+    }
+
+  return word_mode;
+}
+
 /* Emit an RVV unmask && vl mov from SRC to DEST.  */
 static void
 emit_pred_op (unsigned icode, rtx mask, rtx dest, rtx src, rtx len,
@@ -375,6 +470,87 @@ get_avl_type_rtx (enum avl_type type)
   return gen_int_mode (type, Pmode);
 }

+rtx
+get_mask_policy_no_pred ()
+{
+  return get_mask_policy_for_pred (PRED_TYPE_none);
+}
+
+rtx
+get_tail_policy_no_pred ()
+{
+  return get_mask_policy_for_pred (PRED_TYPE_none);
+}
+
+/* Return true if it is a RVV tuple mode.  */
+bool
+riscv_tuple_mode_p (machine_mode mode ATTRIBUTE_UNUSED)
+{
+  return false;
+}
+
+/* Return nf for a machine mode.  */
+int
+riscv_classify_nf (machine_mode mode)
+{
+  switch (mode)
+    {
+
+    default:
+  break;
+    }
+
+  return 1;
+}
+
+/* Return vlmul register size for a machine mode.  */
+int
+riscv_vlmul_regsize (machine_mode mode)
+{
+  if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
+    return 1;
+  switch (riscv_classify_vlmul_field (mode))
+    {
+    case VLMUL_FIELD_001:
+  return 2;
+    case VLMUL_FIELD_010:
+  

[PATCH v2 03/07] RISC-V: autovec: Add vector cost model

2023-03-05 Thread Michael Collison
This patches adds two new files to support the vector cost model and 
modifies the Makefile fragment to build the cost model c++ file. Due to 
the large size this patch is provided as an attachment.


gcc/ChangeLog:

2023-03-02  Michael Collison 
                Juzhe Zhong 

            * gcc/config.gcc (riscv-vector-cost.o): New object file 
to build.
            * config/riscv/riscv-vector-cost.cc: New file for riscv 
vector cost

            model
            * config/riscv/riscv-vector-cost.h: New header file for 
riscv vector

            cost model.
                * config/riscv/t-riscv: Add make rule for 
riscv-vector-cost.o.
From c606f674114a362ba0299caf160b23a98f37c898 Mon Sep 17 00:00:00 2001
From: Michael Collison 
Date: Sun, 5 Mar 2023 17:53:42 -0500
Subject: [PATCH] RISC-V: Add vector cost model

---
 gcc/config.gcc|   2 +-
 gcc/config/riscv/riscv-vector-cost.cc | 689 ++
 gcc/config/riscv/riscv-vector-cost.h  | 481 ++
 gcc/config/riscv/t-riscv  |   5 +
 4 files changed, 1176 insertions(+), 1 deletion(-)
 create mode 100644 gcc/config/riscv/riscv-vector-cost.cc
 create mode 100644 gcc/config/riscv/riscv-vector-cost.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index da3a6d3ba1f..4a260572a3d 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -530,7 +530,7 @@ pru-*-*)
 riscv*)
 	cpu_type=riscv
 	extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o riscv-vsetvl.o"
-	extra_objs="${extra_objs} riscv-vector-builtins.o riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
+	extra_objs="${extra_objs} riscv-vector-cost.o riscv-vector-builtins.o riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
 	d_target_objs="riscv-d.o"
 	extra_headers="riscv_vector.h"
 	target_gtfiles="$target_gtfiles \$(srcdir)/config/riscv/riscv-vector-builtins.cc"
diff --git a/gcc/config/riscv/riscv-vector-cost.cc b/gcc/config/riscv/riscv-vector-cost.cc
new file mode 100644
index 000..4abd0e54da0
--- /dev/null
+++ b/gcc/config/riscv/riscv-vector-cost.cc
@@ -0,0 +1,689 @@
+/* Cost model implementation for RISC-V 'V' Extension for GNU compiler.
+   Copyright (C) 2022-2023 Free Software Foundation, Inc.
+   Contributed by Juzhe Zhong (juzhe.zh...@rivai.ai), RiVAI Technologies Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#define INCLUDE_STRING
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "backend.h"
+#include "rtl.h"
+#include "regs.h"
+#include "insn-config.h"
+#include "insn-attr.h"
+#include "recog.h"
+#include "rtlanal.h"
+#include "output.h"
+#include "alias.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "attribs.h"
+#include "varasm.h"
+#include "stor-layout.h"
+#include "calls.h"
+#include "function.h"
+#include "explow.h"
+#include "memmodel.h"
+#include "emit-rtl.h"
+#include "reload.h"
+#include "tm_p.h"
+#include "target.h"
+#include "basic-block.h"
+#include "expr.h"
+#include "optabs.h"
+#include "bitmap.h"
+#include "df.h"
+#include "diagnostic.h"
+#include "builtins.h"
+#include "predict.h"
+#include "tree-pass.h"
+#include "opts.h"
+#include "langhooks.h"
+#include "rtl-iter.h"
+#include "gimple.h"
+#include "cfghooks.h"
+#include "cfgloop.h"
+#include "fold-const.h"
+#include "gimple-iterator.h"
+#include "tree-vectorizer.h"
+#include "tree-ssa-loop-niter.h"
+#include "riscv-vector-builtins.h"
+
+/* This file should be included last.  */
+#include "riscv-vector-cost.h"
+#include "target-def.h"
+
+bool
+vector_insn_cost_table::get_cost (rtx x, machine_mode mode, int *cost,
+  bool speed) const
+{
+  rtx op0, op1, op2;
+  enum rtx_code code = GET_CODE (x);
+  scalar_int_mode int_mode;
+
+  /* By default, assume that everything has equivalent cost to the
+ cheapest instruction.  Any additional costs are applied as a delta
+ above this default.  */
+  *cost = COSTS_N_INSNS (1);
+
+  switch (code)
+{
+case SET:
+  /* The cost depends entirely on the operands to SET.  */
+  *cost = 0;
+  op0 = SET_DEST (x);
+  op1 = SET_SRC (x);
+
+  switch (GET_CODE (op0))
+	{
+	case MEM:
+	  if (speed)
+	{
+	  *cost += store->cost (x, mode);
+	}
+
+	  //*cost += rtx_cost(op1, mode, SET, 1, speed);
+	  return true;
+
+	case 

[PATCH v2 02/07] RISC-V: autovec: Export policy functions to global scope

2023-03-05 Thread Michael Collison
This patch adds foundational support by making two functions that handle 
predication policies visibly globally.


gcc/ChangeLog:

2023-03-02  Michael Collison 
                Juzhe Zhong 

            * config/riscv/riscv-vector-builtins.cc 
(get_tail_policy_for_pred):

            Remove static declaration to to make externally visible.
            (get_mask_policy_for_pred): Ditto.
            * config/riscv/riscv-vector-builtins.h 
(get_tail_policy_for_pred):

            New external declaration.
            (get_mask_policy_for_pred): Ditto.

---
 gcc/config/riscv/riscv-vector-builtins.cc | 4 ++--
 gcc/config/riscv/riscv-vector-builtins.h  | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc

index 2d57086262b..352ffd8867d 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -2448,7 +2448,7 @@ use_real_merge_p (enum predication_type_index pred)

 /* Get TAIL policy for predication. If predication indicates TU, 
return the TU.

    Otherwise, return the prefer default configuration.  */
-static rtx
+rtx
 get_tail_policy_for_pred (enum predication_type_index pred)
 {
   if (pred == PRED_TYPE_tu || pred == PRED_TYPE_tum || pred == 
PRED_TYPE_tumu)
@@ -2458,7 +2458,7 @@ get_tail_policy_for_pred (enum 
predication_type_index pred)


 /* Get MASK policy for predication. If predication indicates MU, 
return the MU.

    Otherwise, return the prefer default configuration.  */
-static rtx
+rtx
 get_mask_policy_for_pred (enum predication_type_index pred)
 {
   if (pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu)
diff --git a/gcc/config/riscv/riscv-vector-builtins.h 
b/gcc/config/riscv/riscv-vector-builtins.h

index 8464aa9b7e9..d62d2bdab54 100644
--- a/gcc/config/riscv/riscv-vector-builtins.h
+++ b/gcc/config/riscv/riscv-vector-builtins.h
@@ -456,6 +456,8 @@ extern const char *const operand_suffixes[NUM_OP_TYPES];
 extern const rvv_builtin_suffixes type_suffixes[NUM_VECTOR_TYPES + 1];
 extern const char *const predication_suffixes[NUM_PRED_TYPES];
 extern rvv_builtin_types_t builtin_types[NUM_VECTOR_TYPES + 1];
+extern rtx get_tail_policy_for_pred (enum predication_type_index pred);
+extern rtx get_mask_policy_for_pred (enum predication_type_index pred);

 inline tree
 rvv_arg_type_info::get_scalar_type (vector_type_index type_idx) const
--
2.34.1




[PATCH v2 01/07] RISC-V: autovec: Add new predicates and function prototypes

2023-03-05 Thread Michael Collison

This patch adds foundational support in the form of:

1. New predicates

2. New function prototypes

3. Exporting emit_vlmax_vsetvl to global scope

4. Add a new command line option -mriscv_vector_lmu

2023-03-02  Michael Collison 
                Juzhe Zhong 

            * config/riscv/riscv-protos.h (riscv_classify_vlmul_field):
            New external declaration.
            (riscv_vector_preferred_simd_mode): Ditto.
            (riscv_tuple_mode_p): Ditto.
            (riscv_vector_mask_mode_p): Ditto.
            (riscv_classify_nf): Ditto.
            (riscv_vlmul_regsize): Ditto.
            (riscv_vector_preferred_simd_mode): Ditto.
            (riscv_vector_get_mask_mode): Ditto.
            (emit_vlmax_vsetvl): Ditto.
            (get_mask_policy_no_pred): Ditto.
            (get_tail_policy_no_pred): Ditto.
            * config/riscv/riscv-opts.h (riscv_vector_bits_enum): 
New enum.

            (riscv_vector_lmul_enum): Ditto.
            (vlmul_field_enum): Ditto.
            * config/riscv/riscv-v.cc (emit_vlmax_vsetvl):
            Remove static scope.
            * config/riscv/riscv.opt (riscv_vector_lmul):
            New option -mriscv_vector_lmul.
            * config/riscv/predicates.md (p_reg_or_const_csr_operand):
            New predicate.
            (vector_reg_or_const_dup_operand): Ditto.

---
 gcc/config/riscv/predicates.md  | 13 +++
 gcc/config/riscv/riscv-opts.h   | 40 +
 gcc/config/riscv/riscv-protos.h | 15 +
 gcc/config/riscv/riscv-v.cc |  2 +-
 gcc/config/riscv/riscv.opt  | 20 +
 5 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 0d9d7701c7e..19aa5e12920 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -264,6 +264,14 @@
 })

 ;; Predicates for the V extension.
+(define_special_predicate "p_reg_or_const_csr_operand"
+  (match_code "reg, subreg, const_int")
+{
+  if (CONST_INT_P (op))
+    return satisfies_constraint_K (op);
+  return GET_MODE (op) == Pmode;
+})
+
 (define_special_predicate "vector_length_operand"
   (ior (match_operand 0 "pmode_register_operand")
    (match_operand 0 "const_csr_operand")))
@@ -291,6 +299,11 @@
   (and (match_code "const_vector")
    (match_test "rtx_equal_p (op, 
riscv_vector::gen_scalar_move_mask (GET_MODE (op)))")))


+(define_predicate "vector_reg_or_const_dup_operand"
+  (ior (match_operand 0 "register_operand")
+   (match_test "const_vec_duplicate_p (op)
+   && !CONST_POLY_INT_P (CONST_VECTOR_ELT (op, 0))")))
+
 (define_predicate "vector_mask_operand"
   (ior (match_operand 0 "register_operand")
    (match_operand 0 "vector_all_trues_mask_operand")))
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index ff398c0a2ae..c6b6d84fce4 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -67,6 +67,46 @@ enum stack_protector_guard {
   SSP_GLOBAL            /* global canary */
 };

+/* RVV vector register sizes.  */
+enum riscv_vector_bits_enum
+{
+  RVV_SCALABLE,
+  RVV_NOT_IMPLEMENTED = RVV_SCALABLE,
+  RVV_64 = 64,
+  RVV_128 = 128,
+  RVV_256 = 256,
+  RVV_512 = 512,
+  RVV_1024 = 1024,
+  RVV_2048 = 2048,
+  RVV_4096 = 4096,
+  RVV_8192 = 8192,
+  RVV_16384 = 16384,
+  RVV_32768 = 32768,
+  RVV_65536 = 65536
+};
+
+/* vectorization factor.  */
+enum riscv_vector_lmul_enum
+{
+  RVV_LMUL1 = 1,
+  RVV_LMUL2 = 2,
+  RVV_LMUL4 = 4,
+  RVV_LMUL8 = 8
+};
+
+enum vlmul_field_enum
+{
+  VLMUL_FIELD_000, /* LMUL = 1.  */
+  VLMUL_FIELD_001, /* LMUL = 2.  */
+  VLMUL_FIELD_010, /* LMUL = 4.  */
+  VLMUL_FIELD_011, /* LMUL = 8.  */
+  VLMUL_FIELD_100, /* RESERVED.  */
+  VLMUL_FIELD_101, /* LMUL = 1/8.  */
+  VLMUL_FIELD_110, /* LMUL = 1/4.  */
+  VLMUL_FIELD_111, /* LMUL = 1/2.  */
+  MAX_VLMUL_FIELD
+};
+
 #define MASK_ZICSR    (1 << 0)
 #define MASK_ZIFENCEI (1 << 1)

diff --git a/gcc/config/riscv/riscv-protos.h 
b/gcc/config/riscv/riscv-protos.h

index 88a6bf5442f..6a486a1cd61 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -217,4 +217,19 @@ const unsigned int RISCV_BUILTIN_SHIFT = 1;
 /* Mask that selects the riscv_builtin_class part of a function code.  */
 const unsigned int RISCV_BUILTIN_CLASS = (1 << RISCV_BUILTIN_SHIFT) - 1;

+/* Routines implemented in riscv-v.cc.  */
+
+namespace riscv_vector {
+extern unsigned int riscv_classify_vlmul_field (enum machine_mode m);
+extern machine_mode riscv_vector_preferred_simd_mode (scalar_mode mode,
+                          unsigned vf);
+extern bool riscv_tuple_mode_p (machine_mode);
+extern bool riscv_vector_mask_mode_p (machine_mode);
+extern int riscv_classify_nf (machine_mode);
+extern int riscv_vlmul_regsize (machine_mode);
+extern opt_machine_mode riscv_vector_get_mask_mode (machine_mode 

[PATCH v2 00/07] RISC-V: autovec: Add auto-vectorization support

2023-03-05 Thread Michael Collison
This series of patches adds foundational support for RISC-V 
autovectorization support. These patches are based on the current 
upstream rvv vector intrinsic support and is not a new implementation. 
Most of the implementation consists of adding the new vector cost model, 
the autovectorization patterns themselves and target hooks.This 
implementation only provides support for integer addition and 
subtraction as a proof of concept. This patch set should not be 
construed to be feature complete. Based on conversations with the 
community these patches are intended to lay the groundwork for feature 
completion and collaboration within the RISC-V community.In version 1 of 
this patch submission I neglected to indicate that these patches are 
largely based off the work of Juzhe Zhong 
(juzhe.zh...@rivai.ai) of RiVAI. More 
specifically the rvv-next branch 
at:https://github.com/riscv-collab/riscv-gcc.git 
is the foundation of this 
patch set. I want to publicly apologize to Juzhe and RiVIA for not 
attributing their work visibly and publicly.As discussed on this list, 
if these patches are approved they will be merged into a 
"auto-vectorization" branch once gcc-13 branches for release.There are 
two known issues related to crashes (assert failures) associated with 
tree vectorization; one of which I have sent a patch for and have 
received feedback.


Changes in v2

- Updated ChangeLog entry to include RiVAI contributions

- Fixed ChangeLog email formatting

- Fixed gnu formatting issues in the code





Re: [PATCH 0/2] LoongArch: testsuite: Fix tests related to stack

2023-03-05 Thread Xi Ruoyao via Gcc-patches
On Mon, 2023-03-06 at 09:15 +0800, Lulu Cheng wrote:
> 
> 在 2023/3/6 上午12:21, Xi Ruoyao 写道:
> > On Fri, 2023-03-03 at 08:21 -0800, Mike Stump wrote:
> > > On Mar 3, 2023, at 12:40 AM, Xi Ruoyao via Gcc-patches
> > >  wrote:
> > > > Some trivial test case fixes.  Ok for trunk?
> > > Ok.
> > Lulu: if you don't object I'll push these two in this week.
> > 
> > I tried to bisect for the exact point where the test cases are broken,
> > but it turns out they are broken the first day committed (r13-4401).  As
> > the draft of r13-4401 was sent in Sept 2022 but it's committed in Nov
> > 2022, I can only guess something had changed in the two months and broke
> > the tests...
> 
> Sorry for the late reply, the first patch I think is fine. But I haven't 
> reproduced the problem of the second mail.
> 
> Is there any special option in the configuration?

Oh some strange thing might be happening... I'll try to figure out what
has caused the behavior difference.
> 

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Ping [PATCH v3] Add condition coverage profiling

2023-03-05 Thread Jørgen Kvalsvik via Gcc-patches
On 05/12/2022 10:40, Jørgen Kvalsvik wrote:
> This patch adds support in gcc+gcov for modified condition/decision
> coverage (MC/DC) with the -fprofile-conditions flag. MC/DC is a type of
> test/code coverage and it is particularly important in the avation and
> automotive industries for safety-critical applications. MC/DC it is
> required for or recommended by:
> 
> * DO-178C for the most critical software (Level A) in avionics
> * IEC 61508 for SIL 4
> * ISO 26262-6 for ASIL D
> 
>  From the SQLite webpage:
> 
> Two methods of measuring test coverage were described above:
> "statement" and "branch" coverage. There are many other test
> coverage metrics besides these two. Another popular metric is
> "Modified Condition/Decision Coverage" or MC/DC. Wikipedia defines
> MC/DC as follows:
> 
> * Each decision tries every possible outcome.
> * Each condition in a decision takes on every possible outcome.
> * Each entry and exit point is invoked.
> * Each condition in a decision is shown to independently affect
>   the outcome of the decision.
> 
> In the C programming language where && and || are "short-circuit"
> operators, MC/DC and branch coverage are very nearly the same thing.
> The primary difference is in boolean vector tests. One can test for
> any of several bits in bit-vector and still obtain 100% branch test
> coverage even though the second element of MC/DC - the requirement
> that each condition in a decision take on every possible outcome -
> might not be satisfied.
> 
> https://sqlite.org/testing.html#mcdc
> 
> Wahlen, Heimdahl, and De Silva "Efficient Test Coverage Measurement for
> MC/DC" describes an algorithm for adding instrumentation by carrying
> over information from the AST, but my algorithm analyses the the control
> flow graph to instrument for coverage. This has the benefit of being
> programming language independent and faithful to compiler decisions
> and transformations, although I have only tested it on constructs in C
> and C++, see testsuite/gcc.misc-tests and testsuite/g++.dg.
> 
> Like Wahlen et al this implementation records coverage in fixed-size
> bitsets which gcov knows how to interpret. This is very fast, but
> introduces a limit on the number of terms in a single boolean
> expression, the number of bits in a gcov_unsigned_type (which is
> typedef'd to uint64_t), so for most practical purposes this would be
> acceptable. This limitation is in the implementation and not the
> algorithm, so support for more conditions can be added by also
> introducing arbitrary-sized bitsets.
> 
> For space overhead, the instrumentation needs two accumulators
> (gcov_unsigned_type) per condition in the program which will be written
> to the gcov file. In addition, every function gets a pair of local
> accumulators, but these accmulators are reused between conditions in the
> same function.
> 
> For time overhead, there is a zeroing of the local accumulators for
> every condition and one or two bitwise operation on every edge taken in
> the an expression.
> 
> In action it looks pretty similar to the branch coverage. The -g short
> opt carries no significance, but was chosen because it was an available
> option with the upper-case free too.
> 
> gcov --conditions:
> 
> 3:   17:void fn (int a, int b, int c, int d) {
> 3:   18:if ((a && (b || c)) && d)
> condition outcomes covered 3/8
> condition  0 not covered (true false)
> condition  1 not covered (true)
> condition  2 not covered (true)
> condition  3 not covered (true)
> 1:   19:x = 1;
> -:   20:else
> 2:   21:x = 2;
> 3:   22:}
> 
> gcov --conditions --json-format:
> 
> "conditions": [
> {
> "not_covered_false": [
> 0
> ],
> "count": 8,
> "covered": 3,
> "not_covered_true": [
> 0,
> 1,
> 2,
> 3
> ]
> }
> ],
> 
> Some expressions, mostly those without else-blocks, are effectively
> "rewritten" in the CFG construction making the algorithm unable to
> distinguish them:
> 
> and.c:
> 
> if (a && b && c)
> x = 1;
> 
> ifs.c:
> 
> if (a)
> if (b)
> if (c)
> x = 1;
> 
> gcc will build the same graph for both these programs, and gcov will
> report boths as 3-term expressions. It is vital that it is not
> interpreted the other way around (which is consistent with the shape of
> the graph) because otherwise the masking would be wrong for the and.c
> program which is a more severe error. While surprising, users would
> probably expect some minor rewriting of semantically-identical
> expressions.
> 
> and.c.gcov:
> #:2:if (a && b && c)
> condition outcomes covered 6/6
> #:3:x = 1;
> 
> ifs.c.gcov:
> #:

Re: [PATCH 0/2] LoongArch: testsuite: Fix tests related to stack

2023-03-05 Thread Lulu Cheng



在 2023/3/6 上午12:21, Xi Ruoyao 写道:

On Fri, 2023-03-03 at 08:21 -0800, Mike Stump wrote:

On Mar 3, 2023, at 12:40 AM, Xi Ruoyao via Gcc-patches
 wrote:

Some trivial test case fixes.  Ok for trunk?

Ok.

Lulu: if you don't object I'll push these two in this week.

I tried to bisect for the exact point where the test cases are broken,
but it turns out they are broken the first day committed (r13-4401).  As
the draft of r13-4401 was sent in Sept 2022 but it's committed in Nov
2022, I can only guess something had changed in the two months and broke
the tests...


Sorry for the late reply, the first patch I think is fine. But I haven't 
reproduced the problem of the second mail.


Is there any special option in the configuration?



Re: Re: [PATCH] RISC-V: Optimize the MASK opt generation

2023-03-05 Thread Feng Wang
On 2023-03-03 17:12  Feng Wang wrote:
>
>On 2023-03-03 16:54  jiawei wrote:
>>
>>The Mask flag in the single TargetVariable is not enough due to more
>>and more extensions were added.So I optimize the defination of Mask
>>flag, please refer to the below case:
>>There are some new MASK flags for 'v' extension(ZVL32B,ZVL64B,...,ZVL65536B),
>>but these MASK flags can't store into x_target_flags,because the total number
>>of MASK flags exceed 32. In this patch we can write it like this in this 
>>scence.
>>
>>TargetVariable
>>int riscv_zvl_flags
>>
>>Mask(ZVL32B) in TargetVariable(riscv_zvl_flags)
>>
>>The corresponding MASK and TARGET will be automatically generated.
>>
>>gcc/ChangeLog:
>>
>>    * config/riscv/riscv-opts.h   Delete below definations
>>    (MASK_ZICSR): Delete;
>>    (MASK_ZIFENCEI): Delete;
>>    (TARGET_ZICSR): Delete;
>>    (TARGET_ZIFENCEI): Delete;
>>    (MASK_ZAWRS): Delete;
>>    (TARGET_ZAWRS): Delete;
>>    (MASK_ZBA): Delete;
>>    (MASK_ZBB): Delete;
>>    (MASK_ZBC): Delete;
>>    (MASK_ZBS): Delete;
>>    (TARGET_ZBA): Delete;
>>    (TARGET_ZBB): Delete;
>>    (TARGET_ZBC): Delete;
>>    (TARGET_ZBS): Delete;
>>    (MASK_ZFINX): Delete;
>>    (MASK_ZDINX): Delete;
>>    (MASK_ZHINX): Delete;
>>    (MASK_ZHINXMIN): Delete;
>>    (TARGET_ZFINX): Delete;
>>    (TARGET_ZDINX): Delete;
>>    (TARGET_ZHINX): Delete;
>>    (TARGET_ZHINXMIN): Delete;
>>    (MASK_ZBKB): Delete;
>>    (MASK_ZBKC): Delete;
>>    (MASK_ZBKX): Delete;
>>    (MASK_ZKNE): Delete;
>>    (MASK_ZKND): Delete;
>>    (MASK_ZKNH): Delete;
>>    (MASK_ZKR): Delete;
>>    (MASK_ZKSED): Delete;
>>    (MASK_ZKSH): Delete;
>>    (MASK_ZKT): Delete;
>>    (TARGET_ZBKB): Delete;
>>    (TARGET_ZBKC): Delete;
>>    (TARGET_ZBKX): Delete;
>>    (TARGET_ZKNE): Delete;
>>    (TARGET_ZKND): Delete;
>>    (TARGET_ZKNH): Delete;
>>    (TARGET_ZKR): Delete;
>>    (TARGET_ZKSED): Delete;
>>    (TARGET_ZKSH): Delete;
>>    (TARGET_ZKT): Delete;
>>    (MASK_VECTOR_ELEN_32): Delete;
>>    (MASK_VECTOR_ELEN_64): Delete;
>>    (MASK_VECTOR_ELEN_FP_32): Delete;
>>    (MASK_VECTOR_ELEN_FP_64): Delete;
>>    (TARGET_VECTOR_ELEN_32): Delete;
>>    (TARGET_VECTOR_ELEN_64): Delete;
>>    (TARGET_VECTOR_ELEN_FP_32): Delete;
>>    (TARGET_VECTOR_ELEN_FP_64): Delete;
>>    (MASK_ZVL32B): Delete;
>>    (MASK_ZVL64B): Delete;
>>    (MASK_ZVL128B): Delete;
>>    (MASK_ZVL256B): Delete;
>>    (MASK_ZVL512B): Delete;
>>    (MASK_ZVL1024B): Delete;
>>    (MASK_ZVL2048B): Delete;
>>    (MASK_ZVL4096B): Delete;
>>    (MASK_ZVL8192B): Delete;
>>    (MASK_ZVL16384B): Delete;
>>    (MASK_ZVL32768B): Delete;
>>    (MASK_ZVL65536B): Delete;
>>    (TARGET_ZVL32B): Delete;
>>    (TARGET_ZVL64B): Delete;
>>    (TARGET_ZVL128B): Delete;
>>    (TARGET_ZVL256B): Delete;
>>    (TARGET_ZVL512B): Delete;
>>    (TARGET_ZVL1024B): Delete;
>>    (TARGET_ZVL2048B): Delete;
>>    (TARGET_ZVL4096B): Delete;
>>    (TARGET_ZVL8192B): Delete;
>>    (TARGET_ZVL16384B): Delete;
>>    (TARGET_ZVL32768B): Delete;
>>    (TARGET_ZVL65536B): Delete;
>>    (MASK_ZICBOZ): Delete;
>>    (MASK_ZICBOM): Delete;
>>    (MASK_ZICBOP): Delete;
>>    (TARGET_ZICBOZ): Delete;
>>    (TARGET_ZICBOM): Delete;
>>    (TARGET_ZICBOP): Delete;
>>    (MASK_ZFHMIN): Delete;
>>    (MASK_ZFH): Delete;
>>    (TARGET_ZFHMIN): Delete;
>>    (TARGET_ZFH): Delete;
>>    (MASK_ZMMUL): Delete;
>>    (TARGET_ZMMUL): Delete;
>>    (MASK_SVINVAL): Delete;
>>    (MASK_SVNAPOT): Delete;
>>    (TARGET_SVINVAL): Delete;
>>    (TARGET_SVNAPOT): Delete;
>>    * config/riscv/riscv.opt: Add new Mask defination.
>>    * opt-functions.awk:  Add new function to find the index
>>  of target variable from extra_target_vars.
>>    * opt-read.awk:   Add new function to store the Mask flags.
>>    * opth-gen.awk:   Add new function to output the defination of
>>  Mask Macro and Target Macro.
>>---
>> gcc/config/riscv/riscv-opts.h | 115 --
>> gcc/config/riscv/riscv.opt    |  90 ++
>> gcc/opt-functions.awk |  11 
>> gcc/opt-read.awk  |  16 -
>> gcc/opth-gen.awk  |  22 +++
>> 5 files changed, 138 insertions(+), 116 deletions(-)
>>
>>diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
>>index 25fd85b09b1..7cf28838cb5 100644
>>--- a/gcc/config/riscv/riscv-opts.h
>>+++ b/gcc/config/riscv/riscv-opts.h
>>@@ -66,121 +66,6 @@ enum stack_protector_guard {
>>   SSP_TLS,   /* per-thread canary in TLS block */
>>   SSP_GLOBAL /* global canary */
>> };
>>-
>>-#define 

New Swedish PO file for 'gcc' (version 13.1-b20230212)

2023-03-05 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

https://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-13.1-b20230212.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [PATCH] PR rtl-optimization/106594: Preserve zero_extend in combine when cheap.

2023-03-05 Thread Segher Boessenkool
Hi!

On Sun, Mar 05, 2023 at 08:43:20PM +, Tamar Christina wrote:
> > On 3/5/23 12:28, Tamar Christina via Gcc-patches wrote:
> > > The regression was reported during stage-1. A patch was provided during
> > stage 1 and the discussions around combine stalled.
> > >
> > > The regression for AArch64 needs to be fixed in GCC 13. The hit is too 
> > > big just
> > to "take".
> > >
> > > So we need a way forward, even if it's stage-4.
> > Then it needs to be in a way that works within the design constraints of
> > combine.
> > 
> > As Segher has indicated, using a magic constant to say "this is always cheap
> > enough" isn't acceptable.  Furthermore, what this patch changes is combine's
> > internal canonicalization of extensions into shift pairs.
> > 
> > So I think another path forward needs to be found.  I don't see hacking up
> > expand_compound_operation is viable.
> 
> I'm not arguing at all about the merits of the patch. My argument was about 
> Segher saying he doesn't think this is a P1 regression or one that should be 
> addressed in stage-4.

That is not what I said (in the PR).  I said:
  Either this should not be P1, or the proposed patch is taking
  completely the wrong direction.  P1 means there is a regression.
  There is no regression in combine, in fact the proposed patch would
  *cause* regressions on many targets!
()

> We noticed and reported the regression early on during stage-1.  So I'm 
> unsure what else we should have done and it's not right to waive off fixing 
> it now, otherwise what's the point in us filing bug reports.

Something that fixes the regression is of course welcome.  But something
that *causes* many regressions is not.  Something that makes
compound_operation stuff better on all targets is more than welcome as
well, but *better* on *all* targets, not regressing most.  This really
is stage 1 material most likely.  I have been chipping away at this for
years, I don't expect any trivial patch can help, and it certainly won't
solve most problems here.

Maybe a target hook for this is best.  But not a completely ad-hoc one,
something usable and maintainable please.  So, it should say we do not
want certain kinds of code (or what kind of code we want instead), and
it should not add magic to the bowels of basic passes, magic that just
happens to make the code of particular testcases look better on a
particular target.

Yes, *look* better: I have seen no proof or indication that this would
actually generate better code, not even on just aarch, let alone on the
majority of targets.  As I said I have a test running, you may be lucky
even :-)  It has to run for about six hours more and after that it needs
analysis still (a few more hours if it isn't obviously always better or
worse), so expect results tomorrow night at the earliest.


Segher


RE: [PATCH] PR rtl-optimization/106594: Preserve zero_extend in combine when cheap.

2023-03-05 Thread Tamar Christina via Gcc-patches
> 
> On 3/5/23 12:28, Tamar Christina via Gcc-patches wrote:
> >
> > The regression was reported during stage-1. A patch was provided during
> stage 1 and the discussions around combine stalled.
> >
> > The regression for AArch64 needs to be fixed in GCC 13. The hit is too big 
> > just
> to "take".
> >
> > So we need a way forward, even if it's stage-4.
> Then it needs to be in a way that works within the design constraints of
> combine.
> 
> As Segher has indicated, using a magic constant to say "this is always cheap
> enough" isn't acceptable.  Furthermore, what this patch changes is combine's
> internal canonicalization of extensions into shift pairs.
> 
> So I think another path forward needs to be found.  I don't see hacking up
> expand_compound_operation is viable.

I'm not arguing at all about the merits of the patch. My argument was about 
Segher saying he doesn't think this is a P1 regression or one that should be 
addressed in stage-4.

We noticed and reported the regression early on during stage-1.  So I'm unsure 
what else we should have done and it's not right to waive off fixing it now, 
otherwise what's the point in us filing bug reports.

Tamar.
> 
> Jeff


[PATCH, v3] Fortran: fix CLASS attribute handling [PR106856]

2023-03-05 Thread Harald Anlauf via Gcc-patches

Hi Mikael,

Am 04.03.23 um 23:29 schrieb Mikael Morin:

Le 04/03/2023 à 22:20, Harald Anlauf a écrit :

Hi Mikael,

Am 04.03.23 um 18:09 schrieb Mikael Morin:

There was a comment about the old_symbol thing at the end of my previous
message:
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613354.html


I think Tobias might be the better person to answer this.
But when playing with variations of that else-branch,
I always hit an issue with class_74.f90, where the class
variables are not dummy arguments but local variables.

E.g. take the following reduced testcase:

subroutine foo
   class(*)  :: y
   dimension :: y(:,:)
   pointer   :: y
end subroutine foo

So when we see the dimension but haven't seen the
pointer (or allocatable) declaration, we appear to
generate an error with bad consequences (ICE).

If this is a resolution issue, maybe it can be fixed
differently, but likely needs digging deeper.  With
the patch as-is at least I do not see a memory leak
in that context.


One of my suggestions was to fix it as attached.
It is probably more clear with an actual patch to look at.
It seems to work on your example and class_74 as well.


This fix is great.  I've included it in the revised patch.


It seems to also fix some valgrind errors on this example:
    subroutine foo
  pointer   :: y
  dimension :: y(:,:)
  class(*)  :: y
    end subroutine foo
I'm fine with that fix if it works for you.


I've added this variant to class_74.f90, so it won't break
without noticing.

I suggest waiting for next stage 1, but it's your call, you have the 
green light from Steve anyway.


I've chosen to push patch v3 (attached) after a further round of 
regtesting as r13-6497-g6aa1f40a326374 .



Thanks for your work.


Many thanks for your very helpful review!

Harald
From 6aa1f40a3263741d964ef4716e85a0df5cec83b6 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 2 Mar 2023 22:37:14 +0100
Subject: [PATCH] Fortran: fix CLASS attribute handling [PR106856]

gcc/fortran/ChangeLog:

	PR fortran/106856
	* class.cc (gfc_build_class_symbol): Handle update of attributes of
	existing class container.
	(gfc_find_derived_vtab): Fix several memory leaks.
	(find_intrinsic_vtab): Ditto.
	* decl.cc (attr_decl1): Manage update of symbol attributes from
	CLASS attributes.
	* primary.cc (gfc_variable_attr): OPTIONAL shall not be taken or
	updated from the class container.
	* symbol.cc (free_old_symbol): Adjust management of symbol versions
	to not prematurely free array specs while working on the declation
	of CLASS variables.

gcc/testsuite/ChangeLog:

	PR fortran/106856
	* gfortran.dg/interface_41.f90: Remove dg-pattern from valid testcase.
	* gfortran.dg/class_74.f90: New test.
	* gfortran.dg/class_75.f90: New test.

Co-authored-by: Tobias Burnus  
---
 gcc/fortran/class.cc   |  25 +++-
 gcc/fortran/decl.cc|  56 
 gcc/fortran/primary.cc |   1 -
 gcc/fortran/symbol.cc  |   6 +-
 gcc/testsuite/gfortran.dg/class_74.f90 | 151 +
 gcc/testsuite/gfortran.dg/class_75.f90 |  24 
 gcc/testsuite/gfortran.dg/interface_41.f90 |   2 +-
 7 files changed, 229 insertions(+), 36 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/class_74.f90
 create mode 100644 gcc/testsuite/gfortran.dg/class_75.f90

diff --git a/gcc/fortran/class.cc b/gcc/fortran/class.cc
index ae653e74437..52235ab83e3 100644
--- a/gcc/fortran/class.cc
+++ b/gcc/fortran/class.cc
@@ -638,6 +638,7 @@ gfc_build_class_symbol (gfc_typespec *ts, symbol_attribute *attr,
 {
   char tname[GFC_MAX_SYMBOL_LEN+1];
   char *name;
+  gfc_typespec *orig_ts = ts;
   gfc_symbol *fclass;
   gfc_symbol *vtab;
   gfc_component *c;
@@ -646,9 +647,21 @@ gfc_build_class_symbol (gfc_typespec *ts, symbol_attribute *attr,
 
   gcc_assert (as);
 
-  if (attr->class_ok)
-/* Class container has already been built.  */
+  /* Class container has already been built with same name.  */
+  if (attr->class_ok
+  && ts->u.derived->components->attr.dimension >= attr->dimension
+  && ts->u.derived->components->attr.codimension >= attr->codimension
+  && ts->u.derived->components->attr.class_pointer >= attr->pointer
+  && ts->u.derived->components->attr.allocatable >= attr->allocatable)
 return true;
+  if (attr->class_ok)
+{
+  attr->dimension |= ts->u.derived->components->attr.dimension;
+  attr->codimension |= ts->u.derived->components->attr.codimension;
+  attr->pointer |= ts->u.derived->components->attr.class_pointer;
+  attr->allocatable |= ts->u.derived->components->attr.allocatable;
+  ts = >u.derived->components->ts;
+}
 
   attr->class_ok = attr->dummy || attr->pointer || attr->allocatable
 		   || attr->select_type_temporary || attr->associate_var;
@@ -790,7 +803,7 @@ gfc_build_class_symbol (gfc_typespec *ts, symbol_attribute *attr,
 }
 
   fclass->attr.is_class = 1;
-  ts->u.derived = fclass;
+  

Re: [PATCH] PR rtl-optimization/106594: Preserve zero_extend in combine when cheap.

2023-03-05 Thread Jeff Law via Gcc-patches




On 3/5/23 12:28, Tamar Christina via Gcc-patches wrote:


The regression was reported during stage-1. A patch was provided during stage 1 
and the discussions around combine stalled.

The regression for AArch64 needs to be fixed in GCC 13. The hit is too big just to 
"take".

So we need a way forward, even if it's stage-4.
Then it needs to be in a way that works within the design constraints of 
combine.


As Segher has indicated, using a magic constant to say "this is always 
cheap enough" isn't acceptable.  Furthermore, what this patch changes is 
combine's internal canonicalization of extensions into shift pairs.


So I think another path forward needs to be found.  I don't see hacking 
up expand_compound_operation is viable.


Jeff


[committed] testsuite: Fix up syntax error in scan-tree-dump-times target selector

2023-03-05 Thread Jakub Jelinek via Gcc-patches
Hi!

On aarch64, powerpc64le and s390x-linux I'm seeing another syntax error
which didn't show up on x86_64-linux nor i686-linux:
ERROR: gcc.dg/vect/slp-perm-8.c -flto -ffat-lto-objects: error executing 
dg-final: syntax error in target selector "target  ! vect_load_lanes  && 
vect_partial_vectors_usage_1 &&  ! s390_vx"
ERROR: gcc.dg/vect/slp-perm-8.c: error executing dg-final: syntax error in 
target selector "target  ! vect_load_lanes  && vect_partial_vectors_usage_1 &&  
! s390_vx"

The following patch fixes that.

Tested with a cross to aarch64-linux, committed to trunk as obvious.

2023-03-05  Jakub Jelinek  

* gcc.dg/vect/slp-perm-8.c: Fix up syntax error in
scan-tree-dump-times target selector.

--- gcc/testsuite/gcc.dg/vect/slp-perm-8.c.jj   2023-03-03 16:08:17.707264399 
+0100
+++ gcc/testsuite/gcc.dg/vect/slp-perm-8.c  2023-03-05 19:05:28.379974608 
+0100
@@ -60,9 +60,9 @@ int main (int argc, const char* argv[])
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { 
vect_perm_byte } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
target { vect_perm3_byte && { { ! vect_load_lanes } && { {! 
vect_partial_vectors_usage_1 } || s390_vx } } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
target { vect_perm3_byte && { { ! vect_load_lanes } && { { ! 
vect_partial_vectors_usage_1 } || s390_vx } } } } } } */
 /* The epilogues are vectorized using partial vectors.  */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target { vect_perm3_byte && { { ! vect_load_lanes } && 
vect_partial_vectors_usage_1 && { ! s390_vx } } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target { vect_perm3_byte && { { ! vect_load_lanes } && { 
vect_partial_vectors_usage_1 && { ! s390_vx } } } } } } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { 
target vect_load_lanes } } } */
 /* { dg-final { scan-tree-dump "Built SLP cancelled: can use load/store-lanes" 
"vect" { target { vect_perm3_byte && vect_load_lanes } } } } */
 /* { dg-final { scan-tree-dump "LOAD_LANES" "vect" { target vect_load_lanes } 
} } */

Jakub



Re: [PATCH] PR rtl-optimization/106594: Preserve zero_extend in combine when cheap.

2023-03-05 Thread Tamar Christina via Gcc-patches


The regression was reported during stage-1. A patch was provided during stage 1 
and the discussions around combine stalled.

The regression for AArch64 needs to be fixed in GCC 13. The hit is too big just 
to "take".

So we need a way forward, even if it's stage-4.

Thanks,
Tamar


From: Segher Boessenkool 
Sent: Saturday, March 4, 2023 10:17 PM
To: Roger Sayle 
Cc: 'GCC Patches' ; Tamar Christina 
; Richard Sandiford 
Subject: Re: [PATCH] PR rtl-optimization/106594: Preserve zero_extend in 
combine when cheap.

On Sat, Mar 04, 2023 at 06:32:15PM -, Roger Sayle wrote:
> This patch addresses PR rtl-optimization/106594, a P1 performance
> regression affecting aarch64.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.

It should be tested for performance everywhere else, too.

It can very easily result in worse code on some targets.  This kind of
thing really should be done in stage 1, not stage 4.

> PR rtl-optimization/106594
> * combine.cc (expand_compound_operation): Don't expand/transform
> ZERO_EXTEND or SIGN_EXTEND on targets where rtx_cost claims they are
> cheap.

That is not how combine works.  If the old code is more expensive than
what combine comes up with., it *should* transform it.  Magic cost
cutoffs are not okay anywhere in combine, either.

If expand_compound_operation and friends misbehave (not really an "if",
unfortunately), then please fix that, instead of randomly disabling
parts of combine?


Segher


Re: [PATCH v2] RISC-V: Produce better code with complex constants [PR95632] [PR106602]

2023-03-05 Thread Jeff Law via Gcc-patches




On 3/5/23 12:03, Andrew Pinski wrote:

On Sun, Mar 5, 2023 at 10:14 AM Jeff Law via Gcc-patches
 wrote:




On 2/23/23 14:23, Andrew Pinski via Gcc-patches wrote:

On Fri, Dec 9, 2022 at 10:25 AM Raphael Moreira Zinsly
 wrote:


Changes since v1:
  - Fixed formatting issues.
  - Added a name to the define_insn_and_split pattern.
  - Set the target on the 'dg-do compile' in pr106602.c.
  - Removed the rv32 restriction in pr95632.c.

-- >8 --

Due to RISC-V limitations on operations with big constants combine
is failing to match such operations and is not being able to
produce optimal code as it keeps splitting them.  By pretending we
can do those operations we can get more opportunities for
simplification of surrounding instructions.

2022-12-06  Raphael Moreira Zinsly  
  Jeff Law  

gcc/Changelog:
  PR target/95632
  PR target/106602
  * config/riscv/riscv.md: New pattern to simulate complex
  const_int loads.

gcc/testsuite/ChangeLog:
  * gcc.target/riscv/pr95632.c: New test.
  * gcc.target/riscv/pr106602.c: New test.
---
   gcc/config/riscv/riscv.md | 15 +++
   gcc/testsuite/gcc.target/riscv/pr106602.c | 14 ++
   gcc/testsuite/gcc.target/riscv/pr95632.c  | 15 +++
   3 files changed, 44 insertions(+)
   create mode 100644 gcc/testsuite/gcc.target/riscv/pr106602.c
   create mode 100644 gcc/testsuite/gcc.target/riscv/pr95632.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index df57e2b0b4a..b0daa4b19eb 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1667,6 +1667,21 @@
MAX_MACHINE_MODE, [3], TRUE);
   })

+;; Pretend to have the ability to load complex const_int in order to get
+;; better code generation around them.
+(define_insn_and_split "*mvconst_internal"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+(match_operand:GPR 1 "splittable_const_int_operand" "i"))]
+  "cse_not_expected"


This is just way broken. This should be combined with the normal move
instructions and just be a define_split.
See PR 108892 for a testcase which shows this breaking how the
register allocator thinks it should work.

I'm pretty sure that won't work.  You need them exposed as a define_insn
so that they can act as a bridge pattern for combine.  You don't want to
expose before combine as that'll regress things in a variety of other
ways.  You don't want the bridge form to survive after splitting.  Hence
define_insn_and_split.

I haven't looked at that bug in detail, but Raphael and I certainly will.


So the register allocator does not know how to handle if there are two
different patterns which are to be used but differ by
constraints/predicats. This is especially true for mov instructions
which this is.
The define_insn_and_split for this case shouldn't be available for the 
allocator.  If it is, then that's the source of the problem.  We may 
have missed something in the predicates.



What I am saying is the "*movdi_64bit" and "*movsi_internal" patterns
should handle the same instruction as the above and still have a
define_split.
Perhaps but I think that's independent of the problem you're bumping up 
against.  Also note that by the time we're in the allocator we have to 
be more careful as we can't allocate new pseudos.





Take a look at how aarch64 handles this here. It has one pattern for
the move but it is a define_insn_and_split still. This is explicitly
to handle the case you are doing really.
"*movsi_aarch64" and "*movdi_aarch64" .

Will do.
Jeff


Re: [PATCH v2] RISC-V: Produce better code with complex constants [PR95632] [PR106602]

2023-03-05 Thread Andrew Pinski via Gcc-patches
On Sun, Mar 5, 2023 at 10:14 AM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 2/23/23 14:23, Andrew Pinski via Gcc-patches wrote:
> > On Fri, Dec 9, 2022 at 10:25 AM Raphael Moreira Zinsly
> >  wrote:
> >>
> >> Changes since v1:
> >>  - Fixed formatting issues.
> >>  - Added a name to the define_insn_and_split pattern.
> >>  - Set the target on the 'dg-do compile' in pr106602.c.
> >>  - Removed the rv32 restriction in pr95632.c.
> >>
> >> -- >8 --
> >>
> >> Due to RISC-V limitations on operations with big constants combine
> >> is failing to match such operations and is not being able to
> >> produce optimal code as it keeps splitting them.  By pretending we
> >> can do those operations we can get more opportunities for
> >> simplification of surrounding instructions.
> >>
> >> 2022-12-06  Raphael Moreira Zinsly  
> >>  Jeff Law  
> >>
> >> gcc/Changelog:
> >>  PR target/95632
> >>  PR target/106602
> >>  * config/riscv/riscv.md: New pattern to simulate complex
> >>  const_int loads.
> >>
> >> gcc/testsuite/ChangeLog:
> >>  * gcc.target/riscv/pr95632.c: New test.
> >>  * gcc.target/riscv/pr106602.c: New test.
> >> ---
> >>   gcc/config/riscv/riscv.md | 15 +++
> >>   gcc/testsuite/gcc.target/riscv/pr106602.c | 14 ++
> >>   gcc/testsuite/gcc.target/riscv/pr95632.c  | 15 +++
> >>   3 files changed, 44 insertions(+)
> >>   create mode 100644 gcc/testsuite/gcc.target/riscv/pr106602.c
> >>   create mode 100644 gcc/testsuite/gcc.target/riscv/pr95632.c
> >>
> >> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> >> index df57e2b0b4a..b0daa4b19eb 100644
> >> --- a/gcc/config/riscv/riscv.md
> >> +++ b/gcc/config/riscv/riscv.md
> >> @@ -1667,6 +1667,21 @@
> >>MAX_MACHINE_MODE, [3], TRUE);
> >>   })
> >>
> >> +;; Pretend to have the ability to load complex const_int in order to get
> >> +;; better code generation around them.
> >> +(define_insn_and_split "*mvconst_internal"
> >> +  [(set (match_operand:GPR 0 "register_operand" "=r")
> >> +(match_operand:GPR 1 "splittable_const_int_operand" "i"))]
> >> +  "cse_not_expected"
> >
> > This is just way broken. This should be combined with the normal move
> > instructions and just be a define_split.
> > See PR 108892 for a testcase which shows this breaking how the
> > register allocator thinks it should work.
> I'm pretty sure that won't work.  You need them exposed as a define_insn
> so that they can act as a bridge pattern for combine.  You don't want to
> expose before combine as that'll regress things in a variety of other
> ways.  You don't want the bridge form to survive after splitting.  Hence
> define_insn_and_split.
>
> I haven't looked at that bug in detail, but Raphael and I certainly will.

So the register allocator does not know how to handle if there are two
different patterns which are to be used but differ by
constraints/predicats. This is especially true for mov instructions
which this is.
What I am saying is the "*movdi_64bit" and "*movsi_internal" patterns
should handle the same instruction as the above and still have a
define_split.

Take a look at how aarch64 handles this here. It has one pattern for
the move but it is a define_insn_and_split still. This is explicitly
to handle the case you are doing really.
"*movsi_aarch64" and "*movdi_aarch64" .

Thanks,
Andrew Pinski


>
> jeff


Re: [PATCH v2] RISC-V: Produce better code with complex constants [PR95632] [PR106602]

2023-03-05 Thread Jeff Law via Gcc-patches




On 2/23/23 14:23, Andrew Pinski via Gcc-patches wrote:

On Fri, Dec 9, 2022 at 10:25 AM Raphael Moreira Zinsly
 wrote:


Changes since v1:
 - Fixed formatting issues.
 - Added a name to the define_insn_and_split pattern.
 - Set the target on the 'dg-do compile' in pr106602.c.
 - Removed the rv32 restriction in pr95632.c.

-- >8 --

Due to RISC-V limitations on operations with big constants combine
is failing to match such operations and is not being able to
produce optimal code as it keeps splitting them.  By pretending we
can do those operations we can get more opportunities for
simplification of surrounding instructions.

2022-12-06  Raphael Moreira Zinsly  
 Jeff Law  

gcc/Changelog:
 PR target/95632
 PR target/106602
 * config/riscv/riscv.md: New pattern to simulate complex
 const_int loads.

gcc/testsuite/ChangeLog:
 * gcc.target/riscv/pr95632.c: New test.
 * gcc.target/riscv/pr106602.c: New test.
---
  gcc/config/riscv/riscv.md | 15 +++
  gcc/testsuite/gcc.target/riscv/pr106602.c | 14 ++
  gcc/testsuite/gcc.target/riscv/pr95632.c  | 15 +++
  3 files changed, 44 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/riscv/pr106602.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/pr95632.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index df57e2b0b4a..b0daa4b19eb 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1667,6 +1667,21 @@
   MAX_MACHINE_MODE, [3], TRUE);
  })

+;; Pretend to have the ability to load complex const_int in order to get
+;; better code generation around them.
+(define_insn_and_split "*mvconst_internal"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+(match_operand:GPR 1 "splittable_const_int_operand" "i"))]
+  "cse_not_expected"


This is just way broken. This should be combined with the normal move
instructions and just be a define_split.
See PR 108892 for a testcase which shows this breaking how the
register allocator thinks it should work.
I'm pretty sure that won't work.  You need them exposed as a define_insn 
so that they can act as a bridge pattern for combine.  You don't want to 
expose before combine as that'll regress things in a variety of other 
ways.  You don't want the bridge form to survive after splitting.  Hence 
define_insn_and_split.


I haven't looked at that bug in detail, but Raphael and I certainly will.

jeff


Re: [PATCH v4 5/9] riscv: thead: Add support for the XTheadBb ISA extension

2023-03-05 Thread Jeff Law via Gcc-patches




On 3/2/23 01:35, Christoph Muellner wrote:

From: Christoph Müllner 

This patch adds support for the XTheadBb ISA extension.
Thus, there is a functional overlap of the new instructions with
existing Bitmanip instruction, which allows a good amount of code
sharing. However, the vendor extensions are cleanly separated from
the standard extensions (e.g. by using INSN expand pattern that
will re-emit RTL that matches the patterns of either Bitmanip or
XThead INSNs).




diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index d6c2265e9d4..fc8ce9f5226 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -3087,6 +3087,26 @@ (define_insn "riscv_prefetchi_"
"prefetch.i\t%a0"
  )
  
+(define_expand "extv"

+  [(set (match_operand:GPR 0 "register_operand" "=r")
+   (sign_extract:GPR (match_operand:GPR 1 "register_operand" "r")
+(match_operand 2 "const_int_operand")
+(match_operand 3 "const_int_operand")))]
+  "TARGET_XTHEADBB"
+)
+
+(define_expand "extzv"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+   (zero_extract:GPR (match_operand:GPR 1 "register_operand" "r")
+(match_operand 2 "const_int_operand")
+(match_operand 3 "const_int_operand")))]
+  "TARGET_XTHEADBB"
+{
+  if (TARGET_XTHEADBB
+  && (INTVAL (operands[2]) < 8) && (INTVAL (operands[3]) == 0))
+FAIL;
+})
Note that bitmanip has single bit extractions which probably should be 
handed by extzv rather than relying strictly on the combiner to 
synthesize them.  Similarly for single bit insertions.


I've actually got a TODO on Raphael's plate to see how renaming the 
existing bitmanip bit extraction to extzv affects code generation.  I'm 
not offhand sure where it is on his priority list yet.


I guess the wider point is the ext and ins expanders should probably be 
accepting single bit extractions/insertions when ZBS is enabled.


Jeff


Re: [PATCH] RISC-V: Fix ICE for avl_single-86/avl_single-88/avl_single-90

2023-03-05 Thread Kito Cheng via Gcc-patches
Committed, thanks for the fix :)

On Sun, Mar 5, 2023 at 6:25 PM  wrote:
>
> From: Ju-Zhe Zhong 
>
> FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-86.c  -Og -g  (internal
> compiler error: Segmentation fault)
> FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-86.c  -Og -g  (test for
> excess errors)
> FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-88.c  -Og -g  (internal
> compiler error: Segmentation fault)
> FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-88.c  -Og -g  (test for
> excess errors)
> FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-90.c  -Og -g  (internal
> compiler error: Segmentation fault)
> FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-90.c  -Og -g  (test for
> excess errors)
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vsetvl.cc (reg_available_p): Fix bug.
> (pass_vsetvl::backward_demand_fusion): Ditto.
>
> ---
>  gcc/config/riscv/riscv-vsetvl.cc | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/riscv/riscv-vsetvl.cc 
> b/gcc/config/riscv/riscv-vsetvl.cc
> index 9e25102a4f2..73f36a70331 100644
> --- a/gcc/config/riscv/riscv-vsetvl.cc
> +++ b/gcc/config/riscv/riscv-vsetvl.cc
> @@ -1528,7 +1528,7 @@ static bool
>  reg_available_p (const bb_info *bb, const vector_insn_info )
>  {
>if (!info.get_avl_source ())
> -return true;
> +return false;
>insn_info *insn = info.get_avl_source ()->insn ();
>if (insn->bb () == bb)
>  return before_p (insn, info.get_insn ());
> @@ -3040,6 +3040,12 @@ pass_vsetvl::backward_demand_fusion (void)
> continue;
>   if (e->src->index == ENTRY_BLOCK_PTR_FOR_FN (cfun)->index)
> continue;
> + /* If prop is demand of vsetvl instruction and reaching doesn't 
> demand
> +AVL. We don't backward propagate since vsetvl instruction has no
> +side effects.  */
> + if (vsetvl_insn_p (prop.get_insn ()->rtl ())
> + && propagate_avl_across_demands_p (prop, 
> block_info.reaching_out))
> +   continue;
>
>   if (block_info.reaching_out.unknown_p ())
> continue;
> --
> 2.36.3
>


Re: [PATCH 0/2] LoongArch: testsuite: Fix tests related to stack

2023-03-05 Thread Xi Ruoyao via Gcc-patches
On Fri, 2023-03-03 at 08:21 -0800, Mike Stump wrote:
> On Mar 3, 2023, at 12:40 AM, Xi Ruoyao via Gcc-patches
>  wrote:
> > 
> > Some trivial test case fixes.  Ok for trunk?
> 
> Ok.

Lulu: if you don't object I'll push these two in this week.

I tried to bisect for the exact point where the test cases are broken,
but it turns out they are broken the first day committed (r13-4401).  As
the draft of r13-4401 was sent in Sept 2022 but it's committed in Nov
2022, I can only guess something had changed in the two months and broke
the tests...
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH] RISC-V: Fix ICE for avl_single-86/avl_single-88/avl_single-90

2023-03-05 Thread juzhe . zhong
From: Ju-Zhe Zhong 

FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-86.c  -Og -g  (internal
compiler error: Segmentation fault)
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-86.c  -Og -g  (test for
excess errors)
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-88.c  -Og -g  (internal
compiler error: Segmentation fault)
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-88.c  -Og -g  (test for
excess errors)
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-90.c  -Og -g  (internal
compiler error: Segmentation fault)
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-90.c  -Og -g  (test for
excess errors)

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (reg_available_p): Fix bug.
(pass_vsetvl::backward_demand_fusion): Ditto.

---
 gcc/config/riscv/riscv-vsetvl.cc | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 9e25102a4f2..73f36a70331 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1528,7 +1528,7 @@ static bool
 reg_available_p (const bb_info *bb, const vector_insn_info )
 {
   if (!info.get_avl_source ())
-return true;
+return false;
   insn_info *insn = info.get_avl_source ()->insn ();
   if (insn->bb () == bb)
 return before_p (insn, info.get_insn ());
@@ -3040,6 +3040,12 @@ pass_vsetvl::backward_demand_fusion (void)
continue;
  if (e->src->index == ENTRY_BLOCK_PTR_FOR_FN (cfun)->index)
continue;
+ /* If prop is demand of vsetvl instruction and reaching doesn't demand
+AVL. We don't backward propagate since vsetvl instruction has no
+side effects.  */
+ if (vsetvl_insn_p (prop.get_insn ()->rtl ())
+ && propagate_avl_across_demands_p (prop, block_info.reaching_out))
+   continue;
 
  if (block_info.reaching_out.unknown_p ())
continue;
-- 
2.36.3



Re: [PATCH V3 0/5] RISC-V: Implement Scalar Cryptography Extension

2023-03-05 Thread Kito Cheng via Gcc-patches
Committed, thanks!

On Mon, Feb 20, 2023 at 3:01 PM Liao Shihua  wrote:
>
> This series adds basic support for the Scalar Cryptography extensions:
> * Zbkb
> * Zbkc
> * Zbkx
> * Zknd
> * Zkne
> * Zknh
> * Zksed
> * Zksh
>
> The implementation follows the version Scalar Cryptography v1.0.0 of the 
> specification,
> which can be found here:
> https://github.com/riscv/riscv-crypto/releases/tag/v1.0.0-scalar
>
> It works by Wu Siyu and Liao Shihua .
> Liao Shihua (5):
>   Add prototypes for RISC-V Crypto built-in functions
>   Implement ZBKB, ZBKC and ZBKX extensions
>   Implement ZKND and ZKNE extensions
>   Implement ZKNH extension
>   Implement ZKSH and ZKSED extensions
>
>  gcc/config/riscv/bitmanip.md  |  20 +-
>  gcc/config/riscv/constraints.md   |   8 +
>  gcc/config/riscv/crypto.md| 435 ++
>  gcc/config/riscv/riscv-builtins.cc|  26 ++
>  gcc/config/riscv/riscv-ftypes.def |  10 +
>  gcc/config/riscv/riscv-scalar-crypto.def  |  94 
>  gcc/config/riscv/riscv.md |   4 +-
>  gcc/testsuite/gcc.target/riscv/zbkb32.c   |  36 ++
>  gcc/testsuite/gcc.target/riscv/zbkb64.c   |  28 ++
>  gcc/testsuite/gcc.target/riscv/zbkc32.c   |  17 +
>  gcc/testsuite/gcc.target/riscv/zbkc64.c   |  17 +
>  gcc/testsuite/gcc.target/riscv/zbkx32.c   |  18 +
>  gcc/testsuite/gcc.target/riscv/zbkx64.c   |  18 +
>  gcc/testsuite/gcc.target/riscv/zknd32.c   |  18 +
>  gcc/testsuite/gcc.target/riscv/zknd64.c   |  36 ++
>  gcc/testsuite/gcc.target/riscv/zkne32.c   |  18 +
>  gcc/testsuite/gcc.target/riscv/zkne64.c   |  30 ++
>  gcc/testsuite/gcc.target/riscv/zknh-sha256.c  |  28 ++
>  .../gcc.target/riscv/zknh-sha512-32.c |  42 ++
>  .../gcc.target/riscv/zknh-sha512-64.c |  31 ++
>  gcc/testsuite/gcc.target/riscv/zksed32.c  |  19 +
>  gcc/testsuite/gcc.target/riscv/zksed64.c  |  19 +
>  gcc/testsuite/gcc.target/riscv/zksh32.c   |  19 +
>  gcc/testsuite/gcc.target/riscv/zksh64.c   |  19 +
>  24 files changed, 999 insertions(+), 11 deletions(-)
>  create mode 100644 gcc/config/riscv/crypto.md
>  create mode 100644 gcc/config/riscv/riscv-scalar-crypto.def
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zbkb32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zbkb64.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zbkc32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zbkc64.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zbkx32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zbkx64.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zknd32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zknd64.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zkne32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zkne64.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha256.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha512-32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha512-64.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zksed32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zksed64.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zksh32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zksh64.c
>
> --
> 2.38.1.windows.1
>


Re: [PATCH v4 0/9] RISC-V: Add XThead* extension support

2023-03-05 Thread Kito Cheng via Gcc-patches
LGTM :)

On Thu, Mar 2, 2023 at 4:36 PM Christoph Muellner
 wrote:
>
> From: Christoph Müllner 
>
> This series introduces support for the T-Head specific RISC-V ISA extensions
> which are available e.g. on the T-Head XuanTie C906.
>
> The ISA spec can be found here:
>   https://github.com/T-head-Semi/thead-extension-spec
>
> This series adds support for the following XThead* extensions:
> * XTheadBa
> * XTheadBb
> * XTheadBs
> * XTheadCmo
> * XTheadCondMov
> * XTheadFmv
> * XTheadInt
> * XTheadMac
> * XTheadMemPair
> * XTheadSync
>
> All extensions are properly integrated and the included tests
> demonstrate the improvements of the generated code.
>
> The series also introduces support for "-mcpu=thead-c906", which also
> enables all available XThead* ISA extensions of the T-Head C906.
>
> All patches have been tested and don't introduce regressions for RV32 or RV64.
> The patches have also been tested with SPEC CPU2017 on QEMU and real HW
> (D1 board).
>
> Support patches for these extensions for Binutils, QEMU, and LLVM have
> already been merged in the corresponding upstream projects.
>
> Patches 1-8 from this series (everything except the last one) got an ACK
> by Kito. However, since there were a few comments after the ACK, I
> decided to send out a v4, so that reviewers can verify that their
> comments have been addressed properly.
>
> Note, that there was a concern raised by Andrew Pinski (on CC), which
> might not be resolved with this series (I could not reproduce the issue,
> but I might have misunderstood something).
>
> Changes in v4:
> - Drop XTheadMemIdx and XTheadFMemIdx (will be a follow-up series)
> - Replace 'immediate_operand' by 'const_int_operand' in many patterns
> - Small cleanups in XTheadBb
> - Factor out C code into thead.cc (XTheadMemPair) to minimize changes in
>   riscv.cc
>
> Changes in v3:
> - Bugfix in XTheadBa
> - Rewrite of XTheadMemPair
> - Inclusion of XTheadMemIdx and XTheadFMemIdx
>
> Christoph Müllner (9):
>   riscv: Add basic XThead* vendor extension support
>   riscv: riscv-cores.def: Add T-Head XuanTie C906
>   riscv: thead: Add support for the XTheadBa ISA extension
>   riscv: thead: Add support for the XTheadBs ISA extension
>   riscv: thead: Add support for the XTheadBb ISA extension
>   riscv: thead: Add support for the XTheadCondMov ISA extensions
>   riscv: thead: Add support for the XTheadMac ISA extension
>   riscv: thead: Add support for the XTheadFmv ISA extension
>   riscv: thead: Add support for the XTheadMemPair ISA extension
>
>  gcc/common/config/riscv/riscv-common.cc   |  26 ++
>  gcc/config.gcc|   1 +
>  gcc/config/riscv/bitmanip.md  |  52 ++-
>  gcc/config/riscv/constraints.md   |   8 +
>  gcc/config/riscv/iterators.md |   4 +
>  gcc/config/riscv/peephole.md  |  56 +++
>  gcc/config/riscv/riscv-cores.def  |   4 +
>  gcc/config/riscv/riscv-opts.h |  26 ++
>  gcc/config/riscv/riscv-protos.h   |  16 +-
>  gcc/config/riscv/riscv.cc | 226 +++--
>  gcc/config/riscv/riscv.md |  67 ++-
>  gcc/config/riscv/riscv.opt|   3 +
>  gcc/config/riscv/t-riscv  |   4 +
>  gcc/config/riscv/thead.cc | 427 ++
>  gcc/config/riscv/thead.md | 346 ++
>  .../gcc.target/riscv/mcpu-thead-c906.c|  28 ++
>  .../gcc.target/riscv/xtheadba-addsl.c |  55 +++
>  gcc/testsuite/gcc.target/riscv/xtheadba.c |  14 +
>  gcc/testsuite/gcc.target/riscv/xtheadbb-ext.c |  20 +
>  .../gcc.target/riscv/xtheadbb-extu-2.c|  22 +
>  .../gcc.target/riscv/xtheadbb-extu.c  |  22 +
>  gcc/testsuite/gcc.target/riscv/xtheadbb-ff1.c |  18 +
>  gcc/testsuite/gcc.target/riscv/xtheadbb-rev.c |  45 ++
>  .../gcc.target/riscv/xtheadbb-srri.c  |  25 +
>  gcc/testsuite/gcc.target/riscv/xtheadbb.c |  14 +
>  gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c |  13 +
>  gcc/testsuite/gcc.target/riscv/xtheadbs.c |  14 +
>  gcc/testsuite/gcc.target/riscv/xtheadcmo.c|  14 +
>  .../riscv/xtheadcondmov-mveqz-imm-eqz.c   |  38 ++
>  .../riscv/xtheadcondmov-mveqz-imm-not.c   |  38 ++
>  .../riscv/xtheadcondmov-mveqz-reg-eqz.c   |  38 ++
>  .../riscv/xtheadcondmov-mveqz-reg-not.c   |  38 ++
>  .../riscv/xtheadcondmov-mvnez-imm-cond.c  |  38 ++
>  .../riscv/xtheadcondmov-mvnez-imm-nez.c   |  38 ++
>  .../riscv/xtheadcondmov-mvnez-reg-cond.c  |  38 ++
>  .../riscv/xtheadcondmov-mvnez-reg-nez.c   |  38 ++
>  .../gcc.target/riscv/xtheadcondmov.c  |  14 +
>  .../gcc.target/riscv/xtheadfmemidx.c  |  14 +
>  .../gcc.target/riscv/xtheadfmv-fmv.c  |  24 +
>  gcc/testsuite/gcc.target/riscv/xtheadfmv.c|  14 +
>  gcc/testsuite/gcc.target/riscv/xtheadint.c|  14 +
>  .../gcc.target/riscv/xtheadmac-mula-muls.c|  43 ++
>  

Re: [wwwdocs] gcc-13: riscv: Document the T-Head CPU support

2023-03-05 Thread Kito Cheng via Gcc-patches
LGTM :)


On Fri, Feb 24, 2023 at 7:19 PM Christoph Muellner
 wrote:
>
> From: Christoph Müllner 
>
> This patch documents the new T-Head CPU support for RISC-V.
>
> Signed-off-by: Christoph Müllner 
> ---
>  htdocs/gcc-13/changes.html | 24 +++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
> index a803f501..ce5ba35c 100644
> --- a/htdocs/gcc-13/changes.html
> +++ b/htdocs/gcc-13/changes.html
> @@ -490,7 +490,29 @@ a work-in-progress.
>
>  RISC-V
>  
> -New ISA extension support for zawrs.
> +  New ISA extension support for Zawrs.
> +  Support for the following vendor extensions has been added:
> +
> +  XTheadBa
> +  XTheadBb
> +  XTheadBs
> +  XTheadCmo
> +  XTheadCondMov
> +  XTheadFMemIdx
> +  XTheadFmv
> +  XTheadInt
> +  XTheadMac
> +  XTheadMemIdx
> +  XTheadMemPair
> +  XTheadSync
> +
> +  
> +  The following new CPUs are supported through the -mcpu
> +  option (GCC identifiers in parentheses).
> +
> +  T-Head's XuanTie C906 (thead-c906).
> +
> +  
>  
>
>  
> --
> 2.39.2
>


Re: [PATCH] RISC-V: Fix wrong partial subreg check for bsetidisi

2023-03-05 Thread Kito Cheng via Gcc-patches
Committed, thanks!


On Tue, Feb 28, 2023 at 5:32 PM Philipp Tomsich
 wrote:
>
> On Tue, 28 Feb 2023 at 06:00, Lin Sinan  wrote:
> >
> > From: Lin Sinan 
> >
> > The partial subreg check should be for subreg operand(operand 1) instead of
> > the immediate operand(operand 2). This change also fix pr68648.c in zbs.
>
> Good catch.
> Reviewed-by: 


Re: [PATCH] RISC-V: Allow const0_rtx operand in max/min

2023-03-05 Thread Kito Cheng via Gcc-patches
Committed, thanks!


On Tue, Feb 28, 2023 at 12:36 PM Sinan  wrote:
>
> From 73e743348a49a7fffcf2e328b8179e8dbbc3b2b4 Mon Sep 17 00:00:00 2001
> From: Lin Sinan 
> Date: Tue, 28 Feb 2023 00:44:55 +0800
> Subject: [PATCH] RISC-V: Allow const0_rtx operand in max/min
>
> Optimize cases that use max[u]/min[u] against a zero constant.
> E.g., the case int f(int x) { return x >= 0 ? x : 0; }
> the current asm output in rv64gc_zba_zbb
>  li rtmp,0
>  max a0,a0,rtmp
> could be optimized into
>  max a0,a0,zero
>
> gcc/ChangeLog:
>
>  * config/riscv/bitmanip.md: allow 0 constant in max/min
>pattern.
>
> gcc/testsuite/ChangeLog:
>
>  * gcc.target/riscv/zbb-min-max-03.c: New test.
>
> ---
>  gcc/config/riscv/bitmanip.md|  4 ++--
>  gcc/testsuite/gcc.target/riscv/zbb-min-max-03.c | 10 ++
>  2 files changed, 12 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-min-max-03.c
>
> diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
> index 58a86bd929f..f771835369c 100644
> --- a/gcc/config/riscv/bitmanip.md
> +++ b/gcc/config/riscv/bitmanip.md
> @@ -363,9 +363,9 @@
>  (define_insn "3"
>[(set (match_operand:X 0 "register_operand" "=r")
>  (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r")
> -  (match_operand:X 2 "register_operand" "r")))]
> +  (match_operand:X 2 "reg_or_0_operand" "rJ")))]
>"TARGET_ZBB"
> -  "\t%0,%1,%2"
> +  "\t%0,%1,%z2"
>[(set_attr "type" "bitmanip")])
>
>  ;; Optimize the common case of a SImode min/max against a constant
> diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max-03.c 
> b/gcc/testsuite/gcc.target/riscv/zbb-min-max-03.c
> new file mode 100644
> index 000..947300d599d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max-03.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64d" } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" } } */
> +
> +int f(int x) {
> +return x >= 0 ? x : 0;
> +}
> +
> +/* { dg-final { scan-assembler-times "max\t" 1 } } */
> +/* { dg-final { scan-assembler-not "li\t" } } */
> --
> 2.34.1
>