RE: [PATCH] RISC-V: Support simplify (-1-x) for vector.

2023-08-17 Thread Wang, Yanzhang via Gcc-patches
Hi Jeff,

Thank you so much for the note and testing :D.
I'll attach the test result next time.

Thanks,
Yanzhang

> -Original Message-
> From: Jeff Law 
> Sent: Thursday, August 17, 2023 12:33 PM
> To: Wang, Yanzhang ; gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li, Pan2
> 
> Subject: Re: [PATCH] RISC-V: Support simplify (-1-x) for vector.
> 
> 
> 
> On 8/16/23 02:40, yanzhang.wang--- via Gcc-patches wrote:
> > From: Yanzhang Wang 
> >
> > The pattern is enabled for scalar but not for vector. The patch try to
> > make it consistent and will convert below code,
> >
> > shortcut_for_riscv_vrsub_case_1_32:
> >  vl1re32.v   v1,0(a1)
> >  vsetvli zero,a2,e32,m1,ta,ma
> >  vrsub.viv1,v1,-1
> >  vs1r.v  v1,0(a0)
> >  ret
> >
> > to,
> >
> > shortcut_for_riscv_vrsub_case_1_32:
> >  vl1re32.v   v1,0(a1)
> >  vsetvli zero,a2,e32,m1,ta,ma
> >  vnot.v  v1,v1
> >  vs1r.v  v1,0(a0)
> >  ret
> >
> > gcc/ChangeLog:
> >
> > * simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
> >  Get -1 with mode.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/rvv/base/simplify-vrsub.c: New test.
> Just a note.  It is customary to indicate what testing you did for each
> patch.  A patch which changes target independent code should be
> bootstrapped and regression tested on at least one major target (most folks
> use x86_64 or aarch64).
> 
> If you change target code it is customary to run the testsuite on that
> target.  Ideally that would include a bootstrap and regression test, but
> that's not always possible (cross compilers) in which case you just build
> the toolchain and run the cross tests.
> 
> I went ahead and bootstrapped & regression tested this on x86_64-linux-gnu
> where it passed without regressions.
> 
> I'll push this to the trunk.
> 
> Thanks,
> jeff


RE: [PATCH v3] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2023-08-01 Thread Wang, Yanzhang via Gcc-patches
Hi Jeff,

Do you have any further comments about this patch ?

Thanks,
Yanzhang

> -Original Message-
> From: Jeff Law 
> Sent: Friday, July 21, 2023 12:11 PM
> To: Kito Cheng ; Wang, Yanzhang
> 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com;
> Li, Pan2 
> Subject: Re: [PATCH v3] RISCV: Add -m(no)-omit-leaf-frame-pointer support.
> 
> 
> 
> On 7/20/23 21:49, Kito Cheng wrote:
> > LGTM, I think long jump is another issue and making ra become a fixed
> > register will escalate to an ABI issue, so that should not be a
> > blocker for this patch.
> I'll take a look tomorrow, but I'm supportive of what Yanzhang is trying to
> do in principle.  I've got a few hot items to deal with tonight though.
> 
> WRT making $ra fixed.  In practice fixing a register just takes it out of
> the pool of things available to the allocator.  Furthermore $ra is always
> considered clobbered at call sites.  So while one could view it as an ABI
> change, it's not one that's actually observable in practice.
> I suspect that's one of the reasons why $ra is used by the assembler in
> this manner -- it minimizes both the ABI and performance impacts.
> 
> jeff



RE: [PATCH v2] RISC-V: convert the mulh with 0 to mov 0 to the reg.

2023-07-31 Thread Wang, Yanzhang via Gcc-patches
Thanks your comments, Jeff and Robin

> > Is the mulh case somehow common or critical?
> Well, I would actually back up even further.  What were the
> circumstances that led to the mulh with a zero operand?   

I think you both mentioned why should we add the mulh * 0 simplify.
Unfortunately, I have no such a benchmark to explain the criticalness. We found
there're some cases that exists in simplify_binary_operation in simplify-rtx.cc
but not working for RISC-V backend. For example,

- mult * 0 exists, but RISC-V has additional mulh * 0
- add + 0 / sub - 0 exists, but RISC-V has additional (madc + adc) + 0
- ...

So we want to do some complement to make the simplify can cover more cases.
That's the basic idea why we do these shortcut optimizations.

> > However, adding new rtl expressions, especially generic ones that are
> > useful for others and the respective optimizations is a tedious
> > process as well.  Still, just recently Roger Sayle added bitreverse
> > and copysign.  You can refer to his patch as well as the follow-up
> > ones to get an idea of what would need to be done.
> > ("Add RTX codes for BITREVERSE and COPYSIGN")

Great advise. I'll have a check for the generic operations whether they can
be implemented by this patch's style. It seems that we have to write specific
pattern for the unspec relative insns, unfortunately.

Thanks,
Yanzhang

> -Original Message-
> From: Jeff Law 
> Sent: Saturday, July 29, 2023 7:07 AM
> To: Robin Dapp ; Wang, Yanzhang
> ; gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li, Pan2
> 
> Subject: Re: [PATCH v2] RISC-V: convert the mulh with 0 to mov 0 to the reg.
> 
> 
> 
> On 7/28/23 06:31, Robin Dapp via Gcc-patches wrote:
> >> This is a draft patch. I would like to explain it's hard to make the
> >> simplify generic and ask for some help.
> >>
> >> There're 2 categories we need to optimize.
> >>
> >> - The op in optab such as div / 1.
> >> - The unspec operation such as mulh * 0, (vadc+vmadc) + 0.
> >>
> >> Especially for the unspec operation, I found we need to write one by
> >> one to match the special pattern. Seems there's no way to write a
> >> generic pattern that will match mulh, (vadc+vmadc), sll... This way
> >> is too complicated and not so elegant because need to write so much
> >> md patterns.
> >>
> >> Do you have any ideas?
> >
> > Yes, it's cumbersome having to add the patterns individually and it
> > would be nicer to have the middle end optimize for us.
> >
> > However, adding new rtl expressions, especially generic ones that are
> > useful for others and the respective optimizations is a tedious
> > process as well.  Still, just recently Roger Sayle added bitreverse
> > and copysign.  You can refer to his patch as well as the follow-up
> > ones to get an idea of what would need to be done.
> > ("Add RTX codes for BITREVERSE and COPYSIGN")
> >
> > So if we have few patterns that are really performance critical (like
> > for some benchmark) my take is to add them in a similar way you were
> > proposing but I would advise against using this excessively.
> > Is the mulh case somehow common or critical?
> Well, I would actually back up even further.  What were the
> circumstances that led to the mulh with a zero operand?   That would
> tend to be an indicator of a problem earlier.  Perhaps in the gimple
> pipeline or the gimple->rtl conversion.  I'd be a bit surprised to see a
> const0_rtx propagate in during the RTL pipeline, I guess it's possible, but
> I'd expect it to be relatively rare.
> 
> The one case I could see happening would be cases from the builtin apis...
> Of course one might call that user error ;-)
> 
> 
> jeff


RE: [PATCH v2] RISC-V: convert the mulh with 0 to mov 0 to the reg.

2023-07-28 Thread Wang, Yanzhang via Gcc-patches
This is a draft patch. I would like to explain it's hard to make the
simplify generic and ask for some help.

There're 2 categories we need to optimize.

- The op in optab such as div / 1.
- The unspec operation such as mulh * 0, (vadc+vmadc) + 0.

Especially for the unspec operation, I found we need to write one by
one to match the special pattern. Seems there's no way to write a
generic pattern that will match mulh, (vadc+vmadc), sll... This way
is too complicated and not so elegant because need to write so much
md patterns.

Do you have any ideas?

> -Original Message-
> From: Wang, Yanzhang 
> Sent: Friday, July 28, 2023 7:50 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; rdapp@gmail.com; Li,
> Pan2 ; Wang, Yanzhang 
> Subject: [PATCH v2] RISC-V: convert the mulh with 0 to mov 0 to the reg.
> 
> From: Yanzhang Wang 
> 
> This patch will optimize the below mulh example,
> 
> vint32m1_t shortcut_for_riscv_vmulh_case_0(vint32m1_t v1, size_t vl) {
>   return __riscv_vmulh_vx_i32m1(v1, 0, vl); }
> 
> from mulh pattern
> 
> vsetvli   zero, a2, e32, m1, ta, ma
> vmulh.vx  v24, v24, zero
> vs1r.vv24, 0(a0)
> 
> to below vmv.
> 
> vsetvli zero,a2,e32,m1,ta,ma
> vmv.v.i v1,0
> vs1r.v  v1,0(a0)
> 
> It will elimate the mul with const 0 instruction to the simple mov
> instruction.
> 
> Signed-off-by: Yanzhang Wang 
> 
> gcc/ChangeLog:
> 
>   * config/riscv/autovec-opt.md: Add a split pattern.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/rvv/base/binop_vx_constraint-121.c: The mul
> with 0 will be simplified to vmv.v.i.
>   * gcc.target/riscv/rvv/autovec/vmulh-with-zero.cc: New test.
> ---
>  gcc/config/riscv/autovec-opt.md   | 58 +++
>  gcc/config/riscv/riscv-protos.h   |  2 +
>  gcc/config/riscv/riscv-v.cc   | 57 ++
>  .../riscv/rvv/autovec/vmulh-with-zero.cc  | 19 ++
>  .../riscv/rvv/base/binop_vx_constraint-121.c  |  3 +-
>  5 files changed, 138 insertions(+), 1 deletion(-)  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/autovec/vmulh-with-zero.cc
> 
> diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-
> opt.md index 28040805b23..0d87572d1a4 100644
> --- a/gcc/config/riscv/autovec-opt.md
> +++ b/gcc/config/riscv/autovec-opt.md
> @@ -405,3 +405,61 @@
>"vmv.x.s\t%0,%1"
>[(set_attr "type" "vimovvx")
> (set_attr "mode" "")])
> +
> +;;; Simplify the mulh with 0 to move
> +(define_split
> +  [(set (match_operand:VI_QHS 0 "register_operand")
> + (if_then_else:VI_QHS
> +   (unspec:
> +  [(match_operand: 1 "vector_all_trues_mask_operand")
> +(match_operand 5 "vector_length_operand")
> +(match_operand 6 "const_int_operand")
> +(match_operand 7 "const_int_operand")
> +(match_operand 8 "const_int_operand")
> +(reg:SI VL_REGNUM)
> +(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> +   (unspec:VI_QHS
> +  [(vec_duplicate:VI_QHS
> + (match_operand: 4 "reg_or_0_operand"))
> +(match_operand:VI_QHS 3 "register_operand")] VMULH)
> +   (match_operand:VI_QHS 2 "vector_merge_operand")
> +   ))]
> +  "TARGET_VECTOR
> + && rtx_equal_p (operands[4], CONST0_RTX (GET_MODE (operands[4])))"
> +  [(const_int 0)]
> +{
> +  riscv_vector::simplify_unspec_operations (operands, UNSPEC,
> +  , mode) ;
> +  DONE;
> +})
> +
> +;;; Simplify vmadc + vadc with 0 to a simple move.
> +(define_split
> +  [(set (match_operand:VI 0 "register_operand")
> + (if_then_else:VI
> +   (unspec:
> +  [(match_operand 4 "vector_length_operand")
> +(match_operand 5 "const_int_operand")
> +(match_operand 6 "const_int_operand")
> +(reg:SI VL_REGNUM)
> +(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> +   (unspec:VI
> +  [(match_operand:VI 2 "register_operand")
> +(unspec:
> +  [(match_operand:VI 3 "register_operand")
> +(unspec:
> +  [(match_operand 7 "vector_length_operand")
> +(match_operand 8 "const_int_operand")
> +(reg:SI VL_REGNUM)
> +(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> +] UNSPEC_OVERFLOW)
> +] UNSPEC_VADC)
> +   (match_operand:VI 1 "vector_merge_operand")))]
> +  "TARGET_VECTOR"
> +  [(const_int 0)]
> +{
> +  riscv_vector::simplify_unspec_operations (operands, PLUS, UNSPEC_VADC,
> + mode);
> +  DONE;
> +})
> +
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-
> protos.h index f052757cede..6a188a3d0ef 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -228,6 +228,8 @@ bool neg_simm5_p (rtx);  bool has_vi_variant_p
> (rtx_code, rtx);  void expand_vec_cmp (rtx, rtx_code, rtx, rtx);  bool
> expand_vec_cmp_float (rtx, rtx_code, rtx, rtx, bool);
> +void 

RE: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2023-06-21 Thread Wang, Yanzhang via Gcc-patches
Hi Jeff, sorry for the late reply.

> The long branch handling is done at the assembler level.  So the clobbering
> of $ra isn't visible to the compiler.  Thus the compiler has to be
> extremely careful to not hold values in $ra because the assembler may
> clobber $ra.

If assembler will modify the $ra behavior, it seems the rules we defined in
the riscv.cc will be ignored. For example, the $ra saving generated by this
patch may be modified by the assmebler and all others depends on it will be
wrong. So implementing the long jump in the compiler is better.

Do I understand it correctly ?

> If you're not going to use dwarf, then my recommendation is to ensure that
> the data you need is *always* available in the stack at known
> offsets.   That will mean your code isn't optimized as well.  It means
> hand written assembly code has to follow the conventions, you can't link
> against libraries that do not follow those conventions, etc etc.  But
> that's the price you pay for not using dwarf (or presumably ORC/SFRAME
> which I haven't studied in detail).

Yes. That's right. All the libraries need to follow the same logic. But as
you said, this is the price if we choose this solution. And fortunately,
this will only be used in special scenarios.

---

And Jeff, do you have any other comments about this patch? Should we add
some descriptions somewhere in the doc?

Thanks,
Yanzhang

> -Original Message-
> From: Jeff Law 
> Sent: Thursday, June 8, 2023 11:05 PM
> To: Wang, Yanzhang ; gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li, Pan2
> 
> Subject: Re: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.
> 
> 
> 
> On 6/6/23 21:50, Wang, Yanzhang wrote:
> > Hi Jeff,
> >
> > Thanks your comments. I have few questions that I don't quite understand.
> >
> >> One of the things that needs to be upstreamed is long jump support
> >> within a function.  Essentially once a function reaches 1M in size we
> >> have the real possibility that a direct jump may not reach its target.
> >>
> >> To support this I expect that $ra is going to become a fixed register
> >> (ie, not available to the register allocator as a temporary).  It'll
> >> be used as a scratch register for long jump sequences.
> >>
> >> One of the consequences of this is $ra will need to be saved in leaf
> >> functions that are near or over 1M in size.
> >>
> >> Note that at the time when we have to lay out the stack, we do not
> >> know the precise length of the function.  So there's a degree of
> >> "fuzz" in the decision whether or not to save $ra in a function that
> >> is close to the 1M limit.
> >
> > Do you mean that, long jump to more than 1M offset will need multiple
> > jal and each jal will save the $ra ?
> Long jumps are implemnted as an indirect jump which needs a scratch
> register to hold the high part of the jump target address.
> 
> >
> > If yes, I'm confused about what's the influence of the $ra saving for
> > function prologue. We will save the fp+ra at the prologue, the next
> > $ra saving seems will not modify the $ra already saved.
> The long branch handling is done at the assembler level.  So the clobbering
> of $ra isn't visible to the compiler.  Thus the compiler has to be
> extremely careful to not hold values in $ra because the assembler may
> clobber $ra.
> 
> This ultimately comes back to the phase ordering problem.  At register
> allocation time we don't know if we need long jumps or not.  So we don't
> know if $ra is potentially clobbered by the assembler.   A similar phase
> ordering problems exists in the prologue/epilogue generation.
> 
> The other approach to long branch handling would be to do it all in the
> compiler.  I would actually prefer this approach, but it's not likely to
> land in the near term.
> 
> 
> >
> > I think it's yes (not valid) when we want to get the return address to
> > parent function from $ra directly in the function body. But we can get
> > the right return address from fp with offset if we save them at prologue,
> is it right ?
> Right.  You'll be able to get the value of $ra out of the stack.
> 
> 
> 
> >
> >> Meaning that what you really want is to be using
> >> -fno-omit-frame-pointer and for $ra to always be saved in the stack,
> even in a leaf function.
> >
> > This is also another solution but will change the default behavior of
> > -fno-omit-frame-pointer.
> That's OK.  While -f options are target independent options, targets are
> allowed to adjust certain behaviors based on those options.
> 
> If you're not going to use dwarf, then my recommendation is to ensure that
> the data you need is *always* available in the stack at known
> offsets.   That will mean your code isn't optimized as well.  It means
> hand written assembly code has to follow the conventions, you can't link
> against libraries that do not follow those conventions, etc etc.  But
> that's the price you pay for not using dwarf (or presumably ORC/SFRAME
> which I haven't studied 

RE: Re: [PATCH] RISC-V: convert the mulh with 0 to mov 0 to the reg.

2023-06-21 Thread Wang, Yanzhang via Gcc-patches
Of cause, I'd like to make it generic. Thanks Robin’s advice! It's right,
there're many similar situations.

But I'm not sure how to distinguish different operations. Currently, the
VMULH is fixed as below.


+   (unspec:VI_QHS
+ [(vec_duplicate:VI_QHS
+(match_operand: 4 "reg_or_0_operand"))
+   (match_operand:VI_QHS 3 "register_operand")] VMULH)

Do we need to define another UNSPEC ? And do we have any APIs to get the
operation, like whether it's VMULH or POW ?

Thanks,
Yanzhang
From: juzhe.zh...@rivai.ai 
Sent: Wednesday, June 21, 2023 2:33 PM
To: Robin Dapp ; Wang, Yanzhang ; 
gcc-patches 
Cc: Robin Dapp ; Kito.cheng ; Li, 
Pan2 
Subject: Re: Re: [PATCH] RISC-V: convert the mulh with 0 to mov 0 to the reg.

Oh. Yes. Thanks for Robin pointing this.

@yanzhang, could you refine this patch more deeply to gain more optimizations ?

Thanks.

juzhe.zh...@rivai.ai

From: Robin Dapp
Date: 2023-06-21 14:27
To: yanzhang.wang; 
gcc-patches
CC: rdapp.gcc; 
juzhe.zhong; 
kito.cheng; pan2.li
Subject: Re: [PATCH] RISC-V: convert the mulh with 0 to mov 0 to the reg.
Hi Yanzhang,

while I appreciate the optimization, I'm a bit wary about just adding a special
case for "0".  Is that so common? Wouldn't we also like to have
  * pow2_p (val) == << val and others?

* 1 should also be covered.

Regards
Robin



RE: [PATCH] RISC-V: convert the mulh with 0 to mov 0 to the reg.

2023-06-21 Thread Wang, Yanzhang via Gcc-patches
Thanks, you are right. I have not considered the iterator much. I picked it
from one of pred_mulh directly. It should be able to work with VFULL_I.

Yanzhang

From: juzhe.zh...@rivai.ai 
Sent: Wednesday, June 21, 2023 2:21 PM
To: Wang, Yanzhang ; gcc-patches 

Cc: Kito.cheng ; Li, Pan2 ; Wang, 
Yanzhang ; Robin Dapp ; 
jeffreyalaw 
Subject: Re: [PATCH] RISC-V: convert the mulh with 0 to mov 0 to the reg.

Good catch!
vmulh.vx v24,v24,zero -> vmv.v.i v1,0
can eliminate use of v24 and reduce register pressure.

But I wonder why you pick only VI_QHS?


+  [(set (match_operand:VI_QHS 0 "register_operand")

SEW = 64 should always have such optimization.

Thanks.

juzhe.zh...@rivai.ai

From: yanzhang.wang
Date: 2023-06-21 14:08
To: gcc-patches
CC: juzhe.zhong; 
kito.cheng; pan2.li; 
yanzhang.wang
Subject: [PATCH] RISC-V: convert the mulh with 0 to mov 0 to the reg.
From: Yanzhang Wang mailto:yanzhang.w...@intel.com>>

This patch will optimize the below mulh example,

vint32m1_t shortcut_for_riscv_vmulh_case_0(vint32m1_t v1, size_t vl) {
  return __riscv_vmulh_vx_i32m1(v1, 0, vl);
}

from mulh pattern

vsetvli   zero, a2, e32, m1, ta, ma
vmulh.vx  v24, v24, zero
vs1r.vv24, 0(a0)

to below vmv.

vsetvli zero,a2,e32,m1,ta,ma
vmv.v.i v1,0
vs1r.v  v1,0(a0)

It will elimate the mul with const 0 instruction to the simple mov
instruction.

Signed-off-by: Yanzhang Wang 
mailto:yanzhang.w...@intel.com>>

gcc/ChangeLog:

* config/riscv/autovec-opt.md: Add a split pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/binop_vx_constraint-121.c: The mul
  with 0 will be simplified to vmv.v.i.
* gcc.target/riscv/rvv/autovec/vmulh-with-zero.cc: New test.
---
gcc/config/riscv/autovec-opt.md   | 30 +++
.../riscv/rvv/autovec/vmulh-with-zero.cc  | 19 
.../riscv/rvv/base/binop_vx_constraint-121.c  |  3 +-
3 files changed, 51 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vmulh-with-zero.cc

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index 28040805b23..9c14be964b5 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -405,3 +405,33 @@
   "vmv.x.s\t%0,%1"
   [(set_attr "type" "vimovvx")
(set_attr "mode" "")])
+
+;; Simplify VMULH (V, 0) Instructions to vmv.v.i.
+(define_split
+  [(set (match_operand:VI_QHS 0 "register_operand")
+ (if_then_else:VI_QHS
+   (unspec:
+ [(match_operand: 1 "vector_all_trues_mask_operand")
+   (match_operand 5 "vector_length_operand")
+   (match_operand 6 "const_int_operand")
+   (match_operand 7 "const_int_operand")
+   (match_operand 8 "const_int_operand")
+   (reg:SI VL_REGNUM)
+   (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+   (unspec:VI_QHS
+ [(vec_duplicate:VI_QHS
+(match_operand: 4 "reg_or_0_operand"))
+   (match_operand:VI_QHS 3 "register_operand")] VMULH)
+   (match_operand:VI_QHS 2 "vector_merge_operand")))]
+  "TARGET_VECTOR
+ && rtx_equal_p (operands[4], CONST0_RTX (GET_MODE (operands[4])))"
+  [(const_int 0)]
+  {
+machine_mode mask_mode = riscv_vector::get_mask_mode (mode)
+  .require ();
+emit_insn (gen_pred_mov (mode, operands[0], CONST1_RTX (mask_mode),
+   RVV_VUNDEF (mode), CONST0_RTX (GET_MODE (operands[0])),
+   operands[5], operands[6], operands[7], operands[8]));
+DONE;
+  }
+)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmulh-with-zero.cc 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmulh-with-zero.cc
new file mode 100644
index 000..6e4a3d62bc0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vmulh-with-zero.cc
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+#define VMULH_WITH_LMUL(X) \
+  vint32m##X##_t shortcut_for_riscv_vmulh_case_##X (vint32m##X##_t v1,\
+  size_t vl) {  \
+return __riscv_vmulh_vx_i32m ##X (v1, 0, vl); \
+  }
+
+
+VMULH_WITH_LMUL (1)
+VMULH_WITH_LMUL (2)
+VMULH_WITH_LMUL (4)
+VMULH_WITH_LMUL (8)
+VMULH_WITH_LMUL (f2)
+
+/* { dg-final { scan-assembler-times {vmv\.v\.i\sv[0-9]+,0} 5} */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-121.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-121.c
index 4d2de91bc14..d1473274137 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-121.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-121.c
@@ -50,6 +50,7 @@ void f6 (void * in, void *out, int32_t x)
 __riscv_vse64_v_i64m1 (out, v3, 4);
}
-/* { dg-final { scan-assembler-times {vmulh\.vx\s+v[0-9]+,\s*v[0-9]+,zero} 2 } 
} */
+/* { dg-final { scan-assembler-times 

RE: [PATCH v5] RISC-V: Add vector psabi checking.

2023-06-12 Thread Wang, Yanzhang via Gcc-patches
I think it's ok to add it to specific cases and will not affect other cases.
There's not so many cases. And we can revert it after the finalization.

> -Original Message-
> From: Kito Cheng 
> Sent: Monday, June 12, 2023 10:53 PM
> To: Jeff Law 
> Cc: Wang, Yanzhang ; gcc-patches@gcc.gnu.org;
> juzhe.zh...@rivai.ai; Li, Pan2 
> Subject: Re: [PATCH v5] RISC-V: Add vector psabi checking.
> 
> Hmmm, yeah, I think let's add it case by case...I assume we should get it
> rid before GCC 14, it is mostly used for the transition period before we
> settle down the ABI and for GCC 13.
> 
> On Mon, Jun 12, 2023 at 10:34 PM Jeff Law  wrote:
> >
> >
> >
> > On 6/12/23 07:36, Wang, Yanzhang via Gcc-patches wrote:
> > > I found that add the -Wno-psabi to CFLAGS will be overrode by
> > > dg-options. It seems we can only add this option to the third arg of
> > > dg-runtest. Attach the dg-runtest comments,
> > I think we default to -Wno-psabi to avoid triggering diagnostics in
> > the common case where we aren't concerned about such issues.  So not a
> > surprise that we'll need to work a bit harder to get it added when we
> > do want to check for psabi issues.
> >
> > jeff


RE: [PATCH v5] RISC-V: Add vector psabi checking.

2023-06-12 Thread Wang, Yanzhang via Gcc-patches
It's the same behavior. Because the DEFAULT_CFLAGS will be copied to
CFLAGS and then passed as the DEFAULT_EXTRA_OPTIONS to dg-runtest.

> -Original Message-
> From: Kito Cheng 
> Sent: Monday, June 12, 2023 10:08 PM
> To: Wang, Yanzhang 
> Cc: Kito Cheng ; gcc-patches@gcc.gnu.org;
> juzhe.zh...@rivai.ai; Li, Pan2 
> Subject: Re: [PATCH v5] RISC-V: Add vector psabi checking.
> 
> How about appending to DEFAULT_CFLAGS?
> 
> On Mon, Jun 12, 2023 at 9:38 PM Wang, Yanzhang via Gcc-patches  patc...@gcc.gnu.org> wrote:
> >
> > I found that add the -Wno-psabi to CFLAGS will be overrode by
> > dg-options. It seems we can only add this option to the third arg of
> > dg-runtest. Attach the dg-runtest comments,
> >
> > # dg-runtest -- simple main loop useful to most testsuites # # OPTIONS
> > is a set of options to always pass.
> > # DEFAULT_EXTRA_OPTIONS is a set of options to pass if the testcase #
> > doesn't specify any (with dg-option).
> >
> > > -Original Message-
> > > From: Kito Cheng 
> > > Sent: Monday, June 12, 2023 8:43 PM
> > > To: Wang, Yanzhang 
> > > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Li, Pan2
> > > 
> > > Subject: Re: [PATCH v5] RISC-V: Add vector psabi checking.
> > >
> > > Hi Yan-Zhang:
> > >
> > > OK with one minor, go ahead IF the regression is clean.
> > >
> > > Hi Pan:
> > >
> > > Could you help to verify this patch and commit if the regression is
> clean?
> > >
> > > thanks :)
> > >
> > > > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> > > b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> > > > index 5e69235a268..ad79d0e9a8d 100644
> > > > --- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> > > > +++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> > > > @@ -43,7 +43,7 @@ dg-init
> > > >  # Main loop.
> > > >  set CFLAGS "$DEFAULT_CFLAGS -march=$gcc_march -mabi=$gcc_mabi -O3"
> > >
> > > Add -Wno-psabi here rather than below, and also add it for
> > > g++.target/riscv/rvv/rvv.exp
> > >
> > > >  dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/base/*.\[cS\]]]
> \
> > > > -   "" $CFLAGS
> > > > +   "-Wno-psabi" $CFLAGS
> > > >  gcc-dg-runtest [lsort [glob -nocomplain
> > > > $srcdir/$subdir/vsetvl/*.\[cS\]]]
> > > \
> > > > "" $CFLAGS
> > > >  dg-runtest [lsort [glob -nocomplain
> > > > $srcdir/$subdir/autovec/*.\[cS\]]] \


RE: [PATCH v5] RISC-V: Add vector psabi checking.

2023-06-12 Thread Wang, Yanzhang via Gcc-patches
I found that add the -Wno-psabi to CFLAGS will be overrode by
dg-options. It seems we can only add this option to the third
arg of dg-runtest. Attach the dg-runtest comments,

# dg-runtest -- simple main loop useful to most testsuites
#
# OPTIONS is a set of options to always pass.
# DEFAULT_EXTRA_OPTIONS is a set of options to pass if the testcase
# doesn't specify any (with dg-option).

> -Original Message-
> From: Kito Cheng 
> Sent: Monday, June 12, 2023 8:43 PM
> To: Wang, Yanzhang 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Li, Pan2
> 
> Subject: Re: [PATCH v5] RISC-V: Add vector psabi checking.
> 
> Hi Yan-Zhang:
> 
> OK with one minor, go ahead IF the regression is clean.
> 
> Hi Pan:
> 
> Could you help to verify this patch and commit if the regression is clean?
> 
> thanks :)
> 
> > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> > index 5e69235a268..ad79d0e9a8d 100644
> > --- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> > +++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> > @@ -43,7 +43,7 @@ dg-init
> >  # Main loop.
> >  set CFLAGS "$DEFAULT_CFLAGS -march=$gcc_march -mabi=$gcc_mabi -O3"
> 
> Add -Wno-psabi here rather than below, and also add it for
> g++.target/riscv/rvv/rvv.exp
> 
> >  dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/base/*.\[cS\]]] \
> > -   "" $CFLAGS
> > +   "-Wno-psabi" $CFLAGS
> >  gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/vsetvl/*.\[cS\]]]
> \
> > "" $CFLAGS
> >  dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/*.\[cS\]]] \


RE: [PATCH v5] RISC-V: Add vector psabi checking.

2023-06-12 Thread Wang, Yanzhang via Gcc-patches
I found there're still some test cases that does not pass. I'll push
another version soon. Sorry for the inconvenience.

> -Original Message-
> From: Wang, Yanzhang 
> Sent: Monday, June 12, 2023 4:08 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li, Pan2
> ; Wang, Yanzhang 
> Subject: [PATCH v5] RISC-V: Add vector psabi checking.
> 
> From: Yanzhang Wang 
> 
> This patch adds support to check function's argument or return is vector
> type and throw warning if yes.
> 
> There're two exceptions,
>   - The vector_size attribute.
>   - The intrinsic functions.
> 
> gcc/ChangeLog:
> 
>   * config/riscv/riscv-protos.h (riscv_init_cumulative_args): Set
> warning flag if func is not builtin
>   * config/riscv/riscv.cc
>   (riscv_scalable_vector_type_p): Determine whether the type is scalable
> vector.
>   (riscv_arg_has_vector): Determine whether the arg is vector type.
>   (riscv_pass_in_vector_p): Check the vector type param is passed by
> value.
>   (riscv_init_cumulative_args): The same as header.
>   (riscv_get_arg_info): Add the checking.
>   (riscv_function_value): Check the func return and set warning flag
>   * config/riscv/riscv.h (INIT_CUMULATIVE_ARGS): Add a flag to
> determine whether warning psabi or not.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/rvv/rvv.exp: Add -Wno-psabi
>   * gcc.target/riscv/vector-abi-1.c: New test.
>   * gcc.target/riscv/vector-abi-2.c: New test.
>   * gcc.target/riscv/vector-abi-3.c: New test.
>   * gcc.target/riscv/vector-abi-4.c: New test.
>   * gcc.target/riscv/vector-abi-5.c: New test.
>   * gcc.target/riscv/vector-abi-6.c: New test.
> 
> Signed-off-by: Yanzhang Wang 
> Co-authored-by: Kito Cheng 
> ---
>  gcc/config/riscv/riscv-protos.h   |   2 +
>  gcc/config/riscv/riscv.cc | 112 +-
>  gcc/config/riscv/riscv.h  |   5 +-
>  gcc/testsuite/gcc.target/riscv/rvv/rvv.exp|   2 +-
>  gcc/testsuite/gcc.target/riscv/vector-abi-1.c |  14 +++
> gcc/testsuite/gcc.target/riscv/vector-abi-2.c |  15 +++
> gcc/testsuite/gcc.target/riscv/vector-abi-3.c |  14 +++
> gcc/testsuite/gcc.target/riscv/vector-abi-4.c |  16 +++
> gcc/testsuite/gcc.target/riscv/vector-abi-5.c |  15 +++
> gcc/testsuite/gcc.target/riscv/vector-abi-6.c |  20 
>  10 files changed, 212 insertions(+), 3 deletions(-)  create mode 100644
> gcc/testsuite/gcc.target/riscv/vector-abi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/vector-abi-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/vector-abi-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/vector-abi-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/vector-abi-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/vector-abi-6.c
> 
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-
> protos.h index 66c1f535d60..90fde5f8be3 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -302,4 +302,6 @@ th_mempair_output_move (rtx[4], bool, machine_mode,
> RTX_CODE);  #endif
> 
>  extern bool riscv_use_divmod_expander (void);
> +void riscv_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree,
> +int);
> +
>  #endif /* ! GCC_RISCV_PROTOS_H */
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index
> de30bf4e567..dd5361c2bd2 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -3795,6 +3795,99 @@ riscv_pass_fpr_pair (machine_mode mode, unsigned
> regno1,
>  GEN_INT (offset2;
>  }
> 
> +/* Use the TYPE_SIZE to distinguish the type with vector_size attribute
> and
> +   intrinsic vector type.  Because we can't get the decl for the
> +params.  */
> +
> +static bool
> +riscv_scalable_vector_type_p (const_tree type) {
> +  tree size = TYPE_SIZE (type);
> +  if (size && TREE_CODE (size) == INTEGER_CST)
> +return false;
> +
> +  /* For the data type like vint32m1_t, the size code is POLY_INT_CST.
> +*/
> +  return true;
> +}
> +
> +static bool
> +riscv_arg_has_vector (const_tree type)
> +{
> +  bool is_vector = false;
> +
> +  switch (TREE_CODE (type))
> +{
> +case RECORD_TYPE:
> +  if (!COMPLETE_TYPE_P (type))
> + break;
> +
> +  for (tree f = TYPE_FIELDS (type); f; f = DECL_CHAIN (f))
> + if (TREE_CODE (f) == FIELD_DECL)
> +   {
> + tree field_type = TREE_TYPE (f);
> + if (!TYPE_P (field_type))
> +   break;
> +
> + /* Ignore it if it's fixed length vector.  */
> + if (VECTOR_TYPE_P (field_type))
> +   is_vector = riscv_scalable_vector_type_p (field_type);
> + else
> +   is_vector = riscv_arg_has_vector (field_type);
> +   }
> +
> +  break;
> +
> +case VECTOR_TYPE:
> +  is_vector = riscv_scalable_vector_type_p (type);
> +  break;
> +
> +default:
> +  is_vector = false;
> + 

RE: [PATCH v4] RISC-V: Add vector psabi checking.

2023-06-11 Thread Wang, Yanzhang via Gcc-patches
I reproduce the failure too. Because it returns early in get_arg_info for
v-ext mode. I'll move the checking to the beginning.

> -Original Message-
> From: Kito Cheng 
> Sent: Friday, June 9, 2023 5:52 PM
> To: Wang, Yanzhang 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com;
> Li, Pan2 
> Subject: Re: [PATCH v4] RISC-V: Add vector psabi checking.
> 
> Hmmm, I still saw some fail on testsuite after applying this patch, most
> are because the testcase has used vector type as argument or return value,
> but .. vector-abi-1.c should not fail I think?
> 
> For other fails, I would suggest you could just add -Wno-psabi to rvv.exp
> 
> === gcc: Unexpected fails for rv64imafdcv lp64d medlow ===
> FAIL: gcc.target/riscv/vector-abi-1.c   -O0   (test for warnings, line 7)
> FAIL: gcc.target/riscv/vector-abi-1.c   -O1   (test for warnings, line 7)
> FAIL: gcc.target/riscv/vector-abi-1.c   -O2   (test for warnings, line 7)
> FAIL: gcc.target/riscv/vector-abi-1.c   -O2 -flto
> -fno-use-linker-plugin -flto-partition=none   (test for warnings, line
> 7)
> FAIL: gcc.target/riscv/vector-abi-1.c   -O2 -flto -fuse-linker-plugin
> -fno-fat-lto-objects   (test for warnings, line 7)
> FAIL: gcc.target/riscv/vector-abi-1.c   -O3 -g   (test for warnings, line 7)
> FAIL: gcc.target/riscv/vector-abi-1.c   -Os   (test for warnings, line 7)
> FAIL: gcc.target/riscv/vector-abi-1.c  -Og -g   (test for warnings, line 7)
> FAIL: gcc.target/riscv/vector-abi-1.c  -Oz   (test for warnings, line 7)
> FAIL: gcc.target/riscv/rvv/base/binop_vx_constraint-120.c (test for excess
> errors)
> FAIL: gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c (test for
> excess errors)
> FAIL: gcc.target/riscv/rvv/base/mask_insn_shortcut.c (test for excess
> errors)
> FAIL: gcc.target/riscv/rvv/base/misc_vreinterpret_vbool_vint.c (test for
> excess errors)
> FAIL: gcc.target/riscv/rvv/base/pr110109-2.c (test for excess errors)
> FAIL: gcc.target/riscv/rvv/base/scalar_move-9.c (test for excess errors)
> FAIL: gcc.target/riscv/rvv/base/vlmul_ext-1.c (test for excess errors)
> FAIL: gcc.target/riscv/rvv/base/zero_base_load_store_optimization.c
> (test for excess errors)
> FAIL: gcc.target/riscv/rvv/base/zvfh-intrinsic.c (test for excess errors)
> FAIL: gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c (test for excess errors)
> FAIL: gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c (test for excess errors)
> 
>   = Summary of gcc testsuite =
>| # of unexpected case / # of unique unexpected
> case
>|  gcc |  g++ | gfortran |
> rv32imafdc/ ilp32d/ medlow |   20 /12 |0 / 0 |0 / 0 |
> rv32imafdcv/ ilp32d/ medlow |   25 /14 |   22 /22 |0 / 0 |
> rv64imafdc/  lp64d/ medlow |   20 /12 |0 / 0 |0 / 0 |
> rv64imafdcv/  lp64d/ medlow |   20 /12 |   21 /21 |0 / 0 |
> 
> On Fri, Jun 9, 2023 at 2:02 PM yanzhang.wang--- via Gcc-patches  patc...@gcc.gnu.org> wrote:
> >
> > From: Yanzhang Wang 
> >
> > This patch adds support to check function's argument or return is
> > vector type and throw warning if yes.
> >
> > There're two exceptions,
> >   - The vector_size attribute.
> >   - The intrinsic functions.
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv-protos.h (riscv_init_cumulative_args): Set
> >   warning flag if func is not builtin
> > * config/riscv/riscv.cc
> > (riscv_scalable_vector_type_p): Determine whether the type is
> scalable vector.
> > (riscv_arg_has_vector): Determine whether the arg is vector type.
> > (riscv_pass_in_vector_p): Check the vector type param is passed
> by value.
> > (riscv_init_cumulative_args): The same as header.
> > (riscv_get_arg_info): Add the checking.
> > (riscv_function_value): Check the func return and set warning
> flag
> > * config/riscv/riscv.h (INIT_CUMULATIVE_ARGS): Add a flag to
> >   determine whether warning psabi or not.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/vector-abi-1.c: New test.
> > * gcc.target/riscv/vector-abi-2.c: New test.
> > * gcc.target/riscv/vector-abi-3.c: New test.
> > * gcc.target/riscv/vector-abi-4.c: New test.
> > * gcc.target/riscv/vector-abi-5.c: New test.
> > * gcc.target/riscv/vector-abi-6.c: New test.
> >
> > Signed-off-by: Yanzhang Wang 
> > Co-authored-by: Kito Cheng 
> > ---
> >  gcc/config/riscv/riscv-protos.h   |   2 +
> >  gcc/config/riscv/riscv.cc | 112 +-
> >  gcc/config/riscv/riscv.h  |   5 +-
> >  gcc/testsuite/gcc.target/riscv/vector-abi-1.c |  14 +++
> > gcc/testsuite/gcc.target/riscv/vector-abi-2.c |  15 +++
> > gcc/testsuite/gcc.target/riscv/vector-abi-3.c |  14 +++
> > gcc/testsuite/gcc.target/riscv/vector-abi-4.c |  16 +++
> > 

RE: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2023-06-06 Thread Wang, Yanzhang via Gcc-patches
Hi Jeff,

Thanks your comments. I have few questions that I don't quite understand.

> One of the things that needs to be upstreamed is long jump support within
> a function.  Essentially once a function reaches 1M in size we have the
> real possibility that a direct jump may not reach its target.
> 
> To support this I expect that $ra is going to become a fixed register (ie,
> not available to the register allocator as a temporary).  It'll be used
> as a scratch register for long jump sequences.
> 
> One of the consequences of this is $ra will need to be saved in leaf
> functions that are near or over 1M in size.
> 
> Note that at the time when we have to lay out the stack, we do not know
> the precise length of the function.  So there's a degree of "fuzz" in the
> decision whether or not to save $ra in a function that is close to the 1M
> limit.

Do you mean that, long jump to more than 1M offset will need multiple jal
and each jal will save the $ra ?

If yes, I'm confused about what's the influence of the $ra saving for
function prologue. We will save the fp+ra at the prologue, the next $ra 
saving seems will not modify the $ra already saved.

> I don't think you can reliably know if $ra is valid in an arbitrary leaf
> function or not.  You could implement some heuristics by looking at the
> symbol table (which I'm guessing you don't want to do) or by
> disassembling the prologue (again, I'm guessing you don't want to do that
> either).

I think it's yes (not valid) when we want to get the return address to parent
function from $ra directly in the function body. But we can get the right
return address from fp with offset if we save them at prologue, is it right ?

> Meaning that what you really want is to be using -fno-omit-frame-pointer
> and for $ra to always be saved in the stack, even in a leaf function.

This is also another solution but will change the default behavior of
-fno-omit-frame-pointer.

> Presumably you're not suggesting any of these options be used in general
> -- they're going to be used for things like embedded devices or firmware?
> Also note there are low overhead unwinding schemes out there that are
> already supported in various tools -- ORC & SFRAME come
> immediately to mind.   Those may be better than building a bespoke
> solution for the embedded space.

Yes. You're right, I forget to introduce background of the requirement. It
will be used in the firmware where the dwarf or unwinding maybe not acceptable.

Yanzhang

> -Original Message-
> From: Jeff Law 
> Sent: Wednesday, June 7, 2023 10:13 AM
> To: Wang, Yanzhang ; gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li, Pan2
> 
> Subject: Re: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.
> 
> 
> 
> On 6/4/23 20:49, Wang, Yanzhang wrote:
> > Hi Jeff,
> >
> > Yes, there's a requirement to support backtrace based on the fp+ra.
> > And the unwind/cfa is not acceptable because it will add additional
> > sections to the binary. Currently, -fno-omit-frame-pointer can not
> > save the ra for the leaf function. So we need to add another option
> > like ARM/X86 to support consistent fp+ra stack layout for the leaf and
> > non-leaf functions.
> One of the things that needs to be upstreamed is long jump support within
> a function.  Essentially once a function reaches 1M in size we have the
> real possibility that a direct jump may not reach its target.
> 
> To support this I expect that $ra is going to become a fixed register (ie,
> not available to the register allocator as a temporary).  It'll be used
> as a scratch register for long jump sequences.
> 
> One of the consequences of this is $ra will need to be saved in leaf
> functions that are near or over 1M in size.
> 
> Note that at the time when we have to lay out the stack, we do not know
> the precise length of the function.  So there's a degree of "fuzz" in the
> decision whether or not to save $ra in a function that is close to the 1M
> limit.
> 
> I don't think you can reliably know if $ra is valid in an arbitrary leaf
> function or not.  You could implement some heuristics by looking at the
> symbol table (which I'm guessing you don't want to do) or by
> disassembling the prologue (again, I'm guessing you don't want to do that
> either).
> 
> Meaning that what you really want is to be using -fno-omit-frame-pointer
> and for $ra to always be saved in the stack, even in a leaf function.
> 
> Presumably you're not suggesting any of these options be used in general
> -- they're going to be used for things like embedded devices or firmware?
> Also note there are low overhead unwinding schemes out there that are
> already supported in various tools -- ORC & SFRAME come
> immediately to mind.   Those may be better than building a bespoke
> solution for the embedded space.
> 
> 
> 
> Jeff


RE: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2023-06-04 Thread Wang, Yanzhang via Gcc-patches
> +static bool
> +riscv_frame_pointer_required (void)
> +{
> +  if (riscv_save_frame_pointer && !crtl->is_leaf)
> +return true;
> +
> +  return false;
> +}
> 
> Can be simplified to return riscv_save_frame_pointer && !crtl->is_leaf;

Nice. It's much simpler. Will modify in another patch.

> +  riscv_save_frame_pointer = false;
> +  if (TARGET_OMIT_LEAF_FRAME_POINTER_P (global_options.x_target_flags))
> +{
> +  if (!global_options.x_flag_omit_frame_pointer)
> + riscv_save_frame_pointer = true;
> +
> +  global_options.x_flag_omit_frame_pointer = 1;
> +}
> 
> Does this mean if omit_leaf_frame will also set the omit_frame_pointer
> implicitly?
>

For the flag it's yes but for the behavior it's no. The behavior still is
based on the flag of omit-frame-pointer's value.

- ON, than the frame pointer of non-leaf functions will be omitted.
- OFF(no), than the frame pointer of non-leaf functions will not be omitted.

In the other words, if we want to omit the leaf frame pointers,

- if we want to omit the non-leaf fp too, we need only save the ra for the 
non-leaf.
- if we don't, we need to save the fp+ra for the non-leaf but no fp+ra for the 
leaf.

We need to override the option (x_flag_omit_frame_pointer) because it's the
first priority when determine whether the frame pointer is needed. If it's
turned off, the frame pointer will be saved for leaf functions too even
though we turn on the omit-leaf-frame-pointer.

To distinguish the two scenarios above, we need to add another variable to
save the flag user set originally otherwise it will be threw away.

Yanzhang

> -Original Message-
> From: Li, Pan2 
> Sent: Monday, June 5, 2023 9:04 AM
> To: Wang, Yanzhang ; gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com
> Subject: RE: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.
> 
> Some nit comments.
> 
> +static bool
> +riscv_frame_pointer_required (void)
> +{
> +  if (riscv_save_frame_pointer && !crtl->is_leaf)
> +return true;
> +
> +  return false;
> +}
> 
> Can be simplified to return riscv_save_frame_pointer && !crtl->is_leaf;
> 
> +  riscv_save_frame_pointer = false;
> +  if (TARGET_OMIT_LEAF_FRAME_POINTER_P (global_options.x_target_flags))
> +{
> +  if (!global_options.x_flag_omit_frame_pointer)
> + riscv_save_frame_pointer = true;
> +
> +  global_options.x_flag_omit_frame_pointer = 1;
> +}
> 
> Does this mean if omit_leaf_frame will also set the omit_frame_pointer
> implicitly?
> 
> Pan
> 
> 
> -Original Message-
> From: Wang, Yanzhang 
> Sent: Friday, June 2, 2023 3:07 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li, Pan2
> ; Wang, Yanzhang 
> Subject: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.
> 
> From: Yanzhang Wang 
> 
> gcc/ChangeLog:
> 
>   * config/riscv/riscv.cc (riscv_save_reg_p): Save ra for leaf
> when enabling -mno-omit-leaf-frame-pointer
>   (riscv_option_override): Override omit-frame-pointer.
>   (riscv_frame_pointer_required): Save s0 for non-leaf function
>   (TARGET_FRAME_POINTER_REQUIRED): Override defination
>   * config/riscv/riscv.opt: Add option support.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/omit-frame-pointer-1.c: New test.
>   * gcc.target/riscv/omit-frame-pointer-2.c: New test.
>   * gcc.target/riscv/omit-frame-pointer-3.c: New test.
>   * gcc.target/riscv/omit-frame-pointer-4.c: New test.
>   * gcc.target/riscv/omit-frame-pointer-test.c: New test.
> 
> Signed-off-by: Yanzhang Wang 
> ---
>  gcc/config/riscv/riscv.cc | 31 ++-
>  gcc/config/riscv/riscv.opt|  4 +++
>  .../gcc.target/riscv/omit-frame-pointer-1.c   |  7 +
>  .../gcc.target/riscv/omit-frame-pointer-2.c   |  7 +
>  .../gcc.target/riscv/omit-frame-pointer-3.c   |  7 +
>  .../gcc.target/riscv/omit-frame-pointer-4.c   |  7 +
>  .../riscv/omit-frame-pointer-test.c   | 13 
>  7 files changed, 75 insertions(+), 1 deletion(-)  create mode 100644
> gcc/testsuite/gcc.target/riscv/omit-frame-pointer-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/omit-frame-pointer-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/omit-frame-pointer-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/omit-frame-pointer-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/omit-frame-pointer-
> test.c
> 
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index
> 5d2550871c7..e02f9cb50a4 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -408,6 +408,10 @@ static const struct riscv_tune_info
> riscv_tune_info_table[] = {  #include "riscv-cores.def"
>  };
> 
> +/* Global variable to distinguish whether we should save and restore
> s0/fp for
> +   function.  */
> +static bool riscv_save_frame_pointer;
> +
>  void riscv_frame_info::reset(void)
>  {
>total_size = 0;
> @@ -4744,7 

Re: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2023-06-04 Thread Wang, Yanzhang via Gcc-patches
Hi Jeff,

Yes, there's a requirement to support backtrace based on the fp+ra.
And the unwind/cfa is not acceptable because it will add additional
sections to the binary. Currently, -fno-omit-frame-pointer can not
save the ra for the leaf function. So we need to add another option
like ARM/X86 to support consistent fp+ra stack layout for the leaf
and non-leaf functions.

Thanks,
Yanzhang

From: Jeff Law 
Sent: Saturday, June 3, 2023 10:43 AM
To: Wang, Yanzhang ; gcc-patches@gcc.gnu.org 

Cc: juzhe.zh...@rivai.ai ; kito.ch...@sifive.com 
; Li, Pan2 
Subject: Re: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.



On 6/2/23 01:07, yanzhang.wang--- via Gcc-patches wrote:
> From: Yanzhang Wang 
>
> gcc/ChangeLog:
>
>* config/riscv/riscv.cc (riscv_save_reg_p): Save ra for leaf
>  when enabling -mno-omit-leaf-frame-pointer
>(riscv_option_override): Override omit-frame-pointer.
>(riscv_frame_pointer_required): Save s0 for non-leaf function
>(TARGET_FRAME_POINTER_REQUIRED): Override defination
>* config/riscv/riscv.opt: Add option support.
>
> gcc/testsuite/ChangeLog:
>
>* gcc.target/riscv/omit-frame-pointer-1.c: New test.
>* gcc.target/riscv/omit-frame-pointer-2.c: New test.
>* gcc.target/riscv/omit-frame-pointer-3.c: New test.
>* gcc.target/riscv/omit-frame-pointer-4.c: New test.
>* gcc.target/riscv/omit-frame-pointer-test.c: New test.
Not ACKing or NAKing at this time.

Why do you want this feature?

jeff


RE: [PATCH v5] RISC-V: Fix regression of -fzero-call-used-regs=all

2023-04-11 Thread Wang, Yanzhang via Gcc-patches
Hi Kito, Juzhe, Jeff,

Thanks for your kindly reviews. I have modified based on the comments and ran 
the testsuite on my local. Could you please take another look ? If any more 
comments please let me know.

Thanks
Yanzhang

> -Original Message-
> From: Wang, Yanzhang 
> Sent: Tuesday, April 11, 2023 7:38 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li, Pan2
> ; Wang, Yanzhang 
> Subject: [PATCH v5] RISC-V: Fix regression of -fzero-call-used-regs=all
> 
> From: Yanzhang Wang 
> 
> This patch registers a riscv specific function to
> TARGET_ZERO_CALL_USED_REGS instead of default in targhooks.cc. It will
> clean gpr and vector relevant registers.
> 
>   PR 109104
> 
> gcc/ChangeLog:
> 
>   * config/riscv/riscv-protos.h (emit_hard_vlmax_vsetvl):
>   * config/riscv/riscv-v.cc (emit_hard_vlmax_vsetvl):
>   (emit_vlmax_vsetvl):
>   * config/riscv/riscv.cc (vector_zero_call_used_regs):
>   (riscv_zero_call_used_regs):
>   (TARGET_ZERO_CALL_USED_REGS):
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/zero-scratch-regs-1.c: New test.
>   * gcc.target/riscv/zero-scratch-regs-2.c: New test.
>   * gcc.target/riscv/zero-scratch-regs-3.c: New test.
> 
> Signed-off-by: Yanzhang Wang 
> Co-authored-by: Pan Li 
> Co-authored-by: Ju-Zhe Zhong 
> Co-authored-by: Kito Cheng 
> ---
>  gcc/config/riscv/riscv-protos.h   |  1 +
>  gcc/config/riscv/riscv-v.cc   | 15 +++-
>  gcc/config/riscv/riscv.cc | 75 +++
>  .../gcc.target/riscv/zero-scratch-regs-1.c|  9 +++
>  .../gcc.target/riscv/zero-scratch-regs-2.c| 24 ++
>  .../gcc.target/riscv/zero-scratch-regs-3.c| 57 ++
>  6 files changed, 178 insertions(+), 3 deletions(-)  create mode 100644
> gcc/testsuite/gcc.target/riscv/zero-scratch-regs-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zero-scratch-regs-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zero-scratch-regs-3.c
> 
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-
> protos.h index 4611447ddde..5244e8dcbf0 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -159,6 +159,7 @@ bool check_builtin_call (location_t, vec,
> unsigned int,  bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT,
> HOST_WIDE_INT);  bool legitimize_move (rtx, rtx, machine_mode);  void
> emit_vlmax_vsetvl (machine_mode, rtx);
> +void emit_hard_vlmax_vsetvl (machine_mode, rtx);
>  void emit_vlmax_op (unsigned, rtx, rtx, machine_mode);  void
> emit_vlmax_op (unsigned, rtx, rtx, rtx, machine_mode);  void
> emit_nonvlmax_op (unsigned, rtx, rtx, rtx, machine_mode); diff --git
> a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index
> 2e91d019f6c..392f5d02e17 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -118,6 +118,17 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT
> minval,
> && IN_RANGE (INTVAL (elt), minval, maxval));  }
> 
> +/* Emit a vlmax vsetvl instruction.  This should only be used when
> +   optimization is disabled or after vsetvl insertion pass.  */ void
> +emit_hard_vlmax_vsetvl (machine_mode vmode, rtx vl) {
> +  unsigned int sew = get_sew (vmode);
> +  emit_insn (gen_vsetvl (Pmode, vl, RVV_VLMAX, gen_int_mode (sew, Pmode),
> +  gen_int_mode (get_vlmul (vmode), Pmode), const0_rtx,
> +  const0_rtx));
> +}
> +
>  void
>  emit_vlmax_vsetvl (machine_mode vmode, rtx vl)  { @@ -126,9 +137,7 @@
> emit_vlmax_vsetvl (machine_mode vmode, rtx vl)
>unsigned int ratio = calculate_ratio (sew, vlmul);
> 
>if (!optimize)
> -emit_insn (gen_vsetvl (Pmode, vl, RVV_VLMAX, gen_int_mode (sew,
> Pmode),
> -gen_int_mode (get_vlmul (vmode), Pmode), const0_rtx,
> -const0_rtx));
> +emit_hard_vlmax_vsetvl (vmode, vl);
>else
>  emit_insn (gen_vlmax_avl (Pmode, vl, gen_int_mode (ratio,
> Pmode)));  } diff --git a/gcc/config/riscv/riscv.cc
> b/gcc/config/riscv/riscv.cc index 5f542932d13..a9c9e1aa32b 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -7066,6 +7066,78 @@ riscv_shamt_matches_mask_p (int shamt,
> HOST_WIDE_INT mask)
>return shamt == ctz_hwi (mask);
>  }
> 
> +HARD_REG_SET
> +vector_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs) {
> +  HARD_REG_SET zeroed_hardregs;
> +  CLEAR_HARD_REG_SET (zeroed_hardregs);
> +
> +  /* Find a register to hold vl.  */
> +  unsigned vl_regno = INVALID_REGNUM;
> +  /* Skip the first GPR, otherwise the existing vl is kept due to the
> same
> + between vl and avl.  */
> +  for (unsigned regno = GP_REG_FIRST + 1; regno <= GP_REG_LAST; regno++)
> +{
> +  if (TEST_HARD_REG_BIT (need_zeroed_hardregs, regno))
> + {
> +   vl_regno = regno;
> +   break;
> + }
> +}
> +
> +  if (vl_regno > GP_REG_LAST)
> +sorry ("cannot allocate 

RE: [PATCH v3] RISC-V: Fix regression of -fzero-call-used-regs=all

2023-04-09 Thread Wang, Yanzhang via Gcc-patches
Thanks Jeff's comment.

> Presumably the difficulty here is we need to find a suitable hard
> register so that we can emit the vsetvl.
 
Yes. We use the GPR which has been flagged in the need_zeroed_regs to
hold the vl. There should be one GPR we can use, otherwise, will throw
an exception.
 
> Do you need to save/restore the vector configuration before and after
> clearing the vector registers?If so, that seems to be missing.  If
> not, it seems like a comment explaining why would be useful.

I'll add some comments in the code and want to explain here first.
We need not save/restore the vector configurations. Because, by design,
the RVV requires vsetvl when using vector instructions. When users want to
use the RVV insns next, they should have to issue vsetvl first.

Thanks,
Yanzhang