Re: [PATCH v1 2/2] RISC-V: Add testcases for form 2 of signed vector SAT_ADD

2024-09-24 Thread Robin Dapp
LGTM. -- Regards Robin

Re: [PATCH v1 2/2] RISC-V: Add testcases for form 3 of signed vector SAT_ADD

2024-09-24 Thread Robin Dapp
LGTM (in case you haven't committed it yet). -- Regards Robin

[PATCH] RISC-V: testsuite: Fix SELECT_VL SLP fallout.

2024-09-19 Thread Robin Dapp
Hi, this fixes asm-scan fallout from r15-3712-g5e3a4a01785e2d where we allow SLP with SELECT_VL. Assisted by sed and regtested on rv64gcv_zvfh_zvbb. Rather lengthy but obvious, so going to commit after a while if the CI is happy. I think those tests don't really need to check for vsetvl anyway,

Re: [PATCH] RISC-V: Align vconfig for TARGER_SFB_ALU

2024-09-19 Thread Robin Dapp
Hi Dusan, sorry for the late reply. > This patch addresses a missed opportunity to fuse vsetvl_infos. > Instead of checking whether demands for merging configurations of > vsetvl_info are all met, the demands are checked individually. > > The case in question occurs because of the conditional

Re: [PATCH][v2] tree-optimization/116573 - .SELECT_VL for SLP

2024-09-19 Thread Robin Dapp
> On Tue, 17 Sep 2024, Richard Biener wrote: > > > The following restores the use of .SELECT_VL for testcases where it > > is safe to use even when using SLP. I've for now restricted it > > to single-lane SLP plus optimistically allow store-lane nodes > > and assume single-lane roots are not widen

Re: [PATCH v5 4/4] RISC-V: Fix vector SAT_ADD dump check due to middle-end change

2024-09-19 Thread Robin Dapp
> This patch would like fix the dump check times of vector SAT_ADD. The > middle-end change makes the match times from 2 to 4 times. > > The below test suites are passed for this patch. > * The rv64gcv fully regression test. That's OK. And I think testsuite fixup patches like this you can consid

Re: [PATCH] Try fixing RISC-V .SELECT_VL with SLP

2024-09-14 Thread Robin Dapp
> The following simply removes a seemingly bogus guard. > > * tree-vect-loop.cc (vect_analyze_loop_1): Remove SLP guard > from .SELECT_VL disabling. > --- > gcc/tree-vect-loop.cc | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-v

[PATCH] vect: Do not try to duplicate_and_interleave one-element mode.

2024-09-06 Thread Robin Dapp
Hi, PR112694 shows that we try to create sub-vectors of single-element vectors because can_duplicate_and_interleave_p returns true. The problem resurfaced in PR116611. This patch makes can_duplicate_and_interleave_p return false if count / nvectors > 0 and removes the corresponding check in the r

Re: [PATCH] RISC-V: Fixed incorrect semantic description in DF to DI pattern in the Zfa extension on rv32.

2024-09-06 Thread Robin Dapp
> In the process of DF to SI, we generally use "unsigned_fix" rather than > "truncate" for conversion. Although this has no effect in general, > unexpected ICE often occurs when precise semantic analysis is required, > such as analysis in function "simplify_const_unary_operation" in > simplify-rtx.

[PATCH] RISC-V: Add more vector-vector extract cases.

2024-09-06 Thread Robin Dapp
Hi, this adds a V16SI -> V4SI and related i.e. "quartering" vector-vector extract expander for VLS modes. It helps with unnecessary spills in x264. Regtested on rv64gcv_zvfh_zvbb. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md (vec_extract): Add quarter vec-vec extrac

Re: [PATCH 6/8] gcn: Add else operand to masked loads.

2024-09-06 Thread Robin Dapp
> > So we only found two instances of this problem and both were related to > > _Bools. In case you have more cases, it would be greatly appreciated > > to verify the series with them. If you don't mind, would it be possible > > to comment out the zeroing, re-run the testsuite and check for FAILs

Re: [PATCH 6/8] gcn: Add else operand to masked loads.

2024-09-06 Thread Robin Dapp
> There were absolutely problems without this. It's a while ago now, so I'm > struggling with the details, but as GCC only applies the mask to selected > operations there were all sorts of issues that crept in. Zeroing the > undefined lanes seemed to match the middle end assumptions (or, at least i

Re: [PATCH 6/8] gcn: Add else operand to masked loads.

2024-09-05 Thread Robin Dapp
> > +(define_predicate "maskload_else_operand" > > + (and (match_code "const_int,const_vector") > > + (match_test "op == CONST0_RTX (GET_MODE (op))"))) > > This forces maskload and mask_gather_load to only accept zero here, but > in fact the hardware would allow us to accept any value (incl

[PATCH] RISC-V: Fix effective target check.

2024-08-30 Thread Robin Dapp
Hi, I messed up the return value in check_effective_target_rvv_zvl256b_ok and check_effective_target_rvv_zvl512b_ok. This fixes it and also just uses the current march for the check. Going to commit as obvious. Regards Robin gcc/testsuite/ChangeLog: * lib/target-supports.exp: Fix eff

Re: [PATCH] RISC-V: Fix subreg of VLS modes larger than a vector [PR116086].

2024-08-28 Thread Robin Dapp
> On Wed, Aug 28, 2024 at 3:21 PM Robin Dapp wrote: > > > > > Hmm - but how can you call this ambiguous? VLEN and LMUL is a runtime > > > property(?), so unknown to the compiler(?) - as you do below the only > > > way to code generate would be a agnostic way

Re: [PATCH] RISC-V: Fix subreg of VLS modes larger than a vector [PR116086].

2024-08-28 Thread Robin Dapp
> Hmm - but how can you call this ambiguous? VLEN and LMUL is a runtime > property(?), so unknown to the compiler(?) - as you do below the only > way to code generate would be a agnostic way such as with a slide-down. > But can't you always to this, for all subregs of this sort (even with offset)?

Re: [PATCH] RISC-V: Fix subreg of VLS modes larger than a vector [PR116086].

2024-08-27 Thread Robin Dapp
> +(define_mode_iterator V_HAS_HALF [ > + V2QI V4QI V8QI V16QI V32QI V64QI V128QI V256QI V512QI V1024QI V2048QI > V4096QI > + V2HI V4HI V8HI V16HI V32HI V64HI V128HI V256HI V512HI V1024HI V2048HI > + V2SI V4SI V8SI V16SI V32SI V64SI V128SI V256SI V512SI V1024SI > + V2DI V4DI V8DI V16DI V32DI V

Re: [PATCH] RISC-V: Add missing mode_idx for vrol and vror

2024-08-27 Thread Robin Dapp
You don't need an OK of course but LGTM. When I found another instance of this I was thinking about having exhaustive self tests for those attributes. Maybe a good learning exercise? -- Regards Robin

[PATCH] RISC-V: Fix subreg of VLS modes larger than a vector [PR116086].

2024-08-27 Thread Robin Dapp
Hi, this is a hopefully better way to solve the "subreg problem" by first, in the generic case, have the RA go via memory and second, providing a vector-vector extract that deals with it in an optimized way. When the source mode is potentially larger than one vector (e.g. an LMUL2 mode for VLEN=1

Re: [PATCH 3/9] RISC-V: Handle 0.0 floating point pattern costing to match const_vector expander

2024-08-22 Thread Robin Dapp
> + /* Constants in range -16 ~ 15 integer or 0.0 floating-point > +can be emitted using vmv.v.i. */ > + if (satisfies_constraint_vi (x) || satisfies_constraint_Wc0 (x)) > return 1; Just a nit but while you're at it, don't you want to split

Re: [PATCH 1/9] RISC-V: Use encoded nelts when calling repeating_sequence_p

2024-08-22 Thread Robin Dapp
Before looking at the rest (tomorrow) - this is OK. -- Regards Robin

[PATCH] RISC-V: Expand vec abs without masking.

2024-08-22 Thread Robin Dapp
Hi, standard abs synthesis during expand is max (a, -a). This expansion has the advantage of avoiding masking and is thus potentially faster than the a < 0 ? -a : a synthesis. Regtested on rv64gcv_zvfh_zvbb. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md (abs2): Expand via ma

Re: [PATCH] optabs-query: Guard smallest_int_mode_for_size [PR115495].

2024-08-22 Thread Robin Dapp
> Why's the include needed? .ccs ought to include coretypes.h directly > (and get machmode.h that way, since coretypes.h include machmode.h). Ugh, that was not intentional, sometimes my auto-complete inserts such includes for no reason. I really need to disable that, thanks for pointing that out

Re: [PATCH] optabs-query: Guard smallest_int_mode_for_size [PR115495].

2024-08-22 Thread Robin Dapp
> Indeed though that might be a larger change. I have tested the attached now, aarch64 is still running but x86 and power10 are bootstrapped and regtested, riscv regtested. Hope I didn't miss any target-specific code that I haven't tested. As the issue is only latent I verified by calling get_b

Re: [PATCH v1 2/2] RISC-V: Add testcases for unsigned vector .SAT_TRUNC form 3

2024-08-21 Thread Robin Dapp
LGTM. -- Regards Robin

Re: [PATCH v1 1/2] RISC-V: Add testcases for unsigned vector .SAT_TRUNC form 2

2024-08-21 Thread Robin Dapp
LGTM. -- Regards Robin

Re: [PATCH 3/8] tree-ifcvt: Enforce zero else value after maskload.

2024-08-21 Thread Robin Dapp
> And we fail to fold vect_patt_384.36_436 | { 1, ... } to { 1, ... }? > Or is the issue that vector masks contain padding and with > non-zero masking we'd have garbage in the padding and that leaks > here? That is, _47 ? 1 : iftmp.0_113 -> _47 | iftmp.0_113 assumes > there's exactly one bit in a

Re: [PATCH 3/8] tree-ifcvt: Enforce zero else value after maskload.

2024-08-21 Thread Robin Dapp
> > > > _Bool iftmp.0_113; > > > > _Bool iftmp.0_114; > > > > iftmp.0_113 = .MASK_LOAD (_170, 8B, _169, _171(D)); > > > > iftmp.0_114 = _47 | iftmp.0_113; > > _BoolD.2746 _47; > > iftmp.0_114 = _47 ? 1 : iftmp.0_113; > > which is folded into > > iftmp.0_114 = _47 | iftmp.0_113; > >

Re: [PATCH 3/8] tree-ifcvt: Enforce zero else value after maskload.

2024-08-21 Thread Robin Dapp
> > > Why? I don't think the vectorizer relies on a particular else > > > value? I'd say it would be appropriate for if-conversion to > > > use "ANY" and for the vectorizer to then pick a supported > > > version and/or enforce the else value it needs via a blend? > > > > In PR115336 we have some

Re: [PATCH 3/8] tree-ifcvt: Enforce zero else value after maskload.

2024-08-20 Thread Robin Dapp
> > > When predicating a load we implicitly assume that the else value is > > > zero. In order to formalize this this patch queries the target for > > > its supported else operand and uses that for the maskload call. > > > Subsequently, if the else operand is nonzero, a cond_expr enforcing > > > a

[PATCH] optabs-query: Guard smallest_int_mode_for_size [PR115495].

2024-08-20 Thread Robin Dapp
Hi, in get_best_extraction_insn we use smallest_int_mode_for_size with struct_bits as size argument. In PR115495 struct_bits = 256 and we don't have a mode for that. This patch just bails for such cases. This does not happen on the current trunk anymore (so the test passes unpatched) but we've

[PATCH 8/8] RISC-V: Add else operand to masked loads [PR115536].

2024-08-11 Thread Robin Dapp
This patch adds else operands to masked loads. Currently the default else operand predicate accepts "undefined" (i.e. SCRATCH) as well as all-ones values. Note that this series introduces a large number of new RVV FAILs for riscv. All of them are due to us not being able to elide redundant vec_c

[PATCH 7/8] i386: Add else operand to masked loads.

2024-08-11 Thread Robin Dapp
This patch adds a zero else operand to masked loads, in particular the masked gather load builtins that are used for gather vectorization. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_special_args_builtin): Add else-operand handling. (ix86_expand_builtin): Ditt

[PATCH 6/8] gcn: Add else operand to masked loads.

2024-08-11 Thread Robin Dapp
This patch adds a zero else operand to the masked loads. gcc/ChangeLog: * config/gcn/predicates.md (maskload_else_operand): New predicate. * config/gcn/gcn-valu.md: Use new predicate. --- gcc/config/gcn/gcn-valu.md | 6 -- gcc/config/gcn/predicates.md | 3 +++ 2 fil

[PATCH 3/8] tree-ifcvt: Enforce zero else value after maskload.

2024-08-11 Thread Robin Dapp
When predicating a load we implicitly assume that the else value is zero. In order to formalize this this patch queries the target for its supported else operand and uses that for the maskload call. Subsequently, if the else operand is nonzero, a cond_expr enforcing a zero else value is emitted.

[PATCH 4/8] vect: Add maskload else value support.

2024-08-11 Thread Robin Dapp
This patch adds an else operand to vectorized masked load calls. The current implementation adds else-value arguments to the respective target-querying functions that is used to supply the vectorizer with the proper else value. Right now, the only spot where a zero else value is actually enforced

[PATCH 5/8] aarch64: Add masked-load else operands.

2024-08-11 Thread Robin Dapp
This adds zero else operands to masked loads and their intrinsics. I needed to adjust more than initially thought because we rely on combine for several instructions and a change in a "base" pattern needs to propagate to all those. For the lack of a better idea I used a function call property to s

[PATCH 2/8] ifn: Add else-operand handling.

2024-08-11 Thread Robin Dapp
This patch adds else-operand handling to the internal functions. gcc/ChangeLog: * internal-fn.cc (add_mask_and_len_args): Rename... (add_mask_else_and_len_args): ...to this and add else handling. (expand_partial_load_optab_fn): Use adjusted function. (expand_partia

[PATCH 1/8] docs: Document maskload else operand and behavior.

2024-08-11 Thread Robin Dapp
This patch amends the documentation for masked loads (maskload, vec_mask_load_lanes, and mask_gather_load as well as their len counterparts) with an else operand. gcc/ChangeLog: * doc/md.texi: Document masked load else operand. --- gcc/doc/md.texi | 60 +++

Subject: [PATCH 0/8] Masked load else operand.

2024-08-11 Thread Robin Dapp
st suite results are, however, unchanged. Robin Dapp (8): docs: Document maskload else operand and behavior. ifn: Add else-operand handling. tree-ifcvt: Enforce zero else value after maskload. vect: Add maskload else value support. aarch64: Add masked-load else operands. gcn: Add else ope

Re: [PATCH v1] RISC-V: Bugfix incorrect operand for vwsll auto-vect

2024-08-10 Thread Robin Dapp
A bit of bikeshedding: While it's obviously a bug, I'm not really sure it's useful to truncate before emitting the widening shift. Do we save an instruction vs. the regular non-widening shift by doing so? I think my original (failed) idea was this pattern to be an intermediate/bridge pattern tha

Re: [PATCH] RISC-V: Bugfix for RVV rounding intrinsic ICE in function checker

2024-08-09 Thread Robin Dapp
> When compiling an interface for rounding of type 'vfloat16*' without using > zvfh > or zvfhmin, it is not enough to use FLOAT_MODE_P because the type does not > support > it. Although the subsequent riscv_validate_vector_type checks will still fail > and throw exceptions, I don't think we shoul

Re: [PATCH] RISC-V: Fix missing abi arg in test

2024-08-08 Thread Robin Dapp
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c > b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c > index d150f20b5d9..02814183dbb 100644 > --- a/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c > +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run

[PATCH] RISC-V: Correct mode_idx attribute for viwalu wx variants [PR116149].

2024-07-31 Thread Robin Dapp
Hi, in PR116149 we choose a wrong vector length which causes wrong values in a reduction. The problem happens in avlprop where we choose the number of units in the instruction's mode as vector length. For the non-scalar variants the respective operand has the correct non-widened mode. For the s

Re: [PATCH] RISC-V: Expand subreg move via slide if necessary [PR116086].

2024-07-31 Thread Robin Dapp
> > Like aarch64 we set REGMODE_NATURAL_SIZE for fixed-size modes to > > UNITS_PER_WORD. Isn't that part of the problem? > > > > In extract_bit_field_as_subreg we check lowpart_bit_field_p (= true because > > 128 is a multiple of UNITS_PER_WORD). This leads to the subreg expression. > > > > If I

Re: [PATCH] RISC-V: Expand subreg move via slide if necessary [PR116086].

2024-07-30 Thread Robin Dapp
> > IMO, what ought to happen here is that the RA should spill > > the inner register to memory and load the V4SI back from there. > > (Or vice versa, for an lvalue.) Obviously that's not very efficient, > > and so a patch like the above might be useful as an optimisation.[*] > > But it shouldn't

Re: [PATCH v1] RISC-V: Take Xmode instead of Pmode for ussub expanding

2024-07-29 Thread Robin Dapp
OK. -- Regards Robin

[PATCH] RISC-V: Expand subreg move via slide if necessary [PR116086].

2024-07-26 Thread Robin Dapp
Hi, when the source mode is potentially larger than one vector (e.g. an LMUL2 mode for VLEN=128) we don't know which vector the subreg actually refers to. For zvl128b and LMUL=2 the subreg in (subreg:V2DI (reg:V4DI)) could actually be the a full (high) vector register of a two-register group (at

[PATCH] RISC-V: Work around bare apostrophe in error string.

2024-07-26 Thread Robin Dapp
Hi, an unquoted apostrophe slipped through when testing the recent V/M extension patch. This, again, re-words the message to "Currently the 'V' implementation requires the 'M' extension". Going to commit as obvious after testing. Regards Robin gcc/ChangeLog: * config/riscv/riscv.cc (

[PATCH] fold: Allow SSA names in inverse_conditions_p and fold VCOND_MASK.

2024-07-25 Thread Robin Dapp
Hi, In preparation for the maskload else operand I split off this patch. The patch looks through SSA names for the conditions passed to inverse_conditions_p which helps match.pd recognize more redundant vec_cond expressions. It also adds VCOND_MASK to the respective iterators in match.pd. Is th

Re: [PATCH v2] RISC-V: Error early with V and no M extension.

2024-07-24 Thread Robin Dapp
> That phrasing makes sense to me. It's consistent with the -mbig-endian > sorry message: > > https://godbolt.org/z/oWMeorEeM I seem to remember that explicitly mentioning GCC in an error message like that was discouraged but I might be confusing things. So probably "GCC's current 'V' implementa

Re: [PATCH v2] RISC-V: Error early with V and no M extension.

2024-07-24 Thread Robin Dapp
> It's really GCC's implementation of the V extension that requires M, not > the actul ISA V extension. So I think the wording could be a little > confusing for users here, but no big deal either way on my end so > > Reviewed-by: Palmer Dabbelt Hmm, fair. How about just "the 'V' implementatio

[PATCH v2] RISC-V: Error early with V and no M extension.

2024-07-24 Thread Robin Dapp
Hi, now with proper diff... For calculating the value of a poly_int at runtime we use a multiplication instruction that requires the M extension. Instead of just asserting and ICEing this patch emits an early error at option-parsing time. We have several tests that use only "i" (without "m") and

Re: [PATCH] RISC-V: Error early with V and no M extension. [PR116036]

2024-07-24 Thread Robin Dapp
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index 826d552a6fd..eb6c033535c 100644 > --- a/gcc/internal-fn.cc > +++ b/gcc/internal-fn.cc > @@ -5049,7 +5049,8 @@ internal_len_load_store_bias (internal_fn ifn, > machine_mode mode) > } > > /* Return true if the given ELS_VALUE is supp

[PATCH] RISC-V: Error early with V and no M extension. [PR116036]

2024-07-24 Thread Robin Dapp
Hi, for calculating the value of a poly_int at runtime we use a multiplication instruction that requires the M extension. Instead of just asserting and ICEing this patch emits an early error at option-parsing time. We have several tests that use only "i" (without "m") and I adjusted all of them

Re: [PATCH v3 2/2] Prevent divide-by-zero

2024-07-24 Thread Robin Dapp
> Thanks for the explanation! I have a few clarification questions about this. > If I understand correctly, B would represent the number of elements the > vector can have (for 128b vector operating on 32b elements, B == 4, but if > operating on 64b elements B == 2); however, I'm not too sure what A

Re: [PATCH v3] RISC-V: Implement the .SAT_TRUNC for scalar

2024-07-22 Thread Robin Dapp
LGTM. -- Regards Robin

Re: [PATCH v2] RISC-V: More support of vx and vf for autovec comparison

2024-07-19 Thread Robin Dapp
> I have a test. > The backend can't see -0.0 and It becomes 0.0 when translate to gimple. I don't think it should except when specifying -ffast-math or similar. But we don't have a shortcut to load a negative zero, just the positive one. -- Regards Robin

Re: [PATCH v2] RISC-V: More support of vx and vf for autovec comparison

2024-07-19 Thread Robin Dapp
> -(match_operand:V_VLSF 3 "register_operand")]))] > +(match_operand:V_VLSF 3 "nonmemory_operand")]))] Even though the integer compares have nonmemory operand here their respective insn patterns don't (but constrain properly). I guess what's happening with register operand and a c

Re: [PATCH] RISC-V: More support of vx and vf for autovec comparison

2024-07-17 Thread Robin Dapp
Hi Demin, > + void add_integer_operand (rtx x) > + { > +create_integer_operand (&m_ops[m_opno++], INTVAL (x)); > +gcc_assert (m_opno <= MAX_OPERANDS); > + } Can that be folded into add_input_operand somehow? >void add_input_operand (rtx x, machine_mode mode) >{ > create_i

Re: [RFC] tree-if-conv: Handle nonzero masked elements [PR115336].

2024-07-07 Thread Robin Dapp
> Yeah, I think so. I guess for RVV there's a choice between: > > (1) making the insn predicate accept all else values and making > the insn emit an explicit blend between the loaded result > and the else value > > (2) making the insn predicate only accept “undefined” (SCRATCH in > r

Re: [RFC] tree-if-conv: Handle nonzero masked elements [PR115336].

2024-07-05 Thread Robin Dapp
> To me this looks like mis-applying of match.pd:6083? > > Applying pattern match.pd:6083, gimple-match-1.cc:45749 > gimple_simplified to iftmp.0_62 = iftmp.0_61 | _219; > new phi replacement stmt > iftmp.0_62 = iftmp.0_61 | _219; > > so originally it wasn't > > iftmp.0_61 = .MASK_LOAD (_260,

Re: [RFC] tree-if-conv: Handle nonzero masked elements [PR115336].

2024-07-05 Thread Robin Dapp
> FTR, my concern & suggestion was: > > I suppose the difficulty is that we might make: > > MASK_LOAD (mask, ptr, some-arbitrary-else-value) > > seem as cheap as: > > MASK_LOAD (mask, ptr, { 0, 0,. ... 0}) > > which definitely isn't the case for SVE (and I'm guessing also > for

[RFC] tree-if-conv: Handle nonzero masked elements [PR115336].

2024-07-05 Thread Robin Dapp
Hi, in PR115336 we have the following vect_patt_391 = .MASK_LEN_GATHER_LOAD (_470, vect__59, 1, { 0, ... }, { 0, ... }, _482, 0); vect_iftmp.44 = vect_patt_391 | { 1, ... }; .MASK_LEN_STORE (vectp_f.45, 8B, { -1, ... }, _482, 0, vect_iftmp.44); which assumes that a maskload sets the maske

[PATCH] RISC-V: Use tu policy for first-element vec_set [PR115725].

2024-07-03 Thread Robin Dapp
Hi, this patch changes the tail policy for vmv.s.x from ta to tu. By default the bug does not show up with qemu because qemu's current vmv.s.x implementation always uses the tail-undisturbed policy. With a local qemu version that overwrites the tail with ones when the tail-agnostic policy is spec

Re: [PATCH v2] RISC-V: Remove float vector eqne pattern

2024-06-19 Thread Robin Dapp
OK. Thanks for adding the test. Regards Robin

Re: [PATCH V2 2/2] RISC-V: Move mode assertion out of conditional branch in emit_insn

2024-06-14 Thread Robin Dapp
OK. Regards Robin

Re: [PATCH V2 1/2] RISC-V: Fix vwsll combine on rv32 targets

2024-06-14 Thread Robin Dapp
> diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md > index 6a2eabbd854..29916adb62b 100644 > --- a/gcc/config/riscv/autovec-opt.md > +++ b/gcc/config/riscv/autovec-opt.md > @@ -1517,8 +1517,7 @@ (define_insn_and_split "*vwsll_zext1_scalar_" >"&& 1" >[(const_int

Re: [PATCH 1/2] RISC-V: Fix vwsll combine on rv32 targets

2024-06-13 Thread Robin Dapp
> I did a test run without the subreg condition and it also appears to > work when running on rv32gcv and rv64gcv newlib. Would it be better > to remove the subreg? Yep, if it works, i.e. all tests still pass then let's get rid of it. Regards Robin

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-06-12 Thread Robin Dapp
> Hmm, ok. The bit that confused me most was: > > if (last_needs_comparison != -1) > { > end_sequence (); > start_sequence (); > ... > } > > which implied that the second attempt was made conditionally. > It seems like it's always used and is an inherent part of the >

Re: [PATCH 1/2] RISC-V: Fix vwsll combine on rv32 targets

2024-06-12 Thread Robin Dapp
Hi Edwin, this is OK but did you check if we can get rid of the subreg condition now that we have gen_lowpart? Regards Robin

Re: [PATCH 2/2] RISC-V: Move mode assertion out of conditional branch in emit_insn

2024-06-12 Thread Robin Dapp
:00:00 2001 From: Robin Dapp Date: Fri, 31 May 2024 14:51:17 +0200 Subject: [PATCH] RISC-V: Use descriptive errors instead of asserts. In emit_insn we forestall possible ICEs in maybe_legitimize_operand by asserting. This patch replaces the assertions by more descriptive internal errors.

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-06-11 Thread Robin Dapp
> I was looking at the code in more detail and just wanted to check. > We have: > > int last_needs_comparison = -1; > > bool ok = noce_convert_multiple_sets_1 > (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, > &unmodified_insns, &last_needs_comparison); > if (!ok) >

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-06-11 Thread Robin Dapp
The attached v3 tracks the use of cond_earliest as you suggested and adds its cost in default_noce_conversion_profitable_p. Bootstrapped and regtested on x86 and p10, aarch64 still running. Regtested on riscv64. Regards Robin Before noce_find_if_block processes a block it sets up an if_info st

Re: [PATCH v1] RISC-V: Implement .SAT_SUB for unsigned vector int

2024-06-11 Thread Robin Dapp
Thanks, the patch is OK then. Regards Robin

Re: [PATCH v1] RISC-V: Implement .SAT_SUB for unsigned vector int

2024-06-11 Thread Robin Dapp
Hi Pan, in general LGTM. Would you mind adding the coremark-pro testcase which should be working now, and, was the original reason for doing this? I believe the following should do: extern int wsize; typedef unsigned short Posf; #define NIL 0 void foo (Posf *p) { register unsigned n, m; d

[PATCH v2] vect: Merge loop mask and cond_op mask in fold-left, reduction [PR115382].

2024-06-10 Thread Robin Dapp
> Actually, as Richard mentioned in the PR, it would probably be better > to use prepare_vec_mask instead. It should work in this context too > and would avoid redundant double masking. Attached is v2 that uses prepare_vec_mask. Regtested on riscv64 and armv8.8-a+sve via qemu. Bootstrap and regt

Re: [PATCH] vect: Merge loop mask and cond_op mask in fold-left, reduction.

2024-06-10 Thread Robin Dapp
Just realized I missed the PR115382 tag in the patch... Regards Robin

[PATCH] internal-fn: Force to reg if operand doesn't match.

2024-06-10 Thread Robin Dapp
Hi, despite looking good on cfarm185 and Linaro's pre-commit CI gcc-15-638-g7ca35f2e430 now appears to have caused several regressions on arm-eabi cortex-m55 as found by Linaro's CI: https://linaro.atlassian.net/browse/GNU-1252 I'm assuming this target is not tested as regularly and thus the fai

Re: [PATCH 1/5] RISC-V: Remove float vector eqne pattern

2024-06-10 Thread Robin Dapp
> But isn't canonicalization of EQ/NE safe, even for IEEE NaN and +-0.0? > > target = (a == b) ? x : y > target = (a != b) ? y : x > > Are equivalent, even for IEEE IIRC. Yes, that should be fine. My concern was not that we do a canonicalization but that we might not do it for some of the vecto

[PATCH] vect: Merge loop mask and cond_op mask in fold-left, reduction.

2024-06-10 Thread Robin Dapp
Hi, currently we discard the cond-op mask when the loop is fully masked which causes wrong code in gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c when compiled with -O3 -march=cascadelake --param vect-partial-vector-usage=2. This patch ANDs both masks instead. Bootstrapped and regtested on

Re: [PATCH v3] RISC-V: Implement .SAT_SUB for unsigned scalar int

2024-06-07 Thread Robin Dapp
LGTM. Let's keep in mind that min/max will save us two insns(?) and a conditional move would save us one. Regards Robin

Re: [PATCH v2] RISC-V: Implement .SAT_SUB for unsigned scalar int

2024-06-07 Thread Robin Dapp
>> When you say other variants are still to be implemented >> does that also include variants for zbb with min/max >> or zicond? > > No, I mean some other forms like branch need the improvement from the > middle end(aka widen_mul). Ah, I see, thanks. Those can save one instruction and we want th

Re: [PATCH v2] RISC-V: Implement .SAT_SUB for unsigned scalar int

2024-06-07 Thread Robin Dapp
Hi Pan, > + /* Step-2: lt = x < y */ > + riscv_emit_binary (LTU, pmode_lt, pmode_x, pmode_y); > + > + /* Step-3: lt = -lt */ > + riscv_emit_unary (NEG, pmode_lt, pmode_lt); > + > + /* Step-4: lt = ~lt */ > + riscv_emit_unary (NOT, pmode_lt, pmode_lt); Can we replace step 3 and 4 with sub

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-06-07 Thread Robin Dapp
> Is there any way we can avoid using pattern_cost here? Using it means > that we can make use of targetm.insn_cost for the jump but circumvent > it for the condition, giving a bit of a mixed metric. > > (I realise there are existing calls to pattern_cost in ifcvt.cc, > but if possible I think we

[PATCH] RISC-V: Regenerate opt urls.

2024-06-06 Thread Robin Dapp
Hi, I wasn't aware that I needed to regenerate the opt urls when adding an option. For this patch I did it now. I suppose this doesn't require an extra OK but I'm going to wait some minutes before applying still. Regards Robin gcc/ChangeLog: * config/riscv/riscv.opt.urls: Regenerate.

[PATCH] check_GNU_style: Use raw strings.

2024-05-31 Thread Robin Dapp
Hi, this silences some warnings when using check_GNU_style. I didn't expect this to have any bootstrap or regtest impact but I still ran it on x86 - no change. Regards Robin contrib/ChangeLog: * check_GNU_style_lib.py: Use raw strings for regexps. --- contrib/check_GNU_style_lib.py |

[PATCH] RISC-V: Add min/max patterns for ifcvt.

2024-05-31 Thread Robin Dapp
Hi, ifcvt likes to emit (set (if_then_else) (ge (reg 1) (reg2)) (reg 1) (reg 2)) which can be recognized as min/max patterns in the backend. This patch adds such patterns and the respective iterators as well as a test. This depends on the generic ifcvt change. Regtested on rv64gcv

[PATCH] ifcvt: Clarify if_info.original_cost.

2024-05-31 Thread Robin Dapp
Hi, before noce_find_if_block processes a block it sets up an if_info structure that holds the original costs. At that point the costs of the then/else blocks have not been added so we only care about the "if" cost. The code originally used BRANCH_COST for that but was then changed to COST_N_INS

Re: [PATCH 1/2] RISC-V: add option -m(no-)autovec-segment

2024-05-29 Thread Robin Dapp
On 5/28/24 23:55, Patrick O'Neill wrote: > From: Greg McGary > > Add option -m(no-)autovec-segment to enable/disable autovectorizer > from emitting vector segment load/store instructions. This is useful for > performance experiments. I think the question was raised before but does a vector tune

[PATCH v4] RISC-V: Introduce -mvector-strict-align.

2024-05-28 Thread Robin Dapp
Hi, this patch disables movmisalign by default and introduces the -mno-vector-strict-align option to override it and re-enable movmisalign. For now, generic-ooo is the only uarch that supports misaligned vector access. The patch also adds a check_effective_target_riscv_v_misalign_ok to the tests

Re: [PATCH v3] RISC-V: Introduce -mvector-strict-align.

2024-05-27 Thread Robin Dapp
>> + /* By default, when -mno-vector-strict-align is not specified, do not >> allow >> + unaligned vector memory accesses except if -mtune's setting explicitly >> + allows it. */ >> + riscv_vector_unaligned_access_p = rvv_vector_strict_align == 0 || > > opts->x_rvv_vector_strict_align

[PATCH v3] RISC-V: Introduce -mvector-strict-align.

2024-05-27 Thread Robin Dapp
Attached is v3 with the discussed changes. It now has -mscalar-strict-align which is an alias to -mstrict-align as well as -mvector-strict-align. Testsuite shows no new regressions on rv64gcv_zvfh_zvbb. Regards Robin gcc/ChangeLog: * config/riscv/riscv-opts.h (TARGET_VECTOR_MISALIGN_S

Re: [PATCH v2] RISC-V: Introduce -mrvv-allow-misalign.

2024-05-24 Thread Robin Dapp
> * -mstrict-align: Both scalar and vector misaligned accesses are > unsupported (-mrvv-allow-misalign doesn't matter). I'm not sure if > there's hardware there, but given we have systems that don't support > scalar misaligned accesses it seems reasonable to assume they'll also > not support vecto

[PATCH v2] RISC-V: Introduce -mrvv-allow-misalign.

2024-05-24 Thread Robin Dapp
> We should have something in doc/invoke too, this one is going to be > tricky for users. We'll also have to define how this interacts with > the existing -mstrict-align. Addressed the rest in the attached v2 which also fixes tests. I'm really not sure about -mstrict-align. I would have hoped th

[PATCH] RISC-V: Introduce -mrvv-allow-misalign.

2024-05-24 Thread Robin Dapp
Hi, this patch changes the default from always enabling movmisalign to disabling it. It adds an option to override the default and adds generic-ooo to the uarchs that support misaligned vector access. It also adds a check_effective_target_riscv_v_misalign_ok to the testsuite which enables or dis

Re: [PATCH] RISC-V: Enable vectorization for vect-early-break_124-pr114403.c

2024-05-21 Thread Robin Dapp
The patch is OK from the riscv side. generic-ooo includes fast unaligned access. Regards Robin

Re: [PATCH v6] RISC-V: Implement IFN SAT_ADD for both the scalar and vector

2024-05-17 Thread Robin Dapp
Hi Pan, all in all LGTM. Just insignificant nits. > +void > +expand_vec_usadd (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode) > +{ > + emit_vec_saddu (op_0, op_1, op_2, BINARY_OP, vec_mode); > +} > + Do we really need this function? Or do you want it to be a dispatcher for later? If it

[PATCH] RISC-V: Remove dead perm series code and document.

2024-05-17 Thread Robin Dapp
Hi, with the introduction of shuffle_series_patterns the explicit handler code for a perm series is dead. This patch removes it and also adds a function-level comment to shuffle_series_patterns. Regtested on rv64gcv_zvfh_zvbb. Regards Robin gcc/ChangeLog: * config/riscv/riscv-v.cc (e

[PATCH] RISC-V: Add vector popcount, clz, ctz.

2024-05-17 Thread Robin Dapp
Hi, this patch adds the zvbb vcpop, vclz and vctz to the autovec machinery as well as tests for them. It also changes several non-VLS iterators to V_VLS iterators for consistency. Regtested on rv64gcv_zvfh_zvbb. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md (ctz2): New expan

  1   2   3   4   5   6   7   8   9   10   >