Re: [PATCH] fwprop: Allow UNARY_P and check register pressure.

2023-08-24 Thread Robin Dapp via Gcc-patches
Ping. I refined the code and some comments a bit and added a test case. My question in general would still be: Is this something we want given that we potentially move some of combine's work a bit towards the front of the RTL pipeline? Regards Robin Subject: [PATCH] fwprop: Allow UNARY_P and

Re: [PATCH] tree-optimization/111115 - SLP of masked stores

2023-08-24 Thread Robin Dapp via Gcc-patches
This causes an ICE in gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c (internal compiler error: in get_group_load_store_type, at tree-vect-stmts.cc:2121) #include #define TEST_LOOP(DATA_TYPE, INDEX_TYPE) \ void __attribute__ ((noinline,

Re: [PATCH] RISC-V: Add COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS testcases

2023-08-24 Thread Robin Dapp via Gcc-patches
OK. Regards Robin

Re: [PATCH V2] RISC-V: Support LEN_FOLD_EXTRACT_LAST auto-vectorization

2023-08-24 Thread Robin Dapp via Gcc-patches
LGTM. Regards Robin

Re: [PATCH] RISC-V: Add conditional sign/zero extension and truncation autovec patterns

2023-08-24 Thread Robin Dapp via Gcc-patches
> Yes, it's better to call it one_quad. I'd suggest to go with quarter as before or quarter_width_op or something. >> Is this necessary for recognizing a different pattern? > > Are you saying that the testcases xxx-1 and xxx-2 are duplicated? If > so, I have no problem removing it and just kee

Re: [PATCH] RISC-V: Add conditional sign/zero extension and truncation autovec patterns

2023-08-24 Thread Robin Dapp via Gcc-patches
Hi Lehua, thanks, just tiny non-functional nits. > - rtx ops[] = {operands[0], quarter}; > - icode = code_for_pred_trunc (mode); > - riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, ops); > + rtx half = gen_reg_rtx (mode); Not really a half anymore now? :) > +#include > + > +#

Re: [PATCH] RISC-V: Support LEN_FOLD_EXTRACT_LAST auto-vectorization

2023-08-24 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > vcpop.m a5,v0 > beq a5,zero,.L3 > addia5,a5,-1 > vsetvli a4,zero,e32,m1,ta,ma > vcompress.vmv2,v3,v0 > vslidedown.vx v2,v2,a5 > vmv.x.s a0,v2 > .L3: > sext.w a0,a0 Mhm, where is this sext coming from? Thought I had this c

Re: [PATCH] RISC-V: Add initial pipeline description for an out-of-order core.

2023-08-23 Thread Robin Dapp via Gcc-patches
> Does this patch fix these 2 following PR: > 108271 – Missed RVV cost model (gnu.org) > > 108412 – RISC-V: Negative optimization of GCSE && LOOP INVARIANTS (gnu.org) > > > If yes, plz app

[PATCH] RISC-V: Add initial pipeline description for an out-of-order core.

2023-08-23 Thread Robin Dapp via Gcc-patches
Hi, this adds a pipeline description for a generic out-of-order core. Latency and units are not based on any real processor but more or less educated guesses what such a processor could look like. For the lack of a better name, I called the -mtune parameter "generic-ooo". In order to account for

Re: [PATCH V2] RISC-V: Add conditional unary neg/abs/not autovec patterns

2023-08-23 Thread Robin Dapp via Gcc-patches
OK, thanks. Regards Robin

Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns

2023-08-22 Thread Robin Dapp via Gcc-patches
Hi Lehua, no concerns here, just tiny remarks but in general LGTM as is. > +(define_insn_and_split "*copysign_neg" > + [(set (match_operand:VF 0 "register_operand") > +(neg:VF > + (unspec:VF [ > +(match_operand:VF 1 "register_operand") > +(match_operand:V

Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns

2023-08-22 Thread Robin Dapp via Gcc-patches
> What about conditional zero_extension, sign_extension, > float_extension, ...etc? > > We have discussed this, we can have some many conditional situations > that can be supported by either match.pd or rtl backend combine > pass. > > IMHO, it will be too many optabs/internal fns if we support al

Re: RISCV test infrastructure for d / v / zfh extensions

2023-08-21 Thread Robin Dapp via Gcc-patches
Hi Joern. > Hmm, you are right. I personally prefer my version because it allows > consistent naming of the > different tests, also easily extendible when new extensions need testing. > Although the riscv_vector name has the advantage that it is better > legible for people who are > not used to d

Re: [PATCH] RISC-V: Refactor Phase 3 (Demand fusion) of VSETVL PASS

2023-08-21 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, this is a reasonable approach and improves readability noticeably. LGTM but I'd like to wait for other opinions (e.g. by Kito) as I haven't looked closely into the vsetvl pass before and cannot entirely review it quickly. As we already have good test coverage there is not much t

[PATCH] RISC-V: Allow immediates 17-31 for vector shift.

2023-08-18 Thread Robin Dapp via Gcc-patches
Hi, this patch adds a missing constraint check in order to be able to print (and not ICE) vector immediates 17-31 for vector shifts. Regards Robin gcc/ChangeLog: * config/riscv/riscv.cc (riscv_print_operand): gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/shift

[PATCH] RISC-V/testsuite: Add missing conversion tests.

2023-08-18 Thread Robin Dapp via Gcc-patches
Hi, this patch adds some missing tests for vf[nw]cvt. Regards Robin gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-run.c: Add tests. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-rv32gcv.c: Ditto. * gcc.target/ri

[PATCH] RISC-V: Enable pressure-aware scheduling by default.

2023-08-18 Thread Robin Dapp via Gcc-patches
Hi, this patch enables pressure-aware scheduling for riscv. There have been various requests for it so I figured I'd just go ahead and send the patch. There is some slight regression in code quality for a number of vector tests where we spill more due to different instructions order. The ones I

Re: [PATCH] RISC-V: Fix -march error of zhinxmin testcases

2023-08-17 Thread Robin Dapp via Gcc-patches
> This little patch fixs the -march error of a zhinxmin testcase I added earlier > and an old zhinxmin testcase, since these testcases are for zhinxmin extension > and not zfhmin extension. Arg, I should have noticed that ;) OK, of course. Regards Robin

Re: [PATCH V2] RISC-V: Add the missed half floating-point mode patterns of local_pic_load/store when only use zfhmin or zhinxmin

2023-08-17 Thread Robin Dapp via Gcc-patches
Indeed all ANYLSF patterns have TARGET_HARD_FLOAT (==f extension) which is incompatible with ZHINX or ZHINXMIN anyway. That should really be fixed separately or at least clarified, maybe I'm missing something. Still we can go forward with the patch itself as it improves things independently, so L

Re: [PATCH V2] RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl

2023-08-17 Thread Robin Dapp via Gcc-patches
OK, thanks. Regards Robin

Re: [PATCH] RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl

2023-08-17 Thread Robin Dapp via Gcc-patches
Hi Lehua, > XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand > XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand > XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand > XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c sca

Re: [PATCH] RISC-V: Add the missed half floating-point mode patterns of local_pic_load/store when only use zfhmin

2023-08-17 Thread Robin Dapp via Gcc-patches
Hi Lehua, thanks for fixing this. Looks like the same reason we have the separation of zvfh and zvfhmin for vector loads/stores. > +;; Iterator for hardware-supported load/store floating-point modes. > +(define_mode_iterator ANYLSF [(SF "TARGET_HARD_FLOAT || TARGET_ZFINX") > +

Re: [PATCH] RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl

2023-08-17 Thread Robin Dapp via Gcc-patches
Hi Lehua, unrelated but I'm seeing a lot of failing gather/scatter tests on master right now. > /* DIRTY -> DIRTY or VALID -> DIRTY. */ > + if (block_info.reaching_out.demand_p (DEMAND_NONZERO_AVL) > + && vlmax_avl_p (prop.get_avl ())) > + continue

Re: [PATCH] RISC-V: Fix reduc_strict_run-1 test case.

2023-08-17 Thread Robin Dapp via Gcc-patches
> I'm not opposed to merging the test change, but I couldn't figure out > where in C the implicit conversion was coming from: as far as I can > tell the macros don't introduce any (it's "return _float16 * > _float16"), I'd had the patch open since last night but couldn't > figure it out. > > We ge

Re: [PATCH] RISC-V: Fix reduc_strict_run-1 test case.

2023-08-16 Thread Robin Dapp via Gcc-patches
> But if it's a float16 precision issue then I would have expected both > the computations for the lhs and rhs values to have suffered > similarly. Yeah, right. I didn't look closely enough. The problem is not the reduction but the additional return-value conversion that is omitted when calculat

Re: [PATCH] IFN: Fix vector extraction into promoted subreg.

2023-08-16 Thread Robin Dapp via Gcc-patches
> However: > > | #define vec_extract_direct { 3, 3, false } > > This looks wrong. The numbers are argument numbers (or -1 for a return > value). vec_extract only takes 2 arguments, so 3 looks to be out-of-range. > > | #define direct_vec_extract_optab_supported_p direct_optab_supported_p > > I

[PATCH] RISC-V: Fix reduc_strict_run-1 test case.

2023-08-15 Thread Robin Dapp via Gcc-patches
Hi, this patch changes the equality check for the reduc_strict_run-1 testcase from == to fabs () < EPS. The FAIL only occurs with _Float16 but I'd argue approximate equality is preferable for all float modes. Regards Robin gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/reduc/

[PATCH] IFN: Fix vector extraction into promoted subreg.

2023-08-15 Thread Robin Dapp via Gcc-patches
Hi, this patch fixes the case where vec_extract gets passed a promoted subreg (e.g. from a return value). When such a subreg is the destination of a vector extraction we create a separate pseudo register and ensure that the necessary promotion is performed afterwards. Before this patch a sign-ex

Re: [PATCH] RISC-V: Implement vector "average" autovec pattern.

2023-08-15 Thread Robin Dapp via Gcc-patches
> Plz put your testcases into: > > # widening operation only test on LMUL < 8 > set AUTOVEC_TEST_OPTS [list \ >   {-ftree-vectorize -O3 --param riscv-autovec-lmul=m1} \ >   {-ftree-vectorize -O3 --param riscv-autovec-lmul=m2} \ >   {-ftree-vectorize -O3 --param riscv-autovec-lmul=m4} \ >   {-ftree

Re: [PATCH] RISC-V: Fix autovec_length_operand predicate[PR110989]

2023-08-15 Thread Robin Dapp via Gcc-patches
> Currently, autovec_length_operand predicate incorrect configuration is > discovered in PR110989 since this following situation: In case you haven't committed it yet: This is OK. Regards Robin

Re: [PATCH V4] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-14 Thread Robin Dapp via Gcc-patches
Hi Kewen, > I did a bootstrapping and regression testing on Power10 (LE) and found a lot > of failures. I think the problem is that just like for vec_set we're expecting the vec_extract expander not to fail. It is probably passed not a const int here anymore and therefore fails to expand? can_

Re: [PATCH] RISC-V: Add MASK vec_duplicate pattern[PR110962]

2023-08-10 Thread Robin Dapp via Gcc-patches
> Is this patch ok ? Maybe we can find a way to add a target specific > fortran test but should not block this bug fix. It's not much different than adding a C testcase actually, apart from starting comments with a ! But well, LGTM. The test doesn't look that complicated and quite likely is cov

Re: [PATCH V2] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-10 Thread Robin Dapp via Gcc-patches
> Hmm, I think VEC_EXTRACT and VEC_SET should be ECF_CONST. Maybe the > GIMPLE ISEL > comments do not match the implementation, but then that should be fixed? > > /* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls > to >internal function based on vector type of selecte

Re: [PATCH] RISC-V: Support TU for integer ternary OP[PR110964]

2023-08-10 Thread Robin Dapp via Gcc-patches
OK. Regards Robin

Re: [PATCH] RISC-V: Add MASK vec_duplicate pattern[PR110962]

2023-08-10 Thread Robin Dapp via Gcc-patches
Is the testcase already in the test suite? If not we should add it. Apart from that LGTM. Regards Robin

Re: [PATCH] RISC-V: Add missing modes to the iterators

2023-08-10 Thread Robin Dapp via Gcc-patches
Yeah, thanks, better in this separate patch. OK. Regards Robin

Re: [PATCH] RISC-V: Support NPATTERNS = 1 stepped vector[PR110950]

2023-08-09 Thread Robin Dapp via Gcc-patches
OK, thanks. Regards Robin

Re: [PATCH] vect: Add a popcount fallback.

2023-08-09 Thread Robin Dapp via Gcc-patches
> We seem to be looking at promotions of the call argument, lhs_type > is the same as the type of the call LHS. But the comment mentions .POPCOUNT > and the following code also handles others, so maybe handling should be > moved. Also when we look to vectorize popcount (x) instead of popcount((T)

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Robin Dapp via Gcc-patches
> Presumably this is an alternative to the approach Juzhe posted a week > or two ago and ultimately dropped? Yeah, I figured having a generic fallback could help more targets. We can still have a better expander if we see the need. Regards Robin

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Robin Dapp via Gcc-patches
> Hmm, the conversion should be a separate statement so I wonder > why it would go wrong? It is indeed. Yet, lhs_type is the lhs type of the conversion and not the call and consequently we compare the precision of the converted type with the popcount input. So we should probably rather do someth

Re: [PATCH] RISC-V: Allow CONST_VECTOR for VLS modes.

2023-08-08 Thread Robin Dapp via Gcc-patches
Hi Juzhe, just some nits. > - else if (rtx_equal_p (step, constm1_rtx) && poly_int_rtx_p (base, &value) > + else if (rtx_equal_p (step, constm1_rtx) > +&& poly_int_rtx_p (base, &value) Looks like just a line-break change and the line is not too long? > - rtx ops[] = {dest, vid, g

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Robin Dapp via Gcc-patches
> Well, not sure how VECT_COMPARE_COSTS can help here, we either > get the pattern or vectorize the original function. There's no special > handling > for popcount in vectorizable_call so all special cases are handled via > patterns. > I was thinking of popcounthi via popcountsi and zero-extend

Re: [PATCH v2] Mode-Switching: Fix SET_SRC ICE when USE or CLOBBER

2023-08-08 Thread Robin Dapp via Gcc-patches
> Could you please help to share how to enable checks here? Build with --enable-checking or rather --enable-checking=extra. Regards Robin

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Robin Dapp via Gcc-patches
omparable and that's probably due to the lack of a proper search term. Also, I figured the 2-byte repeating sequences might be trickier anyway and therefore kept it as is. If you find it too cumbersome I can look for an alternative. Right now it closely matches what the example C code says wh

[PATCH] vect: Add a popcount fallback.

2023-08-07 Thread Robin Dapp via Gcc-patches
Hi, This patch adds a fallback when the backend does not provide a popcount implementation. The algorithm is the same one libgcc uses, as well as match.pd for recognizing a popcount idiom. __builtin_ctz and __builtin_ffs can also rely on popcount so I used the fallback for them as well. Bootstr

Re: [PATCH] RISC-V: Support VLS basic operation auto-vectorization

2023-08-07 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, looks good from my side. > +/* { dg-final { scan-assembler-times {vand\.vi\s+v[0-9]+,\s*v[0-9]+,\s*-16} > 42 } } */ > +/* { dg-final { scan-assembler-not {csrr} } } */ I was actually looking for a scan-assembler-not vsetvli... but the csrr will do as well. Regards Robin

[PATCH] fwprop: Allow UNARY_P and check register pressure.

2023-08-07 Thread Robin Dapp via Gcc-patches
Hi, originally inspired by the wish to transform vmv v3, a0 ; = vec_duplicate vadd.vv v1, v2, v3 into vadd.vx v1, v2, a0 via fwprop for riscv, this patch enables the forward propagation of UNARY_P sources. As this involves potentially replacing a vector register with a scalar register the

Re: [PATCH V2] RISC-V: Support CALL conditional autovec patterns

2023-08-03 Thread Robin Dapp via Gcc-patches
Hi Juzhe, I would find it a bit clearer if the prepare_ternay part were a separate patch. As it's mostly mechanical replacements I don't mind too much, though so it's LGTM from my side without that. As to the lmul = 8 ICE, is the problem that the register allocator would actually need 5 "registe

Re: [PATCH] RISC-V: Implement vector "average" autovec pattern.

2023-08-02 Thread Robin Dapp via Gcc-patches
> 1. How do you model round to +Inf (avg_floor) and round to -Inf (avg_ceil) ? That's just specified by the +1 or the lack of it in the original pattern. Actually the IFN is just a detour because we would create perfect code if not for the fallback. But as there is currently now way to check for

[PATCH] RISC-V: Implement vector "average" autovec pattern.

2023-08-01 Thread Robin Dapp via Gcc-patches
Hi, this patch adds vector average patterns op[0] = (narrow) ((wide) op[1] + (wide) op[2]) >> 1; op[0] = (narrow) ((wide) op[1] + (wide) op[2] + 1) >> 1; If there is no direct support, the vectorizer can synthesize the patterns but, presumably due to lack of narrowing operation support, won't

Re: RISCV test infrastructure for d / v / zfh extensions

2023-08-01 Thread Robin Dapp via Gcc-patches
Hi Joern, thanks, I believe this will help with testing. > +proc check_effective_target_riscv_v { } { > +return [check_no_compiler_messages riscv_ext_v assembly { > + #ifndef __riscv_v > + #error "Not __riscv_v" > + #endif > +}] > +} This can be replaced by riscv_vector

Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization

2023-07-31 Thread Robin Dapp via Gcc-patches
>>> I'm not against continuing with the more well-known approach for now >>> but we should keep in mind that might still be potential for improvement. > > No. I don't think it's faster. I did a quick check on my x86 laptop and it's roughly 25% faster there. That's consistent with the literature.

Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization

2023-07-31 Thread Robin Dapp via Gcc-patches
> +/* FIXME: We don't allow vectorize "__builtin_popcountll" yet since it needs > "vec_pack_trunc" support > + and such pattern may cause inferior codegen. > + We will enable "vec_pack_trunc" when we support reasonable vector > cost model. */ Wait, why do we need vec_pack_trunc f

Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization

2023-07-31 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > +/* Expand Vector POPCOUNT by parallel popcnt: > + > + int parallel_popcnt(uint32_t n) { > + #define POW2(c) (1U << (c)) > + #define MASK(c) (static_cast(-1) / (POW2(POW2(c)) + 1U)) > + #define COUNT(x, c) ((x) & MASK(c)) + (((x)>>(POW2(c))) & MASK(c)) > + n = CO

Re: [PATCH V2] RISC-V: Enable basic VLS auto-vectorization

2023-07-30 Thread Robin Dapp via Gcc-patches
> +;; - > +;; Duplicate Operations > +;; - > + > +(define_insn_and_split "@vec_duplicate" > + [(set (match_operand:VLS 0 "register_operand") > +(vec_duplicat

Re: [PATCH v2] RISC-V: convert the mulh with 0 to mov 0 to the reg.

2023-07-28 Thread Robin Dapp via Gcc-patches
> This is a draft patch. I would like to explain it's hard to make the > simplify generic and ask for some help. > > There're 2 categories we need to optimize. > > - The op in optab such as div / 1. > - The unspec operation such as mulh * 0, (vadc+vmadc) + 0. > > Especially for the unspec operat

[PATCH] gcse: Extract reg pressure handling into separate file.

2023-07-28 Thread Robin Dapp via Gcc-patches
>From 65e69834eeb08ba093786e386ac16797cec4d8a7 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Mon, 24 Jul 2023 16:25:38 +0200 Subject: [PATCH] gcse: Extract reg pressure handling into separate file. This patch extracts the hoist-pressure handling from gcse into a separate file so it can be used by other passes in the fut

Re: [PATCH v8] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-28 Thread Robin Dapp via Gcc-patches
Hi Pan, thanks for your patience and your work. Apart from my general doubt whether mode-changing intrinsics are a good idea, I don't have other remarks that need fixing. What I mentioned before: - Handling of asms wouldn't be a huge change. It can be done in a follow-up patch of course but

Re: [PATCH v2] RISC-V: testsuite: Add vector_hw and zvfh_hw checks.

2023-07-27 Thread Robin Dapp via Gcc-patches
> LGTM, I just found this patch still on the list, I mostly tested with > qemu, so I don't think that is a problem before, but I realize it's a > problem when we run on a real board that does not support those > extensions. I think we can skip this one as I needed to introduce vector_hw and zvfh_h

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-27 Thread Robin Dapp via Gcc-patches
> I see, you mean at the beginning of frm_after, we can just return the > incoming mode as is? > > If (CALL_P (insn)) > return mode; // Given we aware the mode is DYN_CALL already. Yes, potentially similar for all the other ifs but I didn't check all of them. > Thank and will cleanup this i

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-27 Thread Robin Dapp via Gcc-patches
>> Why do we appear to return a different mode here? We already request >> FRM_MODE_DYN_CALL in mode_needed. It looks like in the whole function >> we do not change the mode so we could just always return the incoming >> mode? > > Because we need to emit 2 insn when meet a call. One before the c

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-26 Thread Robin Dapp via Gcc-patches
> I would like to propose that being focus and moving forward for this > patch itself, the underlying other RVV floating point API support and > the RVV instrinsic API fully tests depend on this. Sorry, I didn't mean to ditch LCM/mode switching. I believe it is doing a pretty good job and we shou

Re: [PATCH] RISC-V: Enable basic VLS modes support

2023-07-26 Thread Robin Dapp via Gcc-patches
Hi Juzhe, just some small remarks, all in all no major concerns. > + vmv%m1r.v\t%0,%1" > + "&& (!register_operand (operands[0], mode) > + || !register_operand (operands[1], mode))" > + [(const_int 0)] > + { > +unsigned size = GET_MODE_BITSIZE (mode).to_constant (); > +if (size

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-26 Thread Robin Dapp via Gcc-patches
> CSR write could be expensive, it will flush whole pipeline in some > RISC-V core implementation… Hopefully not flush but just sequentialize but yes, it's usually a performance concern. However if we set the rounding mode to something else for an intrinsic and then call a function we want to re

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-26 Thread Robin Dapp via Gcc-patches
> current llvm didn't do any pre optimization. They always > backup+restore for each rounding mode intrinsic I see. There is still the option of lazily restoring the (entry) FRM before a function call but not read the FRM after every call. Do we have any data on how good or bad the mode-switchi

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-26 Thread Robin Dapp via Gcc-patches
So after thinking about it again - I'm still not really sure I like treating every function as essentially an fesetround. There is a reason why fesetround is special. Does LLVM behave the same way? But supposing we really, really want it and assuming there's consensus: + start_sequence (); + e

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-25 Thread Robin Dapp via Gcc-patches
> The call fesetround could be any function in practice, and we never > know if that function might use dynamic rounding mode floating point > operation or not, also we don't know if it will be called fesetround > or not. > > So that's why we want to restore before function call to make sure we >

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-25 Thread Robin Dapp via Gcc-patches
Hi Pan, > Given we have a call, we would like to restore before call and then > backup frm after call. Looks current mode switching cannot emit insn > like that, it can only either emit insn before (mostly) or after > (when NOTE_INSN_BASIC_BLOCK_P). Thus, we try to emit the one after > call when n

Re: [PATCH] RISC-V: Fixbug for fsflags instruction error using immediate.

2023-07-24 Thread Robin Dapp via Gcc-patches
Hi Jin, this looks reasonable. Would you mind adding (small) test cases still to make sure we don't accidentally reintroduce the problem? Regards Robin

Re: [PATCH v6] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-24 Thread Robin Dapp via Gcc-patches
Hi Pan, > + for (insn = PREV_INSN (cur_insn); insn; insn = PREV_INSN (insn)) > +{ > + if (INSN_P (insn)) > + { > + if (CALL_P (insn)) > + mode = FRM_MODE_DYN; > + break; > + } > + > + if (insn == BB_HEAD (bb)) > + break; > +} > + > + return mode;

Re: [PATCH v2] vect: Handle demoting FLOAT and promoting FIX_TRUNC.

2023-07-20 Thread Robin Dapp via Gcc-patches
mode cannot hold the range so it still has a chance to fit in the next larger one. Bootstrap and testsuite are unchanged on x86, aarch64 and power and I'm going to commit the attached barring further remarks. Regards Robin >From cabfa07256eafec4485304fe7639d8fd7512cf11 Mon Sep 17 00:00:0

Re: [PATCH V2] RISC-V: Support in-order floating-point reduction

2023-07-20 Thread Robin Dapp via Gcc-patches
> LGTM, but I would like make sure Robin is OK too Yes, LGTM as well. Regards Robin

Re: [PATCH] RISC-V: Support in-order floating-point reduction

2023-07-20 Thread Robin Dapp via Gcc-patches
> The UNORDERED enum will cause ICE since we have UNORDERED in rtx_code. > > Could you give me another enum name? I would have expected it to work when it's namespaced. Regards Robin

Re: [PATCH] RISC-V: Support in-order floating-point reduction

2023-07-20 Thread Robin Dapp via Gcc-patches
> +enum reduction_type > +{ > + UNORDERED_REDUDUCTION, > + FOLD_LEFT_REDUDUCTION, > + MASK_LEN_FOLD_LEFT_REDUDUCTION, > +}; There are redundant 'DU's here ;) Wouldn't it be sufficient to have an enum enum reduction_type { UNORDERED, FOLD_LEFT, MASK_LEN_FOLD_LEFT, }; ? Regards Robin

Re: [PATCH] VECT: Support floating-point in-order reduction for length loop control

2023-07-19 Thread Robin Dapp via Gcc-patches
Hi Juzhe, I just noticed that we recently started calling things MASK_LEN (instead of LEN_MASK before) with the reductions. Wouldn't we want to be consistent here? Especially as the length takes precedence. I realize the preparational work like optabs is already upstream but still wanted to brin

Re: [PATCH] RISC-V: Fix testcase failed when default -mcmodel=medany

2023-07-18 Thread Robin Dapp via Gcc-patches
Hi Lehua, > I think you are rigth, I would like to remove the `-mcmodel=medany` option and > relax assert from `__riscv_save/restore_4` to `__riscv_save/restore_(3|4)` to > let > this testcase not brittle on any -mcmodel. Then I'm also going to add another > testcase (I dont known how to run -ma

Re: [PATCH] RISC-V: Fix testcase failed when default -mcmodel=medany

2023-07-18 Thread Robin Dapp via Gcc-patches
Hi Lehua, > I think the purpose of this testcase is to check whether the modifications to > the stack frame are as expected, so it is necessary to specify exactly whether > three or four registers are saved. But I think its need to add another > testcase > which use another option -mcmodel=medany

Re: [PATCH V2] RISC-V: Enable SLP un-order reduction

2023-07-18 Thread Robin Dapp via Gcc-patches
OK. Regards Robin

Re: [PATCH] RISC-V: Fix testcase failed when default -mcmodel=medany

2023-07-18 Thread Robin Dapp via Gcc-patches
Hi Lehua, > This patch fix testcase failed when I build RISC-V GCC with -mcmodel=medany > as default. If set to medany, stack_save_restore.c testcase will fail because > of > the reduced use of s3 registers in assembly (thus calling __riscv_save/store_3 > instead of __riscv_save/store_4). Explici

Re: [PATCH] RISC-V: Enable SLP un-order reduction

2023-07-18 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > +;; - > +;; [INT,FP] Initialize from individual elements > +;; - > +;; Includes: > +;; - vslide1up.vx/vfslide1up.vf > +;; ---

[PATCH v2] vect: Handle demoting FLOAT and promoting FIX_TRUNC.

2023-07-14 Thread Robin Dapp via Gcc-patches
>>> Can you add testcases? Also the current restriction is because >>> the variants you add are not always correct and I don't see any >>> checks that the intermediate type doesn't lose significant bits? I didn't manage to create one for aarch64 nor for x86 because AVX512 has direct conversions e

Re: [PATCH V2] RISC-V: Enable COND_LEN_FMA auto-vectorization

2023-07-14 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, looks good to me now - did before already actually ;). Regards Robin

Re: [PATCH] RISC-V: Enable COND_LEN_FMA auto-vectorization

2023-07-13 Thread Robin Dapp via Gcc-patches
> Is COND _LEN FMA ok for trunk? I can commit it without changing > scatter store testcase fix. > > It makes no sense block cond Len fma support. The middle end support > has already been merged. Then just add a TODO or so that says e.g. "For some reason we exceed the default code model's +-2

Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-13 Thread Robin Dapp via Gcc-patches
From my understanding, we dont have RVV instruction for fmax/fmin? > > Unless I'm misunderstanding, we do. The ISA manual says > > === Vector Floating-Point MIN/MAX Instructions > > The vector floating-point `vfmin` and `vfmax` instructions have the > same behavior as the

Re: [PATCH] vect: Handle demoting FLOAT and promoting FIX_TRUNC.

2023-07-13 Thread Robin Dapp via Gcc-patches
> Can you add testcases? Also the current restriction is because > the variants you add are not always correct and I don't see any > checks that the intermediate type doesn't lose significant bits? The testcases I wanted to add with a follow-up RISC-V patch but I can also try an aarch64 one. So

Re: [PATCH] RISC-V: Enable COND_LEN_FMA auto-vectorization

2023-07-13 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, no complaints from my side apart from one: > +/* { dg-additional-options "-mcmodel=medany" } */ Please add a comment why we need this. Regards Robin

[PATCH] vect: Handle demoting FLOAT and promoting FIX_TRUNC.

2023-07-13 Thread Robin Dapp via Gcc-patches
Hi, the recent changes that allowed multi-step conversions for "non-packing/unpacking", i.e. modifier == NONE targets included promoting to-float and demoting to-int variants. This patch adds demoting to-float and promoting to-int handling. Bootstrapped and regtested on x86 and aarch64. A quest

Re: [PATCH] Add VXRM enum

2023-07-12 Thread Robin Dapp via Gcc-patches
> +enum __RISCV_VXRM { > + __RISCV_VXRM_RNU = 0, > + __RISCV_VXRM_RNE = 1, > + __RISCV_VXRM_RDN = 2, > + __RISCV_VXRM_ROD = 3, > +}; > + > __extension__ extern __inline unsigned long > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > vread_csr(enum RVV_CSR csr) We have

Re: [PATCH] RISC-V: Support COND_LEN_* patterns

2023-07-12 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > +/* Return true if the operation is the floating-point operation need FRM. */ > +static bool > +need_frm_p (rtx_code code, machine_mode mode) > +{ > + if (!FLOAT_MODE_P (mode)) > +return false; > + return code != SMIN && code != SMAX; > +} Return true if the operation requires

Re: [PATCH] [vect]Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed.

2023-07-12 Thread Robin Dapp via Gcc-patches
> int32_t x = (int32_t)0x1.0p32; > int32_t y = (int32_t)(int64_t)0x1.0p32; > > sets x to 2147483647 and y to 0. >>> >>> Hmm, good question. GENERIC has a direct truncation to unsigned char >>> for example, the C standard generally says if the integral part cannot >>> be represented

[PATCH v2] genopinit: Allow more than 256 modes.

2023-07-11 Thread Robin Dapp via Gcc-patches
Attached is v2 that does not switch to uint64_t but stays within 32 bits by shifting the optab by 20 and the mode(s) by 10 bits. Regards Robin Upcoming changes for RISC-V will have us exceed 255 modes or 8 bits. This patch increases the limit to 10 bits and adjusts the hashing function for the g

[PATCH] genopinit: Allow more than 256 modes.

2023-07-11 Thread Robin Dapp via Gcc-patches
Ok so the consensus seems to rather stay with 32 bits and only change the shift to 10/20? As MACHINE_MODE_BITSIZE is already 16 we would need an additional check independent of that. Wouldn't that also be a bit confusing? Attached is a "v2" with unsigned long long changed to uint64_t and checking

Re: [PATCH] genopinit: Allow more than 256 modes.

2023-07-11 Thread Robin Dapp via Gcc-patches
> if (NUM_OPTABS > 0x > || MAX_MACHINE_MODE >= ((1 << MACHINE_MODE_BITSIZE) - 1)) > fatal ("genopinit range assumptions invalid"); > > so it would be a case of changing those instead. Thanks, right at the beginning of the file and I didn't see it ;) MACHINE_MODE_BITSIZE is already 1

Re: [PATCH] RISC-V: Optimize permutation codegen with vcompress

2023-07-11 Thread Robin Dapp via Gcc-patches
> MASK4 0, 5, 6, 7 also works definitely Sure :) My remark was that the tests are all(?) evenly split and a bit more variation would have been nice. Not that it doesn't work, I'm OK with it as is. Regards Robin

Re: [PATCH] RISC-V: Optimize permutation codegen with vcompress

2023-07-11 Thread Robin Dapp via Gcc-patches
> The compress optimization pattern has included all variety. > It's not necessary to force split (half/half), we can apply this compress > pattern to any variety of compress pattern. Yes, that's clear. I meant the testcases are mostly designed like MASK4 1, 2, 6, 7 instead of variation like M

Re: [PATCH] RISC-V: Optimize permutation codegen with vcompress

2023-07-11 Thread Robin Dapp via Gcc-patches
Hi Juzhe, looks good from my side, thanks. While going through it I thought of some related cases that we could still handle differently but I didn't bother to formalize them for now. Most likely we already handle them in the shortest way anyway. I'm going to check on that when I find some time

[PATCH] genopinit: Allow more than 256 modes.

2023-07-11 Thread Robin Dapp via Gcc-patches
Hi, upcoming changes for RISC-V will have us exceed 256 modes or 8 bits. The helper functions in gen* rely on the opcode as well as two modes fitting into an unsigned int (a signed int even if we consider the qsort default comparison function). This patch changes the type of the index/hash from u

Re: [PATCH V4] RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-07 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, the somewhat unified modulo is IMHO a more readable. Could probably still be improved but OK with me for now. > + if (is_dummy_len) > + { > + rtx dummy_len = gen_reg_rtx (Pmode); Can we call this is_vlmax_len/is_vlmax and vlmax_len or so? > + if (inner

Re: [PATCH V3] RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-07 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, that's quite a chunk :) and it took me a while to go through it. > @@ -564,7 +565,14 @@ const_vec_all_in_range_p (rtx vec, poly_int64 minval, > poly_int64 maxval) > static rtx > gen_const_vector_dup (machine_mode mode, poly_int64 val) > { > - rtx c = gen_int_mode (val, GET_

Re: [PATCH v5] RISC-V: Fix one bug for floating-point static frm

2023-07-06 Thread Robin Dapp via Gcc-patches
Hi Pan, thanks, I think that works for me as I'm expecting these parts to change a bit anyway in the near future. There is no functional change to the last revision that Kito already OK'ed so I think you can go ahead. Regards Robin

<    1   2   3   4   5   6   7   8   9   10   >