Re: [PATCH] RISC-V: Implement autovec abs, vneg, vnot.

2023-05-19 Thread Robin Dapp via Gcc-patches
>>> +  TAIL_UNDEFINED = -1, >>> +  MASK_UNDEFINED = -1, > Why you add this ? > >>> +  void add_policy_operands (enum tail_policy vta = TAIL_UNDEFINED, >>> +     enum mask_policy vma = MASK_UNDEFINED) > No, you should just specify this as TAIL_ANY or MASK_ANY as default value. That's the value I

[PATCH] RISC-V: Implement autovec abs, vneg, vnot.

2023-05-19 Thread Robin Dapp via Gcc-patches
Hi, this patch implements autovec expanders of abs2, vneg2 and vnot2 for integers. I also tried to refactor the helper code in riscv-v.cc a bit. Guess it's not enough to warrant a separate patch though. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md (2): Fix typo.

[PATCH] RISC-V: testsuite: Remove empty *-run-template.h.

2023-05-19 Thread Robin Dapp via Gcc-patches
Hi, this obvious patch removes empty run template files and one redundant stdio.h include. Regards Robin gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/shift-run.c: Do not include . * gcc.target/riscv/rvv/autovec/binop/shift-run-template.h: Removed.

[PATCH] RISC-V: Allow more loading of const vectors.

2023-05-19 Thread Robin Dapp via Gcc-patches
Hi, this fixes a rebase oversight regarding the loading of vector constants. Added another test to properly catch that in the future. Regards Robin gcc/ChangeLog: * config/riscv/riscv.cc (riscv_const_insns): Remove else. gcc/testsuite/ChangeLog: *

Re: [PATCH] RISC-V: Adjust stdint.h to stdint-gcc.h for rvv tests

2023-05-17 Thread Robin Dapp via Gcc-patches
> Huh, including stdint-gcc.h looks completely wrong. What's the issue you are > trying to solve? The way I understood it is that that's a temporary workaround until all multilib et al. (+testsuite) configurations are in place but I haven't checked the details myself. Eventually this should be

Re: [PATCH] RISC-V: Adjust stdint.h to stdint-gcc.h for rvv tests

2023-05-16 Thread Robin Dapp via Gcc-patches
> This patch would like to align the stdint.h to the stdint-gcc.h for all > the RVV test files. Aka: > > stdint.h => stdint-gcc.h Looks good. Jeff already pre-approved so you can go ahead and install this on the trunk. Regards Robin

Re: [PATCH] RISC-V: Support TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT to optimize codegen of RVV auto-vectorization

2023-05-15 Thread Robin Dapp via Gcc-patches
> After this patch, RVV GCC by default support alignment of RVV modes > according to riscv-modes.def. In riscv-modes.def, we define each RVV > modes are element align which is aligned to RVV ISA spec. > > If you want to support other alignment, you should add tunning info > for this in the

Re: [PATCH] RISC-V: Support TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT to optimize codegen of RVV auto-vectorization

2023-05-15 Thread Robin Dapp via Gcc-patches
Hi, we need to discern what we want to achieve here. The goal might be to prevent the vectorizer from performing peeling or versioning for alignment. I realize the peeling code looks ugly but it's actually for a good cause when the target does not support misaligned vector access or only with

Re: [PATCH] RISC-V: Using merge approach to optimize repeating sequence in vec_init

2023-05-12 Thread Robin Dapp via Gcc-patches
> emit_merge_op can not be wrapped into binop since mask position is > different in pattern. > > I prefer merge op in different wrapper. Yes, I didn't mean literally the same but that things already become a bit confusing with all the different variants and bool arguments or code duplication

Re: [PATCH] RISC-V: Using merge approach to optimize repeating sequence in vec_init

2023-05-12 Thread Robin Dapp via Gcc-patches
Hi, in general LGTM, just minor nits and comments. > - void set_len_and_policy (rtx len, bool force_vlmax = false) > -{ > - bool vlmax_p = force_vlmax; > - gcc_assert (has_dest); > + void set_len_and_policy (rtx len, bool force_vlmax = false, bool ta_p = > true, > +

Re: [PATCH] RISC-V: Fix fail of vmv-imm-rv64.c in rv32

2023-05-12 Thread Robin Dapp via Gcc-patches
>> After update local codebase to the trunk. I realize there is one more fail >> in RV32. >> After this patch, all fails of RVV are cleaned up. >> Thanks. But only because we build vmv-imm with autovec-preference=scalable. With fixed-vlmax it still does not work because I messed up the rebase

Re: [PATCH] RISC-V: Fix RVV binary auto-vectorizaiton test fails

2023-05-12 Thread Robin Dapp via Gcc-patches
> ok, thanks :) This has likely been discussed at length before, but why need to specify the additional -mabi with -march (instead of -march implying a matching abi)?

Re: [PATCH v2] RISC-V: Add vector_scalar_shift_operand

2023-05-12 Thread Robin Dapp via Gcc-patches
> The vector shift immediates happen to have the same constraints as some > of the CSR-related operands, but it's a different usage. This adds a > name for them, so I don't get confused again next time. > > gcc/ChangeLog: > > * config/riscv/autovec.md (shifts): Use >

[PATCH v2] RISC-V: Allow vector constants in riscv_const_insns.

2023-05-11 Thread Robin Dapp via Gcc-patches
> OK, you can go ahead commit patch. I am gonna send another patch to > fix this. I agree that we should handle more constants but I'd still rather go ahead now and fix things later. The patch is more about the test rather than the actual change anyway. Jeff already ack'ed v1, maybe waiting for

[Commited] MAINTAINERS: Fix alphabetic sorting.

2023-05-11 Thread Robin Dapp via Gcc-patches
ChangeLog: * MAINTAINERS: Sort. --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 1c380bef5c5..e4dee76e2df 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -521,7 +521,6 @@ James Lemke Ilya

Re: [PATCH v2] RISC-V: Add vectorized binops and insn_expander helpers.

2023-05-11 Thread Robin Dapp via Gcc-patches
> LGTM. You should commit it now. Then I can rebase vec_init patch. Would need an ACK/OK from Kito at least :)

[PATCH v2] RISC-V: Split off shift patterns for autovectorization.

2023-05-11 Thread Robin Dapp via Gcc-patches
> "csr_operand" does seem wrong, though, as that just accepts constants. > Maybe "arith_operand" is the way to go? I haven't looked at the > V immediates though. I was pondering changing the shift-count operand to QImode everywhere but that indeed does not help code generation across the board.

[PATCH v2] RISC-V: Clarify vlmax and length handling.

2023-05-11 Thread Robin Dapp via Gcc-patches
Changes from v1: - Change subject to RISC-V ;) - Minor comment updates and rebasing. This patch tries to improve the wrappers that emit either vlmax or non-vlmax operations. Now, emit_len_op can be used to emit a regular operation. Depending on whether a length != NULL is passed either no

[PATCH v2] RISC-V: Add autovectorization tests for binary integer, operations.

2023-05-11 Thread Robin Dapp via Gcc-patches
Changes from v1: - Split into run tests (guarded by riscv_vector) and compile tests which will be executed unconditionally. Doing dg-do run and -save-temps on a non-supported target will not do anything at all. This patchs adds scan as well as execution tests for vectorized binary

[PATCH v2] RISC-V: Add vectorized binops and insn_expander helpers.

2023-05-11 Thread Robin Dapp via Gcc-patches
Changes from v1: - Rebase against Juzhe's vec_series patch. - Get rid of redundant scalar mode setting. This patch adds basic binary integer operations support. It is based on Michael Collison's work and makes use of the existing helpers in riscv-c.cc. It introduces emit_nonvlmax_binop

[PATCH] mklog.py: Add --commit option.

2023-05-11 Thread Robin Dapp via Gcc-patches
Hi, this patch allows mklog.py to be called with a commit hash directly. So, instead of git show | git gcc-mklog git gcc-mklog --commit can be used. When no is given but --commit is specified, HEAD is used instead. The behavior without --commit is the same as before. Is that useful/OK?

Re: [PATCH] riscv: Clarify vlmax and length handling.

2023-05-10 Thread Robin Dapp via Gcc-patches
It's somewhat common for mail clients to treat "--" as a signature deliminator, it's "---" that git uses as a comment deliminator. It's in my muscle memory somehow. Always did it that way because I didn't want the same delimiter as in the git part of the message. Time to change that habit I

Re: [PATCH] riscv: Add vectorized binops and insn_expander helpers.

2023-05-10 Thread Robin Dapp via Gcc-patches
> +  machine_mode op2mode = Pmode; > +  if (inner == E_QImode || inner == E_HImode || inner == E_SImode) > + op2mode = inner; This I added in order to match the scalar variants like [(set (match_operand:VI_QHS 0 "register_operand" "=vd,vd, vr, vr") (if_then_else:VI_QHS

[PATCH] riscv: Clarify vlmax and length handling.

2023-05-10 Thread Robin Dapp via Gcc-patches
Hi, this patch tries to improve the wrappers that emit either vlmax or non-vlmax operations. Now, emit_len_op can be used to emit a regular operation. Depending on whether a length != NULL is passed either no VLMAX flags are set or we emit a vsetvli and set VLMAX flags. The patch also adds

[PATCH] riscv: Add autovectorization tests for binary integer

2023-05-10 Thread Robin Dapp via Gcc-patches
Hi, this patchs adds scan as well as execution tests for vectorized binary integer operations. It is based on Michael Collison's work and also includes scalar variants. The tests are not fully comprehensive as the vector type promotions (vec_unpack, extend etc.) are not implemented yet. Also,

[PATCH] riscv: Split off shift patterns for autovectorization.

2023-05-10 Thread Robin Dapp via Gcc-patches
Hi, this patch splits off the shift patterns of the binop patterns. This is necessary as the scalar shifts require a Pmode operand as shift count. To this end, a new iterator any_int_binop_no_shift is introduced. At a later point when the binops are split up further in commutative and

[PATCH] riscv: Add vectorized binops and insn_expander helpers.

2023-05-10 Thread Robin Dapp via Gcc-patches
Hi, this patch adds basic binary integer operations support. It is based on Michael Collison's work and makes use of the existing helpers in riscv-c.cc. It introduces emit_nonvlmax_binop which, in turn, uses emit_pred_binop. Setting the destination as well as the mask and the length is

Re: [PATCH V3] RISC-V: Enable basic RVV auto-vectorization support.

2023-05-05 Thread Robin Dapp via Gcc-patches
Hi Juzhe, I wasn't yet able to check this locally so just some minor comment nits: > +/* Return the vectorization machine mode for RVV according to LMUL. */ > +machine_mode > +preferred_simd_mode (scalar_mode mode) > +{ > + /* We only enable auto-vectorization when TARGET_MIN_VLEN < 128 && > +

[PATCH] riscv: Allow vector constants in riscv_const_insns.

2023-04-28 Thread Robin Dapp via Gcc-patches
Hi, I figured I'm going to start sending some patches that build on top of the upcoming RISC-V autovectorization. This one is obviously not supposed to be installed before the basic support lands but it's small enough that it shouldn't hurt to send it now. This patch allows vector constants in

Re: [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations

2023-04-26 Thread Robin Dapp via Gcc-patches
Hi Michael, I have the diff below for the binops in my tree locally. Maybe something like this works for you? Untested but compiles and the expander helpers would need to be fortified obviously. Regards Robin -- gcc/ChangeLog: * config/riscv/autovec.md (3): New binops expander.

Re: [RFA] [PR target/108248] [RISC-V] Break down some bitmanip insn types

2023-04-21 Thread Robin Dapp via Gcc-patches
> ../../gcc/config/riscv/generic.md:28:1: unknown value `smin' for attribute > `type' > make[3]: *** [Makefile:2528: s-attrtab] Error 1 > >From 582c428258ce17ffac8ef1b96b4072f3d510480f Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Fri, 21 Apr 2023 09:38:06 +0200 Subject: [PATCH] riscv: Fix

Re: [PATCH 2/3 V2] RISC-V: Enable basic auto-vectorization for RVV

2023-04-20 Thread Robin Dapp via Gcc-patches
> Can you give more comments about Robin's opinion that he want to change into > "fixed" vs "varying" or "fixed vector size" vs "dynamic vector size" ? It's not necessary to decide on this now as --params are not supposed to be stable and can be changed quickly. I was just curious if this had

Re: [PATCH 2/3 V2] RISC-V: Enable basic auto-vectorization for RVV

2023-04-20 Thread Robin Dapp via Gcc-patches
> $ riscv64-unknown-linux-gnu-gcc > --param=riscv-autovec-preference=fixed-vlmax > gcc/testsuite/gcc.target/riscv/rvv/base/spill-10.c -O2 -march=rv64gcv > -S > ../riscv-gnu-toolchain-trunk/riscv-gcc/gcc/testsuite/gcc.target/riscv/rvv/base/spill-10.c: > In function 'stach_check_alloca_1': >

Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization

2023-04-12 Thread Robin Dapp via Gcc-patches
>> I think we can CC IBM folks to see whether we can make WHILE_LEN works >> for both IBM and RVV ? > > I've CCed them. Adding WHILE_LEN support to rs6000/s390x would be > mainly the "easy" way to get len-masked (epilog) loop support. I've > figured actually implementing WHILE_ULT for AVX512

Re: [committed] testsuite: Fix up syntax errors in scan-tree-dump-times target selectors

2023-03-06 Thread Robin Dapp via Gcc-patches
Hi, > This broke the tests, I'm seeing syntax errors: > ERROR: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects: error executing dg-final: > syntax error in target selector "target ! vect_partial_vectors || vect32 || > s390_vx" > ERROR: gcc.dg/vect/slp-3.c: error executing dg-final: syntax error

[PATCH] testsuite: Do not expect partial vectorization for s390.

2023-03-02 Thread Robin Dapp via Gcc-patches
Hi, this patch changes SLP test expectations. As we only vectorize when no more than one rgroup is present, no vectorization is performed. I was also considering using a separate target selector (something like vect_partial_vectors_bias_m1) but as the number of testcases is limited that would

[PATCH] s390: Use arch14 instead of z16 for -march=native.

2023-03-02 Thread Robin Dapp via Gcc-patches
Hi, When compiling on a system where binutils do not yet support the 'z16' name assembling fails with -march=native which we currently interpret as -march=z16 (on a z16 machine). This patch uses -march=arch14 instead. Is it OK? Regards Robin -- gcc/ChangeLog: *

[PATCH] s390: Fix ifcvt test cases

2023-03-02 Thread Robin Dapp via Gcc-patches
Hi, we seem to flip flop between the "high" and "not low" variants of load on condition. Accept both in the affected test cases. Going to commit this as obvious. Regards Robin -- gcc/testsuite/ChangeLog: * gcc.target/s390/ifcvt-two-insns-bool.c: Allow "high" and "not low or

Re: [PATCH] s390: Add LEN_LOAD/LEN_STORE support.

2023-02-27 Thread Robin Dapp via Gcc-patches
> Do you really need a copy of the address register? Couldn't you just do a > src = adjust_address (operands[1], BLKmode, 0); > You create a paradoxical subreg of the QImode input but vll actually > uses the whole 32 bit value. Couldn't we end up with uninitialized > bytes being used as part of

[PATCH] s390: Add LEN_LOAD/LEN_STORE support.

2023-02-02 Thread Robin Dapp via Gcc-patches
Hi, this patch adds LEN_LOAD/LEN_STORE support for z14 and newer. It defines a bias value of -1 and implements the LEN_LOAD and LEN_STORE optabs. It also includes various vll/vstl testcases adapted from Kewen Lin's patch for Power. Bootstrapped and regtested on z13-z16. Is it OK? Regards

Re: [PATCH] ifcvt.cc: Prevent excessive if-conversion for conditional moves

2023-01-11 Thread Robin Dapp via Gcc-patches
Hi, > On optimizing for speed, default_noce_conversion_profitable_p() allows > plenty of headroom, so this patch has little impact. > > Also, if the target-specific cost estimate is accurate or allows for > margins, the impact should be similarly small. I believe this part of ifcvt does/did not

Re: [RFC] postreload cse'ing vector constants

2022-11-03 Thread Robin Dapp via Gcc-patches
Should we go ahead with this, i.e. push the change and wait for fallout? I guess we're still early enough in the cycle for that. There are no regressions anymore on s390, Power9, x86 and aarch64 (at least on the farm machines I checked). Regards Robin

Re: optabs: Variable index vec_set

2022-11-02 Thread Robin Dapp via Gcc-patches
> IIRC, I was trying to "fix" modeless operand by giving it a mode, but > since it made no difference for x86, I later dropped the patch. > However, operand with a known mode is preferred, so if it works for > you, just include my patch in your submission. My patch is somehow > trivial if we want

Re: optabs: Variable index vec_set

2022-11-02 Thread Robin Dapp via Gcc-patches
Hi, > With the patch my local changes to make better use of vec_set work > nicely even though I haven't done a full bootstrap yet. Were there > other issues with the patch or can it still be applied? I performed a bootstrap as well as a regtest with -march=z16 on s390. There is no new fallout.

optabs: Variable index vec_set

2022-10-31 Thread Robin Dapp via Gcc-patches
Hi, I'm looking into vec_set with variable index on s390. Uros posted a patch [1] that did not make it upstream in Nov 2020. It changed the mode of the index operand to whatever the target supports in can_vec_set_var_idx_p. I missed it back then but we indeed do not make proper use of vec_set

Re: [PATCH] expand: Convert cst - x into cst xor x.

2022-10-21 Thread Robin Dapp via Gcc-patches
> Do we have evidence that targets properly cost XOR vs SUB RTXen? > > It might actually be a reload optimization - when the constant is > available in a register use 'sub', when it needs to be reloaded > use 'xor'? > > That said, I wonder if the fallout of changing some SUB to XOR > is bigger

[PATCH] s390: Fix bootstrap error with checking and -m31

2022-10-19 Thread Robin Dapp via Gcc-patches
Hi, since r13-2746 we hit an ICE when bootstrapping with -m31 and --enable-checking=all. ../../../../libgfortran/ieee/ieee_helper.c: In function 'ieee_class_helper_16': ../../../../libgfortran/ieee/ieee_helper.c:77:3: internal compiler error: RTL check: expected code 'reg', have 'subreg' in

Re: [RFC] postreload cse'ing vector constants

2022-09-28 Thread Robin Dapp via Gcc-patches
> I opened: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107061 The online docs for encodekey256 also say XMM4 through XMM6 are reserved for future usages and software should not rely upon them being zeroed. I believe we also zero there. > This sounds like an issue. So with your patch

Re: [RFC] postreload cse'ing vector constants

2022-09-27 Thread Robin Dapp via Gcc-patches
> I did bootstrapping and ran the testsuite on x86(-64), aarch64, Power9 > and s390. Everything looks good except two additional fails on x86 > where code actually looks worse. > > gcc.target/i386/keylocker-encodekey128.c > > 17c17,18 > < movaps %xmm4, k2(%rip) > --- >> pxor

Re: VN, len_store and endianness

2022-09-27 Thread Robin Dapp via Gcc-patches
> Yes, because the native_interpret always starts at offset zero > (we can't easily feed in a "shifted" RHS). So what I assumed is > that IFN_LEN_STORE always stores elements [0, len + adj]. Hmm, but this assumption is not violated here or am I missing something? It's not like we're storing

Re: VN, len_store and endianness

2022-09-27 Thread Robin Dapp via Gcc-patches
> The error is probably in vn_reference_lookup_3 which assumes that > 'len' applies to the vector elements in element order. See the part > of the code where it checks for internal_store_fn_p. If 'len' is with > respect to the memory and thus endianess has to be taken into > account then for the

VN, len_store and endianness

2022-09-26 Thread Robin Dapp via Gcc-patches
Hi, I'm locally testing a branch that enables vll/vstl for partial vector usage i.e. len_load and len_store on s390. I see a FAIL in testsuite/gfortran.dg/power_3.f90. Since r13-1777-gbd9837bc3ca134 we also performe VN for masked/len stores and things go wrong there. The problem seems to be

Re: Basic REG_EQUIV comprehension question

2022-09-15 Thread Robin Dapp via Gcc-patches
> Yeah, rtx_costs (or preferably insn_cost, if that works) seem like the > best way of addressing this. If the target says that register moves are > cheaper than constant moves then it's a feature that CSE & co remove > duplicate constants. The REG_EQUIV note is still useful in those cases >

Re: Basic REG_EQUIV comprehension question

2022-09-15 Thread Robin Dapp via Gcc-patches
Small addition to clarify: (insn 8) from the example is of course matched to a vzero. The "problem" begins when (reg 64) is later moved into another register and the (const_vector) has been optimized to a single definition e.g. by CSE, i.e. we have several (insn yy (set (reg:V2DI xx) (reg:V2DI

Basic REG_EQUIV comprehension question

2022-09-15 Thread Robin Dapp via Gcc-patches
Hi, I have been working on making better use of s390's vzero instruction. Currently we rather zero a vector register once and load it into other registers via vlr instead of emitting multiple vzeros. At IRA/reload point we e.g. have (insn 8 5 19 2 (set (reg/v:V2DI 64 [ zero ])

Re: [RFC] postreload cse'ing vector constants

2022-09-08 Thread Robin Dapp via Gcc-patches
> Which is this from the mail archives: > > https://gcc.gnu.org/pipermail/gcc-patches/1998-June/000308.html > > I would tend to agree that for equal cost that the constant would be > preferred since that should be better from a scheduling/dependency > standpoint.   So it seems to me we can

Re: [RFC] postreload cse'ing vector constants

2022-09-07 Thread Robin Dapp via Gcc-patches
> Did you did any archeology into this code to see if there was any > history that might shed light on why it doesn't just using the costing > models? This one was buried under some dust :) commit 0254c56158b0533600ba9036258c11d377d46adf Author: John Carr Date: Wed Jun 10 06:00:50 1998 +

[RFC] postreload cse'ing vector constants

2022-09-07 Thread Robin Dapp via Gcc-patches
Hi, I recently looked into a sequence like vzero %v0 vlr %v2, %v0 vlr %v3, %v0. Ideally we would like to use vzero for all of these sets in order to not create dependencies. For some instances of this problem I found the offending snippet to be the postreload cse pass. If there is a non

Re: [PATCH] expand: Convert cst - x into cst xor x.

2022-09-07 Thread Robin Dapp via Gcc-patches
> The question is really whether xor or sub is "better" statically. I can't > think of any reasons. On s390, why does xor end up "better"? There is an xor with immediate (as opposed to no "subtract from immediate") which saves an instruction, usually. On x86, I think the usual argument for xor

Re: [PATCH] expand: Convert cst - x into cst xor x.

2022-09-06 Thread Robin Dapp via Gcc-patches
> cost might also depend on the context in case flag setting > behavior differs for xor vs sub (on x86 sub looks strictly more > powerful here). The same is probably true when looking for > a combination with another bitwise operation. > > Btw, why not perform the optimization in expand_binop?

[PATCH] expand: Convert cst - x into cst xor x.

2022-09-06 Thread Robin Dapp via Gcc-patches
Hi, posting this separately from PR91213 now. I wrote an s390 test and most likely it could also be done for x86 which will give it broader coverage. Depending on the backend it might be better to convert cst - x into cst xor x if cst + 1 is a power of two and 0 <= x <= cst. This patch

[PATCH] testsuite/s390: Add -mzarch to ifcvt test cases.

2022-09-06 Thread Robin Dapp via Gcc-patches
Hi, this adds a missing -mzarch to some ifcvt test cases. Going to commit this as obvious in some days barring objections. Regards Robin gcc/testsuite/ChangeLog: * gcc.target/s390/ifcvt-one-insn-bool.c: Add -mzarch. * gcc.target/s390/ifcvt-one-insn-char.c: Dito. *

Re: [PATCH] s390: Recognize reverse/element swap permute patterns.

2022-08-31 Thread Robin Dapp via Gcc-patches
Hi, adding -save-temps as well as a '\t' in order for the tests to do what they are supposed to do. Going to push this as obvious in some days. Regards Robin -- gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vperm-rev-z14.c: Add -save-temps. *

Re: [PATCH] s390: Recognize reverse/element swap permute patterns.

2022-08-22 Thread Robin Dapp via Gcc-patches
Hi, after discussing off-list, here is v2 of the patch. We now recognize if the permutation mask only refers to the first or the second operand and use this later when emitting vpdi. Regtested and bootstrapped, no regressions. Is it OK? Regards Robin >From

[PATCH] s390: Implement vec_set with vec_merge and, vec_duplicate.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi, similar to other backends this patch implements vec_set via vec_merge and vec_duplicate instead of an unspec. This opens up more possibilites to combine instructions. Bootstrapped and regtested. No regressions. Is it OK? Regards Robin gcc/ChangeLog: * config/s390/s390.md:

[PATCH] s390: Implement vec_extract via vec_select.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi, vec_select can handle dynamic/runtime masks nowadays. Therefore we can get rid of the UNSPEC_VEC_EXTRACT that was preventing further optimizations like combining instructions with vec_extract patterns. Bootstrapped and regtested. No regressions. Is it OK? Regards Robin gcc/ChangeLog:

[PATCH] s390: Implement vec_revb(vector short)/bswapv8hi with, verllh.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi, this patch implements a byte swap for a V8HImode vector via an element rotate by 8 bits. Bootstrapped and regtested, no regressions. Is it OK? Regards Robin gcc/ChangeLog: PR target/100867 * config/s390/vector.md: Add special case for V8HImode. gcc/testsuite/ChangeLog:

[PATCH] s390: Recognize reverse/element swap permute patterns.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi, this adds functions to recognize reverse/element swap permute patterns for vler, vster as well as vpdi and rotate. Bootstrapped and regtested, no regressions. Is it OK? Regards Robin gcc/ChangeLog: * config/s390/s390.cc (expand_perm_with_vpdi): Recognize swap pattern.

[PATCH] s390: Use vpdi and verllg in vec_reve.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi, swapping the two elements of a V2DImode or V2DFmode vector can be done with vpdi instead of using the generic way of loading a permutation mask from the literal pool and vperm. Analogous to the V2DI/V2DF case reversing the elements of a four-element vector can be done by first swapping the

[PATCH] s390: Add z15 to s390_issue_rate.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi, this patch tries to be more explicit by mentioning z15 in s390_issue_rate. No changes in testsuite, bootstrap or SPEC obviously. Is it OK? Regards Robin gcc/ChangeLog: * config/s390/s390.cc (s390_issue_rate): Add z15. --- gcc/config/s390/s390.cc | 1 + 1 file changed, 1

[PATCH] s390: Add -munroll-only-small-loops.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi, inspired by Power we also introduce -munroll-only-small-loops. This implies activating -funroll-loops and -munroll-only-small-loops at -O2 and above. Bootstrapped and regtested. This introduces one regression in gcc.dg/sms-compare-debug-1.c but currently dumps for sms are broken as well.

[PATCH] s390: Add scheduler description for z16

2022-04-13 Thread Robin Dapp via Gcc-patches
Hi, this patch adds the scheduler description for z16. Bootstrapped and regtested with --with-arch=z16. Is it OK? Regards Robin gcc/ChangeLog: * config/s390/s390.cc (s390_get_sched_attrmask): Add z16. (s390_get_unit_mask): Likewise. (s390_is_fpd): Likewise.

[PATCH] testsuite/s390: Silence warning in pr80725.c

2022-04-13 Thread Robin Dapp via Gcc-patches
Hi, this test case checks that we do not ICE but FAILs because of -Wint-to-pointer-cast. Silence this warning. Is it OK? Regards Robin gcc/testsuite/ChangeLog: * gcc.target/s390/pr80725.c: Add -Wno-int-to-pointer-cast. --- gcc/testsuite/gcc.target/s390/pr80725.c | 2 +- 1 file

[PATCH] testsuite/s390: Adapt test expections.

2022-04-04 Thread Robin Dapp via Gcc-patches
Hi, some tests expect a convert instruction but nowadays the conversion is already done at compile time. This results in a literal-pool load. Change the tests accordingly. OK for trunk? Regards Robin gcc/testsuite/ChangeLog: * gcc.target/s390/zvector/vec-double-compile.c: Expect vl

[PATCH] testsuite/s390: Change nle -> h in ifcvt tests.

2022-04-04 Thread Robin Dapp via Gcc-patches
Hi, we have been emitting the "higher" variantes instead of the "not less or equal" ones for a while. Change the test expectations accordingly. OK for trunk? Regards Robin gcc/testsuite/ChangeLog: * gcc.target/s390/ifcvt-two-insns-bool.c: Change nle to h. *

[PATCH] testsuite: Add -fno-tree-loop-distribute-patterns for s390.

2022-04-04 Thread Robin Dapp via Gcc-patches
Hi, in gcc.dg/Wuse-after-free-2.c we try to detect a use-after-free. On s390 the test's while loop is converted into a rawmemchr builtin making it impossible to determine that the pointers *p and *q are related. Therefore, disable the tree loop distribute patterns pass on s390 for this test.

Re: [PATCH] arc: Fix for new ifcvt behavior [PR104154]

2022-03-01 Thread Robin Dapp via Gcc-patches
Hi Claudiu, > The patch looks good. Please go ahead and merge it, please let me know if > you cannot. I merged the patch leaving your check if (cmode != SImode && cmode != SFmode && cmode != DFmode) return NULL_RTX; in place. It is not strictly necessary anymore but I figured it also

Re: [committed] arc: Fail conditional move expand patterns

2022-02-25 Thread Robin Dapp via Gcc-patches
> If the movcc comparison is not valid it triggers an assert in the > current implementation. This behavior is not needed as we can FAIL > the movcc expand pattern. In case of a MODE_CC comparison you can also just return it as described here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104154

[PATCH] s390: Change SET rtx_cost handling.

2022-02-25 Thread Robin Dapp via Gcc-patches
Hi, the IF_THEN_ELSE detection currently prevents us from properly costing register-register moves which causes the lower-subreg pass to assume that a VR-VR move is as expensive as two GPR-GPR moves. This patch adds handling for SETs containing REGs as well as MEMs and is inspired by the aarch64

Re: [PATCH] Don't do int cmoves for IEEE comparisons, PR target/104256.

2022-02-23 Thread Robin Dapp via Gcc-patches
Hi, > Robin's patch has the effct making rs6000_emit_int_cmove return false for > floating point comparisons, so I marked the bug as being a duplicate of PR > target/104335. Didn't I just return false for MODE_CC? This should not affect proper floating-point comparisons. It looks like the

[PATCH] arc: Fix for new ifcvt behavior [PR104154]

2022-02-20 Thread Robin Dapp via Gcc-patches
Hi, I figured I'd just go ahead and post this patch as well since it seems to have fixed the arc build problems. It would be nice if someone could bootstrap/regtest if Jeff hasn't already done so. I was able to verify that the two testcases attached to the PR build cleanly but not much more.

Re: [PATCH] rs6000: Workaround for new ifcvt behavior [PR104335]

2022-02-17 Thread Robin Dapp via Gcc-patches
> Please send patches as plain text, not as base64. It seems like Thunderbird does not support this anymore since later versions, grml. Probably need to look for another mail client. > Why that first test? XEXP (op, 0) is required to not be nil. > > The patch is okay without that (if it

[PATCH] rs6000: Workaround for new ifcvt behavior [PR104335]

2022-02-16 Thread Robin Dapp via Gcc-patches
Hi, since r12-6747-gaa8cfe785953a0 ifcvt not only passes real comparisons but also "cc comparisons" (i.e. the representation of the result of a comparison) to the backend. rs6000_emit_int_cmove () is not prepared to handle this. Therefore, this patch makes it return false in such a case in

Re: [PATCH] s390: Change costs for load on condition.

2022-02-08 Thread Robin Dapp via Gcc-patches
> Patch is ok. Thanks! As discussed off-list, committed as attached. Regards Robin commit 1e3185e714e877b2b4d14ade0865322f71a8cbf6 Author: Robin Dapp Date: Tue Feb 8 14:56:29 2022 +0100 s390: Increase costs for load on condition and change movqicc expander. This patch changes

Re: s390: Fix bootstrap -Wformat-diag errors

2022-02-03 Thread Robin Dapp via Gcc-patches
Hi Martin, > Either this: > >error ("% is unknown", orig_p); > > or this would be better: > >error ("attribute % is unknown", orig_p); > > The %< %> directives will render it in single quotes like keywords and > identifiers. Using %qs would render it in double quotes like a string, >

Re: ifcvt: Fix PR104153 and PR104198

2022-02-03 Thread Robin Dapp via Gcc-patches
> Do you need to adjust anything now that this is emitting into TEMP > rather than TARGET? The idea now is to emit to TEMP in the first pass and check if we read the initial condition. Overwriting the condition (and of course reading it in every sequence) is the reason temporaries were needed

s390: Fix bootstrap -Wformat-diag errors

2022-02-02 Thread Robin Dapp via Gcc-patches
Hi, this fixes the s390 bootstrap errors caused by -Werror=format-diag. It simply splits the problematic format strings. Bootstrapped and regtested with -march=z15. Is it OK? Regards Robin -- gcc/ChangeLog: * config/s390/s390.cc (s390_valid_target_attribute_inner_p): Split

ifcvt: Fix PR104153 and PR104198

2022-02-01 Thread Robin Dapp via Gcc-patches
Hi, this is a bugfix for aa8cfe785953a0e87d2472311e1260cd98c605c0 which broke an or1k test case (PR104153) as well as SPARC bootstrap (PR104198). cond_exec_get_condition () returns the jump condition directly and we now it to the backend. The or1k backend modified the condition in-place but

[PATCH] s390: Split CCSmode into CCSINT and CCSFP

2022-01-20 Thread Robin Dapp via Gcc-patches
Hi, this patch splits the CCSmode into an integer and a floating point variant. This allows ifcvt to consider floating point compares which would be rejected before because they could not be reversed. Bootstrapped and regtested on s390x. Is it OK? Regards Robin -- gcc/ChangeLog: *

[PATCH] s390: Change costs for load on condition.

2022-01-20 Thread Robin Dapp via Gcc-patches
Hi, this patch is a follow-up patch to the recent ifcvt changes. It increased costs for a load on condition to 6. This ensures that we if-convert sequences of three regular instructions (of cost 4) e.g. a compare and two SETs into two loads on condition (of cost 6). With a cost of 5, four-insn

Re: [PATCH 5/6] ira: Consider modelling caller-save allocations as loop spills

2022-01-11 Thread Robin Dapp via Gcc-patches
> Could you try the attached? The series with the patch is OK from a testsuite point of view. The other problem appears later. Regards Robin

Re: [PATCH 5/6] ira: Consider modelling caller-save allocations as loop spills

2022-01-11 Thread Robin Dapp via Gcc-patches
Hi Richard, > Could you try the attached? build and bootstrap look OK with it. Testsuite shows lots of fallout but the proper bisect isn't finished yet. The commit before your series is still fine - the problem could also be after it, though. Will report back later. Thanks Robin

Re: [PATCH 5/6] ira: Consider modelling caller-save allocations as loop spills

2022-01-11 Thread Robin Dapp via Gcc-patches
Hi Richard, this causes a bootstrap error on s390 (where IRA_HARD_REGNO_ADD_COST_MULTIPLIER is defined). rclass is used in the #define-guarded area. I guess you also wanted to move this to the new function ira_caller_save_cost? Regards Robin -- ../../gcc/ira-costs.c: In function ‘void

Re: [PATCH] vect: Add bias parameter for partial vectorization

2022-01-10 Thread Robin Dapp via Gcc-patches
Hi Richard, > I think it would be better to fold this into the existing documentation > a bit more: [..] done. Fixed the remaining nits in the attached v5. Bootstrap and regtest are good on s390x, Power9 and i386. Regards Robin -- gcc/ChangeLog: * config/rs6000/vsx.md: Use const0

Re: [PATCH v3 7/7] ifcvt: Run second pass if it is possible to omit a temporary.

2022-01-10 Thread Robin Dapp via Gcc-patches
Posting the ChangeLog before pushing. -- gcc/ChangeLog: * ifcvt.c (noce_convert_multiple_sets_1): New function. (noce_convert_multiple_sets): Call function a second time if we can improve the first try.

Re: [PATCH v3 6/7] testsuite/s390: Add tests for noce_convert_multiple.

2022-01-10 Thread Robin Dapp via Gcc-patches
Posting the ChangeLog before pushing. -- gcc/testsuite/ChangeLog: * gcc.dg/ifcvt-4.c: Remove s390-specific check. * gcc.target/s390/ifcvt-two-insns-bool.c: New test. * gcc.target/s390/ifcvt-two-insns-int.c: New test. * gcc.target/s390/ifcvt-two-insns-long.c: New

Re: [PATCH v3 5/7] ifcvt: Try re-using CC for conditional moves.

2022-01-10 Thread Robin Dapp via Gcc-patches
Posting the ChangeLog before pushing. -- gcc/ChangeLog: * ifcvt.c (cond_exec_get_condition): New parameter to allow getting the reversed comparison. (try_emit_cmove_seq): New function to facilitate creating a cmov sequence. (noce_convert_multiple_sets):

Re: [PATCH v3 4/7] ifcvt/optabs: Allow using a CC comparison for emit_conditional_move.

2022-01-10 Thread Robin Dapp via Gcc-patches
Posting the ChangeLog before pushing. -- gcc/ChangeLog: * rtl.h (struct rtx_comparison): New struct that holds an rtx comparison. * config/rs6000/rs6000.c (rs6000_emit_minmax): Use struct instead of single parameters. (rs6000_emit_swsqrt): Likewise.

Re: [PATCH v3 3/7] ifcvt: Improve costs handling for noce_convert_multiple.

2022-01-10 Thread Robin Dapp via Gcc-patches
Posting the ChangeLog before pushing. -- gcc/ChangeLog: * ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Estimate insns costs. (noce_process_if_block): Use potential costs.

Re: [PATCH v3 2/7] ifcvt: Allow constants for noce_convert_multiple.

2022-01-10 Thread Robin Dapp via Gcc-patches
Posting the ChangeLog before pushing. -- gcc/ChangeLog: * ifcvt.c (noce_convert_multiple_sets): Allow constants. (bb_ok_for_noce_convert_multiple_sets): Likewise.

<    1   2   3   4   5   >