[PATCH 5/6] ifcvt: Only created temporaries as needed.

2018-11-14 Thread Robin Dapp
created if the destination of a set is used in an emitted condition check. -- gcc/ChangeLog: 2018-11-14 Robin Dapp * ifcvt.c (check_need_temps): New function. (noce_convert_multiple_sets): Only created temporaries if needed. --- gcc/ifcvt.c | 54

[PATCH 4/6] S/390: Implement noce_conversion_profitable_p.

2018-11-14 Thread Robin Dapp
This patch implements noce_conversion_profitable_p by checking for the transformation ifcvt used and only return positively if noce_convert_multiple_sets created less than MAX_IFCVT_INSNS insns. -- gcc/ChangeLog: 2018-11-14 Robin Dapp * config/s390/s390.c (MAX_IFCVT_INSNS): Define

[PATCH 1/6] ifcvt: Store the number of created cmovs.

2018-11-14 Thread Robin Dapp
This patch saves the number of created conditional moves by noce_convert_multiple_sets in the IF_INFO struct. This may be used by the backend to easier decide whether to accept a generated sequence or not. -- gcc/ChangeLog: 2018-11-14 Robin Dapp * ifcvt.c

[PATCH 3/6] ifcvt: Use enum instead of transform_name string.

2018-11-14 Thread Robin Dapp
This patch introduces an enum for ifcvt's various noce transformations. As the transformation might be queried by the backend, I find it nicer to allow checking for a proper type instead of a string comparison. -- gcc/ChangeLog: 2018-11-14 Robin Dapp * ifcvt.c (noce_try_move): Use

[PATCH 6/6] S/390: Add test for noce_convert_multiple_sets.

2018-11-14 Thread Robin Dapp
New test. -- gcc/testsuite/ChangeLog: 2018-11-14 Robin Dapp * gcc.target/s390/ifcvt-two-insns-int.c: New test. --- .../gcc.target/s390/ifcvt-two-insns-int.c | 26 +++ 1 file changed, 26 insertions(+) create mode 100644 gcc/testsuite/gcc.target/s390/ifcvt-two

Re: [PATCH 2/6] ifcvt: Allow constants operands in noce_convert_multiple_sets.

2018-11-15 Thread Robin Dapp
> This may ultimately be too simplistic. There are targets where some > constants are OK, but others may not be. By checking the predicate > like this I think you can cause over-aggressive if-conversion if the > target allows a range of integers in the expander's operand predicate, > but allows

Re: [PATCH 5/6] ifcvt: Only created temporaries as needed.

2018-11-15 Thread Robin Dapp
> This looks pretty reasonable. ISTM it ought to be able to go forward if > it's tested independently. The test suite already passes, any other tests you have in mind? To be honest I suppose noce_convert_multiple_sets will currently never successfully return (due to the costing problems I

[PATCH 0/6] If conversion with multiple sets.

2018-11-14 Thread Robin Dapp
Hi, the follow patch set was created in an attempt to allow multiple sets to be if converted. I was not able to make it work out of the box since I found the cost estimation for the newly created sequence to always be much higher than the sequence before. This is due to

[PATCH 2/6] ifcvt: Allow constants operands in noce_convert_multiple_sets.

2018-11-14 Thread Robin Dapp
This patch checks whether the current target supports conditional moves with immediate then/else operands and allows noce_convert_multiple_sets to deal with constants subsequently. Also, minor refactoring is performed. -- gcc/ChangeLog: 2018-11-14 Robin Dapp * ifcvt.c

sched2 priorities and replacements

2018-10-04 Thread Robin Dapp
Hi, I'm working on some insn latency changes in the s390 backend and noticed a regression in the SPEC2006 bzip2 test case that was due to some insns being scheduled differently. The sequence in short form before my change is ;; | insn | prio | ;; | 823 |1 | %r1=%r1+0x1

[PATCH] Reset insn priority after inc/ref replacement in haifa sched

2018-10-10 Thread Robin Dapp
didn't bootstrap for me on x86). The actual code changes throughout SPEC2006 are minor and the performance impact is negligible provided we do not hit a fixable bad case as described in my last message. Regards Robin -- gcc/ChangeLog: 2018-10-10 Robin Dapp * haifa-sched.c

[RFC] D support for S/390

2019-03-15 Thread Robin Dapp
Hi, during the last few days I tried to get D running on s390x (apparently the first Big Endian platform to try it?). I did not yet go through the code systematically and add a version(SystemZ) in every place where it might be needed but rather tried to fix test failures as they arose. After

[PATCH 1/7] S/390: Change z13 pipeline description.

2019-03-11 Thread Robin Dapp
This patch adapts the z13 pipeline description. --- gcc/config/s390/2964.md | 372 ++-- gcc/config/s390/s390.c | 39 ++--- 2 files changed, 226 insertions(+), 185 deletions(-) diff --git a/gcc/config/s390/2964.md b/gcc/config/s390/2964.md index

[PATCH 3/7] S/390: Change handling of long-running instructions.

2019-03-11 Thread Robin Dapp
This patch makes the detection of long-running instructions independent of their latency and checks the execution unit instead. --- gcc/config/s390/s390.c | 73 +++--- 1 file changed, 55 insertions(+), 18 deletions(-) diff --git a/gcc/config/s390/s390.c

[PATCH 0/7] S/390: Rework instruction scheduling.

2019-03-11 Thread Robin Dapp
Hi, this patch set adds new pipeline descriptions for z13 and z14. Based on that, the scoring and some properties are handled differently in the scheduler hooks. Regards Robin Robin Dapp (7): S/390: Change z13 pipeline description. S/390: Add z14 pipeline description. S/390: Change

[PATCH 4/7] S/390: Change handling of group end.

2019-03-11 Thread Robin Dapp
This patch adds a scheduling state struct and changes the handling of end-group conditions. --- gcc/config/s390/s390.c | 158 ++--- 1 file changed, 68 insertions(+), 90 deletions(-) diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index

[PATCH 2/7] S/390: Add z14 pipeline description.

2019-03-11 Thread Robin Dapp
This patch adds the z14 pipeline description. --- gcc/config/s390/3906.md | 282 gcc/config/s390/s390.c | 23 +++- gcc/config/s390/s390.h | 2 +- gcc/config/s390/s390.md | 3 + 4 files changed, 307 insertions(+), 3 deletions(-) create mode 100644

[PATCH 7/7] S/390: Tune scheduling parameters.

2019-03-11 Thread Robin Dapp
This patch adapts some scheduling-related parameters. --- gcc/config/s390/s390.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 78a707267e8..901807e7833 100644 --- a/gcc/config/s390/s390.c +++

[PATCH 6/7] S/390: Add handling for group-of-two instructions.

2019-03-11 Thread Robin Dapp
This patch adds handling of group-of-two instructions. --- gcc/config/s390/s390.c | 36 +++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 4dcf1be4445..78a707267e8 100644 ---

[PATCH 5/7] S/390: Add side to schedule-mix calculations.

2019-03-11 Thread Robin Dapp
This patch makes the scheduling score execution-side aware. --- gcc/config/s390/s390.c | 32 ++-- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 249df00268a..4dcf1be4445 100644 ---

Re: [RFC] D support for S/390

2019-03-20 Thread Robin Dapp
Hi, the unicode tables in std.internal.unicode_tables are apparently auto generated and loaded at (libphobos) compile time. They are also in little endian format. Is the tool to generate them available somewhere? I wanted to start converting them to little endian before loading but this will

Re: [RFC] D support for S/390

2019-03-19 Thread Robin Dapp
Hi, > Alignment is written to TypeInfo, I don't think it should ever be > zero. That would mean that it isn't being generated by the compiler, > or read by the library correctly, so something else is amiss. it took me a while to see that in libphobos/libdruntime/object.d override @property

Re: [RFC] D support for S/390

2019-03-19 Thread Robin Dapp
> This would mean that StructFlags and ClassFlags will also both have a > wrong value as well. Yes, can confirm that m_flags = 0 (instead of 1) for a struct containing a pointer. > If there's a compiler/library discrepancy, the compiler should be > adjusted to write out the value at the correct

[PATCH] S/390: Perform more aggressive inlining

2019-03-12 Thread Robin Dapp
Hi, this patch sets the inlining parameters for z13 and later to rather aggressive values in response to PR85103 that caused performance regressions in SPEC2006's sjeng and gobmk benchmarks. Regards Robin -- gcc/ChangeLog: 2019-03-12 Robin Dapp * config/s390/s390.c

Re: [PATCH 0/7] S/390: Rework instruction scheduling.

2019-03-12 Thread Robin Dapp
> Please adjust the year and the author in gcc/config/s390/3906.md. Ok with > that change. Changed that and also simplified the longrunning checks. gcc/ChangeLog: 2019-03-12 Robin Dapp * config/s390/s390.c (LONGRUNNING_THRESHOLD): Remove. (s390_is_fpd

Re: [PATCH 8/8] S/390: Change test case to reflect scheduling changes.

2019-03-12 Thread Robin Dapp
This fixes a newly introduced test failure. --- 2019-03-12 Robin Dapp * gcc.target/s390/memset-1.c: Do not require stcy. diff --git a/gcc/testsuite/gcc.target/s390/memset-1.c b/gcc/testsuite/gcc.target/s390/memset-1.c index 3e201df1aed..9463a77208b 100644 --- a/gcc/testsuite

[PATCH] S/390: Fix tests that expect unquoted option names

2019-03-15 Thread Robin Dapp
Hi, r269586 puts single quotes around option names. This patch fixes tests that expect the old format. Regards Robin --- gcc/testsuite/ChangeLog: 2019-03-15 Robin Dapp * gcc.target/s390/target-attribute/tattr-1.c (htm0): -mhtm -> '-mhtm'. * gcc.target/s390/tar

Re: [RFC] D support for S/390

2019-03-22 Thread Robin Dapp
Hi, > Are the values inside the tables the problem? Or just some of the > helper functions/templates that interact with them to generate the > static data? > > If the latter, then a rebuild of the files may not be necessary. I managed to get this to work without rebuilding the files. After

[PATCH] S/390: Implement vectory copysign

2019-02-07 Thread Robin Dapp
Hi, this patch implements vector copysign using vector select on S/390. Regtested and bootstrapped on s390x. Regards Robin -- gcc/ChangeLog: 2019-02-07 Robin Dapp * config/s390/vector.md: Implement vector copysign. gcc/testsuite/ChangeLog: 2019-02-07 Robin Dapp

Re: [RFC] D support for S/390

2019-04-11 Thread Robin Dapp
Hi Rainer, > This will occur on any 32-bit target. The following patch (using > ssize_t instead) allowed the code to compile: thanks, included your fix and attempted a more generic version of the 186 test. I also continued debugging some fails further: - Most of the MurmurHash fails are

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-15 Thread Robin Dapp
> It would really help if you could provide testcases which show the > suboptimal code and any analysis you've done. I tried introducing a define_subst pattern that substitutes something one of two other subst patterns already changed. The first subst pattern helps remove a superfluous and on

[PATCH] Testsuite: Add s390 exceptions for gen-vect

2019-05-15 Thread Robin Dapp
Hi, this patch changes three gen-vect testcases so they do not expect vectorization of an unaligned access. Vectorization happens regardless, we just ignore misalignment. Regards Robin -- gcc/testsuite/ChangeLog: 2019-05-15 Robin Dapp * gcc.dg/tree-ssa/gen-vect-26.c: Do

[PATCH] S/390: Add -march to test case

2019-05-15 Thread Robin Dapp
Hi, this patch adds -march=z900 to a test case that expects larl for loading a value via the GOT. On z10 and later, lgrl is used which is tested in a new test case. Regards Robin -- gcc/testsuite/ChangeLog: 2019-05-15 Robin Dapp * gcc.target/s390/global-array-element-pic.c: Add

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-06-04 Thread Robin Dapp
>> Now, in order to get rid of the subregs in the pattern combine creates, >> I would need to be able to do something like >> >> (define_subst "subreg_subst" >> [(set (match_operand:DI 0 "" "") >> (shift:DI (match_operand:DI 1 "" "") >>(subreg:SI (match_dup:DI 2)))] >> >>

Re: [PATCH] Testsuite: Add s390 exceptions for gen-vect

2019-06-05 Thread Robin Dapp
Ping. > gcc/testsuite/ChangeLog: > > 2019-05-15 Robin Dapp > > * gcc.dg/tree-ssa/gen-vect-26.c: Do not expect unaligned access > vectorization on s390. > * gcc.dg/tree-ssa/gen-vect-28.c: Likewise. > * gcc.dg/tree-ssa/gen-vect-32.c: Likewise. >

[RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-09 Thread Robin Dapp
Hi, while trying to improve s390 code generation for rotate and shift I noticed superfluous subregs for shift count operands. In our backend we already have quite cumbersome patterns that would need to be duplicated (or complicated further by more subst patterns) in order to get rid of the

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-10 Thread Robin Dapp
>> Bit tests on x86 also truncate [1], if the bit base operand specifies >> a register, and we don't use BT with a memory location as a bit base. >> I don't know what is referred with "(real or pretended) bit field >> operations" in the documentation for SHIFT_COUNT_TRUNCATED: >> >> However,

Re: [RFC] D support for S/390

2019-04-29 Thread Robin Dapp
> Robin, have you been testing with --disable-multilib or something > similar? yes, I believe so... stupid mistake :( Thanks for fixing it so quickly.

Re: [PATCH] S/390: Fix PR89952 incorrect CFI

2019-04-18 Thread Robin Dapp
Hi, > + Establish an ANTI dependency between r11 and r15 restores from FPRs > + to prevent the instructions scheduler from reordering them since > + this would break CFI. No further handling in the sched_reorder > + hook is required since the r11 and r15 restore will never appear in > +

Re: [RFC] D support for S/390

2019-04-18 Thread Robin Dapp
Hi Rainer, > I noticed you missed one piece of Iain's typeinfo.cc patch, btw.: > > diff --git a/gcc/d/typeinfo.cc b/gcc/d/typeinfo.cc > --- a/gcc/d/typeinfo.cc > +++ b/gcc/d/typeinfo.cc > @@ -886,7 +886,7 @@ public: > if (cd->isCOMinterface ()) > flags |= ClassFlags::isCOMclass; >

[PATCH 1/3] S/390: Rework shift count handling.

2019-07-08 Thread Robin Dapp
Add s390_valid_shift_count to determine the validity of a shift-count operand. This is used to replace increasingly complex substitutions that should have allowed address-style shift-count handling, an and mask as well as no-op subregs on the operand. -- gcc/ChangeLog: 2019-07-05 Robin Dapp

[PATCH 0/3] S/390: Shift count improvements.

2019-07-08 Thread Robin Dapp
). The second patch adds some tests. The third patch defines the shift_truncation_mask and adds a test for it. Bootstrapped and regtested. Regards Robin --- Robin Dapp (3): S/390: Rework shift count handling. S/390: Shift count tests. S/390: Define shift_truncation_mask. gcc/config/s390

[PATCH 3/3] S/390: Define shift_truncation_mask.

2019-07-08 Thread Robin Dapp
Define s390_shift_truncation_mask to allow the optabs optimization sh = (64 - sh) -> sh = -sh for a rotation operation. -- gcc/ChangeLog: 2019-07-05 Robin Dapp * config/s390/s390.c (s390_shift_truncation_mask): Define. (TARGET_SHIFT_TRUNCATION_MASK): Define.

[PATCH 2/3] S/390: Shift count tests.

2019-07-08 Thread Robin Dapp
Tests to check for the changed shift-count handling. -- gcc/testsuite/ChangeLog: 2019-07-05 Robin Dapp * gcc.target/s390/combine-rotate-modulo.c: New test. * gcc.target/s390/combine-shift-rotate-add-mod.c: New test. * gcc.target/s390/vector/combine-shift-vec.c: New

[PATCH] S/390: Add arch13 pipeline description

2019-04-10 Thread Robin Dapp
Hi, this patch adds the pipeline description and the cpu model number for arch13. Bootstrapped and regtested on s390x. Regards Robin -- gcc/ChangeLog: 2019-04-10 Robin Dapp * config/s390/8561.md: New file. * config/s390/driver-native.c (s390_host_detect_local_cpu): Add

Re: [RFC] D support for S/390

2019-04-24 Thread Robin Dapp
ll: all-am +PWD_COMMAND = $${PWDCMD-pwd} .SUFFIXES: $(srcdir)/Makefile.in: @MAINTAINER_MODE_TRUE@ $(srcdir)/Makefile.am $(am__configure_deps) Regards Robin -- gcc/d/ChangeLog: 2019-04-24 Robin Dapp * typeinfo.cc (create_typeinfo): Set fields with proper length. gcc/testsuite/Change

Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-13 Thread Robin Dapp
> +/* ((T)(A + CST1)) + CST2 -> (T)(A) + CST */ > Do you want to handle MINUS? What about POINTER_PLUS_EXPR? When I last attempted this patch I had the MINUS still in it but got confused easily by needing to think of too many cases at once leading to lots of stupid mistakes. Hence, I left it

[PATCH 2/9] ifcvt: Use enum instead of transform_name string.

2019-08-02 Thread Robin Dapp
This patch introduces an enum for ifcvt's various noce transformations. As the transformation might be queried by the backend, I find it nicer to allow checking for a proper type instead of a string comparison. --- gcc/ifcvt.c | 46 ++-- gcc/ifcvt.h | 67

[PATCH 8/9] ifcvt: Handle swap-style idioms differently.

2019-08-02 Thread Robin Dapp
A swap-style idiom like tmp = a a = b b = tmp would be transformed like tmp_tmp = cond ? a : tmp tmp_a = cond ? b : a tmp_b = cond ? tmp_tmp : b [...] including rewiring the first source operand to previous writes (e.g. tmp -> tmp_tmp). The code would recognize this, though, and

[PATCH 9/9] ifcvt: Also pass reversed cc comparison.

2019-08-02 Thread Robin Dapp
When then and else are reversed, we would swap new_val and old_val. The same has to be done for our new code paths. Also, emit_conditional_move may perform swapping. In case we need to swap, the cc comparison also needs to be swapped and for this we pass the reversed cc comparison directly. An

[PATCH 5/9] ifcvt: Allow constants operands in noce_convert_multiple_sets.

2019-08-02 Thread Robin Dapp
This patch checks allows immediate then/else operands for cmovs. We rely on,emit_conditional_move returning NULL if something unsupported was generated. Also, minor refactoring is performed. -- gcc/ChangeLog: 2018-11-14 Robin Dapp * ifcvt.c (have_const_cmov): New function

[PATCH 6/9] ifcvt: Extract cc comparison from jump.

2019-08-02 Thread Robin Dapp
This patch extracts a cc comparison from the initial compare/jump insn and allows it to be passed to noce_emit_cmove and emit_conditional_move. --- gcc/ifcvt.c | 68 gcc/optabs.c | 7 -- gcc/optabs.h | 2 +- 3 files changed, 69

[PATCH 7/9] ifcvt: Emit two cmov variants and choose the less expensive one.

2019-08-02 Thread Robin Dapp
This patch duplicates the previous noce_emit_cmove logic. First it passes the canonical comparison emits the sequence and costs it. Then, a second, separate sequence is created by passing the cc compare we extracted before. The costs of both sequences are compared and the cheaper one is emitted.

[PATCH 1/9] ifcvt: Store the number of created cmovs.

2019-08-02 Thread Robin Dapp
This patch saves the number of created conditional moves by noce_convert_multiple_sets in the IF_INFO struct. This may be used by the backend to easier decide whether to accept a generated sequence or not. --- gcc/ifcvt.c | 10 -- gcc/ifcvt.h | 4 2 files changed, 12

[PATCH 0/9] Improve icvt "convert multiple"

2019-08-02 Thread Robin Dapp
Robin Robin Dapp (9): ifcvt: Store the number of created cmovs. ifcvt: Use enum instead of transform_name string. ifcvt: Only created temporaries as needed. ifcvt: Estimate original costs before convert_multiple. ifcvt: Allow constants operands in noce_convert_multiple_sets. ifcvt

[PATCH 4/9] ifcvt: Estimate original costs before convert_multiple.

2019-08-02 Thread Robin Dapp
This patch extends bb_ok_for_noce_convert_multiple_sets by a temporary cost estimation that can be used by noce_convert_multiple_sets. --- gcc/ifcvt.c | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index 253b8a96c1a..55205cac153

[PATCH 3/9] ifcvt: Only created temporaries as needed.

2019-08-02 Thread Robin Dapp
noce_convert_multiple_sets creates temporaries for the destination of every emitted cmov and expects subsequent passes to get rid of them. This does not happen every time and even if the temporaries are removed, code generation can be affected adversely. In this patch, temporaries are only

Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-16 Thread Robin Dapp
> So - what are you really after? (sorry if I don't remeber, testcase(s) > are missing > from this patch) > > To me it seems that 1) loses information if A + CST was done in a signed type > and we know that overflow doesn't happen because of that. For the reverse > transformation we don't. Btw,

Re: [PATCH 8/9] ifcvt: Handle swap-style idioms differently.

2019-08-16 Thread Robin Dapp
> Looks like a nice optimisation, but could we just test whether the > destination of a set isn't live on exit from the then block? I think > we could do that on the fly during the main noce_convert_multiple_sets > loop. I included this locally along with the rest of the remarks. Any comments on

Re: [PATCH 8/9] ifcvt: Handle swap-style idioms differently.

2019-08-17 Thread Robin Dapp
> I'm still a bit worried about the overlap between the expanded > noce_convert_multiple_sets and cond_move_process_if_block (5/9). > It seems like we're making noce_convert_multiple_set handle most of > the conditional move cases that cond_move_process_if_block can handle. > But like you say,

[PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-13 Thread Robin Dapp
We would like to simplify code like (larger_type)(var + const1) + const2 to (larger_type)(var + combined_const1_const2) when we know that no overflow happens. --- gcc/match.pd | 101 +++ 1 file changed, 101 insertions(+) diff --git a/gcc/match.pd

[PATCH 3/3] Add new test cases for wrapped binop simplification.

2019-08-13 Thread Robin Dapp
--- .../gcc.dg/tree-ssa/copy-headers-5.c | 2 +- .../gcc.dg/tree-ssa/copy-headers-7.c | 2 +- .../gcc.dg/wrapped-binop-simplify-run.c | 52 .../gcc.dg/wrapped-binop-simplify-signed-1.c | 60 +++ .../wrapped-binop-simplify-unsigned-1.c

[PATCH 0/3] Simplify wrapped binops.

2019-08-13 Thread Robin Dapp
) and manifests similarly to addr1,-1 extend r1,r1 addr1,1 where the adds could be avoided entirely. This is the tree part of the fix, it will still be necessary to correct rtl code generation in doloop later. Bootstrapped and regtested on s390x, x86 running. Regards Robin -- Robin Dapp (3

[PATCH 1/3] Perform fold when propagating.

2019-08-13 Thread Robin Dapp
This patch performs more aggressive folding in order for the match.pd changes to kick in later. Some test cases rely on VRP doing something which now already happens during CCP so adjust them accordingly. Also, the loop versioning pass was missing one case when deconstructing addresses that

Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-13 Thread Robin Dapp
> I have become rather wary of INTEGRAL_TYPE_P recently because it > includes enum types, which with -fstrict-enum can have a surprising > behavior. If I have > enum E { A, B, C }; > and e has type enum E, with -fstrict-enum, do your tests manage to > prevent (long)e+1 from becoming (long)(e+1)

Re: [PATCH 1/3] Perform fold when propagating.

2019-08-13 Thread Robin Dapp
> May I suggest to add a parameter to the substitute-and-fold engine > so we can do the folding on all stmts only when enabled and enable > it just for VRP? That also avoids the testsuite noise. Would something along these lines do? diff --git a/gcc/tree-ssa-propagate.c

Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-20 Thread Robin Dapp
> So - which case is it? IIRC we want to handle small signed > constants but the code can end up unsigned. For the > above we could write (unsigned long)((int)a + 1 - 1) and thus > sign-extend? Or even avoid this if we know the range. > That is, it becomes the first case again (operation

[PATCH/RFC] Simplify wrapped RTL op

2019-08-27 Thread Robin Dapp
Hi, as announced in the wrapped-binop gimple patch mail, on s390 we still emit odd code in front of loops: void v1 (unsigned long *in, unsigned long *out, unsigned int n) { int i; for (i = 0; i < n; i++) { out[i] = in[i]; } } --> aghi%r1,-8 srlg

Re: [PATCH/RFC] Simplify wrapped RTL op

2019-08-29 Thread Robin Dapp
>> PR37451. Not clear what target that regressed on, btw. > > And PR55190 and PR67288 and probably more. Thanks for finding those. So the hope is to get this fixed or rather move towards a fix with the patch series that's currently reviewed which injects some doloop knowledge into ivopts? As

Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-21 Thread Robin Dapp
I'm going to commit the attached two patches. Removed the redundant changes in test cases and added constructor initialization of fold_all_stmts. Regards Robin -- gcc/ChangeLog: 2019-08-21 Robin Dapp * gimple-loop-versioning.cc (loop_versioning::record_address_fragment

Re: [PATCH 3/9] ifcvt: Only created temporaries as needed.

2019-08-08 Thread Robin Dapp
Hi Richard, > Is the separate need_temps scan required for correctness? It looked > like we could test: > > if (reg_overlap_mentioned_p (dest, cond)) > ... > > on-the-fly during the main noce_convert_multiple_sets loop. right, I didn't re-check it but after changes during interal

Re: [PATCH 5/9] ifcvt: Allow constants operands in noce_convert_multiple_sets.

2019-08-08 Thread Robin Dapp
> It seems like this is making noce_convert_multiple_sets overlap > a lot with cond_move_process_if_block (although that uses CONSTANT_P > instead of CONST_INT_P). How do they fit together after this patch, > i.e. which cases is each one meant to handle that the other doesn't? IMHO all of icvt

[PATCH] S/390: Add undef for MUSL_DYNAMIC_LINKERxx

2019-11-26 Thread Robin Dapp
Hi, I committed this patch (obvious). It fixes the s390 bootstrap by undefining existing defines before redefining them. Regards Robin -- gcc/ChangeLog: 2019-11-26 Robin Dapp * config/s390/linux.h: Add undef for MUSL_DYNAMIC_LINKERxx. commit

Re: [PATCH] [dlang/phobos] S/390: Fix PR91628

2019-11-28 Thread Robin Dapp
> OK from me, what about earlier comments of using __asm__ in a C > source file? > > I wouldn't really object to converting all .S sources (infact I can > do this myself) if it meant slightly better portability. Adding to yesterday's message: feel free to apply the current version if it's OK.

[PATCH] [dlang/phobos] S/390: Fix PR91628

2019-11-27 Thread Robin Dapp
Hi, in order to not use a glibc-internal symbol anymore, this patch adds separate .S files for s390x and s390 that allow to obtain the tls offset. I bootstrapped on s390x -m64 and -m31 and test on s390x, s390 seeing no new regressions. Regards Robin -- libphobos/ChangeLog: 2019-11-27 Robin

Re: [PATCH] [dlang/phobos] S/390: Fix PR91628

2019-11-27 Thread Robin Dapp
Hi Iain, > OK from me, what about earlier comments of using __asm__ in a C > source file? I don't mind too much either way but I gathered from the discussion in the bugzilla that .S was preferred for now. Regards Robin

Re: [PATCH V2] RISC-V: Fix unexpected big LMUL choosing in dynamic LMUL model for non-adjacent load/store

2023-10-16 Thread Robin Dapp
Hi Juzhe, > +/* Get STORE value. */ > +static tree > +get_store_value (gimple *stmt) > +{ > + if (is_gimple_call (stmt) && gimple_call_internal_p (stmt)) > +{ > + if (gimple_call_internal_fn (stmt) == IFN_MASK_STORE) > + return gimple_call_arg (stmt, 3); > + else > +

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-11 Thread Robin Dapp
8e50859 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Wed, 13 Sep 2023 22:19:35 +0200 Subject: [PATCH v4] ifcvt/vect: Emit COND_ADD for conditional scalar reduction. As described in PR111401 we currently emit a COND and a PLUS expression for conditional reductions. This makes it difficu

Re: [PATCH V3] RISC-V: Fix unexpected big LMUL choosing in dynamic LMUL model for non-adjacent load/store

2023-10-16 Thread Robin Dapp
> + if (live_range && flow_bb_inside_loop_p (loop, e->src)) > + { Doesn't this match several cases more than before i.e set the range start to zero fairly often? I mean if it works fine with me and the code is easier to read. Please split off the search for the

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-13 Thread Robin Dapp
> Why are the contents of this if statement wrong for COND_LEN? > If the "else" value doesn't matter, then the masked form can use > the "then" value for all elements. I would have expected the same > thing to be true of COND_LEN. Right, that one was overly pessimistic. Removed. > But isn't

Re: [PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-17 Thread Robin Dapp
> Natively, things seem fine, but for cross, I get failures on a few > targets (hppa2.0-unknown-linux-gnu, hppa64-unknown-linux-gnu). > > With ./configure --host=x86_64-pc-linux-gnu > --target=hppa2.0-unknown-linux-gnu --build=x86_64-pc-linux-gnu && make > -j$(nproc), I get a bunch of stuff like:

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-17 Thread Robin Dapp
>> I don't know much about valueisation either :) But it does feel >> like we're working around the lack of a LEN form of COND_EXPR. >> In other words, it seems odd that we can do: >> >> IFN_COND_LEN_ADD (mask, a, 0, b, len, bias) >> >> but we can't do: >> >> IFN_COND_LEN (mask, a, b, len,

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-17 Thread Robin Dapp
Thank you for the explanation. So, assuming I added an IFN_VCOND_MASK and IFN_VCOND_MASK_LEN along with the respective helper and expand functions, what would be the way forward? Generate an IFN_VCOND_MASK(_LEN) here instead of a VEC_COND_EXPR? How would I make sure all of match.pd's vec_cond

Re: [RFC] RISC-V: elide sign extend when expanding cmp_and_jump

2023-10-25 Thread Robin Dapp
> Well, it doesn't seem like there's a lot of difference between doing > it in the generic expander bits vs target expander bits -- the former > just calls into the latter for the most part. Thus if the > subreg-promoted state is available in the target expander, I'd expect > it to be available

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-19 Thread Robin Dapp
Ugh, I didn't push yet because with a rebased trunk I am seeing different behavior for some riscv testcases. A reduction is not recognized because there is yet another "double use" occurrence in check_reduction_path. I guess it's reasonable to loosen the restriction for conditional operations

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
> Suggested adapt codes as follows: > > unsigned int element_size = GET_MODE_SIZE (mode).to_constant (); > poly_int64 nunits = exact_div (BYTES_PER_RISCV_VECTOR *TARGET_MAX_LMUL, > element_size); > if (!get_vector_mode(mode, nunits).exists()) >   gcc_unreachable (); Actually I was initially

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
> I notice we have expand_block_move > in riscv-v.cc > > Maybe we should move it into riscv-string.cc ? Yes I will also move that one. Regards Robin

Re: [PATCH] RISC-V: Fix wrong tune parameters on int_div

2023-10-27 Thread Robin Dapp
> @@ -346,7 +346,7 @@ static const struct riscv_tune_param rocket_tune_info = { >{COSTS_N_INSNS (4), COSTS_N_INSNS (5)},/* fp_mul */ >{COSTS_N_INSNS (20), COSTS_N_INSNS (20)}, /* fp_div */ >{COSTS_N_INSNS (4), COSTS_N_INSNS (4)},/* int_mul */ > - {COSTS_N_INSNS (6),

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
> Could you put it into riscv-string.cc rather than riscv-v.cc? I would > like to put those builtin function expander together if possible, > riscv-string.cc might little bit confuse, but it's all included in > string.h Ok, sure. Will commit the adjusted patch if no further comments. Regards

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
Attached v3 that I'd commit. Regards Robin >From 246b986a8ea2332ced7a094dd68d35d84dcbbc04 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Tue, 24 Oct 2023 10:33:15 +0200 Subject: [PATCH v3] RISC-V: Add rawmemchr expander. This patch adds a vectorized rawmemchr expander. It also mo

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-25 Thread Robin Dapp
. For now I kept the expander function but used a direct optab. Regards Robin >From 4f793b71184b3301087780ed500f798d69328fc9 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Fri, 13 Oct 2023 10:20:35 +0200 Subject: [PATCH v2] internal-fn: Add VCOND_MASK_LEN. In order to prevent simp

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
> It seems that you didn't commit it yet. > > A nit comment: > > + int lmul = riscv_autovec_lmul == RVV_DYNAMIC ? RVV_M8 : riscv_autovec_lmul; > > I change you could use TARGET_MAX_LMUL No didn't commit yet, testsuite was still running. OK, added it, will commit later. Regards Robin

Re: [PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-27 Thread Robin Dapp
bin >From 248744c328440bff9cc339d2bf622852cbaac343 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Thu, 12 Oct 2023 11:23:26 +0200 Subject: [PATCH v3] genemit: Split insn-emit.cc into several partitions. On riscv insn-emit.cc has grown to over 1.2 mio lines of code and compiling it takes considerable time. Therefor

Re: [PATCH] RISC-V: Support strided load/store

2023-10-31 Thread Robin Dapp
Hi Juzhe, LGTM once the middle-end parts are in. Just tiny nits. Nothing that would warrant a V2, though. > +;; = > +;; == Stried Load/Store missing a 'd' here. > +(define_predicate "vector_stride_extension_operand" > +

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-31 Thread Robin Dapp
>> +int >> +internal_fn_else_index (internal_fn fn) > > The function needs a comment, maybe: > > /* If FN is an IFN_COND_* or IFN_COND_LEN_* function, return the index of the >argument that is used when the condition is false. Return -1 otherwise. > */ > > OK for the internal-fn* and

Re: [PATCH V2] RISC-V: Fix redundant vsetvl in fixed-vlmax vectorized codes[PR112326]

2023-11-02 Thread Robin Dapp
Hi Juzhe, in principle this LGTM. It could use some function comments, though ;) > +imm_avl_p (machine_mode mode) > { >poly_uint64 nuints = GET_MODE_NUNITS (mode); > >return nuints.is_constant () > -/* The vsetivli can only hold register 0~31. */ > -? (IN_RANGE

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-02 Thread Robin Dapp
> Looks reasonable overall. The new match patterns are 1:1 the > same as the COND_ ones. That's a bit awkward, but I don't see > a good way to "macroize" stuff further there. Can you at least > interleave the COND_LEN_* ones with the other ones instead of > putting them all at the end? Yes, no

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-03 Thread Robin Dapp
> Could you explain why a special expansion is needed? (Sorry if you already > have and I missed it, bit overloaded ATM.) What does it do that is > different from what expand_fn_using_insn would do? All it does (in excess) is shuffle the arguments - vcond_mask_len has the mask as third operand

Re: [PATCH] RISC-V: Fix bug of AVL propagation PASS

2023-11-02 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-03 Thread Robin Dapp
> Ah, OK. IMO it's better to keep the optab operands the same as the IFN > operands, even if that makes things inconsistent with vcond_mask. > vcond_mask isn't really a good example to follow, since the operand > order is not only inconsistent with the IFN, it's also inconsistent > with the

<    1   2   3   4   5   6   7   8   9   10   >