[Bug tree-optimization/114932] Improvement in CHREC can give large performance gains

2024-05-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #6 from Tamar Christina --- Created attachment 58096 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58096=edit exchange2.fppized-bad.f90.187t.ivopts

[Bug tree-optimization/114932] Improvement in CHREC can give large performance gains

2024-05-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #5 from Tamar Christina --- Created attachment 58095 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58095=edit exchange2.fppized-good.f90.187t.ivopts

[Bug tree-optimization/114932] Improvement in CHREC can give large performance gains

2024-05-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #4 from Tamar Christina --- reduced more: --- module brute_force integer, parameter :: r=9 integer block(r, r, 0) contains subroutine brute do do do do do

[Bug tree-optimization/114932] Improvement in CHREC can give large performance gains

2024-05-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #3 from Tamar Christina --- (In reply to Andrew Pinski from comment #2) > > which is harder for prefetchers to follow. > > This seems like a limitation in the HW prefetcher rather than anything else. > Maybe the cost model for

[Bug tree-optimization/114932] New: Improvement in CHREC can give large performance gains

2024-05-02 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 Bug ID: 114932 Summary: Improvement in CHREC can give large performance gains Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity:

[Bug ipa/92538] Proposal for IPA init() constant propagation

2024-05-02 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92538 Tamar Christina changed: What|Removed |Added CC||jamborm at gcc dot gnu.org ---

[Bug target/114860] [14/15 regression] [aarch64] 511.povray regresses by ~5.5% with -O3 -flto -march=native -mcpu=neoverse-v2 since r14-10014-ga2f4be3dae04fa

2024-05-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860 --- Comment #3 from Tamar Christina --- I cannot reproduce this even recompiling libc.

[Bug target/114860] [14/15 regression] [aarch64] 511.povray regresses by ~5.5% with -O3 -flto -march=native -mcpu=neoverse-v2 since r14-10014-ga2f4be3dae04fa

2024-04-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org ---

[Bug target/114860] [14/15 regression] [aarch64] 511.povray regresses by ~5.5% with -O3 -flto -march=native -mcpu=neoverse-v2 since r14-10014-ga2f4be3dae04fa

2024-04-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860 --- Comment #1 from Tamar Christina --- Hmm I Am unable to reproduce this with -O3 - flto -mcpu=neoverse-v2 on a neoverse-v2 machine. Is any other option required? Also that code was new in gcc 14 and was partially reverted due to register

[Bug rtl-optimization/114766] ^ constraint modifier unexpectedly affects register class selection.

2024-04-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114766 --- Comment #2 from Tamar Christina --- (In reply to Vladimir Makarov from comment #1) > (In reply to Tamar Christina from comment #0) > > The documentation for ^ states: > > If it works for you, we could try to use the patch (although it needs

[Bug tree-optimization/114769] [14 Regression] Suspicious code in vect_recog_sad_pattern() since r14-1832

2024-04-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114769 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug tree-optimization/114769] [14 Regression] Suspicious code in vect_recog_sad_pattern() since r14-1832

2024-04-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114769 --- Comment #2 from Tamar Christina --- I believe this is safe, but the interface is definitely not the cleanest. vect_recog_absolute_difference has two callers: 1. vect_recog_sad_pattern where if you return true with unprom not set, then

[Bug target/113625] Interesting behavior with and without -mcpu=generic

2024-04-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113625 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org ---

[Bug rtl-optimization/114766] New: ^ constraint modifier unexpectedly affects register class selection.

2024-04-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114766 Bug ID: 114766 Summary: ^ constraint modifier unexpectedly affects register class selection. Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords:

[Bug target/114741] [14 regression] aarch64 sve: unnecessary fmov for scalar int bit operations

2024-04-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/114513] [11/12/13/14 Regression] [aarch64] floating-point registers are used when GPRs are preferred

2024-04-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114513 Bug 114513 depends on bug 114741, which changed state. Bug 114741 Summary: [14 regression] aarch64 sve: unnecessary fmov for scalar int bit operations https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741 What|Removed

[Bug target/114741] [14 regression] aarch64 sve: unnecessary fmov for scalar int bit operations

2024-04-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug target/114741] [14 regression] aarch64 sve: unnecessary fmov for scalar int bit operations

2024-04-16 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741 --- Comment #6 from Tamar Christina --- and the exact armv9-a cost model you quoted, also does the right codegen. https://godbolt.org/z/obafoT6cj There is just an inexplicable penalty being applied to the r->r alternative.

[Bug target/114741] [14 regression] aarch64 sve: unnecessary fmov for scalar int bit operations

2024-04-16 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org,

[Bug tree-optimization/113552] [11/12/13 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

2024-04-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552 Tamar Christina changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/114403] [14 regression] LLVM miscompiled with -O3 -march=znver2 -fno-vect-cost-model since r14-6822-g01f4251b8775c8

2024-04-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403 Tamar Christina changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/114403] [14 regression] LLVM miscompiled with -O3 -march=znver2 -fno-vect-cost-model since r14-6822-g01f4251b8775c8

2024-04-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403 --- Comment #26 from Tamar Christina --- (In reply to Richard Biener from comment #25) > That means, when the loop takes the early exit we _must_ take that during > the vector iterations. Peeling for gaps means if we would take the early >

[Bug tree-optimization/114403] [14 regression] LLVM miscompiled with -O3 -march=znver2 -fno-vect-cost-model since r14-6822-g01f4251b8775c8

2024-04-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403 --- Comment #24 from Tamar Christina --- (In reply to Richard Biener from comment #23) > Maybe easier to understand testcase: > > with -O3 -msse4.1 -fno-vect-cost-model we return 20 instead of 8. Adding > -fdisable-tree-cunroll avoids the

[Bug tree-optimization/114403] [14 regression] LLVM miscompiled with -O3 -march=znver2 -fno-vect-cost-model since r14-6822-g01f4251b8775c8

2024-04-11 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403 --- Comment #22 from Tamar Christina --- note that due to the secondary exit the actual full vector iteration count is 8 scalar elements at VF=4 == 2. And it's this boundary condition where we fail, since ceil (8/4) == 2. any other value would

[Bug tree-optimization/114403] [14 regression] LLVM miscompiled with -O3 -march=znver2 -fno-vect-cost-model since r14-6822-g01f4251b8775c8

2024-04-11 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403 --- Comment #21 from Tamar Christina --- Created attachment 57932 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57932=edit loop.c attached reduced testcase that reproduces the issue and also checks the buffer position and copied values.

[Bug tree-optimization/114635] OpenMP reductions fail dependency analysis

2024-04-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 --- Comment #6 from Tamar Christina --- (In reply to Jakub Jelinek from comment #4) > Now, with SVE/RISCV vectors the actual vectorization factor is a poly_int > rather than constant. One possibility would be to use VLA arrays in those >

[Bug tree-optimization/114635] New: OpenMP reductions fail dependency analysis

2024-04-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635 Bug ID: 114635 Summary: OpenMP reductions fail dependency analysis Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug target/114577] New: Inefficient codegen for SVE/NEON bridge

2024-04-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114577 Bug ID: 114577 Summary: Inefficient codegen for SVE/NEON bridge Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug target/114510] [14 Regression] missed proping of multiply by 2 into address of load/stores

2024-04-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114510 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org ---

[Bug rtl-optimization/114515] [14 Regression] Failure to use aarch64 lane forms after PR101523

2024-04-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org

[Bug rtl-optimization/114575] New: [14 Regression] SVE addressing modes broken since g:839bc42772ba7af66af3bd16efed4a69511312ae

2024-04-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114575 Bug ID: 114575 Summary: [14 Regression] SVE addressing modes broken since g:839bc42772ba7af66af3bd16efed4a69511312ae Product: gcc Version: 14.0 Status: UNCONFIRMED

[Bug rtl-optimization/113682] Branches in branchless binary search rather than cmov/csel/csinc

2024-04-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682 --- Comment #9 from Tamar Christina --- (In reply to Andrew Pinski from comment #8) > This might be the path splitting running on the gimple level causing issues > too; see PR 112402 . Ah that's a good shout. It looks like Richi already

[Bug rtl-optimization/113682] Branches in branchless binary search rather than cmov/csel/csinc

2024-04-02 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682 Tamar Christina changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0

[Bug tree-optimization/114403] [14 regression] LLVM miscompiled with -O3 -march=znver2 -fno-vect-cost-model since r14-6822-g01f4251b8775c8

2024-04-02 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403 --- Comment #20 from Tamar Christina --- This is a bad interaction with early break and peeling for gaps. when peeling for gaps we set bias_for_lowest to 0, which then negates the ceil for the upper bound calculation when the div is exact. We

[Bug tree-optimization/114403] [14 regression] LLVM miscompiled with -O3 -march=znver2 -fno-vect-cost-model since r14-6822-g01f4251b8775c8

2024-04-02 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403 Tamar Christina changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed|

[Bug tree-optimization/114345] FRE missing knowledge of semantics of IFN loads

2024-03-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345 --- Comment #5 from Tamar Christina --- (In reply to Richard Biener from comment #4) > Well, the shuffling in .LOAD_LANES will be a bit awkward to do, but sure. We > basically lack "constant folding" of .LOAD_LANES and similarly of course > we

[Bug target/114350] New: missing support for SVE widening floating point conversion

2024-03-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114350 Bug ID: 114350 Summary: missing support for SVE widening floating point conversion Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords:

[Bug tree-optimization/114345] FRE missing knowledge of semantics of IFN loads

2024-03-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345 --- Comment #3 from Tamar Christina --- (In reply to Andrew Pinski from comment #2) > Oh VN does have some knowledge of MASK_STORE and LEN_STORE. Just not > LOAD_LANES . > > > See PR 106365 for MASK_STORE and LEN_STORE implementation.

[Bug tree-optimization/114346] New: vectorizer generates the same IV twice

2024-03-14 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114346 Bug ID: 114346 Summary: vectorizer generates the same IV twice Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug tree-optimization/114345] New: FRE missing knowledge of semantics of IFN loads

2024-03-14 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345 Bug ID: 114345 Summary: FRE missing knowledge of semantics of IFN loads Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug tree-optimization/114339] [14 regression] Tor miscompiled with -O2 -mavx -fno-vect-cost-model since r14-6822

2024-03-14 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114339 --- Comment #6 from Tamar Christina --- vectorizer generates: mask_patt_21.19_58 = vect_perm_even_49 >= vect_cst__57; mask_patt_21.19_59 = vect_perm_even_55 >= vect_cst__57; vexit_reduc_63 = mask_patt_21.19_58 | mask_patt_21.19_59; if

[Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since r14-9193

2024-03-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151 --- Comment #17 from Tamar Christina --- > So doing in the vectorizer sth like the following should get us the best > possible ranges? Ah, probably only global ranges since the SCEV query > itself would still lack context sensitive info (but

[Bug tree-optimization/114234] [14 Regression] verify_ssa failure with early-break vectorisation

2024-03-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114234 Tamar Christina changed: What|Removed |Added Last reconfirmed||2024-03-05

[Bug tree-optimization/114192] scalar code left around following early break vectorization of reduction

2024-03-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114192 Tamar Christina changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED

[Bug target/98877] [AArch64] Inefficient code generated for tbl NEON intrinsics

2024-02-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877 --- Comment #12 from Tamar Christina --- and it's not the first time we have conditional lowering. We already do so for e.g. shifts, where shifting by an amount => bitsize of a vector element is defined behavior or AArch64.

[Bug target/98877] [AArch64] Inefficient code generated for tbl NEON intrinsics

2024-02-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877 --- Comment #11 from Tamar Christina --- (In reply to Andrew Pinski from comment #10) > (In reply to Tamar Christina from comment #9) > > While RA should be able to deal with this, > > shouldn't we also just lower TBLs in gimple? > > > > This

[Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since g:a0b1798042d033fd2cc2c806afbb77875dd2909b

2024-02-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151 --- Comment #3 from Tamar Christina --- > > This was a correctness fix btw, so I'm not sure we can easily recover - we > could try using niter information for CHREC_VARIABLE but then there's > variable niter here so I don't see a chance. >

[Bug tree-optimization/114151] New: [14 Regression] weird and inefficient codegen and addressing modes since g:a0b1798042d033fd2cc2c806afbb77875dd2909b

2024-02-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151 Bug ID: 114151 Summary: [14 Regression] weird and inefficient codegen and addressing modes since g:a0b1798042d033fd2cc2c806afbb77875dd2909b Product: gcc

[Bug target/98877] [AArch64] Inefficient code generated for tbl NEON intrinsics

2024-02-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877 --- Comment #9 from Tamar Christina --- While RA should be able to deal with this, shouldn't we also just lower TBLs in gimple? This no reason why this can't be a VEC_PERM_EXPR which would also get the copies removed at the gimple level and

[Bug target/102171] vget_low_*/vget_high_* intrinsics should become BIT_FIELD_REF during gimple

2024-02-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102171 --- Comment #3 from Tamar Christina --- (In reply to Andrew Pinski from comment #2) > I think I am going to implement this (or assign it interally to someone else > to implement). If you do, please also remove them from arm_neon.h and use the

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop since g:2efe3a7de0107618397264017fb045f237764cc7

2024-02-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441 Tamar Christina changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org ---

[Bug tree-optimization/86530] Vectorization failure for a simple loop

2024-02-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86530 --- Comment #8 from Tamar Christina --- (In reply to Andrew Pinski from comment #6) > With my patch for V4QI, we still don't get the best code: > vect_perm_even_271 = VEC_PERM_EXPR 4, 6 }>; > vect_perm_even_273 = VEC_PERM_EXPR 4, 6 }>; >

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop since g:2efe3a7de0107618397264017fb045f237764cc7

2024-02-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441 --- Comment #27 from Tamar Christina --- Created attachment 57538 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57538=edit proposed1.patch proposed patch, this gets the gathers and scatters back. doing regression run.

[Bug tree-optimization/114099] [14 regression] ICE in find_uses_to_rename_use when building darktable-4.6.1

2024-02-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114099 --- Comment #8 from Tamar Christina --- Created attachment 57537 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57537=edit uses.patch new code seems sensitive to visitation order as get_virtual_phi returns NULL for blocks which don't

[Bug middle-end/114081] [14 regression] ICE in verify_dominators when building php-8.3.3 (error: dominator of 16 should be 111, not 3) since r14-6822

2024-02-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114081 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug tree-optimization/114068] [14 regression] ICE when building darktable-4.6.1 (error: PHI node with wrong VUSE on edge from BB 25) since r14-8768

2024-02-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068 --- Comment #14 from Tamar Christina --- patch submitted https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646415.html

[Bug tree-optimization/114068] [14 regression] ICE when building darktable-4.6.1 (error: PHI node with wrong VUSE on edge from BB 25) since r14-8768

2024-02-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068 --- Comment #13 from Tamar Christina --- Created attachment 57510 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57510=edit candidate-patch1.patch candidate patch being tested. I was hoping to correct it during peeling itself when the

[Bug tree-optimization/114068] [14 regression] ICE when building darktable-4.6.1 (error: PHI node with wrong VUSE on edge from BB 25) since r14-8768

2024-02-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068 --- Comment #12 from Tamar Christina --- looks like the moving of the store didn't update a stray out of block use of the MEM. working on patch.

[Bug tree-optimization/114068] [14 regression] ICE when building darktable-4.6.1 (error: PHI node with wrong VUSE on edge from BB 25) since r14-8768

2024-02-23 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug target/114063] New: Use IFN_CHECK_RAW_PTRS/IFN_CHECK_WAR_PTRS for Advanced. SIMD

2024-02-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114063 Bug ID: 114063 Summary: Use IFN_CHECK_RAW_PTRS/IFN_CHECK_WAR_PTRS for Advanced. SIMD Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords:

[Bug tree-optimization/114061] GCC fails vectorization when using __builtin_prefetch

2024-02-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061 --- Comment #4 from Tamar Christina --- (In reply to Andrew Pinski from comment #3) > Confirmed. > > Though maybe we should drop them in the vectorized version of the loop. HW > prefetchers usually do a decent job and sometimes (maybe most) SW

[Bug tree-optimization/114061] GCC fails vectorization when using __builtin_prefetch

2024-02-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061 --- Comment #2 from Tamar Christina --- (In reply to Andrew Pinski from comment #1) > I thought there was already one recorded about this. I could only find https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103938 about an ICE when prefetching a

[Bug tree-optimization/114061] New: GCC fails vectorization when using __builtin_prefetch

2024-02-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061 Bug ID: 114061 Summary: GCC fails vectorization when using __builtin_prefetch Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity:

[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra

2024-02-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 --- Comment #5 from Tamar Christina --- (In reply to Sam James from comment #3) > (In reply to Richard Earnshaw from comment #2) > I'm missing why the combination then works though? So we've made several changes here over time. -mcpu=native

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop since g:2efe3a7de0107618397264017fb045f237764cc7

2024-02-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441 Tamar Christina changed: What|Removed |Added Ever confirmed|0 |1 Summary|[14 Regression]

[Bug target/113295] [14 Regression] SPEC 2006 416.gamess miscompares on Aarch64 when built with -Ofast -mcpu=native since g:2f46e3578d45ff060a0a329cb39d4f52878f9d5a

2024-02-21 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113295 Tamar Christina changed: What|Removed |Added Keywords|needs-bisection | Summary|[14 Regression]

[Bug target/113295] [14 Regression] SPEC 2006 416.gamess miscompares on Aarch64 when built with -Ofast -march=native -flto

2024-02-21 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113295 --- Comment #3 from Tamar Christina --- I'm however able to reproduce it at -Ofast alone, no need for `-flto`

[Bug target/113295] [14 Regression] SPEC 2006 416.gamess miscompares on Aarch64 when built with -Ofast -march=native -flto

2024-02-21 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113295 --- Comment #2 from Tamar Christina --- bisected to commit g:2f46e3578d45ff060a0a329cb39d4f52878f9d5a Author: Richard Sandiford Date: Thu Dec 14 13:46:16 2023 + aarch64: Improve handling of accumulators in early-ra Being very

[Bug target/113295] [14 Regression] SPEC 2006 416.gamess miscompares on Aarch64 when built with -Ofast -march=native -flto

2024-02-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113295 Tamar Christina changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|

[Bug middle-end/111156] [14 Regression] aarch64 aarch64/sve/mask_struct_store_4.c failures

2024-02-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56 --- Comment #21 from Tamar Christina --- (In reply to Richard Biener from comment #18) > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc > index 7cf9504398c..8deeecfd4aa 100644 > --- a/gcc/tree-vect-slp.cc > +++ b/gcc/tree-vect-slp.cc

[Bug middle-end/111156] [14 Regression] aarch64 aarch64/sve/mask_struct_store_4.c failures

2024-02-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56 --- Comment #15 from Tamar Christina --- and just -O3 -march=armv8-a+sve

[Bug middle-end/111156] [14 Regression] aarch64 aarch64/sve/mask_struct_store_4.c failures

2024-02-15 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56 --- Comment #14 from Tamar Christina --- (In reply to Richard Biener from comment #13) > I didn't add STMT_VINFO_SLP_VECT_ONLY, I'm quite sure we can now do both SLP > of masked loads and stores, so yes, STMT_VINFO_SLP_VECT_ONLY (when we formed

[Bug tree-optimization/112376] [14 Regression] gcc.dg/tree-ssa/ssa-dom-thread-7.c missed threading in aarch64 case

2024-02-14 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112376 Tamar Christina changed: What|Removed |Added Last reconfirmed||2024-02-15 Summary|[14

[Bug middle-end/111156] [14 Regression] aarch64 aarch64/sve/mask_struct_store_4.c failures

2024-02-14 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56 Tamar Christina changed: What|Removed |Added CC||rguenth at gcc dot gnu.org ---

[Bug fortran/107071] gfortran.dg/ieee/modes_1.f90 fails on aarch64-linux

2024-02-14 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107071 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org ---

[Bug rtl-optimization/113903] sched1 should schedule across EBBS

2024-02-13 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113903 --- Comment #2 from Tamar Christina --- (In reply to Alexander Monakov from comment #1) > Lifting those insns from the L8 BB to the L10 BB requires duplicating them > on all incoming edges targeting L8, doesn't it? > No, because they're

[Bug tree-optimization/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7

2024-02-13 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 Tamar Christina changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug rtl-optimization/113903] New: sched1 should schedule across EBBS

2024-02-13 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113903 Bug ID: 113903 Summary: sched1 should schedule across EBBS Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug tree-optimization/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7

2024-02-13 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 Tamar Christina changed: What|Removed |Added Component|middle-end |tree-optimization

[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7

2024-02-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #24 from Tamar Christina --- The case I thought would go wrong with the above fix is: #include #include #include #define N 306 #define NEEDLE 135 __attribute__ ((noipa, noinline)) int use(int x[N]) { printf("res=%d\n",

[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7

2024-02-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #23 from Tamar Christina --- small standalone reducer: #include #include #include #define N 306 #define NEEDLE 136 __attribute__ ((noipa, noinline)) int use(int x[N]) { printf("res=%d\n", x[NEEDLE]); return x[NEEDLE]; }

[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7

2024-02-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #22 from Tamar Christina --- (In reply to Richard Biener from comment #21) > loop->nb_iterations_upper_bound exactly is an upper bound on the number of > latch executions, so maybe I'm missing the point here. When we update it it >

[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7

2024-02-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #20 from Tamar Christina --- [local count: 21718864]: ... _54 = (short unsigned int) bits_106; _26 = _54 >> 9; _88 = _139 + 7; _89 = _88 & 7; _111 = _26 + 10; [local count: 181308616]: # i_66 = PHI #

[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7

2024-02-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #19 from Tamar Christina --- Ok, removing all the noise shows that this is the same issue as I saw before. The code out of the vectorizer is correct, but cunroll does a dodgee unrolling. -fdisable-tree-cunroll confirms it's the

[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7

2024-02-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #18 from Tamar Christina --- Loop that gets miscompiled is the initialization loop: while (parse_tables_n-- && i < 306) table[i++] = 0; and indeed, the compiler seems to also be

[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7

2024-02-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #17 from Tamar Christina --- (In reply to Sam James from comment #16) > Created attachment 57393 [details] > test.c > > OK, all done now (I figured I'd let cvise finish). No more :) > > By the way, this fails on arm64 too (at

[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7

2024-02-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #15 from Tamar Christina --- (In reply to Sam James from comment #14) > Created attachment 57390 [details] > test.c > > I'll try reducing it preprocessed now (couldn't do it before as checking w/ > clang as well in the reduction

[Bug tree-optimization/113808] [14 Regression] FAIL: libgomp.fortran/non-rectangular-loop-1.f90 since r14-8768

2024-02-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113808 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug tree-optimization/113808] [14 Regression] FAIL: libgomp.fortran/non-rectangular-loop-1.f90 since r14-8768

2024-02-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113808 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug tree-optimization/113808] [14 Regression] FAIL: libgomp.fortran/non-rectangular-loop-1.f90

2024-02-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113808 --- Comment #2 from Tamar Christina --- I guess whether that code is correct depends on which exit was picked though. I'll look at dump too.

[Bug tree-optimization/113750] [14 Regression] ICE in vect building gcc/m2/gm2-libs/NumberIO.mod since r14-8769-g64b0130bb6702c

2024-02-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113750 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug tree-optimization/113731] [14 regression] ICE when building libbsd since r14-8768-g85094e2aa6dba7

2024-02-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113731 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7

2024-02-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #10 from Tamar Christina --- (In reply to Richard Biener from comment #9) > Another bug in the dependence checking code is > > if (dr_may_alias_p (dr_ref, dr_read, loop_nest)) > > which will end up using TBAA -

[Bug tree-optimization/113731] [14 regression] ICE when building libbsd since r14-8768-g85094e2aa6dba7

2024-02-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113731 --- Comment #9 from Tamar Christina --- (In reply to Matthias Klose from comment #8) > the proposed patch doesn't fix the amdgcn-amdhsa bootstrap. So what is the error with the patch? The output can't be the same as the function was removed.

[Bug tree-optimization/51492] vectorizer does not support saturated arithmetic patterns

2024-02-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51492 --- Comment #17 from Tamar Christina --- (In reply to Li Pan from comment #16) > I have a try like below and finally have the Standard Name "SAT_ADD". Could > you please help to double-check if my understanding is correct? > > Given below

[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7

2024-02-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #6 from Tamar Christina --- The reason for the miscompile popping up is this change from the previous patch diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index 109d4ce5192..df3eab2e8d5 100644 ---

[Bug tree-optimization/113539] [14 Regression] perlbench miscompiled on aarch64 since r14-8223-g1c1853a70f

2024-02-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113539 Tamar Christina changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/113731] [14 regression] ICE when building libbsd since r14-8768-g85094e2aa6dba7

2024-02-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113731 Tamar Christina changed: What|Removed |Added CC||burnus at gcc dot gnu.org ---

[Bug middle-end/113771] [14 Regression][GCN] ICE during GIMPLE pass: vect in vect_transform_loop tree-vect-loop.cc:11969

2024-02-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113771 Tamar Christina changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7

2024-02-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #4 from Tamar Christina --- Narrowed down the change part that caused the failure, but it should have been correct to do. So looking into why the change caused the failure. Please hold..

  1   2   3   4   5   6   7   8   >