[Bug tree-optimization/115275] [14/15 Regression] Missed optimization for Dead Code Elimination
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115275 Richard Biener changed: What|Removed |Added Known to work||13.3.0 Keywords||missed-optimization, ||needs-bisection Priority|P3 |P2 Known to fail||14.1.0, 15.0 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Target Milestone|--- |14.2 Last reconfirmed||2024-05-29 --- Comment #1 from Richard Biener --- Confirmed.
[Bug sanitizer/115273] [12 Regression] passing zero to ctz() check missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115273 Richard Biener changed: What|Removed |Added Target Milestone|--- |12.4
[Bug debug/115272] [debug] complex type is hard to related back to base type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115272 --- Comment #2 from Richard Biener --- (In reply to Richard Biener from comment #1) > How does it work for 'double' vs. 'long double' themselves? > > <1><32>: Abbrev Number: 3 (DW_TAG_base_type) > <33> DW_AT_byte_size : 16 > <34> DW_AT_encoding: 4(float) > <35> DW_AT_name: (indirect string, offset: 0x60): long double > > so if it's not distinguishable via DW_AT_byte_size you look into > DW_AT_name as well? So it looks like doing the same for _Complex long double > is perfectly in line? Take for example powerpc with it's dual IEEE and IBM long double 128 format.
[Bug debug/115272] [debug] complex type is hard to related back to base type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115272 --- Comment #1 from Richard Biener --- How does it work for 'double' vs. 'long double' themselves? <1><32>: Abbrev Number: 3 (DW_TAG_base_type) <33> DW_AT_byte_size : 16 <34> DW_AT_encoding: 4(float) <35> DW_AT_name: (indirect string, offset: 0x60): long double so if it's not distinguishable via DW_AT_byte_size you look into DW_AT_name as well? So it looks like doing the same for _Complex long double is perfectly in line?
[Bug tree-optimization/115252] The SLP vectorizer failed to perform automatic vectorization on pixel_sub_wxh of x264
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115252 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED Target||x86_64-*-* --- Comment #3 from Richard Biener --- This testcase should be fixed now.
[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 115252, which changed state. Bug 115252 Summary: The SLP vectorizer failed to perform automatic vectorization on pixel_sub_wxh of x264 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115252 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/114435] PCOM messes up vectorization some times
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114435 --- Comment #10 from Richard Biener --- (In reply to Richard Biener from comment #9) > So the "pcom messes up SLP" part should be fixed now. The pass dependence > of invariant/store motion and unswitching (and likely also loop splitting) is > something different. We may want to track this in a seprate bug. Note there's a conditional (on graphite) LIM pass after high-level loop opts, it might be an option to turn it into an unconditional instance.
[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 114435, which changed state. Bug 114435 Summary: PCOM messes up vectorization some times https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114435 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 114435, which changed state. Bug 114435 Summary: PCOM messes up vectorization some times https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114435 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/114435] PCOM messes up vectorization some times
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114435 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #9 from Richard Biener --- So the "pcom messes up SLP" part should be fixed now. The pass dependence of invariant/store motion and unswitching (and likely also loop splitting) is something different. We may want to track this in a seprate bug.
[Bug tree-optimization/114435] PCOM messes up vectorization some times
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114435 --- Comment #7 from Richard Biener --- Ah, the missed store motion is because of the IS_NAN (k) check which makes the memory accesses only conditional executed and thus possibly trap. We "fix" that only during loop unswitching which hoists the invariant check. But there's no store-motion after unswitching. Removing this check shows we can apply store-motion. /* If it can trap, it must be always executed in LOOP. Readonly memory locations may trap when storing to them, but tree_could_trap_p is a predicate for rvalues, so check that explicitly. */ base = get_base_address (ref->mem.ref); if ((tree_could_trap_p (ref->mem.ref) || (DECL_P (base) && TREE_READONLY (base))) /* ??? We can at least use false here, allowing loads? We are forcing conditional stores if the ref is not always stored to later anyway. So this would only guard the load we need to emit. Thus when the ref is not loaded we can elide this completely? */ && !ref_always_accessed_p (loop, ref, true)) return false; as the comment explains we cannot do "conditional" initialization, thus guard the load with the duplicated invariant condition.
[Bug tree-optimization/114435] PCOM messes up vectorization some times
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114435 --- Comment #6 from Richard Biener --- (In reply to jchrist from comment #5) > I tried your patch and it leaves an awful amount of dead stores to the > accumulator within the loop. I also still see the stores inside the loop in > gimple. Is this really desired? Or is this an artifact of our unrolling > setting on s390x? But even in the gimple I see the store inside the loop. The main issue is that we cannot do store-motion in the loop during invariant motion. I have not checked why. So the (vector) accumulator update stays in the loop and if you unroll this say during RTL then you'll see the duplicates. Note that then appearantly RTL DSE also cannot remove them (likely due to the same reason, all memory accesses use alias-set 1).
[Bug tree-optimization/114435] PCOM messes up vectorization some times
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114435 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #4 from Richard Biener --- (In reply to Richard Biener from comment #3) [..] > Note it would be > better to avoid the SSA copy generated by predcom. that looks a bit difficult with the way it operates. pcom could set PENDING_TODO_force_next_scalar_cleanup, this does the trick for me. diff --git a/gcc/tree-predcom.cc b/gcc/tree-predcom.cc index 75a4c85164c..9844fee1e97 100644 --- a/gcc/tree-predcom.cc +++ b/gcc/tree-predcom.cc @@ -3522,6 +3522,9 @@ tree_predictive_commoning (bool allow_unroll_p) } } + if (ret != 0) +cfun->pending_TODOs |= PENDING_TODO_force_next_scalar_cleanup; + return ret; }
[Bug tree-optimization/114435] PCOM messes up vectorization some times
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114435 --- Comment #3 from Richard Biener --- Looking again the reason for the "bad" vectorization with pcom applied is t.c:23:23: missed: Build SLP failed: operation unsupported _51 = r__r0_lsm0.7_7; that is, pcom leaves around SSA name copies which we do not handle. We could probably somehow ignore those during SLP build (but we've most of the time just fixed whoever leaves those around). Maybe it's time to do this. Note we do not want a plain copy in the SLP tree, instead when looking for the def of the operands of the PHI. Note it would be better to avoid the SSA copy generated by predcom. Sneaking in a copy_prop pass after pcom just for checking vectorizes the thing just fine, including the added recurrence: [local count: 70429947]: _12 = {k_25(D), k_25(D)}; vect__8.10_28 = MEM [(double *)r_26(D)]; vect__8.11_5 = MEM [(double *)r_26(D) + 16B]; ivtmp.24_58 = (unsigned long) in_27(D); _65 = (unsigned long) sz_24(D); _66 = _65 * 32; _68 = ivtmp.24_58 + _66; [local count: 640272252]: # vect_r__r0_lsm0.17_15 = PHI # vect_r__r0_lsm0.17_30 = PHI # ivtmp.24_51 = PHI _20 = (void *) ivtmp.24_51; vect__47.14_9 = MEM [(double *)_20]; vect__47.15_11 = MEM [(double *)_20 + 16B]; vect__46.16_13 = vect__47.14_9 * _12; vect__46.16_14 = vect__47.15_11 * _12; vect__45.18_16 = vect__46.16_13 + vect_r__r0_lsm0.17_15; vect__45.18_17 = vect__46.16_14 + vect_r__r0_lsm0.17_30; MEM [(double *)r_26(D)] = vect__45.18_16; MEM [(double *)r_26(D) + 16B] = vect__45.18_17; ivtmp.24_54 = ivtmp.24_51 + 32; if (ivtmp.24_54 != _68) goto ; [89.00%]
[Bug tree-optimization/115267] [14./15 Regression] Undue warning about undefined behavior in loop with varying limits
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115267 Richard Biener changed: What|Removed |Added Keywords||diagnostic CC||rguenth at gcc dot gnu.org Known to work||13.3.0 Summary|Undue warning about |[14./15 Regression] Undue |undefined behavior in loop |warning about undefined |with varying limits |behavior in loop with ||varying limits Known to fail||14.1.0, 15.0 Priority|P3 |P2 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Target Milestone|--- |14.2 Last reconfirmed||2024-05-29 --- Comment #1 from Richard Biener --- Confirmed. We're confusing ourselves with splitting the loop, I haven't digged how we confuse ourselves yet. -fno-split-loops avoids the diagnostic.
[Bug testsuite/115262] [15 regression] gcc.target/powerpc/pr66144-3.c fails after r15-831-g05daf617ea22e1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115262 Richard Biener changed: What|Removed |Added Target Milestone|--- |15.0
[Bug rtl-optimization/115261] [11/12/13/14/15 regression] FAIL: gcc.target/s390/vector/vec-abi-vararg-1.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115261 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Version|unknown |14.1.0 Target Milestone|--- |11.5 CC||rguenth at gcc dot gnu.org --- Comment #1 from Richard Biener --- I suppose it's di_result[0] += si_result[0]; di_result[1] += si_result[1]; on x86 we do vect__5.11_99 = (vector(2) long long int) _100; vect__6.12_98 = _29 + vect__5.11_99; as we have a extendv2siv2di pattern. It's probably easier to try reproducing with a testcase not involving varargs but {di,si}_result initialized from incoming parameters?
[Bug tree-optimization/115149] [14 Regression] ICE on valid code at -O3 with "-fno-inline -fno-tree-vrp -fno-ipa-sra -fno-tree-dce -fno-tree-ch" on x86_64-linux-gnu: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115149 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #6 from Richard Biener --- Fixed.
[Bug target/115254] [15 Regression] GCN regressions from "Avoid splitting store dataref groups during SLP discovery"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115254 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #9 from Richard Biener --- Fixed then.
[Bug target/115254] [15 Regression] GCN regressions from "Avoid splitting store dataref groups during SLP discovery"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115254 --- Comment #6 from Richard Biener --- (In reply to Thomas Schwinge from comment #5) [..] > (In reply to Richard Biener from comment #2) > > Note for gcc.dg/vect/vect-gather-4.c with -mgather and gather support in the > > ISA on x86_64 I get two 'vectorizing stmts using SLP', for f1 and f2 only. > > > > Does that match GCN? > > In addition to 'f1', 'f2', GCN target ('-march=gfx908') apparently can do > 'f3', too: > > [...]/gcc.dg/vect/vect-gather-4.c:37:21: note: vectorizing stmts using > SLP. > > Attaching that 'vect-gather-4.c.179t.vect'. Yeah, so GCN can handle all gathers. > > We unfortunately cannot handle masked gathers as "emulated". > > > > And we don't have good dejagnu target selectors for this either. Which we'd need to "fix" this - note we didn't check at all that the loops are vectorized! What we did want to check is that we do not mangle both feeding masked gathers into the same SLP branch, but we have really no indicator for this now. I suppose we could change this to scan note: node 0x4300808 (max_nunits=64, refcnt=1) vector(64) int note:op template: patt_34 = .MASK_GATHER_LOAD ((sizetype) _71, _5, 4, 0, _37); note:stmt 0 patt_34 = .MASK_GATHER_LOAD ((sizetype) _71, _5, 4, 0, _37); note:children 0x4300560 0x43004d8 specifically _not_ note:stmt 1 ... = .MASK_GATHER_LOAD but then on x86-64 you'd not see .MASK_GATHER_LOAD, neither for emulated gather discovery. And you _do_ have a 'stmt 1' for the SLP store. On x86-64 with native gather support there's .MASK_LOAD, so I suppose given we know we cannot emulate a mask gather we can change it to a scan-not of 'stmt 1 .* = .MASK' The following works for me - does it work for you? diff --git a/gcc/testsuite/gcc.dg/vect/vect-gather-4.c b/gcc/testsuite/gcc.dg/vect/vect-gather-4.c index d18094d6982..edd9a6783c2 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-gather-4.c +++ b/gcc/testsuite/gcc.dg/vect/vect-gather-4.c @@ -45,4 +45,7 @@ f3 (int *restrict y, int *restrict x, int *restrict indices) } } -/* { dg-final { scan-tree-dump-not "vectorizing stmts using SLP" vect } } */ +/* We do not want to see a two-lane .MASK_LOAD or .MASK_GATHER_LOAD since + the gathers are different on each lane. This is a bit fragile and + should possibly be turned into a runtime test. */ +/* { dg-final { scan-tree-dump-not "stmt 1 \[^\r\n\]* = .MASK" vect } } */
[Bug target/115259] [15 Regressions] GCN vs. "tree-optimization/115144 - improve sinking destination choice"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115259 Richard Biener changed: What|Removed |Added Keywords||wrong-code Target Milestone|--- |15.0 --- Comment #1 from Richard Biener --- Possibly. Did you pick up the followup fix? It might go unnoticed and produce wrong-code. r15-850-gf9fbb47987efc8, that is. There's no debug counter in sinking to more easily bisect what goes wrong in libgfortran. Did you already find a single responsible TU in libgfortran?
[Bug tree-optimization/114948] [15 Regression] ICE on valid code at -O3 with "-fno-tree-ccp -fno-tree-ch" on x86_64-linux-gnu: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114948 Richard Biener changed: What|Removed |Added Status|NEW |WAITING --- Comment #5 from Richard Biener --- can't reproduce on trunk either ... the godbold link is also fine.
[Bug tree-optimization/115236] [15 regression] Wrong code at -O1 and above with -fno-tree-fre and volatile pointers since r15-579-ga9251ab3c91c8c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115236 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #6 from Richard Biener --- Should now finally be fixed. Thanks for the report.
[Bug rtl-optimization/115258] [14 Regression][aarch64] Additional XORs generated after r14-6290-g9f0f7d802482a8958d6cdc72f1fe0c8549db2182
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115258 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Target Milestone|--- |14.2
[Bug tree-optimization/115256] [15 Regression] 502.gcc_r Run failed with '-march=native -Ofast -funroll-loops -flto' since r15-571-g1e0ae1f52741f7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115256 --- Comment #2 from Richard Biener --- Confirmed on Zen3 btw, fails with the test input already. Note that this may still be a latent issue in 502.gcc_r. -funroll-loops isn't neccessary, -O3 -flto was enough to reproduce (no specific sub-architecture required). -fno-strict-aliasing avoids the issue. --param dse-max-object-size=0 doesn't help (turn off live byte tracking) The patch itself likely adds quite some extra DSE so that's too much to track down. DSE doesn't have a debug counter at the moment, but "bisecting" --param dse-max-alias-queries-per-store shows the issue still happens with 64 but not with 48. The issue still reproduces with -flto-partition=1to1 (if one wants to try per-TU compile flags) and with -flto-partition=one (if you want to add a debug counter and bisect the bad store, but =one is slow). We ICE in cfgloopmanip.c:create_preheader here: basic_block create_preheader (struct loop *loop, int flags) { edge e, fallthru; basic_block dummy; int nentry = 0; bool irred = false; bool latch_edge_was_fallthru; edge one_succ_pred = NULL, single_entry = NULL; edge_iterator ei; FOR_EACH_EDGE (e, ei, loop->header->preds) { if (e->src == loop->latch) continue; irred |= (e->flags & EDGE_IRREDUCIBLE_LOOP) != 0; nentry++; single_entry = e; if (single_succ_p (e->src)) one_succ_pred = e; } gcc_assert (nentry); ^^^ placing noinline on the above function still reproduces the issue. We seem to run the above for the loop tree root but call from create_preheaders which does 1425 FOR_EACH_LOOP (li, loop, 0) 1426create_preheader (loop, flags); (note absence of LI_INCLUDE_ROOT) so somehow the loop iterator setup is broken.
[Bug tree-optimization/115256] [15 Regression] 502.gcc_r Run failed with '-march=native -Ofast -funroll-loops -flto' since r15-571-g1e0ae1f52741f7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115256 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2024-05-28 Target Milestone|--- |15.0 Keywords||needs-reduction, wrong-code Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #1 from Richard Biener --- Mine.
[Bug target/115254] [15 Regression] GCN regressions from "Avoid splitting store dataref groups during SLP discovery"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115254 --- Comment #4 from Richard Biener --- The gcc.dg/vect/vect-gather-4.c FAIL should be still present.
[Bug target/115254] [15 Regression] GCN regressions from "Avoid splitting store dataref groups during SLP discovery"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115254 --- Comment #2 from Richard Biener --- Note for gcc.dg/vect/vect-gather-4.c with -mgather and gather support in the ISA on x86_64 I get two 'vectorizing stmts using SLP', for f1 and f2 only. Does that match GCN? We unfortunately cannot handle masked gathers as "emulated". And we don't have good dejagnu target selectors for this either.
[Bug target/115254] [15 Regression] GCN regressions from "Avoid splitting store dataref groups during SLP discovery"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115254 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-05-28 Target Milestone|--- |15.0 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Richard Biener --- For gcc.dg/vect/slp-cond-2-big-array.c 4 is indeed expected. We run into slp-cond-2-big-array.c:39:17: note: SLP discovery limit exceeded for f3 on x86-64. I have a patch to cherry-pick to avoid this. gcc.dg/vect/slp-cond-2.c is the same testcase. gcc.dg/vect/vect-gather-4.c is a bad written testcase, we now indeed expect to SLP here. I'll see to pick the change that should help a bit.
[Bug target/115253] [14/15 regression] New tests added by r14-10122-gad45086178d833 fail on Cortex M23 and M55 bare metal targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115253 Richard Biener changed: What|Removed |Added Target Milestone|--- |14.2 Target||arm
[Bug tree-optimization/115252] The SLP vectorizer failed to perform automatic vectorization on pixel_sub_wxh of x264
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115252 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Blocks||53947 Keywords||missed-optimization Last reconfirmed||2024-05-28 Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Richard Biener --- Thanks for the report. I am working on improvements in this area. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations
[Bug d/115249] [14/15 regression] gdc.test/runnable/test34.d etc. FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115249 --- Comment #1 from Richard Biener --- Does it fail on the 14 branch as well? If so the target milestone should be 14.2, otherwise the summary should be [15 Regression]
[Bug target/115248] [15 regresion] aarch64/sve/pre_cond_share_1.c fails since r15-276-gbed6ec161be8c5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115248 Richard Biener changed: What|Removed |Added Target Milestone|--- |15.0 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Keywords||missed-optimization CC||amacleod at redhat dot com Last reconfirmed||2024-05-28 --- Comment #2 from Richard Biener --- I dont' think it's costing, it's failing to evaluate a relation - the path: 16->18->xx REJECTED is usually when we fail to simplify the end condition in BB 18. Enabling --param ranger-debug=all will probably reveal the more important difference.
[Bug tree-optimization/115243] error: stmt with wrong VUSE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115243 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Component|c |tree-optimization Resolution|--- |DUPLICATE Version|unknown |15.0 --- Comment #2 from Richard Biener --- It's fixed in my tree. *** This bug has been marked as a duplicate of bug 115226 ***
[Bug tree-optimization/115226] [15 regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115226 Richard Biener changed: What|Removed |Added CC||dcb314 at hotmail dot com --- Comment #8 from Richard Biener --- *** Bug 115243 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/115226] [15 regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115226 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #7 from Richard Biener --- Fixed.
[Bug tree-optimization/115220] [15 Regression] RISC-V: newlib targets ICE during sink pass triggered in verify_ssa (since r15-815-g5b9b3bae33cae7?)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115220 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #9 from Richard Biener --- Fixed.
[Bug c++/115245] New: Fails to demangle some concepts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115245 Bug ID: 115245 Summary: Fails to demangle some concepts Product: gcc Version: 14.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- _ZZN5OuterIvE6methodIvEEvvQ3cstITL0__EEN5InnernwEm and _ZZN5OuterIvE6methodIvEEvvQ3cstITL0__EEN5InnerdlEPv fail to demangle. Produced by using size_t = decltype(sizeof(0)); template static constexpr bool cst = true; template struct Outer { Outer(); template void method() requires cst { struct Inner { static void* operator new(size_t){return new char;} static void operator delete(void*){} Outer t; }; new Inner; } }; void f() { Outer{}.method(); }
[Bug tree-optimization/115232] [14/15 regression] ICE during GIMPLE pass during waccess
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115232 --- Comment #7 from Richard Biener --- I'm opening a new bug for the demangle fail
[Bug tree-optimization/115232] [14/15 regression] ICE during GIMPLE pass during waccess
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115232 Richard Biener changed: What|Removed |Added Known to fail||14.1.0 Status|ASSIGNED|RESOLVED Known to work||14.1.1, 15.0 Resolution|--- |FIXED --- Comment #6 from Richard Biener --- Fixed.
[Bug tree-optimization/115244] New: virtual operand SSA form verifier imperfect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115244 Bug ID: 115244 Summary: virtual operand SSA form verifier imperfect Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- On r15-845-g06bb125521dec5 the virtual SSA form verifier does not catch the sink pass wrecking virtual operand form for the following testcase at -O3. extern void c(); int a, b; int main() { while (b) { int d, e = 0, *f = *f = 1; e = 1 >> d ? : 1 << d; if (e) a = 0; c(); } return 0; }
[Bug tree-optimization/115226] [15 regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115226 --- Comment #4 from Richard Biener --- It's another case where we "skip" a killing def when sinking a store. Here there's a conditional merge of both paths, again violating the virtual operand update constraint.
[Bug tree-optimization/115226] [15 regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115226 --- Comment #3 from Richard Biener --- It's interesting that SSA verification does not catch a missing virtual PHI from sinking.
[Bug tree-optimization/115236] [15 regression] Wrong code at -O1 and above with -fno-tree-fre and volatile pointers since r15-579-ga9251ab3c91c8c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115236 --- Comment #4 from Richard Biener --- The main issue is that 'c' is a direct node by means of copying the address to MEM[(int * volatile *)] ={v} b.0_1; that makes us not apply STOREDANYTHING to it. It's not exactly clear why we should excempt direct nodes here, that's probably a mistake.
[Bug tree-optimization/115220] [15 Regression] RISC-V: newlib targets ICE during sink pass triggered in verify_ssa (since r15-815-g5b9b3bae33cae7?)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115220 --- Comment #7 from Richard Biener --- OK, so we have a case where we sink a store ignoring a killing use of the VDEF that post-dominates the store. The issue with that is the virtual operand update assumes such stores are on separate paths and that ultimatively the discovered PHI merges the defs of all those "uses". Both constraints are violated - the PHI node is actually dominating the original stmt as it's the loop header PHI which isn't on the path from the "use" (the killing def after the loop).
[Bug c++/115239] [14/15 Regression] ICE: Segmentation fault with ambiguous function call in some cases (`const char*` vs `char` with `long` vs `unsigned`)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115239 Richard Biener changed: What|Removed |Added Priority|P3 |P2
[Bug ipa/115237] -Wsuggest-attribute=pure false positive for obviously non-pure function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115237 Richard Biener changed: What|Removed |Added CC||hubicka at gcc dot gnu.org, ||rguenth at gcc dot gnu.org Keywords||diagnostic --- Comment #1 from Richard Biener --- 'pure' means the function has no side-effect besides reading global memory _when it returns_, so it's valid to turn x = unite (5, 6); y = unite (5, 6); into x = unite (5, 6); y = x; at least that's my understanding. Honza might want to clarify here.
[Bug tree-optimization/115236] [15 regression] Wrong code at -O1 and above with -fno-tree-fre and volatile pointers since r15-579-ga9251ab3c91c8c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115236 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #3 from Richard Biener --- Interesting more of STOREDANYTHING not working. Mine.
[Bug tree-optimization/115232] [14/15 regression] ICE during GIMPLE pass during waccess
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115232 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #3 from Richard Biener --- (gdb) p newc $1 = (const demangle_component &) (gdb) p delc $2 = (const demangle_component &) 1763 demangle_component *ndc = cplus_demangle_v3_components (new_str, 0, ); 1764 demangle_component *ddc = cplus_demangle_v3_components (del_str, 0, ); 1765 bool mismatch = new_delete_mismatch_p (*ndc, *ddc); (gdb) p new_str $3 = 0x77009ea8 "_ZZN5OuterIvE6methodIvEEvvQ3cstITL0__EEN5InnernwEm" (gdb) p del_str $4 = 0x77009e70 "_ZZN5OuterIvE6methodIvEEvvQ3cstITL0__EEN5InnerdlEPv" so we cannot demangle these? But we expect no failure. It should be trivial to build in safety here, mine for that. I'll keep the PR open for the failed demangle though.
[Bug tree-optimization/115228] Suspicious code in tree-vect-data-refs.cc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115228 Richard Biener changed: What|Removed |Added Last reconfirmed||2024-05-27 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #1 from Richard Biener --- Yeah, it seems missing, but WIDEN_MULT_PLUS_EXPR and WIDEN_MULT_MINUS_EXPR as well (though they do not fit in 1:1).
[Bug tree-optimization/115226] [15 regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115226 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #2 from Richard Biener --- I will have a look.
[Bug sanitizer/115225] signed integer overflow check missing with optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115225 Richard Biener changed: What|Removed |Added Component|tree-optimization |sanitizer Summary|[11/12/13/14 Regression]|signed integer overflow |signed integer overflow |check missing with |check missing with |optimization |optimization| CC||dodji at gcc dot gnu.org, ||dvyukov at gcc dot gnu.org, ||jakub at gcc dot gnu.org, ||kcc at gcc dot gnu.org --- Comment #2 from Richard Biener --- Yes, I think this is fully expected.
[Bug c++/115223] [15 regression] ICE when building KDE kontrast with LTO (error: ‘TYPE_CANONICAL’ has different ‘TYPE_CANONICAL’) since r15-779-g3c98d06a9016a0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115223 Richard Biener changed: What|Removed |Added Version|unknown |15.0 Keywords||ice-on-valid-code Target Milestone|--- |15.0 --- Comment #3 from Richard Biener --- Note -flto is only required because that triggers type verification.
[Bug tree-optimization/115221] [15 regression] ICE when building reiser4progs (propagate_updated_value, at gimple-range-cache.cc:1368) since r15-80-g0ade358cd72ffa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115221 Richard Biener changed: What|Removed |Added Version|unknown |15.0 Priority|P3 |P1
[Bug tree-optimization/115220] [15 Regression] RISC-V: newlib targets ICE during sink pass triggered in verify_ssa (since r15-815-g5b9b3bae33cae7?)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115220 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #6 from Richard Biener --- Mine. Looks similar to the libgo issue I actually fixed...
[Bug tree-optimization/115214] tree-ssa-pre.c(ICE in find_or_generate_expression, at tree-ssa-pre.c:2780)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115214 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 CC||rguenth at gcc dot gnu.org Status|UNCONFIRMED |NEW Last reconfirmed||2024-05-27 --- Comment #1 from Richard Biener --- Confirmed.
[Bug other/115211] [11/12/13/14/15 regression] -frecord-gcc-switches refactoring lost list of enabled options
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115211 --- Comment #4 from Richard Biener --- (In reply to Richard Biener from comment #3) [..] > Ah! Use -Q --help=optimizers (how intuititve...) Or when invoking cc1 omit -quiet. remember to put --help=optimziers after optimization options.
[Bug other/115211] [11/12/13/14/15 regression] -frecord-gcc-switches refactoring lost list of enabled options
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115211 --- Comment #3 from Richard Biener --- (In reply to r...@cebitec.uni-bielefeld.de from comment #2) > > --- Comment #1 from Richard Biener --- > > This was done on purpose, you can use -help=optimizers now I think. > > The thread I cited rather suggested is was removed because Martin argued > the info wasn't fully complete. However, I don't see how something that > is only 95% complete is worse than having nothing. ISTR it was awkward to keep it without duplicating code and that was the main reason citing the info is available with -[gf]record-gcc-switches ... > --help=optimizers just documents optimization options, with no > indication which are enabled for a given compilation. Don't see how > this helps. ... and also by another means that I don't remember that shows the list of options but tells whether they are enabled or not. Seems like --help=optimizers indeed isn't it (or maybe it was it but it as well was imprecise and removed ...?). Ah! Use -Q --help=optimizers (how intuititve...)
[Bug bootstrap/115213] [15 regression] Excessive memory use compiling rust with 32-bit gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115213 Richard Biener changed: What|Removed |Added CC||amacleod at redhat dot com --- Comment #1 from Richard Biener --- dup of PR115208?
[Bug other/115211] [11/12/13/14/15 regression] -frecord-gcc-switches refactoring lost list of enabled options
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115211 Richard Biener changed: What|Removed |Added Target Milestone|--- |11.5 --- Comment #1 from Richard Biener --- This was done on purpose, you can use -help=optimizers now I think.
[Bug rtl-optimization/115182] [15 Regression] gcc.target/cris/pr93372-47.c at r15-518-g99b1daae18c095
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115182 Bug 115182 depends on bug 115144, which changed state. Bug 115144 Summary: [15 Regression] 2% performance regression for some codes with r15-518-g99b1daae18c095 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115144 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/115144] [15 Regression] 2% performance regression for some codes with r15-518-g99b1daae18c095
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115144 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #10 from Richard Biener --- This should be fixed now.
[Bug tree-optimization/115210] Missed optimization opportunity in redundant copies for large structure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115210 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW CC||jamborm at gcc dot gnu.org, ||rguenth at gcc dot gnu.org Last reconfirmed||2024-05-24 Ever confirmed|0 |1 Keywords||missed-optimization --- Comment #1 from Richard Biener --- SRA has a limit on the size of structures decomposed and GCC does not have a aggregate copy propagation pass but relies on SRA. There's --param sra-max-scalarization-size-{Osize,Ospeed} Being able to perform copy coalescing(!) for A = B when B dies at the assignment and A becomes live without this limit and without decomposing but from SRA analysis would be nice.
[Bug tree-optimization/115208] [15 Regression] Memory consumption get extremely high after r15-809
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115208 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug analyzer/115203] [15 Regression] Build fail with non LANG=C in analyzer self test: ICE in fail_formatted at selftest.cc:63 / tree-diagnostic-path.cc:2158: test_control_flow_5: FAIL: ASSERT_STREQ
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115203 Richard Biener changed: What|Removed |Added Target Milestone|--- |15.0
[Bug tree-optimization/115199] [15 regression] gettext (libtextstyle) testsuite miscompiled since r15-579
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115199 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #8 from Richard Biener --- Fixed.
[Bug tree-optimization/115138] [15 Regression] Bootstrap compare-debug fail after r15-580-gf3e5f4c58591f5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115138 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #24 from Richard Biener --- Fixed.
[Bug tree-optimization/115197] [13/14/15 Regression] ICE on valid code at -O{1,2} with "-fno-tree-scev-cprop -ftree-pre -ftree-loop-distribute-patterns" on x86_64-linux-gnu: Segmentation fault
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115197 Richard Biener changed: What|Removed |Added Priority|P3 |P2 Keywords|needs-bisection | --- Comment #2 from Richard Biener --- Caused by r13-1728-gce92603fbe3b48, testing trivial fix.
[Bug tree-optimization/115138] [15 Regression] Bootstrap compare-debug fail after r15-580-gf3e5f4c58591f5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115138 --- Comment #22 from Richard Biener --- Yes! Testing a fix for int foo (int) {} int bar (int) {} typedef int (*pred)(int); int x, y; pred A () { if (x) return foo; else return bar; } pred B () { if (y) return foo; else return bar; } int __attribute__((noipa)) baz() { pred a = A(); pred b = B(); if (a != b) return 42; return 0; } int main() { if (baz () != 0) __builtin_abort (); y = 1; if (baz () != 42) __builtin_abort (); return 0; }
[Bug tree-optimization/115202] [14/15 Regression] Missed optimization: std::min(f ? (unsigned short)m : a, ~0)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115202 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Target Milestone|--- |14.2
[Bug tree-optimization/115199] [15 regression] gettext (libtextstyle) testsuite miscompiled since r15-579
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115199 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Version|unknown |15.0 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #6 from Richard Biener --- Mine.
[Bug tree-optimization/115197] [13/14/15 Regression] ICE on valid code at -O{1,2} with "-fno-tree-scev-cprop -ftree-pre -ftree-loop-distribute-patterns" on x86_64-linux-gnu: Segmentation fault
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115197 Richard Biener changed: What|Removed |Added Known to work||12.3.0 Version|unknown |13.3.1 Known to fail||13.3.0 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED Summary|ICE on valid code at|[13/14/15 Regression] ICE |-O{1,2} with|on valid code at -O{1,2} |"-fno-tree-scev-cprop |with "-fno-tree-scev-cprop |-ftree-pre |-ftree-pre |-ftree-loop-distribute-patt |-ftree-loop-distribute-patt |erns" on x86_64-linux-gnu: |erns" on x86_64-linux-gnu: |Segmentation fault |Segmentation fault Ever confirmed|0 |1 Keywords||ice-on-valid-code, ||needs-bisection Last reconfirmed||2024-05-23 Target Milestone|--- |13.4 --- Comment #1 from Richard Biener --- I'll have a look.
[Bug c++/115192] [11/12/13/14/15 regression] -O3 miscompilation on x86-64 (loops with vectors and scalars) since r11-6380
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115192 Richard Biener changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot gnu.org
[Bug c++/115192] [11/12/13/14/15 regression] -O3 miscompilation on x86-64 (loops with vectors and scalars) since r11-6380
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115192 --- Comment #9 from Richard Biener --- (In reply to Richard Biener from comment #7) > I'm looking into the first issue. Interesting fact: > > > /space/rguenther/install/gcc-14.1/bin/g++ t.C -O3 -fopt-info-vec > > -fno-tree-slp-vectorize --param vect-epilogues-nomask=0 > t.C:7:21: optimized: loop vectorized using 16 byte vectors > t.C:7:21: optimized: loop versioned for vectorization because of possible > aliasing > rguenther@localhost:/tmp> ./a.out > > /space/rguenther/install/gcc-14.1/bin/g++ t.C -O3 -fopt-info-vec > > -fno-tree-slp-vectorize --param vect-epilogues-nomask=1 > t.C:7:21: optimized: loop vectorized using 16 byte vectors > t.C:7:21: optimized: loop versioned for vectorization because of possible > aliasing > t.C:7:21: optimized: loop vectorized using 8 byte vectors > rguenther@localhost:/tmp> ./a.out > Aborted (core dumped) > > so avoiding the vectorized epilog fixes this (I've also placed #pragma GCC > novector on the loop in main and noipa on foo). Actually with -fno-vect-cost-model even --param vect-epilogues-nomask=0 fails. Since we are vectorizing for (int y = 1; y < n; y++) { a[y * n][0] = d[y * n] + a[(y - 1) * n][0]; a[y * n][1] = d[y * n] + a[(y - 1) * n][1]; } with a VF of two this is a failure to identify the dependence between the iterations, so possibly related to r11-6380 as well. (compute_affine_dependence ref_a: BIT_FIELD_REF <*_37, 32, 0>, stmt_a: _38 = BIT_FIELD_REF <*_37, 32, 0>; ref_b: BIT_FIELD_REF <*_40, 32, 0>, stmt_b: BIT_FIELD_REF <*_40, 32, 0> = _41; ) -> dependence analysis failed Creating dr for BIT_FIELD_REF <*_37, 32, 0> analyze_innermost: success. base_address: a_23(D) offset from base address: 0 constant offset from base address: 0 step: (ssizetype) ((long unsigned int) n_20(D) * 16) base alignment: 16 base misalignment: 0 offset alignment: 128 step alignment: 16 base_object: BIT_FIELD_REF <*_37, 32, 0> Creating dr for BIT_FIELD_REF <*_40, 32, 0> analyze_innermost: success. base_address: (float4_t *) a_23(D) + (sizetype) n_20(D) * 16 offset from base address: 0 constant offset from base address: 0 step: (ssizetype) ((long unsigned int) n_20(D) * 16) base alignment: 16 base misalignment: 0 offset alignment: 128 step alignment: 16 base_object: BIT_FIELD_REF <*_40, 32, 0> and for reference Creating dr for BIT_FIELD_REF <*_37, 32, 32> analyze_innermost: success. base_address: a_23(D) offset from base address: 0 constant offset from base address: 4 step: (ssizetype) ((long unsigned int) n_20(D) * 16) base alignment: 16 base misalignment: 0 offset alignment: 128 step alignment: 16 base_object: BIT_FIELD_REF <*_37, 32, 32> that looks sensible. And 'a' is indeed properly aligned. t.c:6:21: note: recording new base alignment for d_22(D) + (sizetype) n_20(D) * 4 alignment:4 misalignment: 0 based on: _32 = *_31; t.c:6:21: note: recording new base alignment for a_23(D) alignment:16 misalignment: 0 based on: _38 = BIT_FIELD_REF <*_37, 32, 0>; t.c:6:21: note: recording new base alignment for (float4_t *) a_23(D) + (sizetype) n_20(D) * 16 alignment:16 misalignment: 0 based on: BIT_FIELD_REF <*_40, 32, 0> = _41; t.c:6:21: note: vect_compute_data_ref_alignment: t.c:6:21: missed: step doesn't divide the vector alignment. t.c:6:21: missed: Unknown alignment for access: *_31 t.c:6:21: note: vect_compute_data_ref_alignment: t.c:6:21: missed: Unknown alignment for access: BIT_FIELD_REF <*_37, 32, 0> t.c:6:21: note: vect_compute_data_ref_alignment: t.c:6:21: missed: Unknown alignment for access: BIT_FIELD_REF <*_40, 32, 0>
[Bug c++/115192] [11/12/13/14/15 regression] -O3 miscompilation on x86-64 (loops with vectors and scalars) since r11-6380
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115192 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #7 from Richard Biener --- I'm looking into the first issue. Interesting fact: > /space/rguenther/install/gcc-14.1/bin/g++ t.C -O3 -fopt-info-vec > -fno-tree-slp-vectorize --param vect-epilogues-nomask=0 t.C:7:21: optimized: loop vectorized using 16 byte vectors t.C:7:21: optimized: loop versioned for vectorization because of possible aliasing rguenther@localhost:/tmp> ./a.out > /space/rguenther/install/gcc-14.1/bin/g++ t.C -O3 -fopt-info-vec > -fno-tree-slp-vectorize --param vect-epilogues-nomask=1 t.C:7:21: optimized: loop vectorized using 16 byte vectors t.C:7:21: optimized: loop versioned for vectorization because of possible aliasing t.C:7:21: optimized: loop vectorized using 8 byte vectors rguenther@localhost:/tmp> ./a.out Aborted (core dumped) so avoiding the vectorized epilog fixes this (I've also placed #pragma GCC novector on the loop in main and noipa on foo). C testcase: typedef float float4_t __attribute__((vector_size(4 * sizeof(float; void __attribute__((noipa)) foo(int n, const float *d, float4_t * __restrict a) { for (int y = 1; y < n; y++) for (int c = 0; c < 2; c++) a[y * n][c] = d[y * n] + a[(y - 1) * n][c]; } int main() { const int n = 3; float d[n*n]; float4_t a[n*n]; #pragma GCC novector for (int i = 0; i < n * n; ++i) d[i] = i; foo(n, d, a); if (a[6][1] != 9) __builtin_abort(); }
[Bug other/115189] libiberty introduces UNC paths waking up binutils bug
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115189 Richard Biener changed: What|Removed |Added Known to fail||13.3.0 Known to work||12.3.0 --- Comment #2 from Richard Biener --- The question is whether there's a workaround available on the GCC side? Of course we can't fix older releases, like for example 13.3 what was just released. 13.4 is a year away when the binutils fix should have been propagated. Is there a way to canonicalize "back" UNCs to use mapped drive letters?
[Bug target/115188] [14/15 regression] invalid Thumb assembly for atomic store in loop on ARMv6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115188 Richard Biener changed: What|Removed |Added Target||arm Keywords||wrong-code Target Milestone|--- |14.2
[Bug c++/115187] [14/15 Regression] ICE when deleting temporary array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115187 Richard Biener changed: What|Removed |Added Summary|ICE when deleting temporary |[14/15 Regression] ICE when |array |deleting temporary array Target Milestone|--- |14.2 Priority|P3 |P2 Known to work||13.2.0 --- Comment #2 from Richard Biener --- And GCC 13 complains: t.ii: In function ‘void f()’: t.ii:3:14: warning: deleting array ‘T()’ 3 | delete T{}; | ^~~ t.ii:3:14: error: taking address of temporary array t.ii:3:14: error: type ‘using T = int [2]’ {aka ‘int [2]’} argument given to ‘delete’, expected pointer
[Bug c++/115187] ICE when deleting temporary array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115187 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-05-22 Keywords||accepts-invalid, ||ice-on-invalid-code Status|UNCONFIRMED |NEW --- Comment #1 from Richard Biener --- Confirmed. Missed gimplification: #1 0x01c12d7e in verify_gimple_stmt ( stmt=) at /space/rguenther/src/gcc/gcc/tree-cfg.cc:5169 try { <<< Unknown GIMPLE statement: gimple_with_cleanup_expr >>> D.2795 = {}; D.2796 = MEM[(int *)D.2796] = {CLOBBER(eob)}; } finally { operator delete (D.2796, 4); } clang complains: t.ii:3:7: error: cannot delete expression of type 'T' (aka 'int[2]') 3 | delete T{}; | ^ ~~~
[Bug tree-optimization/115138] [15 Regression] Bootstrap compare-debug fail after r15-580-gf3e5f4c58591f5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115138 --- Comment #15 from Richard Biener --- Indeed with boostrap-O3 I can see Comparing stages 2 and 3 Bootstrap comparison failure! gcc/d/opover.o differs since both have debug info with bootstrap-O3 the difference is only -18: 0b0c 4 OBJECT LOCAL DEFAULT6 CSWTCH.154 +18: 0b0c 4 OBJECT LOCAL DEFAULT6 CSWTCH.155 there's already differenes in SRA and even local-fnsummary. In fact gimplification shows --- ../prev-gcc/d/opover.d.006t.gimple 2024-05-22 13:50:13.437438763 +0200 +++ d/opover.d.006t.gimple 2024-05-22 13:51:08.710863322 +0200 @@ -5158,57 +5158,58 @@ overflow = 0; newLength.100_2 = newLength; newLength.101_3 = newLength.100_2; +newLength.102_4 = newLength.101_3; D.12117 = .ADD_OVERFLOW (typeInfoSize, 1); as first difference (but the .original dumps are the same). That's in the __setArrayAllocLength function. Note the opover.d compile doesn't even use -O3, so this is all extremely odd. It would somehow point at a miscompile of the stage2 compiler by the stage1 compiler manifesting itself only in this change ... So the logical next step would be to bisect stage1/stage2 object files of d21 and see which stage2 object is miscompiled.
[Bug tree-optimization/114072] gcc.dg/vect/vect-pr111779.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114072 --- Comment #2 from Richard Biener --- Hmm, is solaris-sparc big-endian? It seems so. That makes the bitfield access require a VnQImode logical right shift (but little-endian doesn't require it - it's actually bitfield endianess that matters). There is vect_shift_char you could use and somehow conditionalize that on big-endianess.
[Bug tree-optimization/115138] [15 Regression] Bootstrap compare-debug fail after r15-580-gf3e5f4c58591f5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115138 --- Comment #13 from Richard Biener --- (In reply to Iain Sandoe from comment #9) > (In reply to Richard Biener from comment #8) > > I've pushed a fix for PR115137, it's likely this fixes also this bug. > > unfortunately, not; at least, on my fastest x86 machine (AVX512) - the fail > is the same (different CSWTCH.xxx numbers between the stage1 compiler and > the stage2 - the numbers are unchanged with the r15-753 [.154 and .155 > respectively]). Note stage1 and stage2 are not expected to compare equal - it's stage2 and stage3 objects that are compared. > I don't expect the machine to make any difference - and I saw that this was > also reported by at least one person for Linux too (although bootstrapping > with O3, I think). I think that was an ICE with prange.
[Bug tree-optimization/115138] [15 Regression] Bootstrap compare-debug fail after r15-580-gf3e5f4c58591f5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115138 --- Comment #12 from Richard Biener --- How do I reproduce this? I tried, on x86_64-linux an all-language bootstrap using gdc-13 for the first stage and that succeeded. I'm now trying again with gdc-12, just --enable-languages=d and an explicit --with-built-config=bootstrap-debug
[Bug web/115183] GCCGO appears twice at https://gcc.gnu.org/onlinedocs/14.1.0/
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115183 Richard Biener changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #2 from Richard Biener --- I fixed the GCCGO references and the typo.
[Bug rtl-optimization/115182] [15 Regression] gcc.target/cris/pr93372-47.c at r15-518-g99b1daae18c095
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115182 Richard Biener changed: What|Removed |Added Target Milestone|--- |15.0
[Bug tree-optimization/115144] [15 Regression] 2% performance regression for some codes with r15-518-g99b1daae18c095
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115144 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Last reconfirmed||2024-05-22 Status|UNCONFIRMED |ASSIGNED --- Comment #8 from Richard Biener --- (In reply to Hans-Peter Nilsson from comment #7) > (In reply to Richard Biener from comment #6) [...] > > I do want to play with > > sinking to the start of the else {, but without doing any lifetime analysis > > I fear that's going to be worse in the average as the current location > > at least ensures we're close to the first use of the DEF we sink. > > Thank you in advance and for the look this far! I haven't looked closer at > what happens with later passes in main, but looking at the generated > assembly code, the "sinking" of a division has the eventual effect of > increasing register pressure; see the previously attached dumps. Indeed, we have originally _38 = _36 / _37; _39 = _36 % _37; r2_78 = (signed char) _39; where both _36 and _37 die (but _39 and _38 are live for a lot longer). We sink the _38 def across [local count: 173045540]: # iftmp.10_49 = PHI if (_41 >= iftmp.10_49) goto ; [0.00%] else goto ; [100.00%] [local count: 173045540]: r1.13_43 = (unsigned char) _38; which the original profile check avoided. I'll note the above is a more sensible case where to avoid such sinking but I'll also note that sinking does not look at register pressure (or basically whether a sinking increases or decreases register pressure) at all and generally GIMPLE passes are not supposed to do this (it's also not an easy feat).
[Bug tree-optimization/115177] incorrect TBAA for derived types involving hardbool types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115177 Richard Biener changed: What|Removed |Added Version|unknown |15.0 CC||aoliva at gcc dot gnu.org --- Comment #1 from Richard Biener --- hardbool is an extension, so how it should behave is up to us? It probably makes sense to inter-operate with its base type though? You are testing for inter-operability between Base * and Hardbool * though. Alex, what was the idea here?
[Bug tree-optimization/115138] [15 Regression] Bootstrap compare-debug fail after r15-580-gf3e5f4c58591f5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115138 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #10 from Richard Biener --- OK, I won't get to it today but will try to reproduce and analyze tomorrow.
[Bug tree-optimization/115138] [15 Regression] Bootstrap compare-debug fail after r15-580-gf3e5f4c58591f5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115138 --- Comment #8 from Richard Biener --- I've pushed a fix for PR115137, it's likely this fixes also this bug.
[Bug tree-optimization/115137] [15 regression] Miscompilation of wget (test suite hangs) since r15-580-gf3e5f4c58591f5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115137 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #12 from Richard Biener --- Fixed.
[Bug sanitizer/111736] Address sanitizer is not compatible with named address spaces
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111736 Richard Biener changed: What|Removed |Added Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot gnu.org --- Comment #48 from Richard Biener --- Would have been nice to have a new bugreport for -fsanitize=bool.
[Bug tree-optimization/115149] [14 Regression] ICE on valid code at -O3 with "-fno-inline -fno-tree-vrp -fno-ipa-sra -fno-tree-dce -fno-tree-ch" on x86_64-linux-gnu: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115149 Richard Biener changed: What|Removed |Added Priority|P3 |P2
[Bug tree-optimization/115149] [14 Regression] ICE on valid code at -O3 with "-fno-inline -fno-tree-vrp -fno-ipa-sra -fno-tree-dce -fno-tree-ch" on x86_64-linux-gnu: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115149 Richard Biener changed: What|Removed |Added Known to work||15.0 Target Milestone|15.0|14.2 Summary|[15 Regression] ICE on |[14 Regression] ICE on |valid code at -O3 with |valid code at -O3 with |"-fno-inline -fno-tree-vrp |"-fno-inline -fno-tree-vrp |-fno-ipa-sra -fno-tree-dce |-fno-ipa-sra -fno-tree-dce |-fno-tree-ch" on|-fno-tree-ch" on |x86_64-linux-gnu: |x86_64-linux-gnu: |verify_ssa failed |verify_ssa failed --- Comment #4 from Richard Biener --- OK, so the issue is that the endless loop is a CFG sink that doesn't enforce a VUSE so we "miss" virtual PHIs in BB 13 and BB 14 and that makes sinking think it doesn't need to update VOPs. This is a latent issue and IMO a design bug of the virtual SSA net. sinking now employs VOP_LIVE but even that doesn't handle this situation because of a bug. I'm testing a fix. Fixed on trunk, queued for backporting.
[Bug ipa/114930] [14/15 regression] ICE in fld_incomplete_type_of when building libwebp with -std=c23 -flto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114930 Richard Biener changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #6 from Richard Biener --- Likely an issue in fld_incomplete_type_of with how C23 lays out types and/or how we set up their canonical types. What we try to do for LTO is to turn pointer types to pointer types to pointer to "incomplete" type variants, thus for arrays to [] and for structs to structs with no fields. Somebody needs to see how it goes wrong, the assert makes sense and is required for correctness.
[Bug tree-optimization/115137] [15 regression] Miscompilation of wget (test suite hangs) since r15-580-gf3e5f4c58591f5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115137 Richard Biener changed: What|Removed |Added Last reconfirmed||2024-05-21 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 --- Comment #10 from Richard Biener --- Mine. Testing patch.
[Bug tree-optimization/115144] [15 Regression] 2% performance regression for some codes with r15-518-g99b1daae18c095
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115144 Richard Biener changed: What|Removed |Added CC||law at gcc dot gnu.org --- Comment #6 from Richard Biener --- For gcc.c-torture/execute/arith-rand-ll.c, does it help to replace the exit (0) call with a return 0 statement? Looking at gcc.target/cris/pr93372-47.c what we do here is sink tot_bits += n_bits into the else { of the in-loop conditional, in particular we sink it right before the exit conditional in the loop. That's exactly what we are supposed to do and the previous heuristic avoided because of the guessed profile which is if (n_bits_12 == 0) goto ; [5.50%] else goto ; [94.50%] thus the n_bits == 0 exit is unlikely and for some reason we thought sinking across that isn't profitable. To quote the loop in question is: for (;;) { ran = simple_rand (); n_bits = (ran >> 1) % 16; tot_bits += n_bits; if (n_bits == 0) return x; else { x <<= n_bits; if (ran & 1) x |= (1 << n_bits) - 1; if (tot_bits > 8 * sizeof (long long) + 6) return x; } } Note that the sinking doesn't increase register lifetime (one of the reasons of the previous heuristic), esp. if we'd go one step further and sink to the start of the else { block rather than right before the exit conditional. But I'd guess that wouldn't help the delay-slot filling here? I've noticed CRIS doesn't support scheduling at all, so delay slot filling (where's that done?) relies purely on our "random" scheduling we do at RTL expansion time (via TER) and during GIMPLE optimization? That said, I think sinking now works as expected. I do want to play with sinking to the start of the else {, but without doing any lifetime analysis I fear that's going to be worse in the average as the current location at least ensures we're close to the first use of the DEF we sink.
[Bug bootstrap/115167] [15 Regression] CFG edge visualization to path-printing bootstrap failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115167 Richard Biener changed: What|Removed |Added Target Milestone|--- |15.0
[Bug target/115161] [15 Regression] highway-1.0.7 miscompilation of some SSE2 intrinsics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161 Richard Biener changed: What|Removed |Added Target Milestone|--- |15.0 Target||x86_64-*-* Keywords||wrong-code Version|14.0|15.0
[Bug tree-optimization/115152] [13/14/15 Regression] wrong code at -O3 with "-fno-tree-fre -fno-tree-dominator-opts -fno-tree-loop-im" on x86_64-linux-gnu since r13-455
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115152 --- Comment #6 from Richard Biener --- (In reply to Jakub Jelinek from comment #5) > Ugh, slp creates here V1QImode vectors? Why? That can't be ever faster > than just scalar QImode, no? We do not reject single-lane vectors iff the target happily supports them. It definitely would have avoided some bugs if we did - but it's also good we fix those.
[Bug tree-optimization/115149] [15 Regression] ICE on valid code at -O3 with "-fno-inline -fno-tree-vrp -fno-ipa-sra -fno-tree-dce -fno-tree-ch" on x86_64-linux-gnu: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115149 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #2 from Richard Biener --- I will have a look.