[Bug tree-optimization/113900] [14 regression] Hang and then ICE in vect_transform_loops, at tree-vectorizer.cc:1031 when building slang-2.3.3 since r14-8925
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113900 Richard Biener changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #9 from Richard Biener --- It's an odd duplicate. I confirm the fix for PR113902 fixes both the original and the reduced testcase. *** This bug has been marked as a duplicate of bug 113902 ***
[Bug tree-optimization/113902] [14 regression] ICE in find_uses_to_rename_use since r14-8925
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113902 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #5 from Richard Biener --- Fixed.
[Bug tree-optimization/113898] [14 regression] ICE in copy_reference_ops_from_ref, at tree-ssa-sccvn.cc:1156 since r14-8929-g938a419182f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113898 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #4 from Richard Biener --- This one is fixed now.
[Bug tree-optimization/113902] [14 regression] ICE in find_uses_to_rename_use since r14-8925
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113902 --- Comment #3 from Richard Biener --- *** Bug 113901 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/113901] [14 regression] ICE when building nodejs-20.11.0 (crash in find_uses_to_rename_use)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113901 Richard Biener changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #4 from Richard Biener --- Duplicate. The fix for PR113902 works here, too. *** This bug has been marked as a duplicate of bug 113902 ***
[Bug tree-optimization/113902] [14 regression] ICE in find_uses_to_rename_use since r14-8925
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113902 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #2 from Richard Biener --- Mine.
[Bug tree-optimization/113900] [14 regression] Hang and then ICE in vect_transform_loops, at tree-vectorizer.cc:1031 when building slang-2.3.3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113900 Richard Biener changed: What|Removed |Added Last reconfirmed||2024-02-13 Keywords|compile-time-hog| Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #5 from Richard Biener --- What does -march=native resolve to? I suppose znver2? I can confirm the compile-time-hog even with a release checking GCC 13 compiler, but nothing really stands out here besides maybe RTL combine and load CSE after reload (that's a usual suspect). > gcc-13 slarith.i -S -m32 -mfpmath=sse -O3 -fPIC -march=znver2 > -fno-strict-aliasing -Waddress -Warray-bounds -Wfree-nonheap-object > -Wint-to-pointer-cast -Wmain -Wnonnull -Wodr -Wreturn-type > -Wsizeof-pointer-memaccess -Wstrict-aliasing -Wstring-compare -Wuninitialized > -Wvarargs -ftime-report Time variable usr sys wall GGC phase setup: 0.00 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 2042k ( 0%) phase parsing : 0.13 ( 0%) 0.40 ( 20%) 0.53 ( 1%) 25M ( 1%) phase lang. deferred : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 96 ( 0%) phase opt and generate : 46.65 (100%) 1.61 ( 80%) 48.27 ( 99%) 2563M ( 99%) garbage collection : 0.12 ( 0%) 0.01 ( 0%) 0.12 ( 0%) 0 ( 0%) dump files : 0.03 ( 0%) 0.00 ( 0%) 0.05 ( 0%) 0 ( 0%) callgraph construction : 0.05 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 552k ( 0%) callgraph optimization : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 2952 ( 0%) callgraph functions expansion : 45.66 ( 98%) 1.46 ( 73%) 47.13 ( 97%) 2459M ( 95%) callgraph ipa passes : 0.90 ( 2%) 0.15 ( 7%) 1.06 ( 2%) 60M ( 2%) ipa function summary : 0.09 ( 0%) 0.00 ( 0%) 0.09 ( 0%) 9208k ( 0%) ipa cp : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 175k ( 0%) ipa inlining heuristics: 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 68k ( 0%) ipa function splitting : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 8528 ( 0%) ipa pure const : 0.02 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 3504 ( 0%) ipa icf: 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%) 30k ( 0%) ipa SRA: 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 37k ( 0%) ipa modref : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 325k ( 0%) cfg construction : 0.00 ( 0%) 0.00 ( 0%) 0.00 ( 0%) 3443k ( 0%) cfg cleanup: 0.52 ( 1%) 0.01 ( 0%) 0.44 ( 1%) 37M ( 1%) trivially dead code: 0.11 ( 0%) 0.00 ( 0%) 0.15 ( 0%) 0 ( 0%) df scan insns : 0.07 ( 0%) 0.00 ( 0%) 0.10 ( 0%) 12k ( 0%) df reaching defs : 0.37 ( 1%) 0.01 ( 0%) 0.29 ( 1%) 0 ( 0%) df live regs : 1.22 ( 3%) 0.01 ( 0%) 1.15 ( 2%) 0 ( 0%) df live regs : 0.53 ( 1%) 0.00 ( 0%) 0.65 ( 1%) 0 ( 0%) df must-initialized regs : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 ( 0%) df use-def / def-use chains: 0.07 ( 0%) 0.00 ( 0%) 0.09 ( 0%) 0 ( 0%) df live reg subwords : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 0 ( 0%) df reg dead/unused notes : 0.55 ( 1%) 0.00 ( 0%) 0.51 ( 1%) 24M ( 1%) register information : 0.09 ( 0%) 0.00 ( 0%) 0.09 ( 0%) 0 ( 0%) alias analysis : 0.51 ( 1%) 0.00 ( 0%) 0.48 ( 1%) 125M ( 5%) alias stmt walking : 0.91 ( 2%) 0.22 ( 11%) 0.95 ( 2%) 45M ( 2%) register scan : 0.06 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 1524k ( 0%) rebuild jump labels: 0.09 ( 0%) 0.00 ( 0%) 0.04 ( 0%) 264 ( 0%) preprocessing : 0.03 ( 0%) 0.10 ( 5%) 0.12 ( 0%) 500k ( 0%) lexical analysis : 0.06 ( 0%) 0.19 ( 9%) 0.20 ( 0%) 0 ( 0%) parser (global): 0.00 ( 0%) 0.01 ( 0%) 0.01 ( 0%) 3313k ( 0%) parser struct body : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 165k ( 0%) parser function body : 0.04 ( 0%) 0.10 ( 5%) 0.18 ( 0%) 20M ( 1%) parser inl. func. body : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%) 374k ( 0%) inline parameters : 0.04 ( 0%) 0.02 ( 1%) 0.09 ( 0%) 779k ( 0%) integration:
[Bug tree-optimization/113895] [14 Regression] ice in in copy_reference_ops_from_ref, at tree-ssa-sccvn.cc:1144
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113895 --- Comment #5 from Richard Biener --- For the first testcase the issue is bitfields and 'off' being tracked in bytes. ao_ref_init_from_vn_reference handles this by not using 'off'.
[Bug tree-optimization/113895] [14 Regression] ice in in copy_reference_ops_from_ref, at tree-ssa-sccvn.cc:1144
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113895 --- Comment #4 from Richard Biener --- _1 = a[b.1_14][7]; we "correctly" resolve b.1_14 to 1 based on range info which is [-INF,-1] [1, +INF]. The thing is, the get_ref_base_and_extent code cannot do anything with this range but adjusting max_size to 32 by taking [7] and the overall size of a[] (8 elements) into account. The reverse-engineering of a constant array index falls apart when facing with this kind of undefined behavior - and it's the checking code trying to verify both implementations against each other that fails. That said, it's tree asize = TYPE_SIZE (TREE_TYPE (TREE_OPERAND (exp, 0))); /* We need to adjust maxsize to the whole array bitsize. But we can subtract any constant offset seen so far, because that would get us outside of the array otherwise. */ if (known_size_p (maxsize) && asize && poly_int_tree_p (asize)) maxsize = wi::to_poly_offset (asize) - bit_offset; that ends up constraining the access, but the resulting offset is to a[1][3], and VN comes up with a[1][7].
[Bug tree-optimization/113900] [14 regression] Hang and then ICE in vect_transform_loops, at tree-vectorizer.cc:1031 when building slang-2.3.3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113900 Richard Biener changed: What|Removed |Added Target Milestone|--- |14.0 Keywords||needs-bisection, ||needs-reduction
[Bug tree-optimization/113898] [14 regression] ICE in copy_reference_ops_from_ref, at tree-ssa-sccvn.cc:1156 since r14-8929-g938a419182f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113898 --- Comment #2 from Richard Biener --- [local count: 101363582]: # RANGE [irange] int [1, 2] h_24 = 1; ivtmp_25 = 1; e[h_24][_9] = c.5_10; so there's a missed CCP (this is late FRE). We massaged it to e[1][1] but it should have been e[1][0] instead. Oops. Testing fix.
[Bug tree-optimization/113898] [14 regression] ICE in copy_reference_ops_from_ref, at tree-ssa-sccvn.cc:1156 since r14-8929-g938a419182f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113898 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2024-02-13 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 Target Milestone|--- |14.0 --- Comment #1 from Richard Biener --- Looking.
[Bug tree-optimization/113896] [12 Regression] Assigning array elements in the wrong order after floating point optimization since r12-8841
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113896 Richard Biener changed: What|Removed |Added Last reconfirmed||2024-02-13 Keywords|needs-bisection | Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Priority|P3 |P2 --- Comment #3 from Richard Biener --- Hmm, OK, it was a backport.. I'll see.
[Bug tree-optimization/113896] [12 Regression] Assigning array elements in the wrong order after floating point optimization since r12-8841
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113896 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org Keywords||needs-bisection --- Comment #2 from Richard Biener --- what fixed it?
[Bug tree-optimization/113895] [14 Regression] ice in in copy_reference_ops_from_ref, at tree-ssa-sccvn.cc:1144
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113895 Richard Biener changed: What|Removed |Added Last reconfirmed||2024-02-13 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #3 from Richard Biener --- I will have a look.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #39 from Richard Biener --- (In reply to H.J. Lu from comment #32) > (In reply to Michael Matz from comment #31) > > (In reply to H.J. Lu from comment #30) > > > (In reply to Michael Matz from comment #29) > > > > It not only can call malloc. As the backtrace of H.J. shows, it quite > > > > clearly _does_ so :-) > > > > > > ld.so can only call the malloc implementation internal to ld.so. > > > > (And string functions for initializing that memory) If that's ensured > > already > > everywhere: super. Because I agree, that this is the best thing to do here. > > From my perspective this is pure internal implementation details and hence > > setting up thread-local areas should not be expected to be interposable by > > users. > > (a custom allocator that isn't malloc or doesn't interact with it also would > > work) > > Since ia32 ld.so in glibc is compiled with: > > Makefile:rtld-CFLAGS += -mno-sse -mno-mmx -mfpmath=387 > > ia32 _dl_tlsdesc_dynamic is OK. Maybe also use -minline-all-stringops to avoid using IFUNC accelerated memset/memcpy?
[Bug target/113847] [14 Regression] 10% slowdown of 462.libquantum on AMD Ryzen 7700X and Ryzen 7900X
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113847 Richard Biener changed: What|Removed |Added CC||jamborm at gcc dot gnu.org --- Comment #5 from Richard Biener --- CCing also Martin who should know how/why IPA SRA doesn't reconstruct the component ref chain here or why it choses the dynamic type as it does (possibly local SRA when fully scalarizing an aggregate copy does the same).
[Bug target/113847] [14 Regression] 10% slowdown of 462.libquantum on AMD Ryzen 7700X and Ryzen 7900X
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113847 Richard Biener changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #4 from Richard Biener --- Hmm, the important one is actually MEM[ptr + CST] vs MEM[ptr].component. But those are not semantically equivalent, even when the same TBAA type is in effect. _31 = MEM [(struct quantum_reg *)reg_3(D)]; _33 = MEM [(struct quantum_reg *)reg_3(D) + 8B]; _34 = MEM [(struct quantum_reg *)reg_3(D) + 16B]; _35 = MEM [(struct quantum_reg *)reg_3(D) + 24B]; out = quantum_state_collapse.isra (pos_1(D), result_22, _31, _32, _33, _34, _35); [return slot optimization] this is from inlined quantum_state_collapse where IPA SRA is eventually applied producing the above. That we do produce those might hint at that we can't really assume the dynamic type quantum_reg is at offset 8 but that was the original intent. What we are left with is the special-case where typeof (MEM[ptr + CST]) == typeof (alias-pointed-to-type) (with CST == 0). For any other case what we know is only that the access MEM[ptr + CST] is to somewhere inside an object of dynamic type quantum_reg? I'm not sure that's not less than we make use of in the alias-oracle, esp. aliasing_component_refs_walk and friends? We might be fine in practice for "bare" MEM_REFs like the above, but if we ever fold only part of the access path into the constant offset funny things may happen? So I think IPA SRA does wrong here (and maybe GCC in other places as well), possibly only pessimizing and possibly creating latent wrong-code. Note quantum_state_collapse has reg$size_62 = reg.size; reg$node_75 = reg.node; ... pre-IPA. Honza, any opinion?
[Bug tree-optimization/113831] [11/12/13 Regression] Wrong VN with structurally identical ref since r9-398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113831 Richard Biener changed: What|Removed |Added Known to work||14.0 Summary|[11/12/13/14 Regression]|[11/12/13 Regression] Wrong |Wrong VN with structurally |VN with structurally |identical ref since r9-398 |identical ref since r9-398 --- Comment #7 from Richard Biener --- Fixed on trunk sofar.
[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #21 from Richard Biener --- loop->nb_iterations_upper_bound exactly is an upper bound on the number of latch executions, so maybe I'm missing the point here. When we update it it as well has to reflect an upper bound on that, whether the last exit (the one before the latch) is the IV exit or a vectorized early exit. But yes, if the last exit is an early one that last iteration might be partial (so we drop the -1), but that's what we already do?
[Bug target/113847] [14 Regression] 10% slowdown of 462.libquantum on AMD Ryzen 7700X and Ryzen 7900X
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113847 --- Comment #3 from Richard Biener --- I can't confirm a regression (testing r14-8925-g1e3f78dbb328a2 with the offending rev reverted vs bare). 462.libquantum 20720 61.9335 S 20720 62.6331 * 462.libquantum 20720 62.2333 * 20720 61.9335 S 462.libquantum 20720 62.4332 S 20720 62.7330 S so the "best" run with the change is faster than the best run with it reverted while the worst runs are the same. There's only code-gen changes in quantum_bmeasure.part.0 and we can see it's likely {component_ref,mem_ref<0B>,reg_3(D)}@.MEM_166 (0030) vs {component_ref,mem_ref<0B>,reg_3(D)}@.MEM_9 (0022) where once the size is 256 and once 64. The types are constant 256> unit-size constant 32> vs. unit-size the former is subsetted by a COMPONENT_REF to eventually > unsigned DI so we have basically MEM vs. MEM.member-with-off. That's indeed a case where we maybe like to avoid applying this fix, but maybe only when strict-aliasing is in effect.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #17 from Richard Biener --- (In reply to Richard Biener from comment #16) > I do wonder why __tls_get_addr would have to call the overloaded malloc, can > we just not force-bind it to the glibc local malloc (and make sure that's > compiled with -mgeneral-regs-only)? I realize we end up calling memset (but __mempcpy?) as well, that might end up in an ifunc and thus using non-general regs as well (and be overloaded of course). So the whole __tls_get_addr path would need to make sure it never goes out of glibc controlled sources.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #16 from Richard Biener --- I do wonder why __tls_get_addr would have to call the overloaded malloc, can we just not force-bind it to the glibc local malloc (and make sure that's compiled with -mgeneral-regs-only)?
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 Richard Biener changed: What|Removed |Added CC||matz at gcc dot gnu.org --- Comment #14 from Richard Biener --- True. Maybe the kernel VDSO should have a _save_all_regs (fnptr) and "indirector" ...
[Bug tree-optimization/113863] [14 Regression] ICE verify_ssa failed with -O3 -msse4.1 since r14-8768
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113863 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #4 from Richard Biener --- Fixed.
[Bug target/113847] [14 Regression] 10% slowdown of 462.libquantum on AMD Ryzen 7700X and Ryzen 7900X
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113847 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Last reconfirmed||2024-02-12 Ever confirmed|0 |1 --- Comment #2 from Richard Biener --- I will try to investigate. Note this was a correctness fix, it could be relaxed a tiny bit but behavior will then depend on the order of processing of blocks not ordered by RPO.
[Bug target/113882] V4SF->V4HI could be implemented using V4SF->V4SI and then truncation to V4HI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113882 Richard Biener changed: What|Removed |Added Blocks||53947 --- Comment #1 from Richard Biener --- The vectoizer has some of these tricks but the intermediate conversion allowed is somewhat hard-coded. I think the C standard says SF -> HI invokes undefined behavior on overflow so the conversion should be valid. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations
[Bug tree-optimization/113879] missed optimization - not exploiting known range of integers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113879 Richard Biener changed: What|Removed |Added Blocks||85316 CC||amacleod at redhat dot com --- Comment #1 from Richard Biener --- VRP has difficulties with cycles. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85316 [Bug 85316] [meta-bug] VRP range propagation missed cases
[Bug sanitizer/113878] missed optimization with sanitizer and signed integer overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113878 --- Comment #9 from Richard Biener --- I'd very much appreciate getting rid of TYPE_OVERFLOW_SANITIZED checks by doing instrumentation in the frontends. Note we do #define TYPE_OVERFLOW_UNDEFINED(TYPE) \ (POINTER_TYPE_P (TYPE)\ ? !flag_wrapv_pointer\ : (!ANY_INTEGRAL_TYPE_CHECK(TYPE)->base.u.bits.unsigned_flag \ && !flag_wrapv && !flag_trapv)) it might be tempting to do && !flag_trapv && !(flag_sanitize & SANITIZE_SI_OVERFLOW) instead to get more complete coverage of disabling foldings. _Maybe_ we could clear SANITIZE_SI_OVERFLOW once instrumentation is complete?
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment #10 from Richard Biener --- I think a glibc fix would be very much preferred. Is -mtls-dialect=gnu2 supposed to work on a per-TU base or are all parts of an executable + loaded shlibs required to have the same setting?
[Bug target/113871] psrlq is not used for PERM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113871 Richard Biener changed: What|Removed |Added Target|x86_64 |x86_64-*-* i?86-*-* Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2024-02-12 --- Comment #3 from Richard Biener --- Confirmed.
[Bug middle-end/113867] [14 Regression][OpenMP] Wrong code with mapping pointers in structs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113867 Richard Biener changed: What|Removed |Added Target Milestone|--- |14.0
[Bug tree-optimization/113863] [14 Regression] ICE verify_ssa failed with -O3 -msse4.1 since r14-8768
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113863 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #2 from Richard Biener --- It looks OK after peeling. LOOP_VINFO_EARLY_BRK_VUSES is empty, but we have a stray virtual PHI in the body we fail to update: [local count: 446046556]: # .MEM_164 = PHI <.MEM_163(166)> if (f_8(D) < l_162) goto ; [88.31%] else goto ; [11.69%] things go downhill from here.
[Bug c++/113852] -Wsign-compare doesn't warn on unsigned result types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113852 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment #6 from Richard Biener --- Well, given athat a1 * a2 is carried out in 'int' you are invoking undefined behavior if it overflows. GCC assumes that doesn't happen so it's correct to elide the diagnostic. Unless you make overflow well-defined with -fwrapv. I think that errors on the right side for the purpose of -Wsign-compare.
[Bug middle-end/108410] x264 averaging loop not optimized well for avx512
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108410 --- Comment #10 from Richard Biener --- So this is now fixed if you use --param vect-partial-vector-usage=2, there is at the moment no way to get masking/not masking costed against each other. In theory vect_analyze_loop_costing and vect_estimate_min_profitable_iters could do both and we could delay vect_determine_partial_vectors_and_peeling.
[Bug middle-end/108376] TSVC s1279 runs 40% faster with aocc than gcc at zen4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108376 Richard Biener changed: What|Removed |Added Resolution|--- |WONTFIX Status|NEW |RESOLVED --- Comment #4 from Richard Biener --- So I'd say INVALID or WONTFIX.
[Bug rust/113499] crab1 fails to link when configuring with --disable-plugin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113499 --- Comment #3 from Richard Biener --- (In reply to Richard Biener from comment #2) > Re-confirmed. Can be reproduced both on a glibc 2.31 and glibc 2.38 system > with It does work with glibc 2.38, so only glibc 2.31 fails this (and possibly other OS).
[Bug rust/113499] crab1 fails to link when configuring with --disable-plugin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113499 Richard Biener changed: What|Removed |Added Last reconfirmed||2024-02-09 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #2 from Richard Biener --- Re-confirmed. Can be reproduced both on a glibc 2.31 and glibc 2.38 system with ../src/configure --enable-languages=rust --disable-bootstrap --disable-plugin See GCC_ENABLE_PLUGIN which adjusts 'pluginlibs' but also causes symbols to be exported from the executable. You need to figure what you need. For example the 'jit' frontend also requires this (--enable-host-shared), but IIRC it doesn't require -ldl Some hosts may not support dynamically loading things.
[Bug target/113615] internal compiler error: in extract_insn, at recog.cc:2812
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113615 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #9 from Richard Biener --- This seems fixed now.
[Bug rtl-optimization/101188] [11/12/13 Regression] [postreload] Uses content of a clobbered register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |law at gcc dot gnu.org Target Milestone|--- |11.5 Status|REOPENED|ASSIGNED
[Bug target/113847] [14 Regression] 10% slowdown of 462.libquantum on AMD Ryzen 7700X and Ryzen 7900X
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113847 Richard Biener changed: What|Removed |Added Target Milestone|--- |14.0
[Bug modula2/113848] modula2 doesn't build with clang
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113848 --- Comment #1 from Richard Biener --- void * arithmetic is a GCC extension, I suggest to change that to char *
[Bug tree-optimization/113849] wrong code with _BitInt() arithmetics at -O1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113849 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2024-02-09 --- Comment #1 from Richard Biener --- Confirmed.
[Bug tree-optimization/113831] [11/12/13/14 Regression] Wrong VN with structurally identical ref since r9-398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113831 --- Comment #5 from Richard Biener --- So we have equal vn_reference but with different ao_ref. Note the recorded vn_reference has value-numbers in operands (not sanitized via AVAIL to a specific location) but the ao_ref is eventually initialized from get_ref_base_and_extent on the original ref which can use context sensitive info. That doesn't actually compute a constant array index from a variable one but instead it constrains the extend of the access which eventually gets to max_size == size. To apply the same logic consistently to the VN representation (which is eventually valueized) we can only look at ranges on names either from the original ref (during copy_reference_ops_from_ref) or when valueizing with AVAIL in mind. For consistency operating from copy_reference_ops_from_ref would be preferred. It's going to be quite sophisticated to reverse-engineer all constant array indexes from the overall [offset, offset + size] computed by get_ref_base_and_extent (we definitely want to do that only once per copy_reference_ops_from_ref). For PRE we do need all the components, so we have to somehow post-process the vn_reference ops. The other possibility for a fix would be to try to fend off ranges being used by get_ref_base_and_extent (but only for the calls on the refs we're going to insert into the expression hash table). get_range_query cannot be tricked so it would be an extra arg to get_ref_base_and_extent and possibly ao_ref_init. That sounds a bit ugly. I will try to implement the post-processing.
[Bug middle-end/113205] [14 Regression] internal compiler error: in backward_pass, at tree-vect-slp.cc:5346 since r14-3220
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113205 --- Comment #10 from Richard Biener --- Btw, I was hoping Richard would chime in here ...
[Bug libstdc++/113835] [13/14 Regression] compiling std::vector with const size in C++20 is slow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113835 Richard Biener changed: What|Removed |Added Last reconfirmed||2024-02-09 Target Milestone|--- |13.3 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Known to fail||13.2.1, 14.0 Component|c++ |libstdc++ Known to work||12.2.1 Summary|compiling std::vector with |[13/14 Regression] |const size in C++20 is slow |compiling std::vector with ||const size in C++20 is slow --- Comment #1 from Richard Biener --- Confirmed with -std=c++20 -fsyntax-only constant expression evaluation : 1.80 ( 85%) 0.03 ( 14%) 1.84 ( 78%) 220M ( 88%) TOTAL : 2.13 0.22 2.36 250M Samples: 8K of event 'cycles', Event count (approx.): 9294971478 Overhead Samples Command Shared Object Symbol 16.33% 1385 cc1plus cc1plus [.] cxx_eval_constant_expression 4.35% 369 cc1plus cc1plus [.] cxx_eval_call_expression 3.90% 331 cc1plus cc1plus [.] cxx_eval_store_expression 3.16% 268 cc1plus cc1plus [.] hash_table::find_s 1.98% 168 cc1plus cc1plus [.] tree_operand_check GCC 12 was fast (possibly std::vector wasn't constexpr there?)
[Bug tree-optimization/113833] 435.gromacs fails verification on with -Ofast -march={cascadelake,icelake-server} and PGO after r14-7272-g57f611604e8bab
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113833 --- Comment #3 from Richard Biener --- I suspect the issue would pop up with -Ofast -fno-vect-cost-model for any sub-architecture. The patch referenced just adjusts costs for doing BB vectorization (and there's reductions there as well). It might be interesting to offer more high-level knobs to tune for vectorization, say -fno-vect-bb-reduction or -fforce-in-order-bb-reduction-vectorization. A compare before/after the patch of -fopt-info-vec output might show the few cases that are affected by the patch.
[Bug tree-optimization/113831] [11/12/13/14 Regression] Wrong VN with structurally identical ref since r9-398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113831 Richard Biener changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=108355 --- Comment #4 from Richard Biener --- The related bug might be also fixed then.
[Bug tree-optimization/113831] [11/12/13/14 Regression] Wrong VN with structurally identical ref since r9-398-g6b9fc1782effc67dd9f6def16207653d79647553
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113831 --- Comment #2 from Richard Biener --- I think the issue is that we're using range info for get_ref_base_and_extent but we fail to do so when valueizing refs.
[Bug tree-optimization/113831] [11/12/13/14 Regression] Wrong VN with structurally identical ref since r9-398-g6b9fc1782effc67dd9f6def16207653d79647553
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113831 Richard Biener changed: What|Removed |Added Priority|P3 |P2 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Keywords||wrong-code Target Milestone|--- |11.5 Last reconfirmed||2024-02-08 --- Comment #1 from Richard Biener --- Mine.
[Bug tree-optimization/113774] wrong code with _BitInt() arithmetics at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113774 --- Comment #7 from Richard Biener --- (In reply to Jakub Jelinek from comment #6) > Thanks. > The #c5 reduced testcase started to be miscompiled with > r9-398-g6b9fc1782effc67dd9f6def16207653d79647553 > Perhaps we should move that to a separate bug so that it can be marked > [11/12/13/14 Regression] and leave this just for the bitint lowering > enhancements not to emit clearly always true or always false conditions if > possible. PR113831
[Bug tree-optimization/113831] New: [11/12/13/14 Regression] Wrong VN with structurally identical ref since r9-398-g6b9fc1782effc67dd9f6def16207653d79647553
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113831 Bug ID: 113831 Summary: [11/12/13/14 Regression] Wrong VN with structurally identical ref since r9-398-g6b9fc1782effc67dd9f6def16207653d79647553 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- The following is miscompiled by FRE with -O2 int a[3]; int __attribute__((noipa)) foo(int i, int x) { int tem = 0; a[2] = x; if (i < 1) ++i; else { ++i; tem = a[i]; } tem += a[i]; return tem; } int main() { if (foo (0, 7) != 0) __builtin_abort(); }
[Bug tree-optimization/113774] wrong code with _BitInt() arithmetics at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113774 --- Comment #5 from Richard Biener --- This must go wrong during alias disambiguation, somehow figuring we can ignore the backedge?! The ref we hoist is _68 = VIEW_CONVERT_EXPR(b)[_146]; where _146 is _49 + 1, but _49 is an IV: _134 = _105 & 1; MEM [(unsigned _BitInt(257) *) + 32B] = _134; [local count: 1073741824]: # _49 = PHI <0(4), _50(28)> it's also odd that we seem to arrive at b + 32B. Value numbering stmt = _146 = PHI <_145(8), _140(31)> Setting value number of _146 to _140 (changed) Making available beyond BB10 _146 for value _140 ... Value numbering stmt = .MEM_150 = PHI <.MEM_149(8), .MEM_139(31)> Setting value number of .MEM_150 to .MEM_150 (changed) Value numbering stmt = _68 = VIEW_CONVERT_EXPR(b)[_146]; Setting value number of _68 to _134 (changed) huh. Hmm. But we have # RANGE [irange] sizetype [4, 4][6, +INF] MASK 0xfffe VALUE 0x1 _140 = _49 + 1; # RANGE [irange] sizetype [1, 2][4, 4][6, +INF] MASK 0xfffe VALUE 0x1 # _146 = PHI <_145(8), _140(6)> we should look at the range of _146 Hmm, I _think_ I know what happens. We have [local count: 1073741824]: # _49 = PHI <0(4), _50(28)> # _55 = PHI <0(4), _56(28)> _51 = VIEW_CONVERT_EXPR(b)[_49]; if (_49 <= 2) goto ; [80.00%] else goto ; [20.00%] [local count: 214748360]: _135 = .USUBC (0, _51, _55); _136 = IMAGPART_EXPR <_135>; _137 = REALPART_EXPR <_135>; _138 = _51 | _137; bitint.6[_49] = _138; _140 = _49 + 1; _141 = VIEW_CONVERT_EXPR(b)[_140]; and this is the "same" valueized ref (what gets recorded in the hashtable), but here we can see that _140 >= 4 which makes it known 4 based on the array extent. This matches it up with the store of _134: Value numbering stmt = _141 = VIEW_CONVERT_EXPR(b)[_140]; Setting value number of _141 to _134 (changed) _134 is available for _134 we record the expression with the VUSE of the definition. Later when we look up the same expression from the later block (where _140 isn't known to be 4) we find the very same expression when looking with the VUSE of the definition and thus we take the expression already in the hashtable which has been assigned the value _134 and then boom. Sth like the following is miscompiled at -O2 by FRE. int a[3]; int __attribute__((noipa)) foo(int i, int x) { int tem = 0; a[2] = x; if (i < 1) ++i; else { ++i; tem = a[i]; } tem += a[i]; return tem; } int main() { if (foo (0, 7) != 0) __builtin_abort(); }
[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #11 from Richard Biener --- (In reply to Tamar Christina from comment #10) > (In reply to Richard Biener from comment #9) > > Another bug in the dependence checking code is > > > > if (dr_may_alias_p (dr_ref, dr_read, loop_nest)) > > > > which will end up using TBAA - dr_may_alias_p doesn't think you are ever > > going to move stores down across loads. To verify if that's possible > > you need to use > > > > if (dr_may_alias_p (dr_read, dr_ref, loop_nest)) > > > > instead. > > > > Note there's still my very original review consideration that you move > > stmts out-of-order but the main dependence checking the vectorizer does > > assumes the stores and loads appear in their original order. I'm not > > sure whether with the above we prove this doesn't matter. > > But in the original review I had it that way and you said: > > > + for (auto dr_read : bases) > > + if (dr_may_alias_p (dr_read, dr_ref, loop_nest)) > > I think you need to swap dr_read and dr_ref operands, since you > are walking stmts backwards and thus all reads from 'bases' are > after the write. > > so I'm somewhat confused.. I was confused.
[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 --- Comment #42 from Richard Biener --- And the do_store_flag part: diff --git a/gcc/expr.cc b/gcc/expr.cc index fc5e998e329..44d64274071 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -13693,6 +13693,19 @@ do_store_flag (sepops ops, rtx target, machine_mode mode) subtarget = 0; expand_operands (arg0, arg1, subtarget, , , EXPAND_NORMAL); + unsigned HOST_WIDE_INT nunits; + if (VECTOR_BOOLEAN_TYPE_P (type) + && operand_mode == QImode + && TYPE_VECTOR_SUBPARTS (type).is_constant () + && nunits < BITS_PER_UNIT) +{ + op0 = expand_binop (mode, and_optab, op0, + GEN_INT ((1 << nunits) - 1), NULL_RTX, + true, OPTAB_WIDEN); + op1 = expand_binop (mode, and_optab, op1, + GEN_INT ((1 << nunits) - 1), NULL_RTX, + true, OPTAB_WIDEN); +} if (target == 0) target = gen_reg_rtx (mode); for the testcase typedef long v4si __attribute__((vector_size(4*sizeof(long; typedef v4si v4sib __attribute__((vector_mask)); typedef _Bool sbool1 __attribute__((signed_bool_precision(1))); _Bool x; void __GIMPLE (ssa) foo (v4sib v1, v4sib v2) { v4sib tem; _Bool _7; __BB(2): tem_5 = ~v2_2(D); tem_3 = v1_1(D) | tem_5; tem_4 = _Literal (v4sib) { _Literal (sbool1) -1, _Literal (sbool1) -1, _Literal (sbool1) -1, _Literal (sbool1) -1 }; _7 = tem_3 == tem_4; x = _7; return; }
[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 --- Comment #41 from Richard Biener --- (In reply to Hongtao Liu from comment #38) > > I think we should also mask off the upper bits of variable mask? > > > > notl%esi > > orl %esi, %edi > > notl%edi > > andl$15, %edi > > je .L3 > > with -mbmi, it's > > andn%esi, %edi, %edi > andl$15, %edi > je .L3 Well, yes, the discussion in this bug was whether to do this at consumers (that's sth new) or with all mask operations (that's how we handle bit-precision integer operations, so it might be relatively easy to do that - specifically spot the places eventually needing adjustment). There's do_store_flag to fixup for uses not in branches and do_compare_and_jump for conditional jumps. Note the AND is removed by combine if I add it: Successfully matched this instruction: (set (reg:CCZ 17 flags) (compare:CCZ (and:HI (not:HI (subreg:HI (reg:QI 102 [ tem_3 ]) 0)) (const_int 15 [0xf])) (const_int 0 [0]))) (*testhi_not) -9: {r103:QI=r102:QI&0xf;clobber flags:CC;} + REG_DEAD r99:QI +9: NOTE_INSN_DELETED + 12: flags:CCZ=cmp(~r102:QI#0&0xf,0) REG_DEAD r102:QI - REG_UNUSED flags:CC - 12: flags:CCZ=cmp(r103:QI,0xf) - REG_DEAD r103:QI and we get foo: .LFB0: .cfi_startproc notl%esi orl %esi, %edi notl%edi testb $15, %dil je .L6 ret which I'm not sure is OK? diff --git a/gcc/dojump.cc b/gcc/dojump.cc index e2d2b3cb111..784707c1e55 100644 --- a/gcc/dojump.cc +++ b/gcc/dojump.cc @@ -1266,6 +1266,7 @@ do_compare_and_jump (tree treeop0, tree treeop1, enum rtx_code signed_code, machine_mode mode; int unsignedp; enum rtx_code code; + unsigned HOST_WIDE_INT nunits; /* Don't crash if the comparison was erroneous. */ op0 = expand_normal (treeop0); @@ -1308,6 +1309,18 @@ do_compare_and_jump (tree treeop0, tree treeop1, enum rtx_code signed_code, emit_insn (targetm.gen_canonicalize_funcptr_for_compare (new_op1, op1)); op1 = new_op1; } + else if (VECTOR_BOOLEAN_TYPE_P (type) + && mode == QImode + && TYPE_VECTOR_SUBPARTS (type).is_constant () + && nunits < BITS_PER_UNIT) +{ + op0 = expand_binop (mode, and_optab, op0, + GEN_INT ((1 << nunits) - 1), NULL_RTX, + true, OPTAB_WIDEN); + op1 = expand_binop (mode, and_optab, op1, + GEN_INT ((1 << nunits) - 1), NULL_RTX, + true, OPTAB_WIDEN); +} do_compare_rtx_and_jump (op0, op1, code, unsignedp, treeop0, mode, ((mode == BLKmode)
[Bug libstdc++/113811] std::rotate does 64-bit signed division
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113811 --- Comment #1 from Richard Biener --- In case __n is the minimum signed integer and __k is -1 the division would also trap ;) So yes, they should be unsigned.
[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 --- Comment #36 from Richard Biener --- For example with AVX512VL and the following, using -O -fgimple -mavx512vl we get simply notl%esi orl %esi, %edi cmpb$15, %dil je .L6 typedef long v4si __attribute__((vector_size(4*sizeof(long; typedef v4si v4sib __attribute__((vector_mask)); typedef _Bool sbool1 __attribute__((signed_bool_precision(1))); void __GIMPLE (ssa) foo (v4sib v1, v4sib v2) { v4sib tem; __BB(2): tem_5 = ~v2_2(D); tem_3 = v1_1(D) | tem_5; tem_4 = _Literal (v4sib) { _Literal (sbool1) -1, _Literal (sbool1) -1, _Literal (sbool1) -1, _Literal (sbool1) -1 }; if (tem_3 == tem_4) goto __BB3; else goto __BB4; __BB(3): __builtin_abort (); __BB(4): return; } the question is whether that matches the semantics of GIMPLE (the padding is inverted, too), whether it invokes undefined behavior (don't do it - it seems for people using intrinsics that's what it is?) or whether we should avoid affecting padding. Note after the patch I proposed on the mailing list the constant mask is now expanded with zero padding.
[Bug tree-optimization/113796] [14 Regression] ifcvt does not remove range info before folding: Runtime mismatch at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113796 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #9 from Richard Biener --- Fixed (but possibly latent on branches of course).
[Bug tree-optimization/113808] [14 Regression] FAIL: libgomp.fortran/non-rectangular-loop-1.f90 since r14-8768
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113808 --- Comment #8 from Richard Biener --- It's surely a bug in the vectorizer early exit handling. I just don't know what exactly is wrong right now ;)
[Bug tree-optimization/113808] [14 Regression] FAIL: libgomp.fortran/non-rectangular-loop-1.f90 since r14-8768
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113808 --- Comment #6 from Richard Biener --- With the following I don't see things going wrong, but we end up with the loop having the STOP exit last instead and thus a PEELED case. function bar (n) result (k) integer :: n, k !$omp simd lastprivate(k) do k = 1, n + 41 if (k > 11 + 41 .or. k < 1) error stop end do end program main integer :: n, i,k integer :: bar n = 11 k = bar (n) if (k /= 53) then print *, k, 53 error stop endif end
[Bug tree-optimization/113808] [14 Regression] FAIL: libgomp.fortran/non-rectangular-loop-1.f90 since r14-8768
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113808 --- Comment #5 from Richard Biener --- (In reply to Jakub Jelinek from comment #3) > Started with r14-8768-g85094e2aa6dba7908f053046f02dd443e8f65d72 > The regression status is unclear because we emitted sorry on this > before r14-2634-g85da0b40538fb0d17d89de1e7905984668e3dfef I think r14-8768 just exposed this. We are picking the last exit in the loop, it's not a PEELED case. It's the exit towards the if (k/=53) not towards STOP.
[Bug tree-optimization/113808] [14 Regression] FAIL: libgomp.fortran/non-rectangular-loop-1.f90 since r14-8768
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113808 --- Comment #4 from Richard Biener --- Reduced a bit, w/o collapse: program main integer :: n, i,k n = 11 do i = 1, n,2 !$omp simd lastprivate(k) do k = 1, i + 41 if (k > 11 + 41 .or. k < 1) error stop end do end do if (k /= 53) then print *, k, 53 error stop endif end
[Bug tree-optimization/113808] [14 Regression] FAIL: libgomp.fortran/non-rectangular-loop-1.f90
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113808 Richard Biener changed: What|Removed |Added Target Milestone|--- |14.0 Keywords||wrong-code CC||tnfchris at gcc dot gnu.org --- Comment #1 from Richard Biener --- The error must be for the continuation of 'k' to the scalar loop where we have [local count: 829590381]: MEM [(integer(kind=4) *)] = vect_vec_iv_.27_95; vect_k.32_118 = vect_vec_iv_.27_95 + { 1, 1, 1, 1 }; k.4_23 = k.4_55 + 1; ivtmp_120 = ivtmp_119 + 1; if (ivtmp_120 < bnd.23_89) goto ; [85.44%] else goto ; [14.56%] [local count: 136777259]: # k.4_45 = PHI # ivtmp_76 = PHI # vect_vec_iv_.27_99 = PHI # vect__19.29_108 = PHI <{ 0, 1, 2, 3 }(5)> _109 = BIT_FIELD_REF ; _48 = _109; _100 = BIT_FIELD_REF ; k.4_43 = _100; niters_vector_mult_vf.24_90 = bnd.23_89 << 2; tmp.26_93 = 53 - niters_vector_mult_vf.24_90; _92 = (integer(kind=4)) niters_vector_mult_vf.24_90; tmp.25_91 = _92 + 1; if (niters.22_12 == niters_vector_mult_vf.24_90) goto ; [25.00%] else goto ; [75.00%] [local count: 136777259]: # k.4_74 = PHI # ivtmp_77 = PHI but I can't really see anything wrong here (besides redundant code). It's possible to elide the middle loop, but I failed to emulate the inner loop how it's presented without -fopenmp-simd.
[Bug tree-optimization/113808] New: [14 Regression] FAIL: libgomp.fortran/non-rectangular-loop-1.f90
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113808 Bug ID: 113808 Summary: [14 Regression] FAIL: libgomp.fortran/non-rectangular-loop-1.f90 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- The following reduced testcase from libgomp.fortran/non-rectangular-loop-1.f90 fails execution: program main integer :: n,m,p, i,j,k,ll n = 11 m = 23 p = 27 !$omp simd collapse(3) lastprivate(k) do i = 1, n,2 do j = 1, m do k = 1, i + 41 if (k > 11 + 41 .or. k < 1) error stop end do end do end do if (k /= 53) then print *, k, 53 error stop endif end when built with -O -msse4.1 -fopenmp-simd > ./a.out 50 53 ERROR STOP Error termination. Backtrace: #0 0x4008ec in ??? #1 0x400909 in ??? #2 0x7f873306f24c in ??? #3 0x400679 in _start at ../sysdeps/x86_64/start.S:120 #4 0x in ???
[Bug libgcc/113803] libgcc unwinder stops at calls to null function pointer on some targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113803 --- Comment #4 from Richard Biener --- The return address should be still on the stack for most archs, unless we run into zero by "overflowing" the IP, of course.
[Bug tree-optimization/113796] [14 Regression] ifcvt does not remove range info before folding: Runtime mismatch at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113796 --- Comment #7 from Richard Biener --- We're removing flow-sensitive info in combine_blocks, but after inserting and folding stmts comprising the PHI replacements. There's possibly latent issues when building up the predicates themselves since that uses maybe_fold_or_comparisons without the workaround added for if-combine. I have a patch resetting flow-sensitive info earlier (and also covering PHIs).
[Bug tree-optimization/113801] Missed optimization of loop invariant elimination
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113801 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Keywords||missed-optimization Last reconfirmed||2024-02-07 --- Comment #1 from Richard Biener --- I think there's a duplicate bug having the same loop-carried "zero" where final value replacement gets the overall update to 'a', just a bit more complicated here.
[Bug tree-optimization/111478] [12 Regression] aarch64 SVE ICE: in compute_live_loop_exits, at tree-ssa-loop-manip.cc:250
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111478 --- Comment #9 from Richard Biener --- (In reply to Saurabh Jha from comment #8) > Hi Richard, > > Are you also planning to backport it to gcc-12? Yes.
[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #9 from Richard Biener --- Another bug in the dependence checking code is if (dr_may_alias_p (dr_ref, dr_read, loop_nest)) which will end up using TBAA - dr_may_alias_p doesn't think you are ever going to move stores down across loads. To verify if that's possible you need to use if (dr_may_alias_p (dr_read, dr_ref, loop_nest)) instead. Note there's still my very original review consideration that you move stmts out-of-order but the main dependence checking the vectorizer does assumes the stores and loads appear in their original order. I'm not sure whether with the above we prove this doesn't matter.
[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #8 from Richard Biener --- (In reply to Tamar Christina from comment #6) > The reason for the miscompile popping up is this change from the previous > patch > > diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc > index 109d4ce5192..df3eab2e8d5 100644 > --- a/gcc/tree-vect-data-refs.cc > +++ b/gcc/tree-vect-data-refs.cc > @@ -725,8 +725,7 @@ vect_analyze_early_break_dependences (loop_vec_info > loop_vinfo) > bounded by VF so accesses are within range. We only need to > check the > reads since writes are moved to a safe place where if we get > there we > know they are safe to perform. */ > - if (DR_IS_READ (dr_ref) > - && !ref_within_array_bound (stmt, DR_REF (dr_ref))) > + if (!ref_within_array_bound (stmt, DR_REF (dr_ref))) I think it can even be relaxed to if ((DR_IS_READ (dr_ref) && check_deps)) since for non-peeled the IV exit block will be only executed with a fully enabled vector.
[Bug tree-optimization/113796] [14 Regression] ifcvt does not remove range info before folding: Runtime mismatch at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113796 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #6 from Richard Biener --- Let me take this.
[Bug target/113790] [14 Regression][riscv64] ICE in curr_insn_transform, at lra-constraints.cc:4294 since r14-4944-gf55cdce3f8dd85
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113790 Richard Biener changed: What|Removed |Added Target Milestone|--- |14.0
[Bug tree-optimization/113787] [12/13/14 Regression] Wrong code at -O with ipa-modref on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113787 --- Comment #11 from Richard Biener --- Btw, there's related IPA modref wrong-code issues where IPA and late summaries are merged incorrectly (also receiving no attention)
[Bug tree-optimization/113787] [12/13/14 Regression] Wrong code at -O with ipa-modref on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113787 Richard Biener changed: What|Removed |Added Target||aarch64 --- Comment #10 from Richard Biener --- I think it's ipa-modref analyze_store bailing for if (a.parm_index == MODREF_LOCAL_MEMORY_PARM) return false; no idea how it arrives at that.
[Bug middle-end/113734] [14 regression] libarchive miscompiled (fails libarchive_test_read_format_rar5_extra_field_version test) since r14-8768-g85094e2aa6dba7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734 --- Comment #7 from Richard Biener --- (In reply to Tamar Christina from comment #6) > The reason for the miscompile popping up is this change from the previous > patch > > diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc > index 109d4ce5192..df3eab2e8d5 100644 > --- a/gcc/tree-vect-data-refs.cc > +++ b/gcc/tree-vect-data-refs.cc > @@ -725,8 +725,7 @@ vect_analyze_early_break_dependences (loop_vec_info > loop_vinfo) > bounded by VF so accesses are within range. We only need to > check the > reads since writes are moved to a safe place where if we get > there we > know they are safe to perform. */ > - if (DR_IS_READ (dr_ref) > - && !ref_within_array_bound (stmt, DR_REF (dr_ref))) > + if (!ref_within_array_bound (stmt, DR_REF (dr_ref))) > { > if (dump_enabled_p ()) > dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > but this should have bee safe, as the stores shouldn't be done until the > point we know for sure they would be safe to do. > > the code out of the vectorizer looks ok to me. Valgrind is saying we're > reading uninitialized values. But those values I think come from a previous > look which sets them to 0. Or is supposed to. So working my way up this > giant function. Hmm, but there isn't really a "safe" place, is there? If there's a safe place then it would be safe for reads as well, no? So I guess when you manage to massage the testcase to be based on decls then you instead (with the above suggested change) get spurious stores?
[Bug tree-optimization/111478] [12 Regression] aarch64 SVE ICE: in compute_live_loop_exits, at tree-ssa-loop-manip.cc:250
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111478 Richard Biener changed: What|Removed |Added Summary|[12/13 Regression] aarch64 |[12 Regression] aarch64 SVE |SVE ICE: in |ICE: in |compute_live_loop_exits, at |compute_live_loop_exits, at |tree-ssa-loop-manip.cc:250 |tree-ssa-loop-manip.cc:250 Known to work||13.2.1 --- Comment #7 from Richard Biener --- Backported to GCC 13.
[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 112618, which changed state. Bug 112618 Summary: [13 Regression] internal compiler error: in expand_MASK_CALL, at internal-fn.cc:4529 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112618 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/112618] [13 Regression] internal compiler error: in expand_MASK_CALL, at internal-fn.cc:4529
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112618 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Known to work||13.2.1 Status|ASSIGNED|RESOLVED --- Comment #5 from Richard Biener --- Fixed.
[Bug tree-optimization/110243] [12/13 Regression] Wrong code at -O3 on x86_64-linux-gnu since r13-3875-g9e11ceef165
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110243 --- Comment #16 from Richard Biener --- Backporting to GCC 13 causes gcc.dg/tree-ssa/ldist-17.c to FAIL.
[Bug target/113779] Very inefficient m68k code generated for simple copy loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113779 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2024-02-06 Ever confirmed|0 |1 --- Comment #6 from Richard Biener --- It's already visible with a simple void f(const long* src, long* dst) { *dst++ = *src++; *dst = *src; } where we expand to RTL from _1 = *src_3(D); *dst_4(D) = _1; _2 = MEM[(const long int *)src_3(D) + 4B]; MEM[(long int *)dst_4(D) + 4B] = _2; there's nothing on GIMPLE that would split the add and RTLs auto-inc-dec pass doesn't do anything either. We'd need a form of "strength-reduction" or maybe targets prefering auto-inc/dec should not legitimize constant offsets before reload ... Note with one more copy you then see _1 = *src_4(D); *dst_5(D) = _1; _2 = MEM[(const long int *)src_4(D) + 4B]; MEM[(long int *)dst_5(D) + 4B] = _2; _3 = MEM[(const long int *)src_4(D) + 8B]; MEM[(long int *)dst_5(D) + 8B] = _3; and naiively splitting gives you src_6 = src_4(D) + 4; src_7 = src_4(D) + 8; that said, it's really sth for RTL since it's going to be highly target dependent which form is more efficient. The auto-inc pass is well structured, so it should be possible to extend it.
[Bug tree-optimization/113703] ivopts miscompiles loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113703 --- Comment #5 from Richard Biener --- It's going wrong in iv_elimination_compare_lt which tries to exactly handle this kind of loop: We aim to handle the following situation: sometype *base, *p; int a, b, i; i = a; p = p_0 = base + a; do { bla (*p); p++; i++; } while (i < b); Here, the number of iterations of the loop is (a + 1 > b) ? 0 : b - a - 1. We aim to optimize this to p = p_0 = base + a; do { bla (*p); p++; } while (p < p_0 - a + b); This preserves the correctness, since the pointer arithmetics does not overflow. More precisely: 1) if a + 1 <= b, then p_0 - a + b is the final value of p, hence there is no overflow in computing it or the values of p. 2) if a + 1 > b, then we need to verify that the expression p_0 - a does not overflow. To prove this, we use the fact that p_0 = base + a. there's either a hole in that logic or the implementation is off. /* Finally, check that CAND->IV->BASE - CAND->IV->STEP * A does not overflow. */ offset = fold_build2 (MULT_EXPR, TREE_TYPE (cand->iv->step), cand->iv->step, fold_convert (TREE_TYPE (cand->iv->step), a)); if (!difference_cannot_overflow_p (data, cand->iv->base, offset)) return false; where 'A' is 'i', CAND->IV->BASE is 'p + i' and CAND->IV->STEP is 1 as 'sizetype'. That just checks that (p + i) - i doesn't overflow. Somehow it misses to prove p + b doesn't overflow since we end up with p' < (p + i) + (n - i) aka p' < p + n.
[Bug middle-end/24639] [meta-bug] bug to track all Wuninitialized issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24639 Bug 24639 depends on bug 109559, which changed state. Bug 109559 Summary: [12/13/14 Regression] Unexpected -Wmaybe-uninitialized warning when inlining with system header https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109559 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |INVALID
[Bug middle-end/109559] [12/13/14 Regression] Unexpected -Wmaybe-uninitialized warning when inlining with system header
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109559 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |INVALID --- Comment #9 from Richard Biener --- So invalid.
[Bug gcov-profile/113765] [14 Regression] ICE: autofdo: val-profiler-threads-1.c compilation, error: probability of edge from entry block not initialized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113765 Richard Biener changed: What|Removed |Added Target Milestone|--- |14.0
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #10 from Richard Biener --- I see the 'pair' type is marked TYPE_CXX_ODR_P, I'd say you should see a ODR type violation diagnostic, and if you don't, this means we force different alias sets for both? Not sure - Honza added this stuff. It only affects TYPE_CANONICAL though, regular type merging shouldn't merge them but it's likely that you get to see another type because of COMDATs and symbol merging chosing a different prevailing function which has that other type? Btw, can you dump the mangled name of the type? It should be type_with_linkage_p () I think, of course 'pair' itself is a template so only a specific instantiation should be subject to ODR. (of course there might be ODR functions that use different instantiated pair in the signature ..)
[Bug target/113779] Very inefficient m68k code generated for simple copy loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113779 --- Comment #2 from Richard Biener --- I don't think IVOPTs would use postinc for the intermediate increments. It's constant propagation/forwarding that accumulates the increments to a constant offset which removes dependences on the instructions and thus would allow the loads/stores to be executed in parallel (well, not that m68k uarchs likely can do any of that ...). I wonder if the code we emit is measurably slower though? It's possibly a little bit larger due to the two IV increments.
[Bug tree-optimization/113775] Bogus Wstringop-overflow in __atomic_load_n combined with sanitizer flags
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113775 Richard Biener changed: What|Removed |Added Keywords||diagnostic --- Comment #2 from Richard Biener --- Yeah, the 'cc1plus: note: destination object is likely at address zero' message hints at that we likely diagnose a threaded path where the pointer is zero. We were likely inclined to perform the threading by dynamic checks inserted by the sanitizer.
[Bug target/113763] [14 Regression] build fails with clang++ host compiler because aarch64.cc uses C++14 constexpr.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113763 Richard Biener changed: What|Removed |Added Target Milestone|--- |14.0
[Bug gcov-profile/113765] ICE: autofdo: val-profiler-threads-1.c compilation, error: probability of edge from entry block not initialized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113765 Richard Biener changed: What|Removed |Added CC||hubicka at gcc dot gnu.org Keywords||ice-checking Version|unknown |14.0 --- Comment #2 from Richard Biener --- Honza added extra checking for this for gcc14.
[Bug middle-end/109559] [12/13/14 Regression] Unexpected -Wmaybe-uninitialized warning when inlining with system header
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109559 --- Comment #8 from Richard Biener --- Created attachment 57325 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57325=edit patch Patch. Breaks expected diagnostics for inlines from system headers.
[Bug middle-end/109559] [12/13/14 Regression] Unexpected -Wmaybe-uninitialized warning when inlining with system header
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109559 --- Comment #7 from Richard Biener --- So the 2nd hunk tests OK but the first for example runs into FAIL: gcc.dg/Wfree-nonheap-object-4.c (test for warnings, line 19) where we explicitly seem to expect the warning when the system header code is inlined into non-system-header context. That's btw the same that happens for the testcase in this bug - we inline the has_trivial_copy_and_destroy into integrate () which isn't in a system header. So it seems this was a deliberate choice ... which would mean the bug at hand is INVALID. (-Wno-system-headers has no effect)
[Bug middle-end/109559] [12/13/14 Regression] Unexpected -Wmaybe-uninitialized warning when inlining with system header
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109559 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED CC||msebor at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #6 from Richard Biener --- Note the diagnostic is "valid" and for FilonIntegral::integrate () function_base::has_trivial_copy_and_destroy ([(struct function1 *)].D.2804); and we're using the stmts location to diagnose this which expands to (gdb) p IS_ADHOC_LOC (location) $8 = true (gdb) p get_location_from_adhoc_loc (line_table, location) $9 = 268224 (gdb) p expand_location ($9) $10 = {file = 0x4d895d0 "", line = 5, column = 49, data = 0x0, sysp = true} so the system header flag is correct. There's if ((was_warning || diagnostic->kind == DK_WARNING) && ((!m_warn_system_headers && diagnostic->m_iinfo.m_allsyslocs) || m_inhibit_warnings)) /* Bail if the warning is not to be reported because all locations in the inlining stack (if there is one) are in system headers. */ return false; I've added -Wno-system-headers, and (gdb) p m_warn_system_headers $14 = false (gdb) p diagnostic->m_iinfo.m_allsyslocs $15 = false (gdb) p was_warning $16 = true (gdb) p m_inhibit_warnings $17 = false so the issue seems to be that the active m_set_locations_cb tree-diagnostic.cc:set_inlining_locations computes that "wrongly". The operator= associated inline block location isn't in a system header (the abstract origin, the operator= FUNCTION_DECL does have a DECL_SOURCE_LOCATION that's in a system header though). _Note_ we're assigning that BLOCK the location of the _call_ (it's for the parameter setup), _not_ the location of the callee! /* Build a block containing code to initialize the arguments, the actual inline expansion of the body, and a label for the return statements within the function to jump to. The type of the statement expression is the return type of the function call. ??? If the call does not have an associated block then we will remap all callee blocks to NULL, effectively dropping most of its debug information. This should only happen for calls to artificial decls inserted by the compiler itself. We need to either link the inlined blocks into the caller block tree or not refer to them in any way to not break GC for locations. */ if (tree block = gimple_block (stmt)) { /* We do want to assign a not UNKNOWN_LOCATION BLOCK_SOURCE_LOCATION to make inlined_function_outer_scope_p return true on this BLOCK. */ location_t loc = LOCATION_LOCUS (gimple_location (stmt)); if (loc == UNKNOWN_LOCATION) loc = LOCATION_LOCUS (DECL_SOURCE_LOCATION (fn)); if (loc == UNKNOWN_LOCATION) loc = BUILTINS_LOCATION; id->block = make_node (BLOCK); BLOCK_ABSTRACT_ORIGIN (id->block) = DECL_ORIGIN (fn); BLOCK_SOURCE_LOCATION (id->block) = loc; prepend_lexical_block (block, id->block); since this particular hook implementation was added by Martin S. I don't have high hopes of that being a concious decision. while (block && TREE_CODE (block) == BLOCK && BLOCK_ABSTRACT_ORIGIN (block)) { tree ao = BLOCK_ABSTRACT_ORIGIN (block); if (TREE_CODE (ao) == FUNCTION_DECL) { if (!diagnostic->m_iinfo.m_ao) diagnostic->m_iinfo.m_ao = block; location_t bsloc = BLOCK_SOURCE_LOCATION (block); ilocs.safe_push (bsloc); if (in_system_header_at (bsloc)) I think this should either look at DECL_SOURCE_LOCATION (ao) or at the location of the block nested in 'block'. Note we then still warn because if (ilocs.length ()) { /* When there is an inlining context use the macro expansion location for the original location and bump up NSYSLOCS if it's in a system header since it's not counted above. */ location_t sysloc = expansion_point_location_if_in_system_header (loc); if (sysloc != loc) gets us the same location, failing to do loc = sysloc; ++nsyslocs; } and then ilocs.safe_push (loc); makes /* Set if all locations are in a system header. */ diagnostic->m_iinfo.m_allsyslocs = nsyslocs == ilocs.length (); fail. The logic is odd though, if it was not macro expanded it's off. The following fixes it for me. diff --git a/gcc/tree-diagnostic.cc b/gcc/tree-diagnostic.cc index a660c7d0785..a49f8939ce7 100644 --- a/gcc/tree-diagnostic.cc +++ b/gcc/tree-diagnostic.cc @@ -328,7 +328,7 @@ set_inlining_locations (diagnostic_context *, if (!diagnostic->m_iinfo.m_ao)
[Bug tree-optimization/113707] [14 Regression] ICE on valid code at -O1 on x86_64-linux-gnu: Segmentation fault since r14-8683
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113707 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #7 from Richard Biener --- Fixed.
[Bug tree-optimization/113703] ivopts miscompiles loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113703 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Keywords||needs-bisection Status|UNCONFIRMED |NEW Last reconfirmed||2024-02-05 --- Comment #4 from Richard Biener --- Confirmed.
[Bug middle-end/113762] TYPE_ADDR_SPACE requirements on tcc_reference trees not documented/checked
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113762 Richard Biener changed: What|Removed |Added Keywords||documentation Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Last reconfirmed||2024-02-05 Status|UNCONFIRMED |ASSIGNED
[Bug middle-end/113762] New: TYPE_ADDR_SPACE requirements on tcc_reference trees not documented/checked
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113762 Bug ID: 113762 Summary: TYPE_ADDR_SPACE requirements on tcc_reference trees not documented/checked Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- There's not much documentation on what part of a tcc_reference chain (handled_component_p + base) needs to reflect the TYPE_ADDR_SPACE in effect. RTL expansion looks at the base of the chain but for example build_fold_addr_expr_loc simply looks at the outermost object. There's also no IL checking in place to verify consistency within such a chain. And test coverage isn't too great for address-spaces in general.
[Bug tree-optimization/113736] ICE: verify_gimple failed: incompatible types in 'PHI' argument 0 with _BitInt() struct copy to __seg_fs/gs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113736 --- Comment #4 from Richard Biener --- (In reply to rguent...@suse.de from comment #3) > On Sat, 3 Feb 2024, jakub at gcc dot gnu.org wrote: > > Bitint lowering changes here > > MEM < _BitInt(768)> [( struct T > > *)p_2(D)] = > > s_4(D); > > to > > VIEW_CONVERT_EXPR(MEM < _BitInt(768)> > > [( struct T *)p_2(D)])[_5] = s_7(D); > > accesses in a loop. Is that invalid and should have also > > in > > the VCE type? Or is this just a vectorizer bug? > > I think that's OK, I will have a look. I stand corrected - it isn't correct. The address-space needs to be on all types involved in a memory reference (RTL expansion is later quite forgiving though). This needs better documentation and maybe even IL checking I guess.
[Bug tree-optimization/113756] [14 regression] Wrong code at -O2 on x86_64-linux-gnu since r14-2780-g39f117d6c87
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113756 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug modula2/113749] [14 Regression] m2 enabled build times out on i686-gnu (GNU Hurd)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113749 Richard Biener changed: What|Removed |Added Target Milestone|--- |14.0