[Bug tree-optimization/114331] New: Missed optimization: indicate knownbits from dominating condition switch(trunc(a))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114331 Bug ID: 114331 Summary: Missed optimization: indicate knownbits from dominating condition switch(trunc(a)) Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: xxs_chy at outlook dot com Target Milestone: --- Godbolt link: https://godbolt.org/z/dso53ndTo For code like: int src(int num) { switch((short)num){ case 111: return num & 0xfffe; case 267: case 204: case 263: return 0; default: dummy(); return 0; } } "num & 0xfffe" can be folded to "110". But both LLVM and GCC fail to fold it.
[Bug libgcc/114327] `-CST % 1` is wrong for _BitInt()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114327 --- Comment #3 from Zdenek Sojka --- It's not only % 1; wrong results are also for: B x = foo (3, -0x9e9b9fe60); or for B foo (char c, B b) { return b / c; } B x = foo (-0x6, 0); /* 0 / -6 = 0 */ in all these cases, the result is the same: -1 << 64.
[Bug libfortran/114304] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304 --- Comment #18 from Jeffrey A. Law --- I don't have an opinion on the Fortran patch -- I think it's up to the Fortran front-end maintainers to make that decision. Given there's still a regression here, I'll put the marker back.
[Bug driver/114330] needs_preprocessing field of struct compiler is unused
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114330 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-03-14 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org --- Comment #4 from Andrew Pinski --- (In reply to Sam James from comment #2) > git log -G needs_preprocessing -p indicates r0-102965-gc3224d6f70eefb Oh yes when -combine support was removed. It was added in -combine support was added in r0-57561-g0855eab7a30bb9 . combinable field was added at the same time but combinable was used afterwards for go (and D and a few others). So I will handle this for GCC 15. I thought it was added much earlier.
[Bug driver/114330] needs_preprocessing field of struct compiler is unused
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114330 --- Comment #3 from Sam James --- (I think it was dead before, but it should've been removed by then)
[Bug driver/114330] needs_preprocessing field of struct compiler is unused
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114330 Sam James changed: What|Removed |Added CC||sjames at gcc dot gnu.org --- Comment #2 from Sam James --- git log -G needs_preprocessing -p indicates r0-102965-gc3224d6f70eefb
[Bug libgcc/114327] `-CST % 1` is wrong for _BitInt()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114327 --- Comment #2 from Andrew Pinski --- For (ignore the strict aliasing issue): ``` typedef signed _BitInt(256) B; [[gnu::noinline]] B foo (signed char c, B b) { return b % c; } int main (void) { B x = foo (1, -3); // -3 % 1 -> 0 // if (x) // __builtin_abort(); signed long *t = (signed long *) for(int i = 0;i < sizeof(B)/sizeof(long); i++ ) { __builtin_printf("%lx\n", t[i]); } return 0; } ``` We get: ``` 0 ``` Which makes it seem like we are doing the sign extend when the value was the result was 0. Even: > B x = foo (3, -3); // -3 % 3 -> 0 Gives the wrong similar result.
[Bug libgcc/114327] `-CST % 1` is wrong for _BitInt()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114327 Andrew Pinski changed: What|Removed |Added Summary|wrong code with _BitInt() |`-CST % 1` is wrong for |modulo at -O0 |_BitInt() Status|UNCONFIRMED |NEW CC||pinskia at gcc dot gnu.org Ever confirmed|0 |1 Last reconfirmed||2024-03-14 --- Comment #1 from Andrew Pinski --- Confirmed.
[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281 --- Comment #28 from JuzheZhong --- The original cost model I did work for all cases but with some middle-end changes the cost model failed. I don't have time to figure out what's going on here. Robin may be interested at it.
[Bug driver/114330] needs_preprocessing field of struct compiler is unused
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114330 --- Comment #1 from Andrew Pinski --- [apinski@xeond2 gcc]$ git grep needs_preprocessing gcc.cc: int needs_preprocessing; /* If nonzero, source files need to lto/lang-specs.h: /*cpp_spec=*/NULL, /*combinable=*/1, /*needs_preprocessing=*/0}, That is the only references that grep could find even for needs_preprocessing.
[Bug driver/114330] New: needs_preprocessing field of struct compiler is unused
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114330 Bug ID: 114330 Summary: needs_preprocessing field of struct compiler is unused Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: internal-improvement Severity: enhancement Priority: P3 Component: driver Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- I suspect needs_preprocessing field became unused when the C preprocessor became itergrated into the cc1. while it does not hurt anything to have the field still around and only the ".c" sets it to true, it seems like a decent idea to remove the field. Also note it might be useful to boolize combinable in struct compiler in gcc.cc too.
[Bug target/108866] Allow to pass Windows resource file (.rc) as input to gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108866 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-03-13 Status|UNCONFIRMED |NEW --- Comment #3 from Andrew Pinski --- (In reply to Pali Rohár from comment #2) > Andrew, I do not know what is gcc driver nor what to do for it. But if you > can show me some pointers, I can try it. > > Or if you need more details about files, usage, etc... please let me know. See gcc.cc (default_compilers). It contains a mapping from suffix to language and language and how to "compile/assemble" the files. See also */lang-specs.h which are included via specs.h (specs.h is a generated file while building, see the makefile there and depends on which language is enabled). Most likely you would add a new target macro which adds to that part of the gcc.cc and define that macro in the mingw headers.
[Bug target/109317] -Os generates bigger code than -O2 on 32-bit ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109317 --- Comment #3 from Pali Rohár --- Do you need some more input or test data about this issue?
[Bug target/108866] Allow to pass Windows resource file (.rc) as input to gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108866 --- Comment #2 from Pali Rohár --- Andrew, I do not know what is gcc driver nor what to do for it. But if you can show me some pointers, I can try it. Or if you need more details about files, usage, etc... please let me know.
[Bug tree-optimization/114326] Missed optimization for A || B when !B implies A.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114326 --- Comment #3 from Andrew Pinski --- (In reply to ptomsich from comment #2) > To copy the last piece of info from our internal tracker... > > LLVM learned this new trick only in the run-up to LLVM 18. > Up until then, GCC and LLVM performed identically on this snippet. Yes it looks like it is pattern matching what I suggested (well with and without the and). Note we do need another pattern, one without the bit_and: (simplify (bit_ior (ne@n4 @0 @1) (cmp (bit_xor @0 @1) @2)) (bit_ior @n4 (cmp { build_zero_cst (TREE_TYPE (@0)); } @2)) ) And we need one more for bit_ior: (simplify (bit_ior (ne@n4 @0 @1) (cmp (bit_ior (bit_xor @0 @1) @2) @3)) (bit_ior @n4 (cmp @2 @3)) ) Note it looks like clang does not handle non-contants that well, (they handle d == 0 though). E.g.: ``` int foo(void); int cmp1(unsigned d1, unsigned d2, unsigned c, unsigned d) { int t = ((d1 ^ d2) & c ) == (d); int t1 = d1 != d2; int tt = t | t1; return tt; } ``` Should be optimized to: int foo(void); int cmp1(unsigned d1, unsigned d2, unsigned c, unsigned d) { int t = 0 == d; int t1 = d1 != d2; int tt = t | t1; return tt; } ```
[Bug target/108849] __declspec(code_seg("segname")) does not work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108849 --- Comment #3 from Pali Rohár --- Arsen, so based on my understooding (please correct me if I'm wrong), gcc's "section" can be used on both code (functions) and data (global variables). And ms's "code_seg" can be used only on code (functions). So if gcc adds __declspec(code_seg("segname")) as alias to __declspec(section("segname")) for TARGET_DECLSPEC then it should be OK for valid source code. However it does not throws an compile error if __declspec(code_seg("segname")) is specified on data. But I think it is acceptable. Primary motivation is support for compiling valid source code. Are you able to add this alias?
[Bug middle-end/114319] htobe64-like function is not optimized on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114319 --- Comment #8 from Pali Rohár --- Thanks for quick response and fixup of this issue.
[Bug tree-optimization/114326] Missed optimization for A || B when !B implies A.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114326 --- Comment #2 from ptomsich at gcc dot gnu.org --- To copy the last piece of info from our internal tracker... LLVM learned this new trick only in the run-up to LLVM 18. Up until then, GCC and LLVM performed identically on this snippet.
[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281 --- Comment #27 from Patrick O'Neill --- (In reply to Andrew Pinski from comment #26) > (In reply to Edwin Lu from comment #25) > > It's still persisting on trunk (at least for pr113281-1.c > > https://godbolt.org/z/M9EK44hKe) > > I looked into what the vectorizer produces: > vect__22.13_31 = (vector(8) int) vect_vec_iv_.12_8; > _22 = (int) a.4_25; > vect__12.14_33 = { 32872, 32872, 32872, 32872, 32872, 32872, 32872, 32872 > } >> vect__22.13_31; > _12 = 32872 >> _22; > vect_b_7.15_34 = (vector(8) short int) vect__12.14_33; > > that is valid thing to do. That is do the shift in `vector(8) int` and then > do a truncation. The issue originally was about doing the shift in > `vector(8) short` which is not happening here. The regressed testcase looks like its testing if riscv vectorizes the code at all (the first issue Juzhe noted in comment #3 and then fixed). So this is a performance regression for risc-v, not correctness.
[Bug tree-optimization/114329] New: ICE: verify_gimple failed: 'bit_field_ref' of non-mode-precision operand with bitfield _BitInt()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114329 Bug ID: 114329 Summary: ICE: verify_gimple failed: 'bit_field_ref' of non-mode-precision operand with bitfield _BitInt() Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 57690 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57690=edit reduced testcase Compiler output: $ x86_64-pc-linux-gnu-gcc testcase.c testcase.c: In function 'foo': testcase.c:6:1: error: 'bit_field_ref' of non-mode-precision operand 6 | foo(void) | ^~~ # .MEM_20 = VDEF <.MEM_19> BIT_FIELD_REF = _9; during GIMPLE pass: bitintlower0 testcase.c:6:1: internal compiler error: verify_gimple failed 0x155f56d verify_gimple_in_cfg(function*, bool, bool) /repo/gcc-trunk/gcc/tree-cfg.cc:5663 0x13ce234 execute_function_todo /repo/gcc-trunk/gcc/passes.cc:2088 0x13ce78e execute_todo /repo/gcc-trunk/gcc/passes.cc:2142 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-9454-20240313184120-g11caf47b599-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --enable-libsanitizer --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r14-9454-20240313184120-g11caf47b599-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.1 20240313 (experimental) (GCC)
[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281 --- Comment #26 from Andrew Pinski --- (In reply to Edwin Lu from comment #25) > It's still persisting on trunk (at least for pr113281-1.c > https://godbolt.org/z/M9EK44hKe) I looked into what the vectorizer produces: vect__22.13_31 = (vector(8) int) vect_vec_iv_.12_8; _22 = (int) a.4_25; vect__12.14_33 = { 32872, 32872, 32872, 32872, 32872, 32872, 32872, 32872 } >> vect__22.13_31; _12 = 32872 >> _22; vect_b_7.15_34 = (vector(8) short int) vect__12.14_33; that is valid thing to do. That is do the shift in `vector(8) int` and then do a truncation. The issue originally was about doing the shift in `vector(8) short` which is not happening here.
[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281 Edwin Lu changed: What|Removed |Added CC||ewlu at rivosinc dot com --- Comment #25 from Edwin Lu --- (In reply to Richard Sandiford from comment #24) > Fixed on trunk so far, but it's latent on branches. I'll see what > the trunk fallout is like before asking about backports. It looks like we have a regression for riscv I was going through the scan dump failures on trunk and ended up revisiting https://github.com/patrick-rivos/gcc-postcommit-ci/issues/463 where gcc.dg/vect/costmodel/riscv/rvv/pr113281-[125].c are failing the scan-dump checks. I didn't realize at the time that the scan dumps were checking code correctness and ended up ignoring it. It's still persisting on trunk (at least for pr113281-1.c https://godbolt.org/z/M9EK44hKe) A bisection on https://github.com/patrick-rivos/gcc-postcommit-ci/issues/463 commit range suggests https://gcc.gnu.org/g:1a8261e047f7a2c2b0afb95716f7615cba718cd1 introduced it. # first bad commit: [1a8261e047f7a2c2b0afb95716f7615cba718cd1] vect: Tighten vect_determine_precisions_from_range [PR113281] Configuration ../configure --prefix=$(pwd) --with-multilib-generator="rv64gcv-lp64d--" make stamps/build-gcc-linux-stage1 -j 32 Testing ./build-gcc-linux-stage1/gcc/cc1 ../gcc/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c -march=rv64gcv -mabi=lp64d -mtune=rocket -mcmodel=medlow -fdiagnostics-plain-output -march=rv64gcv_zvl256b -mabi=lp64d -O3 -ftree-vectorize -ffat-lto-objects -fno-ident -o pr113281-1.s
[Bug libstdc++/114325] [14 Regression] std::format gives incorrect results for negative numbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114325 --- Comment #2 from Jonathan Wakely --- Indeed. Here's the fix: --- a/libstdc++-v3/include/std/format +++ b/libstdc++-v3/include/std/format @@ -4124,14 +4124,14 @@ namespace __format __uval = make_unsigned_t<_Tp>(~__arg) + 1u; else __uval = __arg; - const auto __n = __detail::__to_chars_len(__uval) + __neg; - if (auto __res = __sink_out._M_reserve(__n)) + const auto __n = __detail::__to_chars_len(__uval); + if (auto __res = __sink_out._M_reserve(__n + __neg)) { auto __ptr = __res.get(); *__ptr = '-'; __detail::__to_chars_10_impl(__ptr + (int)__neg, __n, __uval); - __res._M_bump(__n); + __res._M_bump(__n + __neg); __done = true; } }
[Bug target/114328] New: Using -march=armv9-a+nosve does not allow for vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114328 Bug ID: 114328 Summary: Using -march=armv9-a+nosve does not allow for vectorization Product: gcc Version: 13.1.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org CC: mjr19 at cam dot ac.uk Blocks: 53947 Target Milestone: --- Target: aarch64 Created attachment 57689 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57689=edit Testcase I noticed this while looking into PR 114324, the cost model for -march=armv9-a+nosve causes this code not to be vectorized using ld2/st2 using the SIMD (non-SVE) registers. I don't understand why though because -march=armv8.4-a still does though. Note this is all at -Ofast. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations
[Bug libstdc++/114325] [14 Regression] std::format gives incorrect results for negative numbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114325 Jonathan Wakely changed: What|Removed |Added Known to work||13.2.1 Target Milestone|--- |14.0 Last reconfirmed||2024-03-13 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Summary|std::format gives incorrect |[14 Regression] std::format |results for negative|gives incorrect results for |numbers |negative numbers Known to fail||14.0 Assignee|unassigned at gcc dot gnu.org |redi at gcc dot gnu.org
[Bug tree-optimization/114324] [13/14 Regression] AVX2 vectorisation performance regression with gfortran 13/14
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114324 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Target Milestone|--- |12.4 Last reconfirmed||2024-03-13 Status|UNCONFIRMED |NEW Summary|AVX2 vectorisation |[13/14 Regression] AVX2 |performance regression with |vectorisation performance |gfortran 13/14 |regression with gfortran ||13/14 Blocks||53947 Component|target |tree-optimization --- Comment #1 from Andrew Pinski --- Definitely there is some vectorization changes happening. Confirmed. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations
[Bug c/53548] allow flexible array members in unions like zero-length arrays
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53548 qinzhao at gcc dot gnu.org changed: What|Removed |Added CC||qinzhao at gcc dot gnu.org Status|RESOLVED|REOPENED Resolution|WONTFIX |--- --- Comment #9 from qinzhao at gcc dot gnu.org --- I think that we need to add this support as an GCC extension
[Bug libfortran/114304] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304 --- Comment #17 from Jerry DeLisle --- (In reply to Jeffrey A. Law from comment #16) > Per c#12, c#13, c#14 & c#15, dropping the regression marker, but leaving > open. Interestingly, the remaining part of this bug is also a regression, it just does not break LAPACK. Reverting this change fixes it which means the new test for pr105473 will fail. I have an idea where to put this check in read_complex() but I have not finished this and tested it. Jeffrey, if you would like me to push this, let me know. We can mark pr105473.f90 in the test suite to XFAIL or comment out the one check there that fails. diff --git a/libgfortran/io/list_read.c b/libgfortran/io/list_read.c index fb3f7dbc34d..c178acd61a5 100644 --- a/libgfortran/io/list_read.c +++ b/libgfortran/io/list_read.c @@ -471,8 +471,6 @@ eat_separator (st_parameter_dt *dtp) case ',': if (dtp->u.p.current_unit->decimal_status == DECIMAL_COMMA) { - generate_error (>common, LIBERROR_READ_VALUE, - "Comma not allowed as separator with DECIMAL='comma'"); unget_char (dtp, c); break; }
[Bug libgcc/114327] New: wrong code with _BitInt() modulo at -O0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114327 Bug ID: 114327 Summary: wrong code with _BitInt() modulo at -O0 Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: libgcc Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz CC: jakub at gcc dot gnu.org Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 57688 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57688=edit reduced testcase Output: $ x86_64-pc-linux-gnu-gcc testcase.c $ ./a.out Aborted $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-9441-20240312154250-gef79c64cb57-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --enable-libsanitizer --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r14-9441-20240312154250-gef79c64cb57-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.1 20240312 (experimental) (GCC)
[Bug tree-optimization/114326] Missed optimization for A || B when !B implies A.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114326 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2024-03-13 --- Comment #1 from Andrew Pinski --- _1 = d1_5(D) ^ d2_6(D); _2 = _1 & 43981; _10 = d1_5(D) != d2_6(D); _11 = _2 == 0; _12 = _10 | _11; (d1 != d2) | ((d1 ^ d2) & CST) == 0) Confirmed. Obvious if the first part is false then d1 ^ d2 will be 0. This will work though maybe there is another place where this can be handled ... (simplify (bit_ior (ne@n4 @0 @1) (cmp (bit_and (bit_xor @0 @1) @2) @3)) (bit_ior @n4 (cmp { build_zero_cst (TREE_TYPE (@0)); } @3)) )
[Bug fortran/114023] complex part%ref of complex named constant array cannot be used in an initialization expression.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114023 --- Comment #3 from Steve Kargl --- On Wed, Mar 13, 2024 at 06:02:58PM +, jvdelisle at gcc dot gnu.org wrote: > > --- Comment #2 from Jerry DeLisle --- > Steve, Anuj is interested in digging in on this one. This will be a learning > experience. > That's fine with. If Anuj or you have questions or want me to look at something, just ping me.
[Bug c++/102345] [modules] Cannot define a module interface unit for anything in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102345 Patrick Palka changed: What|Removed |Added CC||ppalka at gcc dot gnu.org Keywords|rejects-valid |diagnostic --- Comment #4 from Patrick Palka --- IIUC we're correct to reject since built-ins are implicitly attached to the global module and here we're trying to redeclare one in another module? Perhaps the diagnostic could be improved here though. Clang gives error: declaration of 'operator new' in module newdel follows declaration in the global module
[Bug c++/103524] [meta-bug] modules issue
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524 Bug 103524 depends on bug 101000, which changed state. Bug 101000 Summary: ICE when trying to import the absl/container/flat_hash_map.h as a header module https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101000 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug c++/101000] ICE when trying to import the absl/container/flat_hash_map.h as a header module
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101000 Patrick Palka changed: What|Removed |Added Status|NEW |RESOLVED Target Milestone|--- |14.0 Resolution|--- |FIXED --- Comment #2 from Patrick Palka --- This seems to work with GCC trunk now.
[Bug target/114310] [11/12/13/14 Regression] [aarch64] __sync_val_compare_and_swap fails on __int128_t with newval = 0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114310 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED CC||jakub at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #4 from Jakub Jelinek --- Created attachment 57687 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57687=edit gcc14-pr114310.patch Untested fix. The lack of on aarch64_reg_or_zero/rZ for the desired operand of aarch64_compare_and_swapti_lse looks correct, because the instructions expect a pair of registers, so one can't use there xzr, xzr.
[Bug fortran/114023] complex part%ref of complex named constant array cannot be used in an initialization expression.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114023 Jerry DeLisle changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2024-03-13 --- Comment #2 from Jerry DeLisle --- Steve, Anuj is interested in digging in on this one. This will be a learning experience.
[Bug libgcc/111731] [13/14 regression] gcc_assert is hit at libgcc/unwind-dw2-fde.c#L291
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111731 --- Comment #17 from Dimitar Yordanov --- I've executed more tests and see another one failing. This time "begin" is inside of another range, not the one that gets calculated with this "begin". So there is again an overlapping in the btree. Could we maybe use two trees, one for "begin" and one for the ranges?
[Bug fortran/114001] is_contiguous considers unlimited polymorphic dummy always as contiguous
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114001 --- Comment #3 from GCC Commits --- The master branch has been updated by Harald Anlauf : https://gcc.gnu.org/g:11caf47b599568c6c6f5a12cf8e21f50778176d3 commit r14-9454-g11caf47b599568c6c6f5a12cf8e21f50778176d3 Author: Harald Anlauf Date: Tue Mar 12 22:58:39 2024 +0100 Fortran: fix IS_CONTIGUOUS for polymorphic dummy arguments [PR114001] gcc/fortran/ChangeLog: PR fortran/114001 * expr.cc (gfc_is_simply_contiguous): Adjust logic so that CLASS symbols are also handled. gcc/testsuite/ChangeLog: PR fortran/114001 * gfortran.dg/is_contiguous_4.f90: New test.
[Bug c++/99000] [modules] declaration std::__copy_move_a2 conflicts with import
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99000 Patrick Palka changed: What|Removed |Added CC||iains at gcc dot gnu.org --- Comment #3 from Patrick Palka --- *** Bug 110447 has been marked as a duplicate of this bug. ***
[Bug c++/110447] [modules] unexpected attachment of GMF decls to a named module.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110447 Patrick Palka changed: What|Removed |Added Resolution|--- |DUPLICATE CC||ppalka at gcc dot gnu.org Status|UNCONFIRMED |RESOLVED --- Comment #2 from Patrick Palka --- dup of PR99000 AFAICT *** This bug has been marked as a duplicate of bug 99000 ***
[Bug c++/103524] [meta-bug] modules issue
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524 Bug 103524 depends on bug 110447, which changed state. Bug 110447 Summary: [modules] unexpected attachment of GMF decls to a named module. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110447 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE
[Bug c++/103524] [meta-bug] modules issue
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524 Bug 103524 depends on bug 106363, which changed state. Bug 106363 Summary: [13 Regression] [modules] ICE using-declaration of imported name in the same namespace https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106363 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug c++/106363] [13 Regression] [modules] ICE using-declaration of imported name in the same namespace
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106363 Patrick Palka changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED CC||ppalka at gcc dot gnu.org Target Milestone|13.3|14.0 --- Comment #8 from Patrick Palka --- IIUC this checking-only ICE is not actually a regression so let's mark this as fixed for 14 only.
[Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since r14-9193
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151 --- Comment #23 from Andrew Macleod --- Created attachment 57686 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57686=edit another patch (In reply to Richard Biener from comment #22) > (In reply to Andrew Macleod from comment #21) > > > > And have that all work with general trees expressions.. That would solve > > much of this for you? > > Yes, I wouldn't mind if range_on_{entry,exit} handle general tree > expressions, > there's enough APIs to be confused with already ;) > > > I promoted range_on_exit and range_on_entry to be part of the API in this patch. This brings valeu_query in line with rangers basic 5 routine API. It also tweaks rangers versions to handle tree expressions. It bootstraps and shows no regressions, with the caveat that I haven't actually tested the usage of range_on_entry and exit with arbitrary trees. As you can see, I didnt change much... so it should work OK. > > > > > > > > > > Interestingly enough we somehow still need the > > > > > > > > > > > hunk of Andrews patch to do it :/ > > > > > > > That probably means there is another call somewhere in the chain with no > > context. However, I will say that functionality is more important than it > > seems. Should have been there from the start :-P. > > Possibly yes. It might be we fill rangers cache with VARYING and when > we re-do the query as a dependent one but with context we don't recompute > it? I also only patched up a single place in SCEV with the context so > I possibly missed some others that end up with a range query, for example > through niter analysis that might be triggered. My guess is the latter. Without a context and with that change, ranger evaluates the definition with the context at the location of the def, then simply uses that value. If anything it is dependent on later changes, the temporal cache should indicate it's out of date and trigger a new fold using current values.
[Bug c++/114292] [11/12/13/14 Regression] ICE with a generic (templated) lambda capturing a constant for VLA allocation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114292 Jakub Jelinek changed: What|Removed |Added CC||jason at gcc dot gnu.org, ||mpolacek at gcc dot gnu.org, ||ppalka at gcc dot gnu.org Priority|P3 |P2 --- Comment #4 from Jakub Jelinek --- void foo (int c) { constexpr int r = 4; [&] (auto) { int n = r * c; int t[n]; } (0); [&] (auto) { int t[c]; } (0); [&] (auto) { int t[r]; } (0); [&] (auto) { int t[c * 4]; } (0); } works fine though.
[Bug c++/114292] [11/12/13/14 Regression] ICE with a generic (templated) lambda capturing a constant for VLA allocation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114292 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- Started with r8-7213-g1577f10a637352b4fe7fb4a4c0fd672a96c84f58
[Bug c++/112652] g++.dg/cpp26/literals2.C FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112652 --- Comment #9 from Jakub Jelinek --- (In reply to r...@cebitec.uni-bielefeld.de from comment #8) > FWIW, the iconv conversion tables in /usr/lib/iconv can be regenerated > from the OpenSolaris sources, modified not to do that '?' conversion. > Worked for a quick check for the UTF-8 -> ASCII example, but the '?' is > more prevalent and would need to be eradicated upstream. If it is always '?' used instead of unknown character, we could also have some hack on the libcpp side for it. Like (but limited to Solaris hosts) in convert_using_iconv when converting from SOURCE_CHARSET to some other character set don't try to convert the whole UTF-8 string at once, but split it into chunks at u'?' characters, so foo???bar?baz?qux would be iconv converted as foo ??? bar ? baz ? qux chunks. And when converting the non-? chunks, it would after the conversion check for the '?' character (in the destination character set - that is something that perhaps could be queried during initialization after iconv_open) and treat it as an error if it appeared there. Or always convert also back to UTF-8 and check if it has more '?' characters than the source.
[Bug ada/106037] internal error with Aggregate aspect on array type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106037 Eric Botcazou changed: What|Removed |Added Status|NEW |RESOLVED Target Milestone|--- |13.3 Summary|ICE with Aggregate aspect |internal error with ||Aggregate aspect on array ||type Resolution|--- |FIXED --- Comment #5 from Eric Botcazou --- commit ec48b99c24a422bf97af91e82203d23b69094e7c Author: Marc Poulhiès Date: Wed Mar 8 20:39:45 2023 +0100 ada: Fix error message for Aggregate aspect The error message was wrongly using % instead of & in the format string, causing the displayed message to refer to incorrect names in some cases. gcc/ada/ * sem_ch13.adb (Check_Aspect_At_Freeze_Point): fix format string, use existing local Ident. commit 3da0e4ae25f15949f87e74aa96a03b47e51a9ff3 Author: Marc Poulhiès Date: Mon Mar 6 12:15:13 2023 +0100 ada: Fix (again) incorrect handling of Aggregate aspect Previous fix stopped the processing of the Aggregate aspect early, skipping the call to Record_Rep_Item, making later call to Resolve_Container_Aggregate fail. Also, the previous fix would not handle correctly the case where the type is private and the check for non-array type can only be done at the freeze point with the full type. Adapt the resolving of the aspect when the input is not correct and the parameters can't be resolved. gcc/ada/ * sem_ch13.adb (Analyze_One_Aspect): Call Record_Rep_Item. (Check_Aspect_At_Freeze_Point): Check the aspect is specified on non-array type only... (Analyze_One_Aspect): ... instead of doing it too early here. * sem_aggr.adb (Resolve_Container_Aggregate): Do nothing in case the parameters failed to resolve. commit fd694822ca6eda8b08fea10fcabdb0ad508a963e Author: Marc Poulhiès Date: Tue Feb 28 17:10:29 2023 +0100 ada: Fix incorrect handling of Aggregate aspect This change fixes 2 incorrect handlings of the aspect. The arguments are now correctly resolved and the aspect is rejected on non array types. gcc/ada/ * sem_ch13.adb (Analyze_One_Aspect): Mark Aggregate aspect as needing delayed resolution and reject the aspect on non-array type.
[Bug target/99829] MVE: ICE in lra_assign at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99829 --- Comment #7 from Vladimir Makarov --- (In reply to Maxim Kuvyrkov from comment #5) > > Where did you see the timeouts, btw? Sorry, I glanced at c logs and interpreted it wrongly. Please, discard my previous comment. I should been more accurate with reading the PR. I've tried c compiler instead of c++ one. Therefore I did not reproduce the bug. But the bug is really present for c++ compiler. I'll work on this PR and try to fix this on this or the next week.
[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #57 from Jan Hubicka --- > So, we can punt on differences there (that is desirable for backporting and > maybe GCC 14 too), or we could at that point populate an int vector, which > maps Yep, that is what I do. I had bug in that so I am re-running (forgot to check that callers and callee argument count matches and this cuases ICE during LLVM LTO link). It seems these extra checks makes no difference in practice. During bootstrap there are no pairs of functions during bootstrap where we new checks punt on value range difference or jump function difference that would be merged otherwise. Most common case where we could merge but we don't are those triggered by TBAA. > the callee > vector indexes to indexes in the callee vector in the other candidate > function. > If unsuccessful, we just free the vector, if successful, we first walk all the > callees and union stuff in there using that vector. This is the plan for metadata merging. A small complication here is that ICF works by comparing bodies to a leader of equivalence class but this leader is not necessarilly the surviving function body. So if we compared A to L (leader) and B to L and then decided replace A by B, we need to be able to combine the permutations so we know how to map call sites in A to ones in B. The same is true about SSA names and basic blocks. I have patch for that for next stage1.
[Bug ada/106037] ICE with Aggregate aspect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106037 simon at pushface dot org changed: What|Removed |Added CC||simon at pushface dot org --- Comment #4 from simon at pushface dot org --- This is illegal code: 'aspect "Aggregate" can only be applied to non-array type'. See https://groups.google.com/g/comp.lang.ada/c/FHWcqk1SWRM/m/sYTWUHQxAgAJ, and the (slightly unemphatically worded) ARM 4.3.5(2), "For a type other than an array type, the following type-related operational aspect may be specified" GNAT 14.0.1 20240223 (experimental) Copyright 1992-2024, Free Software Foundation, Inc. Compiling: container_aggregates.adb Source file time stamp: 2024-03-13 15:04:00 Compiled at: 2024-03-13 15:04:53 1. procedure Container_Aggregates is 2. 3.type Array_Type is 4. array (1 .. 10) of Integer 5.with Aggregate => (Empty => Empty_Array); 12 3 >>> error: aspect "Aggregate" can only be applied to non-array type >>> error: incomplete specification for aggregate >>> error: object "Empty_Array" cannot be used before end of its declaration >>> error: improper aggregate operation for "Array_Type" 6. 7.Empty_Array : constant Array_Type := [1..10 => 123]; 8. 9. begin 10.null; 11. end Container_Aggregates; 12. 12 lines: 4 errors
[Bug rtl-optimization/114261] [13/14 Regression] Scheduling takes excessive time (97%) since r13-5154-g733a1b777f1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114261 --- Comment #10 from Alexander Monakov --- Indeed, but OTOH according to bug 84402 comment 58 it caused a noticeable hit on gimple-match.cc compilation: 733a1b777f16cd397b43a242d9c31761f66d3da8 13th January 2023 sched-deps: do not schedule pseudos across calls [PR108117] (Alexander Monakov) Stage 2: +14% Stage 3: +9% In any case, if the proposed band-aid is unnecessary, that's fine with me.
[Bug c++/112652] g++.dg/cpp26/literals2.C FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112652 --- Comment #8 from ro at CeBiTec dot Uni-Bielefeld.DE --- > --- Comment #7 from Jakub Jelinek --- > (In reply to r...@cebitec.uni-bielefeld.de from comment #6) >> > --- Comment #5 from ro at CeBiTec dot Uni-Bielefeld.DE > > Uni-Bielefeld.DE> --- >> >> --- Comment #4 from Jakub Jelinek --- >> >> Given that C++ says e.g. in https://eel.is/c++draft/lex.ccon#3.1 >> >> that program is ill-formed if some character lacks encoding in the >> >> execution >> >> character set, I'm afraid the Solaris iconv behavior results in violation >> >> of >> >> Although I can barely wrap my head around the standardese there, I had a >> look at n4928 (the last? C++23 draft), which has a different wording >> here (p.25, 5.13.3): > > The testcase is for a C++26 feature, which made those ill-formed. Should have been obvious from the pathname ;-( N4971 has that wording... >> The current Solaris iconv behaviour certainly isn't particularly >> intuitive and I'll ask the Solaris engineers about it. However, there's >> the question what to do about the testcase? Just xfail it on Solaris or >> omit just the two affected subtests there? > > xfailing is one possibility, but then on Solaris we'll never support C++26 > properly. I guess it's the best solution in the short term (GCC 14), though. > Or require using GNU libiconv rather than Solaris iconv if it can't deal with > that? At least document the suggestion in install.texi; I wouldn't make it a hard requirement yet. I'll also wait what the Solaris engineers can provide on background for the current behaviour. FWIW, the iconv conversion tables in /usr/lib/iconv can be regenerated from the OpenSolaris sources, modified not to do that '?' conversion. Worked for a quick check for the UTF-8 -> ASCII example, but the '?' is more prevalent and would need to be eradicated upstream.
[Bug ada/111909] Filename case sensitivity defaulted wrongly on macOS
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111909 simon at pushface dot org changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #4 from simon at pushface dot org --- Fixed on mainline.
[Bug libstdc++/114325] std::format gives incorrect results for negative numbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114325 Michael Duggan changed: What|Removed |Added CC||mwd at md5i dot com --- Comment #1 from Michael Duggan --- I will note that, in experiments, this seems to solely happen with "{}". If anything else is in the format string, it works correctly. This is probably a bug in the fairly recent codepath that optimizes the "{}" case.
[Bug tree-optimization/94094] [meta-bug] store-merging and/or bswap load/store-merging missed optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94094 Bug 94094 depends on bug 114319, which changed state. Bug 114319 Summary: htobe64-like function is not optimized on 32-bit x86 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114319 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug middle-end/114319] htobe64-like function is not optimized on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114319 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED CC||jakub at gcc dot gnu.org Status|NEW |RESOLVED --- Comment #7 from Jakub Jelinek --- Fixed for GCC 14.
[Bug target/113618] [14 Regression] AArch64: memmove idiom regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113618 Wilco changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #6 from Wilco --- Fixed.
[Bug middle-end/114319] htobe64-like function is not optimized on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114319 --- Comment #6 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:74bca21db31e3f4ab6543b56c3f26b4dfe586fef commit r14-9453-g74bca21db31e3f4ab6543b56c3f26b4dfe586fef Author: Jakub Jelinek Date: Wed Mar 13 15:34:59 2024 +0100 store-merging: Match bswap64 on 32-bit targets with bswapsi2 [PR114319] gimple-ssa-store-merging.cc tests bswap_optab in 3 different places, in 2 of them it has special exception for double-word bswap using pair of word-mode bswap optabs, but in the last one it doesn't. The following patch changes even the last spot. We don't handle 128-bit bswaps in the passes at all, because currently we just use uint64_t to represent the byte reshuffling (we'd need to use offset_int or something like that instead) and we don't have __builtin_bswap128 nor type-generic __builtin_bswap, so there is nothing for 64-bit targets there. 2024-03-13 Jakub Jelinek PR middle-end/114319 * gimple-ssa-store-merging.cc (imm_store_chain_info::try_coalesce_bswap): For 32-bit targets allow matching __builtin_bswap64 if there is bswapsi2 optab. * gcc.target/i386/pr114319.c: New test.
[Bug tree-optimization/114326] New: Missed optimization for A || B when !B implies A.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114326 Bug ID: 114326 Summary: Missed optimization for A || B when !B implies A. Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: manolis.tsamis at vrull dot eu Target Milestone: --- The function below doesn't fold to return 0; int cmp1(uint64_t d1, uint64_t d2) { if (((d1 ^ d2) & 0xabcd) == 0 || d1 != d2) return 0; return foo(); } while the following function does: int cmp2(uint64_t d1, uint64_t d2) { if (d1 != d2 || ((d1 ^ d2) & 0xabcd) == 0) return 0; return foo(); } The functions are equivalent since the lhs and rhs of || don't have side effects. In general, there pattern here is a side-effect free expression a || b where !b implies a should be optimized to true. As in the testcase above, a doesn't necessarily imply !b. Something similar could be stated for && expressions. Complementary godbolt link: https://godbolt.org/z/qK5bYf36T
[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #56 from Jakub Jelinek --- (In reply to Jan Hubicka from comment #55) > It is however not hard to match the jump function while walking gimple > bodies and comparing statements, which is backportable and localized. I am > still waiting for my statistics to converge and will send it soon. So, we can punt on differences there (that is desirable for backporting and maybe GCC 14 too), or we could at that point populate an int vector, which maps the callee vector indexes to indexes in the callee vector in the other candidate function. If unsuccessful, we just free the vector, if successful, we first walk all the callees and union stuff in there using that vector.
[Bug rtl-optimization/114261] [13/14 Regression] Scheduling takes excessive time (97%) since r13-5154-g733a1b777f1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114261 --- Comment #9 from Richard Biener --- As far as I understand the testcase is from fuzzing so not "real", so I think this proposed "fix" isn't necessary (and it's not a real fix, adding a setjmp call at the end of the function will restore it).
[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #55 from Jan Hubicka --- > Anyway, can we in the spot my patch changed just walk all > source->node->callees > cgraph_edges, for each of them find the corresponding > cgraph_edge in the alias > and for each walk all the jump_functions recorded > and union their m_vr? > Or is that something that can't be done in LTO for some reason? That was my fist idea too, but the problem is that icf has (very limited) support for matching function which differ by order of the basic blocks: it computes hash of every basic block and orders them by their hash prior comparing. This seems half-finished since i.e. order of edges in PHIs has to match exactly. Callee lists are officially randomly ordered, but practically they follows the order of basic blocks (as they are built this way). However since BB orders can differ, just walking both callee sequences and comparing pairwise does not work. This also makes merging the information harder, since we no longer have the BB map at the time decide to merge. It is however not hard to match the jump function while walking gimple bodies and comparing statements, which is backportable and localized. I am still waiting for my statistics to converge and will send it soon.
[Bug libstdc++/114325] New: std::format gives incorrect results for negative numbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114325 Bug ID: 114325 Summary: std::format gives incorrect results for negative numbers Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: luigighiron at gmail dot com Target Milestone: --- The following code generates an incorrect result with libstdc++: std::format("{}",-100) >From testing on godbolt this seems to generate the string "-1\", then when printed it looks like -10. This seems exclusive to GCC 14, and happens for any numbers less than -99.
[Bug middle-end/111523] Unexpected performance regression with -ftrivial-auto-var-init=zero for e.g. systemctl unmask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111523 --- Comment #10 from qinzhao at gcc dot gnu.org --- (In reply to Andrew Pinski from comment #9) > Anways systemd has now changed the buffer to 256 which is much much smaller > and for most calls enough in size before needing to reallocate the buffer > that it has now become fast. > > Anyways -ftrivial-auto-var-init=zero just exposed a performance (stack size) > issue with already existing issue inside the systemd code. A good thing > really. > > So closing as moved. thanks a lot for the analysis and the solution of this performance issue. really appreciate.
[Bug rtl-optimization/114261] [13/14 Regression] Scheduling takes excessive time (97%) since r13-5154-g733a1b777f1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114261 --- Comment #8 from Alexander Monakov --- If we want to get rid of the compilation time regression sooner rather than later, I can suggest limiting my change only to functions that call setjmp: diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc index c23218890f..ae23f55274 100644 --- a/gcc/sched-deps.cc +++ b/gcc/sched-deps.cc @@ -3695,7 +3695,7 @@ deps_analyze_insn (class deps_desc *deps, rtx_insn *insn) CANT_MOVE (insn) = 1; - if (!reload_completed) + if (!reload_completed && cfun->calls_setjmp) { /* Scheduling across calls may increase register pressure by extending live ranges of pseudos over the call. Worse, in presence of setjmp That way we retain the "correctness fix" part of r13-5154-g733a1b777f1 and keep the previous status quo on normal functions (quadraticness on asms like demonstrated in comment #5 would also remain).
[Bug c++/112652] g++.dg/cpp26/literals2.C FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112652 --- Comment #7 from Jakub Jelinek --- (In reply to r...@cebitec.uni-bielefeld.de from comment #6) > > --- Comment #5 from ro at CeBiTec dot Uni-Bielefeld.DE > Uni-Bielefeld.DE> --- > >> --- Comment #4 from Jakub Jelinek --- > >> Given that C++ says e.g. in https://eel.is/c++draft/lex.ccon#3.1 > >> that program is ill-formed if some character lacks encoding in the > >> execution > >> character set, I'm afraid the Solaris iconv behavior results in violation > >> of > > Although I can barely wrap my head around the standardese there, I had a > look at n4928 (the last? C++23 draft), which has a different wording > here (p.25, 5.13.3): The testcase is for a C++26 feature, which made those ill-formed. > The current Solaris iconv behaviour certainly isn't particularly > intuitive and I'll ask the Solaris engineers about it. However, there's > the question what to do about the testcase? Just xfail it on Solaris or > omit just the two affected subtests there? xfailing is one possibility, but then on Solaris we'll never support C++26 properly. Or require using GNU libiconv rather than Solaris iconv if it can't deal with that?
[Bug libfortran/114304] libgfortran I/O – bogus "Semicolon not allowed as separator with DECIMAL='point'"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114304 Jeffrey A. Law changed: What|Removed |Added Summary|[13/14 Regression] |libgfortran I/O – bogus |libgfortran I/O – bogus |"Semicolon not allowed as |"Semicolon not allowed as |separator with |separator with |DECIMAL='point'" |DECIMAL='point'"| CC||law at gcc dot gnu.org --- Comment #16 from Jeffrey A. Law --- Per c#12, c#13, c#14 & c#15, dropping the regression marker, but leaving open.
[Bug tree-optimization/114322] [14 Regression] SCEV analysis failed for bases like A[(i+x)*stride] since r14-9193-ga0b1798042d033
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114322 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P2 CC||law at gcc dot gnu.org
[Bug target/114323] [14 Regression] MVE vector load intrinsic miscompiled since r14-5622-g4d7647edfd7d98
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114323 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P1 CC||law at gcc dot gnu.org
[Bug c++/103524] [meta-bug] modules issue
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524 Bug 103524 depends on bug 98462, which changed state. Bug 98462 Summary: [modules] ICE when making iomanip module and all modules after it https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98462 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug c++/98462] [modules] ICE when making iomanip module and all modules after it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98462 Patrick Palka changed: What|Removed |Added Target Milestone|--- |11.0 Resolution|--- |FIXED Status|ASSIGNED|RESOLVED CC||ppalka at gcc dot gnu.org --- Comment #1 from Patrick Palka --- Seems fixed even in GCC 11.
[Bug c++/111075] [14 Regression] ICE on g++.dg/torture/tail-padding1.C on darwin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111075 Marek Polacek changed: What|Removed |Added Priority|P1 |P2 CC||mpolacek at gcc dot gnu.org --- Comment #2 from Marek Polacek --- darwin -> probably not P1.
[Bug c++/112652] g++.dg/cpp26/literals2.C FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112652 --- Comment #6 from ro at CeBiTec dot Uni-Bielefeld.DE --- > --- Comment #5 from ro at CeBiTec dot Uni-Bielefeld.DE Uni-Bielefeld.DE> --- >> --- Comment #4 from Jakub Jelinek --- >> Given that C++ says e.g. in https://eel.is/c++draft/lex.ccon#3.1 >> that program is ill-formed if some character lacks encoding in the execution >> character set, I'm afraid the Solaris iconv behavior results in violation of Although I can barely wrap my head around the standardese there, I had a look at n4928 (the last? C++23 draft), which has a different wording here (p.25, 5.13.3): (3.1) — A character-literal with a c-char-sequence consisting of a single basic-c-char, simple-escape-sequence, or universal-character-name is the code unit value of the specified character as encoded in the literal’s associated character encoding. [Note 2 : If the specified character lacks representation in the literal’s associated character encoding or if it cannot be encoded as a single code unit, then the literal is a non-encodable character literal. —end note > I've not yet tried to understand what either iconv(3) has to say on the > matter. Digging further, Solaris iconv(3C) has If iconv() encounters a character in the input buffer that is legal, but for which an identical character does not exist in the target code set, iconv() performs an implementation-defined conversion on this character. which exactly matches XPG7, so the behaviour seems to be in line with the standards. I've also found that Solaris 11 has iconvctl(3C) (obviously patterened after GNU libiconv) with ICONV_SET_TRANSLITERATE With this request and a pointer to a const int with a non-zero value, caller can instruct the current conversion to transliterate non-identical characters from the input buffer during the code con- version as much as it can. The value of zero, on the other hand, turns it off. However, int transliterate = 0; iconvctl (cd, ICONV_SET_TRANSLITERATE, ); doesn't make a difference. The current Solaris iconv behaviour certainly isn't particularly intuitive and I'll ask the Solaris engineers about it. However, there's the question what to do about the testcase? Just xfail it on Solaris or omit just the two affected subtests there?
[Bug fortran/114324] New: AVX2 vectorisation performance regression with gfortran 13/14
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114324 Bug ID: 114324 Summary: AVX2 vectorisation performance regression with gfortran 13/14 Product: gcc Version: 13.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: mjr19 at cam dot ac.uk Target Milestone: --- Created attachment 57685 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57685=edit Test case of loop showing performance regression The attached loop, when compiled with "-Ofast -mavx2" runs over 20% slower on gfortran 13 or (pre-release) 14 than it does on 12.x. Precise versions tested 12.3.0, 13.1.0 and GCC 14 downloaded on 11th March. Precise slowdown depends on CPU. Tested on Haswell and Kaby Lake desktops. Adding "-fopenmp" changes the code produced, but 12.3 still beats later compilers. The analysis below is without -fopenmp. It appears (to me) that 12.x is using the full width of the ymm registers, and has a loop of 17 vector instructions, and some scalar loop control, which performs two iterations of the original Fortran loop. 13.x manages more aggressive unrolling, performing four iterations per pass, but uses about 54 vector instructions, rather than the 34 one might naively expect. More instructions does not necessarily mean slower, but here it does. I attach the test case to which I refer. I would be happy to add the trivial timing program to show how I have been timing it. The full code is an FFT, but the test case has been reduced to functional nonsense. (I note that in other areas there are pleasing performance gains in gfortran 13.x. It is a pity that this partially cancels them.)
[Bug target/114323] [14 Regression] MVE vector load intrinsic miscompiled since r14-5622-g4d7647edfd7d98
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114323 Richard Biener changed: What|Removed |Added Target Milestone|--- |14.0
[Bug target/114323] [14 Regression] MVE vector load intrinsic miscompiled since r14-5622-g4d7647edfd7d98
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114323 --- Comment #1 from Alex Coplan --- Hmm, so in 043t.mergephi1 we have: uint32x4_t foo () { const uint32_t D.13439[4]; uint32x4_t V0; : D.13439 = *.LC0; V0_3 = vld1q_u32 (); D.13439 ={v} {CLOBBER(eos)}; return V0_3; } but then 044t.dse1 says: Deleted dead store: D.13439 = *.LC0; leaving us with a load of uninitialized memory.
[Bug target/114323] New: [14 Regression] MVE vector load intrinsic miscompiled since r14-5622-g4d7647edfd7d98
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114323 Bug ID: 114323 Summary: [14 Regression] MVE vector load intrinsic miscompiled since r14-5622-g4d7647edfd7d98 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acoplan at gcc dot gnu.org Target Milestone: --- The following testcase: #include uint32x4_t foo (void) { uint32x4_t V0 = vld1q_u32(((const uint32_t[4]){1, 2, 3, 4})); return V0; } is miscompiled with -O2 -march=armv8.1-m.main+mve -mfloat-abi=hard on arm-none-eabi. Since r14-5622-g4d7647edfd7d985fbefe13de03c8bc2e3a74fc61 we generate: foo: sub sp, sp, #16 vldrw.32q0, [sp] add sp, sp, #16 bx lr i.e. we do a vector load from uninitialized stack memory. GCC 13 used to give: foo: sub sp, sp, #16 mov ip, sp ldr r3, .L4 ldm r3, {r0, r1, r2, r3} stm ip, {r0, r1, r2, r3} vldrw.32q0, [ip] add sp, sp, #16 bx lr .align 2 .L4: .word .LANCHOR0 .size foo, .-foo .section.rodata .align 2 .set.LANCHOR0,. + 0 .word 1 .word 2 .word 3 .word 4 which, while not optimal, is at least correct. Here is a full executable testcase for the testsuite: #include __attribute__((noipa)) uint32x4_t foo (void) { uint32x4_t V0 = vld1q_u32(((const uint32_t[4]){1, 2, 3, 4})); return V0; } int main(void) { uint32_t buf[4]; vst1q_u32 (buf, foo()); for (int i = 0; i < 4; i++) if (buf[i] != i+1) __builtin_abort (); }
[Bug testsuite/114307] [ARM] Vectorization tests not disabled for vector-less targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114307 --- Comment #8 from Maxim Kuvyrkov --- Patch posted: https://patchwork.sourceware.org/project/gcc/patch/20240313105839.2785627-1-maxim.kuvyr...@linaro.org/
[Bug tree-optimization/114322] [14 Regression] SCEV analysis failed for bases like A[(i+x)*stride] since r14-9193-ga0b1798042d033
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114322 Richard Biener changed: What|Removed |Added Target Milestone|--- |14.0 Last reconfirmed||2024-03-13 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Confirmed. The issue is we have { x_12(D), +, 1 } * stride_11(D) which doesn't behave the same with respect to overflow as { x_12(D) * stride_11(D), +, stride_11(D) } and because of that we analyze it as (int) {(unsigned) x_12(D) * (unsigned) stride_11(D), +, (unsigned) stride_11(D) } as it might wrap. But then then sign-extension to long unsigned int is no longer affine. _1 = x_12(D) + i_20; _2 = _1 * stride_11(D); _3 = (long unsigned int) _2; _4 = _3 * 2; _5 = A_13(D) + _4; _6 = *_5; The problematical case is x == N < 0 where the last - N might now overflow with the new SCEV. The correctness means that we'll now more often run into these issues for IVs smaller than pointer width. With -m32 we can analyze the DR to Creating dr for *_5 offset from base address: 0 constant offset from base address: 0 step: (ssizetype) ((unsigned int) stride_11(D) * 2) base alignment: 2 base misalignment: 0 offset alignment: 256 step alignment: 2 base_object: *A_13(D) + (sizetype) ((unsigned int) stride_11(D) * (unsigned int) x_12(D)) * 2 Access function 0: {0B, +, (unsigned int) stride_11(D) * 2}_1 If you had written sum += A[i*stride + x*stride]; it might have worked but unfortunately EVRP transforms this back to (i+x)*stride because it knows stride isn't zero. In the end this means it's our failure that we fail to handle 2 * (unsigned long)({ x_12(D), +, 1 } * stride_11(D)) as valid evolution for further analysis - of course the multiplication by two in an unsigned type might overflow as well.
[Bug tree-optimization/114322] New: [14 Regression] SCEV analysis failed for bases like A[(i+x)*stride] since r14-9193-ga0b1798042d033
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114322 Bug ID: 114322 Summary: [14 Regression] SCEV analysis failed for bases like A[(i+x)*stride] since r14-9193-ga0b1798042d033 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: hliu at amperecomputing dot com Target Milestone: --- Compile the following case with: gcc simp.c -Ofast -mcpu=neoverse-n1 -S \ -fdump-tree-ifcvt -fdump-tree-vect-details-scev int foo (short *A, int x, int stride) { int sum = 0; if (stride > 1) { #pragma GCC unroll 1 for (int i = 0; i < 1024; ++i) sum += A[(i + x) * stride]; } return sum; } The gimple in the loop is: : # sum_19 = PHI # i_20 = PHI # ivtmp_37 = PHI _1 = x_12(D) + i_20; _2 = _1 * stride_11(D); _3 = (long unsigned int) _2; _4 = _3 * 2; _5 = A_13(D) + _4; _6 = *_5; _7 = (int) _6; sum_15 = _7 + sum_19; Before the commit (i.e., from pr114074 bug fix), it can be vectorized: Creating dr for *_5 analyze_innermost: (analyze_scalar_evolution (loop_nb = 1) (scalar = _5) (get_scalar_evolution (scalar = _5) (scalar_evolution = {A_13(D) + (long unsigned int) (stride_11(D) * x_12(D)) * 2, +, (long unsigned int) stride_11(D) * 2}_1)) ) success. (analyze_scalar_evolution (loop_nb = 1) (scalar = _5) (get_scalar_evolution (scalar = _5) (scalar_evolution = {A_13(D) + (long unsigned int) (stride_11(D) * x_12(D)) * 2, +, (long unsigned int) stride_11(D) * 2}_1)) ) (instantiate_scev (instantiate_below = 5 -> 3) (evolution_loop = 1) (chrec = {A_13(D) + (long unsigned int) (stride_11(D) * x_12(D)) * 2, +, (long unsigned int) stride_11(D) * 2}_1) (res = {A_13(D) + (long unsigned int) (stride_11(D) * x_12(D)) * 2, +, (long unsigned int) stride_11(D) * 2}_1)) base_address: A_13(D) + (sizetype) (stride_11(D) * x_12(D)) * 2 offset from base address: 0 constant offset from base address: 0 step: (ssizetype) ((long unsigned int) stride_11(D) * 2) base alignment: 2 base misalignment: 0 offset alignment: 128 step alignment: 2 base_object: *A_13(D) + (sizetype) (stride_11(D) * x_12(D)) * 2 Access function 0: {0B, +, (long unsigned int) stride_11(D) * 2}_1 After the commit, loop vectorized failed due to SCEV failure with *_5: Creating dr for *_5 analyze_innermost: (analyze_scalar_evolution (loop_nb = 1) (scalar = _5) (get_scalar_evolution (scalar = _5) (scalar_evolution = _5)) ) (analyze_scalar_evolution (loop_nb = 1) (scalar = _5) (get_scalar_evolution (scalar = _5) (scalar_evolution = _5)) ) simp.c:11:10: missed: failed: evolution of base is not affine. .. (res = scev_not_known)) To my understanding, '(i + x) * stride' is signed integer calculation, in which overflow is undefined behavior and the case should be vectorized.
[Bug libstdc++/110167] excessive compile time for std::to_array with huge arrays
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110167 Jonathan Wakely changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #15 from Jonathan Wakely --- Fixed for 13.3 and 12.4
[Bug target/112548] [14 regression] 5% exec time regression in 429.mcf on AMD zen4 CPU (since r14-5076-g01c18f58d37865)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112548 --- Comment #10 from Robin Dapp --- (In reply to Sam James from comment #9) > (In reply to Filip Kastl from comment #8) > > I'd like to help but I'm afraid I cannot send you the SPEC binaries with PGO > > applied since SPEC is licensed nor can I give you access to a Zen4 computer. > > I suppose someone else will have to analyze this bug. > > Could you perhaps send only the gcda files so Robin can build again with > -fprofile-use? Yes, that would be helpful. Or Filip builds the executables himself and posts (some of) the difference here. Maybe that also gets us a bit closer to the problem.
[Bug libstdc++/110167] excessive compile time for std::to_array with huge arrays
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110167 --- Comment #14 from GCC Commits --- The releases/gcc-12 branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:ec5da76ad33dcba7858525fdb6b39288631fcd8a commit r12-10206-gec5da76ad33dcba7858525fdb6b39288631fcd8a Author: Jonathan Wakely Date: Thu Jun 8 12:24:43 2023 +0100 libstdc++: Optimize std::to_array for trivial types [PR110167] As reported in PR libstdc++/110167, std::to_array compiles extremely slowly for very large arrays. It needs to instantiate a very large specialization of std::index_sequence and then create a very large aggregate initializer from the pack expansion. For trivial types we can simply default-initialize the std::array and then use memcpy to copy the values. For non-trivial types we need to use the existing implementation, despite the compilation cost. As also noted in the PR, using a generic lambda instead of the __to_array helper compiles faster since gcc-13. It also produces slightly smaller code at -O1, due to additional inlining. The code at -Os, -O2 and -O3 seems to be the same. This new implementation requires __cpp_generic_lambdas >= 201707L (i.e. P0428R2) but that is supported since Clang 10 and since Intel icc 2021.5.0 (and since GCC 10.1). libstdc++-v3/ChangeLog: PR libstdc++/110167 * include/std/array (to_array): Initialize arrays of trivial types using memcpy. For non-trivial types, use lambda expressions instead of a separate helper function. (__to_array): Remove. * testsuite/23_containers/array/creation/110167.cc: New test. (cherry picked from commit 960de5dd886572711ef86fa1e15e30d3810eccb9)
[Bug middle-end/114319] htobe64-like function is not optimized on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114319 --- Comment #5 from Richard Biener --- Coalescing successful! Merged into 1 stores 32 bit bswap implementation found at: _37 looks like we are only merging one store. Note we cannot recognize bswap to memory this is a known issue. So for the bswap64 we need to merge to a 64bit store which we never do on a 32bit platform. We could with SSE, but appearantly we don't try with the bswap trick at least. The bswap trick also doesn't seem to consider the split 64bit bswap. Oddly enough we also fail to merge the other store (maybe missing a val >> 32 pre-shift "trick"). Possibly could be shown to be a similar issue with a 126bit bswap on x86_64 which we could emulate with two 64bit bswaps.
[Bug middle-end/114313] ICE: in limb_access_type, at gimple-lower-bitint.cc:591 with _BitInt() in a bitfield
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114313 Jakub Jelinek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #4 from Jakub Jelinek --- Fixed.
[Bug middle-end/114313] ICE: in limb_access_type, at gimple-lower-bitint.cc:591 with _BitInt() in a bitfield
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114313 --- Comment #3 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:0613b12dd7f6274a1aac07f295ed51d86c2c85f1 commit r14-9447-g0613b12dd7f6274a1aac07f295ed51d86c2c85f1 Author: Jakub Jelinek Date: Wed Mar 13 10:19:04 2024 +0100 bitint: Fix up lowering of bitfield loads/stores [PR114313] The following testcase ICEs, because for large/huge _BitInt bitfield loads/stores we use the DECL_BIT_FIELD_REPRESENTATIVE as the underlying "var" and indexes into it can be larger than the precision of the bitfield might normally allow. The following patch fixes that by passing NULL_TREE type in that case to limb_access, so that we always return m_limb_type type and don't do the extra assertions, after all, the callers expect that too. I had to add the first hunk to avoid ICE, it was using type in one place even when it was NULL. But TYPE_SIZE (TREE_TYPE (var)) seems like the right size to use anyway because the code uses VIEW_CONVERT_EXPR on it. 2024-03-13 Jakub Jelinek PR middle-end/114313 * gimple-lower-bitint.cc (bitint_large_huge::limb_access): Use TYPE_SIZE of TREE_TYPE (var) rather than TYPE_SIZE of type. (bitint_large_huge::handle_load): Pass NULL_TREE rather than rhs_type to limb_access for the bitfield load cases. (bitint_large_huge::lower_mergeable_stmt): Pass NULL_TREE rather than lhs_type to limb_access if nlhs is non-NULL. * gcc.dg/torture/bitint-62.c: New test.
[Bug fortran/114283] [OpenMP] Dummy procedures/proc pointers and 'defaultmap', 'default', 'firstprivate' etc.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114283 --- Comment #1 from GCC Commits --- The master branch has been updated by Tobias Burnus : https://gcc.gnu.org/g:c5037fcee2de438774466e78e46e6ab4df72a7fe commit r14-9446-gc5037fcee2de438774466e78e46e6ab4df72a7fe Author: Tobias Burnus Date: Wed Mar 13 09:35:28 2024 +0100 OpenMP/Fortran: Fix defaultmap(none) issue with dummy procedures [PR114283] Dummy procedures look similar to variables but aren't - neither in Fortran nor in OpenMP. As the middle end sees PARM_DECLs, mark them as predetermined firstprivate for mapping (as already done in gfc_omp_predetermined_sharing). This does not address the isses related to procedure pointers, which are still discussed on spec level [see PR]. PR fortran/114283 gcc/fortran/ChangeLog: * trans-openmp.cc (gfc_omp_predetermined_mapping): Map dummy procedures as firstprivate. libgomp/ChangeLog: * testsuite/libgomp.fortran/declare-target-indirect-4.f90: New test.
[Bug sanitizer/112709] [13/14 Regression] address sanitize and returns_twice causes an ICE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112709 --- Comment #10 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:6586359e8e4c611dd96129b5d4f24023949ac3fc commit r14-9445-g6586359e8e4c611dd96129b5d4f24023949ac3fc Author: Jakub Jelinek Date: Wed Mar 13 09:19:05 2024 +0100 asan: Fix ICE during instrumentation of returns_twice calls [PR112709] The following patch on top of the previously posted ubsan/gimple-iterator one handles asan the same. While the case of returning by hidden reference is handled differently because of the first recently posted asan patch, this deals with instrumentation of the aggregates returned in registers case as well as instrumentation of loads from aggregate memory in the function arguments of returns_twice calls. 2024-03-13 Jakub Jelinek PR sanitizer/112709 * asan.cc (maybe_create_ssa_name, maybe_cast_to_ptrmode, build_check_stmt, maybe_instrument_call, asan_expand_mark_ifn): Use gsi_safe_insert_before instead of gsi_insert_before. * gcc.dg/asan/pr112709-2.c: New test.
[Bug sanitizer/112709] [13/14 Regression] address sanitize and returns_twice causes an ICE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112709 --- Comment #9 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:364c684c474841e3c9c04e025a5c1bca49705c86 commit r14-9444-g364c684c474841e3c9c04e025a5c1bca49705c86 Author: Jakub Jelinek Date: Wed Mar 13 09:16:45 2024 +0100 gimple-iterator, ubsan: Fix ICE during instrumentation of returns_twice calls [PR112709] ubsan, asan (both PR112709) and _BitInt lowering (PR113466) want to insert some instrumentation or adjustment statements before some statement. This unfortunately creates invalid IL if inserting before a returns_twice call, because we require that such calls are the first statement in a basic block and that we have an edge from the .ABNORMAL_DISPATCHER block to the block containing the returns_twice call (in addition to other edge(s)). The following patch adds helper functions for such insertions and uses it for now in ubsan (I'll post a follow up which uses it in asan and will work later on the _BitInt lowering PR). In particular, if the bb with returns_twice call at the start has just 2 edges, one EDGE_ABNORMAL from .ABNORMAL_DISPATCHER and another (non-EDGE_ABNORMAL/EDGE_EH) from some other bb, it just inserts the statement or sequence on that other edge. If the bb has more predecessor edges or the one not from .ABNORMAL_DISPATCHER is e.g. an EH edge (this latter case likely shouldn't happen, one would need labels or something like that), the patch splits the block with returns_twice call such that there is just one edge next to .ABNORMAL_DISPATCHER edge and adjusts PHIs as needed to make it happen. The functions also replace uses of PHIs from the returns_twice bb with the corresponding PHI arguments, because otherwise it would be invalid IL. E.g. in ubsan/pr112709-2.c (qux) we have before the ubsan pass : # .MEM_5(ab) = PHI <.MEM_4(9), .MEM_25(ab)(11)> # _7(ab) = PHI <_20(9), _8(ab)(11)> # .MEM_21(ab) = VDEF <.MEM_5(ab)> _22 = bar (*_7(ab)); where bar is returns_twice call and bb 11 has .ABNORMAL_DISPATCHER call, this patch instruments it like: : # .MEM_4 = PHI <.MEM_17(ab)(4), .MEM_10(D)(5), .MEM_14(ab)(8)> # DEBUG BEGIN_STMT # VUSE <.MEM_4> _20 = p; # .MEM_27 = VDEF <.MEM_4> .UBSAN_NULL (_20, 0B, 0); # VUSE <.MEM_27> _2 = __builtin_dynamic_object_size (_20, 0); # .MEM_28 = VDEF <.MEM_27> .UBSAN_OBJECT_SIZE (_20, 1024, _2, 0); : # .MEM_5(ab) = PHI <.MEM_28(9), .MEM_25(ab)(11)> # _7(ab) = PHI <_20(9), _8(ab)(11)> # .MEM_21(ab) = VDEF <.MEM_5(ab)> _22 = bar (*_7(ab)); The edge from .ABNORMAL_DISPATCHER is there just to represent the returning for 2nd and later times, the instrumentation can't be done at that point as there is no code executed during that point. The ubsan/pr112709-1.c testcase includes non-virtual PHIs to cover the handling of those as well. 2024-03-13 Jakub Jelinek PR sanitizer/112709 * gimple-iterator.h (gsi_safe_insert_before, gsi_safe_insert_seq_before): Declare. * gimple-iterator.cc: Include gimplify.h. (edge_before_returns_twice_call, adjust_before_returns_twice_call, gsi_safe_insert_before, gsi_safe_insert_seq_before): New functions. * ubsan.cc (instrument_mem_ref, instrument_pointer_overflow, instrument_nonnull_arg, instrument_nonnull_return): Use gsi_safe_insert_before instead of gsi_insert_before. (maybe_instrument_pointer_overflow): Use force_gimple_operand, gimple_seq_add_seq_without_update and gsi_safe_insert_seq_before instead of force_gimple_operand_gsi. (instrument_object_size): Likewise. Use gsi_safe_insert_before instead of gsi_insert_before. * gcc.dg/ubsan/pr112709-1.c: New test. * gcc.dg/ubsan/pr112709-2.c: New test.
[Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since r14-9193
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151 --- Comment #22 from Richard Biener --- (In reply to Andrew Macleod from comment #21) > (In reply to Richard Biener from comment #19) > > > > > While ranger has a range_on_exit API this doesn't work on GENERIC > > expressions > > as far as I can see but only SSA names but I guess that could be "fixed" > > given range_on_exit also looks at the last stmt and eventually defers to > > range_of_expr (or range_on_entry), but possibly get_tree_range needs > > variants for on_entry/on_exit (it doesn't seem to use it's 'stmt' context > > very consistently, notably not for SSA_NAMEs ...). > > That would appear to be an oversight. That API has not been used very much > for arbitrary generic trees. I think the original reason support for tree > expressions was added was a "try this" for some other PR. It was simple to > do so we lef tit in, but it never got any real traction. At least as far as > I can recall :-) > > Currently, I think mosrt, if not all, uses of get_tree_range() are either > !gimple_ssa_range_p() (commonly constants or unsupported types) or ssa_names > on abnormal edges. > > For abnormal edges, we ought to be getting the global range directly these > days instad of calling that routine. Then in get_tree_range (), we ought > to be calling range_of_expr for SSA_NAMES with the provided context. I'll > poke at that too. The support for general tree expressions changed the > original intent of the function, and it should be adjusted. > > As for the on-exit/on-entry bits... we haven't had a need for entry/exit > outside of ranger in the past. I had toyed with exporting those routines > and making them a part of the official API for value-query, but hadn't run > across the need as yet. > > Let me think about that for a minute. It can certainly be done. I guess we > really only need an on-entry and on-exit version of range_of_expr to do > everything. So if we end up with something like: > range_of_expr (r, expr, stmt) > range_of_expr_on_entry (r, expr, bb) > range_of_expr_on_exit (r, expr, bb) > > And have that all work with general trees expressions.. That would solve > much of this for you? Yes, I wouldn't mind if range_on_{entry,exit} handle general tree expressions, there's enough APIs to be confused with already ;) > > > > > > > Interestingly enough we somehow still need the > > > > > > > hunk of Andrews patch to do it :/ > > > > That probably means there is another call somewhere in the chain with no > context. However, I will say that functionality is more important than it > seems. Should have been there from the start :-P. Possibly yes. It might be we fill rangers cache with VARYING and when we re-do the query as a dependent one but with context we don't recompute it? I also only patched up a single place in SCEV with the context so I possibly missed some others that end up with a range query, for example through niter analysis that might be triggered.
[Bug testsuite/114307] [ARM] Vectorization tests not disabled for vector-less targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114307 Maxim Kuvyrkov changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |mkuvyrkov at gcc dot gnu.org --- Comment #7 from Maxim Kuvyrkov --- Working on this, including reviewing gcc.dg/vect/, g++.dg/vect/ and gfortran.dg/vect/ testsuites.
[Bug bootstrap/106472] No rule to make target '../libbacktrace/libbacktrace.la', needed by 'libgo.la'.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106472 --- Comment #36 from Дилян Палаузов --- > maybe this ought to be a `depend=` entry in Makefile.def instead? My understanding is that depend= only has effect for bootstrapped targets, and there is no boot_language=yes in gcc/go/config-lang.in.