[Bug target/115146] [15 Regression] Incorrect 8-byte vectorization: psrlw/psraw confusion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115146 H.J. Lu changed: What|Removed |Added Last reconfirmed||2024-05-18 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW
[Bug target/115146] [15 Regression] Incorrect 8-byte vectorization: psrlw/psraw confusion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115146 H.J. Lu changed: What|Removed |Added CC||hjl.tools at gmail dot com --- Comment #6 from H.J. Lu --- Created attachment 58235 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58235=edit A patch Please try this.
[Bug tree-optimization/115011] [14/15 Regression] Missed optimization: (bool) (f ? 1: t) ==> 1 when bool t = (0 >= f) + x;
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115011 H.J. Lu changed: What|Removed |Added CC||pinskia at gcc dot gnu.org Target Milestone|--- |14.2 Status|UNCONFIRMED |NEW Last reconfirmed||2024-05-09 Ever confirmed|0 |1 --- Comment #1 from H.J. Lu --- It is caused by r14-1597.
[Bug libgcc/114907] __trunchfbf2 should be renamed to __extendhfbf2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114907 --- Comment #5 from H.J. Lu --- (In reply to H.J. Lu from comment #4) > (In reply to H.J. Lu from comment #3) > > convert_mode_scalar has > > > > if (GET_MODE_PRECISION (from_mode) == GET_MODE_PRECISION (to_mode)) > > /* Conversion between decimal float and binary float, same size. */ > > tab = DECIMAL_FLOAT_MODE_P (from_mode) ? trunc_optab : sext_optab; > > > > Since for HF->BF, DECIMAL_FLOAT_MODE_P (from_mode) is false, tab is > > sext_optab > > and __trunchfbf2 should be renamed to __extendhfbf2. > > Since BFmode range is bigger than HFmode, __trunchfbf2 should be used. Opp, __extendhfbf2 is correct.
[Bug libgcc/114907] __trunchfbf2 should be renamed to __extendhfbf2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114907 --- Comment #4 from H.J. Lu --- (In reply to H.J. Lu from comment #3) > convert_mode_scalar has > > if (GET_MODE_PRECISION (from_mode) == GET_MODE_PRECISION (to_mode)) > /* Conversion between decimal float and binary float, same size. */ > tab = DECIMAL_FLOAT_MODE_P (from_mode) ? trunc_optab : sext_optab; > > Since for HF->BF, DECIMAL_FLOAT_MODE_P (from_mode) is false, tab is > sext_optab > and __trunchfbf2 should be renamed to __extendhfbf2. Since BFmode range is bigger than HFmode, __trunchfbf2 should be used.
[Bug libgcc/114907] __trunchfbf2 should be renamed to __extendhfbf2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114907 H.J. Lu changed: What|Removed |Added Summary|Missing __extendhfbf2 in|__trunchfbf2 should be |libgcc |renamed to __extendhfbf2 --- Comment #3 from H.J. Lu --- convert_mode_scalar has if (GET_MODE_PRECISION (from_mode) == GET_MODE_PRECISION (to_mode)) /* Conversion between decimal float and binary float, same size. */ tab = DECIMAL_FLOAT_MODE_P (from_mode) ? trunc_optab : sext_optab; Since for HF->BF, DECIMAL_FLOAT_MODE_P (from_mode) is false, tab is sext_optab and __trunchfbf2 should be renamed to __extendhfbf2.
[Bug libgcc/114907] Missing __extendhfbf2 in libgcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114907 --- Comment #2 from H.J. Lu --- [hjl@gnu-cfl-3 pr114907]$ cat foo.c __bf16 foo (_Float16 x) { return x; } [hjl@gnu-cfl-3 pr114907]$ make CC=gcc gcc -O2 -S foo.c [hjl@gnu-cfl-3 pr114907]$ cat foo.s .file "foo.c" .text .p2align 4 .globl foo .type foo, @function foo: .LFB0: .cfi_startproc subq$8, %rsp .cfi_def_cfa_offset 16 call__extendhfbf2 addq$8, %rsp .cfi_def_cfa_offset 8 ret .cfi_endproc .LFE0: .size foo, .-foo .globl __extendhfbf2 .ident "GCC: (GNU) 14.0.1 20240411 (Red Hat 14.0.1-0)" .section.note.GNU-stack,"",@progbits [hjl@gnu-cfl-3 pr114907]$
[Bug libgcc/114907] Missing __extendhfbf2 in libgcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114907 H.J. Lu changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-05-01 Status|UNCONFIRMED |NEW
[Bug libgcc/114907] Missing __extendhfbf2 in libgcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114907 --- Comment #1 from H.J. Lu --- There is __trunchfbf2. Why does GCC generate __extendhfbf2?
[Bug libgcc/114907] New: Missing __extendhfbf2 in libgcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114907 Bug ID: 114907 Summary: Missing __extendhfbf2 in libgcc Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgcc Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: crazylht at gmail dot com Target Milestone: --- Target: x86-64 >From https://sourceware.org/bugzilla/show_bug.cgi?id=31685 $ cat x.cc #include #include #include #include #define SIZE 8 typedef _Float16 T; //typedef volatile float T; void fp16tobf16(_Float16 * f) { __bf16 * b = reinterpret_cast<__bf16*>(f); for(int i=0; i a{}; std::fill(a.begin(), a.end(), (_Float16) 1.7653432432424324); fp16tobf16(a.data()); __bf16 * b = reinterpret_cast<__bf16*>(a.data()); std::cout << "\n"; for(int i=0; i
[Bug tree-optimization/114864] [12/13/14/15 regression] wrong code at -O1 with "-fno-tree-dce -fno-tree-fre" on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114864 H.J. Lu changed: What|Removed |Added CC||ebotcazou at gcc dot gnu.org --- Comment #2 from H.J. Lu --- It is caused by r12-434.
[Bug rtl-optimization/114828] [14 Regression] ICE on valid code at -O1 with "-ftree-pre -fselective-scheduling -fsel-sched-pipelining -fschedule-insns" on x86_64-linux-gnu: Segmentation fault
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114828 H.J. Lu changed: What|Removed |Added Ever confirmed|0 |1 Version|unknown |14.0 Status|UNCONFIRMED |NEW Summary|ICE on valid code at -O1|[14 Regression] ICE on |with "-ftree-pre|valid code at -O1 with |-fselective-scheduling |"-ftree-pre |-fsel-sched-pipelining |-fselective-scheduling |-fschedule-insns" on|-fsel-sched-pipelining |x86_64-linux-gnu: |-fschedule-insns" on |Segmentation fault |x86_64-linux-gnu: ||Segmentation fault CC||rguenther at suse dot de Last reconfirmed||2024-04-23 --- Comment #1 from H.J. Lu --- This is caused by r14-4089.
[Bug tree-optimization/114796] [11/12/13/14 Regression] wrong code at -O2 with "-fno-tree-fre -fno-inline -fselective-scheduling2" on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114796 H.J. Lu changed: What|Removed |Added Summary|wrong code at -O2 with |[11/12/13/14 Regression] |"-fno-tree-fre -fno-inline |wrong code at -O2 with |-fselective-scheduling2" on |"-fno-tree-fre -fno-inline |x86_64-linux-gnu|-fselective-scheduling2" on ||x86_64-linux-gnu Last reconfirmed||2024-04-21 CC||abel at gcc dot gnu.org Status|UNCONFIRMED |NEW Version|unknown |14.0 Ever confirmed|0 |1 --- Comment #1 from H.J. Lu --- This is caused by r9-6789.
[Bug tree-optimization/114793] [14 Regression] wrong code at -O1 with "-fschedule-insns2 -fselective-scheduling2" on x86_64-linux-gnu (the generated code hangs)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114793 --- Comment #3 from H.J. Lu --- (In reply to Zhendong Su from comment #1) > The following reproducer is different, but perhaps is the same or related. > > Compiler Explorer: https://godbolt.org/z/411rzMP1n > > [588] % gcctk -v > Using built-in specs. > COLLECT_GCC=gcctk > COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/ > x86_64-pc-linux-gnu/14.0.1/lto-wrapper > Target: x86_64-pc-linux-gnu > Configured with: ../gcc-trunk/configure --disable-bootstrap > --enable-checking=yes --prefix=/local/suz-local/software/local/gcc-trunk > --enable-sanitizers --enable-languages=c,c++ --disable-werror > --enable-multilib > Thread model: posix > Supported LTO compression algorithms: zlib > gcc version 14.0.1 20240421 (experimental) (GCC) > [589] % > [589] % gcctk -O1 -fno-tree-forwprop -fselective-scheduling2 > -fschedule-insns2 -fsel-sched-pipelining small.c > [590] % ./a.out > Aborted > [591] % > [591] % cat small.c > int printf(const char *, ...); > int a, d, g, h; > volatile int b = 1; > static unsigned c = 1; > char e, f = 1, i; > static int j() { > int k, l = g, m = 1 << l, n = -e, o = -1 % ((f && 1) ^ i), p = ~n - o; > if (m) { > int q, s, t, r = 1 % (((1 % f) & (~e | c)) ^ b); > q = f; > s = i; > t = e; > f = -b; > k = f; > d = -1; > u: > e = 0 & b; > if (i > f) > if (!b) > goto v; > if (d > t) > __builtin_abort(); > if (b < 1 || !d || !c) { > printf("%d\n", i); > f = ((i | b) & (k - r)) << (e << ~t ^ q) << s; > goto u; > } > if (i) > f = q; > v: > i = n & o & l; > printf("%ld\n", (long)t); > } > i = p; > return h; > } > int main() { > for (; a < 3; a++) > j(); > return 0; > } This is caused by r14-2524.
[Bug tree-optimization/114793] wrong code at -O1 with "-fschedule-insns2 -fselective-scheduling2" on x86_64-linux-gnu (the generated code hangs)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114793 H.J. Lu changed: What|Removed |Added CC||jh at suse dot cz Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2024-04-21 Version|unknown |14.0 --- Comment #2 from H.J. Lu --- (In reply to Zhendong Su from comment #0) > It seems to be a recent regression as it does not reproduce with 13.2 and > earlier. > > Compiler Explorer: https://godbolt.org/z/b3cc1MqP9 > > [538] % gcctk -v > Using built-in specs. > COLLECT_GCC=gcctk > COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/ > x86_64-pc-linux-gnu/14.0.1/lto-wrapper > Target: x86_64-pc-linux-gnu > Configured with: ../gcc-trunk/configure --disable-bootstrap > --enable-checking=yes --prefix=/local/suz-local/software/local/gcc-trunk > --enable-sanitizers --enable-languages=c,c++ --disable-werror > --enable-multilib > Thread model: posix > Supported LTO compression algorithms: zlib > gcc version 14.0.1 20240421 (experimental) (GCC) > [539] % > [539] % gcctk -O0 small.c > [540] % ./a.out > [541] % > [541] % gcctk -O1 -fschedule-insns2 -fselective-scheduling2 small.c > [542] % timeout -s 9 10 ./a.out > Killed > [543] % > [543] % cat small.c > int printf(const char *, ...); > volatile int a; > int b, c, d = 1, e, f; > int main() { > int g = 1; > for (; b; b -= d) > g = e; > for (; c < 2; c++) { > if (g) { > if (!d) > printf("%d", f); > continue; > } > a; > } > return 0; > } This is caused by r14-2712.
[Bug tree-optimization/114792] ICE on valid code at -O1 with "-fno-tree-ccp -fno-tree-copy-prop" on x86_64-linux-gnu: in get_loop_body, at cfgloop.cc:903
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114792 H.J. Lu changed: What|Removed |Added Last reconfirmed||2024-04-21 CC||jh at suse dot cz Ever confirmed|0 |1 Version|unknown |14.0 Status|UNCONFIRMED |NEW --- Comment #1 from H.J. Lu --- It is caused by r14-301.
[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115 H.J. Lu changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #21 from H.J. Lu --- Fixed for GCC 14 and GCC 11/12/13 release branches.
[Bug target/114696] ICE: in extract_constrain_insn_cached, at recog.cc:2725 insn does not satisfy its constraints: {*anddi_1} with -mapxf -mx32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114696 H.J. Lu changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #5 from H.J. Lu --- Fixed.
[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115 --- Comment #17 from H.J. Lu --- (In reply to Jan Hubicka from comment #15) > > Fixed for GCC 14 so far > It is simple patch, so backporting is OK after a week in mainline. These are patches which I am backporting: https://patchwork.sourceware.org/project/gcc/list/?series=32823
[Bug target/114696] ICE: in extract_constrain_insn_cached, at recog.cc:2725 insn does not satisfy its constraints: {*anddi_1} with -mapxf -mx32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114696 H.J. Lu changed: What|Removed |Added Target Milestone|--- |14.0 --- Comment #3 from H.J. Lu --- A patch is posted at https://patchwork.sourceware.org/project/gcc/list/?series=32811
[Bug target/114696] ICE: in extract_constrain_insn_cached, at recog.cc:2725 insn does not satisfy its constraints: {*anddi_1} with -mapxf -mx32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114696 H.J. Lu changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |hjl.tools at gmail dot com --- Comment #2 from H.J. Lu --- Created attachment 57934 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57934=edit A patch I am testing this.
[Bug target/114696] ICE: in extract_constrain_insn_cached, at recog.cc:2725 insn does not satisfy its constraints: {*anddi_1} with -mapxf -mx32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114696 H.J. Lu changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-04-12 Status|UNCONFIRMED |NEW --- Comment #1 from H.J. Lu --- The problem is that the APX encoding length for AND exceeds 15 bytes with -mx32.
[Bug libfortran/114646] libgfortran still doesn't define GTHREAD_USE_WEAK to 0 for newer glibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114646 H.J. Lu changed: What|Removed |Added Status|RESOLVED|NEW Resolution|DUPLICATE |--- --- Comment #14 from H.J. Lu --- This issue is about how libgcc is used by libgfortran, not libgcc itself.
[Bug libfortran/114646] libgfortran still doesn't define GTHREAD_USE_WEAK to 0 for newer glibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114646 H.J. Lu changed: What|Removed |Added Resolution|DUPLICATE |--- Status|RESOLVED|NEW Ever confirmed|0 |1 Component|libgcc |libfortran Summary|libgcc's gthr.h still |libgfortran still doesn't |defines GTHREAD_USE_WEAK to |define GTHREAD_USE_WEAK to |1 for newer glibc |0 for newer glibc
[Bug libgcc/114646] libgcc's gthr.h still defines GTHREAD_USE_WEAK to 1 for newer glibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114646 --- Comment #10 from H.J. Lu --- Created attachment 57906 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57906=edit A patch I am testing this.
[Bug libgcc/114646] libgcc's gthr.h still defines GTHREAD_USE_WEAK to 1 for newer glibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114646 --- Comment #7 from H.J. Lu --- r12-5108 commit 80fe172ba9820199c2bbce5d0611ffca27823049 Author: Jonathan Wakely Date: Tue Nov 9 23:45:36 2021 + libstdc++: Disable gthreads weak symbols for glibc 2.34 [PR103133] Since Glibc 2.34 all pthreads symbols are defined directly in libc not libpthread, and since Glibc 2.32 we have used __libc_single_threaded to avoid unnecessary locking in single-threaded programs. This means there is no reason to avoid linking to libpthread now, and so no reason to use weak symbols defined in gthr-posix.h for all the pthread_xxx functions. libstdc++-v3/ChangeLog: PR libstdc++/100748 PR libstdc++/103133 * config/os/gnu-linux/os_defines.h (_GLIBCXX_GTHREAD_USE_WEAK): Define for glibc 2.34 and later. fixed static C++ pthread programs. libgfortran neeeds a similar fix.
[Bug libgomp/39176] -static and -fopenmp and io causes segfault
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39176 H.J. Lu changed: What|Removed |Added CC||skpgkp2 at gmail dot com Status|REOPENED|NEW --- Comment #11 from H.J. Lu --- r12-5108 commit 80fe172ba9820199c2bbce5d0611ffca27823049 Author: Jonathan Wakely Date: Tue Nov 9 23:45:36 2021 + libstdc++: Disable gthreads weak symbols for glibc 2.34 [PR103133] Since Glibc 2.34 all pthreads symbols are defined directly in libc not libpthread, and since Glibc 2.32 we have used __libc_single_threaded to avoid unnecessary locking in single-threaded programs. This means there is no reason to avoid linking to libpthread now, and so no reason to use weak symbols defined in gthr-posix.h for all the pthread_xxx functions. libstdc++-v3/ChangeLog: PR libstdc++/100748 PR libstdc++/103133 * config/os/gnu-linux/os_defines.h (_GLIBCXX_GTHREAD_USE_WEAK): Define for glibc 2.34 and later. fixed static C++ pthread programs. libgfortran neeeds a similar fix.
[Bug libgomp/39176] -static and -fopenmp and io causes segfault
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39176 H.J. Lu changed: What|Removed |Added Ever confirmed|0 |1 See Also||https://sourceware.org/bugz ||illa/show_bug.cgi?id=5784 Last reconfirmed||2024-04-08 Resolution|INVALID |--- Status|RESOLVED|REOPENED --- Comment #10 from H.J. Lu --- Reopened.
[Bug libfortran/114646] libgfortran doesn't work with static libpthread
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114646 H.J. Lu changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2024-04-08 --- Comment #1 from H.J. Lu --- See https://sourceware.org/bugzilla/show_bug.cgi?id=5784#c10 for more info.
[Bug libfortran/114646] New: libgfortran doesn't work with static libpthread
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114646 Bug ID: 114646 Summary: libgfortran doesn't work with static libpthread Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libfortran Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com Target Milestone: --- [hjl@gnu-cfl-3 tmp]$ cat x.f90 use omp_lib implicit none integer, parameter :: NT = 4 integer :: nThreads(NT) print *, 'Call omp_set_dynamic' !$call omp_set_dynamic(.false.) print *, 'Call omp_set_num_threads' !$call omp_set_num_threads(NT) print *, 'Now enter the parallel region' !$omp parallel default(none) shared(nThreads) nThreads(omp_get_thread_num()+1) = omp_get_num_threads() !$omp end parallel print*, nThreads END [hjl@gnu-cfl-3 tmp]$ gfortran -static -fopenmp x.f90 /usr/local/bin/ld: /usr/lib/gcc/x86_64-redhat-linux/13/libgomp.a(target.o): in function `gomp_target_init.part.0': (.text+0x4d6): warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking [hjl@gnu-cfl-3 tmp]$ ./a.out Call omp_set_dynamic Call omp_set_num_threads Now enter the parallel region 4 4 4 4 Program received signal SIGSEGV: Segmentation fault - invalid memory reference. Backtrace for this error: Program received signal SIGABRT: Process abort signal. Backtrace for this error: Program received signal SIGABRT: Process abort signal. Backtrace for this error: Segmentation fault (core dumped) [hjl@gnu-cfl-3 tmp]$
[Bug target/114590] [14 Regression] FAIL: gcc.target/i386/apx-ndd-ti-shift.c (test for excess errors)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114590 H.J. Lu changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |FIXED --- Comment #5 from H.J. Lu --- Fixed.
[Bug gcov-profile/114599] [14 Regression] ICE: SIGSEGV in bitmap_set_bit(bitmap_head*, int) (bitmap.cc:975) with -O2 -fcondition-coverage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114599 H.J. Lu changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #6 from H.J. Lu --- Not fixed.
[Bug gcov-profile/114599] [14 Regression] ICE: SIGSEGV in bitmap_set_bit(bitmap_head*, int) (bitmap.cc:975) with -O2 -fcondition-coverage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114599 H.J. Lu changed: What|Removed |Added CC||hjl.tools at gmail dot com --- Comment #5 from H.J. Lu --- Created attachment 57888 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57888=edit A testcase The bug isn't fixed: [hjl@gnu-tgl-3 gcc]$ /export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/ /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.misc-tests/gcov-24.c -fdiagnostics-plain-output -O2 -fcondition-coverage -S -o gcov-24.s during IPA pass: profile /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.misc-tests/gcov-24.c: In function ‘do_all_fn_LHASH_DOALL_ARG_arg2’: /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.misc-tests/gcov-24.c:20:1: internal compiler error: Segmentation fault 0x16ccfa6 crash_signal /export/gnu/import/git/sources/gcc/gcc/toplev.cc:319 0x180579d hash_table, unsigned int> >::hash_entry, false, xcallocator>::find_with_hash(gcond* const&, unsigned int) /export/gnu/import/git/sources/gcc/gcc/hash-table.h:983 0x1804c87 hash_map, unsigned int> >::get(gcond* const&) /export/gnu/import/git/sources/gcc/gcc/hash-map.h:191 0x17fdbf8 condition_uid /export/gnu/import/git/sources/gcc/gcc/tree-profile.cc:370 0x17ff420 find_conditions(function*) /export/gnu/import/git/sources/gcc/gcc/tree-profile.cc:877 0x158b963 branch_prob(bool) /export/gnu/import/git/sources/gcc/gcc/profile.cc:1549 0x1802b86 tree_profiling /export/gnu/import/git/sources/gcc/gcc/tree-profile.cc:1917 0x1803210 execute /export/gnu/import/git/sources/gcc/gcc/tree-profile.cc:2046 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. [hjl@gnu-tgl-3 gcc]$
[Bug target/114590] [14 Regression] FAIL: gcc.target/i386/apx-ndd-ti-shift.c (test for excess errors)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114590 H.J. Lu changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |MOVED See Also||https://sourceware.org/bugz ||illa/show_bug.cgi?id=31606 --- Comment #1 from H.J. Lu --- An assembler bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31606
[Bug target/114590] [14 Regression] FAIL: gcc.target/i386/apx-ndd-ti-shift.c (test for excess errors)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114590 H.J. Lu changed: What|Removed |Added Priority|P3 |P2 Target Milestone|--- |14.0 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2024-04-04
[Bug target/114590] New: [14 Regression] FAIL: gcc.target/i386/apx-ndd-ti-shift.c (test for excess errors)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114590 Bug ID: 114590 Summary: [14 Regression] FAIL: gcc.target/i386/apx-ndd-ti-shift.c (test for excess errors) Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: crazylht at gmail dot com Target Milestone: --- Target: x86-64 On x86-64, r14-9788-gb7bd2ec73d66f7 gave Executing on host: /export/build/gnu/tools-build/gcc-x32-gitlab/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/tools-build/gcc-x32-gitlab/build-x86_64-linux/gcc/ /export/gnu/import/git/gitlab/x86-gcc/gcc/testsuite/gcc.target/i386/apx-ndd-ti-shift.c -fdiagnostics-plain-output -O2 -lm -o ./apx-ndd-ti-shift.exe(timeout = 300) spawn -ignore SIGHUP /export/build/gnu/tools-build/gcc-x32-gitlab/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/tools-build/gcc-x32-gitlab/build-x86_64-linux/gcc/ /export/gnu/import/git/gitlab/x86-gcc/gcc/testsuite/gcc.target/i386/apx-ndd-ti-shift.c -fdiagnostics-plain-output -O2 -lm -o ./apx-ndd-ti-shift.exe /tmp/ccVIKjlx.s: Assembler messages: /tmp/ccVIKjlx.s:13: Error: operand type mismatch for `shld' /tmp/ccVIKjlx.s:50: Error: operand type mismatch for `shrd' /tmp/ccVIKjlx.s:91: Error: operand type mismatch for `shrd' compiler exited with status 1 FAIL: gcc.target/i386/apx-ndd-ti-shift.c (test for excess errors)
[Bug target/114587] -mapxf should define a macro
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114587 --- Comment #1 from H.J. Lu --- We should define a macro for each APX command-line option.
[Bug target/114587] -mapxf should define a macro
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114587 H.J. Lu changed: What|Removed |Added Target Milestone|--- |14.0 Ever confirmed|0 |1 Priority|P3 |P2 Last reconfirmed||2024-04-04 Status|UNCONFIRMED |NEW
[Bug target/114587] New: -mapxf should define a macro
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114587 Bug ID: 114587 Summary: -mapxf should define a macro Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: crazylht at gmail dot com Target Milestone: --- Target: x86-64 -mapxf should define a macro to indicate APX is enabled.
[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115 H.J. Lu changed: What|Removed |Added Known to work||14.0 --- Comment #14 from H.J. Lu --- Fixed for GCC 14 so far
[Bug lto/114337] LTO symbol table doesn't include builtin functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114337 H.J. Lu changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |MOVED --- Comment #4 from H.J. Lu --- Will fix it in linker.
[Bug lto/114337] LTO symbol table doesn't include builtin functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114337 --- Comment #1 from H.J. Lu --- Maybe linker can deal with it.
[Bug lto/114337] New: LTO symbol table doesn't include builtin functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114337 Bug ID: 114337 Summary: LTO symbol table doesn't include builtin functions Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com Target Milestone: --- [hjl@gnu-cfl-3 pr31482-a]$ cat y.c #include #include void * foo (size_t n) { printf ("hello\n"); return malloc (n); } [hjl@gnu-cfl-3 pr31482-a]$ gcc -flto -c y.c [hjl@gnu-cfl-3 pr31482-a]$ nm y.o T foo [hjl@gnu-cfl-3 pr31482-a]$ lto-dump -list y.o Type Visibility Size Name function default 0 puts function default 0 malloc function default 4 foo [hjl@gnu-cfl-3 pr31482-a]$ This doesn't work with libraries which provide alternative implementations for standard functions, like jemalloc, since linker doesn't know the builtin functions are referenced. Unless GCC can inline these builtin functions, these symbols should be in LTO symbol table.
[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116 --- Comment #12 from H.J. Lu --- (In reply to Lukas Grätz from comment #11) > > I applied it, double checked, make distclean, configure, make again. > > But your result seems different. Have you applied Jakub Jelinek's patch to No. > save %rbp? I applied both patches. Perhaps there was some subtle > merge-conflict with the two patches. Please try just my patch.
[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116 --- Comment #10 from H.J. Lu --- (In reply to Lukas Grätz from comment #9) > > Not on my computer. When I used -g I got: > > > no_return_to_caller: > .LFB0: > .loc 1 16 1 view -0 > .cfi_startproc > .loc 1 17 3 view .LVU1 > .loc 1 18 3 view .LVU2 > .LVL0: > .loc 1 18 26 discriminator 1 view .LVU3 > .loc 1 16 1 is_stmt 0 view .LVU4 > pushq %rbp > .cfi_def_cfa_offset 16 > .cfi_offset 6, -16 > movl$array+67108860, %eax > .loc 1 21 31 view .LVU5 > xorl%r13d, %r13d > .loc 1 16 1 view .LVU6 > > > Still no .cfi_undefined 13. In principle, it should also be generated > without -g, as the rest of .cfi_offset and friends. Did you apply my patch? I got .globl no_return_to_caller .type no_return_to_caller, @function no_return_to_caller: .LFB0: .file 1 "pr38534-1.c" .loc 1 16 1 view -0 .cfi_startproc .loc 1 17 3 view .LVU1 .loc 1 18 3 view .LVU2 .LVL0: .loc 1 18 26 discriminator 1 view .LVU3 .loc 1 16 1 is_stmt 0 view .LVU4 subq$24, %rsp .cfi_undefined 15 .cfi_undefined 14 .cfi_undefined 13 .cfi_undefined 12 .cfi_undefined 6 ...
[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116 --- Comment #8 from H.J. Lu --- (In reply to Lukas Grätz from comment #7) > (In reply to H.J. Lu from comment #6) > > (In reply to Jakub Jelinek from comment #5) > > > Yeah. Not to mention, one can call backtrace even if -g0; you just don't > > > get nice names for the addresses. Without the patch you get crashes in > > > the > > > unwinder when doing backtrace. > > > > Should we generate REG_CFA_UNDEFINED for unsaved callee-saved registers to > > help unwinder: > > > > https://patchwork.sourceware.org/project/gcc/list/?series=30327 > > Yes. Also for gdb this is needed. > > Perhaps I did something wrong. On my computer, I could get the first patch > working to save rbp, I also applied the patch which should omit the > .cfi_undefined. But somehow, I still not get .cfi_undefined for any of the > examples. > > > $ ./gcc/host-x86_64-pc-linux-gnu/gcc/cc1 -O3 > gcc/gcc/testsuite/gcc.target/i386/pr38534-7.c -o pr38534-7.S > > $ cat pr38534-7.S > [...] > no_return_to_caller: > .LFB0: > .cfi_startproc > pushq %rbp > .cfi_def_cfa_offset 16 > .cfi_offset 6, -16 > movl$array+67108860, %eax > xorl%r13d, %r13d > [...] > > > The ".cfi_undefined 13" is still missing... It is generated only when -g is used.
[Bug target/114098] _tile_loadconfig doesn't work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 H.J. Lu changed: What|Removed |Added Target Milestone|--- |11.5 Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #7 from H.J. Lu --- Fixed for 11.5, 12.4, 13.3 and 14.
[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115 --- Comment #8 from H.J. Lu --- A patch is posted at https://patchwork.sourceware.org/project/gcc/list/?series=31343
[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116 H.J. Lu changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|hjl.tools at gmail dot com |unassigned at gcc dot gnu.org --- Comment #6 from H.J. Lu --- (In reply to Jakub Jelinek from comment #5) > Yeah. Not to mention, one can call backtrace even if -g0; you just don't > get nice names for the addresses. Without the patch you get crashes in the > unwinder when doing backtrace. Should we generate REG_CFA_UNDEFINED for unsaved callee-saved registers to help unwinder: https://patchwork.sourceware.org/project/gcc/list/?series=30327
[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116 --- Comment #3 from H.J. Lu --- (In reply to Jakub Jelinek from comment #2) > Created attachment 57545 [details] > gcc14-pr114116.patch > > This seems to fix it, so far tested just on the small testcase, back to the > expected backtrace there. Should we check -g? Without -g, I don't think we need to save FP.
[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115 --- Comment #7 from H.J. Lu --- Created attachment 57544 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57544=edit A patch
[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115 H.J. Lu changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Last reconfirmed||2024-02-26 Target Milestone|--- |14.0 Assignee|unassigned at gcc dot gnu.org |hjl.tools at gmail dot com
[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116 H.J. Lu changed: What|Removed |Added Last reconfirmed||2024-02-26 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |hjl.tools at gmail dot com
[Bug target/114097] Missed register optimization in _Noreturn functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114097 H.J. Lu changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #7 from H.J. Lu --- Fixed.
[Bug target/114097] Missed register optimization in _Noreturn functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114097 --- Comment #5 from H.J. Lu --- A patch is submitted: https://patchwork.sourceware.org/project/gcc/list/?series=31294
[Bug target/114098] _tile_loadconfig doesn't work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 H.J. Lu changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2024-02-25 --- Comment #2 from H.J. Lu --- We should tell GCC that 64 bytes will be accessed by ldtilecfg and sttilecfg.
[Bug target/114098] _tile_loadconfig doesn't work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 --- Comment #1 from H.J. Lu --- The problem is that in extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _tile_loadconfig (const void *__config) { __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config))); } only 8 bytes are used.
[Bug target/114098] New: _tile_loadconfig doesn't work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 Bug ID: 114098 Summary: _tile_loadconfig doesn't work Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: crazylht at gmail dot com Target Milestone: --- Target: x86-64 [hjl@gnu-cfl-3 amx-1]$ cat foo.c #include #include #define MAX_ROWS 16 #define MAX_COLS 64 #define MAX 1024 #define STRIDE 64 typedef struct __tile_config { uint8_t palette_id; uint8_t start_row; uint8_t reserved_0[14]; uint16_t colsb[16]; uint8_t rows[16]; } __tilecfg; extern void bar (__tilecfg *tileinfo); /* Initialize tile config */ static void init_tile_config (__tilecfg *tileinfo) { int i; tileinfo->palette_id = 1; tileinfo->start_row = 0; for (i = 0; i < 1; ++i) { tileinfo->colsb[i] = MAX_ROWS; tileinfo->rows[i] = MAX_ROWS; } for (i = 1; i < 4; ++i) { tileinfo->colsb[i] = MAX_COLS; tileinfo->rows[i] = MAX_ROWS; } _tile_loadconfig (tileinfo); } void enable_amx (void) { __tilecfg tile_data = {0}; init_tile_config (_data); } [hjl@gnu-cfl-3 amx-1]$ gcc -S -O2 -mamx-tile foo.c [hjl@gnu-cfl-3 amx-1]$ cat foo.s .file "foo.c" .text .p2align 4 .globl enable_amx .type enable_amx, @function enable_amx: .LFB6615: .cfi_startproc movl$1, %eax <<<<<<<<<<<<< tile_data isn't properly initialized. movw%ax, -72(%rsp) #APP # 42 "/usr/lib/gcc/x86_64-redhat-linux/13/include/amxtileintrin.h" 1 ldtilecfg -72(%rsp) # 0 "" 2 #NO_APP ret .cfi_endproc .LFE6615: .size enable_amx, .-enable_amx .ident "GCC: (GNU) 13.2.1 20231205 (Red Hat 13.2.1-6)" .section.note.GNU-stack,"",@progbits [hjl@gnu-cfl-3 amx-1]$
[Bug target/114097] Missed register optimization in _Noreturn functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114097 H.J. Lu changed: What|Removed |Added Component|c |target Target Milestone|--- |14.0 Assignee|unassigned at gcc dot gnu.org |hjl.tools at gmail dot com
[Bug c/114097] Missed register optimization in _Noreturn functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114097 --- Comment #3 from H.J. Lu --- Created attachment 57524 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57524=edit A patch
[Bug c/114097] Missed register optimization in _Noreturn functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114097 --- Comment #2 from H.J. Lu --- I couldn't find a way to access the _Noreturn info in backend.
[Bug c/114097] Missed register optimization in _Noreturn functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114097 H.J. Lu changed: What|Removed |Added Last reconfirmed||2024-02-25 Version|unknown |14.0 CC||hjl.tools at gmail dot com Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #1 from H.J. Lu --- __attribute__((noreturn)) works in GCC 14: [hjl@gnu-cfl-3 tmp]$ cat y.c #include #include //_Noreturn __attribute__((noreturn)) void noret(unsigned A, unsigned B, unsigned C, unsigned D, unsigned E, jmp_buf Jb){ for(;A--;) puts("A"); for(;B--;) puts("B"); for(;C--;) puts("C"); for(;D--;) puts("D"); for(;E--;) puts("E"); longjmp(Jb,1); } [hjl@gnu-cfl-3 tmp]$ /usr/gcc-14.0.1-x32/bin/gcc -S -O2 y.c [hjl@gnu-cfl-3 tmp]$ cat y.s .file "y.c" .text .section.rodata.str1.1,"aMS",@progbits,1 .LC0: .string "A" .LC1: .string "B" .LC2: .string "C" .LC3: .string "D" .LC4: .string "E" .text .p2align 4 .globl noret .type noret, @function noret: .LFB11: .cfi_startproc subq$8, %rsp .cfi_def_cfa_offset 16 movl%esi, %r15d movl%edx, %r14d movl%ecx, %r13d movl%r8d, %ebp movq%r9, %r12 testl %edi, %edi je .L2 leal-1(%rdi), %ebx .p2align 4,,10 .p2align 3 .L3: movl$.LC0, %edi callputs subl$1, %ebx jnb .L3 .L2: leal-1(%r15), %ebx testl %r15d, %r15d je .L4 .p2align 4,,10 .p2align 3 .L5: movl$.LC1, %edi callputs subl$1, %ebx jnb .L5 .L4: leal-1(%r14), %ebx testl %r14d, %r14d je .L6 .p2align 4,,10 .p2align 3 .L7: movl$.LC2, %edi callputs subl$1, %ebx jnb .L7 .L6: leal-1(%r13), %ebx testl %r13d, %r13d je .L8 .p2align 4,,10 .p2align 3 .L9: movl$.LC3, %edi callputs subl$1, %ebx jnb .L9 .L8: leal-1(%rbp), %ebx testl %ebp, %ebp je .L10 .p2align 4,,10 .p2align 3 .L11: movl$.LC4, %edi callputs subl$1, %ebx jnb .L11 .L10: movl$1, %esi movq%r12, %rdi calllongjmp .cfi_endproc .LFE11: .size noret, .-noret .ident "GCC: (GNU) 14.0.1 20240223 (experimental)" .section.note.GNU-stack,"",@progbits [hjl@gnu-cfl-3 tmp]$
[Bug rtl-optimization/91161] [11/12/13/14 Regression] ICE in begin_move_insn, at sched-ebb.c:175
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91161 --- Comment #14 from H.J. Lu --- (In reply to Andrew Pinski from comment #13) > I looked into the IR between GCC 12 and GCC 13 (with the added attributes), > before sched2 there is no difference. So it would good to see what change > "fixes" this again. The bug went latent by r13-2726.
[Bug middle-end/113988] during GIMPLE pass: bitintlower: internal compiler error: in lower_stmt, at gimple-lower-bitint.cc:5470
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113988 --- Comment #13 from H.J. Lu --- (In reply to Jakub Jelinek from comment #11) > Though, bet that would mean we punt with -mavx -mno-avx2 on 32-byte copies, > because there we support just V8SFmode and not V32QImode. Punt AVX without AVX2 shouldn't have any meaningful impacts on codegen for real applications.
[Bug target/113912] push2/pop2 generated when stack isn't aligned to 16 bytes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113912 H.J. Lu changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #3 from H.J. Lu --- Fixed.
[Bug target/113855] [14 Regression] __gcc_nested_func_ptr_{created,deleted} exports from 32-bit libgcc_s.so.1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113855 H.J. Lu changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #9 from H.J. Lu --- Fixed.
[Bug target/113912] push2/pop2 generated when stack isn't aligned to 16 bytes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113912 H.J. Lu changed: What|Removed |Added Target Milestone|--- |14.0 Last reconfirmed||2024-02-13 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #1 from H.J. Lu --- A patch is at https://patchwork.sourceware.org/project/gcc/list/?series=30889
[Bug target/113912] New: push2/pop2 generated when stack isn't aligned to 16 bytes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113912 Bug ID: 113912 Summary: push2/pop2 generated when stack isn't aligned to 16 bytes Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: crazylht at gmail dot com Target Milestone: --- Target: x86-64 [hjl@gnu-cfl-3 apx-1]$ cat x.c extern int bar (int); void foo () { int a,b,c,d,e,f,i; a = bar (5); b = bar (a); c = bar (b); d = bar (c); e = bar (d); f = bar (e); for (i = 1; i < 10; i++) { a += bar (a + i) + bar (b + i) + bar (c + i) + bar (d + i) + bar (e + i) + bar (f + i); } } [hjl@gnu-cfl-3 apx-1]$ make /export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/ -mapxf -O2 -mpreferred-stack-boundary=3 -fomit-frame-pointer -S x.c [hjl@gnu-cfl-3 apx-1]$ cat x.s .file "x.c" .text .p2align 4 .globl foo .type foo, @function foo: .LFB0: .cfi_startproc pushp %r15 .cfi_def_cfa_offset 16 .cfi_offset 15, -16 movl$5, %edi push2p %r13, %r14 .cfi_def_cfa_offset 32 .cfi_offset 14, -24 .cfi_offset 13, -32 push2p %rbp, %r12 .cfi_def_cfa_offset 48 .cfi_offset 12, -40 .cfi_offset 6, -48 pushp %rbx .cfi_def_cfa_offset 56 .cfi_offset 3, -56 movl$1, %ebx subq$8, %rsp .cfi_def_cfa_offset 64 callbar movl%eax, %edi movl%eax, %r12d callbar movl%eax, %edi movl%eax, %r15d callbar movl%eax, %edi movl%eax, %r14d callbar movl%eax, %edi movl%eax, %r13d callbar movl%eax, %edi movl%eax, (%rsp) callbar movl%eax, 4(%rsp) .p2align 4,,10 .p2align 3 .L2: leal(%r12,%rbx), %edi callbar leal(%r15,%rbx), %edi movl%eax, %ebp callbar leal(%r14,%rbx), %edi addl%eax, %ebp callbar leal0(%r13,%rbx), %edi addl%eax, %ebp callbar addl%ebx, (%rsp), %edi addl%eax, %ebp callbar addl%ebx, 4(%rsp), %edi addl$1, %ebx addl%eax, %ebp callbar addl%eax, %ebp addl%ebp, %r12d cmpl$10, %ebx jne .L2 addq$8, %rsp .cfi_def_cfa_offset 56 popp%rbx .cfi_def_cfa_offset 48 pop2p %r12, %rbp .cfi_restore 12 .cfi_restore 6 .cfi_def_cfa_offset 32 pop2p %r14, %r13 .cfi_restore 14 .cfi_restore 13 .cfi_def_cfa_offset 16 popp%r15 .cfi_def_cfa_offset 8 ret .cfi_endproc .LFE0: .size foo, .-foo .ident "GCC: (GNU) 14.0.1 20240213 (experimental)" .section.note.GNU-stack,"",@progbits [hjl@gnu-cfl-3 apx-1]$ With -mpreferred-stack-boundary=3, the coming stack is 8-byte aligned. push2/pop2 shouldn't be generated in this case.
[Bug target/113876] ICE: in ix86_expand_epilogue, at config/i386/i386.cc:10101 with -O -mpreferred-stack-boundary=3 -finstrument-functions -mapxf -mcmodel=large
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113876 H.J. Lu changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #4 from H.J. Lu --- Fixed.
[Bug target/113876] ICE: in ix86_expand_epilogue, at config/i386/i386.cc:10101 with -O -mpreferred-stack-boundary=3 -finstrument-functions -mapxf -mcmodel=large
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113876 H.J. Lu changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2024-02-13 Target Milestone|--- |14.0 CC||crazylht at gmail dot com Ever confirmed|0 |1 --- Comment #2 from H.J. Lu --- A patch is at https://patchwork.sourceware.org/project/gcc/list/?series=30888
[Bug target/113909] gcc.target/i386/pr113689-1.c etc. FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113909 --- Comment #1 from H.J. Lu --- It fails on Solaris because of: sol2.h:#undef NO_PROFILE_COUNTERS Just skip these tests for Solaris.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #38 from H.J. Lu --- The new glibc patch set covers both i386 and x86-64: https://patchwork.sourceware.org/project/glibc/list/?series=30854
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #36 from H.J. Lu --- (In reply to Andreas Schwab from comment #35) > ld.so use its internal malloc only during bootstrapping. ___tls_get_addr always uses the internal malloc.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #34 from H.J. Lu --- (In reply to H.J. Lu from comment #33) > (In reply to H.J. Lu from comment #32) > > (In reply to Michael Matz from comment #31) > > > (In reply to H.J. Lu from comment #30) > > > > (In reply to Michael Matz from comment #29) > > > > > It not only can call malloc. As the backtrace of H.J. shows, it quite > > > > > clearly _does_ so :-) > > > > > > > > ld.so can only call the malloc implementation internal to ld.so. > > > > > > (And string functions for initializing that memory) If that's ensured > > > already > > > everywhere: super. Because I agree, that this is the best thing to do > > > here. > > > From my perspective this is pure internal implementation details and hence > > > setting up thread-local areas should not be expected to be interposable by > > > users. > > > (a custom allocator that isn't malloc or doesn't interact with it also > > > would > > > work) > > > > Since ia32 ld.so in glibc is compiled with: > > > > Makefile:rtld-CFLAGS += -mno-sse -mno-mmx -mfpmath=387 > > > > ia32 _dl_tlsdesc_dynamic is OK. > > 387 registers may be an issue. I checked ld.so. It doesn't use 387 registers.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #33 from H.J. Lu --- (In reply to H.J. Lu from comment #32) > (In reply to Michael Matz from comment #31) > > (In reply to H.J. Lu from comment #30) > > > (In reply to Michael Matz from comment #29) > > > > It not only can call malloc. As the backtrace of H.J. shows, it quite > > > > clearly _does_ so :-) > > > > > > ld.so can only call the malloc implementation internal to ld.so. > > > > (And string functions for initializing that memory) If that's ensured > > already > > everywhere: super. Because I agree, that this is the best thing to do here. > > From my perspective this is pure internal implementation details and hence > > setting up thread-local areas should not be expected to be interposable by > > users. > > (a custom allocator that isn't malloc or doesn't interact with it also would > > work) > > Since ia32 ld.so in glibc is compiled with: > > Makefile:rtld-CFLAGS += -mno-sse -mno-mmx -mfpmath=387 > > ia32 _dl_tlsdesc_dynamic is OK. 387 registers may be an issue.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #32 from H.J. Lu --- (In reply to Michael Matz from comment #31) > (In reply to H.J. Lu from comment #30) > > (In reply to Michael Matz from comment #29) > > > It not only can call malloc. As the backtrace of H.J. shows, it quite > > > clearly _does_ so :-) > > > > ld.so can only call the malloc implementation internal to ld.so. > > (And string functions for initializing that memory) If that's ensured > already > everywhere: super. Because I agree, that this is the best thing to do here. > From my perspective this is pure internal implementation details and hence > setting up thread-local areas should not be expected to be interposable by > users. > (a custom allocator that isn't malloc or doesn't interact with it also would > work) Since ia32 ld.so in glibc is compiled with: Makefile:rtld-CFLAGS += -mno-sse -mno-mmx -mfpmath=387 ia32 _dl_tlsdesc_dynamic is OK.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #30 from H.J. Lu --- (In reply to Michael Matz from comment #29) > It not only can call malloc. As the backtrace of H.J. shows, it quite > clearly _does_ so :-) > ld.so can only call the malloc implementation internal to ld.so.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #28 from H.J. Lu --- (In reply to Jakub Jelinek from comment #27) > (In reply to H.J. Lu from comment #26) > > Even if I compile ia32 glibc with -march=skylake, the _dl_tlsdesc_dynamic > > slow > > path doesn't touch XMM registers at all. > > I thought Florian said it can call malloc and malloc can be user provided > and can use SSE2, 387/MMX or whatever other call clobbered registers ia32 > has. [hjl@gnu-cfl-3 elf]$ readelf -rW ld.so Relocation section '.rel.dyn' at offset 0x9f8 contains 3 entries: Offset InfoTypeSym. Value Symbol's Name 00032fe0 1a06 R_386_GLOB_DAT 00031ac0 __rseq_offset@@GLIBC_2.35 00032fe4 1f06 R_386_GLOB_DAT 00031ac4 __rseq_size@@GLIBC_2.35 00032b20 002a R_386_IRELATIVE Relocation section '.relr.dyn' at offset 0xa10 contains 3 entries: 12 offsets 00031a60 00032ed0 00032ed8 00032f04 00032f08 00032f0c 00032f10 00032f14 00032f18 00032f1c 00032f20 00032f24 [hjl@gnu-cfl-3 elf]$ You can't use another malloc for the ld.so internal usage of malloc/calloc.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #26 from H.J. Lu --- (In reply to Jakub Jelinek from comment #25) > (In reply to H.J. Lu from comment #23) > > > And i386/dl-tlsdesc.S needs to save/restore 387 and SSE regs? > > > > i386 doesn't preserve them in _dl_runtime_resolve nor _dl_tlsdesc_dynamic. > > That is different. _dl_runtime_resolve happens only at the start of calls > to functions, if in all supported ia32 ABIs all of i387 state is unsupported > upon entering functions, then there is no need to save anything. > While _dl_tlsdesc_dynamic can happen anywhere from within functions and > doesn't clobber any registers except ax which gets the value, so I think it > needs to be saved for that case. I couldn't find a test to show it is needed on i386: #0 __GI___libc_malloc (bytes=3200) at malloc.c:3294 #1 0xf7fdb771 in malloc (size=) at ../include/rtld-malloc.h:56 #2 allocate_dtv_entry (size=, alignment=4) at dl-tls.c:679 #3 allocate_and_init (map=0xf6e00670) at dl-tls.c:704 #4 tls_get_addr_tail (ti=0xf6e00a30, dtv=0x5655fcd8, the_map=0xf6e00670) at dl-tls.c:904 #5 0xf7fdf5d5 in _dl_tlsdesc_dynamic () at ../sysdeps/i386/dl-tlsdesc.S:129 #6 0xf7fb017b in apply_tls (p=0xf7a0037c) at tst-gnu2-tls2mod1.c:26 #7 0x5655769b in access_mod (i=1, sym=0x5655a026 "apply_tls") at ../sysdeps/i386/i686/tst-gnu2-tls2-i686.c:55 #8 start (arg=0x0) at ../sysdeps/i386/i686/tst-gnu2-tls2-i686.c:70 #9 0xf7c96207 in start_thread (arg=) at pthread_create.c:447 #10 0xf7d3dc08 in clone3 () at ../sysdeps/unix/sysv/linux/i386/clone3.S:111 Even if I compile ia32 glibc with -march=skylake, the _dl_tlsdesc_dynamic slow path doesn't touch XMM registers at all.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 H.J. Lu changed: What|Removed |Added Resolution|--- |MOVED Status|NEW |RESOLVED --- Comment #24 from H.J. Lu --- Moved to glibc.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #23 from H.J. Lu --- (In reply to Jakub Jelinek from comment #22) > BTW, does aarch64 dl-tlsdesc.S save SVE/SME register state (I only see fixed > offsets in there), or are those call-saved? > What about floating point registers in x86_64/dl-tlsdesc.S? Floating point registers are preserved with my glibc patch. > And i386/dl-tlsdesc.S needs to save/restore 387 and SSE regs? i386 doesn't preserve them in _dl_runtime_resolve nor _dl_tlsdesc_dynamic.
[Bug tree-optimization/113752] [14 Regression] warning: ‘%s’ directive writing up to 10218 bytes into a region of size between 0 and 10240 [-Wformat-overflow=] since r14-261-g0ef3756adf078c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113752 --- Comment #6 from H.J. Lu --- I can reproduce it with r14-8930-g1e94648ab7b370
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #21 from H.J. Lu --- (In reply to Florian Weimer from comment #20) > (In reply to H.J. Lu from comment #19) > > (In reply to Florian Weimer from comment #9) > > > (In reply to H.J. Lu from comment #7) > > > > > The __tls_get_addr call with the default approach potentially needs > > > > > to solve > > > > > the same problem, doesn't it? > > > > > > > > Isn't __tls_get_addr called via the PLT entry? > > > > > > I'm not sure if that matters? Even if the lazy binding trampoline is > > > active, > > > it won't protect the actual call. > > > > Non-GNU2 TLS has > > > > 4000 00010007 R_X86_64_JUMP_SLOT > > __tls_get_addr + 1010 > > > > which calls _dl_runtime_resolve with lazy binding. _dl_runtime_resolve > > preserves all caller-saved registers. > > The dynamic linker preserves register contents during lazy binding and > restores them before calling __tls_get_addr, so it doesn't help with > __tls_get_addr register usage itself. And lazy binding happens only once per > process and object, while we need to protect the first call on every thread. Only called from _dl_tlsdesc_dynamic isn't protected. My glibc patch: https://patchwork.sourceware.org/project/glibc/list/?series=30800 fixes it.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #19 from H.J. Lu --- (In reply to Florian Weimer from comment #9) > (In reply to H.J. Lu from comment #7) > > > The __tls_get_addr call with the default approach potentially needs to > > > solve > > > the same problem, doesn't it? > > > > Isn't __tls_get_addr called via the PLT entry? > > I'm not sure if that matters? Even if the lazy binding trampoline is active, > it won't protect the actual call. Non-GNU2 TLS has 4000 00010007 R_X86_64_JUMP_SLOT __tls_get_addr + 1010 which calls _dl_runtime_resolve with lazy binding. _dl_runtime_resolve preserves all caller-saved registers.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #7 from H.J. Lu --- (In reply to Florian Weimer from comment #6) > > (In reply to H.J. Lu from comment #4) > > > (In reply to H.J. Lu from comment #3) > > > > Created attachment 57385 [details] > > > > A patch > > > > > > > > Try this. > > > > > > This doesn't work properly. To work around in ld.so, _dl_tlsdesc_dynamic > > > needs to save and restore ALL registers, which can be expensive. > > Why doesn't this work properly? Is it possible to make it work with a > different approach? Clobber must be attached to TLS descriptor call insn. > The __tls_get_addr call with the default approach potentially needs to solve > the same problem, doesn't it? Isn't __tls_get_addr called via the PLT entry? > (In reply to Jakub Jelinek from comment #5) > > Or it could be compiled with options to make sure it doesn't use vector > > registers etc., and only save/restore if it needs to call into some code > > where libc can't afford that (say allocate memory). > > We currently call into malloc, which could be a replacement malloc. If GCC > cannot be fixed, full context switch or elimination of the slow path are our > best options for a glibc-side fix. We should open a glibc bug. I am working on the glibc fix.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 H.J. Lu changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2024-02-11 --- Comment #4 from H.J. Lu --- (In reply to H.J. Lu from comment #3) > Created attachment 57385 [details] > A patch > > Try this. This doesn't work properly. To work around in ld.so, _dl_tlsdesc_dynamic needs to save and restore ALL registers, which can be expensive.
[Bug target/113874] GNU2 TLS descriptor calls do not follow psABI on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874 --- Comment #3 from H.J. Lu --- Created attachment 57385 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57385=edit A patch Try this.
[Bug target/113837] Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 --- Comment #9 from H.J. Lu --- (In reply to Jakub Jelinek from comment #7) > (In reply to H.J. Lu from comment #5) > > (In reply to Jakub Jelinek from comment #1) > > > Ugh no, please don't. > > > This is significant ABI change. > > > First of all, zeroing even for signed _BitInt is very weird, sign > > > extension > > > for that case is more natural, but when _BitInt doesn't have any > > > unspecified > > > bits, everything that computes them will need to compute even the extra > > > bits. That is not the case in the current code. > > > > Can we compare zeroing and undefined codegen of unused bits for storing > > signed _BitInt? > > Not easily, the bitint_info::extended support isn't there yet (as no target > needed it so far). See also the discussions about it on IRC and aarch64 > _BitInt support thread (aarch64 wants to have the extra bits unspecified, > but arm 32 extended). > > > Then implement whatever appropriate in GCC and make it the de facto ABI. > > So what's wrong with > https://gitlab.com/x86-psABIs/i386-ABI/-/issues/5 > ? Has it been discussed, or is i386-ABI dead? i386 psABI is not actively maintained. > I'd probably go with 32-bit limbs for _BitInt(65) and higher instead of > 64-bit, > but under the hood that is how it will be implemented no matter what the ABI > says, > whether it is 32-bit limbs or 64-bit limbs only affects a) the alignment b) > how much is wasted in case of say _BitInt(65) or _BitInt(129) etc. and what > the sizeof is. > Even if limbs are 64-bit, the question is about alignment, ia32 has 32-bit > alignment for long long and double at least when used inside of structs, so > it would be weird to have different alignment from struct { limb l1, l2; } > and similar. Just implement what is the appropriate in GCC. We will document it.
[Bug target/113837] Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 --- Comment #6 from H.J. Lu --- (In reply to Jakub Jelinek from comment #4) > (In reply to H.J. Lu from comment #3) > > (In reply to Jakub Jelinek from comment #2) > > > OT, what is the state of the ia32 _BitInt ABI? I'd really like to enable > > > it > > > in GCC 14 even for ia32 (and perhaps -mx32 if you care about that case). > > > > I think we should leave ia32 alone. > > You mean never support C23 on it? Then implement whatever appropriate in GCC and make it the de facto ABI.
[Bug target/113837] Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 --- Comment #5 from H.J. Lu --- (In reply to Jakub Jelinek from comment #1) > Ugh no, please don't. > This is significant ABI change. > First of all, zeroing even for signed _BitInt is very weird, sign extension > for that case is more natural, but when _BitInt doesn't have any unspecified > bits, everything that computes them will need to compute even the extra > bits. That is not the case in the current code. Can we compare zeroing and undefined codegen of unused bits for storing signed _BitInt?
[Bug target/113837] Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 --- Comment #3 from H.J. Lu --- (In reply to Jakub Jelinek from comment #2) > OT, what is the state of the ia32 _BitInt ABI? I'd really like to enable it > in GCC 14 even for ia32 (and perhaps -mx32 if you care about that case). I think we should leave ia32 alone. x32 uses the same psABI as x86-64.
[Bug target/113837] New: Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 Bug ID: 113837 Summary: Zeroing unused bits in _BitInt can improve codegen Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: crazylht at gmail dot com Target Milestone: --- Target: x86-64 I opened this x86-64 psABI issue: https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/16
[Bug target/113689] [11/12/13/14 Regression] wrong code with -fprofile -mcmodel=large when needing drap register since r11-6548
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113689 --- Comment #11 from H.J. Lu --- (In reply to Jakub Jelinek from comment #10) > > Just the second hunk. I think with sorry call the compilation fails, so what > you actually emit doesn't matter (one can see it with -pipe, sure). Done.
[Bug target/113689] [11/12/13/14 Regression] wrong code with -fprofile -mcmodel=large when needing drap register since r11-6548
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113689 --- Comment #9 from H.J. Lu --- Like this? diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index f02c6c02ac6..ed0b0e19985 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -22785,10 +22785,10 @@ x86_64_select_profile_regnum (bool r11_ok ATTRIBUTE_UNUSED) && !REGNO_REG_SET_P (reg_live, i return i; - sorry ("no register available for profiling %<-mcmodel=large%s%>", + sorry ("no register available for profiling %<-mcmodel=large%s%>, use r10", ix86_cmodel == CM_LARGE_PIC ? " -fPIC" : ""); - return INVALID_REGNUM; + return R10_REG; } /* Output assembler code to FILE to increment profiler label # LABELNO
[Bug target/113689] [11/12/13/14 Regression] wrong code with -fprofile -mcmodel=large when needing drap register since r11-6548
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113689 --- Comment #6 from H.J. Lu --- Fixed for GCC 14 so far.
[Bug tree-optimization/113752] [14 Regression] warning: ‘%s’ directive writing up to 10218 bytes into a region of size between 0 and 10240 [-Wformat-overflow=]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113752 --- Comment #2 from H.J. Lu --- [hjl@gnu-skx-1 gcc]$ cat /tmp/foo.i char a[10256]; char b; char *c, *g; int d, e, f; int sprintf(char *, char *, ...); unsigned long strlen(char *); int h(char *j) { if (strlen(j) + strlen(c) + strlen(g) + 32 > 10256) return 0; sprintf(a, "%s:%s:%d:%d:%d:%c:%s\n", j, c, d, e, f, b, g); return 1; } void i() { h("wctype"); } [hjl@gnu-skx-1 gcc]$ ./xgcc -B./ -O3 -Wall -S /tmp/foo.i /tmp/foo.i: In function ?i?: /tmp/foo.i:10:33: warning: ?%s? directive writing up to 10218 bytes into a region of size between 0 and 10240 [-Wformat-overflow=] 10 | sprintf(a, "%s:%s:%d:%d:%d:%c:%s\n", j, c, d, e, f, b, g); | ^~ In function ?h?, inlined from ?i? at /tmp/foo.i:13:12: /tmp/foo.i:10:3: note: ?sprintf? output between 18 and 20484 bytes into a destination of size 10256 10 | sprintf(a, "%s:%s:%d:%d:%d:%c:%s\n", j, c, d, e, f, b, g); | ^ [hjl@gnu-skx-1 gcc]$
[Bug tree-optimization/113752] [14 Regression] warning: ‘%s’ directive writing up to 10218 bytes into a region of size between 0 and 10240 [-Wformat-overflow=]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113752 H.J. Lu changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-02-04 CC||aldyh at redhat dot com Status|UNCONFIRMED |NEW --- Comment #1 from H.J. Lu --- It is caused by r14-261.
[Bug c/113752] New: [14 Regression] warning: ‘%s’ directive writing up to 10218 bytes into a region of size between 0 and 10240 [-Wformat-overflow=]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113752 Bug ID: 113752 Summary: [14 Regression] warning: ‘%s’ directive writing up to 10218 bytes into a region of size between 0 and 10240 [-Wformat-overflow=] Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com Target Milestone: --- Created attachment 57315 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57315=edit A testcase [hjl@gnu-tgl-2 tmp]$ /usr/gcc-14.0.1-x32-apx/bin/gcc -O3 -S x.i -Wall In file included from tests-mbwc/tst_wctype.c:8: tests-mbwc/tsp_common.c: In function ‘result.constprop.isra’: tests-mbwc/tsp_common.c:55:24: warning: ‘%s’ directive writing up to 10218 bytes into a region of size between 0 and 10240 [-Wformat-overflow=] tests-mbwc/tsp_common.c:55:3: note: ‘sprintf’ output between 18 and 20484 bytes into a destination of size 10256 [hjl@gnu-tgl-2 tmp]$ GCC 13 is OK.
[Bug target/113751] New: -mapxf -mfma4 generates wrong assembly code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113751 Bug ID: 113751 Summary: -mapxf -mfma4 generates wrong assembly code Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: crazylht at gmail dot com Target Milestone: --- Target: x86-64 [hjl@gnu-icx-1 creduce-1]$ cat x.i struct { double a[8] } a; double b, c, d; int e, f, g; void h() { f = e; d = a.a[g + 1]; c = a.a[g] + a.a[g + 3] * (a.a[g + 4] * (a.a[g + 5] * (a.a[g + 6] * (a.a[g + 7] * a.a[g + 8] + b; d += e > a.a[g + 11]; } [hjl@gnu-icx-1 creduce-1]$ /export/build/gnu/tools-build/gcc-x32-gitlab/release/usr/gcc-14.0.1-x32/bin/gcc -O3 -mfma4 -mapxf x.i -w -c /tmp/cchsm1V9.s: Assembler messages: /tmp/cchsm1V9.s:38: Error: extended GPR cannot be used as base/index for `vfmaddsd' [hjl@gnu-icx-1 creduce-1]$
[Bug target/113711] APX instruction set and instructions longer than 15 bytes (assembly warning)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113711 --- Comment #9 from H.J. Lu --- Many NDD patterns have the same issue. Here is another testcase: [hjl@gnu-cfl-3 pr113711]$ cat apx-ndd-length-X.c /* { dg-do assemble { target { apxf && { ! ia32 } } } } */ /* { dg-options "-mapxf -O2" } */ typedef signed __int128 S; int o; S qux (void) { S z; o = __builtin_add_overflow (*(S __seg_fs *) 0x1000, 0x200, ); return z; } [hjl@gnu-cfl-3 pr113711]$ make apx-ndd-length-X.o /export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/ -mapxf -O3 -dp -c -o apx-ndd-length-X.o apx-ndd-length-X.c /tmp/cc1eMHh5.s: Assembler messages: /tmp/cc1eMHh5.s:9: Warning: instruction length of 16 bytes exceeds the limit of 15 [hjl@gnu-cfl-3 pr113711]$ cat apx-ndd-length-Y.c /* { dg-do assemble { target { apxf && { ! ia32 } } } } */ /* { dg-options "-mapxf -O2" } */ __thread signed __int128 var; int o; signed __int128 qux (void) { signed __int128 z; o = __builtin_add_overflow (var, 0x200, ); return z; } [hjl@gnu-cfl-3 pr113711]$ make apx-ndd-length-Y.o /export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/ -mapxf -O3 -dp -c -o apx-ndd-length-Y.o apx-ndd-length-Y.c /tmp/ccwvDbZA.s: Assembler messages: /tmp/ccwvDbZA.s:9: Warning: instruction length of 16 bytes exceeds the limit of 15 [hjl@gnu-cfl-3 pr113711]$ We need to exam all NDD patterns to check invalid memory constraint. We should find a testcase for each issue we find.