[PATCH] Add myself to DCO

2024-05-09 Thread H.J. Lu
I am retiring from Intel. Friday, May 10 is my last day. I will submit my GCC, glibc and binutils patches under DCO from now on. H.J. ChangeLog: * MAINTAINERS: Add myself to DCO. Signed-off-by: H.J. Lu --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git

[PATCH] Implement _Float16 to bfloat16 conversion with float32

2024-05-02 Thread H.J. Lu
Since bfloat16 isn't a subset nor superset of _Float16, implement _Float16 to bfloat16 conversion with _Float16 -> float32 -> bfloat16. gcc/ PR middle-end/114907 * expr.cc (convert_mode_scalar): Implement _Float16 to bfloat16 conversion with float32 conversions.

[PATCH] libgcc: Rename __trunchfbf2 to __extendhfbf2

2024-05-01 Thread H.J. Lu
Since bfloat16 has the same range as float32, _Float16 to bfloat16 conversion is an extension, not a truncation. Rename trunchfbf2.c to extendhfbf2.c to provide __extendhfbf2, instead of __trunchfbf2. Since _Float16 to bfloat16 conversion never worked from the day one, the same libgcc version of

Re: [PATCH] Don't assert for IFN_COND_{MIN, MAX} in vect_transform_reduction

2024-04-29 Thread H.J. Lu
On Mon, Apr 29, 2024 at 6:47 AM liuhongt wrote: > > The Fortran standard does not specify what the result of the MAX > and MIN intrinsics are if one of the arguments is a NaN. So it > should be ok to tranform reduction for IFN_COND_MIN with vectorized > COND_MIN and REDUC_MIN. The commit subject

[Backport 2/2] middle-end/114599 - fix bitmap allocation for check_ifunc_callee_symtab_nodes

2024-04-14 Thread H.J. Lu
From: Richard Biener There's no default bitmap obstack during global CTORs, so allocate the bitmap locally. PR middle-end/114599 PR gcov-profile/114115 * symtab.cc (ifunc_ref_map): Do not use auto_bitmap. (is_caller_ifunc_resolver): Optimize

[Backport 1/2] tree-profile: Disable indirect call profiling for IFUNC resolvers

2024-04-14 Thread H.J. Lu
We can't profile indirect calls to IFUNC resolvers nor their callees as it requires TLS which hasn't been set up yet when the dynamic linker is resolving IFUNC symbols. Add an IFUNC resolver caller marker to cgraph_node and set it if the function is called by an IFUNC resolver. Disable indirect

[PATCH] x86: Allow TImode offsettable memory only with 8-bit constant

2024-04-12 Thread H.J. Lu
The x86 instruction size limit is 15 bytes. If a NDD instruction has a segment prefix byte, a 4-byte opcode prefix, a MODRM byte, a SIB byte, a 4-byte displacement and a 4-byte immediate, adding an address size prefix will exceed the size limit. Change TImode ADD, AND, OR and XOR to allow

[PATCH] libstdc++: Update some baseline_symbols.txt (x32)

2024-04-12 Thread H.J. Lu
* config/abi/post/x86_64-linux-gnu/x32/baseline_symbols.txt: Updated. --- .../abi/post/x86_64-linux-gnu/x32/baseline_symbols.txt | 6 ++ 1 file changed, 6 insertions(+) diff --git a/libstdc++-v3/config/abi/post/x86_64-linux-gnu/x32/baseline_symbols.txt

Re: [PATCH 0/2] mmap: Avoid the sanitizer configure check failure

2024-04-10 Thread H.J. Lu
On Tue, Apr 9, 2024 at 10:39 PM Alan Modra wrote: > > On Tue, Apr 09, 2024 at 07:24:33AM -0700, H.J. Lu wrote: > > Define GCC_AC_FUNC_MMAP with export ASAN_OPTIONS=detect_leaks=0 to avoid > > the sanitizer configure check failure. > > OK for binutils. (I just fixed my loc

Re: [PATCH 0/2] mmap: Avoid the sanitizer configure check failure

2024-04-09 Thread H.J. Lu
On Tue, Apr 9, 2024 at 4:08 PM Sam James wrote: > > "H.J. Lu" writes: > > > When -fsanitize=address,undefined is used to build, the mmap configure > > check failed with > > I think Paul fixed this in autoconf commit > 09b6e78d1592ce10fdc975025d699ee41444

Re: [PATCH] libgfortran: Disable gthreads weak symbols for glibc 2.34

2024-04-09 Thread H.J. Lu
On Tue, Apr 9, 2024 at 10:25 AM Andrew Pinski wrote: > > > > On Tue, Apr 9, 2024, 10:07 H.J. Lu wrote: >> >> Since Glibc 2.34 all pthreads symbols are defined directly in libc not >> libpthread, and since Glibc 2.32 we have used __libc_single_threaded to >>

[PATCH] libgfortran: Disable gthreads weak symbols for glibc 2.34

2024-04-09 Thread H.J. Lu
Since Glibc 2.34 all pthreads symbols are defined directly in libc not libpthread, and since Glibc 2.32 we have used __libc_single_threaded to avoid unnecessary locking in single-threaded programs. This means there is no reason to avoid linking to libpthread now, and so no reason to use weak

[PATCH 1/2] mmap: Avoid the sanitizer configure check failure

2024-04-09 Thread H.J. Lu
When -fsanitize=address,undefined is used to build, the mmap configure check failed with = ==231796==ERROR: LeakSanitizer: detected memory leaks Direct leak of 4096 byte(s) in 1 object(s) allocated from: #0 0x7cdd3d0defdf in

[PATCH 2/2] mmap: Avoid the sanitizer configure check failure

2024-04-09 Thread H.J. Lu
When -fsanitize=address,undefined is used to build, the mmap configure check failed with = ==231796==ERROR: LeakSanitizer: detected memory leaks Direct leak of 4096 byte(s) in 1 object(s) allocated from: #0 0x7cdd3d0defdf in

[PATCH 0/2] mmap: Avoid the sanitizer configure check failure

2024-04-09 Thread H.J. Lu
/asan_malloc_linux.cpp:69 #1 0x5750c7f6d2e1 in main /home/alan/build/gas-san/all/bfd/conftest.c:190 SUMMARY: AddressSanitizer: 8192 byte(s) leaked in 2 allocation(s). Define GCC_AC_FUNC_MMAP with export ASAN_OPTIONS=detect_leaks=0 to avoid the sanitizer configure check failure. H.J. Lu (2

[PATCH v2] x86: Define __APX_INLINE_ASM_USE_GPR32__

2024-04-08 Thread H.J. Lu
Define __APX_INLINE_ASM_USE_GPR32__ for -mapx-inline-asm-use-gpr32. When __APX_INLINE_ASM_USE_GPR32__ is defined, inline asm statements should contain only instructions compatible with r16-r31. gcc/ PR target/114587 * config/i386/i386-c.cc (ix86_target_macros_internal): Define

[PATCH] x86: Define macros for APX options

2024-04-08 Thread H.J. Lu
Define following macros for APX options: 1. __APX_EGPR__: -mapx-features=egpr. 2. __APX_PUSH2POP2__: -mapx-features=push2pop2. 3. __APX_NDD__: -mapx-features=ndd. 4. __APX_PPX__: -mapx-features=ppx. 5. __APX_INLINE_ASM_USE_GPR32__: -mapx-inline-asm-use-gpr32. They can be used to make assembly

[PATCH] x86: Use explicit shift count in double-precision shifts

2024-04-05 Thread H.J. Lu
Don't use implicit shift count in double-precision shifts in AT syntax since they aren't in Intel SDM. Keep the 's' modifier for backward compatibility with inline asm statements. PR target/114590 * config/i386/i386.md (x86_64_shld): Use explicit shift count in AT syntax.

Re: [PATCH] middle-end/114599 - fix bitmap allocation for check_ifunc_callee_symtab_nodes

2024-04-05 Thread H.J. Lu
On Fri, Apr 5, 2024 at 6:52 AM Richard Biener wrote: > > > > > Am 05.04.2024 um 15:46 schrieb H.J. Lu : > > > > On Fri, Apr 5, 2024 at 1:21 AM Richard Biener wrote: > >> > >> There's no default bitmap obstack during global CTORs, so allocate

Re: [PATCH v10 1/2] Add condition coverage (MC/DC)

2024-04-05 Thread H.J. Lu
On Thu, Apr 4, 2024 at 5:54 AM Jørgen Kvalsvik wrote: > > On 04/04/2024 14:10, Jan Hubicka wrote: > >> gcc/ChangeLog: > >> > >> * builtins.cc (expand_builtin_fork_or_exec): Check > >>condition_coverage_flag. > >> * collect2.cc (main): Add -fno-condition-coverage to OBSTACK. > >>

Re: [PATCH] middle-end/114599 - fix bitmap allocation for check_ifunc_callee_symtab_nodes

2024-04-05 Thread H.J. Lu
On Fri, Apr 5, 2024 at 1:21 AM Richard Biener wrote: > > There's no default bitmap obstack during global CTORs, so allocate the > bitmap locally. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. > > Richard. > > PR middle-end/114599 > * symtab.cc (ifunc_ref_map): Do

Re: [PATCH v3] tree-profile: Disable indirect call profiling for IFUNC resolvers

2024-04-04 Thread H.J. Lu
On Thu, Apr 4, 2024 at 5:34 PM wrote: > > On 3 April 2024 15:49:13 CEST, "H.J. Lu" wrote: > > > >> OK witht that change. > >> Honza > > > >I am checking in this patch with the updated comments: > > > > /* Disable indirect ca

[PATCH] x86: Define __APX_F__ for -mapxf

2024-04-04 Thread H.J. Lu
Define __APX_F__ when APX is enabled. gcc/ PR target/114587 * config/i386/i386-c.cc (ix86_target_macros_internal): Define __APX_F__ when APX is enabled. gcc/testsuite/ PR target/114587 * gcc.target/i386/apx-2.c: New test. --- gcc/config/i386/i386-c.cc

Re: [PATCH v3] tree-profile: Disable indirect call profiling for IFUNC resolvers

2024-04-03 Thread H.J. Lu
On Wed, Apr 3, 2024 at 8:31 AM Peter Bergner wrote: > > On 4/3/24 7:40 AM, H.J. Lu wrote: > > We can't profile indirect calls to IFUNC resolvers nor their callees as > > it requires TLS which hasn't been set up yet when the dynamic linker is > > resolving IFUNC symbo

Re: [PATCH v3] tree-profile: Disable indirect call profiling for IFUNC resolvers

2024-04-03 Thread H.J. Lu
ng to the PR and need to have TLS > initialized. > > OK witht that change. > Honza I am checking in this patch with the updated comments: /* Disable indirect call profiling for an IFUNC resolver and its callees since it requires TLS which hasn't been set up yet when the dynamic

Re: PING: [PATCH v2] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-04-03 Thread H.J. Lu
On Tue, Apr 2, 2024 at 10:03 AM Jan Hubicka wrote: > > > > I am bit worried about commonly used functions getting "infected" by > > > being called once from ifunc resolver. I think we only use thread local > > > storage for indirect call profiling, so we may just disable indirect > > > call

[PATCH v3] tree-profile: Disable indirect call profiling for IFUNC resolvers

2024-04-03 Thread H.J. Lu
We can't profile indirect calls to IFUNC resolvers nor their callees as it requires TLS which hasn't been set up yet when the dynamic linker is resolving IFUNC symbols. Add an IFUNC resolver caller marker to cgraph_node and set it if the function is called by an IFUNC resolver. Disable indirect

Re: PING: [PATCH v2] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-04-02 Thread H.J. Lu
On Tue, Apr 2, 2024 at 7:50 AM Jan Hubicka wrote: > > > On Tue, Mar 5, 2024 at 1:45 PM H.J. Lu wrote: > > > > > > We can't instrument an IFUNC resolver nor its callees as it may require > > > TLS which hasn't been set up yet when the dynamic lin

PING: [PATCH v2] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-04-02 Thread H.J. Lu
On Tue, Mar 5, 2024 at 1:45 PM H.J. Lu wrote: > > We can't instrument an IFUNC resolver nor its callees as it may require > TLS which hasn't been set up yet when the dynamic linker is resolving > IFUNC symbols. > > Add an IFUNC resolver caller marker to cgraph_node and set it

Re: libbacktrace patch committed: Don't assume compressed section aligned

2024-03-08 Thread H.J. Lu
On Fri, Mar 8, 2024 at 2:48 PM Fangrui Song wrote: > > On ELF64, it looks like BFD uses 8-byte alignment for compressed > `.debug_*` sections while gold/lld/mold use 1-byte alignment. I do not > know how the Solaris linker sets the alignment. > > The specification's wording makes me confused

Re: [C++ coroutines] Initial implementation pushed to master.

2024-03-06 Thread H.J. Lu
On Wed, Mar 6, 2024 at 1:03 AM Iain Sandoe wrote: > > > > > On 5 Mar 2024, at 17:31, H.J. Lu wrote: > > > > On Sat, Jan 18, 2020 at 4:54 AM Iain Sandoe wrote: > >> > > >> 2020-01-18 Iain Sandoe > >> > >>* Ma

Re: [PATCH] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-03-05 Thread H.J. Lu
On Thu, Feb 29, 2024 at 7:11 AM H.J. Lu wrote: > > On Thu, Feb 29, 2024 at 7:06 AM Jan Hubicka wrote: > > > > > > I am worried about scenario where ifunc selector calls function foo > > > > defined locally and foo is also used from other

[PATCH v2] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-03-05 Thread H.J. Lu
We can't instrument an IFUNC resolver nor its callees as it may require TLS which hasn't been set up yet when the dynamic linker is resolving IFUNC symbols. Add an IFUNC resolver caller marker to cgraph_node and set it if the function is called by an IFUNC resolver. Update tree_profiling to skip

Re: [C++ coroutines] Initial implementation pushed to master.

2024-03-05 Thread H.J. Lu
On Sat, Jan 18, 2020 at 4:54 AM Iain Sandoe wrote: > > Hi, > > Thanks to: > >* the reviewers, the code was definitely improved by your reviews. > >* those folks who tested the branch and/or compiler explorer > instance and reported problems with reproducers. > > * WG21 colleagues,

Re: [PATCH] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-02-29 Thread H.J. Lu
On Thu, Feb 29, 2024 at 7:06 AM Jan Hubicka wrote: > > > > I am worried about scenario where ifunc selector calls function foo > > > defined locally and foo is also used from other places possibly in hot > > > loops. > > > > > > > > > So it is not really reliable fix (though I guess it will work

Re: [PATCH] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-02-29 Thread H.J. Lu
On Thu, Feb 29, 2024 at 6:34 AM Jan Hubicka wrote: > > > On Thu, Feb 29, 2024 at 5:39 AM Jan Hubicka wrote: > > > > > > > We can't instrument an IFUNC resolver nor its callees as it may require > > > > TLS which hasn't been set up yet when the dynamic linker is resolving > > > > IFUNC symbols.

Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]

2024-02-29 Thread H.J. Lu
On Thu, Feb 29, 2024 at 6:15 AM Jan Hubicka wrote: > > > On Thu, Feb 29, 2024 at 02:31:05PM +0100, Jan Hubicka wrote: > > > I agree that debugability of user core dumps is important here. > > > > > > I guess an ideal solution would be to change codegen of noreturn functions > > > to callee save

Re: [PATCH] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-02-29 Thread H.J. Lu
On Thu, Feb 29, 2024 at 5:39 AM Jan Hubicka wrote: > > > We can't instrument an IFUNC resolver nor its callees as it may require > > TLS which hasn't been set up yet when the dynamic linker is resolving > > IFUNC symbols. Add an IFUNC resolver caller marker to symtab_node to > > avoid recursive

Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]

2024-02-29 Thread H.J. Lu
On Wed, Feb 28, 2024 at 10:20 PM Hongtao Liu wrote: > > On Wed, Feb 28, 2024 at 4:54 PM Jakub Jelinek wrote: > > > > Hi! > > > > Adding Hongtao and Honza into the loop as the ones who acked the original > > patch. > > > > The no_callee_saved_registers by default for noreturn functions change can

[PATCH] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-02-26 Thread H.J. Lu
We can't instrument an IFUNC resolver nor its callees as it may require TLS which hasn't been set up yet when the dynamic linker is resolving IFUNC symbols. Add an IFUNC resolver caller marker to symtab_node to avoid recursive checking. gcc/ChangeLog: PR tree-optimization/114115

Re: [PATCH] x86: Properly implement AMX-TILE load/store intrinsics

2024-02-26 Thread H.J. Lu
On Sun, Feb 25, 2024 at 8:25 PM H.J. Lu wrote: > > On Sun, Feb 25, 2024 at 7:03 PM Hongtao Liu wrote: > > > > On Mon, Feb 26, 2024 at 10:37 AM H.J. Lu wrote: > > > > > > On Sun, Feb 25, 2024 at 6:03 PM Hongtao Liu wrote: > > > > > &

Re: [PATCH] x86: Properly implement AMX-TILE load/store intrinsics

2024-02-25 Thread H.J. Lu
On Sun, Feb 25, 2024 at 7:03 PM Hongtao Liu wrote: > > On Mon, Feb 26, 2024 at 10:37 AM H.J. Lu wrote: > > > > On Sun, Feb 25, 2024 at 6:03 PM Hongtao Liu wrote: > > > > > > On Mon, Feb 26, 2024 at 5:11 AM H.J. Lu wrote: > > > > > > &g

Re: [PATCH] x86: Properly implement AMX-TILE load/store intrinsics

2024-02-25 Thread H.J. Lu
On Sun, Feb 25, 2024 at 6:03 PM Hongtao Liu wrote: > > On Mon, Feb 26, 2024 at 5:11 AM H.J. Lu wrote: > > > > ldtilecfg and sttilecfg take a 512-byte memory block. With > > _tile_loadconfig implemented as > > > > extern __inline void > > __attr

[PATCH v2] x86: Check interrupt instead of noreturn attribute

2024-02-25 Thread H.J. Lu
ix86_set_func_type checks noreturn attribute to avoid incompatible attribute error in LTO1 on interrupt functions. Since TREE_THIS_VOLATILE is set also for _Noreturn without noreturn attribute, check interrupt attribute for interrupt functions instead. gcc/ PR target/114097 *

[PATCH] x86: Properly implement AMX-TILE load/store intrinsics

2024-02-25 Thread H.J. Lu
ldtilecfg and sttilecfg take a 512-byte memory block. With _tile_loadconfig implemented as extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _tile_loadconfig (const void *__config) { __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));

Re: [PATCH] x86: Check interrupt instead of noreturn attribute

2024-02-25 Thread H.J. Lu
On Sun, Feb 25, 2024 at 8:54 AM Uros Bizjak wrote: > > On Sun, Feb 25, 2024 at 5:01 PM H.J. Lu wrote: > > > > ix86_set_func_type checks noreturn attribute to avoid incompatible > > attribute error in LTO1 on interrupt functions. Since TREE_THIS_VOLATILE > > is

[PATCH] x86: Check interrupt instead of noreturn attribute

2024-02-25 Thread H.J. Lu
ix86_set_func_type checks noreturn attribute to avoid incompatible attribute error in LTO1 on interrupt functions. Since TREE_THIS_VOLATILE is set also for _Noreturn without noreturn attribute, check interrupt attribute for interrupt functions instead. gcc/ PR target/114097 *

Re: PING: [PATCH] x86-64: Check R_X86_64_CODE_6_GOTTPOFF support

2024-02-23 Thread H.J. Lu
On Fri, Feb 23, 2024 at 11:12:41AM +0100, Uros Bizjak wrote: > On Fri, Feb 23, 2024 at 3:45 AM H.J. Lu wrote: > > > > On Thu, Feb 22, 2024 at 6:39 PM Hongtao Liu wrote: > > > > > > On Thu, Feb 22, 2024 at 10:33 PM H.J. Lu wrote: > > > > > >

Re: PING: [PATCH] x86-64: Check R_X86_64_CODE_6_GOTTPOFF support

2024-02-22 Thread H.J. Lu
On Thu, Feb 22, 2024 at 6:39 PM Hongtao Liu wrote: > > On Thu, Feb 22, 2024 at 10:33 PM H.J. Lu wrote: > > > > On Sun, Feb 18, 2024 at 8:02 AM H.J. Lu wrote: > > > > > > If assembler and linker supports > > > > > > add %reg1, name@gottpoff(

PING: [PATCH] x86-64: Check R_X86_64_CODE_6_GOTTPOFF support

2024-02-22 Thread H.J. Lu
On Sun, Feb 18, 2024 at 8:02 AM H.J. Lu wrote: > > If assembler and linker supports > > add %reg1, name@gottpoff(%rip), %reg2 > > with R_X86_64_CODE_6_GOTTPOFF, we can generate it instead of > > mov name@gottpoff(%rip), %reg2 > add %reg1, %reg2 > >

[PATCH] x86-64: Check R_X86_64_CODE_6_GOTTPOFF support

2024-02-18 Thread H.J. Lu
If assembler and linker supports add %reg1, name@gottpoff(%rip), %reg2 with R_X86_64_CODE_6_GOTTPOFF, we can generate it instead of mov name@gottpoff(%rip), %reg2 add %reg1, %reg2 gcc/ * configure.ac (HAVE_AS_R_X86_64_CODE_6_GOTTPOFF): Defined as 1 if R_X86_64_CODE_6_GOTTPOFF

Re: [PATCH v2] x86: Support x32 and IBT in heap trampoline

2024-02-14 Thread H.J. Lu
On Wed, Feb 14, 2024 at 11:59 AM Iain Sandoe wrote: > > > > > On 14 Feb 2024, at 18:12, H.J. Lu wrote: > > > > On Tue, Feb 13, 2024 at 8:46 AM Jakub Jelinek wrote: > >> > >> On Tue, Feb 13, 2024 at 08:40:52AM -0800, H.J. Lu wrote: > &g

Re: [PATCH v2] x86: Support x32 and IBT in heap trampoline

2024-02-14 Thread H.J. Lu
On Tue, Feb 13, 2024 at 8:46 AM Jakub Jelinek wrote: > > On Tue, Feb 13, 2024 at 08:40:52AM -0800, H.J. Lu wrote: > > Add x32 and IBT support to x86 heap trampoline implementation with a > > testcase. > > > > 2024-02-13 Jakub Jelinek > > H.J. L

[PATCH] x86-64: Generate push2/pop2 only if the incoming stack is 16-byte aligned

2024-02-13 Thread H.J. Lu
Since push2/pop2 requires 16-byte stack alignment, don't generate them if the incoming stack isn't 16-byte aligned. gcc/ PR target/113912 * config/i386/i386.cc (ix86_can_use_push2pop2): New. (ix86_pro_and_epilogue_can_use_push2pop2): Use it. (ix86_emit_save_regs):

[PATCH] x86-64: Use push2/pop2 only if the incoming stack is 16-byte aligned

2024-02-13 Thread H.J. Lu
Since push2/pop2 requires 16-byte stack alignment, don't use them if the incoming stack isn't 16-byte aligned. gcc/ PR target/113876 * config/i386/i386.cc (ix86_pro_and_epilogue_can_use_push2pop2): Return false if the incoming stack isn't 16-byte aligned. gcc/testsuite/

[PATCH v2] x86: Support x32 and IBT in heap trampoline

2024-02-13 Thread H.J. Lu
Add x32 and IBT support to x86 heap trampoline implementation with a testcase. 2024-02-13 Jakub Jelinek H.J. Lu libgcc/ PR target/113855 * config/i386/heap-trampoline.c (trampoline_insns): Add IBT support and pad to the multiple of 4 bytes. Use movabsq

[PATCH] x86: Support x32 and IBT in heap trampoline

2024-02-13 Thread H.J. Lu
On Tue, Feb 13, 2024 at 10:42:52AM +0100, Jakub Jelinek wrote: > On Sat, Feb 10, 2024 at 10:05:34AM -0800, H.J. Lu wrote: > > > I bet it probably doesn't work properly for -mx32 (which defines > > > __x86_64__), CCing H.J. on that, but that is a preexisting issue > &

Re: [PATCH] x86, libgcc: Implement ia32 basic heap trampoline [PR113855].

2024-02-10 Thread H.J. Lu
On Sat, Feb 10, 2024 at 9:46 AM Jakub Jelinek wrote: > > On Sat, Feb 10, 2024 at 05:14:44PM +, Iain Sandoe wrote: > > PR target/113855 > > > > gcc/ChangeLog: > > > > * config/i386/darwin.h (DARWIN_HEAP_T_LIB): Moved to be > > available to all sub-targets. > > *

[PATCH] x86-64: Return 10_REG if there is no scratch register

2024-02-06 Thread H.J. Lu
If we can't find a scratch register for large model profiling, return R10_REG. PR target/113689 * config/i386/i386.cc (x86_64_select_profile_regnum): Return R10_REG after sorry. --- gcc/config/i386/i386.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

[PATCH] x86: Update constraints for APX NDD instructions

2024-02-05 Thread H.J. Lu
1. The only supported TLS code sequence with ADD is addq foo@gottpoff(%rip),%reg Change je constraint to a memory operand in APX NDD ADD pattern with register source operand. 2. The instruction length of APX NDD instructions with immediate operand: op imm, mem, reg may exceed the size

Re: [PATCH v6] x86-64: Find a scratch register for large model profiling

2024-02-05 Thread H.J. Lu
On Mon, Feb 5, 2024 at 10:01 AM Uros Bizjak wrote: > > On Mon, Feb 5, 2024 at 5:43 PM H.J. Lu wrote: > > > > Changes in v6: > > > > 1. Use ix86_save_reg and accessible_reg_set in > > x86_64_select_profile_regnum. > > 2. Construct a complete reg name

Re: [PATCH v5] x86-64: Find a scratch register for large model profiling

2024-02-05 Thread H.J. Lu
On Mon, Feb 5, 2024 at 2:56 AM Uros Bizjak wrote: > > On Fri, Feb 2, 2024 at 11:47 PM H.J. Lu wrote: > > > > Changes in v5: > > > > 1. Add pr113689-3.c. > > 2. Use %r10 if ix86_profile_before_prologue () return true. > > 3. Try a callee-sa

[PATCH v6] x86-64: Find a scratch register for large model profiling

2024-02-05 Thread H.J. Lu
Changes in v6: 1. Use ix86_save_reg and accessible_reg_set in x86_64_select_profile_regnum. 2. Construct a complete reg name in x86_function_profiler. Changes in v5: 1. Add pr113689-3.c. 2. Use %r10 if ix86_profile_before_prologue () return true. 3. Try a callee-saved register which has been

Re: [PATCH] x86-64: Update gcc.target/i386/apx-ndd.c

2024-02-05 Thread H.J. Lu
On Mon, Feb 5, 2024 at 3:53 AM H.J. Lu <> wrote: > > Fix the following issues: > > 1. Replace long with int64_t to support x32. > 2. Replace \\(%rdi\\) with \\(%(?:r|e)di\\) for memory operand since x32 > uses (%edi). > 3. Replace %(?:|r|e)al with %al in negb scan. &g

[PATCH] x86-64: Update gcc.target/i386/apx-ndd.c

2024-02-05 Thread H.J. Lu <>
Fix the following issues: 1. Replace long with int64_t to support x32. 2. Replace \\(%rdi\\) with \\(%(?:r|e)di\\) for memory operand since x32 uses (%edi). 3. Replace %(?:|r|e)al with %al in negb scan. * gcc.target/i386/apx-ndd.c: Updated. --- gcc/testsuite/gcc.target/i386/apx-ndd.c |

[PATCH v5] x86-64: Find a scratch register for large model profiling

2024-02-02 Thread H.J. Lu
Changes in v5: 1. Add pr113689-3.c. 2. Use %r10 if ix86_profile_before_prologue () return true. 3. Try a callee-saved register which has been saved on stack in the prologue. Changes in v4: 1. Remove pr113689-3.c. 2. Use df_get_live_out. Changes in v3: 1. Remove r10_ok. Changes in v2: 1. Add

Re: [PATCH v4] x86-64: Find a scratch register for large model profiling

2024-02-02 Thread H.J. Lu
On Fri, Feb 02, 2024 at 05:10:05PM +0100, Jakub Jelinek wrote: > On Fri, Feb 02, 2024 at 07:42:00AM -0800, H.J. Lu wrote: > > --- a/gcc/config/i386/i386.cc > > +++ b/gcc/config/i386/i386.cc > > @@ -22749,6 +22749,39 @@ current_fentry_section (const char **nam

Re: [PATCH v2] x86-64: Find a scratch register for large model profiling

2024-02-02 Thread H.J. Lu
On Fri, Feb 2, 2024 at 4:22 AM Jakub Jelinek wrote: > > On Thu, Feb 01, 2024 at 03:02:47PM -0800, H.J. Lu wrote: > > @@ -2763,6 +2789,8 @@ construct_container (machine_mode mode, machine_mode > > orig_mode, > >{ > >case X86_64_INTEGER_CLASS: > >

[PATCH v4] x86-64: Find a scratch register for large model profiling

2024-02-02 Thread H.J. Lu
Changes in v4: 1. Remove pr113689-2.c. 2. Use df_get_live_out. Changes in v3: 1. Remove r10_ok. Changes in v2: 1. Add int_parameter_registers to machine_function to track integer registers used for parameter passing. 2. Update x86_64_select_profile_regnum to try %r10 first and use an

Re: [PATCH] x86-64: Find a scratch register for large model profiling

2024-02-02 Thread H.J. Lu
On Fri, Feb 2, 2024 at 4:07 AM wrote: > > On 2 February 2024 00:02:54 CET, "H.J. Lu" wrote: > >On Thu, Feb 1, 2024 at 10:32 AM Jakub Jelinek wrote: > >> > >> On Thu, Feb 01, 2024 at 10:15:30AM -0800, H.J. Lu wrote: > >> > --- a/gcc/conf

[PATCH v3] x86-64: Find a scratch register for large model profiling

2024-02-02 Thread H.J. Lu
Changes in v3: 1. Remove r10_ok. Changes in v2: 1. Add int_parameter_registers to machine_function to track integer registers used for parameter passing. 2. Update x86_64_select_profile_regnum to try %r10 first and use an caller-saved register, which isn't used for parameter passing. --- 2

Re: [PATCH] x86-64: Find a scratch register for large model profiling

2024-02-01 Thread H.J. Lu
On Thu, Feb 1, 2024 at 10:32 AM Jakub Jelinek wrote: > > On Thu, Feb 01, 2024 at 10:15:30AM -0800, H.J. Lu wrote: > > --- a/gcc/config/i386/i386.cc > > +++ b/gcc/config/i386/i386.cc > > @@ -22749,6 +22749,31 @@ current_fentry_section (const char **

[PATCH v2] x86-64: Find a scratch register for large model profiling

2024-02-01 Thread H.J. Lu
Changes in v2: 1. Add int_parameter_registers to machine_function to track integer registers used for parameter passing. 2. Update x86_64_select_profile_regnum to try %r10 first and use an caller-saved register, which isn't used for parameter passing. --- 2 scratch registers, %r10 and %r11, are

[PATCH] x86-64: Find a scratch register for large model profiling

2024-02-01 Thread H.J. Lu
2 scratch registers, %r10 and %r11, are available at function entry for large model profiling. But %r10 may be used by stack realignment and we can't use %r10 in this case. Add x86_64_select_profile_regnum to find a scratch register for large model profiling and sorry if we can't find one. gcc/

Re: [PATCH v2] Handle private COMDAT function symbol reference in readonly data section

2024-01-31 Thread H.J. Lu
On Wed, Jan 31, 2024 at 10:11 AM Jakub Jelinek wrote: > > On Wed, Jan 31, 2024 at 09:39:12AM -0800, H.J. Lu wrote: > > GNU binutils has no issues with it: > > I know, I meant gcc. > If I try the proposed: > --- gcc/varasm.cc.jj2024-01-30 08:44:43.304175273 +0100 > +

[PATCH] Assuming the working GNU assembler with --with-gnu-as

2024-01-31 Thread H.J. Lu
When configuring GCC with --target=TARGET to build a cross compiler to reproduce a compiler bug, as and collect have ORIGINAL_AS_FOR_TARGET="" As the result, many target features are disabled which makes it almost impossible to reproduce the bug. Without assembler, the GCC build won't finish

[PATCH] Assuming the working GNU assembler with --with-gnu-as

2024-01-31 Thread H.J. Lu
When configuring GCC with --target=TARGET to build a cross compiler to reproduce a compiler bug, as and collect have ORIGINAL_AS_FOR_TARGET="" As the result, many target features are disabled which makes it almost impossible to reproduce the bug. Without assembler, the GCC build won't finish

Re: [PATCH v2] Handle private COMDAT function symbol reference in readonly data section

2024-01-31 Thread H.J. Lu
On Wed, Jan 31, 2024 at 9:10 AM Jakub Jelinek wrote: > > On Wed, Jan 31, 2024 at 08:48:33AM -0800, H.J. Lu wrote: > > Which function (target hook) can I use to generate > > > > .section.data.rel.ro.local,"awG",@progbits,_ZN1AIxE3fooExx,comdat > >

Re: [PATCH v2] Handle private COMDAT function symbol reference in readonly data section

2024-01-31 Thread H.J. Lu
On Wed, Jan 31, 2024 at 8:30 AM Jakub Jelinek wrote: > > On Tue, Jan 30, 2024 at 06:21:36PM -0800, H.J. Lu wrote: > > Changes in v2: > > > > 1. Check decl non-null before dereferencing it. > > 2. Update PR rtl-optimization/113617 from > > Thanks for up

[PATCH v2] Handle private COMDAT function symbol reference in readonly data section

2024-01-30 Thread H.J. Lu
Changes in v2: 1. Check decl non-null before dereferencing it. 2. Update PR rtl-optimization/113617 from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113617#c14 --- For a private COMDAT function symbol reference in readonly data section, instead of putting it in .data.rel.ro or .rodata.cst

Re: [PATCH] i386: Add "Ws" constraint for symbolic address/label reference [PR105576]

2024-01-30 Thread H.J. Lu
On Tue, Jan 16, 2024 at 11:47 PM Uros Bizjak wrote: > > On Thu, Jan 11, 2024 at 7:24 PM Fangrui Song wrote: > > > > Printing the raw symbol is useful in inline asm (e.g. in C++ to get the > > mangled name). Similar constraints are available in other targets (e.g. > > "S" for aarch64/riscv, "Cs"

Re: [PATCH] Handle function symbol reference in readonly data section

2024-01-30 Thread H.J. Lu
On Tue, Jan 30, 2024 at 4:58 AM H.J. Lu wrote: > > On Tue, Jan 30, 2024 at 4:51 AM Jakub Jelinek wrote: > > > > On Mon, Jan 29, 2024 at 06:05:25PM -0800, H.J. Lu wrote: > > > LRA may call forcce_const_mem on this insn in the function > > > &g

Re: [PATCH] Handle COMDAT function symbol reference in readonly data section

2024-01-30 Thread H.J. Lu
On Mon, Jan 29, 2024 at 3:08 PM H.J. Lu wrote: > > For a COMDAT function symbol reference in readonly data section, > instead of putting it in .data.rel.ro or .rodata.cst section, call > function_rodata_section to get the read-only or relocated read-only > data sec

Re: [PATCH] Handle function symbol reference in readonly data section

2024-01-30 Thread H.J. Lu
On Tue, Jan 30, 2024 at 4:51 AM Jakub Jelinek wrote: > > On Mon, Jan 29, 2024 at 06:05:25PM -0800, H.J. Lu wrote: > > LRA may call forcce_const_mem on this insn in the function > > > > (gdb) call debug_tree (func_decl) > > > type > type >

[PATCH] Handle private COMDAT function symbol reference in readonly data section

2024-01-30 Thread H.J. Lu
For a private COMDAT function symbol reference in readonly data section, instead of putting it in .data.rel.ro or .rodata.cst section, call function_rodata_section to get the read-only or relocated read-only data section associated with the function DECL so that the COMDAT section will be used for

[PATCH] x86: Limit -mcmodel=large tests to lp64 target

2024-01-29 Thread H.J. Lu <>
-mcmodel=large is only supported for lp64 targets. Limit -mcmodel=large tests of libcall-1.c and pr107057.c to lp64 target. * gcc.target/i386/libcall-1.c: Limit to lp64 target. * gcc.target/i386/pr107057.c: Likewise. --- gcc/testsuite/gcc.target/i386/libcall-1.c | 2 +-

Re: [PATCH] Handle function symbol reference in readonly data section

2024-01-29 Thread H.J. Lu
On Mon, Jan 29, 2024 at 3:12 PM H.J. Lu wrote: > > On Mon, Jan 29, 2024 at 2:51 PM Jakub Jelinek wrote: > > > > On Mon, Jan 29, 2024 at 11:29:22PM +0100, Jakub Jelinek wrote: > > > On Mon, Jan 29, 2024 at 11:22:44PM +0100, Jakub Jelinek wrote: > > > > O

Re: [PATCH] Handle function symbol reference in readonly data section

2024-01-29 Thread H.J. Lu
On Mon, Jan 29, 2024 at 2:51 PM Jakub Jelinek wrote: > > On Mon, Jan 29, 2024 at 11:29:22PM +0100, Jakub Jelinek wrote: > > On Mon, Jan 29, 2024 at 11:22:44PM +0100, Jakub Jelinek wrote: > > > On Mon, Jan 29, 2024 at 02:01:56PM -0800, H.J. Lu wrote: > > > > > A

[PATCH] Handle COMDAT function symbol reference in readonly data section

2024-01-29 Thread H.J. Lu
For a COMDAT function symbol reference in readonly data section, instead of putting it in .data.rel.ro or .rodata.cst section, call function_rodata_section to get the read-only or relocated read-only data section associated with the function DECL so that the COMDAT section will be used for the

Re: [PATCH] Handle function symbol reference in readonly data section

2024-01-29 Thread H.J. Lu
On Mon, Jan 29, 2024 at 2:01 PM H.J. Lu wrote: > > On Mon, Jan 29, 2024 at 1:42 PM H.J. Lu wrote: > > > > On Mon, Jan 29, 2024 at 1:22 PM H.J. Lu wrote: > > > > > > On Mon, Jan 29, 2024 at 1:00 PM H.J. Lu wrote: > > > > > &

Re: [PATCH] Handle function symbol reference in readonly data section

2024-01-29 Thread H.J. Lu
On Mon, Jan 29, 2024 at 1:42 PM H.J. Lu wrote: > > On Mon, Jan 29, 2024 at 1:22 PM H.J. Lu wrote: > > > > On Mon, Jan 29, 2024 at 1:00 PM H.J. Lu wrote: > > > > > > On Mon, Jan 29, 2024 at 9:34 AM H.J. Lu wrote: > > > > > > &g

[PATCH v3] x86: Generate REG_CFA_UNDEFINED for unsaved callee-saved registers

2024-01-29 Thread H.J. Lu
Changes in v3: 1. Fix a typo in REG_CFA_UNDEFINED note comment. 2. Replace assemble with compile in tests and remove -save-temps since ".cfi_undefined regno" is generated now. Changes in v2: 1. Add REG_CFA_UNDEFINED notes to a frame-related instruction in prologue. 2. Add comments for

Re: [PATCH] Handle function symbol reference in readonly data section

2024-01-29 Thread H.J. Lu
On Mon, Jan 29, 2024 at 1:22 PM H.J. Lu wrote: > > On Mon, Jan 29, 2024 at 1:00 PM H.J. Lu wrote: > > > > On Mon, Jan 29, 2024 at 9:34 AM H.J. Lu wrote: > > > > > > On Mon, Jan 29, 2024 at 9:00 AM Jakub Jelinek wrote: > > > > > > >

Re: [PATCH] Handle function symbol reference in readonly data section

2024-01-29 Thread H.J. Lu
On Mon, Jan 29, 2024 at 1:00 PM H.J. Lu wrote: > > On Mon, Jan 29, 2024 at 9:34 AM H.J. Lu wrote: > > > > On Mon, Jan 29, 2024 at 9:00 AM Jakub Jelinek wrote: > > > > > > On Mon, Jan 29, 2024 at 08:45:45AM -0800, H.J. Lu wrote: > > > > In this c

Re: [PATCH] Handle function symbol reference in readonly data section

2024-01-29 Thread H.J. Lu
On Mon, Jan 29, 2024 at 9:34 AM H.J. Lu wrote: > > On Mon, Jan 29, 2024 at 9:00 AM Jakub Jelinek wrote: > > > > On Mon, Jan 29, 2024 at 08:45:45AM -0800, H.J. Lu wrote: > > > In this case, these are internal to the same comdat group: > > > > But tha

[PATCH v2] x86: Generate REG_CFA_UNDEFINED for unsaved callee-saved registers

2024-01-29 Thread H.J. Lu
Changes in v2: 1. Add REG_CFA_UNDEFINED notes to a frame-related instruction in prologue. 2. Add comments for add_cfi_undefined. --- Attach REG_CFA_UNDEFINED notes for unsaved callee-saved registers which have been used in the function to a frame-related instruction in prologue. gcc/

Re: [PATCH] x86: Generate REG_CFA_UNDEFINED for unsaved callee-saved registers

2024-01-29 Thread H.J. Lu
On Mon, Jan 29, 2024 at 8:30 AM Jakub Jelinek wrote: > > On Mon, Jan 29, 2024 at 08:00:26AM -0800, H.J. Lu wrote: > > Attach REG_CFA_UNDEFINED notes for unsaved callee-saved registers which > > have been used in the function to an instruction in prologue. > > > >

Re: [PATCH] Handle function symbol reference in readonly data section

2024-01-29 Thread H.J. Lu
On Mon, Jan 29, 2024 at 9:00 AM Jakub Jelinek wrote: > > On Mon, Jan 29, 2024 at 08:45:45AM -0800, H.J. Lu wrote: > > In this case, these are internal to the same comdat group: > > But that is only by accident, no? This may be by luck. I don't know if gcc checks

Re: [PATCH] Handle function symbol reference in readonly data section

2024-01-29 Thread H.J. Lu
On Mon, Jan 29, 2024 at 8:34 AM Jakub Jelinek wrote: > > On Mon, Jan 29, 2024 at 08:23:21AM -0800, H.J. Lu wrote: > > > baz: > > > movq.LC0(%rip), %xmm0 > > > ret > > > > I don't think this is valid. We can't reference a no

Re: [PATCH] Handle function symbol reference in readonly data section

2024-01-29 Thread H.J. Lu
On Mon, Jan 29, 2024 at 8:03 AM Jakub Jelinek wrote: > > On Mon, Jan 29, 2024 at 06:36:47AM -0800, H.J. Lu wrote: > > TARGET_ASM_SELECT_RTX_SECTION is for constant in RTL. > > It should have a non-public label reference which can't be used > > by other TUs. The same s

  1   2   3   4   5   6   7   8   9   10   >