[Bug target/114098] _tile_loadconfig doesn't work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 H.J. Lu changed: What|Removed |Added Target Milestone|--- |11.5 Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #7 from H.J. Lu --- Fixed for 11.5, 12.4, 13.3 and 14.
[Bug target/114098] _tile_loadconfig doesn't work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 --- Comment #6 from GCC Commits --- The releases/gcc-11 branch has been updated by H.J. Lu : https://gcc.gnu.org/g:26b1012c26c4b4de0b4561e74b856a7f7d259a48 commit r11-11258-g26b1012c26c4b4de0b4561e74b856a7f7d259a48 Author: H.J. Lu Date: Sun Feb 25 10:21:04 2024 -0800 x86: Properly implement AMX-TILE load/store intrinsics ldtilecfg and sttilecfg take a 512-byte memory block. With _tile_loadconfig implemented as extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _tile_loadconfig (const void *__config) { __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config))); } GCC sees: (parallel [ (asm_operands/v ("ldtilecfg %X0") ("") 0 [(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars) (const_int -64 [0xffc0])) [1 MEM[(const void * *)_data]+0 S8 A128])] [(asm_input:DI ("m"))] (clobber (reg:CC 17 flags))]) and the memory operand size is 1 byte. As the result, the rest of 511 bytes is ignored by GCC. Implement ldtilecfg and sttilecfg intrinsics with a pointer to XImode to honor the 512-byte memory block. gcc/ChangeLog: PR target/114098 * config/i386/amxtileintrin.h (_tile_loadconfig): Use __builtin_ia32_ldtilecfg. (_tile_storeconfig): Use __builtin_ia32_sttilecfg. * config/i386/i386-builtin.def (BDESC): Add __builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg. * config/i386/i386-expand.c (ix86_expand_builtin): Handle IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG. * config/i386/i386.md (ldtilecfg): New pattern. (sttilecfg): Likewise. gcc/testsuite/ChangeLog: PR target/114098 * gcc.target/i386/amxtile-4.c: New test. (cherry picked from commit 4972f97a265c574d51e20373ddefd66576051e5c)
[Bug target/114098] _tile_loadconfig doesn't work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 --- Comment #5 from GCC Commits --- The releases/gcc-12 branch has been updated by H.J. Lu : https://gcc.gnu.org/g:23f4aa6c68e24a76d3784bcfdad5a53e46cd8f95 commit r12-10180-g23f4aa6c68e24a76d3784bcfdad5a53e46cd8f95 Author: H.J. Lu Date: Sun Feb 25 10:21:04 2024 -0800 x86: Properly implement AMX-TILE load/store intrinsics ldtilecfg and sttilecfg take a 512-byte memory block. With _tile_loadconfig implemented as extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _tile_loadconfig (const void *__config) { __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config))); } GCC sees: (parallel [ (asm_operands/v ("ldtilecfg %X0") ("") 0 [(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars) (const_int -64 [0xffc0])) [1 MEM[(const void * *)_data]+0 S8 A128])] [(asm_input:DI ("m"))] (clobber (reg:CC 17 flags))]) and the memory operand size is 1 byte. As the result, the rest of 511 bytes is ignored by GCC. Implement ldtilecfg and sttilecfg intrinsics with a pointer to XImode to honor the 512-byte memory block. gcc/ChangeLog: PR target/114098 * config/i386/amxtileintrin.h (_tile_loadconfig): Use __builtin_ia32_ldtilecfg. (_tile_storeconfig): Use __builtin_ia32_sttilecfg. * config/i386/i386-builtin.def (BDESC): Add __builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg. * config/i386/i386-expand.cc (ix86_expand_builtin): Handle IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG. * config/i386/i386.md (ldtilecfg): New pattern. (sttilecfg): Likewise. gcc/testsuite/ChangeLog: PR target/114098 * gcc.target/i386/amxtile-4.c: New test. (cherry picked from commit 4972f97a265c574d51e20373ddefd66576051e5c)
[Bug target/114098] _tile_loadconfig doesn't work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 --- Comment #4 from GCC Commits --- The releases/gcc-13 branch has been updated by H.J. Lu : https://gcc.gnu.org/g:2b3ecdf4fb13471b69d80583e10c5baedfe84d7c commit r13-8365-g2b3ecdf4fb13471b69d80583e10c5baedfe84d7c Author: H.J. Lu Date: Sun Feb 25 10:21:04 2024 -0800 x86: Properly implement AMX-TILE load/store intrinsics ldtilecfg and sttilecfg take a 512-byte memory block. With _tile_loadconfig implemented as extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _tile_loadconfig (const void *__config) { __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config))); } GCC sees: (parallel [ (asm_operands/v ("ldtilecfg %X0") ("") 0 [(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars) (const_int -64 [0xffc0])) [1 MEM[(const void * *)_data]+0 S8 A128])] [(asm_input:DI ("m"))] (clobber (reg:CC 17 flags))]) and the memory operand size is 1 byte. As the result, the rest of 511 bytes is ignored by GCC. Implement ldtilecfg and sttilecfg intrinsics with a pointer to XImode to honor the 512-byte memory block. gcc/ChangeLog: PR target/114098 * config/i386/amxtileintrin.h (_tile_loadconfig): Use __builtin_ia32_ldtilecfg. (_tile_storeconfig): Use __builtin_ia32_sttilecfg. * config/i386/i386-builtin.def (BDESC): Add __builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg. * config/i386/i386-expand.cc (ix86_expand_builtin): Handle IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG. * config/i386/i386.md (ldtilecfg): New pattern. (sttilecfg): Likewise. gcc/testsuite/ChangeLog: PR target/114098 * gcc.target/i386/amxtile-4.c: New test. (cherry picked from commit 4972f97a265c574d51e20373ddefd66576051e5c)
[Bug target/114098] _tile_loadconfig doesn't work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 --- Comment #3 from GCC Commits --- The master branch has been updated by H.J. Lu : https://gcc.gnu.org/g:4972f97a265c574d51e20373ddefd66576051e5c commit r14-9171-g4972f97a265c574d51e20373ddefd66576051e5c Author: H.J. Lu Date: Sun Feb 25 10:21:04 2024 -0800 x86: Properly implement AMX-TILE load/store intrinsics ldtilecfg and sttilecfg take a 512-byte memory block. With _tile_loadconfig implemented as extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _tile_loadconfig (const void *__config) { __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config))); } GCC sees: (parallel [ (asm_operands/v ("ldtilecfg %X0") ("") 0 [(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars) (const_int -64 [0xffc0])) [1 MEM[(const void * *)_data]+0 S8 A128])] [(asm_input:DI ("m"))] (clobber (reg:CC 17 flags))]) and the memory operand size is 1 byte. As the result, the rest of 511 bytes is ignored by GCC. Implement ldtilecfg and sttilecfg intrinsics with a pointer to XImode to honor the 512-byte memory block. gcc/ChangeLog: PR target/114098 * config/i386/amxtileintrin.h (_tile_loadconfig): Use __builtin_ia32_ldtilecfg. (_tile_storeconfig): Use __builtin_ia32_sttilecfg. * config/i386/i386-builtin.def (BDESC): Add __builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg. * config/i386/i386-expand.cc (ix86_expand_builtin): Handle IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG. * config/i386/i386.md (ldtilecfg): New pattern. (sttilecfg): Likewise. gcc/testsuite/ChangeLog: PR target/114098 * gcc.target/i386/amxtile-4.c: New test.
[Bug target/114098] _tile_loadconfig doesn't work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 H.J. Lu changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2024-02-25 --- Comment #2 from H.J. Lu --- We should tell GCC that 64 bytes will be accessed by ldtilecfg and sttilecfg.
[Bug target/114098] _tile_loadconfig doesn't work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 --- Comment #1 from H.J. Lu --- The problem is that in extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _tile_loadconfig (const void *__config) { __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config))); } only 8 bytes are used.