[Bug target/114098] _tile_loadconfig doesn't work

2024-02-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098

H.J. Lu  changed:

   What|Removed |Added

   Target Milestone|--- |11.5
 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #7 from H.J. Lu  ---
Fixed for 11.5, 12.4, 13.3 and 14.

[Bug target/114098] _tile_loadconfig doesn't work

2024-02-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098

--- Comment #6 from GCC Commits  ---
The releases/gcc-11 branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:26b1012c26c4b4de0b4561e74b856a7f7d259a48

commit r11-11258-g26b1012c26c4b4de0b4561e74b856a7f7d259a48
Author: H.J. Lu 
Date:   Sun Feb 25 10:21:04 2024 -0800

x86: Properly implement AMX-TILE load/store intrinsics

ldtilecfg and sttilecfg take a 512-byte memory block.  With
_tile_loadconfig implemented as

extern __inline void
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_tile_loadconfig (const void *__config)
{
  __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));
}

GCC sees:

(parallel [
  (asm_operands/v ("ldtilecfg   %X0") ("") 0
   [(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars)
 (const_int -64 [0xffc0])) [1
MEM[(const void * *)_data]+0 S8 A128])]
   [(asm_input:DI ("m"))]
   (clobber (reg:CC 17 flags))])

and the memory operand size is 1 byte.  As the result, the rest of 511
bytes is ignored by GCC.  Implement ldtilecfg and sttilecfg intrinsics
with a pointer to XImode to honor the 512-byte memory block.

gcc/ChangeLog:

PR target/114098
* config/i386/amxtileintrin.h (_tile_loadconfig): Use
__builtin_ia32_ldtilecfg.
(_tile_storeconfig): Use __builtin_ia32_sttilecfg.
* config/i386/i386-builtin.def (BDESC): Add
__builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg.
* config/i386/i386-expand.c (ix86_expand_builtin): Handle
IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG.
* config/i386/i386.md (ldtilecfg): New pattern.
(sttilecfg): Likewise.

gcc/testsuite/ChangeLog:

PR target/114098
* gcc.target/i386/amxtile-4.c: New test.

(cherry picked from commit 4972f97a265c574d51e20373ddefd66576051e5c)

[Bug target/114098] _tile_loadconfig doesn't work

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098

--- Comment #5 from GCC Commits  ---
The releases/gcc-12 branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:23f4aa6c68e24a76d3784bcfdad5a53e46cd8f95

commit r12-10180-g23f4aa6c68e24a76d3784bcfdad5a53e46cd8f95
Author: H.J. Lu 
Date:   Sun Feb 25 10:21:04 2024 -0800

x86: Properly implement AMX-TILE load/store intrinsics

ldtilecfg and sttilecfg take a 512-byte memory block.  With
_tile_loadconfig implemented as

extern __inline void
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_tile_loadconfig (const void *__config)
{
  __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));
}

GCC sees:

(parallel [
  (asm_operands/v ("ldtilecfg   %X0") ("") 0
   [(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars)
 (const_int -64 [0xffc0])) [1
MEM[(const void * *)_data]+0 S8 A128])]
   [(asm_input:DI ("m"))]
   (clobber (reg:CC 17 flags))])

and the memory operand size is 1 byte.  As the result, the rest of 511
bytes is ignored by GCC.  Implement ldtilecfg and sttilecfg intrinsics
with a pointer to XImode to honor the 512-byte memory block.

gcc/ChangeLog:

PR target/114098
* config/i386/amxtileintrin.h (_tile_loadconfig): Use
__builtin_ia32_ldtilecfg.
(_tile_storeconfig): Use __builtin_ia32_sttilecfg.
* config/i386/i386-builtin.def (BDESC): Add
__builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg.
* config/i386/i386-expand.cc (ix86_expand_builtin): Handle
IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG.
* config/i386/i386.md (ldtilecfg): New pattern.
(sttilecfg): Likewise.

gcc/testsuite/ChangeLog:

PR target/114098
* gcc.target/i386/amxtile-4.c: New test.

(cherry picked from commit 4972f97a265c574d51e20373ddefd66576051e5c)

[Bug target/114098] _tile_loadconfig doesn't work

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098

--- Comment #4 from GCC Commits  ---
The releases/gcc-13 branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:2b3ecdf4fb13471b69d80583e10c5baedfe84d7c

commit r13-8365-g2b3ecdf4fb13471b69d80583e10c5baedfe84d7c
Author: H.J. Lu 
Date:   Sun Feb 25 10:21:04 2024 -0800

x86: Properly implement AMX-TILE load/store intrinsics

ldtilecfg and sttilecfg take a 512-byte memory block.  With
_tile_loadconfig implemented as

extern __inline void
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_tile_loadconfig (const void *__config)
{
  __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));
}

GCC sees:

(parallel [
  (asm_operands/v ("ldtilecfg   %X0") ("") 0
   [(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars)
 (const_int -64 [0xffc0])) [1
MEM[(const void * *)_data]+0 S8 A128])]
   [(asm_input:DI ("m"))]
   (clobber (reg:CC 17 flags))])

and the memory operand size is 1 byte.  As the result, the rest of 511
bytes is ignored by GCC.  Implement ldtilecfg and sttilecfg intrinsics
with a pointer to XImode to honor the 512-byte memory block.

gcc/ChangeLog:

PR target/114098
* config/i386/amxtileintrin.h (_tile_loadconfig): Use
__builtin_ia32_ldtilecfg.
(_tile_storeconfig): Use __builtin_ia32_sttilecfg.
* config/i386/i386-builtin.def (BDESC): Add
__builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg.
* config/i386/i386-expand.cc (ix86_expand_builtin): Handle
IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG.
* config/i386/i386.md (ldtilecfg): New pattern.
(sttilecfg): Likewise.

gcc/testsuite/ChangeLog:

PR target/114098
* gcc.target/i386/amxtile-4.c: New test.

(cherry picked from commit 4972f97a265c574d51e20373ddefd66576051e5c)

[Bug target/114098] _tile_loadconfig doesn't work

2024-02-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098

--- Comment #3 from GCC Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:4972f97a265c574d51e20373ddefd66576051e5c

commit r14-9171-g4972f97a265c574d51e20373ddefd66576051e5c
Author: H.J. Lu 
Date:   Sun Feb 25 10:21:04 2024 -0800

x86: Properly implement AMX-TILE load/store intrinsics

ldtilecfg and sttilecfg take a 512-byte memory block.  With
_tile_loadconfig implemented as

extern __inline void
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_tile_loadconfig (const void *__config)
{
  __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));
}

GCC sees:

(parallel [
  (asm_operands/v ("ldtilecfg   %X0") ("") 0
   [(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars)
 (const_int -64 [0xffc0])) [1
MEM[(const void * *)_data]+0 S8 A128])]
   [(asm_input:DI ("m"))]
   (clobber (reg:CC 17 flags))])

and the memory operand size is 1 byte.  As the result, the rest of 511
bytes is ignored by GCC.  Implement ldtilecfg and sttilecfg intrinsics
with a pointer to XImode to honor the 512-byte memory block.

gcc/ChangeLog:

PR target/114098
* config/i386/amxtileintrin.h (_tile_loadconfig): Use
__builtin_ia32_ldtilecfg.
(_tile_storeconfig): Use __builtin_ia32_sttilecfg.
* config/i386/i386-builtin.def (BDESC): Add
__builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg.
* config/i386/i386-expand.cc (ix86_expand_builtin): Handle
IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG.
* config/i386/i386.md (ldtilecfg): New pattern.
(sttilecfg): Likewise.

gcc/testsuite/ChangeLog:

PR target/114098
* gcc.target/i386/amxtile-4.c: New test.

[Bug target/114098] _tile_loadconfig doesn't work

2024-02-25 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098

H.J. Lu  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-02-25

--- Comment #2 from H.J. Lu  ---
We should tell GCC that 64 bytes will be accessed by ldtilecfg and sttilecfg.

[Bug target/114098] _tile_loadconfig doesn't work

2024-02-25 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098

--- Comment #1 from H.J. Lu  ---
The problem is that in

extern __inline void
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_tile_loadconfig (const void *__config)
{
  __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));
}

only 8 bytes are used.