[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243 Peter Bergner changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #12 from Peter Bergner --- Mike said offline we can mark this as FIXED.
[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243 --- Comment #11 from Peter Bergner --- Mike, can we marked this as FIXED now? ...or are there other changes needed?
[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243 --- Comment #10 from CVS Commits --- The releases/gcc-10 branch has been updated by Michael Meissner : https://gcc.gnu.org/g:c970030226341f0c7fa9f319b37786ca81703c6d commit r10-11421-gc970030226341f0c7fa9f319b37786ca81703c6d Author: Michael Meissner Date: Mon May 22 11:26:08 2023 -0400 Do not generate vmaddfp and vnmsubfp This is version 3 of the patch. This is essentially version 1 with the removal of changes to altivec.md, and cleanup of the comments. Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was used, and those changes are deleted in this patch. The Altivec instructions vmaddfp and vnmsubfp have different rounding behaviors than the VSX xvmaddsp and xvnmsubsp instructions. In particular, generating these instructions seems to break Eigen on big endian systems. I have done bootstrap builds on power9 little endian (with both IEEE long double and IBM long double). I have also done the builds and test on a power8 big endian system (testing both 32-bit and 64-bit code generation). Chip has verified that it fixes the problem that Eigen encountered. Can I check this into the master GCC branch? After a burn-in period, can I check this patch into the active GCC branches? Thanks in advance. 2023-05-22 Michael Meissner gcc/ PR target/70243 * config/rs6000/vsx.md (vsx_fmav4sf4): Do not generate vmaddfp. Back port from master 04/10/2023. (vsx_nfmsv4sf4): Do not generate vnmsubfp. gcc/testsuite/ PR target/70243 * gcc.target/powerpc/pr70243.c: New test. Back port from master 04/10/2023.
[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243 --- Comment #9 from CVS Commits --- The releases/gcc-11 branch has been updated by Michael Meissner : https://gcc.gnu.org/g:d7d25bcfbd5ee5ef17fabeb67ad5e093cd975a36 commit r11-10807-gd7d25bcfbd5ee5ef17fabeb67ad5e093cd975a36 Author: Michael Meissner Date: Mon May 22 11:17:01 2023 -0400 Do not generate vmaddfp and vnmsubfp This is version 3 of the patch. This is essentially version 1 with the removal of changes to altivec.md, and cleanup of the comments. Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was used, and those changes are deleted in this patch. The Altivec instructions vmaddfp and vnmsubfp have different rounding behaviors than the VSX xvmaddsp and xvnmsubsp instructions. In particular, generating these instructions seems to break Eigen on big endian systems. I have done bootstrap builds on power9 little endian (with both IEEE long double and IBM long double). I have also done the builds and test on a power8 big endian system (testing both 32-bit and 64-bit code generation). Chip has verified that it fixes the problem that Eigen encountered. Can I check this into the master GCC branch? After a burn-in period, can I check this patch into the active GCC branches? Thanks in advance. 2023-04-07 Michael Meissner gcc/ PR target/70243 * config/rs6000/vsx.md (vsx_fmav4sf4): Do not generate vmaddfp. Back port from master 04/10/2023. (vsx_nfmsv4sf4): Do not generate vnmsubfp. gcc/testsuite/ PR target/70243 * gcc.target/powerpc/pr70243.c: New test. Back port from master 04/10/2023.
[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243 --- Comment #8 from CVS Commits --- The releases/gcc-12 branch has been updated by Michael Meissner : https://gcc.gnu.org/g:3bb91d31a272d7fd9f02301df101e3041d5aeb5d commit r12-9635-g3bb91d31a272d7fd9f02301df101e3041d5aeb5d Author: Michael Meissner Date: Mon May 22 11:08:13 2023 -0400 Do not generate vmaddfp and vnmsubfp This is version 3 of the patch. This is essentially version 1 with the removal of changes to altivec.md, and cleanup of the comments. Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was used, and those changes are deleted in this patch. The Altivec instructions vmaddfp and vnmsubfp have different rounding behaviors than the VSX xvmaddsp and xvnmsubsp instructions. In particular, generating these instructions seems to break Eigen on big endian systems. I have done bootstrap builds on power9 little endian (with both IEEE long double and IBM long double). I have also done the builds and test on a power8 big endian system (testing both 32-bit and 64-bit code generation). Chip has verified that it fixes the problem that Eigen encountered. Can I check this into the master GCC branch? After a burn-in period, can I check this patch into the active GCC branches? Thanks in advance. 2023-05-22 Michael Meissner gcc/ PR target/70243 * config/rs6000/vsx.md (vsx_fmav4sf4): Do not generate vmaddfp. (vsx_nfmsv4sf4): Do not generate vnmsubfp. Back port from master 04/10/2023 change. gcc/testsuite/ PR target/70243 * gcc.target/powerpc/pr70243.c: New test. Back port from master 04/10/2023 change.
[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243 --- Comment #7 from CVS Commits --- The master branch has been updated by Michael Meissner : https://gcc.gnu.org/g:725bcdeec60771cb9ee387978716028b64ea1b7f commit r13-7132-g725bcdeec60771cb9ee387978716028b64ea1b7f Author: Michael Meissner Date: Sun Apr 9 23:32:27 2023 -0400 Do not generate vmaddfp and vnmsubfp This is version 3 of the patch. This is essentially version 1 with the removal of changes to altivec.md, and cleanup of the comments. Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was used, and those changes are deleted in this patch. The Altivec instructions vmaddfp and vnmsubfp have different rounding behaviors than the VSX xvmaddsp and xvnmsubsp instructions due to VSCR[NJ] and other corner cases. In particular, generating these instructions seems to break Eigen on big endian systems. 2023-04-09 Michael Meissner gcc/ PR target/70243 * config/rs6000/vsx.md (vsx_fmav4sf4): Do not generate vmaddfp. (vsx_nfmsv4sf4): Do not generate vnmsubfp. gcc/testsuite/ PR target/70243 * gcc.target/powerpc/pr70243.c: New test.
[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243 Segher Boessenkool changed: What|Removed |Added Status|WAITING |NEW Priority|P3 |P1 --- Comment #6 from Segher Boessenkool --- We should not use any VMX insn unless explicitly asked for it, since those do not work as expected if VSCR[NJ]=1, which unfortunately is the default on Linux (but not on powerpc64le-linux; that is a separate (kernel) bug). Rounding mode does not matter too much, if we have some subset of fast-math anyway; the only rounding mode in VMX is round-to-nearest-ties-to-even, which is the default for most everything else). But NJ=1 makes arithmetic behave completely unexpectedly, and it isn't actually faster than NJ=0 on modern hardware anyway. We cannot change the default for setting NJ because some code might rely on it, unfortunately. Luckily disabling generating all VMX insns automatically (i.e. without it being explicitly asked for) isn't all that expensive, just ends up as a few more move instructions here and there. This isn't a regression, but we should have this in GCC 13.
[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243 --- Comment #5 from Michael Meissner --- Created attachment 54814 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54814=edit Test case This is test case that shows the generation of fmaddfp and fnmsubfp.
[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243 --- Comment #4 from Chip Kerchner --- It shows up as a rounding difference on BE machines.
[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243 Chip Kerchner changed: What|Removed |Added CC||chip.kerchner at ibm dot com --- Comment #3 from Chip Kerchner --- This is showing up in some of the binaries generated by Eigen (with GCC13).
[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243 Segher Boessenkool changed: What|Removed |Added Status|NEW |WAITING CC||segher at gcc dot gnu.org --- Comment #2 from Segher Boessenkool --- Do you have a testcase?
[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243 Michael Meissner changed: What|Removed |Added Summary|PowerPC V4DFmode should not |PowerPC V4SFmode should not |use Altivec instructions on |use Altivec instructions on |VSX systems |VSX systems --- Comment #1 from Michael Meissner --- Note, I meant V4SFmode (i.e. vector float), not V4DFmode.