[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems

2023-06-02 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243

Peter Bergner  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #12 from Peter Bergner  ---
Mike said offline we can mark this as FIXED.

[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems

2023-05-23 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243

--- Comment #11 from Peter Bergner  ---
Mike, can we marked this as FIXED now?  ...or are there other changes needed?

[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems

2023-05-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243

--- Comment #10 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Michael Meissner
:

https://gcc.gnu.org/g:c970030226341f0c7fa9f319b37786ca81703c6d

commit r10-11421-gc970030226341f0c7fa9f319b37786ca81703c6d
Author: Michael Meissner 
Date:   Mon May 22 11:26:08 2023 -0400

Do not generate vmaddfp and vnmsubfp

This is version 3 of the patch.  This is essentially version 1 with the
removal
of changes to altivec.md, and cleanup of the comments.

Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was
used,
and those changes are deleted in this patch.

The Altivec instructions vmaddfp and vnmsubfp have different rounding
behaviors
than the VSX xvmaddsp and xvnmsubsp instructions.  In particular,
generating
these instructions seems to break Eigen on big endian systems.

I have done bootstrap builds on power9 little endian (with both IEEE long
double and IBM long double).  I have also done the builds and test on a
power8
big endian system (testing both 32-bit and 64-bit code generation).  Chip
has
verified that it fixes the problem that Eigen encountered.  Can I check
this
into the master GCC branch?  After a burn-in period, can I check this patch
into the active GCC branches?

Thanks in advance.

2023-05-22   Michael Meissner  

gcc/

PR target/70243
* config/rs6000/vsx.md (vsx_fmav4sf4): Do not generate vmaddfp. 
Back
port from master 04/10/2023.
(vsx_nfmsv4sf4): Do not generate vnmsubfp.

gcc/testsuite/

PR target/70243
* gcc.target/powerpc/pr70243.c: New test.  Back port from master
04/10/2023.

[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems

2023-05-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243

--- Comment #9 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Michael Meissner
:

https://gcc.gnu.org/g:d7d25bcfbd5ee5ef17fabeb67ad5e093cd975a36

commit r11-10807-gd7d25bcfbd5ee5ef17fabeb67ad5e093cd975a36
Author: Michael Meissner 
Date:   Mon May 22 11:17:01 2023 -0400

Do not generate vmaddfp and vnmsubfp

This is version 3 of the patch.  This is essentially version 1 with the
removal
of changes to altivec.md, and cleanup of the comments.

Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was
used,
and those changes are deleted in this patch.

The Altivec instructions vmaddfp and vnmsubfp have different rounding
behaviors
than the VSX xvmaddsp and xvnmsubsp instructions.  In particular,
generating
these instructions seems to break Eigen on big endian systems.

I have done bootstrap builds on power9 little endian (with both IEEE long
double and IBM long double).  I have also done the builds and test on a
power8
big endian system (testing both 32-bit and 64-bit code generation).  Chip
has
verified that it fixes the problem that Eigen encountered.  Can I check
this
into the master GCC branch?  After a burn-in period, can I check this patch
into the active GCC branches?

Thanks in advance.

2023-04-07   Michael Meissner  

gcc/

PR target/70243
* config/rs6000/vsx.md (vsx_fmav4sf4): Do not generate vmaddfp. 
Back
port from master 04/10/2023.
(vsx_nfmsv4sf4): Do not generate vnmsubfp.

gcc/testsuite/

PR target/70243
* gcc.target/powerpc/pr70243.c: New test.  Back port from master
04/10/2023.

[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems

2023-05-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243

--- Comment #8 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Michael Meissner
:

https://gcc.gnu.org/g:3bb91d31a272d7fd9f02301df101e3041d5aeb5d

commit r12-9635-g3bb91d31a272d7fd9f02301df101e3041d5aeb5d
Author: Michael Meissner 
Date:   Mon May 22 11:08:13 2023 -0400

Do not generate vmaddfp and vnmsubfp

This is version 3 of the patch.  This is essentially version 1 with the
removal
of changes to altivec.md, and cleanup of the comments.

Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was
used,
and those changes are deleted in this patch.

The Altivec instructions vmaddfp and vnmsubfp have different rounding
behaviors
than the VSX xvmaddsp and xvnmsubsp instructions.  In particular,
generating
these instructions seems to break Eigen on big endian systems.

I have done bootstrap builds on power9 little endian (with both IEEE long
double and IBM long double).  I have also done the builds and test on a
power8
big endian system (testing both 32-bit and 64-bit code generation).  Chip
has
verified that it fixes the problem that Eigen encountered.  Can I check
this
into the master GCC branch?  After a burn-in period, can I check this patch
into the active GCC branches?

Thanks in advance.

2023-05-22   Michael Meissner  

gcc/

PR target/70243
* config/rs6000/vsx.md (vsx_fmav4sf4): Do not generate vmaddfp.
(vsx_nfmsv4sf4): Do not generate vnmsubfp.  Back port from master
04/10/2023 change.

gcc/testsuite/

PR target/70243
* gcc.target/powerpc/pr70243.c: New test.  Back port from master
04/10/2023 change.

[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems

2023-04-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243

--- Comment #7 from CVS Commits  ---
The master branch has been updated by Michael Meissner :

https://gcc.gnu.org/g:725bcdeec60771cb9ee387978716028b64ea1b7f

commit r13-7132-g725bcdeec60771cb9ee387978716028b64ea1b7f
Author: Michael Meissner 
Date:   Sun Apr 9 23:32:27 2023 -0400

Do not generate vmaddfp and vnmsubfp

This is version 3 of the patch.  This is essentially version 1 with the
removal
of changes to altivec.md, and cleanup of the comments.

Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was
used,
and those changes are deleted in this patch.

The Altivec instructions vmaddfp and vnmsubfp have different rounding
behaviors
than the VSX xvmaddsp and xvnmsubsp instructions due to VSCR[NJ] and other
corner cases.  In particular, generating these instructions seems to break
Eigen on big endian systems.

2023-04-09   Michael Meissner  

gcc/

PR target/70243
* config/rs6000/vsx.md (vsx_fmav4sf4): Do not generate vmaddfp.
(vsx_nfmsv4sf4): Do not generate vnmsubfp.

gcc/testsuite/

PR target/70243
* gcc.target/powerpc/pr70243.c: New test.

[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems

2023-04-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243

Segher Boessenkool  changed:

   What|Removed |Added

 Status|WAITING |NEW
   Priority|P3  |P1

--- Comment #6 from Segher Boessenkool  ---
We should not use any VMX insn unless explicitly asked for it, since those
do not work as expected if VSCR[NJ]=1, which unfortunately is the default on
Linux (but not on powerpc64le-linux; that is a separate (kernel) bug).

Rounding mode does not matter too much, if we have some subset of fast-math
anyway; the only rounding mode in VMX is round-to-nearest-ties-to-even, which
is the default for most everything else).

But NJ=1 makes arithmetic behave completely unexpectedly, and it isn't
actually faster than NJ=0 on modern hardware anyway.  We cannot change the
default for setting NJ because some code might rely on it, unfortunately.
Luckily disabling generating all VMX insns automatically (i.e. without it
being explicitly asked for) isn't all that expensive, just ends up as a few
more move instructions here and there.

This isn't a regression, but we should have this in GCC 13.

[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems

2023-04-05 Thread meissner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243

--- Comment #5 from Michael Meissner  ---
Created attachment 54814
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54814=edit
Test case

This is test case that shows the generation of fmaddfp and fnmsubfp.

[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems

2023-04-05 Thread chip.kerchner at ibm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243

--- Comment #4 from Chip Kerchner  ---
It shows up as a rounding difference on BE machines.

[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems

2023-04-05 Thread chip.kerchner at ibm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243

Chip Kerchner  changed:

   What|Removed |Added

 CC||chip.kerchner at ibm dot com

--- Comment #3 from Chip Kerchner  ---
This is showing up in some of the binaries generated by Eigen (with GCC13).

[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems

2022-01-13 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243

Segher Boessenkool  changed:

   What|Removed |Added

 Status|NEW |WAITING
 CC||segher at gcc dot gnu.org

--- Comment #2 from Segher Boessenkool  ---
Do you have a testcase?

[Bug target/70243] PowerPC V4SFmode should not use Altivec instructions on VSX systems

2016-03-15 Thread meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243

Michael Meissner  changed:

   What|Removed |Added

Summary|PowerPC V4DFmode should not |PowerPC V4SFmode should not
   |use Altivec instructions on |use Altivec instructions on
   |VSX systems |VSX systems

--- Comment #1 from Michael Meissner  ---
Note, I meant V4SFmode (i.e. vector float), not V4DFmode.