https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114944
--- Comment #2 from John Platts ---
Here is more optimal codegen for SSE2ShuffleI8 on x86_64:
SSE2ShuffleI8(long long __vector(2), long long __vector(2)):
pandxmm1, XMMWORD PTR .LC0[rip]
movaps XMMWORD PTR [rsp-24], xmm0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114944
John Platts changed:
What|Removed |Added
Target||x86_64-*-*, i?86-*-*
--- Comment #1 from
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: john_platts at hotmail dot com
Target Milestone: ---
Here is a snippet of code that has suboptimal codegen on SSE2:
#include
#include
__m128i SSE2ShuffleI8
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: john_platts at hotmail dot com
Target Milestone: ---
POWER9 has instructions for _Float16 to float, _Float16 to double, float to
_Float16, and double to _Float16 conversions, but the _Float16 type is not
currently supported
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960
--- Comment #11 from John Platts ---
Created attachment 55869
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55869=edit
Test program to reproduce GCC 12 compilation bug
Here is the expected output of the ppc9_test_sat_add_090923.cpp test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960
--- Comment #10 from John Platts ---
Created attachment 55868
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55868=edit
Test program to reproduce SatWidenMulPairwiseAdd compilation bug
The ppc9_test_sat_widen_pairwise_add_090923_2b.cpp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960
--- Comment #9 from John Platts ---
Created attachment 55867
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55867=edit
Test program to reproduce SatWidenMulPairwiseAdd compilation bug
The attached
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960
--- Comment #6 from John Platts ---
Need to use revision ff1ad85a96c0bc8483b582d6dbceb8bc07edd226 of Google Highway
to reproduce the PPC9 codegen bug with GCC 12 as the TestSatWidenMulPairwiseAdd
will now pass on PPC9 due to a recent update to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960
--- Comment #5 from John Platts ---
The version of Google Highway with the TestSatWidenMulPairwiseAdd changes to
get TestSatWidenMulPairwiseAdd to pass successfully on POWER9 with the
"-mcpu=power9 -DHWY_DISABLED_TARGETS=6918232715082858496
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960
--- Comment #4 from John Platts ---
I had made some changes to TestSatWidenMulPairwiseAdd in hwy/tests/mul_test.cc
that would get TestSatWidenMulPairwiseAdd to pass successfully on POWER9 when
compiled with GCC 12 with the "-mcpu=power9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960
--- Comment #3 from John Platts ---
Here is the output of running the "./tests/mul_test" program in the Google
Highway test suite when compiled with the "-mcpu=power8
-DHWY_DISABLED_TARGETS=6917951240106147840" options when compiled with GCC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960
--- Comment #2 from John Platts ---
Created attachment 55711
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55711=edit
Test program to reproduce SatWidenMulPairwiseAdd compilation bug (requires
CMake and Google Highway)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960
--- Comment #1 from John Platts ---
Created attachment 55710
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55710=edit
Test program to reproduce SatWidenMulPairwiseAdd compilation bug
The attached
Version: 12.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: john_platts at hotmail dot com
Target Milestone: ---
Here are the steps to reproduce
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: john_platts at hotmail dot com
Target Milestone: ---
Created attachment 55582
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55582=edit
POWE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109069
--- Comment #5 from John Platts ---
Here is another test program that shows the same code generation bug when a
splat followed by a vec_sld is incorrectly optimized by gcc 12.2.0 on
powerpc64-linux-gnu and powerpc64le-linux-gnu with the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109069
--- Comment #4 from John Platts ---
Here is another test program that exposes the optimization bug with applying
the vec_sl operation to a constant vector (which generates incorrect results on
both big-endian and little-endian POWER10 when
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109069
--- Comment #3 from John Platts ---
Here is another test program that reproduces the vector truncation test issue:
#pragma push_macro("vector")
#pragma push_macro("pixel")
#pragma push_macro("bool")
#undef vector
#undef pixel
#undef bool
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109069
--- Comment #1 from John Platts ---
The C++ test program below does generate the correct results when compiled with
the -mcpu=power10 -O0 options.
: 12.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: john_platts at hotmail dot com
Target Milestone: ---
The following C++ test program generates a test failure
: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: john_platts at hotmail dot com
Target Milestone: ---
Here is some C++ code that generates suboptimal code with the -O2
-march=skylake-avx512 -m32
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: john_platts at hotmail dot com
Target Milestone: ---
The below code generates suboptimal code if SSE2 is enabled but SSSE3
Severity: normal
Priority: P3
Component: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: john_platts at hotmail dot com
Target Milestone: ---
The following code fails to compile with GCC 12 but compiles successfully on
clang (with the -std=c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103611
--- Comment #4 from John Platts ---
(In reply to Andrew Pinski from comment #3)
> Hmm, GCC 4.8.1-5.5.0 produces:
> long long SSE2ExtractInt64<0>(long long __vector):
> .LFB499:
> .cfi_startproc
> pshufd xmm1, xmm0, 1
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103611
--- Comment #2 from John Platts ---
Here is some code for extracting 64-bit integers from a SSE2 vector using GCC
vector extensions:
#include
#include
using Int64M128Vect [[__gnu__::__vector_size__(16)]] = std::int64_t;
template
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103611
--- Comment #1 from John Platts ---
Here is some C++ code for extracting 64-bit integers from a __m128i vector
using SSE4.1:
#include
#include
template
std::int64_t SSE41ExtractInt64(__m128i vect) noexcept {
static_assert(ElemIdx ==
: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: john_platts at hotmail dot com
Target Milestone: ---
Here is some code for extracting 64-bit integers from a SSE2 vector:
#include
#include
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412
John Platts changed:
What|Removed |Added
CC||john_platts at hotmail dot com
28 matches
Mail list logo