[Bug target/114944] Codegen of __builtin_shuffle for an 16-byte uint8_t vector is suboptimal on SSE2

2024-05-06 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114944 --- Comment #2 from John Platts --- Here is more optimal codegen for SSE2ShuffleI8 on x86_64: SSE2ShuffleI8(long long __vector(2), long long __vector(2)): pandxmm1, XMMWORD PTR .LC0[rip] movaps XMMWORD PTR [rsp-24], xmm0

[Bug target/114944] Codegen of __builtin_shuffle for an 16-byte uint8_t vector is suboptimal on SSE2

2024-05-04 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114944 John Platts changed: What|Removed |Added Target||x86_64-*-*, i?86-*-* --- Comment #1 from

[Bug target/114944] New: Codegen of __builtin_shuffle for an 16-byte uint8_t vector is suboptimal on SSE2

2024-05-04 Thread john_platts at hotmail dot com via Gcc-bugs
Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: john_platts at hotmail dot com Target Milestone: --- Here is a snippet of code that has suboptimal codegen on SSE2: #include #include __m128i SSE2ShuffleI8

[Bug target/113484] New: Add support for _Float16 type on PowerPC

2024-01-18 Thread john_platts at hotmail dot com via Gcc-bugs
: target Assignee: unassigned at gcc dot gnu.org Reporter: john_platts at hotmail dot com Target Milestone: --- POWER9 has instructions for _Float16 to float, _Float16 to double, float to _Float16, and double to _Float16 conversions, but the _Float16 type is not currently supported

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-09-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #11 from John Platts --- Created attachment 55869 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55869=edit Test program to reproduce GCC 12 compilation bug Here is the expected output of the ppc9_test_sat_add_090923.cpp test

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-09-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #10 from John Platts --- Created attachment 55868 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55868=edit Test program to reproduce SatWidenMulPairwiseAdd compilation bug The ppc9_test_sat_widen_pairwise_add_090923_2b.cpp

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-09-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #9 from John Platts --- Created attachment 55867 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55867=edit Test program to reproduce SatWidenMulPairwiseAdd compilation bug The attached

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-08-10 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #6 from John Platts --- Need to use revision ff1ad85a96c0bc8483b582d6dbceb8bc07edd226 of Google Highway to reproduce the PPC9 codegen bug with GCC 12 as the TestSatWidenMulPairwiseAdd will now pass on PPC9 due to a recent update to

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-08-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #5 from John Platts --- The version of Google Highway with the TestSatWidenMulPairwiseAdd changes to get TestSatWidenMulPairwiseAdd to pass successfully on POWER9 with the "-mcpu=power9 -DHWY_DISABLED_TARGETS=6918232715082858496

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-08-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #4 from John Platts --- I had made some changes to TestSatWidenMulPairwiseAdd in hwy/tests/mul_test.cc that would get TestSatWidenMulPairwiseAdd to pass successfully on POWER9 when compiled with GCC 12 with the "-mcpu=power9

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-08-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #3 from John Platts --- Here is the output of running the "./tests/mul_test" program in the Google Highway test suite when compiled with the "-mcpu=power8 -DHWY_DISABLED_TARGETS=6917951240106147840" options when compiled with GCC

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-08-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #2 from John Platts --- Created attachment 55711 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55711=edit Test program to reproduce SatWidenMulPairwiseAdd compilation bug (requires CMake and Google Highway)

[Bug target/110960] TestSatWidenMulPairwiseAdd in the Google Highway Test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-08-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960 --- Comment #1 from John Platts --- Created attachment 55710 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55710=edit Test program to reproduce SatWidenMulPairwiseAdd compilation bug The attached

[Bug target/110960] New: TestSatWidenMulPairwiseAdd in the Google Highway Test suite fails when compiled with GCC 12 or later with the -mcpu=power9 option

2023-08-09 Thread john_platts at hotmail dot com via Gcc-bugs
Version: 12.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: john_platts at hotmail dot com Target Milestone: --- Here are the steps to reproduce

[Bug target/110741] New: vec_ternarylogic intrinsic generates incorrect code on POWER10 target when compiled with GCC

2023-07-19 Thread john_platts at hotmail dot com via Gcc-bugs
Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: john_platts at hotmail dot com Target Milestone: --- Created attachment 55582 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55582=edit POWE

[Bug target/109069] Vector truncation test program produces incorrect result on big-endian powerpc64-linux-gnu with -mcpu=power10 -O2

2023-03-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109069 --- Comment #5 from John Platts --- Here is another test program that shows the same code generation bug when a splat followed by a vec_sld is incorrectly optimized by gcc 12.2.0 on powerpc64-linux-gnu and powerpc64le-linux-gnu with the

[Bug target/109069] Vector truncation test program produces incorrect result on big-endian powerpc64-linux-gnu with -mcpu=power10 -O2

2023-03-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109069 --- Comment #4 from John Platts --- Here is another test program that exposes the optimization bug with applying the vec_sl operation to a constant vector (which generates incorrect results on both big-endian and little-endian POWER10 when

[Bug target/109069] Vector truncation test program produces incorrect result on big-endian powerpc64-linux-gnu with -mcpu=power10 -O2

2023-03-09 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109069 --- Comment #3 from John Platts --- Here is another test program that reproduces the vector truncation test issue: #pragma push_macro("vector") #pragma push_macro("pixel") #pragma push_macro("bool") #undef vector #undef pixel #undef bool

[Bug target/109069] Vector truncation test program produces incorrect result on big-endian powerpc64-linux-gnu with -mcpu=power10 -O2

2023-03-08 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109069 --- Comment #1 from John Platts --- The C++ test program below does generate the correct results when compiled with the -mcpu=power10 -O0 options.

[Bug target/109069] New: Vector truncation test program produces incorrect result on big-endian powerpc64-linux-gnu with -mcpu=power10 -O2

2023-03-08 Thread john_platts at hotmail dot com via Gcc-bugs
: 12.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: john_platts at hotmail dot com Target Milestone: --- The following C++ test program generates a test failure

[Bug target/108614] New: _subborrow_u32 generates suboptimal code when second subtraction operand is constant on x86 targets

2023-01-31 Thread john_platts at hotmail dot com via Gcc-bugs
: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: john_platts at hotmail dot com Target Milestone: --- Here is some C++ code that generates suboptimal code with the -O2 -march=skylake-avx512 -m32

[Bug target/105354] New: __builtin_shuffle for alignr generates suboptimal code unless SSSE3 is enabled

2022-04-22 Thread john_platts at hotmail dot com via Gcc-bugs
Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: john_platts at hotmail dot com Target Milestone: --- The below code generates suboptimal code if SSE2 is enabled but SSSE3

[Bug c++/105353] New: __builtin_shufflevector with template parameter fails to compile on GCC 12 but compiles on clang

2022-04-22 Thread john_platts at hotmail dot com via Gcc-bugs
Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: john_platts at hotmail dot com Target Milestone: --- The following code fails to compile with GCC 12 but compiles successfully on clang (with the -std=c

[Bug target/103611] GCC generates suboptimal code for SSE2/SSE4.1 64-bit integer element extraction on 32-bit x86 targets

2021-12-07 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103611 --- Comment #4 from John Platts --- (In reply to Andrew Pinski from comment #3) > Hmm, GCC 4.8.1-5.5.0 produces: > long long SSE2ExtractInt64<0>(long long __vector): > .LFB499: > .cfi_startproc > pshufd xmm1, xmm0, 1 >

[Bug target/103611] GCC generates suboptimal code for SSE2/SSE4.1 64-bit integer element extraction on 32-bit x86 targets

2021-12-07 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103611 --- Comment #2 from John Platts --- Here is some code for extracting 64-bit integers from a SSE2 vector using GCC vector extensions: #include #include using Int64M128Vect [[__gnu__::__vector_size__(16)]] = std::int64_t; template

[Bug target/103611] GCC generates suboptimal code for SSE2/SSE4.1 64-bit integer element extraction on 32-bit x86 targets

2021-12-07 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103611 --- Comment #1 from John Platts --- Here is some C++ code for extracting 64-bit integers from a __m128i vector using SSE4.1: #include #include template std::int64_t SSE41ExtractInt64(__m128i vect) noexcept { static_assert(ElemIdx ==

[Bug target/103611] New: GCC generates suboptimal code for SSE2/SSE4.1 64-bit integer element extraction on 32-bit x86 targets

2021-12-07 Thread john_platts at hotmail dot com via Gcc-bugs
: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: john_platts at hotmail dot com Target Milestone: --- Here is some code for extracting 64-bit integers from a SSE2 vector: #include #include

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2019-08-25 Thread john_platts at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 John Platts changed: What|Removed |Added CC||john_platts at hotmail dot com