Re: [PATCH v2] [rs6000] Add _mm_blend_epi16 and _mm_blendv_epi8

2019-07-22 Thread Paul Clarke
On 7/22/19 10:38 AM, Segher Boessenkool wrote: > On Mon, Jul 22, 2019 at 08:28:33AM -0500, Bill Schmidt wrote: >> On 7/22/19 12:58 AM, Segher Boessenkool wrote: >>> On Sun, Jul 21, 2019 at 05:22:19PM -0500, Paul Clarke wrote: >>>> Add compatibility impl

[PATCH v2] [rs6000] Add documentation for __builtin_mtfsf

2019-07-22 Thread Paul Clarke
2019-07-21 Paul A. Clarke [gcc] * doc/extend.texi: Add documentation for __builtin_mtfsf. v2: wordsmithing at Segher's request. I'm having a hard time not saying too much. :-) Index: gcc/doc/extend.texi === ---

[PATCH v2] [rs6000] Add _mm_blend_epi16 and _mm_blendv_epi8

2019-07-21 Thread Paul Clarke
Add compatibility implementations of _mm_blend_epi16 and _mm_blendv_epi8 intrinsics. Respective test cases are copied almost verbatim (minor changes to the dejagnu head lines) from i386. 2019-07-21 Paul A. Clarke [gcc] * config/rs6000/smmintrin.h (_mm_blend_epi16): New.

Re: [rs6000] Add documentation for __builtin_mtfsf

2019-07-21 Thread Paul Clarke
On 7/21/19 1:13 PM, Segher Boessenkool wrote: > On Sun, Jul 21, 2019 at 04:06:32AM -0500, Paul Clarke wrote: >> +@code{__builtin_mtfsf} takes a constant 8-bit integer field mask and a >> +representation of the new value of the FPSCR and generates the @code{mtfsf} >>

[rs6000] Add documentation for __builtin_mtfsf

2019-07-21 Thread Paul Clarke
2019-07-21 Paul A. Clarke [gcc] * doc/extend.texi: Add documentation for __builtin_mtfsf. Index: gcc/doc/extend.texi === --- gcc/doc/extend.texi (revision 273615) +++ gcc/doc/extend.texi (working copy) @@ -16848,6

[PATCH] [rs6000] Add _mm_blend_epi16 and _mm_blendv_epi8

2019-07-19 Thread Paul Clarke
Add compatibility implementations of _mm_blend_epi16 and _mm_blendv_epi8 intrinsics. Respective test cases are copied almost verbatim (minor changes to the dejagnu head lines) from i386. 2019-07-19 Paul A. Clarke [gcc] * config/rs6000/smmintrin.h (_mm_blend_epi16): New.

Re: [PATCH v2][rs6000] PR89338, PR89339: Fix compat vector intrinsics for BE and 32-bit

2019-02-25 Thread Paul Clarke
ping. On 02/19/2019 03:03 PM, Paul Clarke wrote: > Test FAILS: sse2-cvtpd2dq-1, sse2-cvtpd2ps, sse2-cvttpd2dq on powerpc64 > (big-endian). > > _mm_cvtpd_epi32, _mm_cvtpd_ps, _mm_cvttpd_epi32: Type conversion from > vector doubleword type to vector word type leaves the results

[PATCH v2][rs6000] PR89338, PR89339: Fix compat vector intrinsics for BE and 32-bit

2019-02-19 Thread Paul Clarke
Test FAILS: sse2-cvtpd2dq-1, sse2-cvtpd2ps, sse2-cvttpd2dq on powerpc64 (big-endian). _mm_cvtpd_epi32, _mm_cvtpd_ps, _mm_cvttpd_epi32: Type conversion from vector doubleword type to vector word type leaves the results in even lanes in big endian mode. Test FAILS: sse-cvtss2si-1, sse-cvtss2si-2,

[rs6000] PR89338, PR89339: Fix compat vector intrinsics for BE and 32-bit

2019-02-19 Thread Paul Clarke
Test FAILS: sse2-cvtpd2dq-1, sse2-cvtpd2ps, sse2-cvttpd2dq on powerpc64 (big-endian). _mm_cvtpd_epi32, _mm_cvtpd_ps, _mm_cvttpd_epi32: Type conversion from vector doubleword type to vector word type leaves the results in even lanes in big endian mode. Test FAILS: sse-cvtss2si-1, sse-cvtss2si-2,

[rs6000] Fix x86 SSSE3 compatibility implementations and testcases

2018-12-18 Thread Paul Clarke
This patch is the analog to r266868-r266870, but for SSSE3. The SSSE3 tests had been inadvertently made to PASS without actually running the test code. Actually running the code turned up some previously undetected issues. This patch fixes some issues in the implementations, fixes up the tests to

[PATCH, committed][PR88408][rs6000] mmintrin.h: fix use of "vector"

2018-12-07 Thread Paul Clarke
A recent patch inadvertently added the use of "vector" to mmintrin.h when all such uses should be "__vector". Committed as obvious/trivial. [gcc] 2018-12-07 Paul A. Clarke PR target/88408 * config/rs6000/mmintrin.h (_mm_packs_pu16): Correctly use "__vector". Index:

Re: [PATCH 1/3][rs6000] x86-compat vector intrinsics fixes for BE, 32bit

2018-12-04 Thread Paul Clarke
On 12/04/2018 02:16 PM, Segher Boessenkool wrote: > Hi! > > On Tue, Dec 04, 2018 at 08:59:03AM -0600, Paul Clarke wrote: >> Fix general endian and 32-bit mode issues found in the >> compatibility implementations of the x86 vector intrinsics when running the >>

[PATCH 3/3][rs6000] Enable x86-compat vector intrinsics testing

2018-12-04 Thread Paul Clarke
The testsuite tests for the compatibility implementations of x86 vector intrinsics for "powerpc" had been inadvertently made to PASS without actually running the test code. This patch removes the code which kept the tests from running the actual test code. 2018-12-03 Paul A. Clarke

[PATCH 2/3][rs6000] Fix x86-compat vector intrinsics testcases for BE, 32bit

2018-12-04 Thread Paul Clarke
Fix general endian issues found in the test cases for the compatibility implementations of the x86 vector intrinsics. (The tests had been inadvertently made to PASS without actually running the test code. A later patch fixes this issue.) Additionally, a new is added, as some of the APIs therein

[PATCH 1/3][rs6000] x86-compat vector intrinsics fixes for BE, 32bit

2018-12-04 Thread Paul Clarke
Fix general endian and 32-bit mode issues found in the compatibility implementations of the x86 vector intrinsics when running the associated test suite tests. (The tests had been inadvertently made to PASS without actually running the test code. A later patch fixes this issue.) In a few cases,

[PATCH 0/3][rs6000] x86-compat vector intrinsics fixes for BE, 32bit

2018-12-04 Thread Paul Clarke
Many of the tests for the x86-compatible vector intrinsics implementations were protected by "#ifdef __BUILTIN_CPU_SUPPORTS__", which is only enabled with recent enough glibc that in most environments, the test were silently passing without actually testing anything. Patches which follow: 01: Fix

[PATCH, rs6000, committed] Remove inaccurate comment in ssse3-check.h

2018-10-29 Thread Paul Clarke
In gcc.target/powerpc/ssse3-check.h, "DEBUG" doesn't actually "replace abort with printf on error", it just enables debugging output. Remove the comment. Suggested-by: Bill Schmidt (Committing as trivial.) [gcc/testsuite] 2018-10-29 Paul A. Clarke *

[PATCH, rs6000] Consistently use '__vector' instead of 'vector'

2018-10-29 Thread Paul Clarke
Revision r265535 committed changes that used 'vector' instead of the preferred '__vector'. There is a reason that '__vector' is preferred, because it ensures no conflicts with C++ namespace. Indeed, gcc/config/rs6000/xmmintrin.h undefines it, leading to errors: gcc/include/xmmintrin.h:999:20:

Re: [PATCH v2, rs6000 4/4] Add compatible implementations of x86 SSSE3 intrinsics

2018-10-26 Thread Paul Clarke
On 10/26/2018 02:02 PM, Bill Schmidt wrote: > On 10/25/18 2:08 PM, Paul Clarke wrote: >> diff --git a/trunk/gcc/testsuite/gcc.target/powerpc/ssse3-check.h >> b/trunk/gcc/testsuite/gcc.target/powerpc/ssse3-check.h >> new file mode 10644 >> --- /dev/null(rev

Re: [PATCH v2, rs6000 3/4] Add compatible implementations of x86 SSSE3 intrinsics

2018-10-26 Thread Paul Clarke
On 10/26/2018 01:00 PM, Segher Boessenkool wrote: > On Thu, Oct 25, 2018 at 02:07:54PM -0500, Paul Clarke wrote: >> This is a follow-on to earlier commits for adding compatibility >> implementations of x86 intrinsics for PPC64LE. This is the first of >> two patches for SS

Re: [PATCH v2, rs6000 1/4] Fixes for x86 intrinsics on POWER 32bit

2018-10-26 Thread Paul Clarke
On 10/25/2018 05:17 PM, Segher Boessenkool wrote: > On Thu, Oct 25, 2018 at 02:07:33PM -0500, Paul Clarke wrote: >> Various clean-ups for 32bit support. >> >> Implement various corrections in the compatibility implementations of the >> x86 vector intrinsics foun

Re: [PATCH, rs6000] Fix _mm_extract_pi16 for big-endian

2018-10-26 Thread Paul Clarke
On 10/25/2018 05:08 PM, Jakub Jelinek wrote: > On Thu, Oct 25, 2018 at 05:07:03PM -0500, Segher Boessenkool wrote: >> On Thu, Oct 25, 2018 at 01:41:15PM -0500, Paul Clarke wrote: >>> For compatibility implementation of x86 vector intrinsic, _mm_extract_pi16, >>> adjust

[PATCH v2, rs6000 4/4] Add compatible implementations of x86 SSSE3 intrinsics

2018-10-25 Thread Paul Clarke
This is part 2/2 for contributing PPC64LE support for X86 SSE3 instrisics. This patch includes testsuite/gcc.target tests for the intrinsics defined in pmmintrin.h, copied from gcc.target/i386. Bootstrapped and tested on Linux POWER8 LE, POWER8 BE (64 & 32), and POWER7. OK for trunk?

[PATCH v2, rs6000 3/4] Add compatible implementations of x86 SSSE3 intrinsics

2018-10-25 Thread Paul Clarke
This is a follow-on to earlier commits for adding compatibility implementations of x86 intrinsics for PPC64LE. This is the first of two patches for SSSE3. This patch adds the 32 x86 intrinsics from ("SSSE3"). (Patch 2/2 adds tests for these intrinsics, and briefly describes the tests

[PATCH v2, rs6000 1/4] Fixes for x86 intrinsics on POWER 32bit

2018-10-25 Thread Paul Clarke
Various clean-ups for 32bit support. Implement various corrections in the compatibility implementations of the x86 vector intrinsics found after enabling 32bit mode for the associated test cases. (Actual enablement coming in a subsequent patch.) Bootstrapped and tested on Linux POWER8 LE,

[PATCH, rs6000] Fix _mm_extract_pi16 for big-endian

2018-10-25 Thread Paul Clarke
For compatibility implementation of x86 vector intrinsic, _mm_extract_pi16, adjust shift value for big-endian mode. Bootstrapped and tested on Linux POWER8 LE, POWER8 BE (64 & 32), and POWER7. OK for trunk? gcc/ChangeLog: 2018-10-25 Paul A. Clarke * config/rs6000/xmmintrin.h: Fix

Re: [PATCH, rs6000 2/2] Add compatible implementations of x86 SSSE3 intrinsics

2018-10-23 Thread Paul Clarke
On 10/22/2018 06:38 PM, Segher Boessenkool wrote: > On Mon, Oct 22, 2018 at 01:26:11PM -0500, Paul Clarke wrote: >> Target tests for the intrinsics defined in pmmintrin.h, copied from >> gcc.target/i386. >> >> Tested on POWER8 ppc64le and ppc64 (-m64 and -m32, the latt

[PATCH, rs6000 2/2] Add compatible implementations of x86 SSSE3 intrinsics

2018-10-22 Thread Paul Clarke
Target tests for the intrinsics defined in pmmintrin.h, copied from gcc.target/i386. Tested on POWER8 ppc64le and ppc64 (-m64 and -m32, the latter only reporting 16 new unsupported tests), and also by forcing -mcpu=power7 on ppc64. [gcc/testsuite] 2018-10-22 Paul A. Clarke *

[PATCH, rs6000 1/2] Add compatible implementations of x86 SSSE3 intrinsics

2018-10-22 Thread Paul Clarke
This is a follow-on to earlier commits for adding compatibility implementations of x86 intrinsics for PPC64LE. This patch adds the 32 x86 intrinsics from ("SSSE3"). (Patch 2/2 adds tests for these intrinsics, and briefly describes the tests performed.) ./gcc/ChangeLog: 2018-10-22 Paul A.

[PATCH v2, rs6000] 2/2 Add x86 SSE3 intrinsics to GCC PPC64LE target

2018-10-05 Thread Paul Clarke
This is part 2/2 for contributing PPC64LE support for X86 SSE3 instrisics. This patch includes testsuite/gcc.target tests for the intrinsics defined in pmmintrin.h. Tested on POWER8 ppc64le and ppc64 (-m64 and -m32, the latter only reporting 10 new unsupported tests.) [gcc/testsuite]

Re: [PATCH, rs6000] 2/2 Add x86 SSE3 intrinsics to GCC PPC64LE target

2018-10-05 Thread Paul Clarke
On 10/05/2018 04:20 AM, Segher Boessenkool wrote: > On Tue, Oct 02, 2018 at 09:12:07AM -0500, Paul Clarke wrote: >> This is part 2/2 for contributing PPC64LE support for X86 SSE3 >> instrisics. This patch includes testsuite/gcc.target tests for the >> intrinsics d

[PATCH, rs6000] 1/2 Add x86 SSE3 intrinsics to GCC PPC64LE target

2018-10-02 Thread Paul Clarke
This is a follow-on to earlier commits for adding compatibility implementations of x86 intrinsics for PPC64LE. This is the first of two patches. This patch adds 11 of the 13 x86 intrinsics from ("SSE3"). (Patch 2/2 adds tests for these intrinsics, and briefly describes the tests performed.)

[PATCH, rs6000] 2/2 Add x86 SSE3 intrinsics to GCC PPC64LE target

2018-10-02 Thread Paul Clarke
This is part 2/2 for contributing PPC64LE support for X86 SSE3 instrisics. This patch includes testsuite/gcc.target tests for the intrinsics defined in pmmintrin.h. Tested on POWER8 ppc64le and ppc64 (-m64 and -m32, the latter only reporting 10 new unsupported tests.) [gcc/testsuite]

Re: [PATCH v2, rs6000] (PR84302) Fix _mm_slli_epi{32,64} for shift values 16 through 31 and negative

2018-04-23 Thread Paul Clarke
On 04/23/2018 02:47 PM, Segher Boessenkool wrote: > On Mon, Apr 23, 2018 at 02:23:42PM -0500, Paul Clarke wrote: >> Can I push this to ibm/gcc-7-branch? > Don't ask me, I'm not maintainer of that branch. I would point you > to the wiki page explaining who owns it, but i cannot

Re: [PATCH v2, rs6000] (PR84302) Fix _mm_slli_epi{32,64} for shift values 16 through 31 and negative

2018-04-23 Thread Paul Clarke
On 04/13/2018 05:40 PM, Segher Boessenkool wrote: > Rest looks fine... Let's see if I manage to commit it :-) Thanks, Segher! Can I push this to ibm/gcc-7-branch? PC

Add myself to MAINTAINERS (write-after-approval)

2018-04-18 Thread Paul Clarke
Cheng <bin.ch...@arm.com> Harshit Chopra <hars...@google.com> Tamar Christina <tamar.christ...@arm.com> Eric Christopher <echri

[PATCH v2, rs6000] (PR84302) Fix _mm_slli_epi{32,64} for shift values 16 through 31 and negative

2018-04-13 Thread Paul Clarke
The powerpc versions of _mm_slli_epi32 and __mm_slli_epi64 in emmintrin.h do not properly handle shift values between 16 and 31, inclusive. These are setting up the shift with vec_splat_s32, which only accepts *5 bit signed* shift values, or a range of -16 to 15. Values above 15 produce an error:

Re: [PATCH, rs6000] (PR84302) Fix _mm_slli_epi{32,64} for shift values 16 through 31 and negative

2018-04-13 Thread Paul Clarke
On 04/13/2018 02:37 PM, Segher Boessenkool wrote: > On Thu, Apr 12, 2018 at 07:07:21PM -0500, Paul Clarke wrote: >> The powerpc versions of _mm_slli_epi32 and __mm_slli_epi64 in emmintrin.h >> do not properly handle shift values between 16 and 31, inclusive. >> These were

[PATCH, rs6000] (PR84302) Fix _mm_slli_epi{32,64} for shift values 16 through 31 and negative

2018-04-12 Thread Paul Clarke
if (check_union128i_w (u, e)) -abort (); + TEST_CODE(0, 0); + TEST_CODE(15, 15); + TEST_CODE(16, 16); + TEST_CODE(neg1, -1); + TEST_CODE(neg16, -16); + TEST_CODE(neg32, -32); + TEST_CODE(neg64, -64); + TEST_CODE(neg128, -128); } -- Paul Clarke, IBM