Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-26 Thread Kirill Yukhin
On 26 Jan 13:05, Jakub Jelinek wrote:
> On Thu, Jan 26, 2017 at 03:53:44AM -0800, Kirill Yukhin wrote:
> > Hi,
> > On 26 Jan 12:49, Thomas Schwinge wrote:
> > > Hi!
> > >
> > > On Thu, 26 Jan 2017 02:44:56 -0800, Kirill Yukhin 
> > >  wrote:
> > > > On 26 Jan 10:14, Thomas Schwinge wrote:
> > > > > I see:
> > > > >
> > > > > {+FAIL: gcc.target/i386/avx512f-ktestw-2.c (test for excess 
> > > > > errors)+}
> > > > > {+UNRESOLVED: gcc.target/i386/avx512f-ktestw-2.c compilation 
> > > > > failed to produce executable+}
> > > > >
> > > > > ... because of:
> > > > >
> > > > > /tmp/ccjv3mX2.s: Assembler messages:
> > > > > /tmp/ccjv3mX2.s:26: Error: no such instruction: `ktestw %k1,%k0'
> > > > > compiler exited with status 1
> > > > Which version of gas do you use?
> > >
> > > A rather old one on that Ubuntu 12.10 system:
> > >
> > > $ as --version
> > > GNU assembler (GNU Binutils for Ubuntu) 2.22.90.20120924
> > > [...]
> > >
> > > > It should be OK since v2.25.
> > >
> > > OK, but as done for other tests, for older versions such testing then
> > > should be UNSUPPORTED instead of FAIL/UNRESOLVED (as long as that is
> > > practicable, which has already been described how to do, as I understand
> > > the other messages).
> > This is a bug as Uroš properly mentioned. Will fix.
>
> Like this?  Tested on x86_64-linux.  Ok for trunk?
You're too fast. I did exactly the same.
OK for trunk.

--
Thanks, K

>
> 2017-01-26  Jakub Jelinek  
>
>   * config/i386/avx512fintrin.h (_ktest_mask16_u8,
>   _ktestz_mask16_u8, _ktestc_mask16_u8, _kadd_mask16): Move to ...
>   * config/i386/avx512dqintrin.h (_ktest_mask16_u8,
>   _ktestz_mask16_u8, _ktestc_mask16_u8, _kadd_mask16): ... here.
>   * config/i386/i386-builtin.def (__builtin_ia32_ktestchi,
>   __builtin_ia32_ktestzhi, __builtin_ia32_kaddhi): Use
>   OPTION_MASK_ISA_AVX512DQ instead of OPTION_MASK_ISA_AVX512F.
>   * config/i386/sse.md (SWI1248_AVX512BWDQ2): New mode iterator.
>   (kadd, ktest): Use it instead of SWI1248_AVX512BWDQ.
> testsuite/
>   * gcc.target/i386/avx512f-kaddw-1.c: Renamed to ...
>   * gcc.target/i386/avx512dq-kaddw-1.c: ... this.  New test.  Replace
>   avx512f with avx512dq.
>   * gcc.target/i386/avx512f-ktestw-1.c: Renamed to ...
>   * gcc.target/i386/avx512dq-ktestw-1.c: ... this.  New test.  Replace
>   avx512f with avx512dq.
>   * gcc.target/i386/avx512f-ktestw-2.c: Renamed to ...
>   * gcc.target/i386/avx512dq-ktestw-2.c: ... this.  New test.  Replace
>   avx512f with avx512dq.
>


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-26 Thread Jakub Jelinek
On Thu, Jan 26, 2017 at 03:53:44AM -0800, Kirill Yukhin wrote:
> Hi,
> On 26 Jan 12:49, Thomas Schwinge wrote:
> > Hi!
> >
> > On Thu, 26 Jan 2017 02:44:56 -0800, Kirill Yukhin  
> > wrote:
> > > On 26 Jan 10:14, Thomas Schwinge wrote:
> > > > I see:
> > > >
> > > > {+FAIL: gcc.target/i386/avx512f-ktestw-2.c (test for excess 
> > > > errors)+}
> > > > {+UNRESOLVED: gcc.target/i386/avx512f-ktestw-2.c compilation failed 
> > > > to produce executable+}
> > > >
> > > > ... because of:
> > > >
> > > > /tmp/ccjv3mX2.s: Assembler messages:
> > > > /tmp/ccjv3mX2.s:26: Error: no such instruction: `ktestw %k1,%k0'
> > > > compiler exited with status 1
> > > Which version of gas do you use?
> >
> > A rather old one on that Ubuntu 12.10 system:
> >
> > $ as --version
> > GNU assembler (GNU Binutils for Ubuntu) 2.22.90.20120924
> > [...]
> >
> > > It should be OK since v2.25.
> >
> > OK, but as done for other tests, for older versions such testing then
> > should be UNSUPPORTED instead of FAIL/UNRESOLVED (as long as that is
> > practicable, which has already been described how to do, as I understand
> > the other messages).
> This is a bug as Uroš properly mentioned. Will fix.

Like this?  Tested on x86_64-linux.  Ok for trunk?

2017-01-26  Jakub Jelinek  

* config/i386/avx512fintrin.h (_ktest_mask16_u8,
_ktestz_mask16_u8, _ktestc_mask16_u8, _kadd_mask16): Move to ...
* config/i386/avx512dqintrin.h (_ktest_mask16_u8,
_ktestz_mask16_u8, _ktestc_mask16_u8, _kadd_mask16): ... here.
* config/i386/i386-builtin.def (__builtin_ia32_ktestchi,
__builtin_ia32_ktestzhi, __builtin_ia32_kaddhi): Use
OPTION_MASK_ISA_AVX512DQ instead of OPTION_MASK_ISA_AVX512F.
* config/i386/sse.md (SWI1248_AVX512BWDQ2): New mode iterator.
(kadd, ktest): Use it instead of SWI1248_AVX512BWDQ.
testsuite/
* gcc.target/i386/avx512f-kaddw-1.c: Renamed to ...
* gcc.target/i386/avx512dq-kaddw-1.c: ... this.  New test.  Replace
avx512f with avx512dq.
* gcc.target/i386/avx512f-ktestw-1.c: Renamed to ...
* gcc.target/i386/avx512dq-ktestw-1.c: ... this.  New test.  Replace
avx512f with avx512dq.
* gcc.target/i386/avx512f-ktestw-2.c: Renamed to ...
* gcc.target/i386/avx512dq-ktestw-2.c: ... this.  New test.  Replace
avx512f with avx512dq.

--- gcc/config/i386/avx512fintrin.h.jj  2017-01-23 18:09:48.0 +0100
+++ gcc/config/i386/avx512fintrin.h 2017-01-26 12:40:10.187825569 +0100
@@ -10008,28 +10008,6 @@ _mm512_maskz_expandloadu_epi32 (__mmask1
 
 extern __inline unsigned char
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_ktest_mask16_u8  (__mmask16 __A,  __mmask16 __B, unsigned char *__CF)
-{
-  *__CF = (unsigned char) __builtin_ia32_ktestchi (__A, __B);
-  return (unsigned char) __builtin_ia32_ktestzhi (__A, __B);
-}
-
-extern __inline unsigned char
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_ktestz_mask16_u8 (__mmask16 __A, __mmask16 __B)
-{
-  return (unsigned char) __builtin_ia32_ktestzhi (__A, __B);
-}
-
-extern __inline unsigned char
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_ktestc_mask16_u8 (__mmask16 __A, __mmask16 __B)
-{
-  return (unsigned char) __builtin_ia32_ktestchi (__A, __B);
-}
-
-extern __inline unsigned char
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _kortest_mask16_u8  (__mmask16 __A,  __mmask16 __B, unsigned char *__CF)
 {
   *__CF = (unsigned char) __builtin_ia32_kortestchi (__A, __B);
@@ -10052,13 +10030,6 @@ _kortestc_mask16_u8 (__mmask16 __A, __mm
(__mmask16) __B);
 }
 
-extern __inline __mmask16
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_kadd_mask16 (__mmask16 __A, __mmask16 __B)
-{
-  return (__mmask16) __builtin_ia32_kaddhi ((__mmask16) __A, (__mmask16) __B);
-}
-
 extern __inline unsigned int
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _cvtmask16_u32 (__mmask16 __A)
--- gcc/config/i386/avx512dqintrin.h.jj 2017-01-23 18:09:48.0 +0100
+++ gcc/config/i386/avx512dqintrin.h2017-01-26 12:41:26.825839239 +0100
@@ -58,6 +58,28 @@ _ktestc_mask8_u8 (__mmask8 __A, __mmask8
 
 extern __inline unsigned char
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_ktest_mask16_u8  (__mmask16 __A,  __mmask16 __B, unsigned char *__CF)
+{
+  *__CF = (unsigned char) __builtin_ia32_ktestchi (__A, __B);
+  return (unsigned char) __builtin_ia32_ktestzhi (__A, __B);
+}
+
+extern __inline unsigned char
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_ktestz_mask16_u8 (__mmask16 __A, __mmask16 __B)
+{
+  return (unsigned char) __builtin_ia32_ktestzhi (__A, __B);
+}
+
+extern __inline unsigned char
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))

Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-26 Thread Kirill Yukhin
Hi,
On 26 Jan 12:49, Thomas Schwinge wrote:
> Hi!
>
> On Thu, 26 Jan 2017 02:44:56 -0800, Kirill Yukhin  
> wrote:
> > On 26 Jan 10:14, Thomas Schwinge wrote:
> > > I see:
> > >
> > > {+FAIL: gcc.target/i386/avx512f-ktestw-2.c (test for excess errors)+}
> > > {+UNRESOLVED: gcc.target/i386/avx512f-ktestw-2.c compilation failed 
> > > to produce executable+}
> > >
> > > ... because of:
> > >
> > > /tmp/ccjv3mX2.s: Assembler messages:
> > > /tmp/ccjv3mX2.s:26: Error: no such instruction: `ktestw %k1,%k0'
> > > compiler exited with status 1
> > Which version of gas do you use?
>
> A rather old one on that Ubuntu 12.10 system:
>
> $ as --version
> GNU assembler (GNU Binutils for Ubuntu) 2.22.90.20120924
> [...]
>
> > It should be OK since v2.25.
>
> OK, but as done for other tests, for older versions such testing then
> should be UNSUPPORTED instead of FAIL/UNRESOLVED (as long as that is
> practicable, which has already been described how to do, as I understand
> the other messages).
This is a bug as Uroš properly mentioned. Will fix.

--
Thanks, K

>
>
> Grüße
>  Thomas


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-26 Thread Thomas Schwinge
Hi!

On Thu, 26 Jan 2017 02:44:56 -0800, Kirill Yukhin  
wrote:
> On 26 Jan 10:14, Thomas Schwinge wrote:
> > I see:
> >
> > {+FAIL: gcc.target/i386/avx512f-ktestw-2.c (test for excess errors)+}
> > {+UNRESOLVED: gcc.target/i386/avx512f-ktestw-2.c compilation failed to 
> > produce executable+}
> >
> > ... because of:
> >
> > /tmp/ccjv3mX2.s: Assembler messages:
> > /tmp/ccjv3mX2.s:26: Error: no such instruction: `ktestw %k1,%k0'
> > compiler exited with status 1
> Which version of gas do you use?

A rather old one on that Ubuntu 12.10 system:

$ as --version
GNU assembler (GNU Binutils for Ubuntu) 2.22.90.20120924
[...]

> It should be OK since v2.25.

OK, but as done for other tests, for older versions such testing then
should be UNSUPPORTED instead of FAIL/UNRESOLVED (as long as that is
practicable, which has already been described how to do, as I understand
the other messages).


Grüße
 Thomas


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-26 Thread Uros Bizjak
On Thu, Jan 26, 2017 at 12:00 PM, Jakub Jelinek  wrote:
> On Thu, Jan 26, 2017 at 11:54:52AM +0100, Uros Bizjak wrote:
>> On Thu, Jan 26, 2017 at 11:51 AM, Jakub Jelinek  wrote:
>> > On Thu, Jan 26, 2017 at 02:44:56AM -0800, Kirill Yukhin wrote:
>> >> Hello Thomas,
>> >> On 26 Jan 10:14, Thomas Schwinge wrote:
>> >> > I see:
>> >> >
>> >> > {+FAIL: gcc.target/i386/avx512f-ktestw-2.c (test for excess 
>> >> > errors)+}
>> >> > {+UNRESOLVED: gcc.target/i386/avx512f-ktestw-2.c compilation failed 
>> >> > to produce executable+}
>> >> >
>> >> > ... because of:
>> >> >
>> >> > /tmp/ccjv3mX2.s: Assembler messages:
>> >> > /tmp/ccjv3mX2.s:26: Error: no such instruction: `ktestw %k1,%k0'
>> >> > compiler exited with status 1
>> >> Which version of gas do you use?
>> >> It should be OK since v2.25.
>> >
>> > It is weird, because the test already has:
>> > /* { dg-require-effective-target avx512f } */
>> > Perhaps if there are gas versions with partial avx512f support, we need
>> > to improve the avx512f effective target test.
>>
>> This is actually AVX512DQ instruction, please see [1], 3-509.
>>
>> [1] 
>> https://software.intel.com/sites/default/files/managed/ad/01/253666-sdm-vol-2a.pdf
>
> You're right.  But then the tests should be named avx512dq-ktestw-{1,2}.c,
> should use -mavx512dq, avx512dq effective target etc. and indeed the
> intrinsics shouldn't be in avx512fintrin.h header, but dq, and should not be
> enabled for f, but only dq.

Yes, all this is needed to fix this oversight (and one more with
kaddw), as I proposed a couple of messages earlier.

Uros.


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-26 Thread Jakub Jelinek
On Thu, Jan 26, 2017 at 11:54:52AM +0100, Uros Bizjak wrote:
> On Thu, Jan 26, 2017 at 11:51 AM, Jakub Jelinek  wrote:
> > On Thu, Jan 26, 2017 at 02:44:56AM -0800, Kirill Yukhin wrote:
> >> Hello Thomas,
> >> On 26 Jan 10:14, Thomas Schwinge wrote:
> >> > I see:
> >> >
> >> > {+FAIL: gcc.target/i386/avx512f-ktestw-2.c (test for excess errors)+}
> >> > {+UNRESOLVED: gcc.target/i386/avx512f-ktestw-2.c compilation failed 
> >> > to produce executable+}
> >> >
> >> > ... because of:
> >> >
> >> > /tmp/ccjv3mX2.s: Assembler messages:
> >> > /tmp/ccjv3mX2.s:26: Error: no such instruction: `ktestw %k1,%k0'
> >> > compiler exited with status 1
> >> Which version of gas do you use?
> >> It should be OK since v2.25.
> >
> > It is weird, because the test already has:
> > /* { dg-require-effective-target avx512f } */
> > Perhaps if there are gas versions with partial avx512f support, we need
> > to improve the avx512f effective target test.
> 
> This is actually AVX512DQ instruction, please see [1], 3-509.
> 
> [1] 
> https://software.intel.com/sites/default/files/managed/ad/01/253666-sdm-vol-2a.pdf

You're right.  But then the tests should be named avx512dq-ktestw-{1,2}.c,
should use -mavx512dq, avx512dq effective target etc. and indeed the
intrinsics shouldn't be in avx512fintrin.h header, but dq, and should not be
enabled for f, but only dq.

Jakub


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-26 Thread Uros Bizjak
On Thu, Jan 26, 2017 at 11:51 AM, Jakub Jelinek  wrote:
> On Thu, Jan 26, 2017 at 02:44:56AM -0800, Kirill Yukhin wrote:
>> Hello Thomas,
>> On 26 Jan 10:14, Thomas Schwinge wrote:
>> > I see:
>> >
>> > {+FAIL: gcc.target/i386/avx512f-ktestw-2.c (test for excess errors)+}
>> > {+UNRESOLVED: gcc.target/i386/avx512f-ktestw-2.c compilation failed to 
>> > produce executable+}
>> >
>> > ... because of:
>> >
>> > /tmp/ccjv3mX2.s: Assembler messages:
>> > /tmp/ccjv3mX2.s:26: Error: no such instruction: `ktestw %k1,%k0'
>> > compiler exited with status 1
>> Which version of gas do you use?
>> It should be OK since v2.25.
>
> It is weird, because the test already has:
> /* { dg-require-effective-target avx512f } */
> Perhaps if there are gas versions with partial avx512f support, we need
> to improve the avx512f effective target test.

This is actually AVX512DQ instruction, please see [1], 3-509.

[1] 
https://software.intel.com/sites/default/files/managed/ad/01/253666-sdm-vol-2a.pdf

Uros.


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-26 Thread Jakub Jelinek
On Thu, Jan 26, 2017 at 02:44:56AM -0800, Kirill Yukhin wrote:
> Hello Thomas,
> On 26 Jan 10:14, Thomas Schwinge wrote:
> > I see:
> >
> > {+FAIL: gcc.target/i386/avx512f-ktestw-2.c (test for excess errors)+}
> > {+UNRESOLVED: gcc.target/i386/avx512f-ktestw-2.c compilation failed to 
> > produce executable+}
> >
> > ... because of:
> >
> > /tmp/ccjv3mX2.s: Assembler messages:
> > /tmp/ccjv3mX2.s:26: Error: no such instruction: `ktestw %k1,%k0'
> > compiler exited with status 1
> Which version of gas do you use?
> It should be OK since v2.25.

It is weird, because the test already has:
/* { dg-require-effective-target avx512f } */
Perhaps if there are gas versions with partial avx512f support, we need
to improve the avx512f effective target test.

Jakub


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-26 Thread Kirill Yukhin
Hello Thomas,
On 26 Jan 10:14, Thomas Schwinge wrote:
> I see:
>
> {+FAIL: gcc.target/i386/avx512f-ktestw-2.c (test for excess errors)+}
> {+UNRESOLVED: gcc.target/i386/avx512f-ktestw-2.c compilation failed to 
> produce executable+}
>
> ... because of:
>
> /tmp/ccjv3mX2.s: Assembler messages:
> /tmp/ccjv3mX2.s:26: Error: no such instruction: `ktestw %k1,%k0'
> compiler exited with status 1
Which version of gas do you use?
It should be OK since v2.25.

--
Thanks, K
>
>
> Grüße
>  Thomas


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-26 Thread Uros Bizjak
On Thu, Jan 26, 2017 at 10:14 AM, Thomas Schwinge
 wrote:
> Hi!
>
> On Fri, 20 Jan 2017 23:03:53 +0300, Andrew Senkevich 
>  wrote:
>> diff --git a/gcc/testsuite/gcc.target/i386/avx512f-ktestw-2.c 
>> b/gcc/testsuite/gcc.target/i386/avx512f-ktestw-2.c
>> new file mode 100644
>> index 000..6602c7a
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/i386/avx512f-ktestw-2.c
>> @@ -0,0 +1,20 @@
>> +/* { dg-do run } */
>> +/* { dg-options "-O2 -mavx512f" } */
>> +/* { dg-require-effective-target avx512f } */
>> +
>> +#include "avx512f-check.h"
>> +
>> +void
>> +avx512f_test ()
>> +{
>> +  volatile __mmask16 k1, k2;
>> +  unsigned char r1, r2;
>> +
>> +  __asm__( "kmovw %1, %0" : "=k" (k1) : "r" (0) );
>> +  __asm__( "kmovw %1, %0" : "=k" (k2) : "r" (-1) );
>> +
>> +  r1 = _ktest_mask16_u8(k1, k2, );
>> +
>> +  if (r1 != 1 || r2 != 0)
>> +abort ();
>> +}
>
> I see:
>
> {+FAIL: gcc.target/i386/avx512f-ktestw-2.c (test for excess errors)+}
> {+UNRESOLVED: gcc.target/i386/avx512f-ktestw-2.c compilation failed to 
> produce executable+}
>
> ... because of:
>
> /tmp/ccjv3mX2.s: Assembler messages:
> /tmp/ccjv3mX2.s:26: Error: no such instruction: `ktestw %k1,%k0'
> compiler exited with status 1

The problem is with __builtin_ia32_ktesthi (and __builtin_ia32_kaddhi)
intrinsics. These should be enabled only with AVX512DQ, since
corresponding insns are available in AVX512DQ ISA extension.

Andrew, can you please adjust builtins, instruction patterns,
intrinsics and testcases? Also, can you please review if there are any
other inconsistencies w.r.t. ISA throughout mask intrinsics?

Uros.


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-26 Thread Thomas Schwinge
Hi!

On Fri, 20 Jan 2017 23:03:53 +0300, Andrew Senkevich 
 wrote:
> diff --git a/gcc/testsuite/gcc.target/i386/avx512f-ktestw-2.c 
> b/gcc/testsuite/gcc.target/i386/avx512f-ktestw-2.c
> new file mode 100644
> index 000..6602c7a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512f-ktestw-2.c
> @@ -0,0 +1,20 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -mavx512f" } */
> +/* { dg-require-effective-target avx512f } */
> +
> +#include "avx512f-check.h"
> +
> +void
> +avx512f_test ()
> +{
> +  volatile __mmask16 k1, k2;
> +  unsigned char r1, r2;
> +
> +  __asm__( "kmovw %1, %0" : "=k" (k1) : "r" (0) );
> +  __asm__( "kmovw %1, %0" : "=k" (k2) : "r" (-1) );
> +
> +  r1 = _ktest_mask16_u8(k1, k2, );
> +
> +  if (r1 != 1 || r2 != 0)
> +abort ();
> +}

I see:

{+FAIL: gcc.target/i386/avx512f-ktestw-2.c (test for excess errors)+}
{+UNRESOLVED: gcc.target/i386/avx512f-ktestw-2.c compilation failed to 
produce executable+}

... because of:

/tmp/ccjv3mX2.s: Assembler messages:
/tmp/ccjv3mX2.s:26: Error: no such instruction: `ktestw %k1,%k0'
compiler exited with status 1


Grüße
 Thomas


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-23 Thread Kirill Yukhin
On 20 Jan 23:03, Andrew Senkevich wrote:
> 2017-01-20 20:08 GMT+03:00 Kirill Yukhin :
> > Hi,
> > On 20 Jan 14:46, Uros Bizjak wrote:
> >> On Fri, Jan 20, 2017 at 2:32 PM, Andrew Senkevich
> >>  wrote:
> >>
> >> > here is intrinsics for ktest{b,w,d,q} and kortest{b,w,d,q}. Is it Ok?
> >> >
> >> > gcc/
> >> > * config/i386/avx512bwintrin.h: Add k-mask test, kortest intrinsics.
> >> > * config/i386/avx512dqintrin.h: Ditto.
> >> > * config/i386/avx512fintrin.h: Ditto.
> >> > * gcc/config/i386/i386.c: Handle new builtins.
> >> > * config/i386/i386-builtin.def: Add new builtins.
> >> > * config/i386/sse.md (ktest, kortest): New.
> >> > (UNSPEC_KORTEST, UNSPEC_KTEST): New.
> >> >
> >> > gcc/testsuite/
> >> > * gcc.target/i386/avx512bw-ktestd-1.c: New test.
> >> > * gcc.target/i386/avx512bw-ktestq-1.c: Ditto.
> >> > * gcc.target/i386/avx512dq-ktestb-1.c: Ditto.
> >> > * gcc.target/i386/avx512f-ktestw-1.c: Ditto.
> >> > * gcc.target/i386/avx512bw-kortestd-1.c: Ditto.
> >> > * gcc.target/i386/avx512bw-kortestq-1.c: Ditto.
> >> > * gcc.target/i386/avx512dq-kortestb-1.c: Ditto.
> >> > * gcc.target/i386/avx512f-kortestw-1.c: Ditto.
> >>
> >> IMO, you should add some runtime tests.
> > +1
> >
> >> Otherwise, the patch LGTM, but I'l leave the final approval to Kirill.
> > Anyway trunk is frozen, so I suppose you'll need OK from RM.
>
> Kirill, attached with runtime tests.
>
> Richard, are you OK to approve commit of this patch?
> It is last part of k-mask intrinsics, it would be great to have all
> intrinsics of this type available in single GCC release..
OK for main trunk. I'll check it in.

--
Thanks, K


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-21 Thread Richard Biener
On January 20, 2017 9:03:53 PM GMT+01:00, Andrew Senkevich 
 wrote:
>2017-01-20 20:08 GMT+03:00 Kirill Yukhin :
>> Hi,
>> On 20 Jan 14:46, Uros Bizjak wrote:
>>> On Fri, Jan 20, 2017 at 2:32 PM, Andrew Senkevich
>>>  wrote:
>>>
>>> > here is intrinsics for ktest{b,w,d,q} and kortest{b,w,d,q}. Is it
>Ok?
>>> >
>>> > gcc/
>>> > * config/i386/avx512bwintrin.h: Add k-mask test, kortest
>intrinsics.
>>> > * config/i386/avx512dqintrin.h: Ditto.
>>> > * config/i386/avx512fintrin.h: Ditto.
>>> > * gcc/config/i386/i386.c: Handle new builtins.
>>> > * config/i386/i386-builtin.def: Add new builtins.
>>> > * config/i386/sse.md (ktest, kortest): New.
>>> > (UNSPEC_KORTEST, UNSPEC_KTEST): New.
>>> >
>>> > gcc/testsuite/
>>> > * gcc.target/i386/avx512bw-ktestd-1.c: New test.
>>> > * gcc.target/i386/avx512bw-ktestq-1.c: Ditto.
>>> > * gcc.target/i386/avx512dq-ktestb-1.c: Ditto.
>>> > * gcc.target/i386/avx512f-ktestw-1.c: Ditto.
>>> > * gcc.target/i386/avx512bw-kortestd-1.c: Ditto.
>>> > * gcc.target/i386/avx512bw-kortestq-1.c: Ditto.
>>> > * gcc.target/i386/avx512dq-kortestb-1.c: Ditto.
>>> > * gcc.target/i386/avx512f-kortestw-1.c: Ditto.
>>>
>>> IMO, you should add some runtime tests.
>> +1
>>
>>> Otherwise, the patch LGTM, but I'l leave the final approval to
>Kirill.
>> Anyway trunk is frozen, so I suppose you'll need OK from RM.
>
>Kirill, attached with runtime tests.
>
>Richard, are you OK to approve commit of this patch?

OK.  Note trunk is not frozen, it's operated in release branch mode now.

Richard.

>It is last part of k-mask intrinsics, it would be great to have all
>intrinsics of this type available in single GCC release..
>
>Updated changelog:
>
>gcc/
>   * config/i386/avx512bwintrin.h: Add k-mask test, kortest intrinsics.
>* config/i386/avx512dqintrin.h: Ditto.
>* config/i386/avx512fintrin.h: Ditto.
>* gcc/config/i386/i386.c: Handle new builtins.
>* config/i386/i386-builtin.def: Add new builtins.
>* config/i386/sse.md (ktest, kortest): New.
>(UNSPEC_KORTEST, UNSPEC_KTEST): New.
>
>gcc/testsuite/
>* gcc.target/i386/avx512bw-ktestd-1.c: New test.
>* gcc.target/i386/avx512bw-ktestq-1.c: Ditto.
>* gcc.target/i386/avx512dq-ktestb-1.c: Ditto.
>* gcc.target/i386/avx512f-ktestw-1.c: Ditto.
>* gcc.target/i386/avx512bw-kortestd-1.c: Ditto.
>* gcc.target/i386/avx512bw-kortestq-1.c: Ditto.
>* gcc.target/i386/avx512dq-kortestb-1.c: Ditto.
>* gcc.target/i386/avx512f-kortestw-1.c: Ditto.
>* gcc.target/i386/avx512bw-ktestd-2.c: Ditt
>* gcc.target/i386/avx512bw-ktestq-2.c: Ditto.
>* gcc.target/i386/avx512dq-ktestb-2.c: Ditto.
>* gcc.target/i386/avx512f-ktestw-2.c: Ditto.
>* gcc.target/i386/avx512bw-kortestd-2.c: Ditto.
>* gcc.target/i386/avx512bw-kortestq-2.c: Ditto.
>* gcc.target/i386/avx512dq-kortestb-2.c: Ditto.
>* gcc.target/i386/avx512f-kortestw-2.c: Ditto.
>
>
>--
>WBR,
>Andrew



Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-20 Thread Andrew Senkevich
2017-01-20 20:08 GMT+03:00 Kirill Yukhin :
> Hi,
> On 20 Jan 14:46, Uros Bizjak wrote:
>> On Fri, Jan 20, 2017 at 2:32 PM, Andrew Senkevich
>>  wrote:
>>
>> > here is intrinsics for ktest{b,w,d,q} and kortest{b,w,d,q}. Is it Ok?
>> >
>> > gcc/
>> > * config/i386/avx512bwintrin.h: Add k-mask test, kortest intrinsics.
>> > * config/i386/avx512dqintrin.h: Ditto.
>> > * config/i386/avx512fintrin.h: Ditto.
>> > * gcc/config/i386/i386.c: Handle new builtins.
>> > * config/i386/i386-builtin.def: Add new builtins.
>> > * config/i386/sse.md (ktest, kortest): New.
>> > (UNSPEC_KORTEST, UNSPEC_KTEST): New.
>> >
>> > gcc/testsuite/
>> > * gcc.target/i386/avx512bw-ktestd-1.c: New test.
>> > * gcc.target/i386/avx512bw-ktestq-1.c: Ditto.
>> > * gcc.target/i386/avx512dq-ktestb-1.c: Ditto.
>> > * gcc.target/i386/avx512f-ktestw-1.c: Ditto.
>> > * gcc.target/i386/avx512bw-kortestd-1.c: Ditto.
>> > * gcc.target/i386/avx512bw-kortestq-1.c: Ditto.
>> > * gcc.target/i386/avx512dq-kortestb-1.c: Ditto.
>> > * gcc.target/i386/avx512f-kortestw-1.c: Ditto.
>>
>> IMO, you should add some runtime tests.
> +1
>
>> Otherwise, the patch LGTM, but I'l leave the final approval to Kirill.
> Anyway trunk is frozen, so I suppose you'll need OK from RM.

Kirill, attached with runtime tests.

Richard, are you OK to approve commit of this patch?
It is last part of k-mask intrinsics, it would be great to have all
intrinsics of this type available in single GCC release..

Updated changelog:

gcc/
* config/i386/avx512bwintrin.h: Add k-mask test, kortest intrinsics.
* config/i386/avx512dqintrin.h: Ditto.
* config/i386/avx512fintrin.h: Ditto.
* gcc/config/i386/i386.c: Handle new builtins.
* config/i386/i386-builtin.def: Add new builtins.
* config/i386/sse.md (ktest, kortest): New.
(UNSPEC_KORTEST, UNSPEC_KTEST): New.

gcc/testsuite/
* gcc.target/i386/avx512bw-ktestd-1.c: New test.
* gcc.target/i386/avx512bw-ktestq-1.c: Ditto.
* gcc.target/i386/avx512dq-ktestb-1.c: Ditto.
* gcc.target/i386/avx512f-ktestw-1.c: Ditto.
* gcc.target/i386/avx512bw-kortestd-1.c: Ditto.
* gcc.target/i386/avx512bw-kortestq-1.c: Ditto.
* gcc.target/i386/avx512dq-kortestb-1.c: Ditto.
* gcc.target/i386/avx512f-kortestw-1.c: Ditto.
* gcc.target/i386/avx512bw-ktestd-2.c: Ditt
* gcc.target/i386/avx512bw-ktestq-2.c: Ditto.
* gcc.target/i386/avx512dq-ktestb-2.c: Ditto.
* gcc.target/i386/avx512f-ktestw-2.c: Ditto.
* gcc.target/i386/avx512bw-kortestd-2.c: Ditto.
* gcc.target/i386/avx512bw-kortestq-2.c: Ditto.
* gcc.target/i386/avx512dq-kortestb-2.c: Ditto.
* gcc.target/i386/avx512f-kortestw-2.c: Ditto.


--
WBR,
Andrew


avx512-kmask-intrin-part5.patch
Description: Binary data


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-20 Thread Kirill Yukhin
Hi,
On 20 Jan 14:46, Uros Bizjak wrote:
> On Fri, Jan 20, 2017 at 2:32 PM, Andrew Senkevich
>  wrote:
>
> > here is intrinsics for ktest{b,w,d,q} and kortest{b,w,d,q}. Is it Ok?
> >
> > gcc/
> > * config/i386/avx512bwintrin.h: Add k-mask test, kortest intrinsics.
> > * config/i386/avx512dqintrin.h: Ditto.
> > * config/i386/avx512fintrin.h: Ditto.
> > * gcc/config/i386/i386.c: Handle new builtins.
> > * config/i386/i386-builtin.def: Add new builtins.
> > * config/i386/sse.md (ktest, kortest): New.
> > (UNSPEC_KORTEST, UNSPEC_KTEST): New.
> >
> > gcc/testsuite/
> > * gcc.target/i386/avx512bw-ktestd-1.c: New test.
> > * gcc.target/i386/avx512bw-ktestq-1.c: Ditto.
> > * gcc.target/i386/avx512dq-ktestb-1.c: Ditto.
> > * gcc.target/i386/avx512f-ktestw-1.c: Ditto.
> > * gcc.target/i386/avx512bw-kortestd-1.c: Ditto.
> > * gcc.target/i386/avx512bw-kortestq-1.c: Ditto.
> > * gcc.target/i386/avx512dq-kortestb-1.c: Ditto.
> > * gcc.target/i386/avx512f-kortestw-1.c: Ditto.
>
> IMO, you should add some runtime tests.
+1

> Otherwise, the patch LGTM, but I'l leave the final approval to Kirill.
Anyway trunk is frozen, so I suppose you'll need OK from RM.

So, no much hurry. Pls add runtime tests.

--
Thanks, K
>
> Uros.


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-20 Thread Uros Bizjak
On Fri, Jan 20, 2017 at 2:32 PM, Andrew Senkevich
 wrote:

> here is intrinsics for ktest{b,w,d,q} and kortest{b,w,d,q}. Is it Ok?
>
> gcc/
> * config/i386/avx512bwintrin.h: Add k-mask test, kortest intrinsics.
> * config/i386/avx512dqintrin.h: Ditto.
> * config/i386/avx512fintrin.h: Ditto.
> * gcc/config/i386/i386.c: Handle new builtins.
> * config/i386/i386-builtin.def: Add new builtins.
> * config/i386/sse.md (ktest, kortest): New.
> (UNSPEC_KORTEST, UNSPEC_KTEST): New.
>
> gcc/testsuite/
> * gcc.target/i386/avx512bw-ktestd-1.c: New test.
> * gcc.target/i386/avx512bw-ktestq-1.c: Ditto.
> * gcc.target/i386/avx512dq-ktestb-1.c: Ditto.
> * gcc.target/i386/avx512f-ktestw-1.c: Ditto.
> * gcc.target/i386/avx512bw-kortestd-1.c: Ditto.
> * gcc.target/i386/avx512bw-kortestq-1.c: Ditto.
> * gcc.target/i386/avx512dq-kortestb-1.c: Ditto.
> * gcc.target/i386/avx512f-kortestw-1.c: Ditto.

IMO, you should add some runtime tests.

Otherwise, the patch LGTM, but I'l leave the final approval to Kirill.

Uros.


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-20 Thread Andrew Senkevich
2017-01-19 20:55 GMT+03:00 Kirill Yukhin :
> On 19 Jan 19:42, Andrew Senkevich wrote:
>> 2017-01-19 13:39 GMT+03:00 Kirill Yukhin :
>> > Hi Andrew,
>> > On 18 Jan 15:45, Andrew Senkevich wrote:
>> >> 2017-01-17 16:51 GMT+03:00 Jakub Jelinek :
>> >> > On Tue, Jan 17, 2017 at 04:03:08PM +0300, Andrew Senkevich wrote:
>> >> >> > I've played a bit w/ SDE. And looks like operands are not early 
>> >> >> > clobber:
>> >> >> > TID0: INS 0x004003ee AVX512VEX kmovd k0, eax
>> >> >> > TID0:   k0 := _
>> >> >> > ...
>> >> >> > TID0: INS 0x004003f4 AVX512VEX kshiftlw k0, k0, 
>> >> >> > 0x3
>> >> >> > TID0:   k0 := _fff8
>> >> >> >
>> >> >> > You can see that same dest and source works just fine.
>> >> >>
>> >> >> Hmm, I looked only on what ICC generates, and it was not correct way.
>> >> >
>> >> > I've just tried
>> >> > int
>> >> > main ()
>> >> > {
>> >> >   unsigned int a = 0x;
>> >> >   asm volatile ("kmovw %1, %%k6; kshiftlw $1, %%k6, %%k6; kmovw %%k6, 
>> >> > %0" : "=r" (a) : "r" (a) : "k6");
>> >> >   __builtin_printf ("%x\n", a);
>> >> >   return 0;
>> >> > }
>> >> > on KNL and got 0x.
>> >> > Are you going to report to the SDM authors so that they fix it up?
>> >> > E.g. using TEMP <- SRC1[0:...] before DEST[...] <- 0 and using TEMP
>> >> > instead of SRC1[0:...] would fix it, or filling up TEMP first and only
>> >> > at the end assigning DEST <- TEMP etc. would do.
>> >>
>> >> Yes, we will work on it.
>> >>
>> >> Attached patch refactored in part of builtints declarations and tests, is 
>> >> it Ok?
>> >
>> > Could you please add runtime tests for new intrinsics as well?
>>
>> Attached with runtime tests.
> Great! Thanks. Patch is OK for main trunk.
>
> --
> Thanks, K
>>
>> gcc/
>> * config/i386/avx512bwintrin.h: Add k-mask registers shift intrinsics.
>> * config/i386/avx512dqintrin.h: Ditto.
>> * config/i386/avx512fintrin.h: Ditto.
>> * config/i386/i386-builtin-types.def: Add new types.
>> * gcc/config/i386/i386.c: Handle new types.
>> * config/i386/i386-builtin.def (__builtin_ia32_kshiftliqi,
>> __builtin_ia32_kshiftlihi, __builtin_ia32_kshiftlisi,
>> __builtin_ia32_kshiftlidi, __builtin_ia32_kshiftriqi,
>> __builtin_ia32_kshiftrihi, __builtin_ia32_kshiftrisi,
>> __builtin_ia32_kshiftridi): New.
>> * config/i386/sse.md (k): Rename *k.
>>
>> gcc/testsuite/
>> * gcc.target/i386/avx512bw-kshiftld-1.c: New test.
>> * gcc.target/i386/avx512bw-kshiftlq-1.c: Ditto.
>> * gcc.target/i386/avx512dq-kshiftlb-1.c: Ditto.
>> * gcc.target/i386/avx512f-kshiftlw-1.c: Ditto.
>> * gcc.target/i386/avx512bw-kshiftrd-1.c: Ditto.
>> * gcc.target/i386/avx512bw-kshiftrq-1.c: Ditto.
>> * gcc.target/i386/avx512dq-kshiftrb-1.c: Ditto.
>> * gcc.target/i386/avx512f-kshiftrw-1.c: Ditto.
>> * gcc.target/i386/avx512bw-kshiftld-2.c: Ditto.
>> * gcc.target/i386/avx512bw-kshiftlq-2.c: Ditto.
>> * gcc.target/i386/avx512bw-kshiftrd-2.c: Ditto.
>> * gcc.target/i386/avx512bw-kshiftrq-2.c: Ditto.
>> * gcc.target/i386/avx512dq-kshiftlb-2.c: Ditto.
>> * gcc.target/i386/avx512dq-kshiftrb-2.c: Ditto.
>> * gcc.target/i386/avx512f-kshiftlw-2.c: Ditto.
>> * gcc.target/i386/avx512f-kshiftrw-2.c: Ditto.
>> * gcc.target/i386/avx-1.c: Test new intrinsics.
>> * gcc.target/i386/sse-13.c: Ditto.
>> * gcc.target/i386/sse-23.c: Ditto.

Hi,

here is intrinsics for ktest{b,w,d,q} and kortest{b,w,d,q}. Is it Ok?

gcc/
* config/i386/avx512bwintrin.h: Add k-mask test, kortest intrinsics.
* config/i386/avx512dqintrin.h: Ditto.
* config/i386/avx512fintrin.h: Ditto.
* gcc/config/i386/i386.c: Handle new builtins.
* config/i386/i386-builtin.def: Add new builtins.
* config/i386/sse.md (ktest, kortest): New.
(UNSPEC_KORTEST, UNSPEC_KTEST): New.

gcc/testsuite/
* gcc.target/i386/avx512bw-ktestd-1.c: New test.
* gcc.target/i386/avx512bw-ktestq-1.c: Ditto.
* gcc.target/i386/avx512dq-ktestb-1.c: Ditto.
* gcc.target/i386/avx512f-ktestw-1.c: Ditto.
* gcc.target/i386/avx512bw-kortestd-1.c: Ditto.
* gcc.target/i386/avx512bw-kortestq-1.c: Ditto.
* gcc.target/i386/avx512dq-kortestb-1.c: Ditto.
* gcc.target/i386/avx512f-kortestw-1.c: Ditto.


--
WBR,
Andrew


avx512-kmask-intrin-part5.patch
Description: Binary data


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-19 Thread Kirill Yukhin
On 19 Jan 19:42, Andrew Senkevich wrote:
> 2017-01-19 13:39 GMT+03:00 Kirill Yukhin :
> > Hi Andrew,
> > On 18 Jan 15:45, Andrew Senkevich wrote:
> >> 2017-01-17 16:51 GMT+03:00 Jakub Jelinek :
> >> > On Tue, Jan 17, 2017 at 04:03:08PM +0300, Andrew Senkevich wrote:
> >> >> > I've played a bit w/ SDE. And looks like operands are not early 
> >> >> > clobber:
> >> >> > TID0: INS 0x004003ee AVX512VEX kmovd k0, eax
> >> >> > TID0:   k0 := _
> >> >> > ...
> >> >> > TID0: INS 0x004003f4 AVX512VEX kshiftlw k0, k0, 
> >> >> > 0x3
> >> >> > TID0:   k0 := _fff8
> >> >> >
> >> >> > You can see that same dest and source works just fine.
> >> >>
> >> >> Hmm, I looked only on what ICC generates, and it was not correct way.
> >> >
> >> > I've just tried
> >> > int
> >> > main ()
> >> > {
> >> >   unsigned int a = 0x;
> >> >   asm volatile ("kmovw %1, %%k6; kshiftlw $1, %%k6, %%k6; kmovw %%k6, 
> >> > %0" : "=r" (a) : "r" (a) : "k6");
> >> >   __builtin_printf ("%x\n", a);
> >> >   return 0;
> >> > }
> >> > on KNL and got 0x.
> >> > Are you going to report to the SDM authors so that they fix it up?
> >> > E.g. using TEMP <- SRC1[0:...] before DEST[...] <- 0 and using TEMP
> >> > instead of SRC1[0:...] would fix it, or filling up TEMP first and only
> >> > at the end assigning DEST <- TEMP etc. would do.
> >>
> >> Yes, we will work on it.
> >>
> >> Attached patch refactored in part of builtints declarations and tests, is 
> >> it Ok?
> >
> > Could you please add runtime tests for new intrinsics as well?
>
> Attached with runtime tests.
Great! Thanks. Patch is OK for main trunk.

--
Thanks, K
>
> gcc/
> * config/i386/avx512bwintrin.h: Add k-mask registers shift intrinsics.
> * config/i386/avx512dqintrin.h: Ditto.
> * config/i386/avx512fintrin.h: Ditto.
> * config/i386/i386-builtin-types.def: Add new types.
> * gcc/config/i386/i386.c: Handle new types.
> * config/i386/i386-builtin.def (__builtin_ia32_kshiftliqi,
> __builtin_ia32_kshiftlihi, __builtin_ia32_kshiftlisi,
> __builtin_ia32_kshiftlidi, __builtin_ia32_kshiftriqi,
> __builtin_ia32_kshiftrihi, __builtin_ia32_kshiftrisi,
> __builtin_ia32_kshiftridi): New.
> * config/i386/sse.md (k): Rename *k.
>
> gcc/testsuite/
> * gcc.target/i386/avx512bw-kshiftld-1.c: New test.
> * gcc.target/i386/avx512bw-kshiftlq-1.c: Ditto.
> * gcc.target/i386/avx512dq-kshiftlb-1.c: Ditto.
> * gcc.target/i386/avx512f-kshiftlw-1.c: Ditto.
> * gcc.target/i386/avx512bw-kshiftrd-1.c: Ditto.
> * gcc.target/i386/avx512bw-kshiftrq-1.c: Ditto.
> * gcc.target/i386/avx512dq-kshiftrb-1.c: Ditto.
> * gcc.target/i386/avx512f-kshiftrw-1.c: Ditto.
> * gcc.target/i386/avx512bw-kshiftld-2.c: Ditto.
> * gcc.target/i386/avx512bw-kshiftlq-2.c: Ditto.
> * gcc.target/i386/avx512bw-kshiftrd-2.c: Ditto.
> * gcc.target/i386/avx512bw-kshiftrq-2.c: Ditto.
> * gcc.target/i386/avx512dq-kshiftlb-2.c: Ditto.
> * gcc.target/i386/avx512dq-kshiftrb-2.c: Ditto.
> * gcc.target/i386/avx512f-kshiftlw-2.c: Ditto.
> * gcc.target/i386/avx512f-kshiftrw-2.c: Ditto.
> * gcc.target/i386/avx-1.c: Test new intrinsics.
> * gcc.target/i386/sse-13.c: Ditto.
> * gcc.target/i386/sse-23.c: Ditto.
>
>
> --
> WBR,
> Andrew


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-19 Thread Andrew Senkevich
2017-01-19 13:39 GMT+03:00 Kirill Yukhin :
> Hi Andrew,
> On 18 Jan 15:45, Andrew Senkevich wrote:
>> 2017-01-17 16:51 GMT+03:00 Jakub Jelinek :
>> > On Tue, Jan 17, 2017 at 04:03:08PM +0300, Andrew Senkevich wrote:
>> >> > I've played a bit w/ SDE. And looks like operands are not early clobber:
>> >> > TID0: INS 0x004003ee AVX512VEX kmovd k0, eax
>> >> > TID0:   k0 := _
>> >> > ...
>> >> > TID0: INS 0x004003f4 AVX512VEX kshiftlw k0, k0, 0x3
>> >> > TID0:   k0 := _fff8
>> >> >
>> >> > You can see that same dest and source works just fine.
>> >>
>> >> Hmm, I looked only on what ICC generates, and it was not correct way.
>> >
>> > I've just tried
>> > int
>> > main ()
>> > {
>> >   unsigned int a = 0x;
>> >   asm volatile ("kmovw %1, %%k6; kshiftlw $1, %%k6, %%k6; kmovw %%k6, %0" 
>> > : "=r" (a) : "r" (a) : "k6");
>> >   __builtin_printf ("%x\n", a);
>> >   return 0;
>> > }
>> > on KNL and got 0x.
>> > Are you going to report to the SDM authors so that they fix it up?
>> > E.g. using TEMP <- SRC1[0:...] before DEST[...] <- 0 and using TEMP
>> > instead of SRC1[0:...] would fix it, or filling up TEMP first and only
>> > at the end assigning DEST <- TEMP etc. would do.
>>
>> Yes, we will work on it.
>>
>> Attached patch refactored in part of builtints declarations and tests, is it 
>> Ok?
>
> Could you please add runtime tests for new intrinsics as well?

Attached with runtime tests.

gcc/
* config/i386/avx512bwintrin.h: Add k-mask registers shift intrinsics.
* config/i386/avx512dqintrin.h: Ditto.
* config/i386/avx512fintrin.h: Ditto.
* config/i386/i386-builtin-types.def: Add new types.
* gcc/config/i386/i386.c: Handle new types.
* config/i386/i386-builtin.def (__builtin_ia32_kshiftliqi,
__builtin_ia32_kshiftlihi, __builtin_ia32_kshiftlisi,
__builtin_ia32_kshiftlidi, __builtin_ia32_kshiftriqi,
__builtin_ia32_kshiftrihi, __builtin_ia32_kshiftrisi,
__builtin_ia32_kshiftridi): New.
* config/i386/sse.md (k): Rename *k.

gcc/testsuite/
* gcc.target/i386/avx512bw-kshiftld-1.c: New test.
* gcc.target/i386/avx512bw-kshiftlq-1.c: Ditto.
* gcc.target/i386/avx512dq-kshiftlb-1.c: Ditto.
* gcc.target/i386/avx512f-kshiftlw-1.c: Ditto.
* gcc.target/i386/avx512bw-kshiftrd-1.c: Ditto.
* gcc.target/i386/avx512bw-kshiftrq-1.c: Ditto.
* gcc.target/i386/avx512dq-kshiftrb-1.c: Ditto.
* gcc.target/i386/avx512f-kshiftrw-1.c: Ditto.
* gcc.target/i386/avx512bw-kshiftld-2.c: Ditto.
* gcc.target/i386/avx512bw-kshiftlq-2.c: Ditto.
* gcc.target/i386/avx512bw-kshiftrd-2.c: Ditto.
* gcc.target/i386/avx512bw-kshiftrq-2.c: Ditto.
* gcc.target/i386/avx512dq-kshiftlb-2.c: Ditto.
* gcc.target/i386/avx512dq-kshiftrb-2.c: Ditto.
* gcc.target/i386/avx512f-kshiftlw-2.c: Ditto.
* gcc.target/i386/avx512f-kshiftrw-2.c: Ditto.
* gcc.target/i386/avx-1.c: Test new intrinsics.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.


--
WBR,
Andrew


avx512-kmask-intrin-part4.patch
Description: Binary data


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-19 Thread Kirill Yukhin
Hi Andrew,
On 18 Jan 15:45, Andrew Senkevich wrote:
> 2017-01-17 16:51 GMT+03:00 Jakub Jelinek :
> > On Tue, Jan 17, 2017 at 04:03:08PM +0300, Andrew Senkevich wrote:
> >> > I've played a bit w/ SDE. And looks like operands are not early clobber:
> >> > TID0: INS 0x004003ee AVX512VEX kmovd k0, eax
> >> > TID0:   k0 := _
> >> > ...
> >> > TID0: INS 0x004003f4 AVX512VEX kshiftlw k0, k0, 0x3
> >> > TID0:   k0 := _fff8
> >> >
> >> > You can see that same dest and source works just fine.
> >>
> >> Hmm, I looked only on what ICC generates, and it was not correct way.
> >
> > I've just tried
> > int
> > main ()
> > {
> >   unsigned int a = 0x;
> >   asm volatile ("kmovw %1, %%k6; kshiftlw $1, %%k6, %%k6; kmovw %%k6, %0" : 
> > "=r" (a) : "r" (a) : "k6");
> >   __builtin_printf ("%x\n", a);
> >   return 0;
> > }
> > on KNL and got 0x.
> > Are you going to report to the SDM authors so that they fix it up?
> > E.g. using TEMP <- SRC1[0:...] before DEST[...] <- 0 and using TEMP
> > instead of SRC1[0:...] would fix it, or filling up TEMP first and only
> > at the end assigning DEST <- TEMP etc. would do.
>
> Yes, we will work on it.
>
> Attached patch refactored in part of builtints declarations and tests, is it 
> Ok?

Could you please add runtime tests for new intrinsics as well?


--
Thanks, K

> --
> WBR,
> Andrew


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-18 Thread Uros Bizjak
On Wed, Jan 18, 2017 at 1:45 PM, Andrew Senkevich
 wrote:
> 2017-01-17 16:51 GMT+03:00 Jakub Jelinek :
>> On Tue, Jan 17, 2017 at 04:03:08PM +0300, Andrew Senkevich wrote:
>>> > I've played a bit w/ SDE. And looks like operands are not early clobber:
>>> > TID0: INS 0x004003ee AVX512VEX kmovd k0, eax
>>> > TID0:   k0 := _
>>> > ...
>>> > TID0: INS 0x004003f4 AVX512VEX kshiftlw k0, k0, 0x3
>>> > TID0:   k0 := _fff8
>>> >
>>> > You can see that same dest and source works just fine.
>>>
>>> Hmm, I looked only on what ICC generates, and it was not correct way.
>>
>> I've just tried
>> int
>> main ()
>> {
>>   unsigned int a = 0x;
>>   asm volatile ("kmovw %1, %%k6; kshiftlw $1, %%k6, %%k6; kmovw %%k6, %0" : 
>> "=r" (a) : "r" (a) : "k6");
>>   __builtin_printf ("%x\n", a);
>>   return 0;
>> }
>> on KNL and got 0x.
>> Are you going to report to the SDM authors so that they fix it up?
>> E.g. using TEMP <- SRC1[0:...] before DEST[...] <- 0 and using TEMP
>> instead of SRC1[0:...] would fix it, or filling up TEMP first and only
>> at the end assigning DEST <- TEMP etc. would do.
>
> Yes, we will work on it.
>
> Attached patch refactored in part of builtints declarations and tests, is it 
> Ok?
>
> gcc/
> * config/i386/avx512bwintrin.h: Add k-mask registers shift intrinsics.
> * config/i386/avx512dqintrin.h: Ditto.
> * config/i386/avx512fintrin.h: Ditto.
> * config/i386/i386-builtin-types.def: Add new types.
> * gcc/config/i386/i386.c: Handle new types.
> * config/i386/i386-builtin.def (__builtin_ia32_kshiftliqi,
> __builtin_ia32_kshiftlihi, __builtin_ia32_kshiftlisi,
> __builtin_ia32_kshiftlidi, __builtin_ia32_kshiftriqi,
> __builtin_ia32_kshiftrihi, __builtin_ia32_kshiftrisi,
> __builtin_ia32_kshiftridi): New.
> * config/i386/sse.md (k): Rename *k.
>
> gcc/testsuite/
> * gcc.target/i386/avx512bw-kshiftld-1.c: New test.
> * gcc.target/i386/avx512bw-kshiftlq-1.c: Ditto.
> * gcc.target/i386/avx512dq-kshiftlb-1.c: Ditto.
> * gcc.target/i386/avx512f-kshiftlw-1.c: Ditto.
> * gcc.target/i386/avx512bw-kshiftrd-1.c: Ditto.
> * gcc.target/i386/avx512bw-kshiftrq-1.c: Ditto.
> * gcc.target/i386/avx512dq-kshiftrb-1.c: Ditto.
> * gcc.target/i386/avx512f-kshiftrw-1.c: Ditto.
> * gcc.target/i386/avx-1.c: Test new intrinsics.
> * gcc.target/i386/sse-13.c: Ditto.
> * gcc.target/i386/sse-23.c: Ditto.

OK.

Thanks,
Uros.


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-18 Thread Andrew Senkevich
2017-01-17 16:51 GMT+03:00 Jakub Jelinek :
> On Tue, Jan 17, 2017 at 04:03:08PM +0300, Andrew Senkevich wrote:
>> > I've played a bit w/ SDE. And looks like operands are not early clobber:
>> > TID0: INS 0x004003ee AVX512VEX kmovd k0, eax
>> > TID0:   k0 := _
>> > ...
>> > TID0: INS 0x004003f4 AVX512VEX kshiftlw k0, k0, 0x3
>> > TID0:   k0 := _fff8
>> >
>> > You can see that same dest and source works just fine.
>>
>> Hmm, I looked only on what ICC generates, and it was not correct way.
>
> I've just tried
> int
> main ()
> {
>   unsigned int a = 0x;
>   asm volatile ("kmovw %1, %%k6; kshiftlw $1, %%k6, %%k6; kmovw %%k6, %0" : 
> "=r" (a) : "r" (a) : "k6");
>   __builtin_printf ("%x\n", a);
>   return 0;
> }
> on KNL and got 0x.
> Are you going to report to the SDM authors so that they fix it up?
> E.g. using TEMP <- SRC1[0:...] before DEST[...] <- 0 and using TEMP
> instead of SRC1[0:...] would fix it, or filling up TEMP first and only
> at the end assigning DEST <- TEMP etc. would do.

Yes, we will work on it.

Attached patch refactored in part of builtints declarations and tests, is it Ok?

gcc/
* config/i386/avx512bwintrin.h: Add k-mask registers shift intrinsics.
* config/i386/avx512dqintrin.h: Ditto.
* config/i386/avx512fintrin.h: Ditto.
* config/i386/i386-builtin-types.def: Add new types.
* gcc/config/i386/i386.c: Handle new types.
* config/i386/i386-builtin.def (__builtin_ia32_kshiftliqi,
__builtin_ia32_kshiftlihi, __builtin_ia32_kshiftlisi,
__builtin_ia32_kshiftlidi, __builtin_ia32_kshiftriqi,
__builtin_ia32_kshiftrihi, __builtin_ia32_kshiftrisi,
__builtin_ia32_kshiftridi): New.
* config/i386/sse.md (k): Rename *k.

gcc/testsuite/
* gcc.target/i386/avx512bw-kshiftld-1.c: New test.
* gcc.target/i386/avx512bw-kshiftlq-1.c: Ditto.
* gcc.target/i386/avx512dq-kshiftlb-1.c: Ditto.
* gcc.target/i386/avx512f-kshiftlw-1.c: Ditto.
* gcc.target/i386/avx512bw-kshiftrd-1.c: Ditto.
* gcc.target/i386/avx512bw-kshiftrq-1.c: Ditto.
* gcc.target/i386/avx512dq-kshiftrb-1.c: Ditto.
* gcc.target/i386/avx512f-kshiftrw-1.c: Ditto.
* gcc.target/i386/avx-1.c: Test new intrinsics.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.


--
WBR,
Andrew


avx512-kmask-intrin-part4.patch
Description: Binary data


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-17 Thread Jakub Jelinek
On Tue, Jan 17, 2017 at 04:03:08PM +0300, Andrew Senkevich wrote:
> > I've played a bit w/ SDE. And looks like operands are not early clobber:
> > TID0: INS 0x004003ee AVX512VEX kmovd k0, eax
> > TID0:   k0 := _
> > ...
> > TID0: INS 0x004003f4 AVX512VEX kshiftlw k0, k0, 0x3
> > TID0:   k0 := _fff8
> >
> > You can see that same dest and source works just fine.
> 
> Hmm, I looked only on what ICC generates, and it was not correct way.

I've just tried
int
main ()
{
  unsigned int a = 0x;
  asm volatile ("kmovw %1, %%k6; kshiftlw $1, %%k6, %%k6; kmovw %%k6, %0" : 
"=r" (a) : "r" (a) : "k6");
  __builtin_printf ("%x\n", a);
  return 0;
}
on KNL and got 0x.
Are you going to report to the SDM authors so that they fix it up?
E.g. using TEMP <- SRC1[0:...] before DEST[...] <- 0 and using TEMP
instead of SRC1[0:...] would fix it, or filling up TEMP first and only
at the end assigning DEST <- TEMP etc. would do.

Jakub


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-17 Thread Andrew Senkevich
2017-01-17 15:30 GMT+03:00 Kirill Yukhin :
> Hi Anrey,
> On 17 Jan 14:04, Andrew Senkevich wrote:
>> 2017-01-17 1:55 GMT+03:00 Jakub Jelinek :
>> > On Tue, Jan 17, 2017 at 01:30:11AM +0300, Andrew Senkevich wrote:
>> >> here is one more part of intrinsics for k-mask registers shifts:
>> >
>> > The software developer manuals describe KSHIFT{L,R}* like:
>> > KSHIFTLW
>> > COUNT <- imm8[7:0]
>> > DEST[MAX_KL-1:0] <- 0
>> > IF COUNT <=15
>> > THEN DEST[15:0] <- SRC1[15:0] << COUNT;
>> > FI;
>> >
>> > What is the behavior when src1 == dest, like:
>> >   kshiftld $3, %k3, %k3
>> > ?  Is it just a bug in the SDM and will it actually do the expected thing
>> > (set %k3 to %k3 << 3 and clear just the upper bits), or do we need
>> > an early-clobber on the destination to make sure GCC never emits these
>> > insns with the same register as both input and output?
>>
>> Indeed, it should be different registers, how to do it?
> Are you sure?
>
> I've played a bit w/ SDE. And looks like operands are not early clobber:
> TID0: INS 0x004003ee AVX512VEX kmovd k0, eax
> TID0:   k0 := _
> ...
> TID0: INS 0x004003f4 AVX512VEX kshiftlw k0, k0, 0x3
> TID0:   k0 := _fff8
>
> You can see that same dest and source works just fine.

Hmm, I looked only on what ICC generates, and it was not correct way.

Thanks Kirill!


--
WBR,
Andrew


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-17 Thread Kirill Yukhin
Hi Anrey,
On 17 Jan 14:04, Andrew Senkevich wrote:
> 2017-01-17 1:55 GMT+03:00 Jakub Jelinek :
> > On Tue, Jan 17, 2017 at 01:30:11AM +0300, Andrew Senkevich wrote:
> >> here is one more part of intrinsics for k-mask registers shifts:
> >
> > The software developer manuals describe KSHIFT{L,R}* like:
> > KSHIFTLW
> > COUNT <- imm8[7:0]
> > DEST[MAX_KL-1:0] <- 0
> > IF COUNT <=15
> > THEN DEST[15:0] <- SRC1[15:0] << COUNT;
> > FI;
> >
> > What is the behavior when src1 == dest, like:
> >   kshiftld $3, %k3, %k3
> > ?  Is it just a bug in the SDM and will it actually do the expected thing
> > (set %k3 to %k3 << 3 and clear just the upper bits), or do we need
> > an early-clobber on the destination to make sure GCC never emits these
> > insns with the same register as both input and output?
>
> Indeed, it should be different registers, how to do it?
Are you sure?

I've played a bit w/ SDE. And looks like operands are not early clobber:
TID0: INS 0x004003ee AVX512VEX kmovd k0, eax
TID0:   k0 := _
...
TID0: INS 0x004003f4 AVX512VEX kshiftlw k0, k0, 0x3
TID0:   k0 := _fff8

You can see that same dest and source works just fine.

--
Thanks, K
>
>
> --
> WBR,
> Andrew


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-17 Thread Uros Bizjak
On Tue, Jan 17, 2017 at 12:04 PM, Andrew Senkevich
 wrote:
> 2017-01-17 1:55 GMT+03:00 Jakub Jelinek :
>> On Tue, Jan 17, 2017 at 01:30:11AM +0300, Andrew Senkevich wrote:
>>> here is one more part of intrinsics for k-mask registers shifts:
>>
>> The software developer manuals describe KSHIFT{L,R}* like:
>> KSHIFTLW
>> COUNT <- imm8[7:0]
>> DEST[MAX_KL-1:0] <- 0
>> IF COUNT <=15
>> THEN DEST[15:0] <- SRC1[15:0] << COUNT;
>> FI;
>>
>> What is the behavior when src1 == dest, like:
>>   kshiftld $3, %k3, %k3
>> ?  Is it just a bug in the SDM and will it actually do the expected thing
>> (set %k3 to %k3 << 3 and clear just the upper bits), or do we need
>> an early-clobber on the destination to make sure GCC never emits these
>> insns with the same register as both input and output?
>
> Indeed, it should be different registers, how to do it?

"=" as operand 0 constraint.

Uros.


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-17 Thread Andrew Senkevich
2017-01-17 1:55 GMT+03:00 Jakub Jelinek :
> On Tue, Jan 17, 2017 at 01:30:11AM +0300, Andrew Senkevich wrote:
>> here is one more part of intrinsics for k-mask registers shifts:
>
> The software developer manuals describe KSHIFT{L,R}* like:
> KSHIFTLW
> COUNT <- imm8[7:0]
> DEST[MAX_KL-1:0] <- 0
> IF COUNT <=15
> THEN DEST[15:0] <- SRC1[15:0] << COUNT;
> FI;
>
> What is the behavior when src1 == dest, like:
>   kshiftld $3, %k3, %k3
> ?  Is it just a bug in the SDM and will it actually do the expected thing
> (set %k3 to %k3 << 3 and clear just the upper bits), or do we need
> an early-clobber on the destination to make sure GCC never emits these
> insns with the same register as both input and output?

Indeed, it should be different registers, how to do it?


--
WBR,
Andrew


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-17 Thread Uros Bizjak
On Mon, Jan 16, 2017 at 11:30 PM, Andrew Senkevich
 wrote:
> Hi,
>
> here is one more part of intrinsics for k-mask registers shifts:
>
> gcc/
> * config/i386/avx512bwintrin.h: Add k-mask registers shift intrinsics.
> * config/i386/avx512dqintrin.h: Ditto.
> * config/i386/avx512fintrin.h: Ditto.
> * config/i386/i386-builtin-types.def: Add new types.
> * gcc/config/i386/i386.c: Handle new types.
> * config/i386/i386-builtin.def (__builtin_ia32_kshiftliqi,
> __builtin_ia32_kshiftlihi, __builtin_ia32_kshiftlisi,
> __builtin_ia32_kshiftlidi, __builtin_ia32_kshiftriqi,
> __builtin_ia32_kshiftrihi, __builtin_ia32_kshiftrisi,
> __builtin_ia32_kshiftridi): New.
> * config/i386/sse.md (k2): Rename *k.
>
> gcc/testsuite/
> * gcc.target/i386/avx512bw-kshiftld-1.c: New test.
> * gcc.target/i386/avx512bw-kshiftlq-1.c: Ditto.
> * gcc.target/i386/avx512dq-kshiftlb-1.c: Ditto.
> * gcc.target/i386/avx512f-kshiftlw-1.c: Ditto.
> * gcc.target/i386/avx512bw-kshiftrd-1.c: Ditto.
> * gcc.target/i386/avx512bw-kshiftrq-1.c: Ditto.
> * gcc.target/i386/avx512dq-kshiftrb-1.c: Ditto.
> * gcc.target/i386/avx512f-kshiftrw-1.c: Ditto.
>
>
> Is it Ok for trunk?

-(define_insn "*k"
+(define_insn "k2"
   [(set (match_operand:SWI1248_AVX512BWDQ 0 "register_operand" "=k")
  (any_lshift:SWI1248_AVX512BWDQ
   (match_operand:SWI1248_AVX512BWDQ 1 "register_operand" "k")

Please do not add "2" to the insn name to follow de-facto convention
of other mask insn names.

Otherwise, OK - but please check Jakub's question first.

Uros.


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-16 Thread Jakub Jelinek
On Tue, Jan 17, 2017 at 01:30:11AM +0300, Andrew Senkevich wrote:
> here is one more part of intrinsics for k-mask registers shifts:

The software developer manuals describe KSHIFT{L,R}* like:
KSHIFTLW
COUNT <- imm8[7:0]
DEST[MAX_KL-1:0] <- 0
IF COUNT <=15
THEN DEST[15:0] <- SRC1[15:0] << COUNT;
FI;

What is the behavior when src1 == dest, like:
  kshiftld $3, %k3, %k3
?  Is it just a bug in the SDM and will it actually do the expected thing
(set %k3 to %k3 << 3 and clear just the upper bits), or do we need
an early-clobber on the destination to make sure GCC never emits these
insns with the same register as both input and output?

Jakub


Re: [PATCH] Add AVX512 k-mask intrinsics

2017-01-16 Thread Andrew Senkevich
Hi,

here is one more part of intrinsics for k-mask registers shifts:

gcc/
* config/i386/avx512bwintrin.h: Add k-mask registers shift intrinsics.
* config/i386/avx512dqintrin.h: Ditto.
* config/i386/avx512fintrin.h: Ditto.
* config/i386/i386-builtin-types.def: Add new types.
* gcc/config/i386/i386.c: Handle new types.
* config/i386/i386-builtin.def (__builtin_ia32_kshiftliqi,
__builtin_ia32_kshiftlihi, __builtin_ia32_kshiftlisi,
__builtin_ia32_kshiftlidi, __builtin_ia32_kshiftriqi,
__builtin_ia32_kshiftrihi, __builtin_ia32_kshiftrisi,
__builtin_ia32_kshiftridi): New.
* config/i386/sse.md (k2): Rename *k.

gcc/testsuite/
* gcc.target/i386/avx512bw-kshiftld-1.c: New test.
* gcc.target/i386/avx512bw-kshiftlq-1.c: Ditto.
* gcc.target/i386/avx512dq-kshiftlb-1.c: Ditto.
* gcc.target/i386/avx512f-kshiftlw-1.c: Ditto.
* gcc.target/i386/avx512bw-kshiftrd-1.c: Ditto.
* gcc.target/i386/avx512bw-kshiftrq-1.c: Ditto.
* gcc.target/i386/avx512dq-kshiftrb-1.c: Ditto.
* gcc.target/i386/avx512f-kshiftrw-1.c: Ditto.


Is it Ok for trunk?


--
WBR,
Andrew


avx512-kmask-intrin-part4.patch
Description: Binary data


Re: [PATCH] Add AVX512 k-mask intrinsics

2016-12-16 Thread Uros Bizjak
On Thu, Dec 15, 2016 at 7:55 PM, Andrew Senkevich
 wrote:
> 2016-12-15 19:51 GMT+03:00 Uros Bizjak :
>> On Thu, Dec 15, 2016 at 2:31 PM, Andrew Senkevich
>>  wrote:
>>> 2016-12-14 22:55 GMT+03:00 Uros Bizjak :
 On Wed, Dec 14, 2016 at 8:04 PM, Andrew Senkevich
  wrote:

> here is the second part of k-mask intrinsics, is it Ok?

> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -1309,12 +1309,30 @@
>  ;; Mask variant shift mnemonics
>  (define_code_attr mshift [(ashift "shiftl") (lshiftrt "shiftr")])
>
> +(define_expand "kmovb"
> +  [(set (match_operand:QI 0 "nonimmediate_operand")
> + (match_operand:QI 1 "nonimmediate_operand"))]
> +  "TARGET_AVX512DQ
> +   && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
> +
>  (define_expand "kmovw"
>[(set (match_operand:HI 0 "nonimmediate_operand")
>   (match_operand:HI 1 "nonimmediate_operand"))]
>"TARGET_AVX512F
> && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
>
> +(define_expand "kmovd"
> +  [(set (match_operand:SI 0 "nonimmediate_operand")
> + (match_operand:SI 1 "nonimmediate_operand"))]
> +  "TARGET_AVX512BW
> +   && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
> +
> +(define_expand "kmovq"
> +  [(set (match_operand:DI 0 "nonimmediate_operand")
> + (match_operand:DI 1 "nonimmediate_operand"))]
> +  "TARGET_AVX512BW
> +   && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
> +
>  (define_insn "k"
>[(set (match_operand:SWI1248_AVX512BW 0 "register_operand" "=k")
>   (any_logic:SWI1248_AVX512BW

 All the above patterns can be macroized with the following patch:

 --cut here--
 Index: sse.md
 ===
 --- sse.md  (revision 243651)
 +++ sse.md  (working copy)
 @@ -1309,9 +1309,9 @@
  ;; Mask variant shift mnemonics
  (define_code_attr mshift [(ashift "shiftl") (lshiftrt "shiftr")])

 -(define_expand "kmovw"
 -  [(set (match_operand:HI 0 "nonimmediate_operand")
 -   (match_operand:HI 1 "nonimmediate_operand"))]
 +(define_expand "kmov"
 +  [(set (match_operand:SWI1248_AVX512BWDQ 0 "nonimmediate_operand")
 +   (match_operand:SWI1248_AVX512BWDQ 1 "nonimmediate_operand"))]
"TARGET_AVX512F
 && !(MEM_P (operands[0]) && MEM_P (operands[1]))")

 --cut here--

 Please also post ChangeLog entry.
>>>
>>> Thanks,
>>>
>>> here is with ChangeLogs and renamed internal __builtin_ia32_kmov* to
>>> match instruction names.
>>> For __builtin_ia32_kmov16 change I will follow up for update in branches.
>>>
>>> Regtested on x86_64-linux-gnu, Ok for trunk?
>>
>> OK.
>
> Thanks,
>
> here is one more part for kadd{b,w,d,q}, is it ok?
>
> gcc/
> * config/i386/avx512bwintrin.h: Add new k-mask intrinsics.
> * config/i386/avx512dqintrin.h: Ditto.
> * config/i386/avx512fintrin.h: Ditto.
> * config/i386/i386-builtin.def (__builtin_ia32_kaddqi,
> __builtin_ia32_kaddhi, __builtin_ia32_kaddsi,
> __builtin_ia32_kadddi): New.
> * config/i386/sse.md (kadd): New.
>
> gcc/testsuite/
> * gcc.target/i386/avx512bw-kaddd-1.c: New test.
> * gcc.target/i386/avx512bw-kaddq-1.c: Ditto.
> * gcc.target/i386/avx512dq-kaddb-1.c: Ditto.
> * gcc.target/i386/avx512f-kaddw-1.c: Ditto.

OK.

I'll commit the patch to mainline later today.

Thanks,
Uros.


Re: [PATCH] Add AVX512 k-mask intrinsics

2016-12-15 Thread Andrew Senkevich
2016-12-15 19:51 GMT+03:00 Uros Bizjak :
> On Thu, Dec 15, 2016 at 2:31 PM, Andrew Senkevich
>  wrote:
>> 2016-12-14 22:55 GMT+03:00 Uros Bizjak :
>>> On Wed, Dec 14, 2016 at 8:04 PM, Andrew Senkevich
>>>  wrote:
>>>
 here is the second part of k-mask intrinsics, is it Ok?
>>>
 --- a/gcc/config/i386/sse.md
 +++ b/gcc/config/i386/sse.md
 @@ -1309,12 +1309,30 @@
  ;; Mask variant shift mnemonics
  (define_code_attr mshift [(ashift "shiftl") (lshiftrt "shiftr")])

 +(define_expand "kmovb"
 +  [(set (match_operand:QI 0 "nonimmediate_operand")
 + (match_operand:QI 1 "nonimmediate_operand"))]
 +  "TARGET_AVX512DQ
 +   && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
 +
  (define_expand "kmovw"
[(set (match_operand:HI 0 "nonimmediate_operand")
   (match_operand:HI 1 "nonimmediate_operand"))]
"TARGET_AVX512F
 && !(MEM_P (operands[0]) && MEM_P (operands[1]))")

 +(define_expand "kmovd"
 +  [(set (match_operand:SI 0 "nonimmediate_operand")
 + (match_operand:SI 1 "nonimmediate_operand"))]
 +  "TARGET_AVX512BW
 +   && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
 +
 +(define_expand "kmovq"
 +  [(set (match_operand:DI 0 "nonimmediate_operand")
 + (match_operand:DI 1 "nonimmediate_operand"))]
 +  "TARGET_AVX512BW
 +   && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
 +
  (define_insn "k"
[(set (match_operand:SWI1248_AVX512BW 0 "register_operand" "=k")
   (any_logic:SWI1248_AVX512BW
>>>
>>> All the above patterns can be macroized with the following patch:
>>>
>>> --cut here--
>>> Index: sse.md
>>> ===
>>> --- sse.md  (revision 243651)
>>> +++ sse.md  (working copy)
>>> @@ -1309,9 +1309,9 @@
>>>  ;; Mask variant shift mnemonics
>>>  (define_code_attr mshift [(ashift "shiftl") (lshiftrt "shiftr")])
>>>
>>> -(define_expand "kmovw"
>>> -  [(set (match_operand:HI 0 "nonimmediate_operand")
>>> -   (match_operand:HI 1 "nonimmediate_operand"))]
>>> +(define_expand "kmov"
>>> +  [(set (match_operand:SWI1248_AVX512BWDQ 0 "nonimmediate_operand")
>>> +   (match_operand:SWI1248_AVX512BWDQ 1 "nonimmediate_operand"))]
>>>"TARGET_AVX512F
>>> && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
>>>
>>> --cut here--
>>>
>>> Please also post ChangeLog entry.
>>
>> Thanks,
>>
>> here is with ChangeLogs and renamed internal __builtin_ia32_kmov* to
>> match instruction names.
>> For __builtin_ia32_kmov16 change I will follow up for update in branches.
>>
>> Regtested on x86_64-linux-gnu, Ok for trunk?
>
> OK.

Thanks,

here is one more part for kadd{b,w,d,q}, is it ok?

gcc/
* config/i386/avx512bwintrin.h: Add new k-mask intrinsics.
* config/i386/avx512dqintrin.h: Ditto.
* config/i386/avx512fintrin.h: Ditto.
* config/i386/i386-builtin.def (__builtin_ia32_kaddqi,
__builtin_ia32_kaddhi, __builtin_ia32_kaddsi,
__builtin_ia32_kadddi): New.
* config/i386/sse.md (kadd): New.

gcc/testsuite/
* gcc.target/i386/avx512bw-kaddd-1.c: New test.
* gcc.target/i386/avx512bw-kaddq-1.c: Ditto.
* gcc.target/i386/avx512dq-kaddb-1.c: Ditto.
* gcc.target/i386/avx512f-kaddw-1.c: Ditto.

diff --git a/gcc/config/i386/avx512bwintrin.h b/gcc/config/i386/avx512bwintrin.h
index b35ae2b..e38055c 100644
--- a/gcc/config/i386/avx512bwintrin.h
+++ b/gcc/config/i386/avx512bwintrin.h
@@ -40,6 +40,20 @@ typedef char __v64qi __attribute__ ((__vector_size__ (64)));

 typedef unsigned long long __mmask64;

+extern __inline __mmask32
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kadd_mask32 (__mmask32 __A, __mmask32 __B)
+{
+  return (__mmask32) __builtin_ia32_kaddsi ((__mmask32) __A, (__mmask32) __B);
+}
+
+extern __inline __mmask64
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kadd_mask64 (__mmask64 __A, __mmask64 __B)
+{
+  return (__mmask64) __builtin_ia32_kadddi ((__mmask64) __A, (__mmask64) __B);
+}
+
 extern __inline unsigned int
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _cvtmask32_u32 (__mmask32 __A)
diff --git a/gcc/config/i386/avx512dqintrin.h b/gcc/config/i386/avx512dqintrin.h
index 4db44e4..ccc6a4d 100644
--- a/gcc/config/i386/avx512dqintrin.h
+++ b/gcc/config/i386/avx512dqintrin.h
@@ -34,6 +34,13 @@
 #define __DISABLE_AVX512DQ__
 #endif /* __AVX512DQ__ */

+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kadd_mask8 (__mmask8 __A, __mmask8 __B)
+{
+  return (__mmask8) __builtin_ia32_kaddqi ((__mmask8) __A, (__mmask8) __B);
+}
+
 extern __inline unsigned int
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _cvtmask8_u32 (__mmask8 __A)
diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
index a889c83..820741c 100644
--- 

Re: [PATCH] Add AVX512 k-mask intrinsics

2016-12-15 Thread Uros Bizjak
On Thu, Dec 15, 2016 at 2:31 PM, Andrew Senkevich
 wrote:
> 2016-12-14 22:55 GMT+03:00 Uros Bizjak :
>> On Wed, Dec 14, 2016 at 8:04 PM, Andrew Senkevich
>>  wrote:
>>
>>> here is the second part of k-mask intrinsics, is it Ok?
>>
>>> --- a/gcc/config/i386/sse.md
>>> +++ b/gcc/config/i386/sse.md
>>> @@ -1309,12 +1309,30 @@
>>>  ;; Mask variant shift mnemonics
>>>  (define_code_attr mshift [(ashift "shiftl") (lshiftrt "shiftr")])
>>>
>>> +(define_expand "kmovb"
>>> +  [(set (match_operand:QI 0 "nonimmediate_operand")
>>> + (match_operand:QI 1 "nonimmediate_operand"))]
>>> +  "TARGET_AVX512DQ
>>> +   && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
>>> +
>>>  (define_expand "kmovw"
>>>[(set (match_operand:HI 0 "nonimmediate_operand")
>>>   (match_operand:HI 1 "nonimmediate_operand"))]
>>>"TARGET_AVX512F
>>> && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
>>>
>>> +(define_expand "kmovd"
>>> +  [(set (match_operand:SI 0 "nonimmediate_operand")
>>> + (match_operand:SI 1 "nonimmediate_operand"))]
>>> +  "TARGET_AVX512BW
>>> +   && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
>>> +
>>> +(define_expand "kmovq"
>>> +  [(set (match_operand:DI 0 "nonimmediate_operand")
>>> + (match_operand:DI 1 "nonimmediate_operand"))]
>>> +  "TARGET_AVX512BW
>>> +   && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
>>> +
>>>  (define_insn "k"
>>>[(set (match_operand:SWI1248_AVX512BW 0 "register_operand" "=k")
>>>   (any_logic:SWI1248_AVX512BW
>>
>> All the above patterns can be macroized with the following patch:
>>
>> --cut here--
>> Index: sse.md
>> ===
>> --- sse.md  (revision 243651)
>> +++ sse.md  (working copy)
>> @@ -1309,9 +1309,9 @@
>>  ;; Mask variant shift mnemonics
>>  (define_code_attr mshift [(ashift "shiftl") (lshiftrt "shiftr")])
>>
>> -(define_expand "kmovw"
>> -  [(set (match_operand:HI 0 "nonimmediate_operand")
>> -   (match_operand:HI 1 "nonimmediate_operand"))]
>> +(define_expand "kmov"
>> +  [(set (match_operand:SWI1248_AVX512BWDQ 0 "nonimmediate_operand")
>> +   (match_operand:SWI1248_AVX512BWDQ 1 "nonimmediate_operand"))]
>>"TARGET_AVX512F
>> && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
>>
>> --cut here--
>>
>> Please also post ChangeLog entry.
>
> Thanks,
>
> here is with ChangeLogs and renamed internal __builtin_ia32_kmov* to
> match instruction names.
> For __builtin_ia32_kmov16 change I will follow up for update in branches.
>
> Regtested on x86_64-linux-gnu, Ok for trunk?

OK.

Thanks,
Uros.


Re: [PATCH] Add AVX512 k-mask intrinsics

2016-12-14 Thread Uros Bizjak
On Wed, Dec 14, 2016 at 8:04 PM, Andrew Senkevich
 wrote:

> here is the second part of k-mask intrinsics, is it Ok?

> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -1309,12 +1309,30 @@
>  ;; Mask variant shift mnemonics
>  (define_code_attr mshift [(ashift "shiftl") (lshiftrt "shiftr")])
>
> +(define_expand "kmovb"
> +  [(set (match_operand:QI 0 "nonimmediate_operand")
> + (match_operand:QI 1 "nonimmediate_operand"))]
> +  "TARGET_AVX512DQ
> +   && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
> +
>  (define_expand "kmovw"
>[(set (match_operand:HI 0 "nonimmediate_operand")
>   (match_operand:HI 1 "nonimmediate_operand"))]
>"TARGET_AVX512F
> && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
>
> +(define_expand "kmovd"
> +  [(set (match_operand:SI 0 "nonimmediate_operand")
> + (match_operand:SI 1 "nonimmediate_operand"))]
> +  "TARGET_AVX512BW
> +   && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
> +
> +(define_expand "kmovq"
> +  [(set (match_operand:DI 0 "nonimmediate_operand")
> + (match_operand:DI 1 "nonimmediate_operand"))]
> +  "TARGET_AVX512BW
> +   && !(MEM_P (operands[0]) && MEM_P (operands[1]))")
> +
>  (define_insn "k"
>[(set (match_operand:SWI1248_AVX512BW 0 "register_operand" "=k")
>   (any_logic:SWI1248_AVX512BW

All the above patterns can be macroized with the following patch:

--cut here--
Index: sse.md
===
--- sse.md  (revision 243651)
+++ sse.md  (working copy)
@@ -1309,9 +1309,9 @@
 ;; Mask variant shift mnemonics
 (define_code_attr mshift [(ashift "shiftl") (lshiftrt "shiftr")])

-(define_expand "kmovw"
-  [(set (match_operand:HI 0 "nonimmediate_operand")
-   (match_operand:HI 1 "nonimmediate_operand"))]
+(define_expand "kmov"
+  [(set (match_operand:SWI1248_AVX512BWDQ 0 "nonimmediate_operand")
+   (match_operand:SWI1248_AVX512BWDQ 1 "nonimmediate_operand"))]
   "TARGET_AVX512F
&& !(MEM_P (operands[0]) && MEM_P (operands[1]))")

--cut here--

Please also post ChangeLog entry.

Uros.


Re: [PATCH] Add AVX512 k-mask intrinsics

2016-12-14 Thread Andrew Senkevich
2016-12-02 21:31 GMT+03:00 Uros Bizjak :
. . . . .
>>
>> I split this patch after last updates in md files, here is the first
>> part which doesn't change md files.
>> Regtested on x86_64-linux-gnu.  Is this part ok?
>
> There is no point to scan for kmovX insn in e.g.:
>
> +/* { dg-final { scan-assembler-times "kmovq" 2 } } */
> +
> +#include 
> +
> +void
> +avx512bw_test ()
> +{
> +  __mmask64 k1, k2, k3;
> +  volatile __m512i x = _mm512_setzero_si512 ();
> +
> +  __asm__( "kmovq %1, %0" : "=k" (k1) : "r" (1) );
> +  __asm__( "kmovq %1, %0" : "=k" (k2) : "r" (2) );
>
> since you emit it from inline asm.
>
> Please remove these pointles kmovX scan-asm-times directives from the
> testcases, and please also remove it  from avx512f-kandnw-1.c
> testcase.
>
> The patch is OK with this change.

Hi

here is the second part of k-mask intrinsics, is it Ok?

diff --git a/gcc/config/i386/avx512bwintrin.h b/gcc/config/i386/avx512bwintrin.h
index 9e6e0ce..7f40808 100644
--- a/gcc/config/i386/avx512bwintrin.h
+++ b/gcc/config/i386/avx512bwintrin.h
@@ -40,6 +40,62 @@ typedef char __v64qi __attribute__ ((__vector_size__ (64)));

 typedef unsigned long long __mmask64;

+extern __inline unsigned int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_cvtmask32_u32 (__mmask32 __A)
+{
+  return (unsigned int) __builtin_ia32_kmov32 ((__mmask32) __A);
+}
+
+extern __inline unsigned long long
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_cvtmask64_u64 (__mmask64 __A)
+{
+  return (unsigned long long) __builtin_ia32_kmov64 ((__mmask64) __A);
+}
+
+extern __inline __mmask32
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_cvtu32_mask32 (unsigned int __A)
+{
+  return (__mmask32) __builtin_ia32_kmov32 ((__mmask32) __A);
+}
+
+extern __inline __mmask64
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_cvtu64_mask64 (unsigned long long __A)
+{
+  return (__mmask64) __builtin_ia32_kmov64 ((__mmask64) __A);
+}
+
+extern __inline __mmask32
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_load_mask32 (__mmask32 *__A)
+{
+  return (__mmask32) __builtin_ia32_kmov32 (*__A);
+}
+
+extern __inline __mmask64
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_load_mask64 (__mmask64 *__A)
+{
+  return (__mmask64) __builtin_ia32_kmov64 (*(__mmask64 *) __A);
+}
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_store_mask32 (__mmask32 *__A, __mmask32 __B)
+{
+  *(__mmask32 *) __A = __builtin_ia32_kmov32 (__B);
+}
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_store_mask64 (__mmask64 *__A, __mmask64 __B)
+{
+  *(__mmask64 *) __A = __builtin_ia32_kmov64 (__B);
+}
+
 extern __inline __mmask32
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _knot_mask32 (__mmask32 __A)
diff --git a/gcc/config/i386/avx512dqintrin.h b/gcc/config/i386/avx512dqintrin.h
index d2405c3..d15d35d 100644
--- a/gcc/config/i386/avx512dqintrin.h
+++ b/gcc/config/i386/avx512dqintrin.h
@@ -34,6 +34,34 @@
 #define __DISABLE_AVX512DQ__
 #endif /* __AVX512DQ__ */

+extern __inline unsigned int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_cvtmask8_u32 (__mmask8 __A)
+{
+  return (unsigned int) __builtin_ia32_kmov8 ((__mmask8 ) __A);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_cvtu32_mask8 (unsigned int __A)
+{
+  return (__mmask8) __builtin_ia32_kmov8 ((__mmask8) __A);
+}
+
+extern __inline __mmask8
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_load_mask8 (__mmask8 *__A)
+{
+  return (__mmask8) __builtin_ia32_kmov8 (*(__mmask8 *) __A);
+}
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_store_mask8 (__mmask8 *__A, __mmask8 __B)
+{
+  *(__mmask8 *) __A = __builtin_ia32_kmov8 (__B);
+}
+
 extern __inline __mmask8
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _knot_mask8 (__mmask8 __A)
diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
index ab1704b..45e1949 100644
--- a/gcc/config/i386/avx512fintrin.h
+++ b/gcc/config/i386/avx512fintrin.h
@@ -9984,6 +9984,34 @@ _mm512_maskz_expandloadu_epi32 (__mmask16 __U,
void const *__P)
 #define _kxnor_mask16 _mm512_kxnor
 #define _kxor_mask16 _mm512_kxor

+extern __inline unsigned int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_cvtmask16_u32 (__mmask16 __A)
+{
+  return (unsigned int) __builtin_ia32_kmov16 ((__mmask16 ) __A);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_cvtu32_mask16 (unsigned int __A)
+{
+  return (__mmask16) __builtin_ia32_kmov16 ((__mmask16 ) __A);
+}
+
+extern __inline __mmask16
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_load_mask16 (__mmask16 *__A)
+{
+  

Re: [PATCH] Add AVX512 k-mask intrinsics

2016-12-05 Thread H.J. Lu
On Mon, Dec 5, 2016 at 6:59 AM, Andrew Senkevich
 wrote:
> 2016-12-02 21:31 GMT+03:00 Uros Bizjak :
>> On Fri, Dec 2, 2016 at 6:44 PM, Andrew Senkevich
>>  wrote:
>>> 2016-11-11 22:14 GMT+03:00 Uros Bizjak :
 On Fri, Nov 11, 2016 at 7:23 PM, Andrew Senkevich
  wrote:
> 2016-11-11 20:56 GMT+03:00 Uros Bizjak :
>> On Fri, Nov 11, 2016 at 6:50 PM, Uros Bizjak  wrote:
>>> On Fri, Nov 11, 2016 at 6:38 PM, Andrew Senkevich
>>>  wrote:
 2016-11-11 17:34 GMT+03:00 Uros Bizjak :
> Some quick remarks:
>
> +(define_insn "kmovb"
> +  [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k")
> + (unspec:QI
> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
> +  UNSPEC_KMOV))]
> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ"
> +  "@
> +   kmovb\t{%k1, %0|%0, %k1}
> +   kmovb\t{%1, %0|%0, %1}";
> +  [(set_attr "mode" "QI")
> +   (set_attr "type" "mskmov")
> +   (set_attr "prefix" "vex")])
> +
> +(define_insn "kmovd"
> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k")
> + (unspec:SI
> +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
> +  UNSPEC_KMOV))]
> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
> +  "@
> +   kmovd\t{%k1, %0|%0, %k1}
> +   kmovd\t{%1, %0|%0, %1}";
> +  [(set_attr "mode" "SI")
> +   (set_attr "type" "mskmov")
> +   (set_attr "prefix" "vex")])
> +
> +(define_insn "kmovq"
> +  [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km")
> + (unspec:DI
> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
> +  UNSPEC_KMOV))]
> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
> +  "@
> +   kmovq\t{%k1, %0|%0, %k1}
> +   kmovq\t{%1, %0|%0, %1}
> +   kmovq\t{%1, %0|%0, %1}";
> +  [(set_attr "mode" "DI")
> +   (set_attr "type" "mskmov")
> +   (set_attr "prefix" "vex")])
>
> - kmovd (and existing kmovw) should be using register_operand for
> opreand 0. In this case, there is no need for MEM_P checks at all.
> - In the insn constraint, pease check TARGET_AVX before checking 
> MEM_P.
> - please put these definitions above corresponding *mov??_internal 
> patterns.

 Do you mean put below *mov??_internal patterns? Attached corrected 
 such way.
>>>
>>> No, please put kmovq near *movdi_internal, kmovd near *movsi_internal,
>>> etc. It doesn't matter if they are above or below their respective
>>> *mov??_internal patterns, as long as they are positioned in some
>>> consistent way. IOW, new patterns shouldn't be grouped together, as is
>>> the case with your patch.
>>
>> +(define_insn "kmovb"
>> +  [(set (match_operand:QI 0 "register_operand" "=k,k")
>> +(unspec:QI
>> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
>> +  UNSPEC_KMOV))]
>> +  "TARGET_AVX512DQ && !MEM_P (operands[1])"
>>
>> There is no need for !MEM_P, this will prevent memory operand, which
>> is allowed by constraint "m".
>>
>> +(define_insn "kmovq"
>> +  [(set (match_operand:DI 0 "register_operand" "=k,k,km")
>> +(unspec:DI
>> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
>> +  UNSPEC_KMOV))]
>> +  "TARGET_AVX512BW && !MEM_P (operands[1])"
>>
>> Operand 0 should have "nonimmediate_operand" predicate. And here you
>> need  && !(MEM_P (op0) && MEM_P (op1)) in insn constraint to prevent
>> mem->mem moves.
>
> Changed according your comments and attached.

 Still not good.

 +(define_insn "kmovd"
 +  [(set (match_operand:SI 0 "register_operand" "=k,k")
 +(unspec:SI
 +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
 +  UNSPEC_KMOV))]
 +  "TARGET_AVX512BW && !MEM_P (operands[1])"

 Remove !MEM_P in the above pattern.

  (define_insn "kmovw"
 -  [(set (match_operand:HI 0 "nonimmediate_operand" "=k,k")
 +  [(set (match_operand:HI 0 "register_operand" "=k,k")
  (unspec:HI
[(match_operand:HI 1 "nonimmediate_operand" "r,km")]
UNSPEC_KMOV))]
 -  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512F"
 +  "TARGET_AVX512F && !MEM_P (operands[1])"

 Also remove !MEM_P here.

 +(define_insn "kadd"
 +  [(set (match_operand:SWI1248x 0 "register_operand" "=r,,!k")
 +(plus:SWI1248x
 +  (not:SWI1248x
 +  

Re: [PATCH] Add AVX512 k-mask intrinsics

2016-12-05 Thread Andrew Senkevich
2016-12-02 21:31 GMT+03:00 Uros Bizjak :
> On Fri, Dec 2, 2016 at 6:44 PM, Andrew Senkevich
>  wrote:
>> 2016-11-11 22:14 GMT+03:00 Uros Bizjak :
>>> On Fri, Nov 11, 2016 at 7:23 PM, Andrew Senkevich
>>>  wrote:
 2016-11-11 20:56 GMT+03:00 Uros Bizjak :
> On Fri, Nov 11, 2016 at 6:50 PM, Uros Bizjak  wrote:
>> On Fri, Nov 11, 2016 at 6:38 PM, Andrew Senkevich
>>  wrote:
>>> 2016-11-11 17:34 GMT+03:00 Uros Bizjak :
 Some quick remarks:

 +(define_insn "kmovb"
 +  [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k")
 + (unspec:QI
 +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
 +  UNSPEC_KMOV))]
 +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ"
 +  "@
 +   kmovb\t{%k1, %0|%0, %k1}
 +   kmovb\t{%1, %0|%0, %1}";
 +  [(set_attr "mode" "QI")
 +   (set_attr "type" "mskmov")
 +   (set_attr "prefix" "vex")])
 +
 +(define_insn "kmovd"
 +  [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k")
 + (unspec:SI
 +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
 +  UNSPEC_KMOV))]
 +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
 +  "@
 +   kmovd\t{%k1, %0|%0, %k1}
 +   kmovd\t{%1, %0|%0, %1}";
 +  [(set_attr "mode" "SI")
 +   (set_attr "type" "mskmov")
 +   (set_attr "prefix" "vex")])
 +
 +(define_insn "kmovq"
 +  [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km")
 + (unspec:DI
 +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
 +  UNSPEC_KMOV))]
 +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
 +  "@
 +   kmovq\t{%k1, %0|%0, %k1}
 +   kmovq\t{%1, %0|%0, %1}
 +   kmovq\t{%1, %0|%0, %1}";
 +  [(set_attr "mode" "DI")
 +   (set_attr "type" "mskmov")
 +   (set_attr "prefix" "vex")])

 - kmovd (and existing kmovw) should be using register_operand for
 opreand 0. In this case, there is no need for MEM_P checks at all.
 - In the insn constraint, pease check TARGET_AVX before checking MEM_P.
 - please put these definitions above corresponding *mov??_internal 
 patterns.
>>>
>>> Do you mean put below *mov??_internal patterns? Attached corrected such 
>>> way.
>>
>> No, please put kmovq near *movdi_internal, kmovd near *movsi_internal,
>> etc. It doesn't matter if they are above or below their respective
>> *mov??_internal patterns, as long as they are positioned in some
>> consistent way. IOW, new patterns shouldn't be grouped together, as is
>> the case with your patch.
>
> +(define_insn "kmovb"
> +  [(set (match_operand:QI 0 "register_operand" "=k,k")
> +(unspec:QI
> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
> +  UNSPEC_KMOV))]
> +  "TARGET_AVX512DQ && !MEM_P (operands[1])"
>
> There is no need for !MEM_P, this will prevent memory operand, which
> is allowed by constraint "m".
>
> +(define_insn "kmovq"
> +  [(set (match_operand:DI 0 "register_operand" "=k,k,km")
> +(unspec:DI
> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
> +  UNSPEC_KMOV))]
> +  "TARGET_AVX512BW && !MEM_P (operands[1])"
>
> Operand 0 should have "nonimmediate_operand" predicate. And here you
> need  && !(MEM_P (op0) && MEM_P (op1)) in insn constraint to prevent
> mem->mem moves.

 Changed according your comments and attached.
>>>
>>> Still not good.
>>>
>>> +(define_insn "kmovd"
>>> +  [(set (match_operand:SI 0 "register_operand" "=k,k")
>>> +(unspec:SI
>>> +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
>>> +  UNSPEC_KMOV))]
>>> +  "TARGET_AVX512BW && !MEM_P (operands[1])"
>>>
>>> Remove !MEM_P in the above pattern.
>>>
>>>  (define_insn "kmovw"
>>> -  [(set (match_operand:HI 0 "nonimmediate_operand" "=k,k")
>>> +  [(set (match_operand:HI 0 "register_operand" "=k,k")
>>>  (unspec:HI
>>>[(match_operand:HI 1 "nonimmediate_operand" "r,km")]
>>>UNSPEC_KMOV))]
>>> -  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512F"
>>> +  "TARGET_AVX512F && !MEM_P (operands[1])"
>>>
>>> Also remove !MEM_P here.
>>>
>>> +(define_insn "kadd"
>>> +  [(set (match_operand:SWI1248x 0 "register_operand" "=r,,!k")
>>> +(plus:SWI1248x
>>> +  (not:SWI1248x
>>> +(match_operand:SWI1248x 1 "register_operand" "r,0,k"))
>>> +  (match_operand:SWI1248x 2 "register_operand" "r,r,k")))
>>> +   (clobber (reg:CC FLAGS_REG))]
>>> +  "TARGET_AVX512F"
>>> +{
>>> +  switch 

Re: [PATCH] Add AVX512 k-mask intrinsics

2016-12-02 Thread Uros Bizjak
On Fri, Dec 2, 2016 at 6:44 PM, Andrew Senkevich
 wrote:
> 2016-11-11 22:14 GMT+03:00 Uros Bizjak :
>> On Fri, Nov 11, 2016 at 7:23 PM, Andrew Senkevich
>>  wrote:
>>> 2016-11-11 20:56 GMT+03:00 Uros Bizjak :
 On Fri, Nov 11, 2016 at 6:50 PM, Uros Bizjak  wrote:
> On Fri, Nov 11, 2016 at 6:38 PM, Andrew Senkevich
>  wrote:
>> 2016-11-11 17:34 GMT+03:00 Uros Bizjak :
>>> Some quick remarks:
>>>
>>> +(define_insn "kmovb"
>>> +  [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k")
>>> + (unspec:QI
>>> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
>>> +  UNSPEC_KMOV))]
>>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ"
>>> +  "@
>>> +   kmovb\t{%k1, %0|%0, %k1}
>>> +   kmovb\t{%1, %0|%0, %1}";
>>> +  [(set_attr "mode" "QI")
>>> +   (set_attr "type" "mskmov")
>>> +   (set_attr "prefix" "vex")])
>>> +
>>> +(define_insn "kmovd"
>>> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k")
>>> + (unspec:SI
>>> +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
>>> +  UNSPEC_KMOV))]
>>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>>> +  "@
>>> +   kmovd\t{%k1, %0|%0, %k1}
>>> +   kmovd\t{%1, %0|%0, %1}";
>>> +  [(set_attr "mode" "SI")
>>> +   (set_attr "type" "mskmov")
>>> +   (set_attr "prefix" "vex")])
>>> +
>>> +(define_insn "kmovq"
>>> +  [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km")
>>> + (unspec:DI
>>> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
>>> +  UNSPEC_KMOV))]
>>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>>> +  "@
>>> +   kmovq\t{%k1, %0|%0, %k1}
>>> +   kmovq\t{%1, %0|%0, %1}
>>> +   kmovq\t{%1, %0|%0, %1}";
>>> +  [(set_attr "mode" "DI")
>>> +   (set_attr "type" "mskmov")
>>> +   (set_attr "prefix" "vex")])
>>>
>>> - kmovd (and existing kmovw) should be using register_operand for
>>> opreand 0. In this case, there is no need for MEM_P checks at all.
>>> - In the insn constraint, pease check TARGET_AVX before checking MEM_P.
>>> - please put these definitions above corresponding *mov??_internal 
>>> patterns.
>>
>> Do you mean put below *mov??_internal patterns? Attached corrected such 
>> way.
>
> No, please put kmovq near *movdi_internal, kmovd near *movsi_internal,
> etc. It doesn't matter if they are above or below their respective
> *mov??_internal patterns, as long as they are positioned in some
> consistent way. IOW, new patterns shouldn't be grouped together, as is
> the case with your patch.

 +(define_insn "kmovb"
 +  [(set (match_operand:QI 0 "register_operand" "=k,k")
 +(unspec:QI
 +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
 +  UNSPEC_KMOV))]
 +  "TARGET_AVX512DQ && !MEM_P (operands[1])"

 There is no need for !MEM_P, this will prevent memory operand, which
 is allowed by constraint "m".

 +(define_insn "kmovq"
 +  [(set (match_operand:DI 0 "register_operand" "=k,k,km")
 +(unspec:DI
 +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
 +  UNSPEC_KMOV))]
 +  "TARGET_AVX512BW && !MEM_P (operands[1])"

 Operand 0 should have "nonimmediate_operand" predicate. And here you
 need  && !(MEM_P (op0) && MEM_P (op1)) in insn constraint to prevent
 mem->mem moves.
>>>
>>> Changed according your comments and attached.
>>
>> Still not good.
>>
>> +(define_insn "kmovd"
>> +  [(set (match_operand:SI 0 "register_operand" "=k,k")
>> +(unspec:SI
>> +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
>> +  UNSPEC_KMOV))]
>> +  "TARGET_AVX512BW && !MEM_P (operands[1])"
>>
>> Remove !MEM_P in the above pattern.
>>
>>  (define_insn "kmovw"
>> -  [(set (match_operand:HI 0 "nonimmediate_operand" "=k,k")
>> +  [(set (match_operand:HI 0 "register_operand" "=k,k")
>>  (unspec:HI
>>[(match_operand:HI 1 "nonimmediate_operand" "r,km")]
>>UNSPEC_KMOV))]
>> -  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512F"
>> +  "TARGET_AVX512F && !MEM_P (operands[1])"
>>
>> Also remove !MEM_P here.
>>
>> +(define_insn "kadd"
>> +  [(set (match_operand:SWI1248x 0 "register_operand" "=r,,!k")
>> +(plus:SWI1248x
>> +  (not:SWI1248x
>> +(match_operand:SWI1248x 1 "register_operand" "r,0,k"))
>> +  (match_operand:SWI1248x 2 "register_operand" "r,r,k")))
>> +   (clobber (reg:CC FLAGS_REG))]
>> +  "TARGET_AVX512F"
>> +{
>> +  switch (which_alternative)
>> +{
>> +case 0:
>> +  return "add\t{%k2, %k1, %k0|%k0, %k1, %k2}";
>> +case 1:
>> +  return "#";
>> +case 2:
>> +  if (TARGET_AVX512BW && mode 

Re: [PATCH] Add AVX512 k-mask intrinsics

2016-12-02 Thread Andrew Senkevich
2016-11-11 22:14 GMT+03:00 Uros Bizjak :
> On Fri, Nov 11, 2016 at 7:23 PM, Andrew Senkevich
>  wrote:
>> 2016-11-11 20:56 GMT+03:00 Uros Bizjak :
>>> On Fri, Nov 11, 2016 at 6:50 PM, Uros Bizjak  wrote:
 On Fri, Nov 11, 2016 at 6:38 PM, Andrew Senkevich
  wrote:
> 2016-11-11 17:34 GMT+03:00 Uros Bizjak :
>> Some quick remarks:
>>
>> +(define_insn "kmovb"
>> +  [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k")
>> + (unspec:QI
>> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
>> +  UNSPEC_KMOV))]
>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ"
>> +  "@
>> +   kmovb\t{%k1, %0|%0, %k1}
>> +   kmovb\t{%1, %0|%0, %1}";
>> +  [(set_attr "mode" "QI")
>> +   (set_attr "type" "mskmov")
>> +   (set_attr "prefix" "vex")])
>> +
>> +(define_insn "kmovd"
>> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k")
>> + (unspec:SI
>> +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
>> +  UNSPEC_KMOV))]
>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>> +  "@
>> +   kmovd\t{%k1, %0|%0, %k1}
>> +   kmovd\t{%1, %0|%0, %1}";
>> +  [(set_attr "mode" "SI")
>> +   (set_attr "type" "mskmov")
>> +   (set_attr "prefix" "vex")])
>> +
>> +(define_insn "kmovq"
>> +  [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km")
>> + (unspec:DI
>> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
>> +  UNSPEC_KMOV))]
>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>> +  "@
>> +   kmovq\t{%k1, %0|%0, %k1}
>> +   kmovq\t{%1, %0|%0, %1}
>> +   kmovq\t{%1, %0|%0, %1}";
>> +  [(set_attr "mode" "DI")
>> +   (set_attr "type" "mskmov")
>> +   (set_attr "prefix" "vex")])
>>
>> - kmovd (and existing kmovw) should be using register_operand for
>> opreand 0. In this case, there is no need for MEM_P checks at all.
>> - In the insn constraint, pease check TARGET_AVX before checking MEM_P.
>> - please put these definitions above corresponding *mov??_internal 
>> patterns.
>
> Do you mean put below *mov??_internal patterns? Attached corrected such 
> way.

 No, please put kmovq near *movdi_internal, kmovd near *movsi_internal,
 etc. It doesn't matter if they are above or below their respective
 *mov??_internal patterns, as long as they are positioned in some
 consistent way. IOW, new patterns shouldn't be grouped together, as is
 the case with your patch.
>>>
>>> +(define_insn "kmovb"
>>> +  [(set (match_operand:QI 0 "register_operand" "=k,k")
>>> +(unspec:QI
>>> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
>>> +  UNSPEC_KMOV))]
>>> +  "TARGET_AVX512DQ && !MEM_P (operands[1])"
>>>
>>> There is no need for !MEM_P, this will prevent memory operand, which
>>> is allowed by constraint "m".
>>>
>>> +(define_insn "kmovq"
>>> +  [(set (match_operand:DI 0 "register_operand" "=k,k,km")
>>> +(unspec:DI
>>> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
>>> +  UNSPEC_KMOV))]
>>> +  "TARGET_AVX512BW && !MEM_P (operands[1])"
>>>
>>> Operand 0 should have "nonimmediate_operand" predicate. And here you
>>> need  && !(MEM_P (op0) && MEM_P (op1)) in insn constraint to prevent
>>> mem->mem moves.
>>
>> Changed according your comments and attached.
>
> Still not good.
>
> +(define_insn "kmovd"
> +  [(set (match_operand:SI 0 "register_operand" "=k,k")
> +(unspec:SI
> +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
> +  UNSPEC_KMOV))]
> +  "TARGET_AVX512BW && !MEM_P (operands[1])"
>
> Remove !MEM_P in the above pattern.
>
>  (define_insn "kmovw"
> -  [(set (match_operand:HI 0 "nonimmediate_operand" "=k,k")
> +  [(set (match_operand:HI 0 "register_operand" "=k,k")
>  (unspec:HI
>[(match_operand:HI 1 "nonimmediate_operand" "r,km")]
>UNSPEC_KMOV))]
> -  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512F"
> +  "TARGET_AVX512F && !MEM_P (operands[1])"
>
> Also remove !MEM_P here.
>
> +(define_insn "kadd"
> +  [(set (match_operand:SWI1248x 0 "register_operand" "=r,,!k")
> +(plus:SWI1248x
> +  (not:SWI1248x
> +(match_operand:SWI1248x 1 "register_operand" "r,0,k"))
> +  (match_operand:SWI1248x 2 "register_operand" "r,r,k")))
> +   (clobber (reg:CC FLAGS_REG))]
> +  "TARGET_AVX512F"
> +{
> +  switch (which_alternative)
> +{
> +case 0:
> +  return "add\t{%k2, %k1, %k0|%k0, %k1, %k2}";
> +case 1:
> +  return "#";
> +case 2:
> +  if (TARGET_AVX512BW && mode == DImode)
> +return "kaddq\t{%2, %1, %0|%0, %1, %2}";
> +  else if (TARGET_AVX512BW && mode == SImode)
> +return "kaddd\t{%2, %1, %0|%0, %1, %2}";
> +  else if (TARGET_AVX512DQ && mode == QImode)
> 

Re: [PATCH] Add AVX512 k-mask intrinsics

2016-11-11 Thread Uros Bizjak
On Fri, Nov 11, 2016 at 7:23 PM, Andrew Senkevich
 wrote:
> 2016-11-11 20:56 GMT+03:00 Uros Bizjak :
>> On Fri, Nov 11, 2016 at 6:50 PM, Uros Bizjak  wrote:
>>> On Fri, Nov 11, 2016 at 6:38 PM, Andrew Senkevich
>>>  wrote:
 2016-11-11 17:34 GMT+03:00 Uros Bizjak :
> Some quick remarks:
>
> +(define_insn "kmovb"
> +  [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k")
> + (unspec:QI
> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
> +  UNSPEC_KMOV))]
> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ"
> +  "@
> +   kmovb\t{%k1, %0|%0, %k1}
> +   kmovb\t{%1, %0|%0, %1}";
> +  [(set_attr "mode" "QI")
> +   (set_attr "type" "mskmov")
> +   (set_attr "prefix" "vex")])
> +
> +(define_insn "kmovd"
> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k")
> + (unspec:SI
> +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
> +  UNSPEC_KMOV))]
> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
> +  "@
> +   kmovd\t{%k1, %0|%0, %k1}
> +   kmovd\t{%1, %0|%0, %1}";
> +  [(set_attr "mode" "SI")
> +   (set_attr "type" "mskmov")
> +   (set_attr "prefix" "vex")])
> +
> +(define_insn "kmovq"
> +  [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km")
> + (unspec:DI
> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
> +  UNSPEC_KMOV))]
> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
> +  "@
> +   kmovq\t{%k1, %0|%0, %k1}
> +   kmovq\t{%1, %0|%0, %1}
> +   kmovq\t{%1, %0|%0, %1}";
> +  [(set_attr "mode" "DI")
> +   (set_attr "type" "mskmov")
> +   (set_attr "prefix" "vex")])
>
> - kmovd (and existing kmovw) should be using register_operand for
> opreand 0. In this case, there is no need for MEM_P checks at all.
> - In the insn constraint, pease check TARGET_AVX before checking MEM_P.
> - please put these definitions above corresponding *mov??_internal 
> patterns.

 Do you mean put below *mov??_internal patterns? Attached corrected such 
 way.
>>>
>>> No, please put kmovq near *movdi_internal, kmovd near *movsi_internal,
>>> etc. It doesn't matter if they are above or below their respective
>>> *mov??_internal patterns, as long as they are positioned in some
>>> consistent way. IOW, new patterns shouldn't be grouped together, as is
>>> the case with your patch.
>>
>> +(define_insn "kmovb"
>> +  [(set (match_operand:QI 0 "register_operand" "=k,k")
>> +(unspec:QI
>> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
>> +  UNSPEC_KMOV))]
>> +  "TARGET_AVX512DQ && !MEM_P (operands[1])"
>>
>> There is no need for !MEM_P, this will prevent memory operand, which
>> is allowed by constraint "m".
>>
>> +(define_insn "kmovq"
>> +  [(set (match_operand:DI 0 "register_operand" "=k,k,km")
>> +(unspec:DI
>> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
>> +  UNSPEC_KMOV))]
>> +  "TARGET_AVX512BW && !MEM_P (operands[1])"
>>
>> Operand 0 should have "nonimmediate_operand" predicate. And here you
>> need  && !(MEM_P (op0) && MEM_P (op1)) in insn constraint to prevent
>> mem->mem moves.
>
> Changed according your comments and attached.

Still not good.

+(define_insn "kmovd"
+  [(set (match_operand:SI 0 "register_operand" "=k,k")
+(unspec:SI
+  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
+  UNSPEC_KMOV))]
+  "TARGET_AVX512BW && !MEM_P (operands[1])"

Remove !MEM_P in the above pattern.

 (define_insn "kmovw"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=k,k")
+  [(set (match_operand:HI 0 "register_operand" "=k,k")
 (unspec:HI
   [(match_operand:HI 1 "nonimmediate_operand" "r,km")]
   UNSPEC_KMOV))]
-  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512F"
+  "TARGET_AVX512F && !MEM_P (operands[1])"

Also remove !MEM_P here.

+(define_insn "kadd"
+  [(set (match_operand:SWI1248x 0 "register_operand" "=r,,!k")
+(plus:SWI1248x
+  (not:SWI1248x
+(match_operand:SWI1248x 1 "register_operand" "r,0,k"))
+  (match_operand:SWI1248x 2 "register_operand" "r,r,k")))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_AVX512F"
+{
+  switch (which_alternative)
+{
+case 0:
+  return "add\t{%k2, %k1, %k0|%k0, %k1, %k2}";
+case 1:
+  return "#";
+case 2:
+  if (TARGET_AVX512BW && mode == DImode)
+return "kaddq\t{%2, %1, %0|%0, %1, %2}";
+  else if (TARGET_AVX512BW && mode == SImode)
+return "kaddd\t{%2, %1, %0|%0, %1, %2}";
+  else if (TARGET_AVX512DQ && mode == QImode)
+return "kaddb\t{%2, %1, %0|%0, %1, %2}";
+  else
+return "kaddw\t{%2, %1, %0|%0, %1, %2}";
+

The above pattern is wrong. Is there really a NOT RTX present,
implying effectively a kaddn?

If this is plain add, then you 

Re: [PATCH] Add AVX512 k-mask intrinsics

2016-11-11 Thread Andrew Senkevich
2016-11-11 18:26 GMT+03:00 Marc Glisse :
> On Fri, 11 Nov 2016, Andrew Senkevich wrote:
>
>> +extern __inline __mmask32
>> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>> +_kand_mask32 (__mmask32 __A, __mmask32 __B)
>> +{
>> +  return (__mmask32) __builtin_ia32_kandsi ((__mmask32) __A, (__mmask32)
>> __B);
>> +}
>
>
> (picking one random example)
> Is a builtin really needed here? What would happen if you used
>
>   return __A & __B;
>
> ?

Good question. Looks like it also works (for this particular case).


--
WBR,
Andrew


Re: [PATCH] Add AVX512 k-mask intrinsics

2016-11-11 Thread Andrew Senkevich
2016-11-11 20:56 GMT+03:00 Uros Bizjak :
> On Fri, Nov 11, 2016 at 6:50 PM, Uros Bizjak  wrote:
>> On Fri, Nov 11, 2016 at 6:38 PM, Andrew Senkevich
>>  wrote:
>>> 2016-11-11 17:34 GMT+03:00 Uros Bizjak :
 Some quick remarks:

 +(define_insn "kmovb"
 +  [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k")
 + (unspec:QI
 +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
 +  UNSPEC_KMOV))]
 +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ"
 +  "@
 +   kmovb\t{%k1, %0|%0, %k1}
 +   kmovb\t{%1, %0|%0, %1}";
 +  [(set_attr "mode" "QI")
 +   (set_attr "type" "mskmov")
 +   (set_attr "prefix" "vex")])
 +
 +(define_insn "kmovd"
 +  [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k")
 + (unspec:SI
 +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
 +  UNSPEC_KMOV))]
 +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
 +  "@
 +   kmovd\t{%k1, %0|%0, %k1}
 +   kmovd\t{%1, %0|%0, %1}";
 +  [(set_attr "mode" "SI")
 +   (set_attr "type" "mskmov")
 +   (set_attr "prefix" "vex")])
 +
 +(define_insn "kmovq"
 +  [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km")
 + (unspec:DI
 +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
 +  UNSPEC_KMOV))]
 +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
 +  "@
 +   kmovq\t{%k1, %0|%0, %k1}
 +   kmovq\t{%1, %0|%0, %1}
 +   kmovq\t{%1, %0|%0, %1}";
 +  [(set_attr "mode" "DI")
 +   (set_attr "type" "mskmov")
 +   (set_attr "prefix" "vex")])

 - kmovd (and existing kmovw) should be using register_operand for
 opreand 0. In this case, there is no need for MEM_P checks at all.
 - In the insn constraint, pease check TARGET_AVX before checking MEM_P.
 - please put these definitions above corresponding *mov??_internal 
 patterns.
>>>
>>> Do you mean put below *mov??_internal patterns? Attached corrected such way.
>>
>> No, please put kmovq near *movdi_internal, kmovd near *movsi_internal,
>> etc. It doesn't matter if they are above or below their respective
>> *mov??_internal patterns, as long as they are positioned in some
>> consistent way. IOW, new patterns shouldn't be grouped together, as is
>> the case with your patch.
>
> +(define_insn "kmovb"
> +  [(set (match_operand:QI 0 "register_operand" "=k,k")
> +(unspec:QI
> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
> +  UNSPEC_KMOV))]
> +  "TARGET_AVX512DQ && !MEM_P (operands[1])"
>
> There is no need for !MEM_P, this will prevent memory operand, which
> is allowed by constraint "m".
>
> +(define_insn "kmovq"
> +  [(set (match_operand:DI 0 "register_operand" "=k,k,km")
> +(unspec:DI
> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
> +  UNSPEC_KMOV))]
> +  "TARGET_AVX512BW && !MEM_P (operands[1])"
>
> Operand 0 should have "nonimmediate_operand" predicate. And here you
> need  && !(MEM_P (op0) && MEM_P (op1)) in insn constraint to prevent
> mem->mem moves.

Changed according your comments and attached.


--
WBR,
Andrew


add_k-mask_intrinsics_11.11_1.patch
Description: Binary data


Re: [PATCH] Add AVX512 k-mask intrinsics

2016-11-11 Thread Uros Bizjak
On Fri, Nov 11, 2016 at 6:50 PM, Uros Bizjak  wrote:
> On Fri, Nov 11, 2016 at 6:38 PM, Andrew Senkevich
>  wrote:
>> 2016-11-11 17:34 GMT+03:00 Uros Bizjak :
>>> Some quick remarks:
>>>
>>> +(define_insn "kmovb"
>>> +  [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k")
>>> + (unspec:QI
>>> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
>>> +  UNSPEC_KMOV))]
>>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ"
>>> +  "@
>>> +   kmovb\t{%k1, %0|%0, %k1}
>>> +   kmovb\t{%1, %0|%0, %1}";
>>> +  [(set_attr "mode" "QI")
>>> +   (set_attr "type" "mskmov")
>>> +   (set_attr "prefix" "vex")])
>>> +
>>> +(define_insn "kmovd"
>>> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k")
>>> + (unspec:SI
>>> +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
>>> +  UNSPEC_KMOV))]
>>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>>> +  "@
>>> +   kmovd\t{%k1, %0|%0, %k1}
>>> +   kmovd\t{%1, %0|%0, %1}";
>>> +  [(set_attr "mode" "SI")
>>> +   (set_attr "type" "mskmov")
>>> +   (set_attr "prefix" "vex")])
>>> +
>>> +(define_insn "kmovq"
>>> +  [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km")
>>> + (unspec:DI
>>> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
>>> +  UNSPEC_KMOV))]
>>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>>> +  "@
>>> +   kmovq\t{%k1, %0|%0, %k1}
>>> +   kmovq\t{%1, %0|%0, %1}
>>> +   kmovq\t{%1, %0|%0, %1}";
>>> +  [(set_attr "mode" "DI")
>>> +   (set_attr "type" "mskmov")
>>> +   (set_attr "prefix" "vex")])
>>>
>>> - kmovd (and existing kmovw) should be using register_operand for
>>> opreand 0. In this case, there is no need for MEM_P checks at all.
>>> - In the insn constraint, pease check TARGET_AVX before checking MEM_P.
>>> - please put these definitions above corresponding *mov??_internal patterns.
>>
>> Do you mean put below *mov??_internal patterns? Attached corrected such way.
>
> No, please put kmovq near *movdi_internal, kmovd near *movsi_internal,
> etc. It doesn't matter if they are above or below their respective
> *mov??_internal patterns, as long as they are positioned in some
> consistent way. IOW, new patterns shouldn't be grouped together, as is
> the case with your patch.

+(define_insn "kmovb"
+  [(set (match_operand:QI 0 "register_operand" "=k,k")
+(unspec:QI
+  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
+  UNSPEC_KMOV))]
+  "TARGET_AVX512DQ && !MEM_P (operands[1])"

There is no need for !MEM_P, this will prevent memory operand, which
is allowed by constraint "m".

+(define_insn "kmovq"
+  [(set (match_operand:DI 0 "register_operand" "=k,k,km")
+(unspec:DI
+  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
+  UNSPEC_KMOV))]
+  "TARGET_AVX512BW && !MEM_P (operands[1])"

Operand 0 should have "nonimmediate_operand" predicate. And here you
need  && !(MEM_P (op0) && MEM_P (op1)) in insn constraint to prevent
mem->mem moves.

Uros.


Re: [PATCH] Add AVX512 k-mask intrinsics

2016-11-11 Thread Uros Bizjak
On Fri, Nov 11, 2016 at 6:38 PM, Andrew Senkevich
 wrote:
> 2016-11-11 17:34 GMT+03:00 Uros Bizjak :
>> Some quick remarks:
>>
>> +(define_insn "kmovb"
>> +  [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k")
>> + (unspec:QI
>> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
>> +  UNSPEC_KMOV))]
>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ"
>> +  "@
>> +   kmovb\t{%k1, %0|%0, %k1}
>> +   kmovb\t{%1, %0|%0, %1}";
>> +  [(set_attr "mode" "QI")
>> +   (set_attr "type" "mskmov")
>> +   (set_attr "prefix" "vex")])
>> +
>> +(define_insn "kmovd"
>> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k")
>> + (unspec:SI
>> +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
>> +  UNSPEC_KMOV))]
>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>> +  "@
>> +   kmovd\t{%k1, %0|%0, %k1}
>> +   kmovd\t{%1, %0|%0, %1}";
>> +  [(set_attr "mode" "SI")
>> +   (set_attr "type" "mskmov")
>> +   (set_attr "prefix" "vex")])
>> +
>> +(define_insn "kmovq"
>> +  [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km")
>> + (unspec:DI
>> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
>> +  UNSPEC_KMOV))]
>> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
>> +  "@
>> +   kmovq\t{%k1, %0|%0, %k1}
>> +   kmovq\t{%1, %0|%0, %1}
>> +   kmovq\t{%1, %0|%0, %1}";
>> +  [(set_attr "mode" "DI")
>> +   (set_attr "type" "mskmov")
>> +   (set_attr "prefix" "vex")])
>>
>> - kmovd (and existing kmovw) should be using register_operand for
>> opreand 0. In this case, there is no need for MEM_P checks at all.
>> - In the insn constraint, pease check TARGET_AVX before checking MEM_P.
>> - please put these definitions above corresponding *mov??_internal patterns.
>
> Do you mean put below *mov??_internal patterns? Attached corrected such way.

No, please put kmovq near *movdi_internal, kmovd near *movsi_internal,
etc. It doesn't matter if they are above or below their respective
*mov??_internal patterns, as long as they are positioned in some
consistent way. IOW, new patterns shouldn't be grouped together, as is
the case with your patch.

Uros.


Re: [PATCH] Add AVX512 k-mask intrinsics

2016-11-11 Thread Andrew Senkevich
2016-11-11 17:34 GMT+03:00 Uros Bizjak :
> Some quick remarks:
>
> +(define_insn "kmovb"
> +  [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k")
> + (unspec:QI
> +  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
> +  UNSPEC_KMOV))]
> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ"
> +  "@
> +   kmovb\t{%k1, %0|%0, %k1}
> +   kmovb\t{%1, %0|%0, %1}";
> +  [(set_attr "mode" "QI")
> +   (set_attr "type" "mskmov")
> +   (set_attr "prefix" "vex")])
> +
> +(define_insn "kmovd"
> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k")
> + (unspec:SI
> +  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
> +  UNSPEC_KMOV))]
> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
> +  "@
> +   kmovd\t{%k1, %0|%0, %k1}
> +   kmovd\t{%1, %0|%0, %1}";
> +  [(set_attr "mode" "SI")
> +   (set_attr "type" "mskmov")
> +   (set_attr "prefix" "vex")])
> +
> +(define_insn "kmovq"
> +  [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km")
> + (unspec:DI
> +  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
> +  UNSPEC_KMOV))]
> +  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
> +  "@
> +   kmovq\t{%k1, %0|%0, %k1}
> +   kmovq\t{%1, %0|%0, %1}
> +   kmovq\t{%1, %0|%0, %1}";
> +  [(set_attr "mode" "DI")
> +   (set_attr "type" "mskmov")
> +   (set_attr "prefix" "vex")])
>
> - kmovd (and existing kmovw) should be using register_operand for
> opreand 0. In this case, there is no need for MEM_P checks at all.
> - In the insn constraint, pease check TARGET_AVX before checking MEM_P.
> - please put these definitions above corresponding *mov??_internal patterns.

Do you mean put below *mov??_internal patterns? Attached corrected such way.


--
WBR,
Andrew


add_k-mask_intrinsics_11.11.patch
Description: Binary data


Re: [PATCH] Add AVX512 k-mask intrinsics

2016-11-11 Thread Marc Glisse

On Fri, 11 Nov 2016, Andrew Senkevich wrote:


+extern __inline __mmask32
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_kand_mask32 (__mmask32 __A, __mmask32 __B)
+{
+  return (__mmask32) __builtin_ia32_kandsi ((__mmask32) __A, (__mmask32) __B);
+}


(picking one random example)
Is a builtin really needed here? What would happen if you used

  return __A & __B;

?

--
Marc Glisse


Re: [PATCH] Add AVX512 k-mask intrinsics

2016-11-11 Thread Uros Bizjak
Some quick remarks:

+(define_insn "kmovb"
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=k,k")
+ (unspec:QI
+  [(match_operand:QI 1 "nonimmediate_operand" "r,km")]
+  UNSPEC_KMOV))]
+  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512DQ"
+  "@
+   kmovb\t{%k1, %0|%0, %k1}
+   kmovb\t{%1, %0|%0, %1}";
+  [(set_attr "mode" "QI")
+   (set_attr "type" "mskmov")
+   (set_attr "prefix" "vex")])
+
+(define_insn "kmovd"
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=k,k")
+ (unspec:SI
+  [(match_operand:SI 1 "nonimmediate_operand" "r,km")]
+  UNSPEC_KMOV))]
+  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
+  "@
+   kmovd\t{%k1, %0|%0, %k1}
+   kmovd\t{%1, %0|%0, %1}";
+  [(set_attr "mode" "SI")
+   (set_attr "type" "mskmov")
+   (set_attr "prefix" "vex")])
+
+(define_insn "kmovq"
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=k,k,km")
+ (unspec:DI
+  [(match_operand:DI 1 "nonimmediate_operand" "r,km,k")]
+  UNSPEC_KMOV))]
+  "!(MEM_P (operands[0]) && MEM_P (operands[1])) && TARGET_AVX512BW"
+  "@
+   kmovq\t{%k1, %0|%0, %k1}
+   kmovq\t{%1, %0|%0, %1}
+   kmovq\t{%1, %0|%0, %1}";
+  [(set_attr "mode" "DI")
+   (set_attr "type" "mskmov")
+   (set_attr "prefix" "vex")])

- kmovd (and existing kmovw) should be using register_operand for
opreand 0. In this case, there is no need for MEM_P checks at all.
- In the insn constraint, pease check TARGET_AVX before checking MEM_P.
- please put these definitions above corresponding *mov??_internal patterns.

+//case USI_FTYPE_UQI:
+//case USI_FTYPE_UHI:

No commented-out code without a good reason, please.

Uros.