RE: [patch, x86] Improve memcpy/memset strategy for Skylake.

2018-07-19 Thread Koval, Julia
Yes, it gives small improvements(~2%) on 557.xz on O2 and on 
548.exchange(~2.5%) and 500.perlbench(~1%) on Ofast in rate mode. 

> -Original Message-
> From: Uros Bizjak [mailto:ubiz...@gmail.com]
> Sent: Thursday, July 19, 2018 8:12 AM
> To: Koval, Julia 
> Cc: GCC Patches 
> Subject: Re: [patch, x86] Improve memcpy/memset strategy for Skylake.
> 
> On Thu, Jul 19, 2018 at 7:00 AM, Koval, Julia  wrote:
> > Hi,
> > This patch improves memset/memcpy strategy for Skylake. Ok for trunk?
> 
> Is this patch based on some benchmark data?
> 
> Uros.
> 
> > * gcc/config/i386/x86-tune-costs.h (skylake_memcpy,
> > skylake_memcpy): Replace rep_prefix with unrolling on 512.
> >
> > Thanks,
> > Julia
> >


[patch, x86] Improve memcpy/memset strategy for Skylake.

2018-07-18 Thread Koval, Julia
Hi,
This patch improves memset/memcpy strategy for Skylake. Ok for trunk?

* gcc/config/i386/x86-tune-costs.h (skylake_memcpy,
skylake_memcpy): Replace rep_prefix with unrolling on 512.

Thanks,
Julia



0001-memset.patch
Description: 0001-memset.patch


RE: [patch] Remove redundant intrinsics

2018-06-14 Thread Koval, Julia
Hi,

This patch should fix the issue. Ok for trunk?

gcc/testsuite/
* gcc.target/i386/avx512vl-vpclmulqdq-2.c: Remove 128bit version.

Thanks,
Julia

> -Original Message-
> From: H.J. Lu [mailto:hjl.to...@gmail.com]
> Sent: Tuesday, June 12, 2018 1:27 PM
> To: Koval, Julia 
> Cc: GCC Patches ; Kirill Yukhin
> 
> Subject: Re: [patch] Remove redundant intrinsics
> 
> On Mon, Jun 4, 2018 at 3:27 AM, Koval, Julia  wrote:
> > Hi,
> >
> > Since pre-Icelake ISA already had 128bit version vpclmul and vaes, we 
> > already
> have intrinsics for them(_mm_aesdec_si128, _mm_aesdeclast_si128,
> _mm_aesenc_si128, _mm_aesenclast_si128, _mm_clmulepi64_si128).
> Therefore intrinsics for them, introduced with Icelake instructions are
> redundant. This patch removes them. Ok for trunk?
> >
> > gcc/
> > * config/i386/vaesintrin.h (_mm_aesdec_epi128,
> _mm_aesdeclast_epi128,
> > _mm_aesenc_epi128, _mm_aesenclast_epi128): Remove.
> > * config/i386/vpclmulqdqintrin.h (_mm_clmulepi64_epi128): Remove.
> >
> > gcc/testsuite/
> > * gcc.target/i386/avx512fvl-vaes-1.c: Remove 128bit versions from 
> > test.
> > * gcc.target/i386/vpclmulqdq.c: Ditto.
> 
> This caused:
> 
> [hjl@gnu-skx-1 gcc]$
> /export/build/gnu/gcc-test/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/gcc-test/build-x86_64-linux/gcc/
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512vl-
> vpclmulqdq-2.c
> -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -mavx512bw
> -mavx512vl -mvpclmulqdq -lm -o ./avx512vl-vpclmulqdq-2.exe
> In file included from
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512f-
> vpclmulqdq-2.c:9,
>  from
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512vl-
> vpclmulqdq-2.c:17:
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512f-
> vpclmulqdq-2.c:
> In function \u2018test_128\u2019:
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512f-
> helper.h:107:25:
> warning: implicit declaration of function
> \u2018_mm_clmulepi64_epi128\u2019; did you mean
> \u2018_mm_clmulepi64_si128\u2019? [-Wimplicit-function-declaration]
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512f-
> vpclmulqdq-2.c:56:11:
> note: in expansion of macro \u2018INTRINSIC\u2019
> In file included from
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512vl-
> vpclmulqdq-2.c:17:
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512f-
> vpclmulqdq-2.c:56:9:
> error: incompatible types when assigning to type \u2018__m128i\u2019
> {aka \u2018__vector(2) long long int\u2019} from type \u2018int\u2019
> [hjl@gnu-skx-1 gcc]$
> /export/build/gnu/gcc-test/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/gcc-test/build-x86_64-linux/gcc/
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512vl-
> vpclmulqdq-2.c
> -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -mavx512bw
> -mavx512vl -mvpclmulqdq -lm -o ./avx512vl-vpclmulqdq-2.exe
> In file included from
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512f-
> vpclmulqdq-2.c:9,
>  from
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512vl-
> vpclmulqdq-2.c:17:
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512f-
> vpclmulqdq-2.c:
> In function \u2018test_128\u2019:
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512f-
> helper.h:107:25:
> warning: implicit declaration of function
> \u2018_mm_clmulepi64_epi128\u2019; did you mean
> \u2018_mm_clmulepi64_si128\u2019? [-Wimplicit-function-declaration]
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512f-
> vpclmulqdq-2.c:56:11:
> note: in expansion of macro \u2018INTRINSIC\u2019
> In file included from
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512vl-
> vpclmulqdq-2.c:17:
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/avx512f-
> vpclmulqdq-2.c:56:9:
> error: incompatible types when assigning to type \u2018__m128i\u2019
> {aka \u2018__vector(2) long long int\u2019} from type \u2018int\u2019
> [hjl@gnu-skx-1 gcc]$
> 
> FAIL: gcc.target/i386/avx512vl-vpclmulqdq-2.c (test for excess errors)
> 
> 
> --
> H.J.


test_fix.patch
Description: test_fix.patch


[patch] Remove redundant intrinsics

2018-06-04 Thread Koval, Julia
Hi,

Since pre-Icelake ISA already had 128bit version vpclmul and vaes, we already 
have intrinsics for them(_mm_aesdec_si128, _mm_aesdeclast_si128, 
_mm_aesenc_si128, _mm_aesenclast_si128, _mm_clmulepi64_si128). Therefore 
intrinsics for them, introduced with Icelake instructions are redundant. This 
patch removes them. Ok for trunk?

gcc/
* config/i386/vaesintrin.h (_mm_aesdec_epi128, _mm_aesdeclast_epi128,
_mm_aesenc_epi128, _mm_aesenclast_epi128): Remove.
* config/i386/vpclmulqdqintrin.h (_mm_clmulepi64_epi128): Remove.

gcc/testsuite/
* gcc.target/i386/avx512fvl-vaes-1.c: Remove 128bit versions from test.
* gcc.target/i386/vpclmulqdq.c: Ditto.

Thanks,
Julia


remove_in.patch
Description: remove_in.patch


[x86, patch] Add tuning options to skylake-avx512

2018-04-13 Thread Koval, Julia
Hi,

This patch adds 2 tuning options to -march=skylake-avx512. Ok for trunk?

gcc/
   PR target/84413
   * config/i386/x86-tune.def (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL,
   X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Add m_SKYLAKE_AVX512.

Thanks,
Julia


0001-pts.patch
Description: 0001-pts.patch


[x86] Skylake tuning options

2018-03-29 Thread Koval, Julia
Hi,

This patch adds 2 tuning options for -mtune=skylake-avx512. Ok for trunk?

gcc/
* x86-tune.def (movx, partial_reg_dependency): Enable for 
m_SKYLAKE_AVX512.

Thanks,
Julia


0001-ptc.patch
Description: 0001-ptc.patch


RE: [x86,avx] Fix __builtin_cpu_supports for icelake and cannonlake isa

2018-03-14 Thread Koval, Julia
Gentle ping.

> -Original Message-
> From: Koval, Julia
> Sent: Monday, February 12, 2018 10:57 AM
> To: Kirill Yukhin <kirill.yuk...@gmail.com>
> Cc: 'GCC Patches' <gcc-patches@gcc.gnu.org>
> Subject: RE: [x86,avx] Fix __builtin_cpu_supports for icelake and cannonlake 
> isa
> 
> Hi,
> 
> There is no PR for this. This builtin was just missing for all new cpus.
> 
> Thanks,
> Julia
> 
> > -Original Message-
> > From: Kirill Yukhin [mailto:kirill.yuk...@gmail.com]
> > Sent: Monday, February 12, 2018 7:19 AM
> > To: Koval, Julia <julia.ko...@intel.com>
> > Cc: 'GCC Patches' <gcc-patches@gcc.gnu.org>
> > Subject: Re: [x86,avx] Fix __builtin_cpu_supports for icelake and cannonlake
> isa
> >
> > Hello Julia.
> >
> > On 15 Jan 08:28, Koval, Julia wrote:
> > > Hi,
> > > This patch fixes subj. Ok for trunk?
> > >
> > > gcc/
> > >   * config/i386/i386.c (F_AVX512VBMI2, F_GFNI, F_VPCLMULQDQ,
> > F_AVX512VNNI,
> > >   F_AVX512BITALG): New.
> > >
> > > gcc/testsuite/
> > >   * gcc.target/i386/builtin_target.c (check_intel_cpu_model): Add
> > cannonlake.
> > >   (check_features): Add avx512vbmi2, gfni, vpclmulqdq, avx512vnni,
> > >   avx512bitalg.
> > >
> > > libgcc/
> > >   * config/i386/cpuinfo.c (get_available_features): Add
> > FEATURE_AVX512VBMI2,
> > >   FEATURE_GFNI, FEATURE_VPCLMULQDQ, FEATURE_AVX512VNNI,
> > FEATURE_AVX512BITALG.
> > >   * config/i386/cpuinfo.h (processor_features) Add
> > FEATURE_AVX512VBMI2,
> > >   FEATURE_GFNI, FEATURE_VPCLMULQDQ, FEATURE_AVX512VNNI,
> > FEATURE_AVX512BITALG.
> >
> > Could you pls mention, which problem does your patch fix?
> >
> > --
> > Thanks, K


RE: [patch][x86] Split-up march icelake on march=icelake-server and march=icelake-client

2018-03-14 Thread Koval, Julia
Small fix.

gcc/
* config.gcc (icelake-client, icelake-server): New.
(icelake): Remove.
* config/i386/i386.c (initial_ix86_tune_features): Extend to 64 bit.
(initial_ix86_arch_features): Ditto.
(PTA_SKYLAKE): Add SGX.
(PTA_ICELAKE): Remove.
(PTA_ICELAKE_CLIENT): New.
(PTA_ICELAKE_SERVER): New.
(ix86_option_override_internal): Split up icelake on icelake client and
icelake server.
(get_builtin_code_for_version): Ditto.
(fold_builtin_cpu): Ditto.
* config/i386/driver-i386.c (config/i386/driver-i386.c): Ditto.
* config/i386/i386-c.c (ix86_target_macros_internal): Ditto
* config/i386/i386.h (processor_type) Ditto.
* doc/invoke.texi: Ditto.

gcc/testsuite/
* g++.dg/ext/mv16.C: Split up icelake on icelake client and
icelake-server.
* gcc.target/i386/funcspec-56.inc: Ditto.

libgcc/
* config/i386/cpuinfo.h (processor_subtypes): Split up icelake on 
icelake 
client and icelake-server.

Thanks,
Julia

> -Original Message-
> From: Koval, Julia
> Sent: Tuesday, March 13, 2018 8:42 AM
> To: Joseph Myers <jos...@codesourcery.com>
> Cc: 'GCC Patches' <gcc-patches@gcc.gnu.org>; Uros Bizjak
> <ubiz...@gmail.com>
> Subject: RE: [patch][x86] Split-up march icelake on march=icelake-server and
> march=icelake-client
> 
> Fixed invoke.texi. Here is the new version.
> 
> gcc/
>   * config.gcc (icelake-client, icelake-server): New.
>   (icelake): Remove.
>   * config/i386/i386.c (initial_ix86_tune_features): Extend to 64 bit.
>   (initial_ix86_arch_features): Ditto.
>   (ix86_option_override_internal): Split up icelake on icelake client and
>   icelake server.
>   (get_builtin_code_for_version): Ditto.
>   (fold_builtin_cpu): Ditto.
>   * config/i386/driver-i386.c (config/i386/driver-i386.c): Ditto.
>   * config/i386/i386-c.c (ix86_target_macros_internal): Ditto
>   * config/i386/i386.h (processor_type) Ditto.
>   * doc/invoke.texi: Ditto.
> 
> gcc/testsuite/
>   * g++.dg/ext/mv16.C: Split up icelake on icelake client and
>   icelake-server.
>   * gcc.target/i386/funcspec-56.inc: Ditto.
> 
> libgcc/
>   * config/i386/cpuinfo.h (processor_subtypes): Split up icelake on 
> icelake
>   client and icelake-server.
> 
> Thanks,
> Julia
> 
> > -----Original Message-
> > From: Joseph Myers [mailto:jos...@codesourcery.com]
> > Sent: Monday, March 12, 2018 10:21 PM
> > To: Koval, Julia <julia.ko...@intel.com>
> > Cc: 'GCC Patches' <gcc-patches@gcc.gnu.org>; Uros Bizjak
> > <ubiz...@gmail.com>
> > Subject: Re: [patch][x86] Split-up march icelake on march=icelake-server and
> > march=icelake-client
> >
> > On Mon, 12 Mar 2018, Koval, Julia wrote:
> >
> > > Hi,
> > > This patch introduces separate client and server arch options instead of
> > > -march=icelake. Ok for trunk?
> >
> > I don't see any invoke.texi updates here to document what these two
> > options mean (including, presumably, different lists of features for
> > them).
> >
> > --
> > Joseph S. Myers
> > jos...@codesourcery.com


0001-icelake-client.patch
Description: 0001-icelake-client.patch


RE: [patch][x86] Split-up march icelake on march=icelake-server and march=icelake-client

2018-03-13 Thread Koval, Julia
Fixed invoke.texi. Here is the new version.

gcc/
* config.gcc (icelake-client, icelake-server): New.
(icelake): Remove.
* config/i386/i386.c (initial_ix86_tune_features): Extend to 64 bit.
(initial_ix86_arch_features): Ditto.
(ix86_option_override_internal): Split up icelake on icelake client and
icelake server.
(get_builtin_code_for_version): Ditto.
(fold_builtin_cpu): Ditto.
* config/i386/driver-i386.c (config/i386/driver-i386.c): Ditto.
* config/i386/i386-c.c (ix86_target_macros_internal): Ditto
* config/i386/i386.h (processor_type) Ditto.
* doc/invoke.texi: Ditto.

gcc/testsuite/
* g++.dg/ext/mv16.C: Split up icelake on icelake client and
icelake-server.
* gcc.target/i386/funcspec-56.inc: Ditto.

libgcc/
* config/i386/cpuinfo.h (processor_subtypes): Split up icelake on 
icelake 
client and icelake-server.

Thanks,
Julia

> -Original Message-
> From: Joseph Myers [mailto:jos...@codesourcery.com]
> Sent: Monday, March 12, 2018 10:21 PM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: 'GCC Patches' <gcc-patches@gcc.gnu.org>; Uros Bizjak
> <ubiz...@gmail.com>
> Subject: Re: [patch][x86] Split-up march icelake on march=icelake-server and
> march=icelake-client
> 
> On Mon, 12 Mar 2018, Koval, Julia wrote:
> 
> > Hi,
> > This patch introduces separate client and server arch options instead of
> > -march=icelake. Ok for trunk?
> 
> I don't see any invoke.texi updates here to document what these two
> options mean (including, presumably, different lists of features for
> them).
> 
> --
> Joseph S. Myers
> jos...@codesourcery.com


0001-icelake-client.patch
Description: 0001-icelake-client.patch


[patch][x86] Split-up march icelake on march=icelake-server and march=icelake-client

2018-03-12 Thread Koval, Julia
Hi,
This patch introduces separate client and server arch options instead of 
-march=icelake. Ok for trunk?

Thanks,
Julia


gcc/
* config.gcc (icelake-client, icelake-server): New.
(icelake): Remove.
* config/i386/i386.c (initial_ix86_tune_features): Extend to 64 bit.
(initial_ix86_arch_features): Ditto.
(ix86_option_override_internal): Split up icelake on icelake client and
icelake server.
(get_builtin_code_for_version): Ditto.
(fold_builtin_cpu): Ditto.
* config/i386/driver-i386.c (config/i386/driver-i386.c): Ditto.
* config/i386/i386-c.c (ix86_target_macros_internal): Ditto
* config/i386/i386.h (processor_type) Ditto.

gcc/testsuite/
* g++.dg/ext/mv16.C: Split up icelake on icelake client and
icelake-server.
* gcc.target/i386/funcspec-56.inc: Ditto.

libgcc/
* config/i386/cpuinfo.h (processor_subtypes): Split up icelake on 
icelake 
client and icelake-server.


0001-icelake-client.patch
Description: 0001-icelake-client.patch


[x86] Fix CLWB documentation.

2018-02-16 Thread Koval, Julia
Hi,
This is small fix for documentation - it adds CLWB to skylake-avx512 and 
removes it from cannonlake.

gcc/
* doc/invoke.texi (Skylake Server): Add CLWB.
(Cannonlake): Remove CLWB.

Thanks,
Julia


doc_patch_16.2.18
Description: doc_patch_16.2.18


[x86] Remove CLWB from cannonlake.

2018-02-16 Thread Koval, Julia
In latest ISE extensions document 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
 CLWB was removed from Cannonlake. This patch removes it from GCC. Ok for trunk?

Thanks,
Julia

gcc/
* config/i386/i386.c (ix86_option_override_internal): Remove PTA_CLWB 
from
PTA_CANNONLAKE.


patch
Description: patch


RE: [x86,avx] Fix __builtin_cpu_supports for icelake and cannonlake isa

2018-02-12 Thread Koval, Julia
Hi,

There is no PR for this. This builtin was just missing for all new cpus.

Thanks,
Julia

> -Original Message-
> From: Kirill Yukhin [mailto:kirill.yuk...@gmail.com]
> Sent: Monday, February 12, 2018 7:19 AM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: 'GCC Patches' <gcc-patches@gcc.gnu.org>
> Subject: Re: [x86,avx] Fix __builtin_cpu_supports for icelake and cannonlake 
> isa
> 
> Hello Julia.
> 
> On 15 Jan 08:28, Koval, Julia wrote:
> > Hi,
> > This patch fixes subj. Ok for trunk?
> >
> > gcc/
> > * config/i386/i386.c (F_AVX512VBMI2, F_GFNI, F_VPCLMULQDQ,
> F_AVX512VNNI,
> > F_AVX512BITALG): New.
> >
> > gcc/testsuite/
> > * gcc.target/i386/builtin_target.c (check_intel_cpu_model): Add
> cannonlake.
> > (check_features): Add avx512vbmi2, gfni, vpclmulqdq, avx512vnni,
> > avx512bitalg.
> >
> > libgcc/
> > * config/i386/cpuinfo.c (get_available_features): Add
> FEATURE_AVX512VBMI2,
> > FEATURE_GFNI, FEATURE_VPCLMULQDQ, FEATURE_AVX512VNNI,
> FEATURE_AVX512BITALG.
> > * config/i386/cpuinfo.h (processor_features) Add
> FEATURE_AVX512VBMI2,
> > FEATURE_GFNI, FEATURE_VPCLMULQDQ, FEATURE_AVX512VNNI,
> FEATURE_AVX512BITALG.
> 
> Could you pls mention, which problem does your patch fix?
> 
> --
> Thanks, K


RE: [patch][x86] -march=icelake

2018-01-30 Thread Koval, Julia
Thank you for your comments, fixed them and rebased Ice Lake patch on top of 
it. Ok for trunk?

Bitmask patch changelog:

gcc/c-family/
* c-common.h (omp_clause_mask): Move to wide_int_bitmask.h.

gcc/
* config/i386/i386.c (ix86_option_override_internal): Change flags type 
to
wide_int_bitmask.
* wide-int-bitmask.h: New.

Icelake patch changelog:

gcc/
* config.gcc: Add -march=icelake.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect icelake.
* config/i386/i386-c.c (ix86_target_macros_internal): Handle icelake.
* config/i386/i386.c (processor_costs): Add m_ICELAKE.
(PTA_ICELAKE, PTA_AVX512VNNI, PTA_GFNI, PTA_VAES, PTA_AVX512VBMI2,
PTA_VPCLMULQDQ, PTA_RDPID, PTA_AVX512BITALG): New.
(processor_target_table): Add icelake.
(ix86_option_override_internal): Handle new PTAs.
(get_builtin_code_for_version): Handle icelake.
(M_INTEL_COREI7_ICELAKE): New.
(fold_builtin_cpu): Handle icelake.
* config/i386/i386.h (TARGET_ICELAKE, PROCESSOR_ICELAKE): New.
* doc/invoke.texi: Add -march=icelake.
gcc/testsuite/
* gcc.target/i386/funcspec-56.inc: Handle new march.
* g++.dg/ext/mv16.C: Ditto.
libgcc/
* config/i386/cpuinfo.h (processor_subtypes): Add INTEL_COREI7_ICELAKE.

Thanks,
Julia

> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Tuesday, January 30, 2018 9:47 AM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: Richard Biener <rguent...@suse.de>; Uros Bizjak <ubiz...@gmail.com>;
> GCC Patches <gcc-patches@gcc.gnu.org>; Kirill Yukhin
> <kirill.yuk...@gmail.com>
> Subject: Re: [patch][x86] -march=icelake
> 
> On Tue, Jan 30, 2018 at 08:35:38AM +, Koval, Julia wrote:
> > * c-common.h (omp_clause_mask): Move to wide_int_bitmask.h
> 
> Missing dot ad the end.
> 
> +  wide_int_bitmask PTA_3DNOW (HOST_WIDE_INT_1U << 0);
> 
> Can't all these be const wide_int_bitmask instead of just
> wide_int_bitmask?
> 
> ...
> +
> +  wide_int_bitmask PTA_CORE2 = PTA_64BIT | PTA_MMX | PTA_SSE |
> PTA_SSE2
> +| PTA_SSE3 | PTA_SSSE3 | PTA_CX16 | PTA_FXSR;
> +  wide_int_bitmask PTA_NEHALEM = PTA_CORE2 | PTA_SSE4_1 | PTA_SSE4_2
> +| PTA_POPCNT;
> +  wide_int_bitmask PTA_WESTMERE = PTA_NEHALEM | PTA_AES |
> PTA_PCLMUL;
> +  wide_int_bitmask PTA_SANDYBRIDGE = PTA_WESTMERE | PTA_AVX |
> PTA_XSAVE
> +| PTA_XSAVEOPT;
> +  wide_int_bitmask PTA_IVYBRIDGE = PTA_SANDYBRIDGE | PTA_FSGSBASE |
> PTA_RDRND
> +| PTA_F16C;
> +  wide_int_bitmask PTA_HASWELL = PTA_IVYBRIDGE | PTA_AVX2 | PTA_BMI |
> PTA_BMI2
> +| PTA_LZCNT | PTA_FMA | PTA_MOVBE | PTA_HLE;
> +  wide_int_bitmask PTA_BROADWELL = PTA_HASWELL | PTA_ADX |
> PTA_PRFCHW
> +| PTA_RDSEED;
> +  wide_int_bitmask PTA_SKYLAKE = PTA_BROADWELL | PTA_CLFLUSHOPT |
> PTA_XSAVEC
> +| PTA_XSAVES;
> +  wide_int_bitmask PTA_SKYLAKE_AVX512 = PTA_SKYLAKE | PTA_AVX512F |
> PTA_AVX512CD
> +| PTA_AVX512VL | PTA_AVX512BW | PTA_AVX512DQ | PTA_PKU |
> PTA_CLWB;
> +  wide_int_bitmask PTA_CANNONLAKE = PTA_SKYLAKE_AVX512 |
> PTA_AVX512VBMI
> +| PTA_AVX512IFMA | PTA_SHA;
> +  wide_int_bitmask PTA_KNL = PTA_BROADWELL | PTA_AVX512PF |
> PTA_AVX512ER
> +| PTA_AVX512F | PTA_AVX512CD;
> +  wide_int_bitmask PTA_BONNELL = PTA_CORE2 | PTA_MOVBE;
> +  wide_int_bitmask PTA_SILVERMONT = PTA_WESTMERE | PTA_MOVBE |
> PTA_RDRND;
> +  wide_int_bitmask PTA_KNM = PTA_KNL | PTA_AVX5124VNNIW |
> PTA_AVX5124FMAPS
> +| PTA_AVX512VPOPCNTDQ;
> 
> Likewise for these.
> 
> --- /dev/null
> +++ b/gcc/wide-int-bitmask.h
> @@ -0,0 +1,145 @@
> +/* Operation with 128 bit bitmask.
> +   Copyright (C) 1987-2018 Free Software Foundation, Inc.
> 
> Please use 2013-2018 instead, all the omp_clause_mask stuff was
> introduced in 2013.
> 
> +
> +#ifndef GCC_BIT_MASK_H
> +#define GCC_BIT_MASK_H
> 
> The macro hasn't been renamed for the header file rename.
> 
> +
> +#endif /* ! GCC_BIT_MASK_H */
> 
> Here as well.  Otherwise LGTM.
> 
>   Jakub


0001-bitmask.patch
Description: 0001-bitmask.patch


0002-icelake_rebased.patch
Description: 0002-icelake_rebased.patch


RE: [patch][x86] -march=icelake

2018-01-30 Thread Koval, Julia
Renamed it. Ok for trunk?

gcc/c-family/
* c-common.h (omp_clause_mask): Move to wide_int_bitmask.h

gcc/
* config/i386/i386.c (ix86_option_override_internal): Change flags type 
to
wide_int_bitmask.
* wide-int-bitmask.h: New.

Thanks,
Julia


> -Original Message-
> From: Richard Biener [mailto:rguent...@suse.de]
> Sent: Wednesday, January 24, 2018 12:18 PM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: Jakub Jelinek <ja...@redhat.com>; Uros Bizjak <ubiz...@gmail.com>; GCC
> Patches <gcc-patches@gcc.gnu.org>; Kirill Yukhin <kirill.yuk...@gmail.com>
> Subject: RE: [patch][x86] -march=icelake
> 
> On Wed, 24 Jan 2018, Koval, Julia wrote:
> 
> > I think we may want to extend it to more than 2 ints someday, when we run
> out of bits again. It won't break the existing functionality if 3rd int will 
> be zero by
> default. That's why I tried to avoid "two" in the name.
> >
> > Julia
> >
> > > -Original Message-
> > > From: Jakub Jelinek [mailto:ja...@redhat.com]
> > > Sent: Wednesday, January 24, 2018 12:06 PM
> > > To: Uros Bizjak <ubiz...@gmail.com>; Richard Biener <rguent...@suse.de>
> > > Cc: Koval, Julia <julia.ko...@intel.com>; GCC Patches  > > patc...@gcc.gnu.org>; Kirill Yukhin <kirill.yuk...@gmail.com>
> > > Subject: Re: [patch][x86] -march=icelake
> > >
> > > On Wed, Jan 24, 2018 at 12:00:26PM +0100, Uros Bizjak wrote:
> > > > On Mon, Jan 22, 2018 at 3:44 PM, Koval, Julia <julia.ko...@intel.com>
> wrote:
> > > > > Yes, you are right, any() is not required. Here is the patch.
> > > >
> > > > Please also attach ChangeLog.
> > > >
> > > > The patch is OK for x86 target, it needs global reviewer approval
> > > > (Maybe Jakub, as the patch touches OMP part).
> > >
> > > I don't like the new class name nor header name, bit_mask is way too
> generic
> > > name for something very specialized (double hwi bitmask).
> > >
> > > Richard, any suggestions for this?
> 
> Maybe wide_int_bitmask?  You could then even use fixed_wide_int <> as
> "implementation".
> 
> Richard.


0001-bitmask.patch
Description: 0001-bitmask.patch


RE: [PATCH] Fix various x86 avx512{bitalg, vpopcntdq, vbmi2} issues (PR target/83488)

2018-01-24 Thread Koval, Julia
Hi,
Fixed it. Ok for trunk?

gcc/
* config/i386/avx512bitalgintrin.h (_mm512_bitshuffle_epi64_mask,
_mm512_mask_bitshuffle_epi64_mask, _mm256_bitshuffle_epi64_mask,
_mm256_mask_bitshuffle_epi64_mask, _mm_bitshuffle_epi64_mask,
_mm_mask_bitshuffle_epi64_mask): Fix type.
* config/i386/i386-builtin-types.def (UHI_FTYPE_V2DI_V2DI_UHI,
USI_FTYPE_V4DI_V4DI_USI): Remove.
* config/i386/i386-builtin.def (__builtin_ia32_vpshufbitqmb512_mask,
__builtin_ia32_vpshufbitqmb256_mask,
__builtin_ia32_vpshufbitqmb128_mask): Fix types.
* config/i386/i386.c (ix86_expand_args_builtin): Remove old types.
* config/i386/sse.md (VI1_AVX512VLBW): Change types.

gcc/testsuite/
* gcc.target/i386/avx512bitalg-vpshufbitqmb-1.c: Add -mavx512f 
-mavx512bw.
* gcc.target/i386/avx512bitalgvl-vpshufbitqmb-1.c: Add -mavx512bw.
* gcc.target/i386/i386.exp: Fix types.

Thanks,
Julia

> -Original Message-
> From: Kirill Yukhin [mailto:kirill.yuk...@gmail.com]
> Sent: Saturday, January 20, 2018 11:49 AM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: 'Jakub Jelinek' <ja...@redhat.com>; 'Uros Bizjak' <ubiz...@gmail.com>;
> 'GCC Patches' <gcc-patches@gcc.gnu.org>
> Subject: Re: [PATCH] Fix various x86 avx512{bitalg, vpopcntdq, vbmi2} issues 
> (PR
> target/83488)
> 
> Hello Julia,
> On 12 Jan 08:55, Koval, Julia wrote:
> > Changelog
> >
> > gcc/
> > * config/i386/avx512bitalgintrin.h (_mm512_bitshuffle_epi64_mask,
> > _mm512_mask_bitshuffle_epi64_mask,
> _mm256_bitshuffle_epi64_mask,
> > _mm256_mask_bitshuffle_epi64_mask, _mm_bitshuffle_epi64_mask,
> > _mm_mask_bitshuffle_epi64_mask): Fix type.
> > * config/i386/i386-builtin-types.def (UHI_FTYPE_V2DI_V2DI_UHI,
> > USI_FTYPE_V4DI_V4DI_USI): Remove.
> > * config/i386/i386-builtin.def (__builtin_ia32_vpshufbitqmb512_mask,
> > __builtin_ia32_vpshufbitqmb256_mask,
> > __builtin_ia32_vpshufbitqmb128_mask): Fix types.
> > * config/i386/i386.c (ix86_expand_args_builtin): Remove old types.
> > * config/i386/sse.md (VI48_AVX512VLBW): Change types.
> >
> > gcc/testsuite/
> > * gcc.target/i386/avx512bitalg-vpshufbitqmb-1.c: Add -mavx512f -
> mavx512bw.
> > * gcc.target/i386/avx512bitalgvl-vpshufbitqmb-1.c: Add -mavx512bw.
> > * gcc.target/i386/i386.exp: Fix types.
> 
>  (define_mode_iterator VI48_AVX512VLBW
> -  [(V8DI "TARGET_AVX512BW") (V4DI  "TARGET_AVX512VL")
> -   (V2DI  "TARGET_AVX512VL")])
> +  [(V64QI "TARGET_AVX512BW") (V32QI  "TARGET_AVX512VL")
> +   (V16QI  "TARGET_AVX512VL")])
> I'd call this iterator VI1_AVX512VLBW.
> 
> --
> Thanks, K



0001-bitalg-fix.patch
Description: 0001-bitalg-fix.patch


RE: [patch][x86] -march=icelake

2018-01-24 Thread Koval, Julia
I think we may want to extend it to more than 2 ints someday, when we run out 
of bits again. It won't break the existing functionality if 3rd int will be 
zero by default. That's why I tried to avoid "two" in the name.

Julia

> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Wednesday, January 24, 2018 12:06 PM
> To: Uros Bizjak <ubiz...@gmail.com>; Richard Biener <rguent...@suse.de>
> Cc: Koval, Julia <julia.ko...@intel.com>; GCC Patches  patc...@gcc.gnu.org>; Kirill Yukhin <kirill.yuk...@gmail.com>
> Subject: Re: [patch][x86] -march=icelake
> 
> On Wed, Jan 24, 2018 at 12:00:26PM +0100, Uros Bizjak wrote:
> > On Mon, Jan 22, 2018 at 3:44 PM, Koval, Julia <julia.ko...@intel.com> wrote:
> > > Yes, you are right, any() is not required. Here is the patch.
> >
> > Please also attach ChangeLog.
> >
> > The patch is OK for x86 target, it needs global reviewer approval
> > (Maybe Jakub, as the patch touches OMP part).
> 
> I don't like the new class name nor header name, bit_mask is way too generic
> name for something very specialized (double hwi bitmask).
> 
> Richard, any suggestions for this?
> 
>   Jakub


RE: [patch][x86] -march=icelake

2018-01-22 Thread Koval, Julia
Yes, you are right, any() is not required. Here is the patch.

Thanks,
Julia

> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Monday, January 22, 2018 12:36 PM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: Richard Biener <richard.guent...@gmail.com>; Uros Bizjak
> <ubiz...@gmail.com>; GCC Patches <gcc-patches@gcc.gnu.org>; Kirill Yukhin
> <kirill.yuk...@gmail.com>
> Subject: Re: [patch][x86] -march=icelake
> 
> On Mon, Jan 22, 2018 at 11:30:10AM +, Koval, Julia wrote:
> > Hi, I tried omp_clause_mask and it looks ok.  But it lacks check if there
> > is any bit or none.  With addition of it(as proposed or in some other way
> > it should work.  What do you think about this approach(patch attached)?
> 
> Well, I certainly didn't mean to use omp_clause_mask for something
> completely unrelated to OpenMP, the reason I've mentioned it is that it is a
> class that deals with a similar problem.
> 
> So, if you want to use the same class, it would need to be moved to some
> generic header, renamed and then c-common.h would typedef that_class
> omp_clause_mask.
> 
> I'm surprised you need any, doesn't ((mask & (...)) != 0 already handle
> that?
> 
>   Jakub



0001-test.patch
Description: 0001-test.patch


RE: [patch][x86] -march=icelake

2018-01-22 Thread Koval, Julia
Hi,
I tried omp_clause_mask and it looks ok. But it lacks check if there is any bit 
or none. With addition of it(as proposed or in some other way it should work. 
What do you think about this approach(patch attached)?

Thanks,
Julia

> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Tuesday, December 19, 2017 2:50 PM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: Richard Biener <richard.guent...@gmail.com>; Uros Bizjak
> <ubiz...@gmail.com>; GCC Patches <gcc-patches@gcc.gnu.org>; Kirill Yukhin
> <kirill.yuk...@gmail.com>
> Subject: Re: [patch][x86] -march=icelake
> 
> On Tue, Dec 19, 2017 at 12:34:03PM +, Koval, Julia wrote:
> > >> Maybe [] operator could be used instead of a dynamic handling here.
> > I had another solution in mind, with enums, which then addresses elements
> using its index, please look the patch attached.
> 
> You can also have a look at the omp_clause_mask class in c-common.h, that is
> also something that has been added to handle the case where we run out of
> 64-bits for a particular bitmask, wanted to keep using pretty much the same
> interfaces and be able to handle it fast.  Using 2 enums for the two halves
> and treating it accordingly is also an option.
> 
> I agree sbitmap is too heavy for this.
> 
>   Jakub


0001-test.patch
Description: 0001-test.patch


[patch][x86] Fix PR83618

2018-01-17 Thread Koval, Julia
Fix bug, when rdpid intrinsic used eax instead of rax in 64bit mode. Ok for 
trunk?

gcc/
* config/i386/i386.c (ix86_expand_builtin): Handle IX86_BUILTIN_RDPID.
* config/i386/i386.md (rdpid_rex64) New.
(rdpid): Make 32bit only.

gcc/testsuite/
* gcc.target/i386/rdpid.c: Remove "eax".

Thanks,
Julia


0001-fix.patch
Description: 0001-fix.patch


[x86,avx] Fix __builtin_cpu_supports for icelake and cannonlake isa

2018-01-15 Thread Koval, Julia
Hi,
This patch fixes subj. Ok for trunk?

gcc/
* config/i386/i386.c (F_AVX512VBMI2, F_GFNI, F_VPCLMULQDQ, F_AVX512VNNI,
F_AVX512BITALG): New.

gcc/testsuite/
* gcc.target/i386/builtin_target.c (check_intel_cpu_model): Add 
cannonlake.
(check_features): Add avx512vbmi2, gfni, vpclmulqdq, avx512vnni,
avx512bitalg.

libgcc/
* config/i386/cpuinfo.c (get_available_features): Add 
FEATURE_AVX512VBMI2,
FEATURE_GFNI, FEATURE_VPCLMULQDQ, FEATURE_AVX512VNNI, 
FEATURE_AVX512BITALG.
* config/i386/cpuinfo.h (processor_features) Add FEATURE_AVX512VBMI2,
FEATURE_GFNI, FEATURE_VPCLMULQDQ, FEATURE_AVX512VNNI, 
FEATURE_AVX512BITALG.


0001-new-isa-builtin_cpu-test.patch
Description: 0001-new-isa-builtin_cpu-test.patch


RE: [PATCH] Fix various x86 avx512{bitalg,vpopcntdq,vbmi2} issues (PR target/83488)

2018-01-12 Thread Koval, Julia
Changelog

gcc/
* config/i386/avx512bitalgintrin.h (_mm512_bitshuffle_epi64_mask,
_mm512_mask_bitshuffle_epi64_mask, _mm256_bitshuffle_epi64_mask,
_mm256_mask_bitshuffle_epi64_mask, _mm_bitshuffle_epi64_mask,
_mm_mask_bitshuffle_epi64_mask): Fix type.
* config/i386/i386-builtin-types.def (UHI_FTYPE_V2DI_V2DI_UHI,
USI_FTYPE_V4DI_V4DI_USI): Remove.
* config/i386/i386-builtin.def (__builtin_ia32_vpshufbitqmb512_mask,
__builtin_ia32_vpshufbitqmb256_mask,
__builtin_ia32_vpshufbitqmb128_mask): Fix types.
* config/i386/i386.c (ix86_expand_args_builtin): Remove old types.
* config/i386/sse.md (VI48_AVX512VLBW): Change types.

gcc/testsuite/
* gcc.target/i386/avx512bitalg-vpshufbitqmb-1.c: Add -mavx512f 
-mavx512bw.
* gcc.target/i386/avx512bitalgvl-vpshufbitqmb-1.c: Add -mavx512bw.
* gcc.target/i386/i386.exp: Fix types.

> -Original Message-
> From: Koval, Julia
> Sent: Wednesday, January 10, 2018 11:51 AM
> To: 'Jakub Jelinek' <ja...@redhat.com>; 'Kirill Yukhin'
> <kirill.yuk...@gmail.com>; 'Uros Bizjak' <ubiz...@gmail.com>
> Cc: 'GCC Patches' <gcc-patches@gcc.gnu.org>
> Subject: RE: [PATCH] Fix various x86 avx512{bitalg,vpopcntdq,vbmi2} issues (PR
> target/83488)
> 
> Hi,
> 
> What do you think about changing these types to UHI_FTYPE_V16QI_V16QI_UHI
> and so on?
> In docs it is (KL, VL) = (16,128), (32,256), (64, 512) - so looks like this 
> is where the
> error was from the start.
> Here is the patch.
> 
> Thanks,
> Julia
> 
> > -Original Message-
> > From: Koval, Julia
> > Sent: Monday, December 25, 2017 1:01 PM
> > To: Jakub Jelinek <ja...@redhat.com>; Kirill Yukhin
> <kirill.yuk...@gmail.com>;
> > Uros Bizjak <ubiz...@gmail.com>
> > Cc: GCC Patches <gcc-patches@gcc.gnu.org>
> > Subject: RE: [PATCH] Fix various x86 avx512{bitalg,vpopcntdq,vbmi2} issues
> (PR
> > target/83488)
> >
> > Thank you very much for fixing those issues.
> >
> > Note, __builtin_ia32_vpshufbitqmb{128,256,512}_mask are implemented
> > > incorrectly, can somebody from Intel handle that?  The inlines in the
> > > intrinsic header look correct, but the builtins aren't and what's even 
> > > worse
> > > is that the define_insns are wrong too.  According to the documentation
> > > and inline fn, the intrinsics have an __mmask{16,32,64} input mask and
> > > also __mmask{16,32,64} output mask.  The builtins use
> > > UHI_FTYPE_V2DI_V2DI_UHI
> > > USI_FTYPE_V4DI_V4DI_USI
> > > UQI_FTYPE_V8DI_V8DI_UQI
> > > types (first two are correct, the last one is wrong, should have been
> > > UDI_FTYPE_V8DI_V8DI_UDI), and the define_insn has:
> > > (set (match_operand:QI 0 ("register_operand") ("=Yk"))
> > > (and:QI (unspec:QI [
> > > (match_operand:V2DI 1 ("register_operand") ("v"))
> > > (match_operand:V2DI 2 ("nonimmediate_operand") 
> > > ("vm"))
> > > ] 214)
> > > (match_operand:QI 3 ("register_operand") ("Yk"
> > > (incorrect, should use :HI result and :HI mask input),
> > > (set (match_operand:QI 0 ("register_operand") ("=Yk"))
> > > (and:QI (unspec:QI [
> > > (match_operand:V4DI 1 ("register_operand") ("v"))
> > > (match_operand:V4DI 2 ("nonimmediate_operand") 
> > > ("vm"))
> > > ] 214)
> > > (match_operand:QI 3 ("register_operand") ("Yk"
> > > (incorrect, should use :SI result and :SI mask input),
> > > (set (match_operand:QI 0 ("register_operand") ("=Yk"))
> > > (and:QI (unspec:QI [
> > > (match_operand:V8DI 1 ("register_operand") ("v"))
> > > (match_operand:V8DI 2 ("nonimmediate_operand") 
> > > ("vm"))
> > > ] 214)
> > > (match_operand:QI 3 ("register_operand") ("Yk"
> > > (incorrect, should use :DI result and :DI mask input).  Similarly the
> > > non-masked patterns, where just the result is incorrect, not the operand 3
> > > which doesn't exist).  I'll file a PR to track this.
> >
> > I'll fix that.
> >
> > Thanks,
> &g

RE: [PATCH] Fix various x86 avx512{bitalg,vpopcntdq,vbmi2} issues (PR target/83488)

2018-01-10 Thread Koval, Julia
Hi,

What do you think about changing these types to UHI_FTYPE_V16QI_V16QI_UHI and 
so on?
In docs it is (KL, VL) = (16,128), (32,256), (64, 512) - so looks like this is 
where the error was from the start.
Here is the patch.

Thanks,
Julia

> -Original Message-
> From: Koval, Julia
> Sent: Monday, December 25, 2017 1:01 PM
> To: Jakub Jelinek <ja...@redhat.com>; Kirill Yukhin <kirill.yuk...@gmail.com>;
> Uros Bizjak <ubiz...@gmail.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>
> Subject: RE: [PATCH] Fix various x86 avx512{bitalg,vpopcntdq,vbmi2} issues (PR
> target/83488)
> 
> Thank you very much for fixing those issues.
> 
> Note, __builtin_ia32_vpshufbitqmb{128,256,512}_mask are implemented
> > incorrectly, can somebody from Intel handle that?  The inlines in the
> > intrinsic header look correct, but the builtins aren't and what's even worse
> > is that the define_insns are wrong too.  According to the documentation
> > and inline fn, the intrinsics have an __mmask{16,32,64} input mask and
> > also __mmask{16,32,64} output mask.  The builtins use
> > UHI_FTYPE_V2DI_V2DI_UHI
> > USI_FTYPE_V4DI_V4DI_USI
> > UQI_FTYPE_V8DI_V8DI_UQI
> > types (first two are correct, the last one is wrong, should have been
> > UDI_FTYPE_V8DI_V8DI_UDI), and the define_insn has:
> > (set (match_operand:QI 0 ("register_operand") ("=Yk"))
> > (and:QI (unspec:QI [
> > (match_operand:V2DI 1 ("register_operand") ("v"))
> > (match_operand:V2DI 2 ("nonimmediate_operand") 
> > ("vm"))
> > ] 214)
> > (match_operand:QI 3 ("register_operand") ("Yk"
> > (incorrect, should use :HI result and :HI mask input),
> > (set (match_operand:QI 0 ("register_operand") ("=Yk"))
> > (and:QI (unspec:QI [
> > (match_operand:V4DI 1 ("register_operand") ("v"))
> > (match_operand:V4DI 2 ("nonimmediate_operand") 
> > ("vm"))
> > ] 214)
> > (match_operand:QI 3 ("register_operand") ("Yk"
> > (incorrect, should use :SI result and :SI mask input),
> > (set (match_operand:QI 0 ("register_operand") ("=Yk"))
> > (and:QI (unspec:QI [
> > (match_operand:V8DI 1 ("register_operand") ("v"))
> > (match_operand:V8DI 2 ("nonimmediate_operand") 
> > ("vm"))
> > ] 214)
> > (match_operand:QI 3 ("register_operand") ("Yk"
> > (incorrect, should use :DI result and :DI mask input).  Similarly the
> > non-masked patterns, where just the result is incorrect, not the operand 3
> > which doesn't exist).  I'll file a PR to track this.
> 
> I'll fix that.
> 
> Thanks,
> Julia
> 
> > -Original Message-----
> > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > ow...@gcc.gnu.org] On Behalf Of Jakub Jelinek
> > Sent: Friday, December 22, 2017 7:40 PM
> > To: Kirill Yukhin <kirill.yuk...@gmail.com>; Uros Bizjak <ubiz...@gmail.com>
> > Cc: Koval, Julia <julia.ko...@intel.com>; GCC Patches  > patc...@gcc.gnu.org>
> > Subject: [PATCH] Fix various x86 avx512{bitalg,vpopcntdq,vbmi2} issues (PR
> > target/83488)
> >
> > On Fri, Dec 22, 2017 at 03:38:03PM +0300, Kirill Yukhin wrote:
> > > Hello, Julia,
> > > On 12 Nov 12:51, Koval, Julia wrote:
> > > > Hi, this patch enables AVX512BITALG and AVX512VPOPCNTDQ instructions
> > from
> > https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> > instruction-set-extensions-programming-reference.pdf. Ok for trunk?
> > > OK for trunk. I've checked it in.
> >
> > Unfortunately, there are various issues in this patch as well as earlier
> > vbmi2 support.
> >
> > 1) as for various AVX512BITALG and AVX512VPOPCNTDQ builtins we need not
> > just
> > that ISA, but also AVX512VL or AVX512BW or both, these two ISAs need to be
> > moved over from ix86_isa_flags2 to ix86_isa_flags.
> > 2) while the PDF doesn't say that explicitly, for builtins that map to
> > hw insns that don't have AVX512BW listed as CPUID, if they use (or set)
> > 32-bit or 64-bit %k? mask register, we need AVX512BW for the builtin,
> > because otherwise we get ICEs when LRA is trying to load

RE: [PATCH] Fix various x86 avx512{bitalg,vpopcntdq,vbmi2} issues (PR target/83488)

2017-12-25 Thread Koval, Julia
Thank you very much for fixing those issues.

Note, __builtin_ia32_vpshufbitqmb{128,256,512}_mask are implemented
> incorrectly, can somebody from Intel handle that?  The inlines in the
> intrinsic header look correct, but the builtins aren't and what's even worse
> is that the define_insns are wrong too.  According to the documentation
> and inline fn, the intrinsics have an __mmask{16,32,64} input mask and
> also __mmask{16,32,64} output mask.  The builtins use
> UHI_FTYPE_V2DI_V2DI_UHI
> USI_FTYPE_V4DI_V4DI_USI
> UQI_FTYPE_V8DI_V8DI_UQI
> types (first two are correct, the last one is wrong, should have been
> UDI_FTYPE_V8DI_V8DI_UDI), and the define_insn has:
> (set (match_operand:QI 0 ("register_operand") ("=Yk"))
> (and:QI (unspec:QI [
> (match_operand:V2DI 1 ("register_operand") ("v"))
> (match_operand:V2DI 2 ("nonimmediate_operand") ("vm"))
> ] 214)
> (match_operand:QI 3 ("register_operand") ("Yk"
> (incorrect, should use :HI result and :HI mask input),
> (set (match_operand:QI 0 ("register_operand") ("=Yk"))
> (and:QI (unspec:QI [
> (match_operand:V4DI 1 ("register_operand") ("v"))
> (match_operand:V4DI 2 ("nonimmediate_operand") ("vm"))
> ] 214)
> (match_operand:QI 3 ("register_operand") ("Yk"
> (incorrect, should use :SI result and :SI mask input),
> (set (match_operand:QI 0 ("register_operand") ("=Yk"))
> (and:QI (unspec:QI [
> (match_operand:V8DI 1 ("register_operand") ("v"))
> (match_operand:V8DI 2 ("nonimmediate_operand") ("vm"))
> ] 214)
> (match_operand:QI 3 ("register_operand") ("Yk"
> (incorrect, should use :DI result and :DI mask input).  Similarly the
> non-masked patterns, where just the result is incorrect, not the operand 3
> which doesn't exist).  I'll file a PR to track this.

I'll fix that.

Thanks,
Julia

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Jakub Jelinek
> Sent: Friday, December 22, 2017 7:40 PM
> To: Kirill Yukhin <kirill.yuk...@gmail.com>; Uros Bizjak <ubiz...@gmail.com>
> Cc: Koval, Julia <julia.ko...@intel.com>; GCC Patches  patc...@gcc.gnu.org>
> Subject: [PATCH] Fix various x86 avx512{bitalg,vpopcntdq,vbmi2} issues (PR
> target/83488)
> 
> On Fri, Dec 22, 2017 at 03:38:03PM +0300, Kirill Yukhin wrote:
> > Hello, Julia,
> > On 12 Nov 12:51, Koval, Julia wrote:
> > > Hi, this patch enables AVX512BITALG and AVX512VPOPCNTDQ instructions
> from
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> instruction-set-extensions-programming-reference.pdf. Ok for trunk?
> > OK for trunk. I've checked it in.
> 
> Unfortunately, there are various issues in this patch as well as earlier
> vbmi2 support.
> 
> 1) as for various AVX512BITALG and AVX512VPOPCNTDQ builtins we need not
> just
> that ISA, but also AVX512VL or AVX512BW or both, these two ISAs need to be
> moved over from ix86_isa_flags2 to ix86_isa_flags.
> 2) while the PDF doesn't say that explicitly, for builtins that map to
> hw insns that don't have AVX512BW listed as CPUID, if they use (or set)
> 32-bit or 64-bit %k? mask register, we need AVX512BW for the builtin,
> because otherwise we get ICEs when LRA is trying to load (or store) the
> 32-bit or 64-bit %k? mask register.  Most of the intrin*.h headers got the
> requirements right (but see below), but not i386-builtins.def, so using
> intrin headers was fine, but using builtins directly resulted in numerous
> ICEs.
> 3) some builtins where the define_insns were requiring AVX512VL didn't have
> that requirement on the builtins, so again, numerous ICEs when using the
> builtins directly.
> 4) for some builtins the intrin headers were uselessly requiring avx512bw
> even when it wasn't needed at all (either when they don't have any mask
> argument or when they have an 8-bit or 16-bit only mask).
> 5) the def_builtin/ix86_expand_builtin stuff didn't handle
> OPTION_MASK_ISA_something | OPTION_MASK_ISA_AVX512BW or
> OPTION_MASK_ISA_something | OPTION_MASK_ISA_AVX512VL |
> OPTION_MASK_ISA_AVX512BW
> right (while the VL is handled there as "require the other ISAs and VL",
> for BW we don't do that).  There were some hacks for GFNI and VPCLM

RE: [patch][x86] -march=icelake

2017-12-19 Thread Koval, Julia
>> Maybe [] operator could be used instead of a dynamic handling here.
I had another solution in mind, with enums, which then addresses elements using 
its index, please look the patch attached.


>>> The natural GCC data structure is a sbitmap ...  I'd rather not use 
>>>  given we have a GCC variant.

Sorry for maybe stupid question, but how do we set 

  bitmask pta_core2  = pta_64bit | pta_mmx | pta_sse | pta_sse2
   | pta_sse3 | pta_ssse3 | pta_cx16 | pta_fxsr;

in sbitmap, except chain of bitmap_and_or with third bitmap set to ones(which 
doesn't look fast)?
Sorry, I think there should be some obvious solution, but can't find a proper 
function.

Thanks,
Julia

> -Original Message-
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Tuesday, December 19, 2017 12:56 PM
> To: Uros Bizjak <ubiz...@gmail.com>
> Cc: Koval, Julia <julia.ko...@intel.com>; GCC Patches  patc...@gcc.gnu.org>; Kirill Yukhin <kirill.yuk...@gmail.com>
> Subject: Re: [patch][x86] -march=icelake
> 
> On Tue, Dec 19, 2017 at 9:29 AM, Uros Bizjak <ubiz...@gmail.com> wrote:
> > On Mon, Dec 18, 2017 at 2:42 PM, Koval, Julia <julia.ko...@intel.com> wrote:
> >> Hi, I tried to replace 2 flags variable with c++ bitset(in patch 
> >> attached). What
> do you think?
> >
> > Hm, I'm not a c++ person, but I wonder about overhead and performance
> > impact of this change. Maybe [] operator could be used instead of a
> > dynamic handling here. Please discuss with a c++ person to find out
> > the most appropriate approach.
> 
> The natural GCC data structure is a sbitmap ...  I'd rather not use 
> given we have a GCC variant.
> 
> >>> Please add these options first.
> >> 2 options left(they are under Kirill's review currently), I'll add PTAs 
> >> for them to
> the patch, as soon as they will be commited.
> >
> > Actually, let's wait for these 2 options to be reviewed and committed
> > first, and after that introduce -march=icelake handling.
> >
> > Uros.


0001-icelake.patch_enums
Description: 0001-icelake.patch_enums


RE: [patch][x86] -march=icelake

2017-12-18 Thread Koval, Julia
Hi, I tried to replace 2 flags variable with c++ bitset(in patch attached). 
What do you think?

> Please add these options first.
2 options left(they are under Kirill's review currently), I'll add PTAs for 
them to the patch, as soon as they will be commited.

Thanks,
Julia


> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Uros Bizjak
> Sent: Sunday, November 12, 2017 5:30 PM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Kirill Yukhin
> <kirill.yuk...@gmail.com>
> Subject: Re: [patch][x86] -march=icelake
> 
> On Sun, Nov 12, 2017 at 1:04 AM, Koval, Julia <julia.ko...@intel.com> wrote:
> > Hi, this patch adds new option -march=icelake. Isasets defined in:
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> instruction-set-extensions-programming-reference.pdf
> > I didn't add arch code to driver-i386.c, because there is no code available 
> > in
> SDM yet, only for cannonlake
> (https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-
> vol-1-2abcd-3abcd.pdf Chapter 2).
> 
> This means the driver will go through generic detection for
> -march=native. Perhaps a comment should be added, so we won't forget
> to add the model number when one is available.
> 
> > gcc/
> > * config.gcc: Add -march=icelake.
> > * config/i386/driver-i386.c (host_detect_local_cpu): Detect icelake.
> > * config/i386/i386-c.c (ix86_target_macros_internal): Handle 
> > icelake.
> > * config/i386/i386.c (processor_costs): Add m_ICELAKE.
> > (PTA_ICELAKE, PTA2_ICELAKE, PTA2_GFNI, PTA2_AVX512VBMI2,
> PTA2_VAES,
> > PTA2_AVX512VNNI, PTA2_VPCLMULQDQ, PTA2_RDPID,
> PTA2_AVX512BITALG): New.
> > (processor_target_table): Add icelake.
> > (ix86_option_override_internal): Add flags2 for new PTA, handle 
> > GFNI,
> RDPID.
> > (get_builtin_code_for_version): Handle icelake.
> > (M_INTEL_COREI7_ICELAKE): New.
> > * config/i386/i386.h (TARGET_ICELAKE, PROCESSOR_ICELAKE): New.
> > * doc/invoke.texi: Add -march=icelake.
> > gcc/testsuite/
> > * gcc.target/i386/funcspec-56.inc: Handle new march.
> > * g++.dg/ext/mv16.C: Ditto.
> > libgcc/
> > * config/i386/cpuinfo.h (processor_subtypes): Add
> INTEL_COREI7_ICELAKE.
> 
> @@ -3425,6 +3427,13 @@ ix86_option_override_internal (bool main_args_p,
>  #define PTA_AVX5124FMAPS(HOST_WIDE_INT_1 << 61)
>  #define PTA_AVX512VPOPCNTDQ(HOST_WIDE_INT_1 << 62)
>  #define PTA_SGX(HOST_WIDE_INT_1 << 63)
> +#define PTA2_GFNI(HOST_WIDE_INT_1 << 0)
> +#define PTA2_AVX512VBMI2(HOST_WIDE_INT_1 << 1)
> +#define PTA2_VAES(HOST_WIDE_INT_1 << 2)
> +#define PTA2_AVX512VNNI(HOST_WIDE_INT_1 << 3)
> +#define PTA2_VPCLMULQDQ(HOST_WIDE_INT_1 << 4)
> +#define PTA2_RDPID(HOST_WIDE_INT_1 << 5)
> +#define PTA2_AVX512BITALG(HOST_WIDE_INT_1 << 6)
> 
> Please add these options first.
> 
> On a related note, there should probably be a better way to extend
> various bitmapped flag variables beyond 64bit words. We are constantly
> going over 64bit sizes in target option masks, now the number of
> processor flags doesn't fit in a word anymore. There are several
> places one has to keep in mind in which word some specific flag lives,
> and this  approach opens several ways to make a hard to detect
> mistake. Does C++ offer a more elegant way?
> 
> Bellow, please find a suggestion of a couple of cosmetic changes.
> 
> Thanks,
> Uros.
> 
> @@ -3425,6 +3427,13 @@ ix86_option_override_internal (bool main_args_p,
>  #define PTA_AVX5124FMAPS(HOST_WIDE_INT_1 << 61)
>  #define PTA_AVX512VPOPCNTDQ(HOST_WIDE_INT_1 << 62)
>  #define PTA_SGX(HOST_WIDE_INT_1 << 63)
> 
> Please add a comment here, that the folowing belongs to flags2.
> 
> +#define PTA2_GFNI(HOST_WIDE_INT_1 << 0)
> +#define PTA2_AVX512VBMI2(HOST_WIDE_INT_1 << 1)
> +#define PTA2_VAES(HOST_WIDE_INT_1 << 2)
> 
> 
> @@ -4105,6 +4124,12 @@ ix86_option_override_internal (bool main_args_p,
>  if (processor_alias_table[i].flags & PTA_SGX
>  && !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA_SGX))
>opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_SGX;
> 
> Please add vertical space here to visually separate flags and flags2 
> processing.
> 
> +if (processor_alias_table[i].flags2 & PTA2_RDPID
> +&& !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA_RDPID))
> +  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_RDPID;


0001-icelake.patch
Description: 0001-icelake.patch


RE: [PATCH][i386,AVX] Enable VBMI2 support [5/7]

2017-12-12 Thread Koval, Julia
Here is the patch to update these files with my contributions. Ok for trunk?

Thanks,
Julia

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Gerald Pfeifer
> Sent: Tuesday, December 12, 2017 11:34 AM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: Kirill Yukhin <kirill.yuk...@gmail.com>; gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH][i386,AVX] Enable VBMI2 support [5/7]
> 
> On Tue, 12 Dec 2017, Koval, Julia wrote:
> > Looks good. How to put it there(sorry, noob question)?
> 
> Does https://gcc.gnu.org/about.html help?  If not, let me know
> and I'll work with you (and update those docs on the way).
> 
> Of course, even if things work for you, any suggestions on how
> to improve this little page are very welcome. :)
> 
> Gerald


patch
Description: patch


RE: [x86][patch] Fix clwb for skylake

2017-12-12 Thread Koval, Julia
Sorry,

gcc/
* config/i386/i386.c (PTA_SKYLAKE_AVX512): Add PTA_CLWB.
(PTA_CANNONLAKE): Remove PTA_CLWB.

Thanks,
Julia

> -Original Message-
> From: Uros Bizjak [mailto:ubiz...@gmail.com]
> Sent: Monday, December 11, 2017 9:47 AM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Kirill Yukhin
> <kirill.yuk...@gmail.com>
> Subject: Re: [x86][patch] Fix clwb for skylake
> 
> On Mon, Dec 11, 2017 at 9:34 AM, Koval, Julia <julia.ko...@intel.com> wrote:
> > Hi Uros, Kirill,
> > According to isa-extensions doc CLWB appeared first in Skylake-avx512, but 
> > it
> isn't in the PTA. This patch fixes it. Ok for trunk?
> 
> Please also include ChangeLog entry in your patch submission.
> 
> Uros.


RE: [PATCH][i386,AVX] Enable VBMI2 support [5/7]

2017-12-12 Thread Koval, Julia
Looks good. How to put it there(sorry, noob question)?

Thanks,
Julia

> -Original Message-
> From: Gerald Pfeifer [mailto:ger...@pfeifer.com]
> Sent: Saturday, December 09, 2017 2:49 PM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: Kirill Yukhin <kirill.yuk...@gmail.com>; GCC Patches  patc...@gcc.gnu.org>
> Subject: RE: [PATCH][i386,AVX] Enable VBMI2 support [5/7]
> 
> Hi Julia,
> 
> On Mon, 4 Dec 2017, Koval, Julia wrote:
> > Do you think it is ok to copypaste it from GCC-6?
> 
> you mean copy, past, and adjust?  Yes, that should work.
> 
> > GCC now supports the Intel CPU, named Cannonlake through
> > -march=cannonlake. The switch enables the following ISA extensions:
> > AVX512VBMI, AVX512IFMA, SHA.
> > GCC now supports the Intel CPU, named and Icelake through
> > -march=icelake. The switch enables the following ISA extensions:
> > AVX512VNNI, GFNI, VAES, AVX512VBMI2, VPCLMULQDQ, AVX512BITALG,
> RDPID,
> > AVX512VPOPCNTDQ.
> 
> No comma before "named".
> 
> -march=
> 
> And perhaps "enables the AVX..., AVX...,and... ISA extensions"?
> 
> Gerald


[x86][patch] Fix clwb for skylake

2017-12-11 Thread Koval, Julia
Hi Uros, Kirill,
According to isa-extensions doc CLWB appeared first in Skylake-avx512, but it 
isn't in the PTA. This patch fixes it. Ok for trunk?

Thanks,
Julia


0001-i386.patch
Description: 0001-i386.patch


RE: [PATCH][i386,AVX] Enable VBMI2 support [5/7]

2017-12-03 Thread Koval, Julia
Hi Gerald,
Do you think it is ok to copypaste it from GCC-6?

GCC now supports the Intel CPU, named Cannonlake through -march=cannonlake. The 
switch enables the following ISA extensions: AVX512VBMI, AVX512IFMA, SHA.
GCC now supports the Intel CPU, named and Icelake through -march=icelake. The 
switch enables the following ISA extensions: AVX512VNNI, GFNI, VAES, 
AVX512VBMI2, VPCLMULQDQ, AVX512BITALG, RDPID, AVX512VPOPCNTDQ.

Thanks,
Julia

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Gerald Pfeifer
> Sent: Sunday, December 03, 2017 6:51 PM
> To: Koval, Julia <julia.ko...@intel.com>; Kirill Yukhin 
> <kirill.yuk...@gmail.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>
> Subject: Re: [PATCH][i386,AVX] Enable VBMI2 support [5/7]
> 
> Hi Julia, hi Kirill,
> 
> On Tue, 24 Oct 2017, Koval, Julia wrote:
> > This patch enables VPSHRD instruction.
> 
> packing a "random" of your contributions.  Can you please also think
> how to best document this in http://gcc.gnu.org/gcc-8/changes.html ?
> 
> Let me know if you need any help with the web side of things (beyond
> the brief notes in https://gcc.gnu.org/about.html )!
> 
> Gerald


RE: [patch] remove cilk-plus

2017-11-30 Thread Koval, Julia
Hi, here is the followup patch. Ok for trunk?

gcc/c-family/
* c-common.h (inv_list): Remove.

Thanks,
Julia

> -Original Message-
> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Monday, November 27, 2017 6:50 PM
> To: Koval, Julia <julia.ko...@intel.com>; Joseph Myers
> <jos...@codesourcery.com>
> Cc: Jakub Jelinek <ja...@redhat.com>; GCC Patches  patc...@gcc.gnu.org>
> Subject: Re: [patch] remove cilk-plus
> 
> On 11/23/2017 02:45 AM, Koval, Julia wrote:
> > Sorry, I think in this version of this patch they are fixed.
> >
> >> -Original Message-
> >> From: Joseph Myers [mailto:jos...@codesourcery.com]
> >> Sent: Wednesday, November 22, 2017 6:23 PM
> >> To: Koval, Julia <julia.ko...@intel.com>
> >> Cc: Jeff Law <l...@redhat.com>; Jakub Jelinek <ja...@redhat.com>; GCC
> >> Patches <gcc-patches@gcc.gnu.org>
> >> Subject: RE: [patch] remove cilk-plus
> >>
> >> This patch version does not appear to address my comment that you're
> >> leaving behind comments in c-parser.c relating to Cilk array notations
> >> while removing the subsequent code.  (Or in one case actually reindenting
> >> the comment that is no longer relevant, rather than removing it.)
> >>
> >> --
> >> Joseph S. Myers
> >> jos...@codesourcery.com
> This version is fine to commit.
> 
> Can you also remove struct inv_list from c-family/c-common.h.  It was
> only used for array notation support.  You can include that in your main
> commit or as a separate follow-up.
> 
> We may well find other nits like inv_list.  I'm comfortable addressing
> those as we bump into them.
> 
> Thanks for taking care of this!
> 
> jeff


0001-cilk-followup.patch
Description: 0001-cilk-followup.patch


[PATCH, committed] Add myself to MAINTAINRS

2017-11-28 Thread Koval, Julia
2017-11-28  Julia Koval  

    * MAINTAINERS (write after approval): Add myself.



RE: [patch] remove cilk-plus

2017-11-23 Thread Koval, Julia
Sorry, I think in this version of this patch they are fixed.

> -Original Message-
> From: Joseph Myers [mailto:jos...@codesourcery.com]
> Sent: Wednesday, November 22, 2017 6:23 PM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: Jeff Law <l...@redhat.com>; Jakub Jelinek <ja...@redhat.com>; GCC
> Patches <gcc-patches@gcc.gnu.org>
> Subject: RE: [patch] remove cilk-plus
> 
> This patch version does not appear to address my comment that you're
> leaving behind comments in c-parser.c relating to Cilk array notations
> while removing the subsequent code.  (Or in one case actually reindenting
> the comment that is no longer relevant, rather than removing it.)
> 
> --
> Joseph S. Myers
> jos...@codesourcery.com


cilk-plus.tar.xz
Description: cilk-plus.tar.xz


RE: [patch] remove cilk-plus

2017-11-22 Thread Koval, Julia
Added fix for gcc/doc/sourcebuild.texi

> -Original Message-
> From: Koval, Julia
> Sent: Wednesday, November 22, 2017 10:15 AM
> To: Rainer Orth <r...@cebitec.uni-bielefeld.de>
> Cc: Jeff Law <l...@redhat.com>; Jakub Jelinek <ja...@redhat.com>; GCC
> Patches <gcc-patches@gcc.gnu.org>
> Subject: RE: [patch] remove cilk-plus
> 
> Changes for these files(except sourcebuild one, will fix that) are included in
> patch I sent. I only removed from the patch deletion of the folders I 
> mentioned.
> 
> Julia
> 
> > -Original Message-
> > From: Rainer Orth [mailto:r...@cebitec.uni-bielefeld.de]
> > Sent: Wednesday, November 22, 2017 10:11 AM
> > To: Koval, Julia <julia.ko...@intel.com>
> > Cc: Jeff Law <l...@redhat.com>; Jakub Jelinek <ja...@redhat.com>; GCC
> > Patches <gcc-patches@gcc.gnu.org>
> > Subject: Re: [patch] remove cilk-plus
> >
> > Hi Julia,
> >
> > >> So it's not important, but the patch doesn't have the removal of the
> > >> cilk+ testsuite or runtime.  BUt again, it's not a big deal, I can guess
> > >> what that part of the patch looks like.
> > >
> > > I used Jakub's suggestion in
> > > https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01348.html and didn't add
> > > libcilkrts and cilk-plus directories to patch(without this patch doesn't
> > > fit in gcc-patches), only to change log. I will include them, when I'll
> > > commit the patch, but I guess there is nothing to review here:
> > > rm -rf gcc/testsuite/c-c++-common/cilk-plus
> > > rm -rf gcc/testsuite/g++.dg/cilk-plus
> > > rm -rf gcc/testsuite/gcc.dg/cilk-plus
> > > rm -rf libcilkrts
> > > I can send it as an additional patch(or patches) if this is required.
> >
> > there's more in the testsuite:
> >
> > gcc/testsuite/lib/cilk-plus-dg.exp
> > gcc/testsuite/lib/target-supports.exp (check_effective_target_cilkplus,
> > check_effective_target_cilkplus_runtime)
> > gcc/doc/sourcebuild.texi (cilkplus_runtime effective-target keyword)
> >
> > Rainer
> >
> > --
> > -
> > Rainer Orth, Center for Biotechnology, Bielefeld University


cilk-plus.tar.xz
Description: cilk-plus.tar.xz


RE: [patch] remove cilk-plus

2017-11-22 Thread Koval, Julia
Changes for these files(except sourcebuild one, will fix that) are included in 
patch I sent. I only removed from the patch deletion of the folders I mentioned.

Julia

> -Original Message-
> From: Rainer Orth [mailto:r...@cebitec.uni-bielefeld.de]
> Sent: Wednesday, November 22, 2017 10:11 AM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: Jeff Law <l...@redhat.com>; Jakub Jelinek <ja...@redhat.com>; GCC
> Patches <gcc-patches@gcc.gnu.org>
> Subject: Re: [patch] remove cilk-plus
> 
> Hi Julia,
> 
> >> So it's not important, but the patch doesn't have the removal of the
> >> cilk+ testsuite or runtime.  BUt again, it's not a big deal, I can guess
> >> what that part of the patch looks like.
> >
> > I used Jakub's suggestion in
> > https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01348.html and didn't add
> > libcilkrts and cilk-plus directories to patch(without this patch doesn't
> > fit in gcc-patches), only to change log. I will include them, when I'll
> > commit the patch, but I guess there is nothing to review here:
> > rm -rf gcc/testsuite/c-c++-common/cilk-plus
> > rm -rf gcc/testsuite/g++.dg/cilk-plus
> > rm -rf gcc/testsuite/gcc.dg/cilk-plus
> > rm -rf libcilkrts
> > I can send it as an additional patch(or patches) if this is required.
> 
> there's more in the testsuite:
> 
> gcc/testsuite/lib/cilk-plus-dg.exp
> gcc/testsuite/lib/target-supports.exp (check_effective_target_cilkplus,
> check_effective_target_cilkplus_runtime)
> gcc/doc/sourcebuild.texi (cilkplus_runtime effective-target keyword)
> 
>   Rainer
> 
> --
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University


RE: [patch] remove cilk-plus

2017-11-22 Thread Koval, Julia
Hi,

> So it's not important, but the patch doesn't have the removal of the
> cilk+ testsuite or runtime.  BUt again, it's not a big deal, I can guess
> what that part of the patch looks like.

I used Jakub's suggestion in 
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01348.html and didn't add 
libcilkrts and cilk-plus directories to patch(without this patch doesn't fit in 
gcc-patches), only to change log. I will include them, when I'll commit the 
patch, but I guess there is nothing to review here:
rm -rf gcc/testsuite/c-c++-common/cilk-plus
rm -rf gcc/testsuite/g++.dg/cilk-plus
rm -rf gcc/testsuite/gcc.dg/cilk-plus
rm -rf libcilkrts
I can send it as an additional patch(or patches) if this is required.

Thanks for your comments, fixed them in attached patch.vm


> -Original Message-
> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Tuesday, November 21, 2017 8:41 AM
> To: Koval, Julia <julia.ko...@intel.com>; Jakub Jelinek <ja...@redhat.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>
> Subject: Re: [patch] remove cilk-plus
> 
> On 11/16/2017 10:02 AM, Koval, Julia wrote:
> > Thanks for your comments, fixed it.
> >
> > 2017-11-16  Julia Koval  <julia.ko...@intel.com>
> > Sebastian Peryt  <sebastian.pe...@intel.com>
> >
> > * Makefile.def (target_modules): Remove libcilkrts.
> > * Makefile.in: Ditto.
> > * configure: Ditto.
> > * configure.ac: Ditto.
> >
> > contrib/
> > * contrib/gcc_update: Ditto.
> >
> > gcc/
> > * Makefile.in (cilkplus.def, cilk-builtins.def, c-family/cilk.o,
> > c-family/c-cilkplus.o, c-family/array-notation-common.o,
> > cilk-common.o, cilk.h, cilk-common.c): Remove.
> > * builtin-types.def
> > (BT_FN_INT_PTR_PTR_PTR_FTYPE_BT_INT_BT_PTR_BT_PTR_BT_PTR):
> Remove.
> > * builtins.c (is_builtin_name): Remove cilkplus condition.
> > (BUILT_IN_CILK_DETACH, BUILT_IN_CILK_POP_FRAME): Remove.
> > * builtins.def (DEF_CILK_BUILTIN_STUB, DEF_CILKPLUS_BUILTIN,
> > cilk-builtins.def, cilkplus.def): Remove.
> > * cif-code.def (CILK_SPAWN): Remove.
> > * cilk-builtins.def: Delete.
> > * cilk-common.c: Ditto.
> > * cilk.h: Ditto.
> > * cilkplus.def: Ditto.
> > * config/darwin.h (fcilkplus): Delete.
> > * cppbuiltin.c: Ditto.
> > * doc/extend.texi: Remove cilkplus doc.
> > * doc/generic.texi: Ditto.
> > * doc/invoke.texi: Ditto.
> > * doc/passes.texi: Ditto.
> > * gcc.c (fcilkplus): Remove.
> > * gengtype.c (cilk.h): Remove.
> > * gimple-pretty-print.c (dump_gimple_omp_for): Remove cilkplus
> support.
> > * gimple.h (GF_OMP_FOR_KIND_CILKFOR,
> GF_OMP_FOR_KIND_CILKSIMD): Remove.
> > * gimplify.c (gimplify_return_expr, maybe_fold_stmt,
> gimplify_call_expr,
> > is_gimple_stmt, gimplify_modify_expr, gimplify_scan_omp_clauses,
> > gimplify_adjust_omp_clauses, gimplify_omp_for, gimplify_expr):
> Remove
> > cilkplus conditions.
> > * ipa-fnsummary.c (ipa_dump_fn_summary, compute_fn_summary,
> > inline_read_section): Ditto.
> > * ipa-inline-analysis.c (cilk.h): Remove.
> > * ira.c (ira_setup_eliminable_regset): Remove cilkplus support.
> > * lto-wrapper.c (merge_and_complain, append_compiler_options,
> > append_linker_options): Remove condition for fcilkplus.
> > * lto/lto-lang.c (cilk.h): Remove.
> > (lto_init): Remove condition for fcilkplus.
> > * omp-expand.c (expand_cilk_for_call): Delete.
> > (expand_omp_taskreg, expand_omp_for_static_chunk,
> > expand_omp_for): Remove cilkplus
> > conditions.
> > (expand_cilk_for): Delete.
> > * omp-general.c (omp_extract_for_data): Remove cilkplus support.
> > * omp-low.c (scan_sharing_clauses, create_omp_child_function,
> > execute_lower_omp, diagnose_sb_0): Ditto.
> > * omp-simd-clone.c (simd_clone_clauses_extract): Ditto.
> > * tree-core.h (OMP_CLAUSE__CILK_FOR_COUNT_): Delete.
> > * tree-nested.c: Ditto.
> > * tree-pretty-print.c (dump_omp_clause): Remove cilkplus support.
> > (dump_generic_node): Ditto.
> > * tree.c (OMP_CLAUSE__CILK_FOR_COUNT_): Delete.
> > * tree.def (cilk_simd, cilk_for, cilk_spawn_stmt, cilk_sync_stmt): 
> > Delete.
> > * tree.h (CILK_SPAWN_FN, EXPR_CILK_SPAWN): Delete.
> >
> > gcc/c-family/
> > * array-notation-common.c: Delete.
> > * c-cilkplus.c: Ditto.
> > * c-common.c (_Cilk_spawn, _Cilk_sync, _Cilk_for): Remove.
> > * c-common.def (ARRAY_NOTATIO

[patch][x86] skylake costs

2017-11-17 Thread Koval, Julia
Hi, this patch introduces separate cost model for skylake-avx512. Ok for trunk?

gcc/
* config/i386/i386.c (processor_target_table): Add skylake_cost for
skylake-avx512.
* config/i386/x86-tune-costs.h (skylake_memcpy, skylake_memset,
skylake_cost): New.
Thanks,
Julia


0001-cost-model.patch
Description: 0001-cost-model.patch


RE: [patch] remove cilk-plus

2017-11-16 Thread Koval, Julia
dg/cilk-plus/AN/braced_list.c: Delete.
* g++.dg/cilk-plus/AN/builtin_fn_custom_tplt.c: Delete.
* g++.dg/cilk-plus/AN/builtin_fn_mutating_tplt.c: Delete.
* g++.dg/cilk-plus/AN/fp_triplet_values_tplt.c: Delete.
* g++.dg/cilk-plus/AN/postincr_test.c: Delete.
* g++.dg/cilk-plus/AN/preincr_test.c: Delete.
* g++.dg/cilk-plus/CK/catch_exc.c: Delete.
* g++.dg/cilk-plus/CK/cf3.c: Delete.
* g++.dg/cilk-plus/CK/cilk-for-tplt.c: Delete.
* g++.dg/cilk-plus/CK/const_spawn.c: Delete.
* g++.dg/cilk-plus/CK/fib-opr-overload.c: Delete.
* g++.dg/cilk-plus/CK/fib-tplt.c: Delete.
* g++.dg/cilk-plus/CK/for1.c: Delete.
* g++.dg/cilk-plus/CK/lambda_spawns.c: Delete.
* g++.dg/cilk-plus/CK/lambda_spawns_tplt.c: Delete.
* g++.dg/cilk-plus/CK/pr60586.c: Delete.
* g++.dg/cilk-plus/CK/pr66326.c: Delete.
* g++.dg/cilk-plus/CK/pr68001.c: Delete.
* g++.dg/cilk-plus/CK/pr68997.c: Delete.
* g++.dg/cilk-plus/CK/pr69024.c: Delete.
* g++.dg/cilk-plus/CK/pr69048.c: Delete.
* g++.dg/cilk-plus/CK/pr69267.c: Delete.
* g++.dg/cilk-plus/CK/pr80038.c: Delete.
* g++.dg/cilk-plus/CK/stl_iter.c: Delete.
* g++.dg/cilk-plus/CK/stl_rev_iter.c: Delete.
* g++.dg/cilk-plus/CK/stl_test.c: Delete.
* g++.dg/cilk-plus/cilk-plus.exp
* g++.dg/cilk-plus/ef_test.C: Delete.
* g++.dg/cilk-plus/for.C: Delete.
* g++.dg/cilk-plus/for2.C: Delete.
* g++.dg/cilk-plus/for3.C: Delete.
* g++.dg/cilk-plus/for4.C: Delete.
* g++.dg/cilk-plus/pr60967.C: Delete.
* g++.dg/cilk-plus/pr69028.C: Delete.
* g++.dg/cilk-plus/pr70565.C: Delete.
* g++.dg/pr57662.C: Delete.
* gcc.dg/cilk-plus/cilk-plus.exp
* gcc.dg/cilk-plus/for1.c: Delete.
* gcc.dg/cilk-plus/for2.c: Delete.
* gcc.dg/cilk-plus/jump-openmp.c: Delete.
* gcc.dg/cilk-plus/jump.c: Delete.
* gcc.dg/cilk-plus/pr69798-1.c: Delete.
* gcc.dg/cilk-plus/pr69798-2.c: Delete.
* gcc.dg/cilk-plus/pr78306.c: Delete.
* gcc.dg/cilk-plus/pr79116.c: Delete.
* gcc.dg/graphite/id-28.c: Delete.
* lib/cilk-plus-dg.exp: Delete.
* lib/target-supports.exp (cilkplus_runtime): Delete.

libcilkrts: Delete

> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Thursday, November 16, 2017 4:49 PM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; l...@redhat.com
> Subject: Re: [patch] remove cilk-plus
> 
> On Thu, Nov 16, 2017 at 03:33:40PM +, Koval, Julia wrote:
> > // I failed to send patch itself, it is too big even in gzipped form.  What 
> > is the
> right way to send such big patches?
> 
> Don't include the libcilkrts subtree in the patch nor /cilk-plus/
> testcases that are going to be removed?
> 
> > Hi, this patch removes cilkplus. Ok for trunk?
> >
> > 2017-11-16  Julia Koval  <julia.ko...@intel.com>
> > Sebastian Peryt  <sebastian.pe...@intel.com>
> > gcc/
> > * Makefile.def (target_modules): Remove libcilkrts.
> > * Makefile.in: Ditto.
> > * configure: Ditto.
> > * configure.ac: Ditto.
> 
> The ChangeLog needs work, e.g. we have many different ChangeLog files and
> changes should be relative to that.  The above entries are for toplevel.
> 
> > * contrib/gcc_update: Ditto.
> 
> This one is for contrib/ChangeLog, so should be without contrib/
> in the entry.
> 
> > * Makefile.in (cilkplus.def, cilk-builtins.def, c-family/cilk.o,
> > c-family/c-cilkplus.o, c-family/array-notation-common.o,
> > cilk-common.o, cilk.h, cilk-common.c): Remove.
> > * builtin-types.def
> (BT_FN_INT_PTR_PTR_PTR_FTYPE_BT_INT_BT_PTR_BT_PTR
> > _BT_PTR): Remove.
> 
> There should be no linebreaks within one identifier.  So
>   * builtin-types.def
>   (BT_FN_INT_PTR_PTR_PTR_FTYPE_BT_INT_BT_PTR_BT_PTR_BT_PTR):
> Remove.
> 
> > * c-family/array-notation-common.c: Delete.
> > * c-family/c-cilkplus.c: Ditto.
> > * c-family/c-common.c (_Cilk_spawn, _Cilk_sync, _Cilk_for): Remove.
> > * c-family/c-common.def (ARRAY_NOTATION_REF): Remove.
> > * c-family/c-common.h (RID_CILK_SPAWN, build_array_notation_expr,
> > build_array_notation_ref, C_ORT_CILK, c_check_cilk_loop,
> > c_validate_cilk_plus_loop, cilkplus_an_parts,
> cilk_ignorable_spawn_rhs_op,
> > cilk_recognize_spawn): Remove.
> > * c-family/c-gimplify.c (CILK_SPAWN_STMT): Remove.
> > * c-family/c-omp.c: Remove CILK_SIMD check.
> > * c-family/c-pragma.c: Ditto.
> > * c-family/c-pragma.h: Remove CILK related pragmas.
&g

[patch] remove cilk-plus

2017-11-16 Thread Koval, Julia
// I failed to send patch itself, it is too big even in gzipped form.  What is 
the right way to send such big patches?

Hi, this patch removes cilkplus. Ok for trunk?

2017-11-16  Julia Koval  
Sebastian Peryt  
gcc/
* Makefile.def (target_modules): Remove libcilkrts.
* Makefile.in: Ditto.
* configure: Ditto.
* configure.ac: Ditto.
* contrib/gcc_update: Ditto.
* Makefile.in (cilkplus.def, cilk-builtins.def, c-family/cilk.o, 
c-family/c-cilkplus.o, c-family/array-notation-common.o,
cilk-common.o, cilk.h, cilk-common.c): Remove.
* builtin-types.def (BT_FN_INT_PTR_PTR_PTR_FTYPE_BT_INT_BT_PTR_BT_PTR
_BT_PTR): Remove.
* builtins.c (is_builtin_name): Remove cilkplus condition.
(BUILT_IN_CILK_DETACH, BUILT_IN_CILK_POP_FRAME): Remove.
* builtins.def (DEF_CILK_BUILTIN_STUB, DEF_CILKPLUS_BUILTIN,
cilk-builtins.def, cilkplus.def): Remove.
* c-family/array-notation-common.c: Delete.
* c-family/c-cilkplus.c: Ditto.
* c-family/c-common.c (_Cilk_spawn, _Cilk_sync, _Cilk_for): Remove.
* c-family/c-common.def (ARRAY_NOTATION_REF): Remove.
* c-family/c-common.h (RID_CILK_SPAWN, build_array_notation_expr,
build_array_notation_ref, C_ORT_CILK, c_check_cilk_loop,
c_validate_cilk_plus_loop, cilkplus_an_parts, 
cilk_ignorable_spawn_rhs_op,
cilk_recognize_spawn): Remove.
* c-family/c-gimplify.c (CILK_SPAWN_STMT): Remove.
* c-family/c-omp.c: Remove CILK_SIMD check.
* c-family/c-pragma.c: Ditto.
* c-family/c-pragma.h: Remove CILK related pragmas.
* c-family/c-pretty-print.c (c_pretty_printer::postfix_expression): 
Remove
ARRAY_NOTATION_REF condition.
(c_pretty_printer::expression): Ditto.
* c-family/c.opt (fcilkplus): Remove.
* c-family/cilk.c: Delete.
* c/Make-lang.in (c/c-array-notation.o): Remove.
* c/c-array-notation.c: Delete.
* c/c-decl.c: Remove cilkplus condition.
* c/c-parser.c (c_parser_cilk_simd, c_parser_cilk_for,
c_parser_cilk_verify_simd, c_parser_array_notation,
c_parser_cilk_clause_vectorlength, c_parser_cilk_grainsize,
c_parser_cilk_simd_fn_vector_attrs, c_finish_cilk_simd_fn_tokens): 
Delete.
(c_parser_declaration_or_fndef): Remove cilkplus condition.
(c_parser_direct_declarator_inner): Ditto.
(CILK_SIMD_FN_CLAUSE_MASK): Delete.
(c_parser_attributes): Remove cilk-plus condition.
(c_parser_compound_statement): Ditto.
(c_parser_statement_after_labels): Ditto.
(c_parser_if_statement): Ditto.
(c_parser_switch_statement): Ditto.
(c_parser_while_statement): Ditto.
(c_parser_do_statement): Ditto.
(c_parser_for_statement): Ditto.
(c_parser_unary_expression): Ditto.
(c_parser_postfix_expression): Ditto.
(c_parser_postfix_expression_after_primary): Ditto.
(c_parser_pragma): Ditto.
(c_parser_omp_clause_name): Ditto.
(c_parser_omp_all_clauses): Ditto.
(c_parser_omp_for_loop): Ditto.
(c_finish_omp_declare_simd): Ditto.
* c/c-typeck.c (build_array_ref, build_function_call_vec, 
convert_arguments,
lvalue_p, build_compound_expr, c_finish_return, c_finish_if_stmt,
c_finish_loop, build_binary_op): Remove cilkplus condition.
* cif-code.def (CILK_SPAWN): Remove.
* cilk-builtins.def: Delete.
* cilk-common.c: Ditto.
* cilk.h: Ditto.
* cilkplus.def: Ditto.
* config/darwin.h (fcilkplus): Delete.
* cp/Make-lang.in (cp/cp-array-notation.o, cp/cp-cilkplus.o): Delete.
* cp/call.c (convert_for_arg_passing, build_cxx_call): Remove cilkplus.
* cp/constexpr.c (potential_constant_expression_1): Ditto.
* cp/cp-array-notation.c: Delete.
* cp/cp-cilkplus.c: Ditto.
* cp/cp-cilkplus.h: Ditto.
* cp/cp-gimplify.c (cp_gimplify_expr, cp_fold_r, cp_genericize): Remove
cilkplus condition.
* cp/cp-objcp-common.c (ARRAY_NOTATION_REF): Delete.
* cp/cp-tree.h (cilkplus_an_triplet_types_ok_p): Delete.
* cp/decl.c (grokfndecl, finish_function): Remove cilkplus condition.
* cp/error.c (dump_decl, dump_expr): Remove ARRAY_NOTATION_REF 
condition.
* cp/lambda.c (cp-cilkplus.h): Remove.
* cp/parser.c (cp_parser_cilk_simd, cp_parser_cilk_for,
cp_parser_cilk_simd_vectorlength): Delete.
(cp_debug_parser, cp_parser_ctor_initializer_opt_and_function_body,
cp_parser_postfix_expression, cp_parser_postfix_open_square_expression,
cp_parser_statement, cp_parser_jump_statement, 
cp_parser_direct_declarator,
cp_parser_late_return_type_opt, cp_parser_gnu_attribute_list,
cp_parser_omp_clause_name, cp_parser_omp_clause_aligned,

[x86,avx][patch] Fix PR82983

2017-11-14 Thread Koval, Julia
Hi, this patch fix GFNI check which didn't work properly in gfni+sse case.

gcc/
* config/i386/gfniintrin.h: Add sse check.
* config/i386/i386.c (ix86_expand_builtin): Fix gfni check.


0001-fix-gfni.patch
Description: 0001-fix-gfni.patch


RE: [x86,avx][patch] Fix PR82983

2017-11-14 Thread Koval, Julia
Didn't get in the list for some reason.

> -Original Message-
> From: Koval, Julia
> Sent: Tuesday, November 14, 2017 10:29 AM
> To: GCC Patches <gcc-patches@gcc.gnu.org>
> Cc: Kirill Yukhin <kirill.yuk...@gmail.com>
> Subject: [x86,avx][patch] Fix PR82983
> 
> Hi, this patch fix GFNI check which didn't work properly in gfni+sse case.
> 
> gcc/
>   * config/i386/gfniintrin.h: Add sse check.
>   * config/i386/i386.c (ix86_expand_builtin): Fix gfni check.


0001-fix-gfni.patch
Description: 0001-fix-gfni.patch


RE: [x86][patch] Add -march=cannonlake.

2017-11-13 Thread Koval, Julia
Hi, here is followup patch to add skylake-avx512.
gcc/
* config/i386/driver-i386.c (host_detect_local_cpu): Detect 
skylake-avx512.

> -Original Message-
> From: Uros Bizjak [mailto:ubiz...@gmail.com]
> Sent: Sunday, November 12, 2017 5:05 PM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Kirill Yukhin
> <kirill.yuk...@gmail.com>
> Subject: Re: [x86][patch] Add -march=cannonlake.
> 
> On Sat, Nov 11, 2017 at 10:10 PM, Koval, Julia <julia.ko...@intel.com> wrote:
> > Hi Uros,
> > I fixed comments.
> > Btw, I haven't found skylake-avx512 in driver-i386.c at all. Is it intended 
> > or
> should I add it?
> 
> It looks like an oversight to me. If there are no "skylake-avx512"
> model, then the driver goes through "This is unknown ..." for
> -march=native and hopefully chooses the next most appropriate choice.
> Please add "skylake-avx512" in a follow-up patch.
> 
> > Thanks,
> > Julia
> >
> > gcc/
> > * config.gcc: Add -march=cannonlake.
> > * config/i386/driver-i386.c (host_detect_local_cpu): Detect 
> > cannonlake.
> > * config/i386/i386-c.c (ix86_target_macros_internal): Handle 
> > cannonlake.
> > * config/i386/i386.c (processor_costs): Add m_CANNONLAKE.
> > (PTA_CANNONLAKE): New.
> > (processor_target_table): Add cannonlake.
> > (ix86_option_override_internal): Ditto.
> > (fold_builtin_cpu): Ditto.
> > (get_builtin_code_for_version): Handle cannonlake.
> > (M_INTEL_COREI7_CANNONLAKE): New.
> > * config/i386/i386.h (TARGET_CANNONLAKE,
> PROCESSOR_CANNONLAKE): New.
> > * doc/invoke.texi: Add -march=cannonlake.
> > gcc/testsuite/
> > * gcc.target/i386/funcspec-56.inc: Handle new march.
> > * g++.dg/ext/mv16.C: Ditto.
> > libgcc/
> > * config/i386/cpuinfo.c (get_intel_cpu): Handle cannonlake.
> > * config/i386/cpuinfo.h (processor_subtypes): Add
> INTEL_COREI7_CANNONLAKE.
> 
> OK.
> 
> Thanks,
> Uros.


0001-skylake-512.patch
Description: 0001-skylake-512.patch


[patch][x86,avx] Enable AVX512BITALG

2017-11-12 Thread Koval, Julia
Hi, this patch enables AVX512BITALG and AVX512VPOPCNTDQ instructions from 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf.
 Ok for trunk?

Thanks,
Julia


Julia Koval 
Sebastian Peryt 
gcc/
* common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512BITALG_SET,
OPTION_MASK_ISA_AVX512BITALG_UNSET): New.
(ix86_handle_option): Handle -mavx512bitalg, fix 4VNNIW formatting.
* config.gcc: Add avx512vpopcntdqvlintrin.h and avx512bitalgintrin.h.
* config/i386/avx512bitalgintrin.h (_mm512_popcnt_epi8, 
_mm512_popcnt_epi16,
_mm512_mask_popcnt_epi8, _mm512_maskz_popcnt_epi8, 
_mm512_mask_popcnt_epi16,
_mm512_maskz_popcnt_epi16, _mm512_bitshuffle_epi64_mask, 
_mm256_popcnt_epi8,
_mm512_mask_bitshuffle_epi64_mask, _mm256_mask_popcnt_epi8, 
_mm_popcnt_epi8,
_mm256_maskz_popcnt_epi8, _mm_bitshuffle_epi64_mask, 
_mm256_popcnt_epi16,
_mm_mask_bitshuffle_epi64_mask, _mm256_bitshuffle_epi64_mask,
_mm256_mask_bitshuffle_epi64_mask, _mm_popcnt_epi16, 
_mm_maskz_popcnt_epi8,
_mm256_mask_popcnt_epi16, _mm256_maskz_popcnt_epi16, 
_mm_mask_popcnt_epi8,
_mm_mask_popcnt_epi16, _mm_maskz_popcnt_epi16): New intrinsics.
* config/i386/avx512vpopcntdqvlintrin.h (_mm_popcnt_epi32, 
_mm_popcnt_epi64,
_mm_mask_popcnt_epi32, _mm_maskz_popcnt_epi32, _mm256_popcnt_epi32,
_mm256_mask_popcnt_epi32, _mm256_maskz_popcnt_epi32, 
_mm_mask_popcnt_epi64,
_mm_maskz_popcnt_epi64, _mm256_popcnt_epi64, _mm256_mask_popcnt_epi64,
_mm256_maskz_popcnt_epi64): New intrinsics.
* config/i386/cpuid.h (bit_AVX512BITALG): New bit.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect 
-mavx512bitalg.
* config/i386/i386-builtin-types.def (V64QI_FTYPE_V64QI, 
V64QI_FTYPE_V64QI,
V4DI_FTYPE_V4DI, UHI_FTYPE_V2DI_V2DI_UHI, USI_FTYPE_V4DI_V4DI_USI,
V4SI_FTYPE_V4SI_V4SI_UHI, V8SI_FTYPE_V8SI_V8SI_UHI): New types.
* config/i386/i386-builtin.def (__builtin_ia32_vpopcountq_v4di,
__builtin_ia32_vpopcountq_v4di_mask, __builtin_ia32_vpopcountq_v2di,
__builtin_ia32_vpopcountq_v2di_mask, __builtin_ia32_vpopcountd_v4si,
__builtin_ia32_vpopcountd_v4si_mask, __builtin_ia32_vpopcountd_v8si,
__builtin_ia32_vpopcountd_v8si_mask, __builtin_ia32_vpopcountb_v64qi,
__builtin_ia32_vpopcountb_v64qi_mask, __builtin_ia32_vpopcountb_v32qi,
__builtin_ia32_vpopcountb_v32qi_mask, __builtin_ia32_vpopcountb_v16qi,
__builtin_ia32_vpopcountb_v16qi_mask, __builtin_ia32_vpopcountw_v32hi,
__builtin_ia32_vpopcountw_v32hi_mask, __builtin_ia32_vpopcountw_v16hi,
__builtin_ia32_vpopcountw_v16hi_mask, __builtin_ia32_vpopcountw_v8hi,
__builtin_ia32_vpopcountw_v8hi_mask, 
__builtin_ia32_vpshufbitqmb128_mask,
__builtin_ia32_vpshufbitqmb256_mask,
__builtin_ia32_vpshufbitqmb512_mask): New builtins.
* config/i386/i386-c.c (__AVX512BITALG__): New.
* config/i386/i386.c (isa2_opts): Add -mavx512bitalg.
(ix86_valid_target_attribute_inner_p): Ditto.
(ix86_expand_args_builtin): Handle new types.
* config/i386/i386.h (TARGET_AVX512BITALG, TARGET_AVX512BITALG_P): New.
* config/i386/i386.opt: Add -mavx512bitalg.
* config/i386/immintrin.h: Add avx512vpopcntdqvlintrin.h and
avx512bitalgintrin.h.
* config/i386/sse.md (VI48_AVX512VLBW): New iterator.
(vpopcount): Add more types.
(avx512vl_vpshufbitqmb): New.
* doc/invoke.texi: Add -mavx512bitalg and -mavx512vpopcntdq.
gcc/testsuite/
* g++.dg/other/i386-2.C: Add new options.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/sse-12.c: Ditto.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/avx512-check.h: Handle bit_AVX512BITALG.
* gcc.target/i386/avx512bitalg-vpopcntb-1.c: New.
* gcc.target/i386/avx512bitalg-vpopcntb.c: Ditto.
* gcc.target/i386/avx512bitalg-vpopcntbvl.c: Ditto.
* gcc.target/i386/avx512bitalg-vpopcntw-1.c: Ditto.
* gcc.target/i386/avx512bitalg-vpopcntw.c: Ditto.
* gcc.target/i386/avx512bitalg-vpopcntwvl.c: Ditto.
* gcc.target/i386/avx512bitalg-vpshufbitqmb-1.c: Ditto.
* gcc.target/i386/avx512bitalg-vpshufbitqmb.c: Ditto.
* gcc.target/i386/avx512bitalgvl-vpopcntb-1.c: Ditto.
* gcc.target/i386/avx512bitalgvl-vpopcntw-1.c: Ditto.
* gcc.target/i386/avx512bitalgvl-vpshufbitqmb-1.c: Ditto.
* gcc.target/i386/avx512vpopcntdqvl-vpopcntd-1.c: Ditto.
* gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c: Ditto.
* gcc.target/i386/i386.exp (check_effective_target_avx512bitalg): New.
* gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c: 

[patch][x86] -march=icelake

2017-11-11 Thread Koval, Julia
Hi, this patch adds new option -march=icelake. Isasets defined in: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
I didn't add arch code to driver-i386.c, because there is no code available in 
SDM yet, only for cannonlake 
(https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf
 Chapter 2).

gcc/
* config.gcc: Add -march=icelake.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect icelake.
* config/i386/i386-c.c (ix86_target_macros_internal): Handle icelake.
* config/i386/i386.c (processor_costs): Add m_ICELAKE.
(PTA_ICELAKE, PTA2_ICELAKE, PTA2_GFNI, PTA2_AVX512VBMI2, PTA2_VAES,
PTA2_AVX512VNNI, PTA2_VPCLMULQDQ, PTA2_RDPID, PTA2_AVX512BITALG): New.
(processor_target_table): Add icelake.
(ix86_option_override_internal): Add flags2 for new PTA, handle GFNI, 
RDPID.
(get_builtin_code_for_version): Handle icelake.
(M_INTEL_COREI7_ICELAKE): New.
* config/i386/i386.h (TARGET_ICELAKE, PROCESSOR_ICELAKE): New.
* doc/invoke.texi: Add -march=icelake.
gcc/testsuite/
* gcc.target/i386/funcspec-56.inc: Handle new march.
* g++.dg/ext/mv16.C: Ditto.
libgcc/
* config/i386/cpuinfo.h (processor_subtypes): Add INTEL_COREI7_ICELAKE.


0001-icelake.patch
Description: 0001-icelake.patch


RE: [x86][patch] Add -march=cannonlake.

2017-11-11 Thread Koval, Julia
Hi Uros,
I fixed comments.
Btw, I haven't found skylake-avx512 in driver-i386.c at all. Is it intended or 
should I add it?

Thanks,
Julia

gcc/
* config.gcc: Add -march=cannonlake.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect cannonlake.
* config/i386/i386-c.c (ix86_target_macros_internal): Handle cannonlake.
* config/i386/i386.c (processor_costs): Add m_CANNONLAKE.
(PTA_CANNONLAKE): New.
(processor_target_table): Add cannonlake.
(ix86_option_override_internal): Ditto.
(fold_builtin_cpu): Ditto.
(get_builtin_code_for_version): Handle cannonlake.
(M_INTEL_COREI7_CANNONLAKE): New.
* config/i386/i386.h (TARGET_CANNONLAKE, PROCESSOR_CANNONLAKE): New.
* doc/invoke.texi: Add -march=cannonlake.
gcc/testsuite/
* gcc.target/i386/funcspec-56.inc: Handle new march.
* g++.dg/ext/mv16.C: Ditto.
libgcc/
* config/i386/cpuinfo.c (get_intel_cpu): Handle cannonlake.
* config/i386/cpuinfo.h (processor_subtypes): Add 
INTEL_COREI7_CANNONLAKE.

> -Original Message-
> From: Uros Bizjak [mailto:ubiz...@gmail.com]
> Sent: Wednesday, November 08, 2017 8:45 PM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Kirill Yukhin
> <kirill.yuk...@gmail.com>
> Subject: Re: [x86][patch] Add -march=cannonlake.
> 
> On Wed, Nov 8, 2017 at 9:02 AM, Koval, Julia <julia.ko...@intel.com> wrote:
> > Attachment got lost.
> >
> >> -Original Message-
> >> From: Koval, Julia
> >> Sent: Wednesday, November 08, 2017 9:01 AM
> >> To: 'GCC Patches' <gcc-patches@gcc.gnu.org>
> >> Cc: 'Uros Bizjak' <ubiz...@gmail.com>; 'Kirill Yukhin'
> <kirill.yuk...@gmail.com>
> >> Subject: RE: [x86][patch] Add -march=cannonlake.
> >>
> >> Hi, this patch adds new option -march=cannonlake. Isasets defined in:
> >> https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> >> instruction-set-extensions-programming-reference.pdf
> >>
> >> Ok for trunk?
> >>
> >> gcc/
> >>   * config.gcc: Add -march=cannonlake.
> >>   * config/i386/driver-i386.c (host_detect_local_cpu): Detect 
> >> cannonlake.
> >>   * config/i386/i386-c.c (ix86_target_macros_internal): Handle
> >> cannonlake.
> >>   * config/i386/i386.c (processor_costs): Add m_CANNONLAKE.
> >>   (PTA_CANNONLAKE): New.
> >>   (processor_target_table): Add cannonlake.
> >>   (ix86_option_override_internal): Ditto.
> >>   (fold_builtin_cpu): Ditto.
> >>   (get_builtin_code_for_version): Handle cannonlake.
> >>   (M_INTEL_CANNONLAKE): New.
> >>   * config/i386/i386.h (TARGET_CANNONLAKE,
> >> PROCESSOR_CANNONLAKE): New.
> >>   * doc/invoke.texi: Add -march=cannonlake.
> >> gcc/testsuite/
> >>   * gcc.target/i386/funcspec-56.inc: Handle new march.
> 
> --- a/gcc/config/i386/driver-i386.c
> +++ b/gcc/config/i386/driver-i386.c
> @@ -803,9 +803,11 @@ const char *host_detect_local_cpu (int argc,
> const char **argv)
>  default:
>if (arch)
>  {
> +  if (has_avx512vbmi)
> +cpu = "cannonlake";
>/* This is unknown family 0x6 CPU.  */
>/* Assume Knights Landing.  */
> -  if (has_avx512f)
> +  else if (has_avx512f)
>  cpu = "knl";
>/* Assume Knights Mill */
>else if (has_avx5124vnniw)
> 
> You should add correct model numbers under  PROCESSOR_PENTIUMPRO>. The above is for the unknown case (which should
> not happen), and it should read (note that "knl" is already
> misplaced):
> 
> default:
>   /* This is unknown family 0x6 CPU.  */
>   if (arch)
> {
>   /* Assume Cannonlake.  */
>   if (has_avx512vbmi)
> cpu = "cannonlake";
>   /* Assume Knights Mill */
>   else if (has_avx5124vnniw)
> cpu = "knm";
>   /* Assume Skylake.  */
>   else if (has_clflushopt)
> cpu = "skylake";
>   /* Assume Knights Landing.  */
>   else if (has_avx512f)
> cpu = "knl";
>   /* Assume Broadwell.  */
>   else if (has_adx)
> ...
> 
> 
> @@ -31832,7 +31839,8 @@ fold_builtin_cpu (tree fndecl, tree *args)
>  M_INTEL_COREI7_HASWELL,
>  M_INTEL_COREI7_BROADWELL,
>  M_INTEL_COREI7_SKYLAKE,
> -M_INTEL_COREI7_SKYLAKE_AVX512
> +M_INTEL_COREI7_SKYLAKE_AVX512,
&

[PATCH][i386,AVX] Enable VPCLMULQDQ support

2017-11-09 Thread Koval, Julia
Hi, this patch enables VPCLMULQDQ instruction from VPCLMULQDQ isaset, defined 
here: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?
Thanks,
Julia

gcc/
* common/config/i386/i386-common.c (OPTION_MASK_ISA_VPCLMULQDQ_SET,
OPTION_MASK_ISA_VPCLMULQDQ_UNSET): New.
(ix86_handle_option): Handle -mvpclmulqdq, move cx6 to flags2.
* config.gcc: Include vpclmulqdqintrin.h.
* config/i386/cpuid.h: Handle bit_VPCLMULQDQ.
* config/i386/driver-i386.c (host_detect_local_cpu): Handle 
-mvpclmulqdq.
* config/i386/i386-builtin.def (__builtin_ia32_vpclmulqdq_v2di,
__builtin_ia32_vpclmulqdq_v4di, __builtin_ia32_vpclmulqdq_v8di): New.
* config/i386/i386-c.c (__VPCLMULQDQ__): New.
* config/i386/i386.c (isa2_opts): Add -mcx16.
(isa_opts): Add -mpclmulqdq, remove -mcx16.
(ix86_option_override_internal): Move mcx16 to flags2.
(ix86_valid_target_attribute_inner_p): Add vpclmulqdq.
(ix86_expand_builtin): Handle OPTION_MASK_ISA_VPCLMULQDQ.
* config/i386/i386.h (TARGET_VPCLMULQDQ, TARGET_VPCLMULQDQ_P): New.
* config/i386/i386.opt: Add mvpclmulqdq, move mcx16 to flags2.
* config/i386/immintrin.h: Include vpclmulqdqintrin.h.
* config/i386/sse.md (vpclmulqdq_): New pattern.
* config/i386/vpclmulqdqintrin.h (_mm512_clmulepi64_epi128,
_mm_clmulepi64_epi128, _mm256_clmulepi64_epi128): New intrinsics.
* doc/invoke.texi: Add -mvpclmulqdq.

gcc/testsuite/
* gcc.target/i386/avx-1.c: Handle new intrinsics.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/avx512-check.h: Handle bit_VPCLMULQDQ.
* gcc.target/i386/avx512f-vpclmulqdq-2.c: New test.
* gcc.target/i386/avx512vl-vpclmulqdq-2.c: Ditto.
* gcc.target/i386/vpclmulqdq.c: Ditto.
* gcc.target/i386/i386.exp (check_effective_target_vpclmulqdq): New.


0001-mvpclmulqdq-option.patch
Description: 0001-mvpclmulqdq-option.patch


[PATCH][i386,AVX] Enable VAES support [5/5]

2017-11-08 Thread Koval, Julia
Hi, this patch enables VAESENC instruction from VAES isaset, defined here: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?
Thanks,
Julia

gcc/
* config/i386/i386-builtin.def (__builtin_ia32_vaesenclast_v16qi,
__builtin_ia32_vaesenclast_v32qi, __builtin_ia32_vaesenclast_v64qi): 
New.
* config/i386/sse.md (vaesenclast_): New pattern.
* config/i386/vaesintrin.h (_mm256_aesenclast_epi128,
_mm512_aesenclast_epi128, _mm_aesenclast_epi128): New intrinsics.

gcc/testsuite/
* gcc.target/i386/avx512f-aesenclast-2.c: New test.
* gcc.target/i386/avx512vl-aesenclast-2.c: Ditto.
* gcc.target/i386/avx512fvl-vaes-1.c: Handle new intrinsics.



0005-VAESENCLAST.PATCH
Description: 0005-VAESENCLAST.PATCH


[PATCH][i386,AVX] Enable VAES support [4/5]

2017-11-08 Thread Koval, Julia
Hi, this patch enables VAESENC instruction from VAES isaset, defined here: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?
Thanks,
Julia

gcc/
* config/i386/i386-builtin.def (__builtin_ia32_vaesenc_v16qi,
__builtin_ia32_vaesenc_v32qi, __builtin_ia32_vaesenc_v64qi): New.
* config/i386/sse.md (vaesenc_): New pattern.
* config/i386/vaesintrin.h (_mm256_aesenc_epi128, _mm512_aesenc_epi128,
_mm_aesenc_epi128): New intrinsics.

gcc/testsuite/
* gcc.target/i386/avx512f-aesenc-2.c: New test.
* gcc.target/i386/avx512vl-aesenc-2.c: Ditto.
* gcc.target/i386/avx512fvl-vaes-1.c: Handle new intrinsics.


0004-VAESENC-instruction.patch
Description: 0004-VAESENC-instruction.patch


RE: [PATCH][i386,AVX] Enable VAES support [3/5]

2017-11-08 Thread Koval, Julia
Patch attached.

> -Original Message-
> From: Koval, Julia
> Sent: Wednesday, November 08, 2017 1:38 PM
> To: 'GCC Patches' <gcc-patches@gcc.gnu.org>
> Cc: 'Kirill Yukhin' <kirill.yuk...@gmail.com>
> Subject: [PATCH][i386,AVX] Enable VAES support [3/5]
> 
> Hi, this patch enables VAESDECLAST instruction from VAES isaset, defined here:
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> instruction-set-extensions-programming-reference.pdf
> 
> Ok for trunk?
> Thanks,
> Julia
> 
> gcc/
>   * config/i386/i386-builtin.def (__builtin_ia32_vaesdeclast_v16qi,
>   __builtin_ia32_vaesdeclast_v32qi, __builtin_ia32_vaesdeclast_v64qi):
> New.
>   * config/i386/sse.md (vaesdeclast_): New pattern.
>   * config/i386/vaesintrin.h (_mm256_aesdeclast_epi128,
>   _mm512_aesdeclast_epi128, _mm_aesdeclast_epi128): New intrinsics.
> 
> gcc/testsuite/
>   * gcc.target/i386/avx512f-aesdeclast-2.c: New test.
>   * gcc.target/i386/avx512vl-aesdeclast-2.c
>   * gcc.target/i386/avx512fvl-vaes-1.c: Handle new intrinsics.


0003-VAESDECLAST.PATCH
Description: 0003-VAESDECLAST.PATCH


[PATCH][i386,AVX] Enable VAES support [3/5]

2017-11-08 Thread Koval, Julia
Hi, this patch enables VAESDECLAST instruction from VAES isaset, defined here: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?
Thanks,
Julia

gcc/
* config/i386/i386-builtin.def (__builtin_ia32_vaesdeclast_v16qi,
__builtin_ia32_vaesdeclast_v32qi, __builtin_ia32_vaesdeclast_v64qi): 
New.
* config/i386/sse.md (vaesdeclast_): New pattern.
* config/i386/vaesintrin.h (_mm256_aesdeclast_epi128,
_mm512_aesdeclast_epi128, _mm_aesdeclast_epi128): New intrinsics.

gcc/testsuite/
* gcc.target/i386/avx512f-aesdeclast-2.c: New test.
* gcc.target/i386/avx512vl-aesdeclast-2.c
* gcc.target/i386/avx512fvl-vaes-1.c: Handle new intrinsics.


[PATCH][i386,AVX] Enable VAES support [2/5]

2017-11-08 Thread Koval, Julia
Hi, this patch enables VAESDEC instruction from VAES isaset, defined here: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?
Thanks,
Julia

gcc/
* config.gcc: Add vaesintrin.h.
* config/i386/i386-builtin-types.def (V64QI_FTYPE_V64QI_V64QI): New 
type.
* config/i386/i386-builtin.def (__builtin_ia32_vaesdec_v16qi,
__builtin_ia32_vaesdec_v32qi, __builtin_ia32_vaesdec_v64qi): New 
builtins.
* config/i386/i386.c (ix86_expand_args_builtin): Handle new type.
* config/i386/immintrin.h: Include vaesintrin.h.
* config/i386/sse.md (vaesdec_): New pattern.
* config/i386/vaesintrin.h (_mm256_aesdec_epi128, _mm512_aesdec_epi128,
_mm_aesdec_epi128): New intrinsics.

gcc/testsuite/
* gcc.target/i386/avx512-check.h: Handle bit_VAES.
* gcc.target/i386/avx512f-aesdec-2.c: New test.
* gcc.target/i386/avx512fvl-vaes-1.c: Ditto.
* gcc.target/i386/avx512vl-aesdec-2.c: Ditto.
* gcc.target/i386/i386.exp (check_effective_target_avx512vaes): New.


0002-VAESDEC.PATCH
Description: 0002-VAESDEC.PATCH


RE: [x86][patch] Add -march=cannonlake.

2017-11-08 Thread Koval, Julia
Attachment got lost.

> -Original Message-
> From: Koval, Julia
> Sent: Wednesday, November 08, 2017 9:01 AM
> To: 'GCC Patches' <gcc-patches@gcc.gnu.org>
> Cc: 'Uros Bizjak' <ubiz...@gmail.com>; 'Kirill Yukhin' 
> <kirill.yuk...@gmail.com>
> Subject: RE: [x86][patch] Add -march=cannonlake.
> 
> Hi, this patch adds new option -march=cannonlake. Isasets defined in:
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> instruction-set-extensions-programming-reference.pdf
> 
> Ok for trunk?
> 
> gcc/
>   * config.gcc: Add -march=cannonlake.
>   * config/i386/driver-i386.c (host_detect_local_cpu): Detect cannonlake.
>   * config/i386/i386-c.c (ix86_target_macros_internal): Handle
> cannonlake.
>   * config/i386/i386.c (processor_costs): Add m_CANNONLAKE.
>   (PTA_CANNONLAKE): New.
>   (processor_target_table): Add cannonlake.
>   (ix86_option_override_internal): Ditto.
>   (fold_builtin_cpu): Ditto.
>   (get_builtin_code_for_version): Handle cannonlake.
>   (M_INTEL_CANNONLAKE): New.
>   * config/i386/i386.h (TARGET_CANNONLAKE,
> PROCESSOR_CANNONLAKE): New.
>   * doc/invoke.texi: Add -march=cannonlake.
> gcc/testsuite/
>   * gcc.target/i386/funcspec-56.inc: Handle new march.
> 
> Thanks,
> Julia


0001-cannonlake.patch
Description: 0001-cannonlake.patch


RE: [x86][patch] Add -march=cannonlake.

2017-11-08 Thread Koval, Julia
Hi, this patch adds new option -march=cannonlake. Isasets defined in: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?

gcc/
* config.gcc: Add -march=cannonlake.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect cannonlake.
* config/i386/i386-c.c (ix86_target_macros_internal): Handle cannonlake.
* config/i386/i386.c (processor_costs): Add m_CANNONLAKE.
(PTA_CANNONLAKE): New.
(processor_target_table): Add cannonlake.
(ix86_option_override_internal): Ditto.
(fold_builtin_cpu): Ditto.
(get_builtin_code_for_version): Handle cannonlake.
(M_INTEL_CANNONLAKE): New.
* config/i386/i386.h (TARGET_CANNONLAKE, PROCESSOR_CANNONLAKE): New.
* doc/invoke.texi: Add -march=cannonlake.
gcc/testsuite/
* gcc.target/i386/funcspec-56.inc: Handle new march.

Thanks,
Julia


RE: [patch][i386, AVX] GFNI enabling [3/4]

2017-11-06 Thread Koval, Julia
Rebased after last patch fixes.

gcc/
 * config/i386/gfniintrin.h (_mm_gf2p8affine_epi64_epi8,
 _mm256_gf2p8affine_epi64_epi8, _mm_mask_gf2p8affine_epi64_epi8,
 _mm_maskz_gf2p8affine_epi64_epi8, _mm256_mask_gf2p8affine_epi64_epi8,
 _mm256_maskz_gf2p8affine_epi64_epi8,
 _mm512_mask_gf2p8affine_epi64_epi8, _mm512_gf2p8affine_epi64_epi8
 _mm512_maskz_gf2p8affine_epi64_epi8): New intrinsics.
 * config/i386/i386-builtin.def (__builtin_ia32_vgf2p8affineqb_v64qi,
 __builtin_ia32_vgf2p8affineqb_v32qi,
 __builtin_ia32_vgf2p8affineqb_v16qi): New builtins.
 * config/i386/sse.md (vgf2p8affineqb_): New pattern.

gcc/testsuite/
 * gcc.target/i386/avx-1.c: Handle new intrinsics.
 * gcc.target/i386/avx512f-gf2p8affineqb-2.c: New runtime tests.
 * gcc.target/i386/avx512vl-gf2p8affineqb-2.c: Ditto.
 * gcc.target/i386/gfni-1.c: Add tests for GF2P8AFFINE.
 * gcc.target/i386/gfni-2.c: Ditto.
 * gcc.target/i386/gfni-3.c: Ditto.
 * gcc.target/i386/gfni-4.c: Ditto.
 * gcc.target/i386/sse-13.c: Handle new tests. 
 * gcc.target/i386/sse-14.c: Handle new tests.
 * gcc.target/i386/sse-23.c: Handle new tests.

> -Original Message-
> From: Koval, Julia
> Sent: Tuesday, October 17, 2017 3:26 PM
> To: Jakub Jelinek <ja...@redhat.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Kirill Yukhin
> <kirill.yuk...@gmail.com>
> Subject: RE: [patch][i386, AVX] GFNI enabling [3/4]
> 
> Thanks for your comments, fixed everything.
> 
> gcc/
> * config/i386/gfniintrin.h (_mm_gf2p8affine_epi64_epi8,
> _mm256_gf2p8affine_epi64_epi8, _mm_mask_gf2p8affine_epi64_epi8,
> _mm_maskz_gf2p8affine_epi64_epi8,
> _mm256_mask_gf2p8affine_epi64_epi8,
> _mm256_maskz_gf2p8affine_epi64_epi8,
> _mm512_mask_gf2p8affine_epi64_epi8, _mm512_gf2p8affine_epi64_epi8
> _mm512_maskz_gf2p8affine_epi64_epi8): New intrinsics.
> * config/i386/i386-builtin.def (__builtin_ia32_vgf2p8affineqb_v64qi,
> __builtin_ia32_vgf2p8affineqb_v32qi,
> __builtin_ia32_vgf2p8affineqb_v16qi): New builtins.
> * config/i386/sse.md (vgf2p8affineqb_): New
> pattern.
> 
> gcc/testsuite/
> * gcc.target/i386/avx-1.c: Handle new intrinsics.
> * gcc.target/i386/avx512f-gf2p8affineqb-2.c: New runtime tests.
> * gcc.target/i386/avx512vl-gf2p8affineqb-2.c: Ditto.
> * gcc.target/i386/gfni-1.c: Add tests for GF2P8AFFINE.
> * gcc.target/i386/gfni-2.c: Ditto.
> * gcc.target/i386/gfni-3.c: Ditto.
> * gcc.target/i386/gfni-4.c: Ditto.
> * gcc.target/i386/sse-13.c: Handle new tests.
> * gcc.target/i386/sse-23.c: Handle new tests.
> 
> 
> > -----Original Message-
> > From: Jakub Jelinek [mailto:ja...@redhat.com]
> > Sent: Tuesday, October 17, 2017 3:15 PM
> > To: Koval, Julia <julia.ko...@intel.com>
> > Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Kirill Yukhin
> > <kirill.yuk...@gmail.com>
> > Subject: Re: [patch][i386, AVX] GFNI enabling [3/4]
> >
> > On Tue, Oct 17, 2017 at 01:09:50PM +, Koval, Julia wrote:
> > > Hi, this the third patch of GFNI ISASET enabling. It enables GF2P8AFFINE
> > instruction, described here:
> > https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> > instruction-set-extensions-programming-reference.pdf
> > >
> > > gcc/
> > >   * config/i386/gfniintrin.h (_mm_gf2p8affine_epi64_epi8,
> > _mm256_gf2p8affine_epi64_epi8,
> >
> > Too long line, even ChangeLog entries should be wrapped to 80 columns.
> >
> > >   (_mm_mask_gf2p8affine_epi64_epi8,
> > _mm_maskz_gf2p8affine_epi64_epi8,
> > >   _mm256_mask_gf2p8affine_epi64_epi8,
> > _mm256_maskz_gf2p8affine_epi64_epi8,
> > >   _mm512_mask_gf2p8affine_epi64_epi8,
> > _mm512_maskz_gf2p8affine_epi64_epi8,
> >
> > The above two are also too long (off by 1 char).
> >
> > >   _mm512_gf2p8affine_epi64_epi8): New intrinsics.
> > >   * config/i386/i386-builtin.def (__builtin_ia32_vgf2p8affineqb_v64qi,
> > >   __builtin_ia32_vgf2p8affineqb_v32qi,
> > __builtin_ia32_vgf2p8affineqb_v16qi): New builtins.
> >
> > And this one too.  Please wrap them.
> >
> > >   * config/i386/sse.md (vgf2p8affineqb_*): New pattern.
> >
> > Use vgf2p8affineqb_ instead of the wild-card?
> >
> > I'll defer actual review to Kirill.
> >
> > Jakub


0001-gf2p8affine.patch
Description: 0001-gf2p8affine.patch


RE: [patch][x86] GFNI enabling [2/4]

2017-11-03 Thread Koval, Julia
Here is the solution I propose:

gcc/
* common/config/i386/i386-common.c
(OPTION_MASK_ISA_GENERAL_REGS_ONLY_UNSET): Remove MPX from flag.
(ix86_handle_option): Move MPX to isa_flags2 and GFNI to isa_flags.
* config/i386/i386-c.c (ix86_target_macros_internal): Ditto.
* config/i386/i386.opt: Ditto.
* config/i386/i386.c (ix86_target_string): Ditto.
(ix86_option_override_internal): Ditto.
(ix86_init_mpx_builtins): Move MPX to args2.
(ix86_expand_builtin): Special handling for OPTION_MASK_ISA_GFNI.
* config/i386/i386-builtin.def (__builtin_ia32_vgf2p8affineinvqb_v64qi,
__builtin_ia32_vgf2p8affineinvqb_v64qi_mask,
__builtin_ia32_vgf2p8affineinvqb_v32qi,
__builtin_ia32_vgf2p8affineinvqb_v32qi_mask,
__builtin_ia32_vgf2p8affineinvqb_v16qi,
__builtin_ia32_vgf2p8affineinvqb_v16qi_mask): Move to ARGS array.

> -Original Message-
> From: Koval, Julia
> Sent: Friday, November 03, 2017 9:27 AM
> To: 'Jakub Jelinek' <ja...@redhat.com>
> Cc: 'Kirill Yukhin' <kirill.yuk...@gmail.com>; 'GCC Patches'  patc...@gcc.gnu.org>
> Subject: RE: [patch][x86] GFNI enabling [2/4]
> 
> > But what do you think about adding AVX/SSE flags to this special set 
> > instead?
> Ok, was wrong, it is impossible to add SSE, because it is used in normal "or" 
> way.
> Then I'll add GFNI/VAES instead.
> 
> There is also another problem there: GFNI belongs to isa_flags2, while
> AVX512VL/AVX/SSE belong to isa_flags, so we can't keep them in the same field.
> There are candidates, which can be moved from isa_flags to isa_flags2 instead
> of GFNI, because there are no dependencies on other flags, but it is only a 
> short
> term solution.
> 
> > -Original Message-
> > From: Koval, Julia
> > Sent: Thursday, November 02, 2017 12:57 PM
> > To: Jakub Jelinek <ja...@redhat.com>
> > Cc: Kirill Yukhin <kirill.yuk...@gmail.com>; GCC Patches  > patc...@gcc.gnu.org>
> > Subject: RE: [patch][x86] GFNI enabling [2/4]
> >
> > The documentation is right, I was wrong not adding SSE/AVX flags in these
> > builtin declaratuin.
> >
> > > The exceptions are
> > > MMX, AVX512VL and 64BIT is also special.
> > > So, shall GFNI be added to that set?
> > Turns out only GFNI and VAES(haven't sent those yet, they are from the same
> > Icelake pdf) are like this, others rely on AVX512VL/BW. But what do you 
> > think
> > about adding AVX/SSE flags to this special set instead? Looks like they more
> > probably will be used as a flags, on which new instructions may depend in 
> > the
> > future, than GFNI/VAES flags.
> >
> > -Julia
> >
> > > -Original Message-
> > > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > > ow...@gcc.gnu.org] On Behalf Of Jakub Jelinek
> > > Sent: Tuesday, October 31, 2017 8:28 PM
> > > To: Koval, Julia <julia.ko...@intel.com>
> > > Cc: Kirill Yukhin <kirill.yuk...@gmail.com>; GCC Patches  > > patc...@gcc.gnu.org>
> > > Subject: Re: [patch][x86] GFNI enabling [2/4]
> > >
> > > On Mon, Oct 30, 2017 at 07:02:23PM +, Koval, Julia wrote:
> > > > gcc/testsuite/
> > > > * gcc.target/i386/avx-1.c: Handle new intrinsics.
> > > > * gcc.target/i386/avx512-check.h: Check GFNI bit.
> > > > * gcc.target/i386/avx512f-gf2p8affineinvqb-2.c: Runtime test.
> > > > * gcc.target/i386/avx512vl-gf2p8affineinvqb-2.c: Runtime test.
> > > > * gcc.target/i386/gfni-1.c: New.
> > > > * gcc.target/i386/gfni-2.c: New.
> > > > * gcc.target/i386/gfni-3.c: New.
> > > > * gcc.target/i386/gfni-4.c: New.
> > >
> > > The gfni-4.c testcase ICEs on i686-linux (e.g. try
> > > make check-gcc RUNTESTFLAGS='--target_board=unix\{-m32/-msse,-m32/-
> > > mno-sse,-m64\} i386.exp=gfni*'
> > > to see it).
> > >
> > > I must say I'm confused by the CPUIDs, the
> > > https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> > > instruction-set-extensions-programming-reference.pdf
> > > lists GFNI; 2x AVX+GFNI; 2x AVX512VL+GFNI; AVX512F+GFNI CPUIDs for the
> > > instructions, but i386-builtins.def has:
> > > BDESC (OPTION_MASK_ISA_GFNI, CODE_FOR_vgf2p8affineinvqb_v64qi,
> > > "__builtin_ia32_vgf2p8affineinvqb_v64qi",
> > > IX86_BUILTIN_VGF2P8AFFINEINVQB512, UNKNOWN
> > > BDESC (OPTION_MASK_ISA_GFNI | OPTION_MASK_ISA_AVX512BW,
> > > CODE_FOR_

RE: [patch][x86] GFNI enabling [2/4]

2017-11-03 Thread Koval, Julia
> But what do you think about adding AVX/SSE flags to this special set instead?
Ok, was wrong, it is impossible to add SSE, because it is used in normal "or" 
way. Then I'll add GFNI/VAES instead.

There is also another problem there: GFNI belongs to isa_flags2, while 
AVX512VL/AVX/SSE belong to isa_flags, so we can't keep them in the same field. 
There are candidates, which can be moved from isa_flags to isa_flags2 instead 
of GFNI, because there are no dependencies on other flags, but it is only a 
short term solution.

> -Original Message-----
> From: Koval, Julia
> Sent: Thursday, November 02, 2017 12:57 PM
> To: Jakub Jelinek <ja...@redhat.com>
> Cc: Kirill Yukhin <kirill.yuk...@gmail.com>; GCC Patches  patc...@gcc.gnu.org>
> Subject: RE: [patch][x86] GFNI enabling [2/4]
> 
> The documentation is right, I was wrong not adding SSE/AVX flags in these
> builtin declaratuin.
> 
> > The exceptions are
> > MMX, AVX512VL and 64BIT is also special.
> > So, shall GFNI be added to that set?
> Turns out only GFNI and VAES(haven't sent those yet, they are from the same
> Icelake pdf) are like this, others rely on AVX512VL/BW. But what do you think
> about adding AVX/SSE flags to this special set instead? Looks like they more
> probably will be used as a flags, on which new instructions may depend in the
> future, than GFNI/VAES flags.
> 
> -Julia
> 
> > -Original Message-
> > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > ow...@gcc.gnu.org] On Behalf Of Jakub Jelinek
> > Sent: Tuesday, October 31, 2017 8:28 PM
> > To: Koval, Julia <julia.ko...@intel.com>
> > Cc: Kirill Yukhin <kirill.yuk...@gmail.com>; GCC Patches  > patc...@gcc.gnu.org>
> > Subject: Re: [patch][x86] GFNI enabling [2/4]
> >
> > On Mon, Oct 30, 2017 at 07:02:23PM +, Koval, Julia wrote:
> > > gcc/testsuite/
> > >   * gcc.target/i386/avx-1.c: Handle new intrinsics.
> > >   * gcc.target/i386/avx512-check.h: Check GFNI bit.
> > >   * gcc.target/i386/avx512f-gf2p8affineinvqb-2.c: Runtime test.
> > >   * gcc.target/i386/avx512vl-gf2p8affineinvqb-2.c: Runtime test.
> > >   * gcc.target/i386/gfni-1.c: New.
> > >   * gcc.target/i386/gfni-2.c: New.
> > >   * gcc.target/i386/gfni-3.c: New.
> > >   * gcc.target/i386/gfni-4.c: New.
> >
> > The gfni-4.c testcase ICEs on i686-linux (e.g. try
> > make check-gcc RUNTESTFLAGS='--target_board=unix\{-m32/-msse,-m32/-
> > mno-sse,-m64\} i386.exp=gfni*'
> > to see it).
> >
> > I must say I'm confused by the CPUIDs, the
> > https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> > instruction-set-extensions-programming-reference.pdf
> > lists GFNI; 2x AVX+GFNI; 2x AVX512VL+GFNI; AVX512F+GFNI CPUIDs for the
> > instructions, but i386-builtins.def has:
> > BDESC (OPTION_MASK_ISA_GFNI, CODE_FOR_vgf2p8affineinvqb_v64qi,
> > "__builtin_ia32_vgf2p8affineinvqb_v64qi",
> > IX86_BUILTIN_VGF2P8AFFINEINVQB512, UNKNOWN
> > BDESC (OPTION_MASK_ISA_GFNI | OPTION_MASK_ISA_AVX512BW,
> > CODE_FOR_vgf2p8affineinvqb_v64qi_mask,
> > "__builtin_ia32_vgf2p8affineinvqb_v64qi_mask", IX86_
> > BDESC (OPTION_MASK_ISA_GFNI, CODE_FOR_vgf2p8affineinvqb_v32qi,
> > "__builtin_ia32_vgf2p8affineinvqb_v32qi",
> > IX86_BUILTIN_VGF2P8AFFINEINVQB256, UNKNOWN
> > BDESC (OPTION_MASK_ISA_GFNI | OPTION_MASK_ISA_AVX512BW,
> > CODE_FOR_vgf2p8affineinvqb_v32qi_mask,
> > "__builtin_ia32_vgf2p8affineinvqb_v32qi_mask", IX86_
> > BDESC (OPTION_MASK_ISA_GFNI, CODE_FOR_vgf2p8affineinvqb_v16qi,
> > "__builtin_ia32_vgf2p8affineinvqb_v16qi",
> > IX86_BUILTIN_VGF2P8AFFINEINVQB128, UNKNOWN
> > BDESC (OPTION_MASK_ISA_GFNI | OPTION_MASK_ISA_AVX512BW,
> > CODE_FOR_vgf2p8affineinvqb_v16qi_mask,
> > "__builtin_ia32_vgf2p8affineinvqb_v16qi_mask", IX86_
> > and the gfniintrin.h requires just gfni for the first insn,
> > and then some combinations of gfni,avx, or gfni,avx512vl, or
> > gfni,avx512vl,avx512bw, or gfni,avx512f,avx512bw.
> >
> > So, what is right, the paper, i386-builtins.def or gfniintrin.h?
> >
> > Obviously even if the GF2P8AFFINEINVQB instruction doesn't list SSE as
> > required CPUID, we can't really emit it without at least SSE because
> > then the operands can't be emitted.  So, at least in GCC we should
> > require both GFNI and SSE for the first instruction.
> >
> > Which leads to another issue, as ix86_expand_builtin documents,
> > we treat the BDESC ISAs OPTION_MASK_ISA_ISA1 | OPTION_MASK_ISA_ISA2
> > as either ISA1 or ISA2, not ISA1 and ISA2.  The exceptions are
> > MMX, AVX512VL and 64BIT is also special.
> > So, shall GFNI be added to that set?  Do we have other ISAs that
> > should be handled the same?  I guess maybe OPTION_MASK_ISA_AES, but
> > that is handled weirdly.
> >
> > Jakub


RE: [patch][x86] GFNI enabling [2/4]

2017-11-02 Thread Koval, Julia
The documentation is right, I was wrong not adding SSE/AVX flags in these 
builtin declaratuin.

> The exceptions are
> MMX, AVX512VL and 64BIT is also special.
> So, shall GFNI be added to that set?  
Turns out only GFNI and VAES(haven't sent those yet, they are from the same 
Icelake pdf) are like this, others rely on AVX512VL/BW. But what do you think 
about adding AVX/SSE flags to this special set instead? Looks like they more 
probably will be used as a flags, on which new instructions may depend in the 
future, than GFNI/VAES flags.

-Julia

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Jakub Jelinek
> Sent: Tuesday, October 31, 2017 8:28 PM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: Kirill Yukhin <kirill.yuk...@gmail.com>; GCC Patches  patc...@gcc.gnu.org>
> Subject: Re: [patch][x86] GFNI enabling [2/4]
> 
> On Mon, Oct 30, 2017 at 07:02:23PM +, Koval, Julia wrote:
> > gcc/testsuite/
> > * gcc.target/i386/avx-1.c: Handle new intrinsics.
> > * gcc.target/i386/avx512-check.h: Check GFNI bit.
> > * gcc.target/i386/avx512f-gf2p8affineinvqb-2.c: Runtime test.
> > * gcc.target/i386/avx512vl-gf2p8affineinvqb-2.c: Runtime test.
> > * gcc.target/i386/gfni-1.c: New.
> > * gcc.target/i386/gfni-2.c: New.
> > * gcc.target/i386/gfni-3.c: New.
> > * gcc.target/i386/gfni-4.c: New.
> 
> The gfni-4.c testcase ICEs on i686-linux (e.g. try
> make check-gcc RUNTESTFLAGS='--target_board=unix\{-m32/-msse,-m32/-
> mno-sse,-m64\} i386.exp=gfni*'
> to see it).
> 
> I must say I'm confused by the CPUIDs, the
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> instruction-set-extensions-programming-reference.pdf
> lists GFNI; 2x AVX+GFNI; 2x AVX512VL+GFNI; AVX512F+GFNI CPUIDs for the
> instructions, but i386-builtins.def has:
> BDESC (OPTION_MASK_ISA_GFNI, CODE_FOR_vgf2p8affineinvqb_v64qi,
> "__builtin_ia32_vgf2p8affineinvqb_v64qi",
> IX86_BUILTIN_VGF2P8AFFINEINVQB512, UNKNOWN
> BDESC (OPTION_MASK_ISA_GFNI | OPTION_MASK_ISA_AVX512BW,
> CODE_FOR_vgf2p8affineinvqb_v64qi_mask,
> "__builtin_ia32_vgf2p8affineinvqb_v64qi_mask", IX86_
> BDESC (OPTION_MASK_ISA_GFNI, CODE_FOR_vgf2p8affineinvqb_v32qi,
> "__builtin_ia32_vgf2p8affineinvqb_v32qi",
> IX86_BUILTIN_VGF2P8AFFINEINVQB256, UNKNOWN
> BDESC (OPTION_MASK_ISA_GFNI | OPTION_MASK_ISA_AVX512BW,
> CODE_FOR_vgf2p8affineinvqb_v32qi_mask,
> "__builtin_ia32_vgf2p8affineinvqb_v32qi_mask", IX86_
> BDESC (OPTION_MASK_ISA_GFNI, CODE_FOR_vgf2p8affineinvqb_v16qi,
> "__builtin_ia32_vgf2p8affineinvqb_v16qi",
> IX86_BUILTIN_VGF2P8AFFINEINVQB128, UNKNOWN
> BDESC (OPTION_MASK_ISA_GFNI | OPTION_MASK_ISA_AVX512BW,
> CODE_FOR_vgf2p8affineinvqb_v16qi_mask,
> "__builtin_ia32_vgf2p8affineinvqb_v16qi_mask", IX86_
> and the gfniintrin.h requires just gfni for the first insn,
> and then some combinations of gfni,avx, or gfni,avx512vl, or
> gfni,avx512vl,avx512bw, or gfni,avx512f,avx512bw.
> 
> So, what is right, the paper, i386-builtins.def or gfniintrin.h?
> 
> Obviously even if the GF2P8AFFINEINVQB instruction doesn't list SSE as
> required CPUID, we can't really emit it without at least SSE because
> then the operands can't be emitted.  So, at least in GCC we should
> require both GFNI and SSE for the first instruction.
> 
> Which leads to another issue, as ix86_expand_builtin documents,
> we treat the BDESC ISAs OPTION_MASK_ISA_ISA1 | OPTION_MASK_ISA_ISA2
> as either ISA1 or ISA2, not ISA1 and ISA2.  The exceptions are
> MMX, AVX512VL and 64BIT is also special.
> So, shall GFNI be added to that set?  Do we have other ISAs that
> should be handled the same?  I guess maybe OPTION_MASK_ISA_AES, but
> that is handled weirdly.
> 
>   Jakub


[PATCH][i386,AVX] Enable VAES support [1/5]

2017-10-25 Thread Koval, Julia
Hi,
This patch enables VAES isaset option. The doc for isaset and instruction: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?

Thanks,
Julia

gcc/
* common/config/i386/i386-common.c (OPTION_MASK_ISA_VAES_SET,
OPTION_MASK_ISA_VAES_UNSET): New.
(ix86_handle_option): Handle -mvaes.
* config/i386/cpuid.h: Define bit_VAES.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect -mvaes.
* config/i386/i386-c.c (__VAES__): New.
* config/i386/i386.c (ix86_target_string): Add -mvaes.
(ix86_valid_target_attribute_inner_p): Ditto.
* config/i386/i386.h (TARGET_VAES, TARGET_VAES_P): New.
* config/i386/i386.opt: Add -mvaes.
* doc/invoke.texi: Ditto.



0001-VAES-option.patch
Description: 0001-VAES-option.patch


RE: [PATCH][i386,AVX] Enable VBMI2 support [1/7]

2017-10-25 Thread Koval, Julia
Thanks, fix it.

gcc/
* common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512VBMI2_SET,
OPTION_MASK_ISA_AVX512VBMI2_UNSET): New.
(ix86_handle_option): Handle -mavx512vbmi2.
* config/i386/cpuid.h: Add bit_AVX512VBMI2.
* config/i386/driver-i386.c (host_detect_local_cpu): Handle new bit.
* config/i386/i386-c.c (__AVX512VBMI2__): New.
* config/i386/i386.c (ix86_target_string): Handle -mavx512vbmi2.
(ix86_valid_target_attribute_inner_p): Ditto.
* config/i386/i386.h (TARGET_AVX512VBMI2, TARGET_AVX512VBMI2_P): New.
* config/i386/i386.opt (mavx512vbmi2): New option.
* doc/invoke.texi: Add new option.


> -Original Message-
> From: Joseph Myers [mailto:jos...@codesourcery.com]
> Sent: Wednesday, October 25, 2017 12:40 AM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Kirill Yukhin
> <kirill.yuk...@gmail.com>
> Subject: Re: [PATCH][i386,AVX] Enable VBMI2 support [1/7]
> 
> On Tue, 24 Oct 2017, Koval, Julia wrote:
> 
> > config/i386/i386.opt (mavx512vbmi2): New option.
> 
> Any patch adding a new command-line option needs to add documentation of
> it to invoke.texi.
> 
> --
> Joseph S. Myers
> jos...@codesourcery.com


0001-VBMI2-option.patch
Description: 0001-VBMI2-option.patch


[PATCH][i386,AVX] Enable VNNI support [5/5]

2017-10-24 Thread Koval, Julia
Hi,
This patch enables VPDPWSSDS instruction. The doc for isaset and instruction: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?

Thanks,
Julia

gcc/
* config/i386/avx512vnniintrin.h (_mm512_dpwssds_epi32,
_mm512_mask_dpwssds_epi32, _mm512_maskz_dpwssds_epi32): New intrinsics.
* config/i386/avx512vnnivlintrin.h (_mm256_dpwssds_epi32,
_mm256_mask_dpwssds_epi32, _mm256_maskz_dpwssds_epi32,
_mm_dpwssds_epi32, _mm_mask_dpwssds_epi32,
_mm_maskz_dpwssds_epi32): Ditto.

gcc/testsuite/
* gcc.target/i386/avx512f-vnni-1.c: Add checks for vdpdwssds.
* gcc.target/i386/avx512vl-vnni-1.c: Ditto.
* gcc.target/i386/avx512f-vpdpwssds-2.c: New test.
* gcc.target/i386/avx512vl-vpdpwssds-2.c: Ditto.


0014-VPDPWSSDS-instruction.patch
Description: 0014-VPDPWSSDS-instruction.patch


[PATCH][i386,AVX] Enable VNNI support [4/5]

2017-10-24 Thread Koval, Julia
Hi,
This patch enables VPDPWSSD instruction. The doc for isaset and instruction: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?

Thanks,
Julia

gcc/
* config/i386/avx512vnniintrin.h (_mm512_dpwssd_epi32,
_mm512_mask_dpwssd_epi32, _mm512_maskz_dpwssd_epi32): New intrinsics.
* config/i386/avx512vnnivlintrin.h (_mm256_dpwssd_epi32,
_mm256_mask_dpwssd_epi32, _mm256_maskz_dpwssd_epi32, _mm_dpwssd_epi32,
_mm_mask_dpwssd_epi32, _mm_maskz_dpwssd_epi32): Ditto.

gcc/testsuite/
* gcc.target/i386/avx512f-vnni-1.c: Add vdpwssd checks.
* gcc.target/i386/avx512vl-vnni-1.c: Ditto.
* gcc.target/i386/avx512f-vpdpwssd-2.c: New.
* gcc.target/i386/avx512vl-vpdpwssd-2.c: Ditto.


0013-VPDPWSSD-instruction.patch
Description: 0013-VPDPWSSD-instruction.patch


[PATCH][i386,AVX] Enable VNNI support [3/5]

2017-10-24 Thread Koval, Julia
Hi,
This patch enables VPDPBUSDS instruction. The doc for isaset and instruction: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?

Thanks,
Julia

gcc/
* config/i386/avx512vnniintrin.h (_mm512_dpbusds_epi32,
_mm512_mask_dpbusds_epi32, _mm512_maskz_dpbusds_epi32): New.
* config/i386/avx512vnnivlintrin.h (_mm256_dpbusds_epi32,
_mm256_mask_dpbusds_epi32, _mm256_maskz_dpbusds_epi32,
_mm_dpbusds_epi32, _mm_mask_dpbusds_epi32,
_mm_maskz_dpbusds_epi32): New intrinsics.

gcc/testsuite/
* gcc.target/i386/avx512f-vnni-1.c: Add vpdpbusds check.
* gcc.target/i386/avx512vl-vnni-1.c: Ditto.
* gcc.target/i386/avx512f-vpdpbusds-2.c: New.
* gcc.target/i386/avx512vl-vpdpbusds-2.c: Ditto.


0012-VPDPBUSDS-instruction.patch
Description: 0012-VPDPBUSDS-instruction.patch


[PATCH][i386,AVX] Enable VNNI support [2/5]

2017-10-24 Thread Koval, Julia
Hi,
This patch enables VPDPBUSD instruction, also it contains builtins and md 
patterns for other VNNI instructions. The doc for isaset and instruction: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?

Thanks,
Julia

gcc/
* config.gcc: Add avx512vnniintrin.h and avx512vnnivlintrin.h.
* config/i386/avx512vnniintrin.h (_mm512_dpbusd_epi32,
_mm512_mask_dpbusd_epi32, _mm512_maskz_dpbusd_epi32): New intrinsics.
* config/i386/avx512vnnivlintrin.h (_mm256_dpbusd_epi32,
_mm256_mask_dpbusd_epi32, _mm256_maskz_dpbusd_epi32, _mm_dpbusd_epi32,
_mm_mask_dpbusd_epi32, _mm_maskz_dpbusd_epi32): Ditto.
* config/i386/i386-builtin.def (__builtin_ia32_vpdpbusd_v16si,
__builtin_ia32_vpdpbusd_v16si_mask,
__builtin_ia32_vpdpbusd_v16si_maskz, __builtin_ia32_vpdpbusd_v8si,
__builtin_ia32_vpdpbusd_v8si_mask, __builtin_ia32_vpdpbusd_v8si_maskz,
__builtin_ia32_vpdpbusd_v4si, __builtin_ia32_vpdpbusd_v4si_mask,
__builtin_ia32_vpdpbusd_v4si_maskz, __builtin_ia32_vpdpbusds_v16si,
__builtin_ia32_vpdpbusds_v16si_mask,
__builtin_ia32_vpdpbusds_v16si_maskz, __builtin_ia32_vpdpbusds_v8si,
__builtin_ia32_vpdpbusds_v8si_mask,
__builtin_ia32_vpdpbusds_v8si_maskz, __builtin_ia32_vpdpbusds_v4si,
__builtin_ia32_vpdpbusds_v4si_mask,
__builtin_ia32_vpdpbusds_v4si_maskz, __builtin_ia32_vpdpwssd_v16si,
__builtin_ia32_vpdpwssd_v16si_mask,
__builtin_ia32_vpdpwssd_v16si_maskz, __builtin_ia32_vpdpwssd_v8si,
__builtin_ia32_vpdpwssd_v8si_mask,
__builtin_ia32_vpdpwssd_v8si_maskz, __builtin_ia32_vpdpwssd_v4si,
__builtin_ia32_vpdpwssd_v4si_mask,
__builtin_ia32_vpdpwssd_v4si_maskz, __builtin_ia32_vpdpwssds_v16si,
__builtin_ia32_vpdpwssds_v16si_mask,
__builtin_ia32_vpdpwssds_v16si_maskz, __builtin_ia32_vpdpwssds_v8si,
__builtin_ia32_vpdpwssds_v8si_mask,
__builtin_ia32_vpdpwssds_v8si_maskz, __builtin_ia32_vpdpwssds_v4si,
__builtin_ia32_vpdpwssds_v4si_mask,
__builtin_ia32_vpdpwssds_v4si_maskz): New builtins.
* config/i386/immintrin.h: Include avx512vnniintrin.h and
avx512vnnivlintrin.h.
* config/i386/sse.md (vpdpbusd_, vpdpbusd__mask,
vpdpbusd__maskz, vpdpbusd__maskz_1, vpdpbusds_,
vpdpbusds__mask, vpdpbusds__maskz,
vpdpbusds__maskz_1, vpdpwssd_, vpdpwssd__mask,
vpdpwssd__maskz, vpdpwssd__maskz_1, vpdpwssds_,
vpdpwssds__mask, vpdpwssds__maskz,
vpdpwssds__maskz_1): New md patterns.

gcc/testsuite/
* gcc.target/i386/avx512-check.h: Handle AVX512VNNI bit.
* gcc.target/i386/avx512f-vnni-1.c: New test.
* gcc.target/i386/avx512f-vpdpbusd-2.c: Ditto.
* gcc.target/i386/avx512vl-vnni-1.c: Ditto.
* gcc.target/i386/avx512vl-vpdpbusd-2.c: Ditto.
* gcc.target/i386/i386.exp: Check avx512vnni effective target.




0011-VPDPBUSD-instruction.patch
Description: 0011-VPDPBUSD-instruction.patch


[PATCH][i386,AVX] Enable VNNI support [1/5]

2017-10-24 Thread Koval, Julia
Hi,
This patch enables VNNI isaset option. The doc for isaset and instruction: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?

Thanks,
Julia

gcc/
* common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512VNNI_SET,
OPTION_MASK_ISA_AVX512VNNI_UNSET): New.
(ix86_handle_option): Handle -mavx512vnni.
* config/i386/cpuid.h (bit_AVX512VNNI): New bit.
* config/i386/driver-i386.c (host_detect_local_cpu): Handle new bit.
* config/i386/i386-c (__AVX512VNNI__): New.
* config/i386/i386.c (ix86_target_string): Handle new option.
(ix86_valid_target_attribute_inner_p): Handle new option.
* config/i386/i386.h (TARGET_AVX512VNNI, TARGET_AVX512VNNI_P): New.
* config/i386/i386.opt (mavx512vnni): New option.


0010-VNNI-option.patch
Description: 0010-VNNI-option.patch


[PATCH][i386,AVX] Enable VBMI2 support [7/7]

2017-10-24 Thread Koval, Julia
Hi,
This patch enables VPSHRDV instruction. The doc for isaset and instruction: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?

gcc/
config/i386/avx512vbmi2intrin.h (_mm512_shldv_epi16,
_mm512_mask_shldv_epi16, _mm512_maskz_shldv_epi16, _mm512_shldv_epi32,
_mm512_mask_shldv_epi32, _mm512_maskz_shldv_epi32, _mm512_shldv_epi64,
_mm512_mask_shldv_epi64, _mm512_maskz_shldv_epi64): New intrinsics.
config/i386/avx512vbmi2vlintrin.h (_mm256_shldv_epi16,
_mm256_mask_shldv_epi16, _mm256_maskz_shldv_epi16, _mm256_shldv_epi32,
_mm256_mask_shldv_epi32, _mm256_maskz_shldv_epi32, _mm256_shldv_epi64,
_mm256_mask_shldv_epi64, _mm256_maskz_shldv_epi64, _mm_shldv_epi16,
_mm_mask_shldv_epi16, _mm_maskz_shldv_epi16, _mm_shldv_epi32,
_mm_mask_shldv_epi32, _mm_maskz_shldv_epi32, _mm_shldv_epi64,
_mm_mask_shldv_epi64, _mm_maskz_shldv_epi64): Ditto.
config/i386/i386-builtin.def (__builtin_ia32_vpshldv_v32hi,
__builtin_ia32_vpshldv_v32hi_mask, __builtin_ia32_vpshldv_v32hi_maskz,
__builtin_ia32_vpshldv_v16hi, __builtin_ia32_vpshldv_v16hi_mask,
__builtin_ia32_vpshldv_v16hi_maskz, __builtin_ia32_vpshldv_v8hi,
__builtin_ia32_vpshldv_v8hi_mask, __builtin_ia32_vpshldv_v8hi_maskz,
__builtin_ia32_vpshldv_v16si, __builtin_ia32_vpshldv_v16si_mask,
__builtin_ia32_vpshldv_v16si_maskz, __builtin_ia32_vpshldv_v8si,
__builtin_ia32_vpshldv_v8si_mask, __builtin_ia32_vpshldv_v8si_maskz,
__builtin_ia32_vpshldv_v4si, __builtin_ia32_vpshldv_v4si_mask,
__builtin_ia32_vpshldv_v4si_maskz, __builtin_ia32_vpshldv_v8di,
__builtin_ia32_vpshldv_v8di_mask, __builtin_ia32_vpshldv_v8di_maskz,
__builtin_ia32_vpshldv_v4di, __builtin_ia32_vpshldv_v4di_mask,
__builtin_ia32_vpshldv_v4di_maskz, __builtin_ia32_vpshldv_v2di,
__builtin_ia32_vpshldv_v2di_mask,
__builtin_ia32_vpshldv_v2di_maskz): New builtins.
config/i386/sse.md (vpshldv_, vpshldv__mask,
vpshldv__maskz, vpshldv__maskz_1): New patterns.

gcc/testsuite/
gcc.target/i386/avx512f-vpshldv-1.c: New test.
gcc.target/i386/avx512f-vpshldvd-2.c: Ditto.
gcc.target/i386/avx512f-vpshldvq-2.c: Ditto.
gcc.target/i386/avx512f-vpshldvw-2.c: Ditto.
gcc.target/i386/avx512vl-vpshldv-1.c: Ditto.
gcc.target/i386/avx512vl-vpshldvd-2.c: Ditto.
gcc.target/i386/avx512vl-vpshldvq-2.c: Ditto.
gcc.target/i386/avx512vl-vpshldvw-2.c: Ditto.


0009-VPSHLDV-instruction.patch
Description: 0009-VPSHLDV-instruction.patch


[PATCH][i386,AVX] Enable VBMI2 support [6/7]

2017-10-24 Thread Koval, Julia
Hi,
This patch enables VPSHRDV instruction. The doc for isaset and instruction: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?

gcc/
config/i386/avx512vbmi2intrin.h (_mm512_shrdv_epi16,
_mm512_mask_shrdv_epi16, _mm512_maskz_shrdv_epi16, _mm512_shrdv_epi32,
_mm512_mask_shrdv_epi32, _mm512_maskz_shrdv_epi32, _mm512_shrdv_epi64,
_mm512_mask_shrdv_epi64, _mm512_maskz_shrdv_epi64): New intrinsics.
config/i386/avx512vbmi2vlintrin.h (_mm256_shrdv_epi16,
_mm256_mask_shrdv_epi16, _mm256_maskz_shrdv_epi16, _mm256_shrdv_epi32,
_mm256_mask_shrdv_epi32, _mm256_maskz_shrdv_epi32, _mm256_shrdv_epi64,
_mm256_mask_shrdv_epi64, _mm256_maskz_shrdv_epi64, _mm_shrdv_epi16,
_mm_mask_shrdv_epi16, _mm_maskz_shrdv_epi16, _mm_shrdv_epi32,
_mm_mask_shrdv_epi32, _mm_maskz_shrdv_epi32, _mm_shrdv_epi64,
_mm_mask_shrdv_epi64, _mm_maskz_shrdv_epi64): Ditto.
config/i386/i386-builtin-types.def (V32HI_FTYPE_V32HI_V32HI_V32HI,
V32HI_FTYPE_V32HI_V32HI_V32HI_INT, V16HI_FTYPE_V16HI_V16HI_V16HI_INT,
V8HI_FTYPE_V8HI_V8HI_V8HI_INT, V8SI_FTYPE_V8SI_V8SI_V8SI_INT,
V4SI_FTYPE_V4SI_V4SI_V4SI_INT, V8DI_FTYPE_V8DI_V8DI_V8DI,
V8DI_FTYPE_V8DI_V8DI_V8DI_INT, V4DI_FTYPE_V4DI_V4DI_V4DI_INT,
V16SI_FTYPE_V16SI_V16SI_V16SI, V16SI_FTYPE_V16SI_V16SI_V16SI_INT,
V2DI_FTYPE_V2DI_V2DI_V2DI_INT): New types.
config/i386/i386.c (ix86_expand_args_builtin): Handle new types.
config/i386/sse.md (vpshrdv_, vpshrdv__mask,
vpshrdv__maskz, vpshrdv__maskz_1): New pattern.

gcc/testsuite/
gcc.target/i386/avx512f-vpshrdv-1.c: New test.
gcc.target/i386/avx512f-vpshrdvd-2.c: Ditto.
gcc.target/i386/avx512f-vpshrdvq-2.c: Ditto.
gcc.target/i386/avx512f-vpshrdvw-2.c: Ditto.
gcc.target/i386/avx512f-vpshrdw-2.c: Ditto.
gcc.target/i386/avx512vl-vpshrdv-1.c: Ditto.
gcc.target/i386/avx512vl-vpshrdvd-2.c: Ditto.
gcc.target/i386/avx512vl-vpshrdvq-2.c: Ditto.
gcc.target/i386/avx512vl-vpshrdvw-2.c: Ditto.
gcc.target/i386/avx512vl-vpshrdw-2.c: Ditto.


0008-VPSHRDV.PATCH
Description: 0008-VPSHRDV.PATCH


RE: [PATCH][i386,AVX] Enable VBMI2 support [5/7]

2017-10-24 Thread Koval, Julia
Attached the patch

> -Original Message-
> From: Koval, Julia
> Sent: Tuesday, October 24, 2017 12:01 PM
> To: GCC Patches <gcc-patches@gcc.gnu.org>
> Cc: Kirill Yukhin <kirill.yuk...@gmail.com>
> Subject: [PATCH][i386,AVX] Enable VBMI2 support [5/7]
> 
> Hi,
> This patch enables VPSHRD instruction. The doc for isaset and instruction:
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> instruction-set-extensions-programming-reference.pdf
> 
> Ok for trunk?
> 
> gcc/
>   config/i386/avx512vbmi2intrin.h (_mm512_shrdi_epi16,
>   _mm512_mask_shrdi_epi16, _mm512_maskz_shrdi_epi16,
> _mm512_shrdi_epi32,
>   _mm512_mask_shrdi_epi32, _mm512_maskz_shrdi_epi32,
> _mm512_shrdi_epi64,
>   _mm512_mask_shrdi_epi64, _mm512_maskz_shrdi_epi64): New
> intrinsics.
>   config/i386/avx512vbmi2vlintrin.h (_mm256_shrdi_epi16,
>   _mm256_mask_shrdi_epi16, _mm256_maskz_shrdi_epi16,
>   _mm256_mask_shrdi_epi32, _mm256_maskz_shrdi_epi32,
> _mm256_shrdi_epi32,
>   _mm256_mask_shrdi_epi64, _mm256_maskz_shrdi_epi64,
> _mm256_shrdi_epi64,
>   _mm_mask_shrdi_epi16, _mm_maskz_shrdi_epi16, _mm_shrdi_epi16,
>   _mm_mask_shrdi_epi32, _mm_maskz_shrdi_epi32, _mm_shrdi_epi32,
>   _mm_mask_shrdi_epi64, _mm_maskz_shrdi_epi64, _mm_shrdi_epi64):
> Ditto.
>   config/i386/i386-builtin.def (__builtin_ia32_vpshrd_v32hi,
>   __builtin_ia32_vpshrd_v32hi_mask, __builtin_ia32_vpshrd_v16hi,
>   __builtin_ia32_vpshrd_v16hi_mask, __builtin_ia32_vpshrd_v8hi,
>   __builtin_ia32_vpshrd_v8hi_mask, __builtin_ia32_vpshrd_v16si,
>   __builtin_ia32_vpshrd_v16si_mask, __builtin_ia32_vpshrd_v8si,
>   __builtin_ia32_vpshrd_v8si_mask, __builtin_ia32_vpshrd_v4si,
>   __builtin_ia32_vpshrd_v4si_mask, __builtin_ia32_vpshrd_v8di,
>   __builtin_ia32_vpshrd_v8di_mask, __builtin_ia32_vpshrd_v4di,
>   __builtin_ia32_vpshrd_v4di_mask, __builtin_ia32_vpshrd_v2di,
>   __builtin_ia32_vpshrd_v2di_mask): New builtins.
>   config/i386/sse.md (vpshrd_): New pattern.
> 
> gcc/testsuite/
>   gcc.target/i386/avx-1.c: Handle new intrinsics.
>   gcc.target/i386/sse-13.c: Ditto.
>   gcc.target/i386/sse-23.c: Ditto.
>   gcc.target/i386/avx512f-vpshrdd-2.c: New.
>   gcc.target/i386/avx512f-vpshrdq-2.c: Ditto.
>   gcc.target/i386/avx512vl-vpshrd-1.c: Ditto.
>   gcc.target/i386/avx512vl-vpshrdd-2.c: Ditto.
>   gcc.target/i386/avx512vl-vpshrdq-2.c: Ditto.


0007-VPSHRD-instruction.patch
Description: 0007-VPSHRD-instruction.patch


[PATCH][i386,AVX] Enable VBMI2 support [5/7]

2017-10-24 Thread Koval, Julia
Hi,
This patch enables VPSHRD instruction. The doc for isaset and instruction: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?

gcc/
config/i386/avx512vbmi2intrin.h (_mm512_shrdi_epi16,
_mm512_mask_shrdi_epi16, _mm512_maskz_shrdi_epi16, _mm512_shrdi_epi32,
_mm512_mask_shrdi_epi32, _mm512_maskz_shrdi_epi32, _mm512_shrdi_epi64,
_mm512_mask_shrdi_epi64, _mm512_maskz_shrdi_epi64): New intrinsics.
config/i386/avx512vbmi2vlintrin.h (_mm256_shrdi_epi16,
_mm256_mask_shrdi_epi16, _mm256_maskz_shrdi_epi16,
_mm256_mask_shrdi_epi32, _mm256_maskz_shrdi_epi32, _mm256_shrdi_epi32,
_mm256_mask_shrdi_epi64, _mm256_maskz_shrdi_epi64, _mm256_shrdi_epi64,
_mm_mask_shrdi_epi16, _mm_maskz_shrdi_epi16, _mm_shrdi_epi16,
_mm_mask_shrdi_epi32, _mm_maskz_shrdi_epi32, _mm_shrdi_epi32,
_mm_mask_shrdi_epi64, _mm_maskz_shrdi_epi64, _mm_shrdi_epi64): Ditto.
config/i386/i386-builtin.def (__builtin_ia32_vpshrd_v32hi,
__builtin_ia32_vpshrd_v32hi_mask, __builtin_ia32_vpshrd_v16hi,
__builtin_ia32_vpshrd_v16hi_mask, __builtin_ia32_vpshrd_v8hi,
__builtin_ia32_vpshrd_v8hi_mask, __builtin_ia32_vpshrd_v16si,
__builtin_ia32_vpshrd_v16si_mask, __builtin_ia32_vpshrd_v8si,
__builtin_ia32_vpshrd_v8si_mask, __builtin_ia32_vpshrd_v4si,
__builtin_ia32_vpshrd_v4si_mask, __builtin_ia32_vpshrd_v8di,
__builtin_ia32_vpshrd_v8di_mask, __builtin_ia32_vpshrd_v4di,
__builtin_ia32_vpshrd_v4di_mask, __builtin_ia32_vpshrd_v2di,
__builtin_ia32_vpshrd_v2di_mask): New builtins.
config/i386/sse.md (vpshrd_): New pattern.

gcc/testsuite/
gcc.target/i386/avx-1.c: Handle new intrinsics.
gcc.target/i386/sse-13.c: Ditto.
gcc.target/i386/sse-23.c: Ditto.
gcc.target/i386/avx512f-vpshrdd-2.c: New.
gcc.target/i386/avx512f-vpshrdq-2.c: Ditto.
gcc.target/i386/avx512vl-vpshrd-1.c: Ditto.
gcc.target/i386/avx512vl-vpshrdd-2.c: Ditto.
gcc.target/i386/avx512vl-vpshrdq-2.c: Ditto.


[PATCH][i386,AVX] Enable VBMI2 support [4/7]

2017-10-24 Thread Koval, Julia
Hi,
This patch enables VPSHLD instruction. The doc for isaset and instruction: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?

gcc/
config/i386/avx512vbmi2intrin.h (_mm512_shldi_epi16,
_mm512_mask_shldi_epi16, _mm512_maskz_shldi_epi16, _mm512_shldi_epi32,
_mm512_mask_shldi_epi32, _mm512_maskz_shldi_epi32, _mm512_shldi_epi64,
_mm512_mask_shldi_epi64, _mm512_maskz_shldi_epi64): New intrinsics.
config/i386/avx512vbmi2vlintrin.h (_mm256_shldi_epi16,
_mm256_mask_shldi_epi16, _mm256_maskz_shldi_epi16,
_mm256_mask_shldi_epi32, _mm256_maskz_shldi_epi32, _mm256_shldi_epi32,
_mm256_mask_shldi_epi64, _mm256_maskz_shldi_epi64, _mm256_shldi_epi64,
_mm_mask_shldi_epi16, _mm_maskz_shldi_epi16, _mm_shldi_epi16,
_mm_mask_shldi_epi32, _mm_maskz_shldi_epi32, _mm_shldi_epi32,
_mm_mask_shldi_epi64, _mm_maskz_shldi_epi64, _mm_shldi_epi64): Ditto.
config/i386/i386-builtin-types.def (V32HI_FTYPE_V32HI_V32HI_INT,
V32HI_FTYPE_V32HI_V32HI_INT_V32HI_INT, V16SI_FTYPE_V16SI_V16SI_INT,
V16SI_FTYPE_V16SI_V16SI_INT_V16SI_INT,
V8DI_FTYPE_V8DI_V8DI_INT_V8DI_INT, V8SI_FTYPE_V8SI_V8SI_INT_V8SI_INT,
V16HI_FTYPE_V16HI_V16HI_INT_V16HI_INT,
V4DI_FTYPE_V4DI_V4DI_INT_V4DI_INT,
V8HI_FTYPE_V8HI_V8HI_INT_V8HI_INT,
V4SI_FTYPE_V4SI_V4SI_INT_V4SI_INT,
V2DI_FTYPE_V2DI_V2DI_INT_V2DI_INT): New types.
config/i386/i386-builtin.def (__builtin_ia32_vpshld_v32hi,
__builtin_ia32_vpshld_v32hi_mask, __builtin_ia32_vpshld_v16hi,
__builtin_ia32_vpshld_v16hi_mask, __builtin_ia32_vpshld_v8hi,
__builtin_ia32_vpshld_v8hi_mask, __builtin_ia32_vpshld_v16si,
__builtin_ia32_vpshld_v16si_mask, __builtin_ia32_vpshld_v8si,
__builtin_ia32_vpshld_v8si_mask, __builtin_ia32_vpshld_v4si,
__builtin_ia32_vpshld_v4si_mask, __builtin_ia32_vpshld_v8di,
__builtin_ia32_vpshld_v8di_mask, __builtin_ia32_vpshld_v4di,
__builtin_ia32_vpshld_v4di_mask, __builtin_ia32_vpshld_v2di,
__builtin_ia32_vpshld_v2di_mask): New builtins.
config/i386/i386.c (ix86_expand_args_builtin): Handle new types.
config/i386/sse.md (vpshld_): New pattern.

gcc/testsuite/
gcc.target/i386/avx-1.c: Handle new intrinics.
gcc.target/i386/sse-13.c: Ditto.
gcc.target/i386/sse-23.c: Ditto.
gcc.target/i386/avx512f-vpshld-1.c: New test.
gcc.target/i386/avx512f-vpshldd-2.c: Ditto.
gcc.target/i386/avx512f-vpshldq-2.c: Ditto.
gcc.target/i386/avx512vl-vpshld-1.c: Ditto.
gcc.target/i386/avx512vl-vpshldd-2.c: Ditto.
gcc.target/i386/avx512vl-vpshldq-2.c: Ditto.


0006-VPSHLD-instruction.patch
Description: 0006-VPSHLD-instruction.patch


[PATCH][i386,AVX] Enable VBMI2 support [3/7]

2017-10-24 Thread Koval, Julia
Hi,
This patch enables VPEXPANDB[W] instruction. The doc for isaset and 
instruction: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?

gcc/
config/i386/avx512vbmi2intrin.h (_mm512_mask_expand_epi8,
_mm512_maskz_expand_epi8, _mm512_mask_expandloadu_epi8,
_mm512_maskz_expandloadu_epi8, _mm512_mask_expand_epi16,
_mm512_maskz_expand_epi16, _mm512_mask_expandloadu_epi16,
_mm512_maskz_expandloadu_epi16): New intrinsics.
config/i386/avx512vbmi2vlintrin.h (_mm_mask_expand_epi8,
_mm_maskz_expand_epi8, _mm_mask_expandloadu_epi8,
_mm_maskz_expandloadu_epi8, _mm_mask_expand_epi16,
_mm_maskz_expand_epi16, _mm_mask_expandloadu_epi16,
_mm_maskz_expandloadu_epi16, _mm256_mask_expand_epi16,
_mm256_maskz_expand_epi16, _mm256_mask_expandloadu_epi16,
_mm256_maskz_expandloadu_epi16, _mm256_mask_expand_epi8,
_mm256_maskz_expand_epi8, _mm256_mask_expandloadu_epi8,
_mm256_maskz_expandloadu_epi8): New intrinsics.
config/i386/i386-builtin-types.def (V64QI_FTYPE_PCV64QI_V64QI_UDI,
V32HI_FTYPE_PCV32HI_V32HI_USI, V32QI_FTYPE_PCV32QI_V32QI_USI,
V16HI_FTYPE_PCV16HI_V16HI_UHI, V16QI_FTYPE_PCV16QI_V16QI_UHI,
V8HI_FTYPE_PCV8HI_V8HI_UQI): New types.
config/i386/i386.c (ix86_expand_special_args_builtin): Use new types.
config/i386/sse.md (VI248_VLBW): New iterator.
(expand_mask, expand_maskz): New patterns.

gcc/testsuite/
gcc.target/i386/avx512f-vpexpandb-1.c: New test.
gcc.target/i386/avx512f-vpexpandb-2.c: Ditto.
gcc.target/i386/avx512f-vpexpandw-1.c: Ditto.
gcc.target/i386/avx512f-vpexpandw-2.c: Ditto.
gcc.target/i386/avx512vl-vpexpandb-1.c: Ditto.
gcc.target/i386/avx512vl-vpexpandb-2.c: Ditto.
gcc.target/i386/avx512vl-vpexpandw-1.c: Ditto.
gcc.target/i386/avx512vl-vpexpandw-2.c: Ditto.




0005-VPEXPANDB-W-instruction.patch
Description: 0005-VPEXPANDB-W-instruction.patch


[PATCH][i386,AVX] Enable VBMI2 support [2/7]

2017-10-24 Thread Koval, Julia
Hi,
This patch enables VPCOMPRESSB[W] instruction. The doc for isaset and 
instruction: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?

gcc/
config.gcc (avx512vbmi2intrin.h, avx512vbmi2vlintrin): New headers.
config/i386/avx512vbmi2intrin.h (_mm512_mask_compress_epi8,
_mm512_maskz_compress_epi8, _mm512_mask_compressstoreu_epi8,
_mm512_mask_compress_epi16, _mm512_maskz_compress_epi16,
_mm512_mask_compressstoreu_epi16): New.
config/i386/avx512vbmi2vlintrin.h (_mm_mask_compress_epi8,
_mm_maskz_compress_epi8, _mm256_mask_compressstoreu_epi16,
_mm_mask_compress_epi16, _mm_maskz_compress_epi16,
_mm256_mask_compress_epi16, _mm256_maskz_compress_epi16,
_mm_mask_compressstoreu_epi8, _mm_mask_compressstoreu_epi16,
_mm256_mask_compress_epi8, _mm256_maskz_compress_epi8,
_mm256_mask_compressstoreu_epi8): New.
config/i386/i386-builtin-types.def (VOID_FTYPE_PV64QI_V64QI_UDI,
VOID_FTYPE_PV32HI_V32HI_USI, VOID_FTYPE_PV32QI_V32QI_USI,
VOID_FTYPE_PV16QI_V16QI_UHI, VOID_FTYPE_PV16HI_V16HI_UHI,
VOID_FTYPE_PV8HI_V8HI_UQI): New types.
config/i386/i386-builtin.def (__builtin_ia32_compressqi512_mask,
__builtin_ia32_compresshi512_mask, __builtin_ia32_compressqi256_mask,
__builtin_ia32_compressqi128_mask, __builtin_ia32_compresshi256_mask,
__builtin_ia32_compresshi128_mask,
__builtin_ia32_compressstoreuqi512_mask,
__builtin_ia32_compressstoreuhi512_mask,
__builtin_ia32_compressstoreuqi256_mask,
__builtin_ia32_compressstoreuqi128_mask,
__builtin_ia32_compressstoreuhi256_mask,
__builtin_ia32_compressstoreuhi128_mask): New builtins.
config/i386/i386.c (ix86_init_mmx_sse_builtins): Create special args
array for flags2.
(ix86_expand_special_args_builtin): Handle new types.
(s4fma_expand): Handle new builtin array.
config/i386/immintrin.h: Include new headers.
config/i386/sse.md (VI12_AVX512VLBW): New iterator.
(compress_mask, compressstore_mask): New patterns.

gcc/testsuite/
gcc.target/i386/avx512-check.h: Handle AVX512VBMI2 bit.
gcc.target/i386/avx512f-vpcompressb-1.c: New test.
gcc.target/i386/avx512f-vpcompressb-2.c: Ditto.
gcc.target/i386/avx512f-vpcompressw-1.c: Ditto.
gcc.target/i386/avx512f-vpcompressw-2.c: Ditto.
gcc.target/i386/avx512vl-vpcompressb-1.c: Ditto.
gcc.target/i386/avx512vl-vpcompressb-2.c: Ditto.
gcc.target/i386/avx512vl-vpcompressw-1.c: Ditto.
gcc.target/i386/avx512vl-vpcompressw-2.c: Ditto.
gcc.target/i386/i386.exp (check_effective_target_avx512vbmi2): New.


0004-VPCOMPRESSB-W-instruction.patch
Description: 0004-VPCOMPRESSB-W-instruction.patch


[PATCH][i386,AVX] Enable VBMI2 support [1/7]

2017-10-24 Thread Koval, Julia
Hi,
This patch enables VBMI2 isaset option. The doc for isaset and instruction: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Ok for trunk?

Thanks,
Julia

gcc/
common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512VBMI2_SET,
OPTION_MASK_ISA_AVX512VBMI2_UNSET): New.
(ix86_handle_option): Handle -mavx512vbmi2.
config/i386/cpuid.h: Add bit_AVX512VBMI2.
config/i386/driver-i386.c (host_detect_local_cpu): Handle new bit.
config/i386/i386-c.c (__AVX512VBMI2__): New.
config/i386/i386.c (ix86_target_string): Handle -mavx512vbmi2.
(ix86_valid_target_attribute_inner_p): Ditto.
config/i386/i386.h (TARGET_AVX512VBMI2, TARGET_AVX512VBMI2_P): New.
config/i386/i386.opt (mavx512vbmi2): New option.


0003-VBMI2-option.patch
Description: 0003-VBMI2-option.patch


RE: [patch][i386, AVX] GFNI enabling [4/4]

2017-10-17 Thread Koval, Julia
Fixed changelog.

gcc/
* config/i386/gfniintrin.h (_mm_gf2p8mul_epi8, _mm256_gf2p8mul_epi8,
_mm_mask_gf2p8mul_epi8, _mm_maskz_gf2p8mul_epi8,
_mm256_mask_gf2p8mul_epi8, _mm256_maskz_gf2p8mul_epi8,
_mm512_mask_gf2p8mul_epi8, _mm512_maskz_gf2p8mul_epi8,
_mm512_gf2p8mul_epi8): New intrinsics.
* config/i386/i386-builtin-types.def
(V64QI_FTYPE_V64QI_V64QI): New type.
* config/i386/i386-builtin.def (__builtin_ia32_vgf2p8mulb_v64qi,
__builtin_ia32_vgf2p8mulb_v64qi_mask, __builtin_ia32_vgf2p8mulb_v32qi,
__builtin_ia32_vgf2p8mulb_v32qi_mask, __builtin_ia32_vgf2p8mulb_v16qi,
__builtin_ia32_vgf2p8mulb_v16qi_mask): New builtins.
* config/i386/sse.md (vgf2p8mulb_*): New pattern.
* config/i386/i386.c (ix86_expand_args_builtin): Handle new type.

gcc/testsuite/
* gcc.target/i386/avx512f-gf2p8mulb-2.c: New runtime tests.
* gcc.target/i386/avx512vl-gf2p8mulb-2.c: Ditto.
* gcc.target/i386/gfni-1.c: Add tests for GF2P8MUL.
* gcc.target/i386/gfni-2.c: Ditto.
* gcc.target/i386/gfni-3.c: Ditto.
* gcc.target/i386/gfni-4.c: Ditto.

> -Original Message-
> From: Koval, Julia
> Sent: Tuesday, October 17, 2017 3:21 PM
> To: GCC Patches <gcc-patches@gcc.gnu.org>
> Cc: Kirill Yukhin <kirill.yuk...@gmail.com>
> Subject: [patch][i386, AVX] GFNI enabling [4/4]
> 
> Hi,
> This the fourth patch of GFNI ISASET enabling. It enables GF2P8MULB
> instruction, described here:
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> instruction-set-extensions-programming-reference.pdf
> 
> gcc/
>   * config/i386/gfniintrin.h (_mm_gf2p8mul_epi8,
> _mm256_gf2p8mul_epi8,
>   _mm_mask_gf2p8mul_epi8, _mm_maskz_gf2p8mul_epi8,
> _mm256_mask_gf2p8mul_epi8,
>   _mm256_maskz_gf2p8mul_epi8, _mm512_mask_gf2p8mul_epi8,
> _mm512_maskz_gf2p8mul_epi8,
>   _mm512_gf2p8mul_epi8): New intrinsics.
>   * config/i386/i386-builtin-types.def (V64QI_FTYPE_V64QI_V64QI): New
> type.
>   * config/i386/i386-builtin.def (__builtin_ia32_vgf2p8mulb_v64qi,
> __builtin_ia32_vgf2p8mulb_v64qi_mask
>   __builtin_ia32_vgf2p8mulb_v32qi,
> __builtin_ia32_vgf2p8mulb_v32qi_mask,
>   __builtin_ia32_vgf2p8mulb_v16qi,
> __builtin_ia32_vgf2p8mulb_v16qi_mask): New builtins.
>   * config/i386/sse.md (vgf2p8mulb_*): New pattern.
>   * config/i386/i386.c (ix86_expand_args_builtin): Handle new type.
> 
> gcc/testsuite/
>   * gcc.target/i386/avx512f-gf2p8mulb-2.c: New runtime tests.
>   * gcc.target/i386/avx512vl-gf2p8mulb-2.c: Ditto.
>   * gcc.target/i386/gfni-1.c: Add tests for GF2P8MUL.
>   * gcc.target/i386/gfni-2.c: Ditto.
>   * gcc.target/i386/gfni-3.c: Ditto.
>   * gcc.target/i386/gfni-4.c: Ditto.
> 
> Ok for trunk?
> 
> Thanks,
> Julia



RE: [patch][i386, AVX] GFNI enabling [3/4]

2017-10-17 Thread Koval, Julia
Thanks for your comments, fixed everything.

gcc/
* config/i386/gfniintrin.h (_mm_gf2p8affine_epi64_epi8,
_mm256_gf2p8affine_epi64_epi8, _mm_mask_gf2p8affine_epi64_epi8,
_mm_maskz_gf2p8affine_epi64_epi8, _mm256_mask_gf2p8affine_epi64_epi8,
_mm256_maskz_gf2p8affine_epi64_epi8,
_mm512_mask_gf2p8affine_epi64_epi8, _mm512_gf2p8affine_epi64_epi8
_mm512_maskz_gf2p8affine_epi64_epi8): New intrinsics.
* config/i386/i386-builtin.def (__builtin_ia32_vgf2p8affineqb_v64qi,
__builtin_ia32_vgf2p8affineqb_v32qi,
__builtin_ia32_vgf2p8affineqb_v16qi): New builtins.
* config/i386/sse.md (vgf2p8affineqb_): New pattern.

gcc/testsuite/
* gcc.target/i386/avx-1.c: Handle new intrinsics.
* gcc.target/i386/avx512f-gf2p8affineqb-2.c: New runtime tests.
* gcc.target/i386/avx512vl-gf2p8affineqb-2.c: Ditto.
* gcc.target/i386/gfni-1.c: Add tests for GF2P8AFFINE.
* gcc.target/i386/gfni-2.c: Ditto.
* gcc.target/i386/gfni-3.c: Ditto.
* gcc.target/i386/gfni-4.c: Ditto.
* gcc.target/i386/sse-13.c: Handle new tests.
* gcc.target/i386/sse-23.c: Handle new tests.


> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Tuesday, October 17, 2017 3:15 PM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Kirill Yukhin
> <kirill.yuk...@gmail.com>
> Subject: Re: [patch][i386, AVX] GFNI enabling [3/4]
> 
> On Tue, Oct 17, 2017 at 01:09:50PM +, Koval, Julia wrote:
> > Hi, this the third patch of GFNI ISASET enabling. It enables GF2P8AFFINE
> instruction, described here:
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> instruction-set-extensions-programming-reference.pdf
> >
> > gcc/
> > * config/i386/gfniintrin.h (_mm_gf2p8affine_epi64_epi8,
> _mm256_gf2p8affine_epi64_epi8,
> 
> Too long line, even ChangeLog entries should be wrapped to 80 columns.
> 
> > (_mm_mask_gf2p8affine_epi64_epi8,
> _mm_maskz_gf2p8affine_epi64_epi8,
> > _mm256_mask_gf2p8affine_epi64_epi8,
> _mm256_maskz_gf2p8affine_epi64_epi8,
> > _mm512_mask_gf2p8affine_epi64_epi8,
> _mm512_maskz_gf2p8affine_epi64_epi8,
> 
> The above two are also too long (off by 1 char).
> 
> > _mm512_gf2p8affine_epi64_epi8): New intrinsics.
> > * config/i386/i386-builtin.def (__builtin_ia32_vgf2p8affineqb_v64qi,
> > __builtin_ia32_vgf2p8affineqb_v32qi,
> __builtin_ia32_vgf2p8affineqb_v16qi): New builtins.
> 
> And this one too.  Please wrap them.
> 
> > * config/i386/sse.md (vgf2p8affineqb_*): New pattern.
> 
> Use vgf2p8affineqb_ instead of the wild-card?
> 
> I'll defer actual review to Kirill.
> 
>   Jakub


[patch][i386, AVX] GFNI enabling [4/4]

2017-10-17 Thread Koval, Julia
Hi,
This the fourth patch of GFNI ISASET enabling. It enables GF2P8MULB 
instruction, described here: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

gcc/
* config/i386/gfniintrin.h (_mm_gf2p8mul_epi8, _mm256_gf2p8mul_epi8,
_mm_mask_gf2p8mul_epi8, _mm_maskz_gf2p8mul_epi8, 
_mm256_mask_gf2p8mul_epi8,
_mm256_maskz_gf2p8mul_epi8, _mm512_mask_gf2p8mul_epi8, 
_mm512_maskz_gf2p8mul_epi8,
_mm512_gf2p8mul_epi8): New intrinsics.
* config/i386/i386-builtin-types.def (V64QI_FTYPE_V64QI_V64QI): New 
type.
* config/i386/i386-builtin.def (__builtin_ia32_vgf2p8mulb_v64qi, 
__builtin_ia32_vgf2p8mulb_v64qi_mask
__builtin_ia32_vgf2p8mulb_v32qi, __builtin_ia32_vgf2p8mulb_v32qi_mask,
__builtin_ia32_vgf2p8mulb_v16qi, __builtin_ia32_vgf2p8mulb_v16qi_mask): 
New builtins.
* config/i386/sse.md (vgf2p8mulb_*): New pattern.
* config/i386/i386.c (ix86_expand_args_builtin): Handle new type.

gcc/testsuite/
* gcc.target/i386/avx512f-gf2p8mulb-2.c: New runtime tests.
* gcc.target/i386/avx512vl-gf2p8mulb-2.c: Ditto.
* gcc.target/i386/gfni-1.c: Add tests for GF2P8MUL.
* gcc.target/i386/gfni-2.c: Ditto. 
* gcc.target/i386/gfni-3.c: Ditto. 
* gcc.target/i386/gfni-4.c: Ditto.

Ok for trunk?

Thanks,
Julia



0004-GF2P8MULB-instruction.patch
Description: 0004-GF2P8MULB-instruction.patch


[patch][i386, AVX] GFNI enabling [3/4]

2017-10-17 Thread Koval, Julia
Hi, this the third patch of GFNI ISASET enabling. It enables GF2P8AFFINE 
instruction, described here: 
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

gcc/
* config/i386/gfniintrin.h (_mm_gf2p8affine_epi64_epi8, 
_mm256_gf2p8affine_epi64_epi8,
(_mm_mask_gf2p8affine_epi64_epi8, _mm_maskz_gf2p8affine_epi64_epi8,
_mm256_mask_gf2p8affine_epi64_epi8, _mm256_maskz_gf2p8affine_epi64_epi8,
_mm512_mask_gf2p8affine_epi64_epi8, _mm512_maskz_gf2p8affine_epi64_epi8,
_mm512_gf2p8affine_epi64_epi8): New intrinsics.
* config/i386/i386-builtin.def (__builtin_ia32_vgf2p8affineqb_v64qi,
__builtin_ia32_vgf2p8affineqb_v32qi, 
__builtin_ia32_vgf2p8affineqb_v16qi): New builtins.
* config/i386/sse.md (vgf2p8affineqb_*): New pattern.

gcc/testsuite/
* gcc.target/i386/avx-1.c: Handle new intrinsics.
* gcc.target/i386/avx512f-gf2p8affineqb-2.c: New runtime tests.
* gcc.target/i386/avx512vl-gf2p8affineqb-2.c: Ditto.
* gcc.target/i386/gfni-1.c: Add tests for GF2P8AFFINE.
* gcc.target/i386/gfni-2.c: Ditto. 
* gcc.target/i386/gfni-3.c: Ditto. 
* gcc.target/i386/gfni-4.c: Ditto.
* gcc.target/i386/sse-13.c: Handle new tests.
* gcc.target/i386/sse-23.c: Handle new tests.

Ok for trunk?

Thanks,
Julia


0003-GF2P8AFFINE-instruction.patch
Description: 0003-GF2P8AFFINE-instruction.patch


[patch][x86] GFNI enabling [2/4]

2017-10-17 Thread Koval, Julia
Hi, this is the second patch of enabling GFNI ISASET. It adds GF2P8AFFINEINV 
instruction.
The instruction is described here:
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

gcc/
* config.gcc: Add gfniintrin.h.
* config/i386/gfniintrin.h: New.
* config/i386/i386-builtin-types.def 
(__builtin_ia32_vgf2p8affineinvqb_v64qi,
__builtin_ia32_vgf2p8affineinvqb_v64qi_mask, 
__builtin_ia32_vgf2p8affineinvqb_v32qi
__builtin_ia32_vgf2p8affineinvqb_v32qi_mask, 
__builtin_ia32_vgf2p8affineinvqb_v16qi,
__builtin_ia32_vgf2p8affineinvqb_v16qi_mask): New builtins.
* config/i386/i386-builtin.def (V64QI_FTYPE_V64QI_V64QI_INT_V64QI_UDI,
V32QI_FTYPE_V32QI_V32QI_INT_V32QI_USI, 
V16QI_FTYPE_V16QI_V16QI_INT_V16QI_UHI,
V64QI_FTYPE_V64QI_V64QI_INT): New types.
* config/i386/i386.c (ix86_expand_args_builtin): Handle new types.
* config/i386/immintrin.h: Include gfniintrin.h.
* config/i386/sse.md (vgf2p8affineinvqb_*) New pattern.

gcc/testsuite/
* gcc.target/i386/avx-1.c: Handle new intrinsics.
* gcc.target/i386/avx512-check.h: Check GFNI bit.
* gcc.target/i386/avx512f-gf2p8affineinvqb-2.c: Runtime test.
* gcc.target/i386/avx512vl-gf2p8affineinvqb-2.c: Runtime test.
* gcc.target/i386/gfni-1.c: New.
* gcc.target/i386/gfni-2.c: New.
* gcc.target/i386/gfni-3.c: New.
* gcc.target/i386/gfni-4.c: New.
* gcc.target/i386/i386.exp: (check_effective_target_gfni): New.
* gcc.target/i386/sse-13.c: Handle new intrinsics.
* gcc.target/i386/sse-23.c: Handle new intrinsics.

Ok for trunk?

Thanks,
Julia


0002-GF2P8AFFINEINVQB-instruction.patch
Description: 0002-GF2P8AFFINEINVQB-instruction.patch


RE: [x86] GFNI enabling[1/4]

2017-10-13 Thread Koval, Julia
Sorry, fixed that.

Thanks,
Julia

> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Friday, October 13, 2017 9:08 AM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Uros Bizjak
> <ubiz...@gmail.com>; Kirill Yukhin <kirill.yuk...@gmail.com>
> Subject: Re: [x86] GFNI enabling[1/4]
> 
> On Fri, Oct 13, 2017 at 07:03:14AM +, Koval, Julia wrote:
> 
> --- a/gcc/config/i386/i386.opt
> +++ b/gcc/config/i386/i386.opt
> @@ -753,6 +753,10 @@ mrdpid
>  Target Report Mask(ISA_RDPID) Var(ix86_isa_flags2) Save
>  Support RDPID built-in functions and code generation.
> 
> +mgfni
> +Target Report Mask(ISA_GFNI) Var(ix86_isa_flags2) Save
> +Support RDPID built-in functions and code generation.
> +
> 
> Pasto?  It would surprise me if the description was meant to be
> exactly the same as -mrdpid.
> 
>   Jakub


0001-gfni-option.patch
Description: 0001-gfni-option.patch


[x86] GFNI enabling[1/4]

2017-10-13 Thread Koval, Julia
Hi,

gcc/
* gcc/common/config/i386/i386-common.c (OPTION_MASK_ISA_GFNI_SET,
(OPTION_MASK_ISA_GFNI_UNSET): New.
(ix86_handle_option): Handle OPT_mgfni.
* gcc/config/i386/cpuid.h (bit_GFNI): New.
* gcc/config/i386/driver-i386.c (host_detect_local_cpu): Detect gfni.
* gcc/config/i386/i386-c.c (ix86_target_macros_internal): Define 
__GFNI__.
* gcc/config/i386/i386.c (ix86_target_string): Add -mgfni.
(ix86_valid_target_attribute_inner_p): Add OPT_mgfni.
* gcc/config/i386/i386.h (TARGET_GFNI, TARGET_GFNI_P): New.
* gcc/config/i386/i386.opt: Add mgfni.

Here is the first patch of GFNI isaset enabling. It adds new option -mgfni for 
GFNI isaset and cpuid bit.
Docs for new instructions and isasets are here:
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
Ok for trunk? 

Thanks,
Julia


0001-gfni-option.patch
Description: 0001-gfni-option.patch


[patch][x86] Remove old rounding code

2017-06-21 Thread Koval, Julia
Hi,
This patch removes old parallel code for avx512er. Parallel in this case can't 
be generated anymore, because all existing patterns were reworked to unspec in 
r249423 and r249009. Ok for trunk?

gcc/
* gcc/config/i386/i386.c (ix86_erase_embedded_rounding):
Remove code for old rounding pattern.

Thanks,
Julia


0001-remove-code-for-old-rounding.patch
Description: 0001-remove-code-for-old-rounding.patch


RE: [PATCH][X86] Fix rounding pattern similar to PR73350

2017-06-16 Thread Koval, Julia
Hi,

This test hangs on avx512er, maybe that's why:
> According to POSIX, the behavior of a process is undefined after it ignores a 
> SIGFPE, SIGILL, or SIGSEGV signal that was not generated by kill(2) or 
> raise(3).

And volatile make it work even without a patch(r1 and r2 are not combined then).

Added other changes.

Thanks,
Julia


> -Original Message-
> From: Uros Bizjak [mailto:ubiz...@gmail.com]
> Sent: Wednesday, June 14, 2017 11:54 AM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: Richard Biener <richard.guent...@gmail.com>; Jakub Jelinek
> <ja...@redhat.com>; H.J. Lu <hjl.to...@gmail.com>; GCC Patches  patc...@gcc.gnu.org>; Kirill Yukhin <kirill.yuk...@gmail.com>
> Subject: Re: [PATCH][X86] Fix rounding pattern similar to PR73350
> 
> On Tue, Jun 13, 2017 at 1:37 PM, Koval, Julia <julia.ko...@intel.com> wrote:
> > Thank you for your help. I fixed the test similar to existing sigaction 
> > tests.
> >
> > gcc/
> > * config/i386/i386.c: Fix rounding expand for new pattern.
> > * config/i386/subst.md: Fix pattern (parallel -> unspec).
> > gcc/testsuite/
> > * gcc.target/i386/pr73350-2.c: New test.
> 
> The test will fail at runtime on non-avx512er targets. Can you please
> test the attached testcase?
> 
> Uros.


0001-fix.patch
Description: 0001-fix.patch


RE: [PATCH][X86] Fix rounding pattern similar to PR73350

2017-06-13 Thread Koval, Julia
Thank you for your help. I fixed the test similar to existing sigaction tests.

gcc/
* config/i386/i386.c: Fix rounding expand for new pattern.
* config/i386/subst.md: Fix pattern (parallel -> unspec).
gcc/testsuite/
* gcc.target/i386/pr73350-2.c: New test.

Thanks,
Julia

> -Original Message-
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Tuesday, June 13, 2017 10:09 AM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: Jakub Jelinek <ja...@redhat.com>; H.J. Lu <hjl.to...@gmail.com>; GCC
> Patches <gcc-patches@gcc.gnu.org>; Uros Bizjak <ubiz...@gmail.com>; Kirill
> Yukhin <kirill.yuk...@gmail.com>
> Subject: Re: [PATCH][X86] Fix rounding pattern similar to PR73350
> 
> On Mon, Jun 12, 2017 at 6:50 PM, Koval, Julia <julia.ko...@intel.com> wrote:
> > I'm so sorry, but I really don't get it. The right result of the test is: 
> > Floating
> point exception (core dumped). The wrong result of the test is: nan(no
> exception). If I get an exception(which is right) - the test is failed 
> anyway. The
> exception is raised in one instruction, I can't get any intermediate value 
> there..
> 
> We do have a few testcases catching these cases by installing a signal
> handler (grep for sigaction in testsuite/)
> 
> Richard.
> 
> > I tried to replaced it with compile time test(attached), which shows, that 
> > both
> instruction are generated(not combined) - is it ok?
> >
> > Thanks,
> > Julia
> >
> >> -Original Message-
> >> From: Jakub Jelinek [mailto:ja...@redhat.com]
> >> Sent: Monday, June 12, 2017 6:18 PM
> >> To: H.J. Lu <hjl.to...@gmail.com>
> >> Cc: Koval, Julia <julia.ko...@intel.com>; GCC Patches  >> patc...@gcc.gnu.org>; Uros Bizjak <ubiz...@gmail.com>; Kirill Yukhin
> >> <kirill.yuk...@gmail.com>
> >> Subject: Re: [PATCH][X86] Fix rounding pattern similar to PR73350
> >>
> >> On Mon, Jun 12, 2017 at 09:08:00AM -0700, H.J. Lu wrote:
> >> > On Mon, Jun 12, 2017 at 9:06 AM, Koval, Julia <julia.ko...@intel.com>
> wrote:
> >> > > I would like to, but as far as I know the only testcase possible is 
> >> > > below,
> and
> >> as far as I know there is no possibility to use dg-error for runtime
> >> exceptions(Sorry, if I'm wrong). There are only 2 versions of the flag
> exception
> >> or no exception and the error is, when they are combined in CSE.
> >> >
> >> > Can you use
> >> >
> >> > if (wrong)
> >> >   abort ();
> >> >
> >> > in testcase?
> >>
> >> Where wrong can also be if (__builtin_fabsf (somefloatval - expectedval) <
> >> epsilon)
> >> or similar if needed.  Also, the testcase contains many unnecessary
> >> includes, if you use __builtin_abort, I'd hope you only need x86intrin.h 
> >> and
> >> nothing else.  And, main can be just int main (), argc and argv aren't 
> >> used.
> >> >
> >> > >> -Original Message-
> >> > >> From: H.J. Lu [mailto:hjl.to...@gmail.com]
> >> > >> Sent: Monday, June 12, 2017 3:43 PM
> >> > >> To: Koval, Julia <julia.ko...@intel.com>
> >> > >> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Uros Bizjak
> >> > >> <ubiz...@gmail.com>; Kirill Yukhin <kirill.yuk...@gmail.com>
> >> > >> Subject: Re: [PATCH][X86] Fix rounding pattern similar to PR73350
> >> > >>
> >> > >> On Mon, Jun 12, 2017 at 6:21 AM, Koval, Julia <julia.ko...@intel.com>
> >> wrote:
> >> > >> > This is the same issue as PR73350 and PR80862 for disabling FP
> >> exceptions.
> >> > >> >
> >> > >> > gcc -O0 -mavx512f -mavx512er returns exception
> >> > >> > gcc -O2 -mavx512f -mavx512er returns nan
> >> > >> >
> >> > >> > For this code:
> >> > >> >
> >> > >> > #include 
> >> > >> > #include 
> >> > >> > #include 
> >> > >> > #include 
> >> > >> > #include 
> >> > >> >
> >> > >> > int main(int argc, char *argv[]) {
> >> > >> > __m512 a = _mm512_set1_ps((float) -1);
> >> > >> > __m512 b = _mm512_set1_ps((float) -1);
> >> > >> > _mm_setcsr( _MM_MASK_MASK &~
> >> > >> >
> >> > >>
> (_MM_MASK_OVERFLOW|_MM_MASK_INVALID|_MM_MASK_DIV_ZERO)
> >> );
> >> > >> > __m512 result1 = _mm512_rsqrt28_round_ps(a,
> >> _MM_FROUND_NO_EXC );
> >> > >> > printf("%d %d\n", _MM_FROUND_CUR_DIRECTION,
> >> > >> _MM_FROUND_NO_EXC);
> >> > >> > __m512 result2 = _mm512_rsqrt28_round_ps(a,
> >> > >> _MM_FROUND_CUR_DIRECTION);
> >> > >> >
> >> > >> > printf("%g\n", result1[0] - result2[0]);
> >> > >> >
> >> > >> > return 0;
> >> > >> > }
> >>
> >>   Jakub


0001-fix.patch
Description: 0001-fix.patch


RE: [PATCH][X86] Fix rounding pattern similar to PR73350

2017-06-12 Thread Koval, Julia
I'm so sorry, but I really don't get it. The right result of the test is: 
Floating point exception (core dumped). The wrong result of the test is: nan(no 
exception). If I get an exception(which is right) - the test is failed anyway. 
The exception is raised in one instruction, I can't get any intermediate value 
there..

I tried to replaced it with compile time test(attached), which shows, that both 
instruction are generated(not combined) - is it ok?

Thanks,
Julia

> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Monday, June 12, 2017 6:18 PM
> To: H.J. Lu <hjl.to...@gmail.com>
> Cc: Koval, Julia <julia.ko...@intel.com>; GCC Patches  patc...@gcc.gnu.org>; Uros Bizjak <ubiz...@gmail.com>; Kirill Yukhin
> <kirill.yuk...@gmail.com>
> Subject: Re: [PATCH][X86] Fix rounding pattern similar to PR73350
> 
> On Mon, Jun 12, 2017 at 09:08:00AM -0700, H.J. Lu wrote:
> > On Mon, Jun 12, 2017 at 9:06 AM, Koval, Julia <julia.ko...@intel.com> wrote:
> > > I would like to, but as far as I know the only testcase possible is 
> > > below, and
> as far as I know there is no possibility to use dg-error for runtime
> exceptions(Sorry, if I'm wrong). There are only 2 versions of the flag 
> exception
> or no exception and the error is, when they are combined in CSE.
> >
> > Can you use
> >
> > if (wrong)
> >   abort ();
> >
> > in testcase?
> 
> Where wrong can also be if (__builtin_fabsf (somefloatval - expectedval) <
> epsilon)
> or similar if needed.  Also, the testcase contains many unnecessary
> includes, if you use __builtin_abort, I'd hope you only need x86intrin.h and
> nothing else.  And, main can be just int main (), argc and argv aren't used.
> >
> > >> -Original Message-
> > >> From: H.J. Lu [mailto:hjl.to...@gmail.com]
> > >> Sent: Monday, June 12, 2017 3:43 PM
> > >> To: Koval, Julia <julia.ko...@intel.com>
> > >> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Uros Bizjak
> > >> <ubiz...@gmail.com>; Kirill Yukhin <kirill.yuk...@gmail.com>
> > >> Subject: Re: [PATCH][X86] Fix rounding pattern similar to PR73350
> > >>
> > >> On Mon, Jun 12, 2017 at 6:21 AM, Koval, Julia <julia.ko...@intel.com>
> wrote:
> > >> > This is the same issue as PR73350 and PR80862 for disabling FP
> exceptions.
> > >> >
> > >> > gcc -O0 -mavx512f -mavx512er returns exception
> > >> > gcc -O2 -mavx512f -mavx512er returns nan
> > >> >
> > >> > For this code:
> > >> >
> > >> > #include 
> > >> > #include 
> > >> > #include 
> > >> > #include 
> > >> > #include 
> > >> >
> > >> > int main(int argc, char *argv[]) {
> > >> > __m512 a = _mm512_set1_ps((float) -1);
> > >> > __m512 b = _mm512_set1_ps((float) -1);
> > >> > _mm_setcsr( _MM_MASK_MASK &~
> > >> >
> > >> (_MM_MASK_OVERFLOW|_MM_MASK_INVALID|_MM_MASK_DIV_ZERO)
> );
> > >> > __m512 result1 = _mm512_rsqrt28_round_ps(a,
> _MM_FROUND_NO_EXC );
> > >> > printf("%d %d\n", _MM_FROUND_CUR_DIRECTION,
> > >> _MM_FROUND_NO_EXC);
> > >> > __m512 result2 = _mm512_rsqrt28_round_ps(a,
> > >> _MM_FROUND_CUR_DIRECTION);
> > >> >
> > >> > printf("%g\n", result1[0] - result2[0]);
> > >> >
> > >> > return 0;
> > >> > }
> 
>   Jakub


0001-fix.patch
Description: 0001-fix.patch


RE: [PATCH][X86] Fix rounding pattern similar to PR73350

2017-06-12 Thread Koval, Julia
I would like to, but as far as I know the only testcase possible is below, and 
as far as I know there is no possibility to use dg-error for runtime 
exceptions(Sorry, if I'm wrong). There are only 2 versions of the flag 
exception or no exception and the error is, when they are combined in CSE.

> -Original Message-
> From: H.J. Lu [mailto:hjl.to...@gmail.com]
> Sent: Monday, June 12, 2017 3:43 PM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Uros Bizjak
> <ubiz...@gmail.com>; Kirill Yukhin <kirill.yuk...@gmail.com>
> Subject: Re: [PATCH][X86] Fix rounding pattern similar to PR73350
> 
> On Mon, Jun 12, 2017 at 6:21 AM, Koval, Julia <julia.ko...@intel.com> wrote:
> > This is the same issue as PR73350 and PR80862 for disabling FP exceptions.
> >
> > gcc -O0 -mavx512f -mavx512er returns exception
> > gcc -O2 -mavx512f -mavx512er returns nan
> >
> > For this code:
> >
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> >
> > int main(int argc, char *argv[]) {
> > __m512 a = _mm512_set1_ps((float) -1);
> > __m512 b = _mm512_set1_ps((float) -1);
> > _mm_setcsr( _MM_MASK_MASK &~
> >
> (_MM_MASK_OVERFLOW|_MM_MASK_INVALID|_MM_MASK_DIV_ZERO) );
> > __m512 result1 = _mm512_rsqrt28_round_ps(a, _MM_FROUND_NO_EXC );
> > printf("%d %d\n", _MM_FROUND_CUR_DIRECTION,
> _MM_FROUND_NO_EXC);
> > __m512 result2 = _mm512_rsqrt28_round_ps(a,
> _MM_FROUND_CUR_DIRECTION);
> >
> > printf("%g\n", result1[0] - result2[0]);
> >
> > return 0;
> > }
> >
> > This patch fixes the issue.
> >
> > gcc/
> > * config/i386/i386.c: Fix rounding expand for new pattern
> > * config/i386/subst.md: Fix pattern (parallel -> unspec)
> >
> > Ok for trunk?
> >
> 
> Please include the testcase.
> 
> Thanks.
> 
> --
> H.J.


[PATCH][X86] Fix rounding pattern similar to PR73350

2017-06-12 Thread Koval, Julia
This is the same issue as PR73350 and PR80862 for disabling FP exceptions.

gcc -O0 -mavx512f -mavx512er returns exception
gcc -O2 -mavx512f -mavx512er returns nan

For this code:

#include 
#include 
#include 
#include 
#include 

int main(int argc, char *argv[]) {
__m512 a = _mm512_set1_ps((float) -1);
__m512 b = _mm512_set1_ps((float) -1);
_mm_setcsr( _MM_MASK_MASK &~
  (_MM_MASK_OVERFLOW|_MM_MASK_INVALID|_MM_MASK_DIV_ZERO) );
__m512 result1 = _mm512_rsqrt28_round_ps(a, _MM_FROUND_NO_EXC );
printf("%d %d\n", _MM_FROUND_CUR_DIRECTION, _MM_FROUND_NO_EXC);
__m512 result2 = _mm512_rsqrt28_round_ps(a, _MM_FROUND_CUR_DIRECTION);

printf("%g\n", result1[0] - result2[0]);

return 0;
}

This patch fixes the issue.

gcc/
* config/i386/i386.c: Fix rounding expand for new pattern
* config/i386/subst.md: Fix pattern (parallel -> unspec)

Ok for trunk?

Thanks,
Julia


0001-fix.patch
Description: 0001-fix.patch


[PATCH] Add mov[us]wb store intrinsics

2017-06-08 Thread Koval, Julia
Hi,
These patch adds these 9 new intrinsics. Ok for trunk?

gcc/
* config/i386/avx512bwintrin.h (_mm512_mask_cvtepi16_storeu_epi8,
_mm512_mask_cvtsepi16_storeu_epi8,
_mm512_mask_cvtusepi16_storeu_epi8): New intrinsics.
* config/i386/avx512vlbwintrin.h (_mm256_mask_cvtepi16_storeu_epi8,
_mm_mask_cvtsepi16_storeu_epi8, _mm256_mask_cvtsepi16_storeu_epi8,
_mm_mask_cvtusepi16_storeu_epi8, _mm256_mask_cvtusepi16_storeu_epi8,
_mm_mask_cvtepi16_storeu_epi8): New intrinsics.
* config/i386/i386-builtin-types.def (PV8Q, V8QI): New pointer type.
(VOID_FTYPE_PV32QI_V32HI_USI, VOID_FTYPE_PV8QI_V8HI_UQI,
VOID_FTYPE_PV16QI_V16HI_UHI): New function types.
* config/i386/i386-builtin.def (__builtin_ia32_pmovwb128mem_mask,
__builtin_ia32_pmovwb256mem_mask, __builtin_ia32_pmovswb128mem_mask,
__builtin_ia32_pmovswb256mem_mask, __builtin_ia32_pmovuswb128mem_mask,
__builtin_ia32_pmovuswb256mem_mask,
__builtin_ia32_pmovuswb512mem_mask, __builtin_ia32_pmovswb512mem_mask,  
__builtin_ia32_pmovwb512mem_mask): New builtins.
* gcc/config/i386/i386.c (ix86_expand_special_args_builtin): Handle new 
types.
gcc/testsuite/
* gcc.target/i386/avx512bw-vpmovswb-1.c: Add new intrinsics to test.
* gcc.target/i386/avx512bw-vpmovswb-2.c: Ditto.
* gcc.target/i386/avx512bw-vpmovuswb-1.c: Ditto.
* gcc.target/i386/avx512bw-vpmovuswb-2.c: Ditto.
* gcc.target/i386/avx512bw-vpmovwb-1.c: Ditto.
* gcc.target/i386/avx512bw-vpmovwb-2.c: Ditto.

Thanks,
Julia


0001-cvtusepi.patch
Description: 0001-cvtusepi.patch


RE: [PATCH][x86][PR73350][PR80862]

2017-06-05 Thread Koval, Julia
Hi,

1 is replace 8 spaces with tab suggested by ./check_GNU_style.sh, should I 
still fix it back?
2,3,4 Done

CSE is working, spec 2k6 on skylake-avx512 has same score.

PR target/73350,80862
gcc/
* config/i386/subst.md (round): Fix round pattern.
* config/i386/i386.c (ix86_erase_embedded_rounding):
Fix erasing rounding for the fixed pattern.
gcc/testsuite
* gcc.target/i386/pr73350.c: New test.

Ok for trunk?

Thanks,
Julia

> -Original Message-
> From: Kirill Yukhin [mailto:kirill.yuk...@gmail.com]
> Sent: Wednesday, May 31, 2017 12:34 PM
> To: Koval, Julia <julia.ko...@intel.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Uros Bizjak
> <ubiz...@gmail.com>; Peryt, Sebastian <sebastian.pe...@intel.com>;
> ja...@redhat.com; richard.guent...@gmail.com
> Subject: Re: [PATCH][x86][PR73350][PR80862]
> 
> On 31 May 11:38, Kirill Yukhin wrote:
> > Hello Julia,
> > On 26 May 09:13, Koval, Julia wrote:
> > > Hi,
> > > This patch fixes these PR's. Ok for trunk?
> > >
> > > gcc/
> > >   * config/i386/subst.md (round): Fix round pattern.
> > >   * config/i386/i386.c (ix86_erase_embedded_rounding):
> > >   Fix erasing rounding for the fixed pattern.
> > >
> > > Thanks,
> > > Julia
> >
> > Let me copy-paste parts of your patch here.
> > diff --git a/gcc/config/i386/subst.md b/gcc/config/i386/subst.md
> > index 0bc22fd..2e632b9 100644
> > --- a/gcc/config/i386/subst.md
> > +++ b/gcc/config/i386/subst.md
> > @@ -137,12 +137,12 @@
> >
> >  (define_subst "round"
> >[(set (match_operand:SUBST_A 0)
> > -(match_operand:SUBST_A 1))]
> > +   (match_operand:SUBST_A 1))]
> >"TARGET_AVX512F"
> > -  [(parallel[
> > - (set (match_dup 0)
> > -  (match_dup 1))
> > - (unspec [(match_operand:SI 2 "const_4_or_8_to_11_operand")]
> UNSPEC_EMBEDDED_ROUNDING)])])
> > +  [(set (match_dup 0)
> > +   (unspec:SUBST_A [(match_dup 1)
> > +   (match_operand:SI 2 "const_4_or_8_to_11_operand")]
> > +UNSPEC_EMBEDDED_ROUNDING))])
> >
> >  (define_subst_attr "round_saeonly_name" "round_saeonly" "" "_round")
> >  (define_subst_attr "round_saeonly_mask_operand2" "mask" "%r2" "%r4")
> > --
> > 2.5.5
> >
> > So, you propose to put RC as third argument to the set expression.
> > I am not sure that we might use (set ...) in such a way.
> Whoops, I was wrong. You are setting w/ SET_SRC as UNSPEC which
> which is eliminated conditionally, which is much better to me.
> 
> Few nits:
> 1.
> diff --git a/gcc/config/i386/subst.md b/gcc/config/i386/subst.md
> index 0bc22fd..2e632b9 100644
> --- a/gcc/config/i386/subst.md
> +++ b/gcc/config/i386/subst.md
> @@ -137,12 +137,12 @@
> 
>  (define_subst "round"
>[(set (match_operand:SUBST_A 0)
> -(match_operand:SUBST_A 1))]
> + (match_operand:SUBST_A 1))]
>"TARGET_AVX512F"
> Junk.
> 
> 2. Check identation pls
> 
> 3. Mention PR in ChangeLog entry
> 
> 4. Add reg test please
> 
> Overall I like this approach. We must somehow set explicit dependency
> between RC and actual op, why not this way?
> 
> Could you pls make sure that CSE is still working for ops w/ identical RC?
> 
> To be paranoid: is it possible to check skylake-avx512 w/ and w/o the patch
> on Spec2k6?
> 
> --
> Thanks, K


0001-test_rounding.patch
Description: 0001-test_rounding.patch


[PATCH][x86][PR73350][PR80862]

2017-05-26 Thread Koval, Julia
Hi,
This patch fixes these PR's. Ok for trunk?

gcc/
* config/i386/subst.md (round): Fix round pattern.
* config/i386/i386.c (ix86_erase_embedded_rounding):
Fix erasing rounding for the fixed pattern.

Thanks,
Julia


0001-patch_1.patch
Description: 0001-patch_1.patch


[PATCH][X86] Add missing xgetbv xsetbv intrinsics

2017-05-12 Thread Koval, Julia
Hi,

This patch add these missing intrinsics:
_xsetbv
_xgetbv

gcc/
* config/i386/i386-builtin-types.def (VOID_FTYPE_INT_INT64): New type.
* config/i386/i386-builtin.def (__builtin_ia32_xgetbv,
__builtin_ia32_xsetbv): New builtins.
* config/i386/i386.c (ix86_expand_special_args_builtin): Process new 
type.
(ix86_expand_builtin): Special expand for new intrinsics.
* config/i386/i386.md: (UNSPECV_XGETBV, UNSPECV_XSETBV): New.
(xsetbv, xsetbv_rex64, xgetbv, xgetbv_rex64): New patterns.
* config/i386/xsaveintrin.h (_xsetbv, _getbv): New intrinsics.

gcc/testsuite
* gcc.target/i386/xgetsetbv.c: New test.

Ok for trunk?

Thanks,
Julia


xgetbv_patch
Description: xgetbv_patch


RE: [PATCH][X86] Add missing intrinsics for VRSQRT14

2017-05-11 Thread Koval, Julia
> Please macroize existing rsqrt14_ pattern.
There is no existing macro to match this pattern - it is "mask first element of 
vec_merge", instead of mask the whole vector.(SDM Vol. 2C 5-527) 
https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf
 
I can create a new macro(patch attached), but I only know 2 of this type, so 
I'm not sure if it is needed. Also I don't like (const_int 1) part of my macro, 
but I don't know how I can save & use it as other operands.

-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com] 
Sent: Thursday, May 11, 2017 11:25 AM
To: Koval, Julia <julia.ko...@intel.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Kirill Yukhin 
<kirill.yuk...@gmail.com>
Subject: Re: [PATCH][X86] Add missing intrinsics for VRSQRT14

On Thu, May 11, 2017 at 7:43 AM, Koval, Julia <julia.ko...@intel.com> wrote:
> Hi,
>
> These are 4 missing intrinsics for VRSQRT14 instruction. Ok for trunk?
>
> gcc/
> * config/i386/avx512fintrin.h (_mm_mask_rsqrt14_sd, 
> _mm_maskz_rsqrt14_sd,
> _mm_mask_rsqrt14_ss, _mm_maskz_rsqrt14_ss): New intrinsics.
> * config/i386/i386-builtin.def (__builtin_ia32_rsqrt14sd_mask,
> __builtin_ia32_rsqrt14ss_mask): New builtins.
> * config/i386/sse.md (rsqrt14__mask): New pattern.

Please macroize existing rsqrt14_ pattern.

BTW: Please also macroize srcp14_mask from your previous patch in the 
same way.

Thanks,
Uros.

> gcc/testsuite/
> * gcc.target/i386/avx512f-vrsqrt14sd-1.c: Test new intrinsics.
> * gcc.target/i386/avx512f-vrsqrt14sd-2.c: Ditto.
> * gcc.target/i386/avx512f-vrsqrt14ss-1.c: Ditto.
> * gcc.target/i386/avx512f-vrsqrt14ss-2.c: Ditto.
>
> Thanks,
> Julia


0001-pattern.patch
Description: 0001-pattern.patch


[PATCH][X86] Add missing intrinsics for VRSQRT14

2017-05-10 Thread Koval, Julia
Hi,

These are 4 missing intrinsics for VRSQRT14 instruction. Ok for trunk?

gcc/
* config/i386/avx512fintrin.h (_mm_mask_rsqrt14_sd, 
_mm_maskz_rsqrt14_sd,
_mm_mask_rsqrt14_ss, _mm_maskz_rsqrt14_ss): New intrinsics.
* config/i386/i386-builtin.def (__builtin_ia32_rsqrt14sd_mask,
__builtin_ia32_rsqrt14ss_mask): New builtins.
* config/i386/sse.md (rsqrt14__mask): New pattern.

gcc/testsuite/
* gcc.target/i386/avx512f-vrsqrt14sd-1.c: Test new intrinsics.
* gcc.target/i386/avx512f-vrsqrt14sd-2.c: Ditto.
* gcc.target/i386/avx512f-vrsqrt14ss-1.c: Ditto.
* gcc.target/i386/avx512f-vrsqrt14ss-2.c: Ditto.

Thanks,
Julia


0001-rsqrt14.patch
Description: 0001-rsqrt14.patch


RE: [PR80582][X86] Add missing __mm256_set[r] intrinsics

2017-05-09 Thread Koval, Julia
Sorry, fixed that.

Thanks,
Julia

-Original Message-
From: Jakub Jelinek [mailto:ja...@redhat.com] 
Sent: Tuesday, May 09, 2017 11:36 AM
To: Koval, Julia <julia.ko...@intel.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>; Uros Bizjak <ubiz...@gmail.com>; 
Kirill Yukhin <kirill.yuk...@gmail.com>
Subject: Re: [PR80582][X86] Add missing __mm256_set[r] intrinsics

On Tue, May 09, 2017 at 09:28:40AM +, Koval, Julia wrote:
> Hi,
> 
> This patch implements missing intrinsics:
> _mm256_set_m128
> _mm256_set_m128d
> _mm256_set_m128i
> _mm256_setr_m128
> _mm256_setr_m128d
> _mm256_setr_m128i
> 
> gcc/
>   * config/i386/avxintrin.h (_mm256_set_m128, _mm256_set_m128d,
>   _mm256_set_m128i, _mm256_setr_m128, _mm256_setr_m128d,
>   _mm256_setr_m128i): New intrinsics.
> 
> gcc/testsuite/
>   * gcc.target/i386/avx-vinsertf128-256-1: Test new intrinsics.
>   * gcc.target/i386/avx-vinsertf128-256-2: Ditto.
>   * gcc.target/i386/avx-vinsertf128-256-3: Ditto.
> 
> Ok for trunk?

--- a/gcc/config/i386/avxintrin.h
+++ b/gcc/config/i386/avxintrin.h
@@ -746,6 +746,7 @@ _mm256_broadcast_ps (__m128 const *__X)
   return (__m256) __builtin_ia32_vbroadcastf128_ps256 (__X);  }
 
+
 #ifdef __OPTIMIZE__
 extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))  _mm256_insertf128_pd (__m256d __X, __m128d __Y, const int 
__O) @@ -770,7 +771,6 @@ _mm256_insertf128_si256 (__m256i __X, __m128i __Y, 
const int __O)
 (__v4si)__Y,
 __O);
 }
-
 extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm256_insert_epi32 (__m256i __X, int __D, int const __N)  {

Why the above extra whitespace changes?  Especially the latter looks 
undesirable, there should be one empty line in between different inline 
functions.


Jakub


0001-set_.patch
Description: 0001-set_.patch


[PR80582][X86] Add missing __mm256_set[r] intrinsics

2017-05-09 Thread Koval, Julia
Hi,

This patch implements missing intrinsics:
_mm256_set_m128
_mm256_set_m128d
_mm256_set_m128i
_mm256_setr_m128
_mm256_setr_m128d
_mm256_setr_m128i

gcc/
* config/i386/avxintrin.h (_mm256_set_m128, _mm256_set_m128d,
_mm256_set_m128i, _mm256_setr_m128, _mm256_setr_m128d,
_mm256_setr_m128i): New intrinsics.

gcc/testsuite/
* gcc.target/i386/avx-vinsertf128-256-1: Test new intrinsics.
* gcc.target/i386/avx-vinsertf128-256-2: Ditto.
* gcc.target/i386/avx-vinsertf128-256-3: Ditto.

Ok for trunk?

Thanks,
Julia


0001-set_.patch
Description: 0001-set_.patch


[PATCH][x86] Add missing intrinsics for vrcp14sd/ss instructions.

2017-05-09 Thread Koval, Julia
Hi,

This patch adds missing intrinsics for VRCP14SD and VRCP14SS instructions:
_mm_mask_rcp14_sd
_mm_maskz_rcp14_sd
_mm_mask_rcp14_ss
_mm_maskz_rcp14_ss

These instructions and intrinsics are described in SDM Vol. 2C 5-487.

gcc/
* config/i386/avx512fintrin.h (_mm_mask_rcp14_sd,
_mm_maskz_rcp14_sd, _mm_mask_rcp14_ss,
_mm_maskz_rcp14_ss): New intrinsics.
* config/i386/i386-builtin.def (__builtin_ia32_rcp14sd_mask,
__builtin_ia32_rcp14ss_mask): New builtins.
* config/i386/sse.md (srcp14_mask): New pattern.

gcc/testsuite/
* gcc.target/i386/avx512f-vrcp14sd-1.c: Test new intrinsics.
* gcc.target/i386/avx512f-vrcp14sd-2.c: Ditto.
* gcc.target/i386/avx512f-vrcp14ss-1.c: Ditto.
* gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto.

Thanks,
Julia


0001-vrcp14sd.patch
Description: 0001-vrcp14sd.patch


RE: [PR79793] Fixes interrupt stack alignment

2017-03-06 Thread Koval, Julia
Uhhh, copypaste error, sorry.

PR target/79793
gcc/
* config/i386/i386.c (ix86_minimum_incoming_stack_boundary): Set 
incoming
stack boundary to 128 for 64-bit targets.

gcc/testsuite/
* gcc.target/i386/interrupt-12.c: Update scan-assembler-times 
directives.
* gcc.target/i386/interrupt-13.c: Ditto. 
* gcc.target/i386/interrupt-14.c: Ditto. 
* gcc.target/i386/interrupt-15.c: Ditto.


-Original Message-
From: Koval, Julia 
Sent: Monday, March 06, 2017 1:03 PM
To: 'Uros Bizjak' <ubiz...@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>; H.J. Lu <hjl.to...@gmail.com>
Subject: RE: [PR79793] Fixes interrupt stack alignment

Ok, fixed it. Can you please commit it for me?

PR target/79731
gcc/
* config/i386/i386.c (ix86_minimum_incoming_stack_boundary): Set 
incoming
stack boundary to 128 for 64-bit targets.

gcc/testsuite/
* gcc.target/i386/interrupt-12.c: Update scan-assembler-times 
directives.
* gcc.target/i386/interrupt-13.c: Ditto. 
* gcc.target/i386/interrupt-14.c: Ditto. 
* gcc.target/i386/interrupt-15.c: Ditto.

Thanks,
Julia

-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com] 
Sent: Friday, March 03, 2017 4:29 PM
To: Koval, Julia <julia.ko...@intel.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>; H.J. Lu <hjl.to...@gmail.com>
Subject: Re: [PR79793] Fixes interrupt stack alignment

On Fri, Mar 3, 2017 at 4:20 PM, Koval, Julia <julia.ko...@intel.com> wrote:
> Hi,
> This patch fixes PR79731. Ok for trunk?
>
> gcc/
> * config/i386/i386.c (ix86_minimum_incoming_stack_boundary): Fix 
> boundary for interrupt

...: Set incoming stack boundary to 128 for 64-bit targets.

> gcc/testsuite/
> * gcc.target/i386/interrupt-12.c: Fix test.
> * gcc.target/i386/interrupt-13.c: Ditto.
> * gcc.target/i386/interrupt-14.c: Ditto.
> * gcc.target/i386/interrupt-15.c: Ditto.

...: Update scan-assembler-times directives.

Please also add "PR target/79731" to the top of ChangeLog entry.

OK with the above ChangeLog changes.

Thanks,
Uros.


RE: [PR79793] Fixes interrupt stack alignment

2017-03-06 Thread Koval, Julia
Ok, fixed it. Can you please commit it for me?

PR target/79731
gcc/
* config/i386/i386.c (ix86_minimum_incoming_stack_boundary): Set 
incoming
stack boundary to 128 for 64-bit targets.

gcc/testsuite/
* gcc.target/i386/interrupt-12.c: Update scan-assembler-times 
directives.
* gcc.target/i386/interrupt-13.c: Ditto. 
* gcc.target/i386/interrupt-14.c: Ditto. 
* gcc.target/i386/interrupt-15.c: Ditto.

Thanks,
Julia

-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com] 
Sent: Friday, March 03, 2017 4:29 PM
To: Koval, Julia <julia.ko...@intel.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>; H.J. Lu <hjl.to...@gmail.com>
Subject: Re: [PR79793] Fixes interrupt stack alignment

On Fri, Mar 3, 2017 at 4:20 PM, Koval, Julia <julia.ko...@intel.com> wrote:
> Hi,
> This patch fixes PR79731. Ok for trunk?
>
> gcc/
> * config/i386/i386.c (ix86_minimum_incoming_stack_boundary): Fix 
> boundary for interrupt

...: Set incoming stack boundary to 128 for 64-bit targets.

> gcc/testsuite/
> * gcc.target/i386/interrupt-12.c: Fix test.
> * gcc.target/i386/interrupt-13.c: Ditto.
> * gcc.target/i386/interrupt-14.c: Ditto.
> * gcc.target/i386/interrupt-15.c: Ditto.

...: Update scan-assembler-times directives.

Please also add "PR target/79731" to the top of ChangeLog entry.

OK with the above ChangeLog changes.

Thanks,
Uros.


[PR79793] Fixes interrupt stack alignment

2017-03-03 Thread Koval, Julia
Hi,
This patch fixes PR79731. Ok for trunk?

gcc/
* config/i386/i386.c (ix86_minimum_incoming_stack_boundary): Fix 
boundary for interrupt
gcc/testsuite/
* gcc.target/i386/interrupt-12.c: Fix test.
* gcc.target/i386/interrupt-13.c: Ditto. 
* gcc.target/i386/interrupt-14.c: Ditto. 
* gcc.target/i386/interrupt-15.c: Ditto.

Julia


patch_interrupt_3.3
Description: patch_interrupt_3.3


RE: [PATCH] Enable RDPID instruction.

2017-02-17 Thread Koval, Julia
Hi,
Can you please commit it for me?

Thanks,
Julia

-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com] 
Sent: Friday, February 17, 2017 1:30 PM
To: Koval, Julia <julia.ko...@intel.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] Enable RDPID instruction.

On Thu, Feb 16, 2017 at 11:56 PM, Koval, Julia <julia.ko...@intel.com> wrote:
> Sorry, here is the right patch(previous one had a typo). Changelog is right.
>
>
> -Original Message-
> From: Koval, Julia
> Sent: Thursday, February 16, 2017 11:31 PM
> To: 'Uros Bizjak' <ubiz...@gmail.com>
> Cc: GCC Patches <gcc-patches@gcc.gnu.org>
> Subject: RE: [PATCH] Enable RDPID instruction.
>
> Sorry, fixed it.
>
> gcc/
> * common/config/i386/i386-common.c (OPTION_MASK_ISA_RDPID_SET): New.
> (OPTION_MASK_ISA_PKU_UNSET): New.
> (ix86_handle_option): Handle -mrdpid.
> * config/i386/cpuid.h
> (bit_RDPID): New.
> * config/i386/driver-i386.c (host_detect_local_cpu): Detect RDPID 
> feature.
> * config/i386/i386-builtin.def (__builtin_ia32_rdpid): New.
> * config/i386/i386-c.c (ix86_target_macros_internal): Handle RDPID 
> flag.
> * config/i386/i386.c (ix86_target_string): Add -mrdpid to isa2_opts.
> (ix86_valid_target_attribute_inner_p): Add "rdpid".
> (ix86_expand_builtin): Handle IX86_BUILTIN_RDPID.
> * config/i386/i386.h (TARGET_RDPID, TARGET_RDPID_P): New.
> * config/i386/i386.md (define_insn "rdpid"): New.
> * config/i386/i386.opt Add -mrdpid.
> * config/i386/immintrin.h (_rdpid_u32): New.
>
> gcc/testsuite/
> * gcc.target/i386/rdpid.c New test.
> * gcc.target/i386/sse-12.c: Add -mrdpid.
> * gcc.target/i386/sse-13.c: Ditto.
> * gcc.target/i386/sse-14.c: Ditto.
> * gcc.target/i386/sse-22.c: Ditto.
> * gcc.target/i386/sse-23.c: Ditto.
> * g++.dg/other/i386-2.C: Ditto.
> * g++.dg/other/i386-3.C: Ditto.

OK for mainline.

Thanks,
Uros.


[PATCH][GCC5] Backport of PR76731 fix

2017-02-16 Thread Koval, Julia
Hi,

This is GCC5 backport discussed in 
https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01034.html. Please commit it for 
me, if it is ok.

PR target/76731
* config/i386/avx512fintrin.h
(_mm512_i32gather_ps): Change __addr type to void const*.
(_mm512_mask_i32gather_ps): Ditto.
(_mm512_i32gather_pd): Ditto.
(_mm512_mask_i32gather_pd): Ditto.
(_mm512_i64gather_ps): Ditto.
(_mm512_mask_i64gather_ps): Ditto.
(_mm512_i64gather_pd): Ditto.
(_mm512_mask_i64gather_pd): Ditto.
(_mm512_i32gather_epi32): Ditto.
(_mm512_mask_i32gather_epi32): Ditto.
(_mm512_i32gather_epi64): Ditto.
(_mm512_mask_i32gather_epi64): Ditto.
(_mm512_i64gather_epi32): Ditto.
(_mm512_mask_i64gather_epi32): Ditto.
(_mm512_i64gather_epi64): Ditto.
(_mm512_mask_i64gather_epi64): Ditto.
(_mm512_i32scatter_ps): Change __addr type to void*.
(_mm512_mask_i32scatter_ps): Ditto.
(_mm512_i32scatter_pd): Ditto.
(_mm512_mask_i32scatter_pd): Ditto.
(_mm512_i64scatter_ps): Ditto.
(_mm512_mask_i64scatter_ps): Ditto.
(_mm512_i64scatter_pd): Ditto.
(_mm512_mask_i64scatter_pd): Ditto.
(_mm512_i32scatter_epi32): Ditto.
(_mm512_mask_i32scatter_epi32): Ditto.
(_mm512_i32scatter_epi64): Ditto.
(_mm512_mask_i32scatter_epi64): Ditto.
(_mm512_i64scatter_epi32): Ditto.
(_mm512_mask_i64scatter_epi32): Ditto.
(_mm512_i64scatter_epi64): Ditto.
(_mm512_mask_i64scatter_epi64): Ditto.
* config/i386/avx512pfintrin.h
(_mm512_mask_prefetch_i32gather_pd): Change addr type to void const*.
(_mm512_mask_prefetch_i32gather_ps): Ditto.
(_mm512_mask_prefetch_i64gather_pd): Ditto.
(_mm512_mask_prefetch_i64gather_ps): Ditto.
(_mm512_prefetch_i32scatter_pd): Change addr type to void*.
(_mm512_prefetch_i32scatter_ps): Ditto.
(_mm512_mask_prefetch_i32scatter_pd): Ditto.
(_mm512_mask_prefetch_i32scatter_ps): Ditto.
(_mm512_prefetch_i64scatter_pd): Ditto.
(_mm512_prefetch_i64scatter_ps): Ditto.
(_mm512_mask_prefetch_i64scatter_pd): Ditto.
(_mm512_mask_prefetch_i64scatter_ps): Ditto.
* config/i386/avx512vlintrin.h
(_mm256_mmask_i32gather_ps): Change __addr type to void const*.
(_mm_mmask_i32gather_ps): Ditto.
(_mm256_mmask_i32gather_pd): Ditto.
(_mm_mmask_i32gather_pd): Ditto.
(_mm256_mmask_i64gather_ps): Ditto.
(_mm_mmask_i64gather_ps): Ditto.
(_mm256_mmask_i64gather_pd): Ditto.
(_mm_mmask_i64gather_pd): Ditto.
(_mm256_mmask_i32gather_epi32): Ditto.
(_mm_mmask_i32gather_epi32): Ditto.
(_mm256_mmask_i32gather_epi64): Ditto.
(_mm_mmask_i32gather_epi64): Ditto.
(_mm256_mmask_i64gather_epi32): Ditto.
(_mm_mmask_i64gather_epi32): Ditto.
(_mm256_mmask_i64gather_epi64): Ditto.
(_mm_mmask_i64gather_epi64): Ditto.
(_mm256_i32scatter_ps): Change __addr type to void*.
(_mm256_mask_i32scatter_ps): Ditto.
(_mm_i32scatter_ps): Ditto.
(_mm_mask_i32scatter_ps): Ditto.
(_mm256_i32scatter_pd): Ditto.
(_mm256_mask_i32scatter_pd): Ditto.
(_mm_i32scatter_pd): Ditto.
(_mm_mask_i32scatter_pd): Ditto.
(_mm256_i64scatter_ps): Ditto.
(_mm256_mask_i64scatter_ps): Ditto.
(_mm_i64scatter_ps): Ditto.
(_mm_mask_i64scatter_ps): Ditto.
(_mm256_i64scatter_pd): Ditto.
(_mm256_mask_i64scatter_pd): Ditto.
(_mm_i64scatter_pd): Ditto.
(_mm_mask_i64scatter_pd): Ditto.
(_mm256_i32scatter_epi32): Ditto.
(_mm256_mask_i32scatter_epi32): Ditto.
(_mm_i32scatter_epi32): Ditto.
(_mm_mask_i32scatter_epi32): Ditto.
(_mm256_i32scatter_epi64): Ditto.
(_mm256_mask_i32scatter_epi64): Ditto.
(_mm_i32scatter_epi64): Ditto.
(_mm_mask_i32scatter_epi64): Ditto.
(_mm256_i64scatter_epi32): Ditto.
(_mm256_mask_i64scatter_epi32): Ditto.
(_mm_i64scatter_epi32): Ditto.
(_mm_mask_i64scatter_epi32): Ditto.
(_mm256_i64scatter_epi64): Ditto.
(_mm256_mask_i64scatter_epi64): Ditto.
(_mm_i64scatter_epi64): Ditto.
(_mm_mask_i64scatter_epi64): Ditto.
* config/i386/i386-builtin-types.def (V16SF_V16SF_PCFLOAT_V16SI_HI_INT)
(V8DF_V8DF_PCDOUBLE_V8SI_QI_INT, V8SF_V8SF_PCFLOAT_V8DI_QI_INT)
(V8DF_V8DF_PCDOUBLE_V8DI_QI_INT, V16SI_V16SI_PCINT_V16SI_HI_INT)
(V8DI_V8DI_PCINT64_V8SI_QI_INT, V8SI_V8SI_PCINT_V8DI_QI_INT)
(V8DI_V8DI_PCINT64_V8DI_QI_INT, V2DF_V2DF_PCDOUBLE_V4SI_QI_INT)
(V4DF_V4DF_PCDOUBLE_V4SI_QI_INT, V2DF_V2DF_PCDOUBLE_V2DI_QI_INT)
(V4DF_V4DF_PCDOUBLE_V4DI_QI_INT, V4SF_V4SF_PCFLOAT_V4SI_QI_INT)
(V8SF_V8SF_PCFLOAT_V8SI_QI_INT, 

  1   2   >