RE: [PATCH][i386] Adjust vec_construct cost for AVX256/512, penaltize elementwise load vectorization

2018-02-15 Thread Shalnov, Sergey
Richard, I've benchmarked your patch on Skylake with SPEC CPU 20[06|17][fp|int]rate and another smaller benchmark suites. I found that it doesn't regress any benchmark off-noise but improves 525.x264 by 1.8%, 526.blender by 1.9% and 465.tonto by 3.2%. I think this is a good reason to merge the

RE: [PATCH, i386] Fix ix86_multiplication_cost for SKX

2018-02-08 Thread Shalnov, Sergey
, February 7, 2018 2:15 PM To: Shalnov, Sergey <sergey.shal...@intel.com> Cc: gcc-patches@gcc.gnu.org; Peryt, Sebastian <sebastian.pe...@intel.com>; Ivchenko, Alexander <alexander.ivche...@intel.com>; Kirill Yukhin <kirill.yuk...@gmail.com> Subject: Re: [PATCH, i386] Fix ix86_mul

[PATCH, i386] PR target/83008: Fix for SKX cost model

2018-02-08 Thread Shalnov, Sergey
Hi, This patch contain cost model change for SKX and closes PR target/83008 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008) It provides following performance scores in geomean: SPEC CPU2017 intrate +0.6% SPEC CPU2017 fprate +1.5% SPEC 2006 [int|fp] no changes out of noise I found a

[PATCH, i386] Fix ix86_multiplication_cost for SKX

2018-02-07 Thread Shalnov, Sergey
Hi, This patch is one of the set of patches to fix SKX costs. I think multiplication costs calculation algorithm needs to be adjusted in gcc/config/i386/i386.c ix86_multiplication_cost() function. For TARGET_AVX512DQ emulation is not used and single vpmullq instruction emitted. I think we have

[PATCH, i386] Avoid SLP vectorization if vector and scalar costs are equal.

2017-12-27 Thread Shalnov, Sergey
Hi, Should we use vector instructions if the scalar and vector costs in SLP are the same? According to the source line comment (already in source code) we should not use vector instructions in this case. I would like to propose to use scalars if costs are the same. Sergey 2017-12-27 Sergey

[PATCH, i386] Fix movdi_internal to return MODE_TI with AVX512

2017-11-29 Thread Shalnov, Sergey
Hi, I found wrong MODE_XI used in movdi_internal that cause zmm Generation with "-march=skylake-avx512 -mprefer-vector-width=128" options set. This patch fixes the mode and register type but keep using AVX512 instruction set. 2017-11-28 Sergey Shalnov gcc/ *

[PATCH, i386] Fix wrong instruction vpcmpeqd generation

2017-11-24 Thread Shalnov, Sergey
Hi, I found wrong vpcmpeqd instruction form generated in case of "-march=skylake-avx512 -mprefer-vector-width=128" options set The compiler emits following error at compile stage: Error: invalid register operand for `vpcmpeqd' Because following was generated: vpcmpeqd %xmm16,

[PATCH, i386] Fix registers type for MODE_TI

2017-11-24 Thread Shalnov, Sergey
Hi, I found wrong ymm registers are generated in case of "-march=skylake-avx512 -mprefer-vector-width=128" options set The code looks like: movq%r11, 64(%rbx) vpxord %ymm0, %ymm0, %ymm0 vmovdqa64 %xmm0, 32(%rbx) movq%r11, 15584(%rbx) where

RE: [PATCH, i386] Fix behavior for –mprefer-vector-width= option

2017-11-23 Thread Shalnov, Sergey
is set. -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Wednesday, November 22, 2017 9:18 PM To: Shalnov, Sergey <sergey.shal...@intel.com> Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com; Koval, Julia <julia.ko...@intel.com>; Senkevich, Andrew <andre

[PATCH, i386] Fix behavior for –mprefer-vector-width= option

2017-11-22 Thread Shalnov, Sergey
Hi, This patch making –mprefer-vector-width= option inclusive. This means that if we use –mprefer-vector-width=128 it should switch TARGET_PREFER_AVX128=ON and TARGET_PREFER_AVX256=ON also. It is minor change to generate “xmm” with –mprefer-vector-width=128 on the platform with “zmm”. Sergey

RE: [PATCH, i386] Refactor -mprefer-avx[128|256] options into common -mprefer-vector-width=[none|128|256|512]

2017-11-21 Thread Shalnov, Sergey
Uros, Yes, please. Thank you for your proposals and comments. Please commit as you proposed. Sergey -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Tuesday, November 21, 2017 6:13 PM To: Shalnov, Sergey <sergey.shal...@intel.com> Cc: gcc-patches@gcc.g

RE: [PATCH, i386] Refactor -mprefer-avx[128|256] options into common -mprefer-vector-width=[none|128|256|512]

2017-11-21 Thread Shalnov, Sergey
merge this patch if you think it is acceptable. Thank you Sergey -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Tuesday, November 14, 2017 7:57 AM To: Joseph Myers <jos...@codesourcery.com> Cc: Shalnov, Sergey <sergey.shal...@intel.com>; gcc-patches@gcc.gn

[PATCH, i386] Refactor -mprefer-avx[128|256] options into common -mprefer-vector-width=[none|128|256|512]

2017-11-13 Thread Shalnov, Sergey
Hi, Modern architectures provides wider and wider vector registers. This patch implements common (in i386 arch) option to prefer vector register width for the vectorizer. Currently, GCC has "-mprefer-avx128" and "-mprefer-avx256" options to limit maximum vector register width in vectorizer. To

[PATCH, i386] Enable option -mprefer-avx256 as default for Intel Skylake configuration

2017-11-02 Thread Shalnov, Sergey
Hi, This patch makes "prefer-avx256" option as default tuning for "skylake-avx512". This is due to better performance of 256-bit code for some of the cases. In case of Skylake Server the Optimization Manual has following "Since port 0 and port 1 are 256-bits wide, Intel AVX-512 operations that

RE: [PATCH, i386] Avoid 512-bit mode MOV for prefer-avx256 option in Intel AVX512 configuration

2017-10-20 Thread Shalnov, Sergey
>; Peryt, Sebastian <sebastian.pe...@intel.com> Subject: Re: [PATCH, i386] Avoid 512-bit mode MOV for prefer-avx256 option in Intel AVX512 configuration Hello Sergey, On 06 Oct 14:20, Shalnov, Sergey wrote: > Jakub, > I completely agree with you. I fixed the patch. > Currently, TAR

RE: [PATCH, i386] Avoid 512-bit mode MOV for prefer-avx256 option in Intel AVX512 configuration

2017-10-16 Thread Shalnov, Sergey
Uros, Is this patch (second one which fixed in the way as Jakub proposed) ok for the trunk? Could you please merge it? Sergey -Original Message- From: Shalnov, Sergey Sent: Friday, October 6, 2017 4:20 PM To: Jakub Jelinek <ja...@redhat.com> Cc: 'gcc-patches@gcc.gnu.org' <gc

RE: [PATCH, i386] Avoid 512-bit mode MOV for prefer-avx256 option in Intel AVX512 configuration

2017-10-06 Thread Shalnov, Sergey
in case of TARGET_PREFER_AVX256. I would propose to merge this patch as temporal solution. Sergey -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On Behalf Of Jakub Jelinek Sent: Friday, October 6, 2017 11:58 AM To: Shalnov, Sergey <sergey.s

[PATCH, i386] Avoid 512-bit mode MOV for prefer-avx256 option in Intel AVX512 configuration

2017-10-06 Thread Shalnov, Sergey
Hi, GCC uses full 512-bit register in case of moving SF/DF value between two registers. The patch avoid 512-bit register usage if "-mprefer-avx256" option used. 2017-10-06 Sergey Shalnov gcc/ * config/i386/i386.md(*movsf_internal, *movdf_internal):

RE: [PATCH, i386] Avoid 512-bit vector return constant for Intel AVX512 configuration

2017-09-28 Thread Shalnov, Sergey
Sorry. The patch is changed as you proposed. -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Thursday, September 28, 2017 3:17 PM To: Shalnov, Sergey <sergey.shal...@intel.com> Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com; Senkevich, Andrew <andr

[PATCH, i386] Avoid 512-bit vector return constant for Intel AVX512 configuration

2017-09-28 Thread Shalnov, Sergey
Hi, GCC uses full 512-bit register to return the constant from the function. The patch avoid 512-bit register usage if "-mprefer-avx256" option used. 2017-09-28 Sergey Shalnov gcc/ * config/i386/i386.md(*movsf_internal, *movdf_internal): Return

RE: [PATCH, i386] Avoid fixed 512-bit vector size in constant set for Intel AVX512 configuration

2017-09-21 Thread Shalnov, Sergey
or fixed vector length usage. -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Thursday, September 21, 2017 3:54 PM To: Shalnov, Sergey <sergey.shal...@intel.com> Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com; Koval, Julia <julia.ko...@intel.com>

[PATCH, i386] Avoid fixed 512-bit vector size in constant set for Intel AVX512 configuration

2017-09-21 Thread Shalnov, Sergey
Hi, GCC uses full 512-bit register to keep the constant. This constant uses in the code further but with 128-bit vector length. The patch avoid fixed large vector length usage. For the simple code: void my_test(short *table) { for (int i = 0; i < 128; ++i) { table[i] = -1; } } It

RE: [PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512 configuration

2017-09-21 Thread Shalnov, Sergey
ember 20, 2017 4:25 PM To: Shalnov, Sergey <sergey.shal...@intel.com> Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com; Koval, Julia <julia.ko...@intel.com>; Senkevich, Andrew <andrew.senkev...@intel.com> Subject: Re: [PATCH, i386] Enable option -mprefer-avx256 added for Int

RE: [PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512 configuration

2017-09-20 Thread Shalnov, Sergey
ros Bizjak [mailto:ubiz...@gmail.com] Sent: Wednesday, September 20, 2017 3:51 PM To: Shalnov, Sergey <sergey.shal...@intel.com> Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com; Koval, Julia <julia.ko...@intel.com>; Senkevich, Andrew <andrew.senkev...@intel.com> Subject: Re:

RE: [PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512 configuration

2017-09-20 Thread Shalnov, Sergey
Uros, Could you please merge the patch into mainline? Thank you Sergey -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Tuesday, September 19, 2017 6:17 PM To: Shalnov, Sergey <sergey.shal...@intel.com> Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com;

RE: [PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512 configuration

2017-09-19 Thread Shalnov, Sergey
anges" in previous message. If you like to change "ix86_autovectorize_vector_sizes" function algorithmically, I would propose to do this in separate patch. Sergey -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Monday, September 18, 2017 11:44 AM To:

RE: [PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512 configuration

2017-09-15 Thread Shalnov, Sergey
f Of Jakub Jelinek Sent: Thursday, September 14, 2017 2:36 PM To: Shalnov, Sergey <sergey.shal...@intel.com> Cc: gcc-patches@gcc.gnu.org; ubiz...@gmail.com; kirill.yuk...@gmail.com; Koval, Julia <julia.ko...@intel.com>; Senkevich, Andrew <andrew.senkev...@intel.com> Subject: Re:

[PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512 configuration

2017-09-14 Thread Shalnov, Sergey
Hi, GCC has the option "mprefer-avx128" to use 128-bit AVX registers instead of 256-bit AVX registers in the auto-vectorizer. This patch enables the command line option "mprefer-avx256" that reduces 512-bit registers usage in "march=skylake-avx512" mode. This is the initial implementation of the