[PATCH][x86] Remove duplicated headers includes
Hi, I have made some cleaning to remove redundancy in includes call of some of the headers in x86intrin.h. Removed headers were included in both x86intrin.h and immintrin.h which is included into x86intrin.h. Is it ok for trunk? 2018-05-30 Sebastian Peryt gcc/ * config/i386/cldemoteintrin.h: Change define from _X86INTRIN_H_INCLUDED to _IMMINTRIN_H_INCLUDED. * config/i386/pconfigintrin.h: Ditto. * config/i386/waitpkgintrin.h: Ditto. * config/i386/immintrin.h: Add includes for sgxintrin.h, pconfigintrin.h, waitpkgintrin.h and cldemoteintrin.h. * config/i386/x86intrin.h: Remove includes for mintrin.h, xmmintrin.h, emmintrin.h, pmmintrin.h, tmmintrin.h, smmintrin.h, wmmintrin.h, bmiintrin.h, bmi2intrin.h, lzcntintrin.h, sgxintrin.h, pconfigintrin.h, waitpkgintrin.h and cldemoteintrin.h. Thanks, Sebastian 0001-Headers-changes.patch Description: 0001-Headers-changes.patch
RE: [RFT PATCH, AVX512]: Implement scalar unsigned int->float conversions with AVX512F
> From: Uros Bizjak [mailto:ubiz...@gmail.com] > Sent: Tuesday, May 22, 2018 8:43 PM > To: gcc-patches@gcc.gnu.org > Cc: Peryt, Sebastian <sebastian.pe...@intel.com>; Jakub Jelinek > <ja...@redhat.com> > Subject: [RFT PATCH, AVX512]: Implement scalar unsigned int->float conversions > with AVX512F > > Hello! > > Attached patch implements scalar unsigned int->float conversions with > AVX512F. > > 2018-05-22 Uros Bizjak <ubiz...@gmail.com> > > * config/i386/i386.md (*floatuns2_avx512): > New insn pattern. > (floatunssi2): Also enable for AVX512F and TARGET_SSE_MATH. > Rewrite expander pattern. Emit gen_floatunssi2_i387_with_xmm > for non-SSE modes. > (floatunsdisf2): Rewrite expander pattern. Hanlde TARGET_AVX512F. > (floatunsdidf2): Ditto. > > testsuite/ChangeLog: > > 2018-05-22 Uros Bizjak <ubiz...@gmail.com> > > * gcc.target/i386/cvt-3.c: New test. > > Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}., but > not tested on AVX512 target. I have checked it on x86_64-linux-gnu {,-m32} on SKX and don't see any stability regressions. Sebastian > > Uros.
RE: [RFT PATCH, AVX512]: Implement scalar float->unsigned int truncations with AVX512F
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > ow...@gcc.gnu.org] On Behalf Of Uros Bizjak > Sent: Monday, May 21, 2018 9:55 PM > To: gcc-patches@gcc.gnu.org > Cc: Jakub Jelinek; Kirill Yukhin > > Subject: Re: [RFT PATCH, AVX512]: Implement scalar float->unsigned int > truncations with AVX512F > > On Mon, May 21, 2018 at 4:53 PM, Uros Bizjak wrote: > > Hello! > > > > Attached patch implements scalar float->unsigned int truncations > > with > AVX512F. > > > > 2018-05-21 Uros Bizjak > > > > * config/i386/i386.md (fixuns_truncdi2): New insn pattern. > > (fixuns_truncsi2_avx512f): Ditto. > > (*fixuns_truncsi2_avx512f_zext): Ditto. > > (fixuns_truncsi2): Also enable for AVX512F and TARGET_SSE_MATH. > > Emit fixuns_truncsi2_avx512f for AVX512F targets. > > > > testsuite/ChangeLog: > > > > 2018-05-21 Uros Bizjak > > > > * gcc.target/i386/cvt-2.c: New test. > > > > Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. > > > > Unfortunately, I have to means to test the patch on AVX512 target, > > so to avoid some hidden issue, I'd like to ask someone to test it on > > live target. I've bootstrapped and regression tested your patch on x86_64-linux-gnu {,-m32} on SKX machine and I don't see any stability regression. Sebastian > > Ops, ssemodesuffix handling was missing in the insn mnemonic. Fixed in > the attached v-2 patch. > > Uros.
RE: [PATCH][i386] Adding WAITPKG instructions
> -Original Message- > From: Uros Bizjak [mailto:ubiz...@gmail.com] > Sent: Thursday, May 10, 2018 3:26 PM > To: Peryt, Sebastian <sebastian.pe...@intel.com> > Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin <kirill.yuk...@gmail.com> > Subject: Re: [PATCH][i386] Adding WAITPKG instructions > > On Thu, May 10, 2018 at 2:50 PM, Peryt, Sebastian <sebastian.pe...@intel.com> > wrote: > > Hi Uros, > > > > Updated patch attached, please find comments below. > > > >> -Original Message- > >> From: Uros Bizjak [mailto:ubiz...@gmail.com] > >> Sent: Wednesday, May 9, 2018 1:47 PM > >> To: Peryt, Sebastian <sebastian.pe...@intel.com> > >> Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin <kirill.yuk...@gmail.com> > >> Subject: Re: [PATCH][i386] Adding WAITPKG instructions > >> > >> On Tue, May 8, 2018 at 1:34 PM, Peryt, Sebastian > >> <sebastian.pe...@intel.com> > >> wrote: > >> > Hi, > >> > > >> > This patch adds support for WAITPKG instructions. > >> > > >> > Is it ok for trunk and after few day for backport to GCC-8? > >> > > > (Removed) > >> > > >> > > >> > >> +case IX86_BUILTIN_UMONITOR: > >> + arg0 = CALL_EXPR_ARG (exp, 0); > >> + op0 = expand_normal (arg0); > >> + if (!REG_P (op0)) > >> +op0 = ix86_zero_extend_to_Pmode (op0); > >> + > >> + emit_insn (ix86_gen_umonitor (op0)); > >> + return 0; > >> > >> Please see how movdir64b handles its address operand. Also, do not > >> use global ix86_gen_monitor, just expand directly in the same way as > movdir64b. > >> > > > > Fixed. > > > >> +case IX86_BUILTIN_UMWAIT: > >> +case IX86_BUILTIN_TPAUSE: > >> + rtx eax, edx, op1_lo, op1_hi; > >> + arg0 = CALL_EXPR_ARG (exp, 0); > >> + arg1 = CALL_EXPR_ARG (exp, 1); > >> + op0 = expand_normal (arg0); > >> + op1 = expand_normal (arg1); > >> + eax = gen_rtx_REG (SImode, AX_REG); > >> + edx = gen_rtx_REG (SImode, DX_REG); > >> + if (!REG_P (op0)) > >> +op0 = copy_to_mode_reg (SImode, op0); > >> + if (!REG_P (op1)) > >> +op1 = copy_to_mode_reg (DImode, op1); > >> + op1_lo = gen_lowpart (SImode, op1); > >> + op1_hi = expand_shift (RSHIFT_EXPR, DImode, op1, > >> + GET_MODE_BITSIZE (SImode), 0, 1); > >> + op1_hi = convert_modes (SImode, DImode, op1_hi, 1); > >> + emit_move_insn (eax, op1_lo); > >> + emit_move_insn (edx, op1_hi); > >> + emit_insn (fcode == IX86_BUILTIN_UMWAIT > >> +? gen_umwait (op0, eax, edx) > >> +: gen_tpause (op0, eax, edx)); > >> + > >> + /* Return current CF value. */ > >> + op3 = gen_rtx_REG (CCCmode, FLAGS_REG); > >> + target = gen_rtx_LTU (QImode, op3, const0_rtx); > >> + > >> + return target; > >> > >> For the above code, please see how xsetbv expansion and patterns are > >> handling their input operands. There should be two patterns, one for > >> 32bit and the other for 64bit targets. The patterns will need to set > >> FLAGS_REG, otherwise the test will be removed. > >> > > > > I copied what is done for xsetbv expansion and most likely I found some bug > > in > GCC. > > The problem is that when I use 3 arguments and compile as 64bit > > version upper part of rax is not cleared. It doesn't appear when I'm using > > 2 or 4 > function arguments. > > Most likely error is caused by the fact that rdx is used both as an > > input for function and argument in instruction. > > There is no need to clear upper parts of 64bit register. As specified in the > ISA > (and modelled with RTX pattern), the instruction (e.g. > tpause) reads only lower 32 bits from %rax and %rdx. Implicitly, the > instruction > should ignore upper 32 bits by itself, so we can use SUBREGs. If this is not > the > case, we need to use DImode input arguments in RTX pattern and explicitly emit > zero-extension insns to clear upper 32 bits of input arguments. > Ok, I agree with you regarding clearing. But there is still one thing bothering me as explained in last email. The problem appears when I use 3 arguments and compile as 64bit version. Assembly generated is different from when I'm adding extra unused argument or removing one function argument not related to
RE: [PATCH][i386] Adding CLDEMOTE instruction
> -Original Message- > From: Uros Bizjak [mailto:ubiz...@gmail.com] > Sent: Wednesday, May 9, 2018 1:53 PM > To: Peryt, Sebastian <sebastian.pe...@intel.com> > Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin <kirill.yuk...@gmail.com> > Subject: Re: [PATCH][i386] Adding CLDEMOTE instruction > > On Tue, May 8, 2018 at 1:58 PM, Peryt, Sebastian <sebastian.pe...@intel.com> > wrote: > > Sorry, forgot attachment. > > > > Sebastian > > > > > > -Original Message- > > From: Peryt, Sebastian > > Sent: Tuesday, May 8, 2018 1:56 PM > > To: gcc-patches@gcc.gnu.org > > Cc: Uros Bizjak <ubiz...@gmail.com>; Kirill Yukhin > > <kirill.yuk...@gmail.com>; Peryt, Sebastian > > <sebastian.pe...@intel.com> > > Subject: [PATCH][i386] Adding CLDEMOTE instruction > > > > Hi, > > > > This patch adds support for CLDEMOTE instruction. > > > > Is it ok for trunk and after few day for backport to GCC-8? > > > > 2018-05-08 Sebastian Peryt <sebastian.pe...@intel.com> > > > > gcc/ > > > > * common/config/i386/i386-common.c > (OPTION_MASK_ISA_CLDEMOTE_SET, > > OPTION_MASK_ISA_CLDEMOTE_UNSET): New defines. > > (ix86_handle_option): Handle -mcldemote. > > * config.gcc: New header. > > * config/i386/cldemoteintrin.h: New file. > > * config/i386/cpuid.h (bit_CLDEMOTE): New bit. > > * config/i386/driver-i386.c (host_detect_local_cpu): Detect > > -mcldemote. > > * config/i386/i386-c.c (ix86_target_macros_internal): Handle > > OPTION_MASK_ISA_CLDEMOTE. > > * config/i386/i386.c (ix86_target_string): Added -mcldemote. > > (ix86_valid_target_attribute_inner_p): Ditto. > > (enum ix86_builtins): Added IX86_BUILTIN_CLDEMOTE. > > (ix86_init_mmx_sse_builtins): Define __builtin_ia32_cldemote. > > (ix86_expand_builtin): Expand IX86_BUILTIN_CLDEMOTE. > > * config/i386/i386.h (TARGET_CLDEMOTE, TARGET_CLDEMOTE_P): New. > > * config/i386/i386.md (UNSPECV_CLDEMOTE): New. > > (cldemote): New. > > * config/i386/i386.opt: Added -mcldemote. > > * config/i386/x86intrin.h: New header. > > * doc/invoke.texi: Added -mcldemote. > > > > 2018-05-08 Sebastian Peryt <sebastian.pe...@intel.com> > > > > gcc/testsuite/ > > > > * gcc.target/i386/cldemote-1.c: New test. > > OK for mainline. > > is there a compelling reason why we want this new feature in gcc-8 release > branch? > After some additional internal discussion I figured for now it's not required to backport it. I'll backport it if/when it'll be required in the future. > Thanks, > Uros. Thanks, Sebastian
RE: [PATCH][i386] Adding WAITPKG instructions
Hi Uros, Updated patch attached, please find comments below. > -Original Message- > From: Uros Bizjak [mailto:ubiz...@gmail.com] > Sent: Wednesday, May 9, 2018 1:47 PM > To: Peryt, Sebastian <sebastian.pe...@intel.com> > Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin <kirill.yuk...@gmail.com> > Subject: Re: [PATCH][i386] Adding WAITPKG instructions > > On Tue, May 8, 2018 at 1:34 PM, Peryt, Sebastian <sebastian.pe...@intel.com> > wrote: > > Hi, > > > > This patch adds support for WAITPKG instructions. > > > > Is it ok for trunk and after few day for backport to GCC-8? > > (Removed) > > > > > > +case IX86_BUILTIN_UMONITOR: > + arg0 = CALL_EXPR_ARG (exp, 0); > + op0 = expand_normal (arg0); > + if (!REG_P (op0)) > +op0 = ix86_zero_extend_to_Pmode (op0); > + > + emit_insn (ix86_gen_umonitor (op0)); > + return 0; > > Please see how movdir64b handles its address operand. Also, do not use global > ix86_gen_monitor, just expand directly in the same way as movdir64b. > Fixed. > +case IX86_BUILTIN_UMWAIT: > +case IX86_BUILTIN_TPAUSE: > + rtx eax, edx, op1_lo, op1_hi; > + arg0 = CALL_EXPR_ARG (exp, 0); > + arg1 = CALL_EXPR_ARG (exp, 1); > + op0 = expand_normal (arg0); > + op1 = expand_normal (arg1); > + eax = gen_rtx_REG (SImode, AX_REG); > + edx = gen_rtx_REG (SImode, DX_REG); > + if (!REG_P (op0)) > +op0 = copy_to_mode_reg (SImode, op0); > + if (!REG_P (op1)) > +op1 = copy_to_mode_reg (DImode, op1); > + op1_lo = gen_lowpart (SImode, op1); > + op1_hi = expand_shift (RSHIFT_EXPR, DImode, op1, > + GET_MODE_BITSIZE (SImode), 0, 1); > + op1_hi = convert_modes (SImode, DImode, op1_hi, 1); > + emit_move_insn (eax, op1_lo); > + emit_move_insn (edx, op1_hi); > + emit_insn (fcode == IX86_BUILTIN_UMWAIT > +? gen_umwait (op0, eax, edx) > +: gen_tpause (op0, eax, edx)); > + > + /* Return current CF value. */ > + op3 = gen_rtx_REG (CCCmode, FLAGS_REG); > + target = gen_rtx_LTU (QImode, op3, const0_rtx); > + > + return target; > > For the above code, please see how xsetbv expansion and patterns are handling > their input operands. There should be two patterns, one for 32bit and the > other > for 64bit targets. The patterns will need to set FLAGS_REG, otherwise the test > will be removed. > I copied what is done for xsetbv expansion and most likely I found some bug in GCC. The problem is that when I use 3 arguments and compile as 64bit version upper part of rax is not cleared. It doesn't appear when I'm using 2 or 4 function arguments. Most likely error is caused by the fact that rdx is used both as an input for function and argument in instruction. When using 3 operands: bar: .LFB5450: .cfi_startproc movq%rdx, %rax umonitor%rdi movq%rdx, %rcx shrq$32, %rcx movq%rcx, %rdx umwait %esi setc%al ret .cfi_endproc When using 4 operands: bar: .LFB5450: .cfi_startproc movl%edx, %esi umonitor%rdi movq%rcx, %rax shrq$32, %rax movq%rax, %rdx movl%ecx, %eax umwait %esi setc%al ret .cfi_endproc Can you please suggest how to proceed here? I cannot open new PR without adding this instruction first. Or maybe you know how to resolve it? > +(define_insn "umwait" > + [(unspec_volatile [(match_operand:SI 0 "register_operand" "r") > + (use (match_operand:SI 1 "register_operand" "a")) > + (use (match_operand:SI 2 "register_operand" "d"))] > +UNSPECV_UMWAIT)] > + "TARGET_WAITPKG" > + "umwait\t{%0}" > + [(set_attr "length" "3")]) > > No need for "use" RTX here and in other patterns. You should also remove {} > from insn template, otherwise there will be no operand printed in some asm > dialect. > Fixed. > Uros. Sebastian 0001-WAITPKG-v2.patch Description: 0001-WAITPKG-v2.patch
RE: [PATCH 1/3] Add PTWRITE builtins for x86
I have rebased this patch to the latest trunk and addressed comments. Also, there was a test in changelog, but not in the patch itself - this has been added. Is it ok for trunk and backport to GCC-8 after few days? gcc/ * common/config/i386/i386-common.c (OPTION_MASK_ISA_PTWRITE_SET, OPTION_MASK_ISA_PTWRITE_UNSET): New. (ix86_handle_option): Handle OPT_mptwrite. * config/i386/cpuid.h (bit_PTWRITE): Add. * config/i386/driver-i386.c (host_detect_local_cpu): Detect PTWRITE CPUID. * config/i386/i386-builtin.def (PTWRITE): Add PTWRITE. * config/i386/i386-c.c (ix86_target_macros_internal): Support __PTWRITE__. * config/i386/i386.c (ix86_target_string): Add -mptwrite. (ix86_valid_target_attribute_inner_p): Support ptwrite. (ix86_init_mmx_sse_builtins): Add edges detection for ptwrites generated by vartrace. * config/i386/i386.h (TARGET_PTWRITE): Add. (TARGET_PTWRITE_P): Add. * config/i386/i386.md: Add ptwrite. * config/i386/i386.opt: Add -mptwrite. * config/i386/immintrin.h (target): (_ptwrite64): Add. (_ptwrite32): Add. * doc/extend.texi: Document ptwrite builtins. * doc/invoke.texi: Document -mptwrite. gcc/testsuite/ * gcc.target/i386/ptwrite-1.c: New test. Sebastian > -Original Message- > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > ow...@gcc.gnu.org] On Behalf Of Andi Kleen > Sent: Monday, February 12, 2018 3:53 AM > To: gcc-patches@gcc.gnu.org > Cc: Metzger, Markus T; ubiz...@gmail.com; > Andi Kleen > Subject: [PATCH 1/3] Add PTWRITE builtins for x86 > > From: Andi Kleen > > Add builtins/intrinsics for PTWRITE. PTWRITE is a new instruction on Intel > Cherry > Trail that allows to write values into the Processor Trace log. > > This is fairly straight forward, except I had to add isa2 support for variable > number of operands. > > gcc/: > > 2018-02-10 Andi Kleen > > * common/config/i386/i386-common.c > (OPTION_MASK_ISA_PTWRITE_SET): > (OPTION_MASK_ISA_PTWRITE_UNSET): New. > (ix86_handle_option): Handle OPT_mptwrite. > * config/i386/cpuid.h (bit_PTWRITE): Add. > * config/i386/driver-i386.c (host_detect_local_cpu): Detect > PTWRITE CPUID. > * config/i386/i386-builtin.def (PTWRITE): Add PTWRITE. > * config/i386/i386-c.c (ix86_target_macros_internal): > Support __PTWRITE__. > * config/i386/i386.c (ix86_target_string): Add -mptwrite. > (ix86_valid_target_attribute_inner_p): Support ptwrite. > (BDESC_VERIFYS): Verify SPECIAL_ARGS2. > (ix86_init_mmx_sse_builtins): Handle special args2. > * config/i386/i386.h (TARGET_PTWRITE): Add. > (TARGET_PTWRITE_P): Add. > * config/i386/i386.md: Add ptwrite. > * config/i386/i386.opt: Add -mptwrite. > * config/i386/immintrin.h (target): > (_ptwrite_u64): Add. > (_ptwrite_u32): Add. > * doc/extend.texi: Document ptwrite builtins. > * doc/invoke.texi: Document -mptwrite. > > gcc/testsuite/: > > 2018-02-10 Andi Kleen > > * gcc.target/i386/ptwrite1.c: New test. > * gcc.target/i386/ptwrite2.c: New test. 0001-PTWRITE-intrinsics.patch Description: 0001-PTWRITE-intrinsics.patch
RE: [PATCH][i386] Adding CLDEMOTE instruction
Sorry, forgot attachment. Sebastian -Original Message- From: Peryt, Sebastian Sent: Tuesday, May 8, 2018 1:56 PM To: gcc-patches@gcc.gnu.org Cc: Uros Bizjak <ubiz...@gmail.com>; Kirill Yukhin <kirill.yuk...@gmail.com>; Peryt, Sebastian <sebastian.pe...@intel.com> Subject: [PATCH][i386] Adding CLDEMOTE instruction Hi, This patch adds support for CLDEMOTE instruction. Is it ok for trunk and after few day for backport to GCC-8? 2018-05-08 Sebastian Peryt <sebastian.pe...@intel.com> gcc/ * common/config/i386/i386-common.c (OPTION_MASK_ISA_CLDEMOTE_SET, OPTION_MASK_ISA_CLDEMOTE_UNSET): New defines. (ix86_handle_option): Handle -mcldemote. * config.gcc: New header. * config/i386/cldemoteintrin.h: New file. * config/i386/cpuid.h (bit_CLDEMOTE): New bit. * config/i386/driver-i386.c (host_detect_local_cpu): Detect -mcldemote. * config/i386/i386-c.c (ix86_target_macros_internal): Handle OPTION_MASK_ISA_CLDEMOTE. * config/i386/i386.c (ix86_target_string): Added -mcldemote. (ix86_valid_target_attribute_inner_p): Ditto. (enum ix86_builtins): Added IX86_BUILTIN_CLDEMOTE. (ix86_init_mmx_sse_builtins): Define __builtin_ia32_cldemote. (ix86_expand_builtin): Expand IX86_BUILTIN_CLDEMOTE. * config/i386/i386.h (TARGET_CLDEMOTE, TARGET_CLDEMOTE_P): New. * config/i386/i386.md (UNSPECV_CLDEMOTE): New. (cldemote): New. * config/i386/i386.opt: Added -mcldemote. * config/i386/x86intrin.h: New header. * doc/invoke.texi: Added -mcldemote. 2018-05-08 Sebastian Peryt <sebastian.pe...@intel.com> gcc/testsuite/ * gcc.target/i386/cldemote-1.c: New test. Thanks, Sebastian 0002-CLDEMOTE.PATCH Description: 0002-CLDEMOTE.PATCH
[PATCH][i386] Adding CLDEMOTE instruction
Hi, This patch adds support for CLDEMOTE instruction. Is it ok for trunk and after few day for backport to GCC-8? 2018-05-08 Sebastian Perytgcc/ * common/config/i386/i386-common.c (OPTION_MASK_ISA_CLDEMOTE_SET, OPTION_MASK_ISA_CLDEMOTE_UNSET): New defines. (ix86_handle_option): Handle -mcldemote. * config.gcc: New header. * config/i386/cldemoteintrin.h: New file. * config/i386/cpuid.h (bit_CLDEMOTE): New bit. * config/i386/driver-i386.c (host_detect_local_cpu): Detect -mcldemote. * config/i386/i386-c.c (ix86_target_macros_internal): Handle OPTION_MASK_ISA_CLDEMOTE. * config/i386/i386.c (ix86_target_string): Added -mcldemote. (ix86_valid_target_attribute_inner_p): Ditto. (enum ix86_builtins): Added IX86_BUILTIN_CLDEMOTE. (ix86_init_mmx_sse_builtins): Define __builtin_ia32_cldemote. (ix86_expand_builtin): Expand IX86_BUILTIN_CLDEMOTE. * config/i386/i386.h (TARGET_CLDEMOTE, TARGET_CLDEMOTE_P): New. * config/i386/i386.md (UNSPECV_CLDEMOTE): New. (cldemote): New. * config/i386/i386.opt: Added -mcldemote. * config/i386/x86intrin.h: New header. * doc/invoke.texi: Added -mcldemote. 2018-05-08 Sebastian Peryt gcc/testsuite/ * gcc.target/i386/cldemote-1.c: New test. Thanks, Sebastian
[PATCH][i386] Adding WAITPKG instructions
Hi, This patch adds support for WAITPKG instructions. Is it ok for trunk and after few day for backport to GCC-8? 2018-05-08 Sebastian Perytgcc/ * common/config/i386/i386-common.c (OPTION_MASK_ISA_WAITPKG_SET, OPTION_MASK_ISA_WAITPKG_UNSET): New defines. (ix86_handle_option): Handle -mwaitpkg. * config.gcc: New header. * config/i386/cpuid.h (bit_WAITPKG): New bit. * config/i386/driver-i386.c (host_detect_local_cpu): Detect -mwaitpkg. * config/i386/i386-builtin-types.def ((UINT8, UNSIGNED, UINT64)): New function type. * config/i386/i386-c.c (ix86_target_macros_internal): Handle OPTION_MASK_ISA_WAITPKG * config/i386/i386.c (ix86_target_string): Added -mwaitpkg. (ix86_option_override_internal): Added PTA_WAITPKG. (ix86_valid_target_attribute_inner_p): Added -mwaitpkg. (enum ix86_builtins): Added IX86_BUILTIN_UMONITOR, IX86_BUILTIN_UMWAIT, IX86_BUILTIN_TPAUSE. (ix86_init_mmx_sse_builtins): Define __builtin_ia32_umonitor, __builtin_ia32_umwait and __builtin_ia32_tpause. (ix86_expand_builtin):Expand IX86_BUILTIN_UMONITOR, IX86_BUILTIN_UMWAIT, IX86_BUILTIN_TPAUSE. * config/i386/i386.h (TARGET_WAITPKG, TARGET_WAITPKG_P): New. * config/i386/i386.opt: Added -mwaitpkg. * config/i386/sse.md (UNSPECV_UMWAIT, UNSPECV_UMONITOR, UNSPECV_TPAUSE): New. (umwait, umonitor_, tpause): New. * config/i386/waitpkgintrin.h: New file. * config/i386/x86intrin.h: New header. * doc/invoke.texi: Added -mwaitpkg. 2018-05-08 Sebastian Peryt gcc/testsuite/ * gcc.target/i386/tpause-1.c: New test. * gcc.target/i386/umonitor-1.c: New test. Thanks, Sebastian 0001-WAITPKG.patch Description: 0001-WAITPKG.patch
RE: [PATCH][i386] PR target/85473, Fix _movdir64b expansion with -mx32
Hi, Patch has been updated and tested. Now I don't see any new regressions. Changelog stays the same. Is it ok for trunk? Thanks, Sebastian > -Original Message- > From: Peryt, Sebastian > Sent: Saturday, April 21, 2018 5:36 PM > To: gcc-patches@gcc.gnu.org > Cc: Uros Bizjak <ubiz...@gmail.com>; Kirill Yukhin <kirill.yuk...@gmail.com>; > H.J. Lu <hjl.to...@gmail.com>; Peryt, Sebastian <sebastian.pe...@intel.com> > Subject: RE: [PATCH][i386] PR target/85473, Fix _movdir64b expansion with - > mx32 > > Hi, > > I just realized this patch introduces some new regressions. > > Sorry, I must have mixed up something in testing. Will update this patch > shortly. > > Sebastian > > > -Original Message- > > From: Peryt, Sebastian > > Sent: Friday, April 20, 2018 6:38 PM > > To: gcc-patches@gcc.gnu.org > > Cc: Uros Bizjak <ubiz...@gmail.com>; Kirill Yukhin > > <kirill.yuk...@gmail.com>; H.J. Lu <hjl.to...@gmail.com>; Peryt, > > Sebastian <sebastian.pe...@intel.com> > > Subject: [PATCH][i386] PR target/85473, Fix _movdir64b expansion with > > -mx32 > > > > Hi, > > > > This fixes PR85473 by fixing _movdir64b expansion for -mx32. > > > > Ok for trunk? > > > > 2018-04-20 Sebastian Peryt <sebastian.pe...@intel.com> > > > > gcc/ChangeLog: > > > > PR target/85473 > > * config/i386/i386.c (ix86_expand_builtin): Change memory > > operand to XI, op0 extend to Pmode. > > * config/i386/i386.md: Change unspec volatile and operand 1 > > mode to XI, change operand 0 mode to P > > > > 2018-04-20 Sebastian Peryt <sebastian.pe...@intel.com> > > > > gcc/testsuite/ChangeLog: > > > > PR target/85473 > > * gcc.target/i386/pr85473-1.c: New test. > > * gcc.target/i386/pr85473-2.c: New test. > > > > Sebastian > > 0001-PR85473-fix-v2.patch Description: 0001-PR85473-fix-v2.patch
RE: [PATCH][i386] PR target/85473, Fix _movdir64b expansion with -mx32
Hi, I just realized this patch introduces some new regressions. Sorry, I must have mixed up something in testing. Will update this patch shortly. Sebastian > -Original Message- > From: Peryt, Sebastian > Sent: Friday, April 20, 2018 6:38 PM > To: gcc-patches@gcc.gnu.org > Cc: Uros Bizjak <ubiz...@gmail.com>; Kirill Yukhin <kirill.yuk...@gmail.com>; > H.J. Lu <hjl.to...@gmail.com>; Peryt, Sebastian <sebastian.pe...@intel.com> > Subject: [PATCH][i386] PR target/85473, Fix _movdir64b expansion with -mx32 > > Hi, > > This fixes PR85473 by fixing _movdir64b expansion for -mx32. > > Ok for trunk? > > 2018-04-20 Sebastian Peryt <sebastian.pe...@intel.com> > > gcc/ChangeLog: > > PR target/85473 > * config/i386/i386.c (ix86_expand_builtin): Change memory > operand to XI, op0 extend to Pmode. > * config/i386/i386.md: Change unspec volatile and operand 1 > mode to XI, change operand 0 mode to P > > 2018-04-20 Sebastian Peryt <sebastian.pe...@intel.com> > > gcc/testsuite/ChangeLog: > > PR target/85473 > * gcc.target/i386/pr85473-1.c: New test. > * gcc.target/i386/pr85473-2.c: New test. > > Sebastian >
[PATCH][i386] PR target/85473, Fix _movdir64b expansion with -mx32
Hi, This fixes PR85473 by fixing _movdir64b expansion for -mx32. Ok for trunk? 2018-04-20 Sebastian Perytgcc/ChangeLog: PR target/85473 * config/i386/i386.c (ix86_expand_builtin): Change memory operand to XI, op0 extend to Pmode. * config/i386/i386.md: Change unspec volatile and operand 1 mode to XI, change operand 0 mode to P 2018-04-20 Sebastian Peryt gcc/testsuite/ChangeLog: PR target/85473 * gcc.target/i386/pr85473-1.c: New test. * gcc.target/i386/pr85473-2.c: New test. Sebastian fix-PR85473.patch Description: fix-PR85473.patch
RE: [PATCH][i386] Adding MOVDIRI and MOVDIR64B instructions
> On Thu, Apr 19, 2018 at 3:11 PM, Peryt, Sebastian <sebastian.pe...@intel.com> > wrote: > >> On Thu, Apr 19, 2018 at 2:35 PM, Peryt, Sebastian > >> <sebastian.pe...@intel.com> > >> wrote: > >> >> On Wed, Apr 18, 2018 at 2:56 PM, Peryt, Sebastian > >> >> <sebastian.pe...@intel.com> > >> >> wrote: > >> >> > Hi, > >> >> > > >> >> > This patch enables new instructions - MOVDIRI and MOVDIR64B. > >> >> > > >> >> > Is it ok for trunk? > >> >> > >> >> Is there a reason that one flag goes to ix86_isa_flags and the > >> >> other to ix86_isa_flags2? > >> > > >> > This is because of usage of OPTION_MASK_ISA_MOVDIRI | > >> > OPTION_MASK_ISA_64BIT which would end up in different isa flags > >> > tables. And MOVDIR64B doesn't use this option, so it can be in > ix86_isa_flags2. > >> > >> Ah, indeed. > >> > >> The patch is OK for mainline then. > > > > Thanks! > > > >> > >> (Please note that until gcc-8 is branched, patches that add new > >> features won't be approved as we are nearing the release.) > > > > Can you please explain what this actually mean? I got confused. Also > > I'd like to mention that I have few more patches I'm going to send soon. > > > > That this mean I can merge this one in trunk, but there is no guarantee it > > will > be added into GCC-8? > > No, the patch will be included in gcc-8, as gcc-8 has not yet branched from > the > trunk. But since branch date is approaching, we don't want to destabilize > trunk > by accepting patches that introduce new features, so they will have to be > postponed and committed to the trunk after gcc-8 is branched. > > Uros. Ok, thank you. That explains it now. Sebastian
RE: [PATCH][i386] Adding MOVDIRI and MOVDIR64B instructions
> On Thu, Apr 19, 2018 at 2:35 PM, Peryt, Sebastian <sebastian.pe...@intel.com> > wrote: > >> On Wed, Apr 18, 2018 at 2:56 PM, Peryt, Sebastian > >> <sebastian.pe...@intel.com> > >> wrote: > >> > Hi, > >> > > >> > This patch enables new instructions - MOVDIRI and MOVDIR64B. > >> > > >> > Is it ok for trunk? > >> > >> Is there a reason that one flag goes to ix86_isa_flags and the other > >> to ix86_isa_flags2? > > > > This is because of usage of OPTION_MASK_ISA_MOVDIRI | > > OPTION_MASK_ISA_64BIT which would end up in different isa flags > > tables. And MOVDIR64B doesn't use this option, so it can be in > > ix86_isa_flags2. > > Ah, indeed. > > The patch is OK for mainline then. Thanks! > > (Please note that until gcc-8 is branched, patches that add new features won't > be approved as we are nearing the release.) Can you please explain what this actually mean? I got confused. Also I'd like to mention that I have few more patches I'm going to send soon. That this mean I can merge this one in trunk, but there is no guarantee it will be added into GCC-8? Thanks, Sebastian > > Thanks, > Uros. > > > Sebastian > > > >> > >> Uros. > >> > >> > 2018-04-18 Sebastian Peryt <sebastian.pe...@intel.com> > >> > > >> > gcc/ > >> > > >> > * common/config/i386/i386-common.c > >> > (OPTION_MASK_ISA_MOVDIRI_SET, > >> OPTION_MASK_ISA_MOVDIR64B_SET, > >> > OPTION_MASK_ISA_MOVDIRI_UNSET, > >> > OPTION_MASK_ISA_MOVDIR64B_UNSET): New defines. > >> > (ix86_handle_option): Handle -mmovdiri and -mmovdir64b. > >> > * config.gcc (movdirintrin.h): New header. > >> > * config/i386/cpuid.h (bit_MOVDIRI, > >> > bit_MOVDIR64B): New bits. > >> > * config/i386/driver-i386.c (host_detect_local_cpu): Detect > >> > -mmovdiri > >> > and -mmvodir64b. > >> > * config/i386/i386-builtin-types.def ((VOID, PUNSIGNED, > >> > UNSIGNED), > >> > (VOID, PVOID, PCVOID)): New function types. > >> > * config/i386/i386-builtin.def (__builtin_ia32_directstoreu_u32, > >> > __builtin_ia32_directstoreu_u64, __builtin_ia32_movdir64b): > >> > New > >> builtins. > >> > * config/i386/i386-c.c (__MOVDIRI__, __MOVDIR64B__): New. > >> > * config/i386/i386.c (ix86_target_string): Added > >> > -mmovdir64b and - > >> mmovdiri. > >> > (ix86_valid_target_attribute_inner_p): Ditto. > >> > (ix86_expand_special_args_builtin): Added > >> VOID_FTYPE_PUNSIGNED_UNSIGNED > >> > and VOID_FTYPE_PUNSIGNED_UNSIGNED. > >> > (ix86_expand_builtin): Expand IX86_BUILTIN_MOVDIR64B. > >> > * config/i386/i386.h (TARGET_MOVDIRI, TARGET_MOVDIRI_P, > >> > TARGET_MOVDIR64B, TARGET_MOVDIR64B_P): New. > >> > * config/i386/i386.md (UNSPECV_MOVDIRI, UNSPECV_MOVDIR64B): > >> New. > >> > (movdiri, movdir64b_): New. > >> > * config/i386/i386.opt: Add -mmovdiri and -mmovdir64b. > >> > * config/i386/immintrin.h: Include movdirintrin.h. > >> > * config/i386/movdirintrin.h: New file. > >> > * doc/invoke.texi: Added -mmovdiri and -mmovdir64b. > >> > > >> > 2018-04-18 Sebastian Peryt <sebastian.pe...@intel.com> > >> > > >> > gcc/testsuite/ > >> > > >> > * gcc.target/i386/movdir-1.c: New test. > >> > > >> > > >> > Thanks, > >> > Sebastian
RE: [PATCH][i386] Adding MOVDIRI and MOVDIR64B instructions
> On Wed, Apr 18, 2018 at 2:56 PM, Peryt, Sebastian <sebastian.pe...@intel.com> > wrote: > > Hi, > > > > This patch enables new instructions - MOVDIRI and MOVDIR64B. > > > > Is it ok for trunk? > > Is there a reason that one flag goes to ix86_isa_flags and the other to > ix86_isa_flags2? This is because of usage of OPTION_MASK_ISA_MOVDIRI | OPTION_MASK_ISA_64BIT which would end up in different isa flags tables. And MOVDIR64B doesn't use this option, so it can be in ix86_isa_flags2. Sebastian > > Uros. > > > 2018-04-18 Sebastian Peryt <sebastian.pe...@intel.com> > > > > gcc/ > > > > * common/config/i386/i386-common.c > > (OPTION_MASK_ISA_MOVDIRI_SET, > OPTION_MASK_ISA_MOVDIR64B_SET, > > OPTION_MASK_ISA_MOVDIRI_UNSET, > > OPTION_MASK_ISA_MOVDIR64B_UNSET): New defines. > > (ix86_handle_option): Handle -mmovdiri and -mmovdir64b. > > * config.gcc (movdirintrin.h): New header. > > * config/i386/cpuid.h (bit_MOVDIRI, > > bit_MOVDIR64B): New bits. > > * config/i386/driver-i386.c (host_detect_local_cpu): Detect > > -mmovdiri > > and -mmvodir64b. > > * config/i386/i386-builtin-types.def ((VOID, PUNSIGNED, UNSIGNED), > > (VOID, PVOID, PCVOID)): New function types. > > * config/i386/i386-builtin.def (__builtin_ia32_directstoreu_u32, > > __builtin_ia32_directstoreu_u64, __builtin_ia32_movdir64b): New > builtins. > > * config/i386/i386-c.c (__MOVDIRI__, __MOVDIR64B__): New. > > * config/i386/i386.c (ix86_target_string): Added -mmovdir64b and - > mmovdiri. > > (ix86_valid_target_attribute_inner_p): Ditto. > > (ix86_expand_special_args_builtin): Added > VOID_FTYPE_PUNSIGNED_UNSIGNED > > and VOID_FTYPE_PUNSIGNED_UNSIGNED. > > (ix86_expand_builtin): Expand IX86_BUILTIN_MOVDIR64B. > > * config/i386/i386.h (TARGET_MOVDIRI, TARGET_MOVDIRI_P, > > TARGET_MOVDIR64B, TARGET_MOVDIR64B_P): New. > > * config/i386/i386.md (UNSPECV_MOVDIRI, UNSPECV_MOVDIR64B): > New. > > (movdiri, movdir64b_): New. > > * config/i386/i386.opt: Add -mmovdiri and -mmovdir64b. > > * config/i386/immintrin.h: Include movdirintrin.h. > > * config/i386/movdirintrin.h: New file. > > * doc/invoke.texi: Added -mmovdiri and -mmovdir64b. > > > > 2018-04-18 Sebastian Peryt <sebastian.pe...@intel.com> > > > > gcc/testsuite/ > > > > * gcc.target/i386/movdir-1.c: New test. > > > > > > Thanks, > > Sebastian
[PATCH][i386] Adding MOVDIRI and MOVDIR64B instructions
Hi, This patch enables new instructions - MOVDIRI and MOVDIR64B. Is it ok for trunk? 2018-04-18 Sebastian Perytgcc/ * common/config/i386/i386-common.c (OPTION_MASK_ISA_MOVDIRI_SET, OPTION_MASK_ISA_MOVDIR64B_SET, OPTION_MASK_ISA_MOVDIRI_UNSET, OPTION_MASK_ISA_MOVDIR64B_UNSET): New defines. (ix86_handle_option): Handle -mmovdiri and -mmovdir64b. * config.gcc (movdirintrin.h): New header. * config/i386/cpuid.h (bit_MOVDIRI, bit_MOVDIR64B): New bits. * config/i386/driver-i386.c (host_detect_local_cpu): Detect -mmovdiri and -mmvodir64b. * config/i386/i386-builtin-types.def ((VOID, PUNSIGNED, UNSIGNED), (VOID, PVOID, PCVOID)): New function types. * config/i386/i386-builtin.def (__builtin_ia32_directstoreu_u32, __builtin_ia32_directstoreu_u64, __builtin_ia32_movdir64b): New builtins. * config/i386/i386-c.c (__MOVDIRI__, __MOVDIR64B__): New. * config/i386/i386.c (ix86_target_string): Added -mmovdir64b and -mmovdiri. (ix86_valid_target_attribute_inner_p): Ditto. (ix86_expand_special_args_builtin): Added VOID_FTYPE_PUNSIGNED_UNSIGNED and VOID_FTYPE_PUNSIGNED_UNSIGNED. (ix86_expand_builtin): Expand IX86_BUILTIN_MOVDIR64B. * config/i386/i386.h (TARGET_MOVDIRI, TARGET_MOVDIRI_P, TARGET_MOVDIR64B, TARGET_MOVDIR64B_P): New. * config/i386/i386.md (UNSPECV_MOVDIRI, UNSPECV_MOVDIR64B): New. (movdiri, movdir64b_): New. * config/i386/i386.opt: Add -mmovdiri and -mmovdir64b. * config/i386/immintrin.h: Include movdirintrin.h. * config/i386/movdirintrin.h: New file. * doc/invoke.texi: Added -mmovdiri and -mmovdir64b. 2018-04-18 Sebastian Peryt gcc/testsuite/ * gcc.target/i386/movdir-1.c: New test. Thanks, Sebastian 0001-MOVDIRI.PATCH Description: 0001-MOVDIRI.PATCH
RE: [PATCH][i386,AVX] Fix PR84783 - backport missing permutexvar to GCC7
Hi Jakub, Gentle ping. Thanks, Sebastian > -Original Message- > From: Kirill Yukhin [mailto:kirill.yuk...@gmail.com] > Sent: Friday, March 23, 2018 6:49 AM > To: ja...@redhat.com; Peryt, Sebastian <sebastian.pe...@intel.com> > Cc: 'gcc-patches@gcc.gnu.org' <gcc-patches@gcc.gnu.org> > Subject: Re: [PATCH][i386,AVX] Fix PR84783 - backport missing permutexvar to > GCC7 > > Hello Sebastian! > > On 22 мар 13:01, Peryt, Sebastian wrote: > > Hi, > > > > This patch adds missing permutexvar intrinsics for backporting to GCC 7 to > resolve PR84783. > > > > 2018-03-22 Sebastian Peryt <sebastian.pe...@intel.com> > > > > gcc: > > PR84783 > > * config/i386/avx512vlintrin.h (_mm256_permutexvar_epi64) > > (_mm256_permutexvar_epi32, _mm256_permutex_epi64): New > intrinsics. > > > > gcc/testsuite: > > PR84783 > > > > * gcc.target/i386/avx512vl-vpermd-1.c (_mm256_permutexvar_epi32): > > Test new intrinsic. > > * gcc.target/i386/avx512vl-vpermq-imm-1.c > (_mm256_permutex_epi64): > > Ditto. > > * gcc.target/i386/avx512vl-vpermq-var-1.c > (_mm256_permutexvar_epi64): > > Ditto. > > * gcc.target/i386/avx512f-vpermd-2.c: Do not check for AVX512F_LEN. > > * gcc.target/i386/avx512f-vpermq-imm-2.c: Ditto. > > * gcc.target/i386/avx512f-vpermq-var-2.c: Ditto. > > > > Is it ok for merge? > Your patch is pretty much simple and is OK to me. > > However, since you're aiming to GCC 7, I'd like to here GM's OK here as well. > > -- > Thanks, K > > > > > Thanks, > > Sebastian >
[PATCH][i386,AVX] Fix PR84783 - backport missing permutexvar to GCC7
Hi, This patch adds missing permutexvar intrinsics for backporting to GCC 7 to resolve PR84783. 2018-03-22 Sebastian Perytgcc: PR84783 * config/i386/avx512vlintrin.h (_mm256_permutexvar_epi64) (_mm256_permutexvar_epi32, _mm256_permutex_epi64): New intrinsics. gcc/testsuite: PR84783 * gcc.target/i386/avx512vl-vpermd-1.c (_mm256_permutexvar_epi32): Test new intrinsic. * gcc.target/i386/avx512vl-vpermq-imm-1.c (_mm256_permutex_epi64): Ditto. * gcc.target/i386/avx512vl-vpermq-var-1.c (_mm256_permutexvar_epi64): Ditto. * gcc.target/i386/avx512f-vpermd-2.c: Do not check for AVX512F_LEN. * gcc.target/i386/avx512f-vpermq-imm-2.c: Ditto. * gcc.target/i386/avx512f-vpermq-var-2.c: Ditto. Is it ok for merge? Thanks, Sebastian PR84783.patch Description: PR84783.patch
[PATCH][x86] Fix PR84460
Hi, This is fix for PR84460. gcc/testsuite PR target/84460 * gcc.target/i386/pr57193.c (dg-options): Add -mtune=generic. Is it ok for trunk? Thanks, Sebastian PR84460.patch Description: PR84460.patch
[PATCH][i386] Fix PR83546 - missing RDRND for -march=silvermont
Hi, This patch re-enables RDRND for Silvermont. It got lost in r206178 as pointed out in PR. Bootstraped and tested. 2018-01-15 Sebastian Perytgcc/ PR target/83546 * config/i386/i386.c (ix86_option_override_internal): Add PTA_RDRND to PTA_SILVERMONT. 2018-01-15 Sebastian Peryt gcc/testsuite/ PR target/83546 * gcc.target/i386/pr83546.c: New test. Is it ok for trunk? Sebastian 0001-PR83546.patch Description: 0001-PR83546.patch
[Patch][x86, backport] Backport to GCC-6 vzeroupper patches
Hi, I'd like to ask for backporting to GCC-6 branch vzeroupper generation patches from trunk, that are resolving 3 PRs: PR target/82941 PR target/82942 PR target/82990 Two patches were combined into one and rebased. Bootstraped and tested. Is it ok for merge? Changelog: Fix PR82941 and PR82942 by adding proper vzeroupper generation on SKX. Add X86_TUNE_EMIT_VZEROUPPER to indicate if vzeroupper instruction should be inserted before a transfer of control flow out of the function. It is turned on by default unless we are tuning for KNL. Users can always use -mzeroupper or -mno-zeroupper to override X86_TUNE_EMIT_VZEROUPPER. 2017-11-29 Sebastian PerytH.J. Lu gcc/ Bakcported from trunk PR target/82941 PR target/82942 PR target/82990 * config/i386/i386.c (pass_insert_vzeroupper): Remove TARGET_AVX512F check from gate condition. (ix86_check_avx256_register): Changed to ... (ix86_check_avx_upper_register): ... this. Add extra check for VALID_AVX512F_REG_OR_XI_MODE. (ix86_avx_u128_mode_needed): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_check_avx256_stores): Changed to ... (ix86_check_avx_upper_stores): ... this. Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_avx_u128_mode_after): Changed avx_reg256_found to avx_upper_reg_found. Changed ix86_check_avx256_stores to ix86_check_avx_upper_stores. (ix86_avx_u128_mode_entry): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_avx_u128_mode_exit): Ditto. (ix86_option_override_internal): Set MASK_VZEROUPPER if neither -mzeroupper nor -mno-zeroupper is used and TARGET_EMIT_VZEROUPPER is set. * config/i386/i386.h: (host_detect_local_cpu): New define. (TARGET_EMIT_VZEROUPPER): New. * config/i386/x86-tune.def: Add X86_TUNE_EMIT_VZEROUPPER. 2017-11-29 Sebastian Peryt H.J. Lu gcc/testsuite/ Backported from trunk PR target/82941 PR target/82942 PR target/82990 * gcc.target/i386/pr82941-1.c: New test. * gcc.target/i386/pr82941-2.c: Likewise. * gcc.target/i386/pr82942-1.c: Likewise. * gcc.target/i386/pr82942-2.c: Likewise. * gcc.target/i386/pr82990-1.c: Likewise. * gcc.target/i386/pr82990-2.c: Likewise. * gcc.target/i386/pr82990-3.c: Likewise. * gcc.target/i386/pr82990-4.c: Likewise. * gcc.target/i386/pr82990-5.c: Likewise. * gcc.target/i386/pr82990-6.c: Likewise. * gcc.target/i386/pr82990-7.c: Likewise. Thanks, Sebastian 0001-backportPR82941-GCC-6.patch Description: 0001-backportPR82941-GCC-6.patch
[Patch][x86, backport] Backport to GCC-7 vzeroupper patches
Hi, I'd like to ask for backporting to GCC-7 branch vzeroupper generation patches from trunk, that are resolving 3 PRs: PR target/82941 PR target/82942 PR target/82990 Two patches were combined into one and rebased. Bootstraped and tested. Is it ok for merge? Changelog: Fix PR82941 and PR82942 by adding proper vzeroupper generation on SKX. Add X86_TUNE_EMIT_VZEROUPPER to indicate if vzeroupper instruction should be inserted before a transfer of control flow out of the function. It is turned on by default unless we are tuning for KNL. Users can always use -mzeroupper or -mno-zeroupper to override X86_TUNE_EMIT_VZEROUPPER. 2017-11-29 Sebastian PerytH.J. Lu gcc/ Bakcported from trunk PR target/82941 PR target/82942 PR target/82990 * config/i386/i386.c (pass_insert_vzeroupper): Remove TARGET_AVX512F check from gate condition. (ix86_check_avx256_register): Changed to ... (ix86_check_avx_upper_register): ... this. Add extra check for VALID_AVX512F_REG_OR_XI_MODE. (ix86_avx_u128_mode_needed): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_check_avx256_stores): Changed to ... (ix86_check_avx_upper_stores): ... this. Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_avx_u128_mode_after): Changed avx_reg256_found to avx_upper_reg_found. Changed ix86_check_avx256_stores to ix86_check_avx_upper_stores. (ix86_avx_u128_mode_entry): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_avx_u128_mode_exit): Ditto. (ix86_option_override_internal): Set MASK_VZEROUPPER if neither -mzeroupper nor -mno-zeroupper is used and TARGET_EMIT_VZEROUPPER is set. * config/i386/i386.h: (host_detect_local_cpu): New define. (TARGET_EMIT_VZEROUPPER): New. * config/i386/x86-tune.def: Add X86_TUNE_EMIT_VZEROUPPER. 2017-11-29 Sebastian Peryt H.J. Lu gcc/testsuite/ Backported from trunk PR target/82941 PR target/82942 PR target/82990 * gcc.target/i386/pr82941-1.c: New test. * gcc.target/i386/pr82941-2.c: Likewise. * gcc.target/i386/pr82942-1.c: Likewise. * gcc.target/i386/pr82942-2.c: Likewise. * gcc.target/i386/pr82990-1.c: Likewise. * gcc.target/i386/pr82990-2.c: Likewise. * gcc.target/i386/pr82990-3.c: Likewise. * gcc.target/i386/pr82990-4.c: Likewise. * gcc.target/i386/pr82990-5.c: Likewise. * gcc.target/i386/pr82990-6.c: Likewise. * gcc.target/i386/pr82990-7.c: Likewise. Thanks, Sebastian 0001-backportPR82942-GCC-7.patch Description: 0001-backportPR82942-GCC-7.patch
RE: [PATCH, committed] Add myself to MAINTAINERS
Message didn't get thru for some reason. Resending. Sebastian From: Peryt, Sebastian Sent: Wednesday, November 15, 2017 1:44 PM To: gcc-patches@gcc.gnu.org Cc: Peryt, Sebastian <sebastian.pe...@intel.com> Subject: [PATCH, committed] Add myself to MAINTAINERS ChangeLog: 2017-11-15 Sebastian Peryt <sebastian.pe...@intel.com> * MAINTAINERS (write after approval): Add myself. Index: MAINTAINERS === --- MAINTAINERS (revision 254760) +++ MAINTAINERS (working copy) @@ -532,6 +532,7 @@ Devang Patel <dpa...@apple.com> Andris Pavenis <andris.pave...@iki.fi> Fernando Pereira <prone...@gmail.com> +Sebastian Peryt <sebastian.pe...@intel.com> Kaushik Phatak <kaushik.pha...@kpitcummins.com> Nicolas Pitre <n...@cam.org> Paul Pluzhnikov <ppluzhni...@google.com> Sebastian
RE: [PATCH][i386] PR82941/PR82942 - Adding vzeroupper generation for SKX
Attached is fixed patch. Sebastian > -Original Message- > From: H.J. Lu [mailto:hjl.to...@gmail.com] > Sent: Tuesday, November 14, 2017 1:18 PM > To: Peryt, Sebastian <sebastian.pe...@intel.com> > Cc: Jakub Jelinek <ja...@redhat.com>; gcc-patches@gcc.gnu.org; Uros Bizjak > <ubiz...@gmail.com>; Kirill Yukhin <kirill.yuk...@gmail.com>; Lu, Hongjiu > <hongjiu...@intel.com> > Subject: Re: [PATCH][i386] PR82941/PR82942 - Adding vzeroupper generation > for SKX > > On Tue, Nov 14, 2017 at 3:18 AM, Peryt, Sebastian <sebastian.pe...@intel.com> > wrote: > > I have updated tests and changelog according to Jakub's suggestions. > > Please find attached v2 of my patch. > > > > > > 14.11.2017 Sebastian Peryt <sebastian.pe...@intel.com> > > > > gcc/ > > > > PR target/82941 > > PR target/82942 > > * config/i386/i386.c (pass_insert_vzeroupper): Modify gate condition > > to return true on Xeon and not on Xeon Phi. > > (ix86_check_avx256_register): Changed to ... > > (ix86_check_avx_upper_register): ... this. Add extra check for > > VALID_AVX512F_REG_OR_XI_MODE. > > (ix86_avx_u128_mode_needed): Changed > > ix86_check_avx256_register to ix86_check_avx_upper_register. > > (ix86_check_avx256_stores): Changed to ... > > (ix86_check_avx_upper_stores): ... this. Changed > > ix86_check_avx256_register to ix86_check_avx_upper_register. > > (ix86_avx_u128_mode_after): Changed > > avx_reg256_found to avx_upper_reg_found. Changed > > ix86_check_avx256_stores to ix86_check_avx_upper_stores. > > (ix86_avx_u128_mode_entry): Changed > > ix86_check_avx256_register to ix86_check_avx_upper_register. > > (ix86_avx_u128_mode_exit): Ditto. > > * config/i386/i386.h: (host_detect_local_cpu): New define. > > @@ -2497,7 +2497,7 @@ public: >/* opt_pass methods: */ >virtual bool gate (function *) > { > - return TARGET_AVX && !TARGET_AVX512F > + return TARGET_AVX && !TARGET_AVX512PF && !TARGET_AVX512ER > ^ Please > remove this. > > From glibc commit: > > commit 4cb334c4d6249686653137ec273d081371b3672d > Author: H.J. Lu <hjl.to...@gmail.com> > Date: Tue Apr 18 14:01:45 2017 -0700 > > x86: Use AVX2 memcpy/memset on Skylake server [BZ #21396] > > On Skylake server, AVX512 load/store instructions in memcpy/memset may > lead to lower CPU turbo frequency in certain situations. Use of AVX2 > in memcpy/memset has been observed to have improved overall performance > in many workloads due to the higher frequency. > > Since AVX512ER is unique to Xeon Phi, this patch sets Prefer_No_AVX512 > if AVX512ER isn't available so that AVX2 versions of memcpy/memset are > used on Skylake server. > > Only AVX512ER is really unique to Xeon Phi. > >&& TARGET_VZEROUPPER && flag_expensive_optimizations >&& !optimize_size; > } > > > 14.11.2017 Sebastian Peryt <sebastian.pe...@intel.com> > > > > gcc/testsuite/ > > > > PR target/82941 > > PR target/82942 > > * gcc.target/i386/pr82941-1.c: New test. > > * gcc.target/i386/pr82941-2.c: New test. > > * gcc.target/i386/pr82942-1.c: New test. > > * gcc.target/i386/pr82942-2.c: New test. > > > > > > Thanks, > > Sebastian > > > >> -Original Message- > >> From: Jakub Jelinek [mailto:ja...@redhat.com] > >> Sent: Tuesday, November 14, 2017 10:51 AM > >> To: Peryt, Sebastian <sebastian.pe...@intel.com> > >> Cc: gcc-patches@gcc.gnu.org; Uros Bizjak <ubiz...@gmail.com>; Kirill > >> Yukhin <kirill.yuk...@gmail.com>; Lu, Hongjiu <hongjiu...@intel.com> > >> Subject: Re: [PATCH][i386] PR82941/PR82942 - Adding vzeroupper > >> generation for SKX > >> > >> On Tue, Nov 14, 2017 at 09:45:12AM +, Peryt, Sebastian wrote: > >> > Hi, > >> > > >> > This patch fixes PR82941 and PR82942 by adding vzeroupper > >> > generation on > >> SKX. > >> > Bootstrapped and tested. > >> > > >> > 14.11.2017 Sebastian Peryt <sebastian.pe...@intel.com> > >> > > >> > gcc/ > >> > >> In that case the ChangeLog entry should list the PRs, i.e. > >> PR targ
RE: [PATCH][i386] PR82941/PR82942 - Adding vzeroupper generation for SKX
I have updated tests and changelog according to Jakub's suggestions. Please find attached v2 of my patch. 14.11.2017 Sebastian Peryt <sebastian.pe...@intel.com> gcc/ PR target/82941 PR target/82942 * config/i386/i386.c (pass_insert_vzeroupper): Modify gate condition to return true on Xeon and not on Xeon Phi. (ix86_check_avx256_register): Changed to ... (ix86_check_avx_upper_register): ... this. Add extra check for VALID_AVX512F_REG_OR_XI_MODE. (ix86_avx_u128_mode_needed): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_check_avx256_stores): Changed to ... (ix86_check_avx_upper_stores): ... this. Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_avx_u128_mode_after): Changed avx_reg256_found to avx_upper_reg_found. Changed ix86_check_avx256_stores to ix86_check_avx_upper_stores. (ix86_avx_u128_mode_entry): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_avx_u128_mode_exit): Ditto. * config/i386/i386.h: (host_detect_local_cpu): New define. 14.11.2017 Sebastian Peryt <sebastian.pe...@intel.com> gcc/testsuite/ PR target/82941 PR target/82942 * gcc.target/i386/pr82941-1.c: New test. * gcc.target/i386/pr82941-2.c: New test. * gcc.target/i386/pr82942-1.c: New test. * gcc.target/i386/pr82942-2.c: New test. Thanks, Sebastian > -Original Message- > From: Jakub Jelinek [mailto:ja...@redhat.com] > Sent: Tuesday, November 14, 2017 10:51 AM > To: Peryt, Sebastian <sebastian.pe...@intel.com> > Cc: gcc-patches@gcc.gnu.org; Uros Bizjak <ubiz...@gmail.com>; Kirill Yukhin > <kirill.yuk...@gmail.com>; Lu, Hongjiu <hongjiu...@intel.com> > Subject: Re: [PATCH][i386] PR82941/PR82942 - Adding vzeroupper generation > for SKX > > On Tue, Nov 14, 2017 at 09:45:12AM +, Peryt, Sebastian wrote: > > Hi, > > > > This patch fixes PR82941 and PR82942 by adding vzeroupper generation on > SKX. > > Bootstrapped and tested. > > > > 14.11.2017 Sebastian Peryt <sebastian.pe...@intel.com> > > > > gcc/ > > In that case the ChangeLog entry should list the PRs, i.e. > PR target/82941 > PR target/82942 > > * config/i386/i386.c (pass_insert_vzeroupper): Modify gate condition > > to return true on Xeon and not on Xeon Phi. > > (ix86_check_avx256_register): Changed to ... > > (ix86_check_avx_upper_register): ... this. > > (ix86_check_avx_upper_register): Add extra check for > > VALID_AVX512F_REG_OR_XI_MODE. > > The way this is usually written is instead: > (ix86_check_avx256_register): Changed to ... > (ix86_check_avx_upper_register): ... this. Add extra check for > VALID_AVX512F_REG_OR_XI_MODE. > i.e. don't duplicate the function name, just continue mentioning further > changes. > > > (ix86_avx_u128_mode_needed): Changed > > ix86_check_avx256_register to ix86_check_avx_upper_register. > > (ix86_check_avx256_stores): Changed to ... > > (ix86_check_avx_upper_stores): ... this. > > (ix86_check_avx_upper_stores): Changed > > ix86_check_avx256_register to ix86_check_avx_upper_register. > > Likewise. > > > gcc/testsuite/ > > * gcc.target/i386/pr82941.c: New test. > > * gcc.target/i386/pr82942.c: New test. > > Shouldn't there be also a test that if using -march=knl and another one if > using - > mavx512f -mavx512er that we don't emit any vzeroupper? > > Jakub 0001-VZEROUPPER_v2.patch Description: 0001-VZEROUPPER_v2.patch
[PATCH][i386] PR82941/PR82942 - Adding vzeroupper generation for SKX
Hi, This patch fixes PR82941 and PR82942 by adding vzeroupper generation on SKX. Bootstrapped and tested. 14.11.2017 Sebastian Perytgcc/ * config/i386/i386.c (pass_insert_vzeroupper): Modify gate condition to return true on Xeon and not on Xeon Phi. (ix86_check_avx256_register): Changed to ... (ix86_check_avx_upper_register): ... this. (ix86_check_avx_upper_register): Add extra check for VALID_AVX512F_REG_OR_XI_MODE. (ix86_avx_u128_mode_needed): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_check_avx256_stores): Changed to ... (ix86_check_avx_upper_stores): ... this. (ix86_check_avx_upper_stores): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_avx_u128_mode_after): Changed avx_reg256_found to avx_upper_reg_found. (ix86_avx_u128_mode_after): Changed ix86_check_avx256_stores to ix86_check_avx_upper_stores. (ix86_avx_u128_mode_entry): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_avx_u128_mode_exit): Ditto. * config/i386/i386.h: (host_detect_local_cpu): New define. gcc/testsuite/ * gcc.target/i386/pr82941.c: New test. * gcc.target/i386/pr82942.c: New test. Is it ok for trunk? Thanks, Sebastian 0001-VZEROUPPER.patch Description: 0001-VZEROUPPER.patch
RE: [Patch, testcase] PR82767 Fix scan-assembler patterns in i386/pr71321.c
> On Sun, Nov 5, 2017 at 12:14 PM, Peryt, Sebastian <sebastian.pe...@intel.com> > wrote: > > Hi, > > > > After r253934 gcc.target/i386/pr71321.c started to fail due to the wrong > number of scan-assembler - 2 instead of 3. This patch is fixing that. > > Are you sure that there is no problem with the code generation? Did you > investigate original PR for what it is testing and why it is testing for > these 3 > LEAs? Well, the problem is due to the change in cost model. This can be reverted by simple modification: diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h index c7ac70e..bb5b3e2 100644 --- a/gcc/config/i386/x86-tune-costs.h +++ b/gcc/config/i386/x86-tune-costs.h @@ -2253,7 +2253,7 @@ struct processor_costs core_cost = { COSTS_N_INSNS (4), /* DI */ COSTS_N_INSNS (4)}, /*other */ 0, /* cost of multiply per each bit set */ - {COSTS_N_INSNS (8), /* cost of a divide/mod for QI */ + {COSTS_N_INSNS (18), /* cost of a divide/mod for QI */ COSTS_N_INSNS (8), /* HI */ /* 8-11 */ COSTS_N_INSNS (11), /* SI */ The original PR was to make better code generation when dividing and modulo small integers. Ok, maybe I missed something. I'll get back to PR and see if any other solution will be proposed since for now I have nothing. > > > 2017-11-05 Sebastian Peryt <sebastian.pe...@intel.com> > > > > PR testsuite/82767 > > * gcc.target/i386/pr71321.c: Fix invalid testcase. > > There is nothing wrong with the testcase. > > > Is it ok for trunk? > > > > Thanks, > > Sebastian > >
[Patch, testcase] PR82767 Fix scan-assembler patterns in i386/pr71321.c
Hi, After r253934 gcc.target/i386/pr71321.c started to fail due to the wrong number of scan-assembler - 2 instead of 3. This patch is fixing that. 2017-11-05 Sebastian PerytPR testsuite/82767 * gcc.target/i386/pr71321.c: Fix invalid testcase. Is it ok for trunk? Thanks, Sebastian PR82767.patch Description: PR82767.patch
[patch][i386, AVX] Adding missing CMP* intrinsics
Hi, This patch written by Olga Makhotina adds listed below missing intrinsics: _mm512_[mask_]cmpeq_[pd|ps]_mask _mm512_[mask_]cmple_[pd|ps]_mask _mm512_[mask_]cmplt_[pd|ps]_mask _mm512_[mask_]cmpneq_[pd|ps]_mask _mm512_[mask_]cmpnle_[pd|ps]_mask _mm512_[mask_]cmpnlt_[pd|ps]_mask _mm512_[mask_]cmpord_[pd|ps]_mask _mm512_[mask_]cmpunord_[pd|ps]_mask 20.10.2017 Olga Makhotinagcc/ * config/i386/avx512fintrin.h (_mm512_cmpeq_pd_mask, _mm512_cmple_pd_mask, _mm512_cmplt_pd_mask, _mm512_cmpneq_pd_mask, _mm512_cmpnle_pd_mask, _mm512_cmpnlt_pd_mask, _mm512_cmpord_pd_mask, _mm512_cmpunord_pd_mask, _mm512_mask_cmpeq_pd_mask, _mm512_mask_cmple_pd_mask, _mm512_mask_cmplt_pd_mask, _mm512_mask_cmpneq_pd_mask, _mm512_mask_cmpnle_pd_mask, _mm512_mask_cmpnlt_pd_mask, _mm512_mask_cmpord_pd_mask, _mm512_mask_cmpunord_pd_mask, _mm512_cmpeq_ps_mask, _mm512_cmple_ps_mask, _mm512_cmplt_ps_mask, _mm512_cmpneq_ps_mask, _mm512_cmpnle_ps_mask, _mm512_cmpnlt_ps_mask, _mm512_cmpord_ps_mask, _mm512_cmpunord_ps_mask, _mm512_mask_cmpeq_ps_mask, _mm512_mask_cmple_ps_mask, _mm512_mask_cmplt_ps_mask, _mm512_mask_cmpneq_ps_mask, _mm512_mask_cmpnle_ps_mask, _mm512_mask_cmpnlt_ps_mask, _mm512_mask_cmpord_ps_mask, _mm512_mask_cmpunord_ps_mask): New intrinsics. 20.10.2017 Olga Makhotina gcc/testsuite/ * gcc.target/i386/avx512f-vcmpps-1.c (_mm512_cmpeq_ps_mask, _mm512_cmple_ps_mask, _mm512_cmplt_ps_mask, _mm512_cmpneq_ps_mask, _mm512_cmpnle_ps_mask, _mm512_cmpnlt_ps_mask, _mm512_cmpord_ps_mask, _mm512_cmpunord_ps_mask, _mm512_mask_cmpeq_ps_mask, _mm512_mask_cmple_ps_mask, _mm512_mask_cmplt_ps_mask, _mm512_mask_cmpneq_ps_mask, _mm512_mask_cmpnle_ps_mask, _mm512_mask_cmpnlt_ps_mask, _mm512_mask_cmpord_ps_mask, _mm512_mask_cmpunord_ps_mask): Test new intrinsics. * gcc.target/i386/avx512f-vcmpps-2.c (_mm512_cmpeq_ps_mask, _mm512_cmple_ps_mask, _mm512_cmplt_ps_mask, _mm512_cmpneq_ps_mask, _mm512_cmpnle_ps_mask, _mm512_cmpnlt_ps_mask, _mm512_cmpord_ps_mask, _mm512_cmpunord_ps_mask, _mm512_mask_cmpeq_ps_mask, _mm512_mask_cmple_ps_mask, _mm512_mask_cmplt_ps_mask, _mm512_mask_cmpneq_ps_mask, _mm512_mask_cmpnle_ps_mask, _mm512_mask_cmpnlt_ps_mask, _mm512_mask_cmpord_ps_mask, _mm512_mask_cmpunord_ps_mask): Test new intrinsics. * gcc.target/i386/avx512f-vcmppd-1.c (_mm512_cmpeq_pd_mask, _mm512_cmple_pd_mask, _mm512_cmplt_pd_mask, _mm512_cmpneq_pd_mask, _mm512_cmpnle_pd_mask, _mm512_cmpnlt_pd_mask, _mm512_cmpord_pd_mask, _mm512_cmpunord_pd_mask, _mm512_mask_cmpeq_pd_mask, _mm512_mask_cmple_pd_mask, _mm512_mask_cmplt_pd_mask, _mm512_mask_cmpneq_pd_mask, _mm512_mask_cmpnle_pd_mask, _mm512_mask_cmpnlt_pd_mask, _mm512_mask_cmpord_pd_mask, _mm512_mask_cmpunord_pd_mask): Test new intrinsics. * gcc.target/i386/avx512f-vcmppd-2.c (_mm512_cmpeq_pd_mask, _mm512_cmple_pd_mask, _mm512_cmplt_pd_mask, _mm512_cmpneq_pd_mask, _mm512_cmpnle_pd_mask, _mm512_cmpnlt_pd_mask, _mm512_cmpord_pd_mask, _mm512_cmpunord_pd_mask, _mm512_mask_cmpeq_pd_mask, _mm512_mask_cmple_pd_mask, _mm512_mask_cmplt_pd_mask, _mm512_mask_cmpneq_pd_mask, _mm512_mask_cmpnle_pd_mask, _mm512_mask_cmpnlt_pd_mask, _mm512_mask_cmpord_pd_mask, _mm512_mask_cmpunord_pd_mask): Test new intrinsics. Is it ok for trunk? Thanks, Sebastian 0001-vcmpp-d-s.patch Description: 0001-vcmpp-d-s.patch
Missing REDUCE[SD,SS] intrinsics
Hi, This patch written by Olga Makhotina adds missing intrinsics for REDUCE[SD,SS]. 16.10.2017 Olga Makhotinagcc/ * config/i386/avx512dqintrin.h (_mm_mask_reduce_sd, _mm_maskz_reduce_sd, _mm_mask_reduce_ss, _mm_maskz_reduce_ss): New intrinsics. * config/i386/i386-builtin.def (__builtin_ia32_reducesd_mask, __builtin_ia32_reducess_mask): New builtin. (__builtin_ia32_reducesd, __builtin_ia32_reducess): Remove. * config/i386/sse.md (reduces): Renamed to ... (reduces): ... this. (vreduce\t{%3, %2, %1, %0|%0, %1, %2, %3}): Changed to ... (vreduce\t{%3, %2, %1, %0| %0, %1, %2, %3}): ... this. gcc/testsuite/ * gcc.target/i386/avx512dq-vreducesd-1.c (_mm_mask_reduce_sd, _mm_maskz_reduce_sd): Test new intrinsics. * gcc.target/i386/avx512dq-vreducesd-2.c: New. * gcc.target/i386/avx512dq-vreducess-1.c (_mm_mask_reduce_ss, _mm_maskz_reduce_ss): Test new intrinsics. * gcc.target/i386/avx512dq-vreducess-2.c: New. * gcc.target/i386/avx-1.c (__builtin_ia32_reducesd, __builtin_ia32_reducess): Remove builtin. (__builtin_ia32_reducesd_mask, __builtin_ia32_reducess_mask): Test new builtin. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. Is it ok for trunk? Thanks, Sebastian 0001-reduce_ss-reduce_sd.patch Description: 0001-reduce_ss-reduce_sd.patch
RE: [PATCH][x86] Knights Mill -march/-mtune options
> -Original Message- > From: Uros Bizjak [mailto:ubiz...@gmail.com] > Sent: Tuesday, September 19, 2017 11:23 PM > To: Peryt, Sebastian <sebastian.pe...@intel.com> > Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin <kirill.yuk...@gmail.com> > Subject: Re: [PATCH][x86] Knights Mill -march/-mtune options > > On Tue, Sep 19, 2017 at 9:01 AM, Peryt, Sebastian <sebastian.pe...@intel.com> > wrote: > > >> >> >> > This patch adds options -march=/-mtune=knm for Knights Mill. > >> >> >> > > >> >> >> > 2017-09-14 Sebastian Peryt <sebastian.pe...@intel.com> gcc/ > >> >> >> > > >> >> >> > * config.gcc: Support "knm". > >> >> >> > * config/i386/driver-i386.c (host_detect_local_cpu): Detect > "knm". > >> >> >> > * config/i386/i386-c.c (ix86_target_macros_internal): > >> >> >> > Handle > >> >> >> > PROCESSOR_KNM. > >> >> >> > * config/i386/i386.c (m_KNM): Define. > >> >> >> > (processor_target_table): Add "knm". > >> >> >> > (PTA_KNM): Define. > >> >> >> > (ix86_option_override_internal): Add "knm". > >> >> >> > (ix86_issue_rate): Add PROCESSOR_KNM. > >> >> >> > (ix86_adjust_cost): Ditto. > >> >> >> > (ia32_multipass_dfa_lookahead): Ditto. > >> >> >> > (get_builtin_code_for_version): Handle PROCESSOR_KNM. > >> >> >> > (fold_builtin_cpu): Define M_INTEL_KNM. > >> >> >> > * config/i386/i386.h (TARGET_KNM): Define. > >> >> >> > (processor_type): Add PROCESSOR_KNM. > >> >> >> > * config/i386/x86-tune.def: Add m_KNM. > >> >> >> > * doc/invoke.texi: Add knm as x86 -march=/-mtune= CPU type. > >> >> >> > > >> >> >> > > >> >> >> > gcc/testsuite/ > >> >> >> > > >> >> >> > * gcc.target/i386/funcspec-5.c: Test knm. > >> >> >> > > >> >> >> > Is it ok for trunk? > >> >> >> > >> >> >> You also have to update libgcc/cpuinfo.h together with > >> >> >> fold_builtin_cpu from i386.c. Please note that all new > >> >> >> processor types and subtypes have to be added at the end of the enum. > >> >> >> > >> >> > > >> >> > Uros, > >> >> > > >> >> > I have updated libgcc/cpuinfo.h and libgcc/cpuinfo.c. I > >> >> > understood that CPU_TYPE_MAX in libgcc/cpuinfo.h processor_types > >> >> > is some kind of barrier, this is why I put KNM before that. Is that > >> >> > correct > thinking? > >> >> > As for fold_builtin_cpu in i386.c I already have something like this: > >> >> > > >> >> > @@ -34217,6 +34229,7 @@ fold_builtin_cpu (tree fndecl, tree *args) > >> >> > M_AMDFAM15H, > >> >> > M_INTEL_SILVERMONT, > >> >> > M_INTEL_KNL, > >> >> > +M_INTEL_KNM, > >> >> > M_AMD_BTVER1, > >> >> > M_AMD_BTVER2, > >> >> > M_CPU_SUBTYPE_START, > >> >> > @@ -34262,6 +34275,7 @@ fold_builtin_cpu (tree fndecl, tree *args) > >> >> >{"bonnell", M_INTEL_BONNELL}, > >> >> >{"silvermont", M_INTEL_SILVERMONT}, > >> >> >{"knl", M_INTEL_KNL}, > >> >> > + {"knm", M_INTEL_KNM}, > >> >> >{"amdfam10h", M_AMDFAM10H}, > >> >> >{"barcelona", M_AMDFAM10H_BARCELONA}, > >> >> >{"shanghai", M_AMDFAM10H_SHANGHAI}, > >> >> > > >> >> > I couldn't find any other place where I'm supposed to add anything > extra. > >> >> > >> >> Please look at libgcc/config/i386/cpuinfo.h. The comment here says that: > >> >> > >> >> /* Any new types or subtypes have to be inserted at the end. */ > >> >> > >> >> The above patch should then add M_INTE
RE: [PATCH][x86] Knights Mill -march/-mtune options
> -Original Message- > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > ow...@gcc.gnu.org] On Behalf Of Uros Bizjak > Sent: Monday, September 18, 2017 9:10 PM > To: Peryt, Sebastian <sebastian.pe...@intel.com> > Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin <kirill.yuk...@gmail.com> > Subject: Re: [PATCH][x86] Knights Mill -march/-mtune options > > On Mon, Sep 18, 2017 at 12:42 PM, Peryt, Sebastian > <sebastian.pe...@intel.com> wrote: > >> -Original Message- > >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > >> ow...@gcc.gnu.org] On Behalf Of Uros Bizjak > >> Sent: Monday, September 18, 2017 12:23 PM > >> To: Peryt, Sebastian <sebastian.pe...@intel.com> > >> Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin <kirill.yuk...@gmail.com> > >> Subject: Re: [PATCH][x86] Knights Mill -march/-mtune options > >> > >> On Mon, Sep 18, 2017 at 12:17 PM, Peryt, Sebastian > >> <sebastian.pe...@intel.com> wrote: > >> >> -Original Message- > >> >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > >> >> ow...@gcc.gnu.org] On Behalf Of Uros Bizjak > >> >> Sent: Sunday, September 17, 2017 6:14 PM > >> >> To: Peryt, Sebastian <sebastian.pe...@intel.com> > >> >> Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin > >> >> <kirill.yuk...@gmail.com> > >> >> Subject: Re: [PATCH][x86] Knights Mill -march/-mtune options > >> >> > >> >> On Thu, Sep 14, 2017 at 1:47 PM, Peryt, Sebastian > >> >> <sebastian.pe...@intel.com> > >> >> wrote: > >> >> > Hi, > >> >> > > >> >> > This patch adds options -march=/-mtune=knm for Knights Mill. > >> >> > > >> >> > 2017-09-14 Sebastian Peryt <sebastian.pe...@intel.com> gcc/ > >> >> > > >> >> > * config.gcc: Support "knm". > >> >> > * config/i386/driver-i386.c (host_detect_local_cpu): Detect > >> >> > "knm". > >> >> > * config/i386/i386-c.c (ix86_target_macros_internal): Handle > >> >> > PROCESSOR_KNM. > >> >> > * config/i386/i386.c (m_KNM): Define. > >> >> > (processor_target_table): Add "knm". > >> >> > (PTA_KNM): Define. > >> >> > (ix86_option_override_internal): Add "knm". > >> >> > (ix86_issue_rate): Add PROCESSOR_KNM. > >> >> > (ix86_adjust_cost): Ditto. > >> >> > (ia32_multipass_dfa_lookahead): Ditto. > >> >> > (get_builtin_code_for_version): Handle PROCESSOR_KNM. > >> >> > (fold_builtin_cpu): Define M_INTEL_KNM. > >> >> > * config/i386/i386.h (TARGET_KNM): Define. > >> >> > (processor_type): Add PROCESSOR_KNM. > >> >> > * config/i386/x86-tune.def: Add m_KNM. > >> >> > * doc/invoke.texi: Add knm as x86 -march=/-mtune= CPU type. > >> >> > > >> >> > > >> >> > gcc/testsuite/ > >> >> > > >> >> > * gcc.target/i386/funcspec-5.c: Test knm. > >> >> > > >> >> > Is it ok for trunk? > >> >> > >> >> You also have to update libgcc/cpuinfo.h together with > >> >> fold_builtin_cpu from i386.c. Please note that all new processor > >> >> types and subtypes have to be added at the end of the enum. > >> >> > >> > > >> > Uros, > >> > > >> > I have updated libgcc/cpuinfo.h and libgcc/cpuinfo.c. I understood > >> > that CPU_TYPE_MAX in libgcc/cpuinfo.h processor_types is some kind > >> > of barrier, this is why I put KNM before that. Is that correct thinking? > >> > As for fold_builtin_cpu in i386.c I already have something like this: > >> > > >> > @@ -34217,6 +34229,7 @@ fold_builtin_cpu (tree fndecl, tree *args) > >> > M_AMDFAM15H, > >> > M_INTEL_SILVERMONT, > >> > M_INTEL_KNL, > >> > +M_INTEL_KNM, > >> > M_AMD_BTVER1, > >> > M_AMD_BTVER2, > >> > M_CPU_SUBTYPE_START, > >> > @@ -34262,6 +34275,7 @@ fold_builtin_cpu (tree fndecl, tree *args) > >> >{"bonnell", M_INTEL_BONNELL}, > >> >{"silvermont", M_INTEL_SILVERMONT}, > >> >{"knl", M_INTEL_KNL}, > >> > + {"knm", M_INTEL_KNM}, > >> >{"amdfam10h", M_AMDFAM10H}, > >> >{"barcelona", M_AMDFAM10H_BARCELONA}, > >> >{"shanghai", M_AMDFAM10H_SHANGHAI}, > >> > > >> > I couldn't find any other place where I'm supposed to add anything extra. > >> > >> Please look at libgcc/config/i386/cpuinfo.h. The comment here says that: > >> > >> /* Any new types or subtypes have to be inserted at the end. */ > >> > >> The above patch should then add M_INTEL_KNM as the last entry > >> *before* M_CPU_SUBTYPE_START. > >> > > > > Sorry, I didn't notice this value at first. I believe now it's correct. > > OK for mainline SVN (with updated ChangeLog). > Can you please commit for me? Thanks, Sebastian > Thanks, > Uros.
RE: [PATCH][x86] Knights Mill -march/-mtune options
> -Original Message- > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > ow...@gcc.gnu.org] On Behalf Of Uros Bizjak > Sent: Monday, September 18, 2017 12:23 PM > To: Peryt, Sebastian <sebastian.pe...@intel.com> > Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin <kirill.yuk...@gmail.com> > Subject: Re: [PATCH][x86] Knights Mill -march/-mtune options > > On Mon, Sep 18, 2017 at 12:17 PM, Peryt, Sebastian > <sebastian.pe...@intel.com> wrote: > >> -Original Message- > >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > >> ow...@gcc.gnu.org] On Behalf Of Uros Bizjak > >> Sent: Sunday, September 17, 2017 6:14 PM > >> To: Peryt, Sebastian <sebastian.pe...@intel.com> > >> Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin <kirill.yuk...@gmail.com> > >> Subject: Re: [PATCH][x86] Knights Mill -march/-mtune options > >> > >> On Thu, Sep 14, 2017 at 1:47 PM, Peryt, Sebastian > >> <sebastian.pe...@intel.com> > >> wrote: > >> > Hi, > >> > > >> > This patch adds options -march=/-mtune=knm for Knights Mill. > >> > > >> > 2017-09-14 Sebastian Peryt <sebastian.pe...@intel.com> gcc/ > >> > > >> > * config.gcc: Support "knm". > >> > * config/i386/driver-i386.c (host_detect_local_cpu): Detect > >> > "knm". > >> > * config/i386/i386-c.c (ix86_target_macros_internal): Handle > >> > PROCESSOR_KNM. > >> > * config/i386/i386.c (m_KNM): Define. > >> > (processor_target_table): Add "knm". > >> > (PTA_KNM): Define. > >> > (ix86_option_override_internal): Add "knm". > >> > (ix86_issue_rate): Add PROCESSOR_KNM. > >> > (ix86_adjust_cost): Ditto. > >> > (ia32_multipass_dfa_lookahead): Ditto. > >> > (get_builtin_code_for_version): Handle PROCESSOR_KNM. > >> > (fold_builtin_cpu): Define M_INTEL_KNM. > >> > * config/i386/i386.h (TARGET_KNM): Define. > >> > (processor_type): Add PROCESSOR_KNM. > >> > * config/i386/x86-tune.def: Add m_KNM. > >> > * doc/invoke.texi: Add knm as x86 -march=/-mtune= CPU type. > >> > > >> > > >> > gcc/testsuite/ > >> > > >> > * gcc.target/i386/funcspec-5.c: Test knm. > >> > > >> > Is it ok for trunk? > >> > >> You also have to update libgcc/cpuinfo.h together with > >> fold_builtin_cpu from i386.c. Please note that all new processor > >> types and subtypes have to be added at the end of the enum. > >> > > > > Uros, > > > > I have updated libgcc/cpuinfo.h and libgcc/cpuinfo.c. I understood > > that CPU_TYPE_MAX in libgcc/cpuinfo.h processor_types is some kind of > > barrier, this is why I put KNM before that. Is that correct thinking? > > As for fold_builtin_cpu in i386.c I already have something like this: > > > > @@ -34217,6 +34229,7 @@ fold_builtin_cpu (tree fndecl, tree *args) > > M_AMDFAM15H, > > M_INTEL_SILVERMONT, > > M_INTEL_KNL, > > +M_INTEL_KNM, > > M_AMD_BTVER1, > > M_AMD_BTVER2, > > M_CPU_SUBTYPE_START, > > @@ -34262,6 +34275,7 @@ fold_builtin_cpu (tree fndecl, tree *args) > >{"bonnell", M_INTEL_BONNELL}, > >{"silvermont", M_INTEL_SILVERMONT}, > >{"knl", M_INTEL_KNL}, > > + {"knm", M_INTEL_KNM}, > >{"amdfam10h", M_AMDFAM10H}, > >{"barcelona", M_AMDFAM10H_BARCELONA}, > >{"shanghai", M_AMDFAM10H_SHANGHAI}, > > > > I couldn't find any other place where I'm supposed to add anything extra. > > Please look at libgcc/config/i386/cpuinfo.h. The comment here says that: > > /* Any new types or subtypes have to be inserted at the end. */ > > The above patch should then add M_INTEL_KNM as the last entry *before* > M_CPU_SUBTYPE_START. > Sorry, I didn't notice this value at first. I believe now it's correct. Sebastian > > Additionally I updated one extra test I found - > > gcc.target/i386/funcspec-56.inc > > > >> Ops, and ANDFAM17H processor type should not be there in cpuinfo.h. > > > > Sorry, I don't understand - it shouldn't be at this position, or in this > > enum at all? > > This means I have to synchronize gcc part with libgcc. I'll do it later today. > > Uros. KNM_enabling_v3.patch Description: KNM_enabling_v3.patch
RE: [PATCH][x86] Knights Mill -march/-mtune options
> -Original Message- > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > ow...@gcc.gnu.org] On Behalf Of Uros Bizjak > Sent: Sunday, September 17, 2017 6:14 PM > To: Peryt, Sebastian <sebastian.pe...@intel.com> > Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin <kirill.yuk...@gmail.com> > Subject: Re: [PATCH][x86] Knights Mill -march/-mtune options > > On Thu, Sep 14, 2017 at 1:47 PM, Peryt, Sebastian <sebastian.pe...@intel.com> > wrote: > > Hi, > > > > This patch adds options -march=/-mtune=knm for Knights Mill. > > > > 2017-09-14 Sebastian Peryt <sebastian.pe...@intel.com> gcc/ > > > > * config.gcc: Support "knm". > > * config/i386/driver-i386.c (host_detect_local_cpu): Detect "knm". > > * config/i386/i386-c.c (ix86_target_macros_internal): Handle > > PROCESSOR_KNM. > > * config/i386/i386.c (m_KNM): Define. > > (processor_target_table): Add "knm". > > (PTA_KNM): Define. > > (ix86_option_override_internal): Add "knm". > > (ix86_issue_rate): Add PROCESSOR_KNM. > > (ix86_adjust_cost): Ditto. > > (ia32_multipass_dfa_lookahead): Ditto. > > (get_builtin_code_for_version): Handle PROCESSOR_KNM. > > (fold_builtin_cpu): Define M_INTEL_KNM. > > * config/i386/i386.h (TARGET_KNM): Define. > > (processor_type): Add PROCESSOR_KNM. > > * config/i386/x86-tune.def: Add m_KNM. > > * doc/invoke.texi: Add knm as x86 -march=/-mtune= CPU type. > > > > > > gcc/testsuite/ > > > > * gcc.target/i386/funcspec-5.c: Test knm. > > > > Is it ok for trunk? > > You also have to update libgcc/cpuinfo.h together with fold_builtin_cpu from > i386.c. Please note that all new processor types and subtypes have to be added > at the end of the enum. > Uros, I have updated libgcc/cpuinfo.h and libgcc/cpuinfo.c. I understood that CPU_TYPE_MAX in libgcc/cpuinfo.h processor_types is some kind of barrier, this is why I put KNM before that. Is that correct thinking? As for fold_builtin_cpu in i386.c I already have something like this: @@ -34217,6 +34229,7 @@ fold_builtin_cpu (tree fndecl, tree *args) M_AMDFAM15H, M_INTEL_SILVERMONT, M_INTEL_KNL, +M_INTEL_KNM, M_AMD_BTVER1, M_AMD_BTVER2, M_CPU_SUBTYPE_START, @@ -34262,6 +34275,7 @@ fold_builtin_cpu (tree fndecl, tree *args) {"bonnell", M_INTEL_BONNELL}, {"silvermont", M_INTEL_SILVERMONT}, {"knl", M_INTEL_KNL}, + {"knm", M_INTEL_KNM}, {"amdfam10h", M_AMDFAM10H}, {"barcelona", M_AMDFAM10H_BARCELONA}, {"shanghai", M_AMDFAM10H_SHANGHAI}, I couldn't find any other place where I'm supposed to add anything extra. Additionally I updated one extra test I found - gcc.target/i386/funcspec-56.inc > Ops, and ANDFAM17H processor type should not be there in cpuinfo.h. Sorry, I don't understand - it shouldn't be at this position, or in this enum at all? > > Uros. Thanks, Sebastian 2017-09-18 Sebastian Peryt <sebastian.pe...@intel.com> gcc/ * config.gcc: Support "knm". * config/i386/driver-i386.c (host_detect_local_cpu): Detect "knm". * config/i386/i386-c.c (ix86_target_macros_internal): Handle PROCESSOR_KNM. * config/i386/i386.c (m_KNM): Define. (processor_target_table): Add "knm". (PTA_KNM): Define. (ix86_option_override_internal): Add "knm". (ix86_issue_rate): Add PROCESSOR_KNM. (ix86_adjust_cost): Ditto. (ia32_multipass_dfa_lookahead): Ditto. (get_builtin_code_for_version): Handle PROCESSOR_KNM. (fold_builtin_cpu): Define M_INTEL_KNM. * config/i386/i386.h (TARGET_KNM): Define. (processor_type): Add PROCESSOR_KNM. * config/i386/x86-tune.def: Add m_KNM. * doc/invoke.texi: Add knm as x86 -march=/-mtune= CPU type. libgcc/ * config/i386/cpuinfo.h (processor_types): Add INTEL_KNM. * config/i386/cpuinfo.c (get_intel_cpu): Detect Knights Mill. gcc/testsuite/ * gcc.target/i386/funcspec-5.c: Test knm. * gcc.target/i386/funcspec-56.inc: Test arch=knm. KNM_enabling_v2.patch Description: KNM_enabling_v2.patch
[PATCH][x86] Knights Mill -march/-mtune options
Hi, This patch adds options -march=/-mtune=knm for Knights Mill. 2017-09-14 Sebastian Perytgcc/ * config.gcc: Support "knm". * config/i386/driver-i386.c (host_detect_local_cpu): Detect "knm". * config/i386/i386-c.c (ix86_target_macros_internal): Handle PROCESSOR_KNM. * config/i386/i386.c (m_KNM): Define. (processor_target_table): Add "knm". (PTA_KNM): Define. (ix86_option_override_internal): Add "knm". (ix86_issue_rate): Add PROCESSOR_KNM. (ix86_adjust_cost): Ditto. (ia32_multipass_dfa_lookahead): Ditto. (get_builtin_code_for_version): Handle PROCESSOR_KNM. (fold_builtin_cpu): Define M_INTEL_KNM. * config/i386/i386.h (TARGET_KNM): Define. (processor_type): Add PROCESSOR_KNM. * config/i386/x86-tune.def: Add m_KNM. * doc/invoke.texi: Add knm as x86 -march=/-mtune= CPU type. gcc/testsuite/ * gcc.target/i386/funcspec-5.c: Test knm. Is it ok for trunk? Thanks, Sebastian KNM_enabling.patch Description: KNM_enabling.patch
RE: [PATCH] i386: Rewrite check for AVX512 features
> -Original Message- > From: Uros Bizjak [mailto:ubiz...@gmail.com] > Sent: Sunday, July 30, 2017 11:02 AM > To: H.J. Lu <hjl.to...@gmail.com> > Cc: gcc-patches@gcc.gnu.org; Koval, Julia <julia.ko...@intel.com>; Peryt, > Sebastian <sebastian.pe...@intel.com> > Subject: Re: [PATCH] i386: Rewrite check for AVX512 features > > On Sat, Jul 29, 2017 at 3:06 PM, H.J. Lu <hjl.to...@gmail.com> wrote: > > Add a new file, avx512-check.h, to check all AVX512 features. The > > test is skipped if any requested AVX512 features are unavailable. > > > > Tested on Skylake server and Haswell. OK for trunk? > > No, I'd rather leave it in in the way they are now, so test can include > individual > checks. > > Uros. > Uros, Can you please suggests any alternative approach? The main problem with current one used in avx512f-helper.h is that it doesn't take into account situations where two features are required, but only one is supported by CPU. That's exactly the case with AVX512VL and AVX512VBMI on SKX. Once avx512vl-check.h verifies existence of AVX512VL on SKX it starts to execute test, which fails because AVX512VBMI is not supported but it has never been checked, before test execution. Honestly I cannot think of any solution that would allow for both individual include files (beside what HJ already did in those few remaining tests) and multiple features verification. Also I think it's worth taking into account that not many tests actually use individual include files instead of avx512f-helper.h. Thanks, Sebastian > > > > H.J. > > --- > > PR target/81590 > > * gcc.target/i386/avx512-check.h: New file. > > * gcc.target/i386/avx5124fmaps-check.h: Removed. > > * gcc.target/i386/avx5124vnniw-check.h: Likewise. > > * gcc.target/i386/avx512cd-check.h: Likewise. > > * gcc.target/i386/avx512ifma-check.h: Likewise. > > * gcc.target/i386/avx512vbmi-check.h: Likewise. > > * gcc.target/i386/avx512vpopcntdq-check.h: Likewise. > > * gcc.target/i386/avx512bw-check.h: Rewrite. > > * gcc.target/i386/avx512dq-check.h: Likewise. > > * gcc.target/i386/avx512er-check.h: Likewise. > > * gcc.target/i386/avx512f-check.h: Likewise. > > * gcc.target/i386/avx512vl-check.h: Likewise. > > * gcc.target/i386/avx512f-helper.h: Include "avx512-check.h" > > only. > > (test_512): Removed. > > (avx512*_test): Likewise. > > * gcc.target/i386/avx512f-pr71559.c (TEST): Undef. > > --- > > gcc/testsuite/gcc.target/i386/avx512-check.h | 113 > + > > gcc/testsuite/gcc.target/i386/avx5124fmaps-check.h | 47 - > > gcc/testsuite/gcc.target/i386/avx5124vnniw-check.h | 47 - > > gcc/testsuite/gcc.target/i386/avx512bw-check.h | 50 + > > gcc/testsuite/gcc.target/i386/avx512cd-check.h | 46 - > > gcc/testsuite/gcc.target/i386/avx512dq-check.h | 50 + > > gcc/testsuite/gcc.target/i386/avx512er-check.h | 49 + > > gcc/testsuite/gcc.target/i386/avx512f-check.h | 49 + > > gcc/testsuite/gcc.target/i386/avx512f-helper.h | 64 +--- > > gcc/testsuite/gcc.target/i386/avx512f-pr71559.c| 1 + > > gcc/testsuite/gcc.target/i386/avx512ifma-check.h | 46 - > > gcc/testsuite/gcc.target/i386/avx512vbmi-check.h | 46 - > > gcc/testsuite/gcc.target/i386/avx512vl-check.h | 51 +- > > .../gcc.target/i386/avx512vpopcntdq-check.h| 47 - > > 14 files changed, 130 insertions(+), 576 deletions(-) create mode > > 100644 gcc/testsuite/gcc.target/i386/avx512-check.h > > delete mode 100644 gcc/testsuite/gcc.target/i386/avx5124fmaps-check.h > > delete mode 100644 gcc/testsuite/gcc.target/i386/avx5124vnniw-check.h > > delete mode 100644 gcc/testsuite/gcc.target/i386/avx512cd-check.h > > delete mode 100644 gcc/testsuite/gcc.target/i386/avx512ifma-check.h > > delete mode 100644 gcc/testsuite/gcc.target/i386/avx512vbmi-check.h > > delete mode 100644 > > gcc/testsuite/gcc.target/i386/avx512vpopcntdq-check.h > > > > diff --git a/gcc/testsuite/gcc.target/i386/avx512-check.h > > b/gcc/testsuite/gcc.target/i386/avx512-check.h > > new file mode 100644 > > index 000..bfe14960100 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/avx512-check.h > > @@ -0,0 +1,113 @@ > > +#include > > +#include "cpuid.h" > > +#include "m512-check.h" > > +#include "avx512f-os-support.h"
[PATCH][x86] Add missing intrinsics for VGETMANT[SD,SS] and VGETEXP[SD,SS]
Hi, This patch adds missing intrinsics for VGETEXPSD, VGETEXPSS, VGETMANTSD, VGETMANTSS. 2017-07-06 Sebastian Perytgcc/ * config/i386/avx512fintrin.h (_mm_mask_getexp_round_ss, _mm_maskz_getexp_round_ss, _mm_mask_getexp_round_sd, _mm_maskz_getexp_round_sd, _mm_mask_getmant_round_sd, _mm_maskz_getmant_round_sd, _mm_mask_getmant_round_ss, _mm_maskz_getmant_round_ss, _mm_mask_getexp_ss, _mm_maskz_getexp_ss, _mm_mask_getexp_sd, _mm_maskz_getexp_sd, _mm_mask_getmant_sd, _mm_maskz_getmant_sd, _mm_mask_getmant_ss, _mm_maskz_getmant_ss): New intrinsics. (__builtin_ia32_getexpss128_mask): Changed to ... __builtin_ia32_getexpss128_round ... this. (__builtin_ia32_getexpsd128_mask): Changed to ... __builtin_ia32_getexpsd128_round ... this. * config/i386/i386-builtin-types.def ((V2DF, V2DF, V2DF, INT, V2DF, UQI, INT), (V4SF, V4SF, V4SF, INT, V4SF, UQI, INT)): New function type aliases. * config/i386/i386-builtin.def (__builtin_ia32_getexpsd_mask_round, __builtin_ia32_getexpss_mask_round, __builtin_ia32_getmantsd_mask_round, __builtin_ia32_getmantss_mask_round): New builtins. * config/i386/i386.c (V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT, V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT): Handle new types. (CODE_FOR_avx512f_vgetmantv2df_mask_round, CODE_FOR_avx512f_vgetmantv4sf_mask_round): New cases. * config/i386/sse.md (avx512f_sgetexp): Changed to ... avx512f_sgetexp ... this. (vgetexp\t{%2, %1, %0| %0, %1, %2}): Changed to ... vgetexp \t{%2, %1, %0| %0, %1, %2} ... this. (avx512f_vgetmant): Changed to ... avx512f_vgetmant ... this. (vgetmant\t{%3, %2, %1, %0| %0, %1, %2, %3}): Changed to ... vgetmant \t{%3, %2, %1, %0| %0, %1, %2 , %3} ... this. * config/i386/subst.md (mask_scalar_operand4, round_saeonly_scalar_mask_operand4, round_saeonly_scalar_mask_op4, round_saeonly_scalar_nimm_predicate): New subst attributes. gcc/testsuite/ * gcc.target/i386/avx512f-vgetexpsd-1.c (_mm_mask_getexp_sd, _mm_maskz_getexp_sd, _mm_mask_getexp_round_sd, _mm_maskz_getexp_round_sd): Test new intrinsics. * gcc.target/i386/avx512f-vgetexpss-1.c (_mm_mask_getexp_ss, _mm_maskz_getexp_ss, _mm_mask_getexp_round_ss, _mm_maskz_getexp_round_ss): Ditto. * gcc.target/i386/avx512f-vgetmantsd-1.c (_mm_mask_getmant_sd, _mm_maskz_getmant_sd, _mm_mask_getmant_round_sd, _mm_maskz_getmant_round_sd): Ditto. * gcc.target/i386/avx512f-vgetmantss-1.c (_mm_mask_getmant_ss, _mm_maskz_getmant_ss, _mm_mask_getmant_round_ss, _mm_maskz_getmant_round_ss): Ditto. * gcc.target/i386/avx512f-vgetexpsd-2.c (_mm_mask_getexp_sd, _mm_maskz_getexp_sd, _mm_getexp_round_sd, _mm_mask_getexp_round_sd, _mm_maskz_getexp_round_sd): New runtime tests. * gcc.target/i386/avx512f-vgetexpss-2.c (_mm_mask_getexp_ss, _mm_maskz_getexp_ss, _mm_getexp_round_ss, _mm_mask_getexp_round_ss, _mm_maskz_getexp_round_ss): Ditto. * gcc.target/i386/avx512f-vgetmantsd-2.c (_mm_mask_getmant_sd, _mm_maskz_getmant_sd, _mm_getmant_round_sd, _mm_mask_getmant_round_sd, _mm_maskz_getmant_round_sd): Ditto. * gcc.target/i386/avx512f-vgetmantss-2.c (_mm_mask_getmant_ss, _mm_maskz_getmant_ss, _mm_getmant_round_ss, _mm_mask_getmant_round_ss, _mm_maskz_getmant_round_ss): Ditto. * gcc.target/i386/avx-1.c (__builtin_ia32_getexpsd_mask_round, __builtin_ia32_getexpss_mask_round, __builtin_ia32_getmantsd_mask_round, __builtin_ia32_getmantss_mask_round): Test new builtins. * gcc.target/i386/sse-13.c : Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c (_mm_maskz_getexp_round_sd, _mm_maskz_getexp_round_ss, _mm_mask_getmant_round_sd, _mm_maskz_getmant_round_sd, _mm_mask_getmant_round_ss, _mm_maskz_getmant_round_ss, _mm_mask_getexp_round_sd, _mm_mask_getexp_round_ss): Test new intrinsics. * gcc.target/i386/testround-1.c: Ditto. * gcc.target/i386/sse-22.c (_mm_maskz_getmant_round_sd, _mm_maskz_getmant_round_ss, _mm_mask_getmant_round_sd, _mm_mask_getmant_round_ss): Test new intrinsics * gcc.target/i386/testimm-10.c (_mm_mask_getmant_sd, _mm_maskz_getmant_sd, _mm_mask_getmant_ss, _mm_maskz_getmant_ss): Test new intrinsics. Is it ok for trunk? Thanks, Sebastian Missing_GETEXP_GETMANT.patch Description: Missing_GETEXP_GETMANT.patch
RE: [PATHC][x86] Scalar mask and round RTL templates
Tests were added. I also updated Changelog and set the max line length to be equal to 79 characters. gcc/ * config/i386/subst.md (mask_scalar, round_scalar, round_saeonly_scalar): New meta-templates. (mask_scalar_name, mask_scalar_operand3, round_scalar_name, round_scalar_mask_operand3, round_scalar_mask_op3, round_scalar_constraint, round_scalar_prefix, round_saeonly_scalar_name, round_saeonly_scalar_mask_operand3, round_saeonly_scalar_mask_op3, round_saeonly_scalar_constraint, round_saeonly_scalar_prefix): New subst attribute. * config/i386/sse.md (_vm3): Renamed to ... _vm3 ... this. (_vm3): Renamed to ... _vm3 ... this. (_vm3): Renamed to ... _vm3 ... this. (v \t{%2, %1, %0| %0, %1, %2}): Changed to ... v \t{%2, %1, %0| %0, %1, %2} ... this. (v \t{%2, %1, %0| %0, %1, %2}): Changed to ... v \t{%2, %1, %0| %0, %1, %2} ... this. (v \t{%2, %1, %0| %0, %1, %2}): Changed to ... v \t{%2, %1, %0| %0, %1, %2 } ... this. gcc/testsuite * gcc.target/i386/avx512f-vaddsd-3.c: New test for mask 0 verification. * gcc.target/i386/avx512f-vaddss-3.c: Ditto. * gcc.target/i386/avx512f-vdivsd-3.c: Ditto. * gcc.target/i386/avx512f-vdivss-3.c: Ditto. * gcc.target/i386/avx512f-vmaxsd-3.c: Ditto. * gcc.target/i386/avx512f-vmaxss-3.c: Ditto. * gcc.target/i386/avx512f-vminsd-3.c: Ditto. * gcc.target/i386/avx512f-vminss-3.c: Ditto. * gcc.target/i386/avx512f-vmulsd-3.c: Ditto. * gcc.target/i386/avx512f-vmulss-3.c: Ditto. * gcc.target/i386/avx512f-vsubsd-3.c: Ditto. * gcc.target/i386/avx512f-vsubss-3.c: Ditto. Is it ok for trunk? Thanks, Sebastian -Original Message- From: Kirill Yukhin [mailto:kirill.yuk...@gmail.com] Sent: Wednesday, July 5, 2017 12:36 PM To: Peryt, Sebastian <sebastian.pe...@intel.com> Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATHC][x86] Scalar mask and round RTL templates On 05 Jul 06:38, Peryt, Sebastian wrote: > Hi Kirill, > > Sorry for this confusion. I meant to write MDs for intrinsics. Those > intrinsics are all masked ones for ADD[SD,SS], SUB[SD,SS], MUL[SD,SS], > DIV[SD,SS], MIN[SD,SS] and MAX[SD,SS]. What I found is that for mask equal 0 > they were producing wrong results when old mask meta-template was used. What you're talking about looks like a bug. Could you pls add a regession test to your patch? > Modified changelog below. > > 2017-07-05 Sebastian Peryt <sebastian.pe...@intel.com> > > gcc/ > * config/i386/subst.md (mask_scalar, round_scalar, > round_saeonly_scalar): New meta-templates. > (mask_scalar_name, mask_scalar_operand3, round_scalar_name, > round_scalar_mask_operand3, round_scalar_mask_op3, > round_scalar_constraint, round_scalar_prefix, round_saeonly_scalar_name, > round_saeonly_scalar_mask_operand3, round_saeonly_scalar_mask_op3, > round_saeonly_scalar_constraint, round_saeonly_scalar_prefix): New > subst attribute. > * config/i386/sse.md > (_vm3): Renamed to ... > _vm3 > ... this. > (_vm3): Renamed to > ... > _vm3 > ... this. > (_vm3): Renamed to ... > _vm3 ... > this. > (v\t{%2, %1, > %0| > %0, %1, %2}): Changed to ... > v\t{%2, > %1, %0| > %0, %1, %2} ... this. > (v\t{%2, %1, > %0| > %0, %1, %2}): Changed to ... > v\t{%2, > %1, %0| > %0, %1, %2} ... this. > (v\t{%2, %1, > %0| > %0, %1, %2}): Changed to > ... > > v\t{%2, %1, > %0| > %0, %1, %2} > ... this. Max line length is 79 characters I suppose. -- Thanks, K > > Is it ok for trunk? > > Thanks, > Sebastian > > -Original Message- > From: Kirill Yukhin [mailto:kirill.yuk...@gmail.com] > Sent: Tuesday, July 4, 2017 7:45 PM > To: Peryt, Sebastian <sebastian.pe...@intel.com> > Cc: gcc-patches@gcc.gnu.org; Uros Bizjak <ubiz...@gmail.com> > Subject: Re: [PATHC][x86] Scalar mask and round RTL templates > > Hello Sebastian, > On 23 Jun 09:00, Peryt, Sebastian wrote: > > Hi, > > > > This patch adds three extra RTL meta-templates for scalar round and mask. > > Additionally fixes errors caused by previous mask and round usage in some > > of the intrinsics that I found. > Could you pls point which intrinsics did you fixed (or which errors)? > I see only MD changes in your patch. > > > > > 2017-06-23 Sebastian Peryt <sebastian.pe...@intel.
RE: [PATHC][x86] Scalar mask and round RTL templates
Hi Kirill, Sorry for this confusion. I meant to write MDs for intrinsics. Those intrinsics are all masked ones for ADD[SD,SS], SUB[SD,SS], MUL[SD,SS], DIV[SD,SS], MIN[SD,SS] and MAX[SD,SS]. What I found is that for mask equal 0 they were producing wrong results when old mask meta-template was used. Modified changelog below. 2017-07-05 Sebastian Peryt <sebastian.pe...@intel.com> gcc/ * config/i386/subst.md (mask_scalar, round_scalar, round_saeonly_scalar): New meta-templates. (mask_scalar_name, mask_scalar_operand3, round_scalar_name, round_scalar_mask_operand3, round_scalar_mask_op3, round_scalar_constraint, round_scalar_prefix, round_saeonly_scalar_name, round_saeonly_scalar_mask_operand3, round_saeonly_scalar_mask_op3, round_saeonly_scalar_constraint, round_saeonly_scalar_prefix): New subst attribute. * config/i386/sse.md (_vm3): Renamed to ... _vm3 ... this. (_vm3): Renamed to ... _vm3 ... this. (_vm3): Renamed to ... _vm3 ... this. (v\t{%2, %1, %0| %0, %1, %2}): Changed to ... v\t{%2, %1, %0| %0, %1, %2} ... this. (v\t{%2, %1, %0| %0, %1, %2}): Changed to ... v\t{%2, %1, %0| %0, %1, %2} ... this. (v\t{%2, %1, %0| %0, %1, %2}): Changed to ... v\t{%2, %1, %0| %0, %1, %2} ... this. Is it ok for trunk? Thanks, Sebastian -Original Message- From: Kirill Yukhin [mailto:kirill.yuk...@gmail.com] Sent: Tuesday, July 4, 2017 7:45 PM To: Peryt, Sebastian <sebastian.pe...@intel.com> Cc: gcc-patches@gcc.gnu.org; Uros Bizjak <ubiz...@gmail.com> Subject: Re: [PATHC][x86] Scalar mask and round RTL templates Hello Sebastian, On 23 Jun 09:00, Peryt, Sebastian wrote: > Hi, > > This patch adds three extra RTL meta-templates for scalar round and mask. > Additionally fixes errors caused by previous mask and round usage in some of > the intrinsics that I found. Could you pls point which intrinsics did you fixed (or which errors)? I see only MD changes in your patch. > > 2017-06-23 Sebastian Peryt <sebastian.pe...@intel.com> > > gcc/ > * config/i386/subst.md (mask_scalar, round_scalar, > round_saeonly_scalar): New templates. I'd call it meta-templates. > (mask_scalar_name, mask_scalar_operand3, round_scalar_name, > round_scalar_mask_operand3, round_scalar_mask_op3, > round_scalar_constraint, round_scalar_prefix, round_saeonly_scalar_name, > round_saeonly_scalar_mask_operand3, round_saeonly_scalar_mask_op3, > round_saeonly_scalar_constraint, round_saeonly_scalar_prefix): New > subst attribute. > * config/i386/sse.md > (_vm3): Renamed to ... > _vm3 > ... this. > (_vm3): Renamed to > ... > _vm3 > ... this. > (_vm3): Renamed to ... > _vm3 ... > this. > (v\t{%2, %1, > %0|%0, %1, %2}): Changed > to ... > v\t{%2, > %1, %0|%0, %1, > %2} ... this. > (v\t{%2, %1, > %0|%0, %1, %2}): Changed > to ... > v\t{%2, > %1, %0|%0, %1, > %2} ... this. > (v\t{%2, %1, > %0|%0, %1, %2}): > Changed to ... > > v\t{%2, %1, > %0|%0, %1, > %2} ... this. We need to obey conventions. Pls break long lines here. -- Thanks, K > > Is it ok for trunk? > > Thanks, > Sebastian
[PATCH][x86] Add permutex[var]_epi[32,64] intrinsics
Hi, This patch adds missing intrinsics: - _mm256_permutexvar_epi32 - _mm256_permutex_epi64 - _mm256_permutexvar_epi64 gcc/ * config/i386/avx512vlintrin.h (_mm256_permutexvar_epi64, _mm256_permutexvar_epi32, _mm256_permutex_epi64): New intrinsics. gcc/tesuite/ * gcc.target/i386/avx512vl-vpermd-1.c (_mm256_permutexvar_epi32): Test new intrinsic. * gcc.target/i386/avx512vl-vpermq-imm-1.c (_mm256_permutex_epi64): Ditto. * gcc.target/i386/avx512vl-vpermq-var-1.c (_mm256_permutexvar_epi64): Ditto. *gcc.target/i386/avx512f-vpermd-2.c: Removed define length constraint. * gcc.target/i386/avx512f-vpermq-imm-2.c: Ditto. * gcc.target/i386/avx512f-vpermq-var-2.c: Ditto. Is it ok for trunk? Thanks, Sebastian permutex.patch Description: permutex.patch
[PATHC][x86] Scalar mask and round RTL templates
Hi, This patch adds three extra RTL meta-templates for scalar round and mask. Additionally fixes errors caused by previous mask and round usage in some of the intrinsics that I found. 2017-06-23 Sebastian Perytgcc/ * config/i386/subst.md (mask_scalar, round_scalar, round_saeonly_scalar): New templates. (mask_scalar_name, mask_scalar_operand3, round_scalar_name, round_scalar_mask_operand3, round_scalar_mask_op3, round_scalar_constraint, round_scalar_prefix, round_saeonly_scalar_name, round_saeonly_scalar_mask_operand3, round_saeonly_scalar_mask_op3, round_saeonly_scalar_constraint, round_saeonly_scalar_prefix): New subst attribute. * config/i386/sse.md (_vm3): Renamed to ... _vm3 ... this. (_vm3): Renamed to ... _vm3 ... this. (_vm3): Renamed to ... _vm3 ... this. (v\t{%2, %1, %0|%0, %1, %2}): Changed to ... v\t{%2, %1, %0|%0, %1, %2} ... this. (v\t{%2, %1, %0|%0, %1, %2}): Changed to ... v\t{%2, %1, %0|%0, %1, %2} ... this. (v\t{%2, %1, %0|%0, %1, %2}): Changed to ... v\t{%2, %1, %0|%0, %1, %2} ... this. Is it ok for trunk? Thanks, Sebastian Scalar-templates.patch Description: Scalar-templates.patch
[PATCH][x86] Add missing mask intrinsics for MAX[SD,SS] and MIN[SD,SS]
Hi, This patch adds missing intrinsics for MAX[SD,SS] and MIN[SD,SS] listed below: - _mm_mask_max_sd, - _mm_maskz_max_sd, - _mm_mask_max_ss, - _mm_maskz_max_ss, - _mm_mask_min_sd, - _mm_maskz_min_sd, - _mm_mask_min_ss, - _mm_maskz_min_ss. gcc/ * config/i386/avx512fintrin.h (_mm_mask_max_sd, _mm_maskz_max_sd, _mm_mask_max_ss, _mm_maskz_max_ss, _mm_mask_min_sd, _mm_maskz_min_sd, _mm_mask_min_ss, _mm_maskz_min_ss): New intrinsics. gcc/testsuite/ * gcc.target/i386/avx512f-vmaxsd-1.c (_mm_mask_max_sd, _mm_maskz_max_sd): Test new intrinsics. * gcc.target/i386/avx512f-vmaxsd-2.c (_mm_mask_max_sd, _mm_maskz_max_sd): Test new intrinsics. * gcc.target/i386/avx512f-vmaxss-1.c (_mm_mask_max_ss, _mm_maskz_max_ss): Test new intrinsics. * gcc.target/i386/avx512f-vmaxss-2.c (_mm_mask_max_ss, _mm_maskz_max_ss): Test new intrinsics. * gcc.target/i386/avx512f-vminsd-1.c (_mm_mask_min_sd, _mm_maskz_min_sd): Test new intrinsics. * gcc.target/i386/avx512f-vminsd-2.c (_mm_mask_min_sd, _mm_maskz_min_sd): Test new intrinsics. * gcc.target/i386/avx512f-vminss-1.c (_mm_mask_min_ss, _mm_maskz_min_ss): Test new intrinsics. * gcc.target/i386/avx512f-vminss-2.c (_mm_mask_min_ss, _mm_maskz_min_ss): Test new intrinsics. Is it ok for trunk? Thanks, Sebastian MASK_MAX[SD,SS]_MIN[SD,SS].patch Description: MASK_MAX[SD,SS]_MIN[SD,SS].patch
RE: [PATCH][x86]Fix for false-positives results of runtime tests on machines not supporting AVX512F
Thank you very much for clarification. Yes, you are right, it would be better if such test would be marked UNSUPPORTED. Sebastian -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Tuesday, May 30, 2017 8:23 AM To: Peryt, Sebastian <sebastian.pe...@intel.com> Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH][x86]Fix for false-positives results of runtime tests on machines not supporting AVX512F On Tue, May 30, 2017 at 7:59 AM, Peryt, Sebastian <sebastian.pe...@intel.com> wrote: > Hi, > > The attached patch fixes the issue of tests' false-positive results > generation on machines not supporting AVX512F feature. Currently when any > runtime test intended for AVX512F feature will be run on non-AVX512F machine > the best it can produce to inform of such a case is print SKIPPED, if debug > is enabled. But in any case the return value is 0, which is exactly the same > as if the test passed what might be misleading when looking at gcc.sum > summary values. With this patch such tests can be properly recognized during > make check as unexpected failures. > > gcc/testsuite/ > * gcc.target/i386/avx512f-check.h: Return value modified for skipped > test. > > > > Please let me know if such fix can be accepted. No, this is by design. It is not a failure, if the target doesn't support requested runtime feature. The test shoudl be marked UNSUPPORTED in this case, but I don't think DejaGnu infrastructure allows that. Uros.
[PATCH][x86]Fix for false-positives results of runtime tests on machines not supporting AVX512F
Hi, The attached patch fixes the issue of tests' false-positive results generation on machines not supporting AVX512F feature. Currently when any runtime test intended for AVX512F feature will be run on non-AVX512F machine the best it can produce to inform of such a case is print SKIPPED, if debug is enabled. But in any case the return value is 0, which is exactly the same as if the test passed what might be misleading when looking at gcc.sum summary values. With this patch such tests can be properly recognized during make check as unexpected failures. gcc/testsuite/ * gcc.target/i386/avx512f-check.h: Return value modified for skipped test. Please let me know if such fix can be accepted. Thanks, Sebastian AVX512F_TESTS_VERIFICATION_PATCH.patch Description: AVX512F_TESTS_VERIFICATION_PATCH.patch
RE: [PATCH] Match x86 family machine constraints section with constarints.md
Hi, Thank you very much for the answers. Can someone please commit this patch for me? Thanks, Sebastian -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Wednesday, May 24, 2017 3:31 PM To: Sandra Loosemore <san...@codesourcery.com> Cc: Peryt, Sebastian <sebastian.pe...@intel.com>; gcc-patches@gcc.gnu.org; Koval, Julia <julia.ko...@intel.com>; kirill.yuk...@gmail.com Subject: Re: [PATCH] Match x86 family machine constraints section with constarints.md On Tue, May 23, 2017 at 5:33 PM, Sandra Loosemore <san...@codesourcery.com> wrote: > On 04/28/2017 03:30 AM, Peryt, Sebastian wrote: >> >> Hi, >> >> Thank you for your comments. I edited my patch accordingly. As for >> some of your doubts: >> - REX is the opcode prefix to access 64-bit register extensions >> introduced in IA-32e mode. >> - EVEX is the encoding prefix which applies to SIMD operating >> instructions operating on XMM, YMM and ZMM registers. It was >> introduced with AVX-512 instructions. >> - "number factor of four" that means that sources start in a multiple >> of 4 boundary. This is used for some of instructions. >> >> Also I'd like to add that this whole patch is strictly based on >> docstring parts of constraints that are present in >> config/i386/constraints.md but not in documentation (md.texi file). >> There is no new (new as in nonexistent in >> code) content. >> >> I'm also adding Kirill Yukhin to CC, because I believe he is the >> correct person that can catch any technical errors if any has slipped-in. > > > The grammar/markup/etc are OK now, but I can't comment on technical > correctness of the information. LGTM. Thanks, Uros.
RE: [PATCH] Match x86 family machine constraints section with constarints.md
Gentle ping. Thanks, Sebastian -Original Message- From: Peryt, Sebastian Sent: Friday, April 28, 2017 11:31 AM To: Sandra Loosemore <san...@codesourcery.com>; gcc-patches@gcc.gnu.org Cc: ubiz...@gmail.com; Koval, Julia <julia.ko...@intel.com>; kirill.yuk...@gmail.com Subject: RE: [PATCH] Match x86 family machine constraints section with constarints.md Hi, Thank you for your comments. I edited my patch accordingly. As for some of your doubts: - REX is the opcode prefix to access 64-bit register extensions introduced in IA-32e mode. - EVEX is the encoding prefix which applies to SIMD operating instructions operating on XMM, YMM and ZMM registers. It was introduced with AVX-512 instructions. - "number factor of four" that means that sources start in a multiple of 4 boundary. This is used for some of instructions. Also I'd like to add that this whole patch is strictly based on docstring parts of constraints that are present in config/i386/constraints.md but not in documentation (md.texi file). There is no new (new as in nonexistent in code) content. I'm also adding Kirill Yukhin to CC, because I believe he is the correct person that can catch any technical errors if any has slipped-in. Thanks, Sebastian -Original Message- From: Sandra Loosemore [mailto:san...@codesourcery.com] Sent: Thursday, April 27, 2017 10:17 PM To: Peryt, Sebastian <sebastian.pe...@intel.com>; gcc-patches@gcc.gnu.org Cc: ubiz...@gmail.com; Koval, Julia <julia.ko...@intel.com> Subject: Re: [PATCH] Match x86 family machine constraints section with constarints.md On 04/26/2017 08:29 AM, Peryt, Sebastian wrote: > Hi, > > This patch updates x86 family machine constraints section in '16.8.5 > Constraints for Particular Machines' section to match the ones in > 'config/i386/constraints.md'. > > gcc/ > * doc/md.texi (Machine Constraints): Update x86 family machine > constraints > section to match 'config/i386/constraints.md'. > > Is it ok for trunk? I have a few comments on grammar and markup, but I can't comment intelligently on whether the technical content is correct. > @@ -4013,24 +4015,94 @@ Top of 80387 floating-point stack (@code{%st(0)}). > @item u > Second from top of 80387 floating-point stack (@code{%st(1)}). > > +@ifset INTERNALS > +@item Yk > +Any mask register that can be used as predicate, i.e. k1-k7. s/predicate/a predicate/ Other places in this section use @code markup on literal register names. > + > +@item k > +Any mask register. > +@end ifset > + > @item y > Any MMX register. > > @item x > Any SSE register. > > +@item v > +Any EVEX encodable SSE register (@code{%xmm0-%xmm31}). > + > +@ifset INTERNALS > +@item w > +Any bound register. > +@end ifset > + > @item Yz > First SSE register (@code{%xmm0}). > > @ifset INTERNALS > -@item Y2 > -Any SSE register, when SSE2 is enabled. > - > @item Yi > Any SSE register, when SSE2 and inter-unit moves are enabled. > > +@item Yj > +Any SSE register, when SSE2 and inter-unit moves from vector registers are > enabled. > + > @item Ym > Any MMX register, when inter-unit moves are enabled. > + > +@item Yn > +Any MMX register, when inter-unit moves from vector registers are enabled. > + > +@item Yp > +Any integer register when TARGET_PARTIAL_REG_STALL is disabled. @code markup on that. > + > +@item Ya > +Any integer register when zero extensions with AND are disabled. I'm not sure what "AND" is, but it probably needs @code markup too. > + > +@item Yb > +Any register that can be used as the GOT base when calling ___tls_get_addr: @code{___tls_get_addr} > +that is, any general register except @code{a} and @code{sp} > +registers, for -fno-plt if linker supports it. Otherwise, @code{b} register. @option{-fno-plt} > + > +@item Yf > +Any x87 register when 80387 FP arithmetic is enabled. Is "FP" a literal feature name used in the processor documentation, or do you just mean "floating-point arithmetic" here? > + > +@item Yr > +Lower SSE register when avoiding REX prefix and all SSE registers otherwise. I don't know what "avoiding REX prefix" means, and don't see the string "REX" in any other GCC documentation. > + > +@item Yv > +For AVX512VL, any EVEX encodable SSE register (@code{%xmm0-%xmm31}), > +otherwise any SSE register. This should probably be "EVEX-encodable", whatever that means. > + > +@item Yh > +Any EVEX encodable SSE register, which has number factor of four. Same here, but what is "number factor of four"? Also, if this is supposed to designate a subset of the EVEX-encodable SSE registers rather than describe all of them, you need "that" instead o
[PATCH][x86] Add missing intrinsics for MAX[SD,SS] and MIN[SD,SS]
Hi, This patch adds missing intrinsics for MAXSD, MAXSS, MINSD and MINSS instructions. 2017-05-09 Sebastian Perytgcc/ * config/i386/avx512fintrin.h (_mm_mask_max_round_sd, _mm_maskz_max_round_sd, _mm_mask_max_round_ss, _mm_maskz_max_round_ss, _mm_mask_min_round_sd, _mm_maskz_min_round_sd, _mm_mask_min_round_ss, _mm_maskz_min_round_ss): New intrinsics. * config/i386/i386-builtin-types.def (V2DF, V2DF, V2DF, V2DF, UQI, INT, V4SF, V4SF, V4SF, V4SF, UQI, INT): New function type aliases. * config/i386/i386-builtin.def (__builtin_ia32_maxsd_mask_round, __builtin_ia32_maxss_mask_round, __builtin_ia32_minsd_mask_round, __builtin_ia32_minss_mask_round): New builtins. * config/i386/i386.c (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT, V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): Handle new types. * config/i386/sse.md (_vm3): Renamed to ... (_vm3): ... this. (v\t{%2, %1, %0|%0, %1, %2}): Changed to ... (v\t{%2, %1, %0|%0, %1, %2}): ... this. gcc/testsuite/ * gcc.target/i386/avx512f-vmaxsd-1.c (_mm_mask_max_round_sd, _mm_maskz_max_round_sd): Test new intrinsics. * gcc.target/i386/avx512f-vmaxsd-2.c: New. * gcc.target/i386/avx512f-vmaxss-1.c (_mm_mask_max_round_ss, _mm_maskz_max_round_ss): Test new intrinsics. * gcc.target/i386/avx512f-vmaxss-2.c: New. * gcc.target/i386/avx512f-vminsd-1.c (_mm_mask_min_round_sd, _mm_maskz_min_round_sd): Test new intrinsics. * gcc.target/i386/avx512f-vminsd-2.c: New. * gcc.target/i386/avx512f-vminss-1.c (_mm_mask_min_round_ss, _mm_maskz_min_round_ss): Test new intrinsics. * gcc.target/i386/avx512f-vminss-2.c: New. * gcc.target/i386/avx-1.c (__builtin_ia32_maxsd_mask_round, __builtin_ia32_maxss_mask_round, __builtin_ia32_minsd_mask_round, __builtin_ia32_minss_mask_round): Test new builtins. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c (_mm_maskz_max_round_sd, _mm_maskz_max_round_ss, _mm_maskz_min_round_sd, _mm_maskz_min_round_ss, _mm_mask_max_round_sd, _mm_mask_max_round_ss, _mm_mask_min_round_sd, _mm_mask_min_round_ss): Test new intrinsics. * gcc.target/i386/testround-1.c: Ditto. Is it ok for trunk? Thanks, Sebastian MAX[SD_SS]_MIN[SD_SS]_patch.patch Description: MAX[SD_SS]_MIN[SD_SS]_patch.patch
[PATCH][x86] Add missing intrinsics for DIV[SD,SS] and MUL[SD,SS]
Hi, This patch adds missing intrinsics for DIVSD, DIVSS, MULSD and MULSS instructions. 2017-05-09 Sebastian Perytgcc/ * config/i386/avx512fintrin.h (_mm_mask_mul_round_sd, _mm_maskz_mul_round_sd, _mm_mask_mul_round_ss, _mm_maskz_mul_round_ss, _mm_mask_div_round_sd, _mm_maskz_div_round_sd, _mm_mask_div_round_ss, _mm_maskz_div_round_ss, _mm_mask_mul_sd, _mm_maskz_mul_sd, _mm_mask_mul_ss, _mm_maskz_mul_ss, _mm_mask_div_sd, _mm_maskz_div_sd, _mm_mask_div_ss, _mm_maskz_div_ss): New intrinsics. * config/i386/i386-builtin-types.def (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT, V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): New function type aliases. * config/i386/i386-builtin.def (__builtin_ia32_divsd_mask_round, __builtin_ia32_divss_mask_round, __builtin_ia32_mulsd_mask_round, __builtin_ia32_mulss_mask_round): New builtins. * config/i386/i386.c (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT, V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): Handle new types. * config/i386/sse.md (_vm3): Renamed to ... (_vm3): ... this. (v\t{%2, %1, %0|%0, %1, %2}): Changed to ... (v\t{%2, %1, %0|%0, %1, %2}): ... this. gcc/testsuite/ * gcc.target/i386/avx512f-vdivsd-1.c (_mm_mask_div_sd, _mm_maskz_div_sd, _mm_mask_div_round_sd, _mm_maskz_div_round_sd): Test new intrinsics. * gcc.target/i386/avx512f-vdivsd-2.c: New. * gcc.target/i386/avx512f-vdivss-1.c (_mm_mask_div_ss, _mm_maskz_div_ss, _mm_mask_div_round_ss, _mm_maskz_div_round_ss): Test new intrinsics. * gcc.target/i386/avx512f-vdivss-2.c: New. * gcc.target/i386/avx512f-vmulsd-1.c (_mm_mask_mul_sd, _mm_maskz_mul_sd, _mm_mask_mul_round_sd, _mm_maskz_mul_round_sd): Test new intrinsics. * gcc.target/i386/avx512f-vmulsd-2.c: New. * gcc.target/i386/avx512f-vmulss-1.c (_mm_mask_mul_ss, _mm_maskz_mul_ss, _mm_mask_mul_round_ss, _mm_maskz_mul_round_ss): Test new intrinsics. * gcc.target/i386/avx512f-vmulss-2.c: New. * gcc.target/i386/avx-1.c (__builtin_ia32_divsd_mask_round, __builtin_ia32_divss_mask_round, __builtin_ia32_mulsd_mask_round, __builtin_ia32_mulss_mask_round): Test new builtins. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c (_mm_maskz_div_round_sd, _mm_maskz_div_round_ss, _mm_maskz_mul_round_sd, _mm_maskz_mul_round_ss): Test new intrinsics. * gcc.target/i386/testround-1.c: Ditto. Is it ok for trunk? Sebastian DIV[SD_SS]_MUL[SD_SS]_patch.patch Description: DIV[SD_SS]_MUL[SD_SS]_patch.patch
RE: [PATCH][x86] Fix ADD[SD,SS] and SUB[SD,SS] runtime tests
Hi, Can you please commit it for me? Thanks, Sebastian -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Tuesday, May 9, 2017 10:40 AM To: Peryt, Sebastian <sebastian.pe...@intel.com> Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com Subject: Re: [PATCH][x86] Fix ADD[SD,SS] and SUB[SD,SS] runtime tests On Mon, May 8, 2017 at 9:53 AM, Peryt, Sebastian <sebastian.pe...@intel.com> wrote: > Hi, > > This patch fixes errors in runtime tests for ADDSD, ADDSS, SUBSD and SUBSS > instructions. > > gcc/testsuite/ > * gcc.target/i386/avx512f-vaddsd-2.c: Test fixed. > * gcc.target/i386/avx512f-vaddss-2.c: Ditto. > * gcc.target/i386/avx512f-vsubsd-2.c: Ditto. > * gcc.target/i386/avx512f-vsubss-2.c: Ditto. > > Is it ok for trunk? OK. Thanks, Uros.
[PATCH][x86] Fix ADD[SD,SS] and SUB[SD,SS] runtime tests
Hi, This patch fixes errors in runtime tests for ADDSD, ADDSS, SUBSD and SUBSS instructions. gcc/testsuite/ * gcc.target/i386/avx512f-vaddsd-2.c: Test fixed. * gcc.target/i386/avx512f-vaddss-2.c: Ditto. * gcc.target/i386/avx512f-vsubsd-2.c: Ditto. * gcc.target/i386/avx512f-vsubss-2.c: Ditto. Is it ok for trunk? Thanks, Sebastian ADD[SD_SS]_SUB[SD_SS]_runtime_tests_fix.patch Description: ADD[SD_SS]_SUB[SD_SS]_runtime_tests_fix.patch
RE: [PATCH][x86] Add missing intrinsics for ADD[SD,SS] and SUB[SD,SS]
Thank you! Sebastian -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Tuesday, May 2, 2017 3:08 PM To: Peryt, Sebastian <sebastian.pe...@intel.com> Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com Subject: Re: [PATCH][x86] Add missing intrinsics for ADD[SD,SS] and SUB[SD,SS] On Tue, May 2, 2017 at 11:39 AM, Peryt, Sebastian <sebastian.pe...@intel.com> wrote: > Hi, > Can you please commit it for me? Done. Uros.
RE: [PATCH][x86] Add missing intrinsics for ADD[SD,SS] and SUB[SD,SS]
Hi, Can you please commit it for me? Thanks, Sebastian -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Monday, May 1, 2017 11:28 AM To: Peryt, Sebastian <sebastian.pe...@intel.com> Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com Subject: Re: [PATCH][x86] Add missing intrinsics for ADD[SD,SS] and SUB[SD,SS] On Thu, Apr 27, 2017 at 10:22 AM, Peryt, Sebastian <sebastian.pe...@intel.com> wrote: > Hi, > > This patch adds missing intrinsics for ADDSD, ADDSS, SUBSD and SUBSS > instructions. > > gcc/ > * config/i386/avx512fintrin.h (_mm_mask_add_round_sd, > _mm_maskz_add_round_sd, _mm_mask_add_round_ss, > _mm_maskz_add_round_ss, _mm_mask_sub_round_sd, > _mm_maskz_sub_round_sd, _mm_mask_sub_round_ss, > _mm_maskz_sub_round_ss, _mm_mask_add_sd, > _mm_maskz_add_sd, _mm_mask_add_ss, _mm_maskz_add_ss, > _mm_mask_sub_sd, _mm_maskz_sub_sd, _mm_mask_sub_ss, > _mm_maskz_sub_ss): New intrinsics. > * config/i386/i386-builtin-types.def > (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT, > V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): New function type aliases. > * config/i386/i386-builtin.def (__builtin_ia32_addsd_mask_round, > __builtin_ia32_addss_mask_round, __builtin_ia32_subsd_mask_round, > __builtin_ia32_subss_mask_round): New builtins. > * config/i386/i386.c (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT, > V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): Handle new types. > * config/i386/sse.md (_vm3): > Renamed to ... > (_vm3): ... this. > (v\t{%2, %1, > %0|%0, %1, %2}): Changed to ... > (v\t{%2, %1, > %0|%0, %1, %2}): ... this. > > gcc/testsuite/ > * gcc.target/i386/avx512f-vaddsd-1.c (_mm_mask_add_sd, > _mm_maskz_add_sd, _mm_mask_add_round_sd, > _mm_maskz_add_round_sd): Test new intrinsics. > * gcc.target/i386/avx512f-vaddsd-2.c: New. > * gcc.target/i386/avx512f-vaddss-1.c (_mm_mask_add_ss, > _mm_maskz_add_ss, _mm_mask_add_round_ss, > _mm_maskz_add_round_ss): Test new intrinsics. > * gcc.target/i386/avx512f-vaddss-2.c: New. > * gcc.target/i386/avx512f-vsubsd-1.c (_mm_mask_sub_sd, > _mm_maskz_sub_sd, _mm_mask_sub_round_sd, > _mm_maskz_sub_round_sd): Test new intrinsics. > * gcc.target/i386/avx512f-vsubsd-2.c: New. > * gcc.target/i386/avx512f-vsubss-1.c (_mm_mask_sub_ss, > _mm_maskz_sub_ss, _mm_mask_sub_round_ss, > _mm_maskz_sub_round_ss): Test new intrinsics. > * gcc.target/i386/avx512f-vsubss-2.c: New. > * gcc.target/i386/avx-1.c (__builtin_ia32_addsd_mask_round, > __builtin_ia32_addss_mask_round, __builtin_ia32_subsd_mask_round, > __builtin_ia32_subss_mask_round): Test new builtins. > * gcc.target/i386/sse-13.c: Ditto. > * gcc.target/i386/sse-23.c: Ditto. > * gcc.target/i386/sse-14.c (_mm_maskz_add_round_sd, > _mm_maskz_add_round_ss, _mm_maskz_sub_round_sd, > _mm_maskz_sub_round_ss, _mm_mask_add_round_sd, > _mm_mask_add_round_ss, _mm_mask_sub_round_sd, > _mm_mask_sub_round_ss): Test new intrinsics. > * gcc.target/i386/testround-1.c: Ditto. > > Is it ok for trunk? OK. Thanks, Uros.
RE: [PATCH] Match x86 family machine constraints section with constarints.md
Hi, Thank you for your comments. I edited my patch accordingly. As for some of your doubts: - REX is the opcode prefix to access 64-bit register extensions introduced in IA-32e mode. - EVEX is the encoding prefix which applies to SIMD operating instructions operating on XMM, YMM and ZMM registers. It was introduced with AVX-512 instructions. - "number factor of four" that means that sources start in a multiple of 4 boundary. This is used for some of instructions. Also I'd like to add that this whole patch is strictly based on docstring parts of constraints that are present in config/i386/constraints.md but not in documentation (md.texi file). There is no new (new as in nonexistent in code) content. I'm also adding Kirill Yukhin to CC, because I believe he is the correct person that can catch any technical errors if any has slipped-in. Thanks, Sebastian -Original Message- From: Sandra Loosemore [mailto:san...@codesourcery.com] Sent: Thursday, April 27, 2017 10:17 PM To: Peryt, Sebastian <sebastian.pe...@intel.com>; gcc-patches@gcc.gnu.org Cc: ubiz...@gmail.com; Koval, Julia <julia.ko...@intel.com> Subject: Re: [PATCH] Match x86 family machine constraints section with constarints.md On 04/26/2017 08:29 AM, Peryt, Sebastian wrote: > Hi, > > This patch updates x86 family machine constraints section in '16.8.5 > Constraints for Particular Machines' section to match the ones in > 'config/i386/constraints.md'. > > gcc/ > * doc/md.texi (Machine Constraints): Update x86 family machine > constraints > section to match 'config/i386/constraints.md'. > > Is it ok for trunk? I have a few comments on grammar and markup, but I can't comment intelligently on whether the technical content is correct. > @@ -4013,24 +4015,94 @@ Top of 80387 floating-point stack (@code{%st(0)}). > @item u > Second from top of 80387 floating-point stack (@code{%st(1)}). > > +@ifset INTERNALS > +@item Yk > +Any mask register that can be used as predicate, i.e. k1-k7. s/predicate/a predicate/ Other places in this section use @code markup on literal register names. > + > +@item k > +Any mask register. > +@end ifset > + > @item y > Any MMX register. > > @item x > Any SSE register. > > +@item v > +Any EVEX encodable SSE register (@code{%xmm0-%xmm31}). > + > +@ifset INTERNALS > +@item w > +Any bound register. > +@end ifset > + > @item Yz > First SSE register (@code{%xmm0}). > > @ifset INTERNALS > -@item Y2 > -Any SSE register, when SSE2 is enabled. > - > @item Yi > Any SSE register, when SSE2 and inter-unit moves are enabled. > > +@item Yj > +Any SSE register, when SSE2 and inter-unit moves from vector registers are > enabled. > + > @item Ym > Any MMX register, when inter-unit moves are enabled. > + > +@item Yn > +Any MMX register, when inter-unit moves from vector registers are enabled. > + > +@item Yp > +Any integer register when TARGET_PARTIAL_REG_STALL is disabled. @code markup on that. > + > +@item Ya > +Any integer register when zero extensions with AND are disabled. I'm not sure what "AND" is, but it probably needs @code markup too. > + > +@item Yb > +Any register that can be used as the GOT base when calling ___tls_get_addr: @code{___tls_get_addr} > +that is, any general register except @code{a} and @code{sp} > +registers, for -fno-plt if linker supports it. Otherwise, @code{b} register. @option{-fno-plt} > + > +@item Yf > +Any x87 register when 80387 FP arithmetic is enabled. Is "FP" a literal feature name used in the processor documentation, or do you just mean "floating-point arithmetic" here? > + > +@item Yr > +Lower SSE register when avoiding REX prefix and all SSE registers otherwise. I don't know what "avoiding REX prefix" means, and don't see the string "REX" in any other GCC documentation. > + > +@item Yv > +For AVX512VL, any EVEX encodable SSE register (@code{%xmm0-%xmm31}), > +otherwise any SSE register. This should probably be "EVEX-encodable", whatever that means. > + > +@item Yh > +Any EVEX encodable SSE register, which has number factor of four. Same here, but what is "number factor of four"? Also, if this is supposed to designate a subset of the EVEX-encodable SSE registers rather than describe all of them, you need "that" instead of "which". > + > +@item Bf > +Flags register operand. > + > +@item Bg > +GOT memory operand. > + > +@item Bm > +Vector memory operand. > + > +@item Bc > +Constant memory operand. > + > +@item Bn > +Memory operand without REX prefix. > + > +@item Bs > +Sibcall memory operand. > + > +@item Bw > +Call mem
[PATCH][x86] Add missing intrinsics for ADD[SD,SS] and SUB[SD,SS]
Hi, This patch adds missing intrinsics for ADDSD, ADDSS, SUBSD and SUBSS instructions. gcc/ * config/i386/avx512fintrin.h (_mm_mask_add_round_sd, _mm_maskz_add_round_sd, _mm_mask_add_round_ss, _mm_maskz_add_round_ss, _mm_mask_sub_round_sd, _mm_maskz_sub_round_sd, _mm_mask_sub_round_ss, _mm_maskz_sub_round_ss, _mm_mask_add_sd, _mm_maskz_add_sd, _mm_mask_add_ss, _mm_maskz_add_ss, _mm_mask_sub_sd, _mm_maskz_sub_sd, _mm_mask_sub_ss, _mm_maskz_sub_ss): New intrinsics. * config/i386/i386-builtin-types.def (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT, V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): New function type aliases. * config/i386/i386-builtin.def (__builtin_ia32_addsd_mask_round, __builtin_ia32_addss_mask_round, __builtin_ia32_subsd_mask_round, __builtin_ia32_subss_mask_round): New builtins. * config/i386/i386.c (V2DF_FTYPE_V2DF_V2DF_V2DF_UQI_INT, V4SF_FTYPE_V4SF_V4SF_V4SF_UQI_INT): Handle new types. * config/i386/sse.md (_vm3): Renamed to ... (_vm3): ... this. (v\t{%2, %1, %0|%0, %1, %2}): Changed to ... (v\t{%2, %1, %0|%0, %1, %2}): ... this. gcc/testsuite/ * gcc.target/i386/avx512f-vaddsd-1.c (_mm_mask_add_sd, _mm_maskz_add_sd, _mm_mask_add_round_sd, _mm_maskz_add_round_sd): Test new intrinsics. * gcc.target/i386/avx512f-vaddsd-2.c: New. * gcc.target/i386/avx512f-vaddss-1.c (_mm_mask_add_ss, _mm_maskz_add_ss, _mm_mask_add_round_ss, _mm_maskz_add_round_ss): Test new intrinsics. * gcc.target/i386/avx512f-vaddss-2.c: New. * gcc.target/i386/avx512f-vsubsd-1.c (_mm_mask_sub_sd, _mm_maskz_sub_sd, _mm_mask_sub_round_sd, _mm_maskz_sub_round_sd): Test new intrinsics. * gcc.target/i386/avx512f-vsubsd-2.c: New. * gcc.target/i386/avx512f-vsubss-1.c (_mm_mask_sub_ss, _mm_maskz_sub_ss, _mm_mask_sub_round_ss, _mm_maskz_sub_round_ss): Test new intrinsics. * gcc.target/i386/avx512f-vsubss-2.c: New. * gcc.target/i386/avx-1.c (__builtin_ia32_addsd_mask_round, __builtin_ia32_addss_mask_round, __builtin_ia32_subsd_mask_round, __builtin_ia32_subss_mask_round): Test new builtins. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c (_mm_maskz_add_round_sd, _mm_maskz_add_round_ss, _mm_maskz_sub_round_sd, _mm_maskz_sub_round_ss, _mm_mask_add_round_sd, _mm_mask_add_round_ss, _mm_mask_sub_round_sd, _mm_mask_sub_round_ss): Test new intrinsics. * gcc.target/i386/testround-1.c: Ditto. Is it ok for trunk? Sebastian ADD[SD_SS]_SUB[SD_SS]_patch.patch Description: ADD[SD_SS]_SUB[SD_SS]_patch.patch
[PATCH] Match x86 family machine constraints section with constarints.md
Hi, This patch updates x86 family machine constraints section in '16.8.5 Constraints for Particular Machines' section to match the ones in 'config/i386/constraints.md'. gcc/ * doc/md.texi (Machine Constraints): Update x86 family machine constraints section to match 'config/i386/constraints.md'. Is it ok for trunk? Sebastian x86_constraints_doc.patch Description: x86_constraints_doc.patch