[PATCH] i386: Fix REDUC_SSE_SMINMAX_MODE mode conditions. [PR94494]

2020-04-11 Thread Uros Bizjak via Gcc-patches
V4SI, V8HI and V16QI modes of redux__scal_ expander expand with SSE2 instructions (PSRLDQ and PCMPGTx) so use TARGET_SSE2 as relevant mode iterator codition. 2020-04-11 Uroš Bizjak PR target/94494 * config/i386/sse.md (REDUC_SSE_SMINMAX_MODE): Use TARGET_SSE2 condition for V4SI,

Re: [PATCH] i386: Fix V{64QI, 32HI}mode constant permutations [PR94509]

2020-04-07 Thread Uros Bizjak via Gcc-patches
On Tue, Apr 7, 2020 at 2:29 PM Jakub Jelinek wrote: > > Hi! > > The following testcases are miscompiled, because expand_vec_perm_pshufb > incorrectly thinks it can use vpshufb instruction for the permutations > when it can't. > The > if (vmode == V32QImode) > { >

[committed] i386: Remove unneeded assignments when triggering SSE exceptions

2020-04-19 Thread Uros Bizjak via Gcc-patches
According to "Intel 64 and IA32 Arch SDM, Vol. 3: "Because SIMD floating-point exceptions are precise and occur immediately, the situation does not arise where an x87 FPU instruction, a WAIT/FWAIT instruction, or another SSE/SSE2/SSE3 instruction will catch a pending unmasked SIMD floating-point

[RFA PATCH]: i386: Use generic division to generate INVALID and DIVZERO exceptions

2020-04-19 Thread Uros Bizjak via Gcc-patches
This patch implements the idea from glibc, where generic division operations instead of assembly are used where appropriate. --commit-- i386: Use generic division to generate INVALID and DIVZERO exceptions Introduce math_force_eval to evaluate generic division to generate INVALID and DIVZERO

Re: [PATCH] i386: Fix emit_reduc_half on V{64Q,32H}Imode [PR94500]

2020-04-07 Thread Uros Bizjak via Gcc-patches
On Tue, Apr 7, 2020 at 12:51 AM Jakub Jelinek wrote: > > Hi! > > The following testcase is miscompiled in 8.x, because emit_reduc_half is > prepared to handle for 512-bit modes only i equal to 512, 256, 128 and 64. > V32HImode also needs i equal to 32 and V64QImode i equal to 32 and 16, > but

Re: [PATCH] x86: Restore the frame pointer in word_mode

2020-04-13 Thread Uros Bizjak via Gcc-patches
On Sun, Apr 12, 2020 at 11:28 PM H.J. Lu wrote: > > We must restore the frame pointer in word_mode for eh_return epilogues > since the upper 32 bits of RBP register can have any values. > > Tested on Linux/x32 and Linux/x86-64. OK for master and backport to > GCC 8/9 branches? > > Thanks. > >

Re: [PATCH] i386: Fix up *testqi_ext_3 define_insn_and_split [PR94567]

2020-04-17 Thread Uros Bizjak via Gcc-patches
On Fri, Apr 17, 2020 at 10:29 AM Jakub Jelinek wrote: > > Hi! > > As the testcase shows, there are unfortunately more problematic cases > in *testqi_ext_3 if the mode is not CCZmode, because the sign flag might > not behave the same between the insn with zero_extract and what we split it > into.

Re: [PATCH v2] i386: Fix up *testqi_ext_3 define_insn_and_split [PR94567]

2020-04-17 Thread Uros Bizjak via Gcc-patches
On Fri, Apr 17, 2020 at 2:38 PM Jakub Jelinek wrote: > > On Fri, Apr 17, 2020 at 02:12:19PM +0200, Uros Bizjak wrote: > > Is it possible to perform widening only for !TARGET_PARTIAL_REG_STALL? > > I'm worried that highpart may not be cleared, so the effects of > > partial reg stall can be quite

[commited] testsuite: Assorted x32 testsuite fixes

2020-03-13 Thread Uros Bizjak via Gcc-patches
2020-03-13 Uroš Bizjak * gcc.target/i386/pr64409.c: Do not limit compilation to x32 targets. (dg-error): Quote 'ms_abi' attribute. * gcc.target/i386/pr71958.c: Do not limit compilation to x32 targets. Require maybe_x32 effective target. (dg-options): Add -mx32.

[PATCH] libstdc++: Skip 91371.cc for x32.

2020-03-13 Thread Uros Bizjak via Gcc-patches
x32 does not support MS ABI, skip testcases that require it. 2020-03-13 Uroš Bizjak * testsuite/20_util/bind/91371.cc: Skip for x32. * testsuite/20_util/is_function/91371.cc: Ditto. * testsuite/20_util/is_member_function_pointer/91371.cc: Ditto. *

Re: [PATCH] i386: Simplify {, v}ph{add, sub{, s}{w, d} insn patterns [PR94460]

2020-04-04 Thread Uros Bizjak via Gcc-patches
On Sat, Apr 4, 2020 at 12:41 AM Jakub Jelinek wrote: > > Hi! > > As mentioned in the previous PR94460 patch, the RTL patterns look too > large/complicated, we can simplify them by just performing two 2 arg > permutations to move the arguments into the right spots and then just > doing the

Re: [PATCH] i386: Fix up handling of OPTION_MASK_ISA_MMX builtins [PR94461]

2020-04-03 Thread Uros Bizjak via Gcc-patches
On Fri, Apr 3, 2020 at 7:30 PM Jakub Jelinek wrote: > > Hi! > > In https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00576.html the builtin > handling was changed so that OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE > etc. in i386-builtin.def means we require both mmx and sse, not just one of > those,

Re: [PATCH] x86: Mark scratch operand in ssse3_pshufbv8qi3 as earlyclobber

2020-04-03 Thread Uros Bizjak via Gcc-patches
On Fri, Apr 3, 2020 at 6:51 PM H.J. Lu wrote: > > commit 16ed2601ad0a4aa82f11e9df86ea92183f94f979 > Author: H.J. Lu > Date: Wed May 15 15:26:19 2019 + > > i386: Emulate MMX pshufb with SSE version > > has > > +(define_insn_and_split "ssse3_pshufbv8qi3" > + [(set (match_operand:V8QI 0

Re: [PATCH] i386: Fix vph{add, subs?}[wd] 256-bit AVX2 RTL patterns [PR94460]

2020-04-03 Thread Uros Bizjak via Gcc-patches
On Fri, Apr 3, 2020 at 7:06 PM Jakub Jelinek wrote: > > Hi! > > The following testcase is miscompiled, because the AVX2 patterns don't > describe correctly what the insn does. E.g. vphaddd with %ymm* operands > (the second pattern) instruction as per: >

Re: [PATH] Enable GCC support for SERIALIZE

2020-04-01 Thread Uros Bizjak via Gcc-patches
On Wed, Apr 1, 2020 at 9:23 AM Hongtao Liu wrote: > > Hi: > This patch is about to enable GCC support for SERIALIZE which would > be in GLC. There's only 1 instruction: SERIALIZE, more details please > refer to >

Re: [PATCH] Enable GCC support for TSXLDTRK

2020-04-01 Thread Uros Bizjak via Gcc-patches
On Wed, Apr 1, 2020 at 9:28 AM Hongtao Liu wrote: > > On Wed, Apr 1, 2020 at 3:32 PM Hongtao Liu wrote: > > > > Hi: > > This patch is about to enable GCC support for TSXLDTRK which would > > be in GLC. There's only 2 instructions: XRESLDTRK, XSUSLDTRK, more > > details please > > refer to > >

Re: [PATCH] i386: Fix ix86_add_reg_usage_to_vzeroupper [PR94308]

2020-03-25 Thread Uros Bizjak via Gcc-patches
On Wed, Mar 25, 2020 at 9:05 AM Jakub Jelinek wrote: > > Hi! > > The following patch ICEs due to my recent change r10-6451-gb7b3378f91c. > Since that patch, for explicit vzeroupper in the sources (when an intrinsic > is used), we start with the *avx_vzeroupper_1 pattern which contains just the >

Re: [PATCH] x86: Fix up ix86_atomic_assign_expand_fenv [PR94780]

2020-04-27 Thread Uros Bizjak via Gcc-patches
On Mon, Apr 27, 2020 at 8:26 PM Jakub Jelinek wrote: > > Hi! > > This function, because it is sometimes called even outside of function > bodies, uses create_tmp_var_raw rather than create_tmp_var. But in order > for that to work, when first referenced, the VAR_DECLs need to appear in a >

Re: [PATCH] Enable GCC support for TSXLDTRK

2020-05-03 Thread Uros Bizjak via Gcc-patches
On Wed, Apr 1, 2020 at 9:27 AM Hongtao Liu wrote: > > Hi: > This patch is about to enable GCC support for TSXLDTRK which would > be in GLC. There's only 2 instructions: XRESLDTRK, XSUSLDTRK, more > details please > refer to >

Re: [PATH] Enable GCC support for SERIALIZE

2020-05-03 Thread Uros Bizjak via Gcc-patches
On Wed, Apr 1, 2020 at 9:23 AM Hongtao Liu wrote: > > Hi: > This patch is about to enable GCC support for SERIALIZE which would > be in GLC. There's only 1 instruction: SERIALIZE, more details please > refer to >

[committed] i386: Use plus_constant instead of gen_rtx_PLUS

2020-05-03 Thread Uros Bizjak via Gcc-patches
Replace gen_rtx_PLUS with a GEN_INT with plus_constant. 2020-05-03 Uroš Bizjak * config/i386/i386-expand.c (ix86_expand_int_movcc): Use plus_constant instead of gen_rtx_PLUS with GEN_INT. (emit_memmov): Ditto. (emit_memset): Ditto. (ix86_expand_strlensi_unroll_1): Ditto.

[committed] i386: Use SHR to compare with large power-of-two constants [PR94650]

2020-05-04 Thread Uros Bizjak via Gcc-patches
Convert unsigned compares where m >= LARGE_POWER_OF_TWO and LARGE_POWER_OF_TWO represent an immediate where bit 33+ is set to use a SHR instruction and compare the result to 0. This avoids loading a large immediate with MOVABS insn. movabsq $1099511627775, %rax cmpq

[committed] i386: Use SBB more [PR94650]

2020-05-04 Thread Uros Bizjak via Gcc-patches
When returning 0 or -1, "SBB reg,reg" instruction that borrows carry flag can be used. Carry flag can be generated by converting compare with zero to a LTU compare with one, so e.g. return -(x == 0) generates: cmpq$1, %rdi sbbq%rax, %rax instead of: xorl

Re: [PATCH] x86: Fix *vec_dupv4hi constraints [PR94942]

2020-05-05 Thread Uros Bizjak via Gcc-patches
On Tue, May 5, 2020 at 9:11 AM Jakub Jelinek wrote: > > Hi! > > This insn and split splits into HI->V?HImode broadcast for avx2 and later, > but either the operands need to be %xmm0-%xmm15 (i.e. VEX encoded insn), or > the insn needs both AVX512BW and AVX512VL. > Now, Yv constraint is v for

Re: [PATCH 2/4] x86: Add -mzero-caller-saved-regs=[skip|used-gpr|all-gpr|used|all]

2020-05-05 Thread Uros Bizjak via Gcc-patches
On Mon, May 4, 2020 at 9:01 PM H.J. Lu wrote: > > Add -mzero-caller-saved-regs=[skip|used-gpr|all-gpr|used|all] command-line > option and zero_caller_saved_regs("skip|used|all") function attribue: > > 1. -mzero-caller-saved-regs=skip and zero_caller_saved_regs("skip") > > Don't zero caller-saved

[committed] i386: Use int_nonimmediate_operand more

2020-05-05 Thread Uros Bizjak via Gcc-patches
Pattern explosing and manual mode checks can be avoided by using int_nonimmediate_operand special predicate. While there, rewrite *x86_movcc_0_m1_neg_leu to a combine pass splitter. 2020-05-05 Uroš Bizjak * config/i386/i386.md (*testqi_ext_3): Use int_nonimmediate_operand instead of

Re: [PATCH] x86: Fix -O0 intrinsic *gather*/*scatter* macros [PR94832]

2020-04-29 Thread Uros Bizjak via Gcc-patches
On Wed, Apr 29, 2020 at 1:04 PM Jakub Jelinek wrote: > > Hi! > > As reported in the PR, while most intrinsic -O0 macro argument uses > are properly wrapped in ()s or used in context where having a complex > expression passed as the argument doesn't pose a problem (e.g. when > macro argument use

Re: [PATCH] x86: Fix -O0 remaining intrinsic macros [PR94832]

2020-04-29 Thread Uros Bizjak via Gcc-patches
On Wed, Apr 29, 2020 at 1:11 PM Jakub Jelinek wrote: > > Hi! > > A few other macros seem to suffer from the same issue. What I've done was: > cat gcc/config/i386/*intrin.h | sed -e ':x /\\$/ { N; s/\\\n//g ; bx }' \ > | grep '^[[:blank:]]*#[[:blank:]]*define[[:blank:]].*(' | sed 's/[ ]\+/

i386: Require OPTION_MASK_ISA_SSE2 for __builtin_ia32_movq128 [PR94603]

2020-04-15 Thread Uros Bizjak via Gcc-patches
2020-04-15 Uroš Bizjak PR target/94603 * config/i386/i386-builtin.def (__builtin_ia32_movq128): Require OPTION_MASK_ISA_SSE2. testsuite/ChangeLog: 2020-04-15 Uroš Bizjak PR target/94603 * gcc.target/i386/pr94603.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu

[RFC PATCH] i386: Add V2SFmode FMA insn patterns [PR95046]

2020-05-11 Thread Uros Bizjak via Gcc-patches
Attached patch implements V2SFmode FMA insn patterns. Patched compiler vectorizes FMA, FMS and FNMA instructions, but for some reason fails to vectorize FNMS. I have double checked that the insn pattern is correct, and now I'm all out of ideas what could be wrong with the pattern, still ignored

Re: [RFC PATCH] i386: Add V2SFmode FMA insn patterns [PR95046]

2020-05-12 Thread Uros Bizjak via Gcc-patches
On Tue, May 12, 2020 at 7:57 AM Richard Biener wrote: > > On Mon, 11 May 2020, Uros Bizjak wrote: > > > Attached patch implements V2SFmode FMA insn patterns. Patched compiler > > vectorizes FMA, FMS and FNMA instructions, but for some reason fails > > to vectorize FNMS. > > > > I have double

Re: [PATCH] [PR94118]] Update documentation for x86 operand modifier.

2020-05-12 Thread Uros Bizjak via Gcc-patches
On Tue, May 12, 2020 at 7:27 AM Hongtao Liu wrote: > > Documents operand modifiers which are available in asm stmt but > missing in document. > > | Modifier | Description | Available in asm stmt | Existed in documentation | > | --- | --- | --- | - | > | L,W,B,Q,S,T | print the opcode

Re: [PATCH] x86: Properly count cost of XMM register push

2020-05-13 Thread Uros Bizjak via Gcc-patches
On Wed, May 13, 2020 at 1:05 PM Uros Bizjak wrote: > > On Tue, May 12, 2020 at 10:07 PM H.J. Lu wrote: > > > > Update STV pass to properly count cost of XMM register push. In 32-bit > > mode, to convert XMM register push in DImode, we do an XMM store in > > DImode, followed by 2 memory pushes

[committed] i386: Add V2SFmode NEG, ABS and logic insn patterns [PR95046]

2020-05-12 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: 2020-05-12 Uroš Bizjak PR target/95046 * config/i386/mmx.md (v2sf2): New insn pattern. (*mmx_v2sf2): New insn_and_split pattern. (*mmx_nabsv2sf2): Ditto. (*mmx_andnotv2sf3): New insn pattern. (*mmx_v2sf3): Ditto. * config/i386/i386.md (absneg_op):

[committed] i386: Add V2SFmode copysign, xorsign and signbit expanders [PR95046]

2020-05-12 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: 2020-05-12 Uroš Bizjak PR target/95046 * config/i386/mmx.md (copysignv2sf3): New expander. (xorsignv2sf3): Ditto. (signbitv2sf3): Ditto. testsuite/ChangeLog: 2020-05-12 Uroš Bizjak PR target/95046 * gcc.target/i386/pr95046-4.c: New test.

[committed] i386: Add V2SFmode FMA insn patterns [PR95046]

2020-05-12 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: 2020-05-12 Uroš Bizjak PR target/95046 * config/i386/mmx.md (fmav2sf4): New insn pattern. (fmsv2sf4): Ditto. (fnmav2sf4): Ditto. (fnmsv2sf4): Ditto. testsuite/ChangeLog: 2020-05-12 Uroš Bizjak PR target/95046 * gcc.target/i386/pr95046-3.c: New

Re: [PATCH] x86: Allow V1TI vector register pushes

2020-05-17 Thread Uros Bizjak via Gcc-patches
On Sat, May 16, 2020 at 8:13 PM H.J. Lu wrote: > > On Fri, May 15, 2020 at 11:21:30AM +0200, Uros Bizjak wrote: > > On Wed, May 13, 2020 at 5:58 PM H.J. Lu wrote: > > > > > > > > The question is, why STV pass creates its funny sequence? The > > > > > > original > > > > > > sequence should be

Re: [PATCH] i386: Define __ILP32__ and _ILP32 for all 32-bit targets

2020-05-17 Thread Uros Bizjak via Gcc-patches
On Sun, May 17, 2020 at 6:14 PM Gerald Pfeifer wrote: > > On Fri, 8 May 2020, Uros Bizjak wrote: > >> A user reported that gcc -m32 on x86-64 does not define __ILP32__ > >> and I found the same on i686 (with gcc -x c -dM -E /dev/null). > : > >> This patch does the same for all "regular" 32-bit

[committed] i386: Use generic division to generate INEXACT exception

2020-05-06 Thread Uros Bizjak via Gcc-patches
Introduce math_force_eval_div to use generic division to generate INEXACT as well as INVALID and DIVZERO exceptions. libgcc/ChangeLog: 2020-05-06 Uroš Bizjak * config/i386/sfp-exceptions.c (__math_force_eval): Remove. (__math_force_eval_div): New define.

[committed] i386: Use "clobber (scratch)" in expanders

2020-05-05 Thread Uros Bizjak via Gcc-patches
Use "clobber (scratch:M)" instad of "clobber (match_scratch:M N)" in expanders. 2020-05-05 Uroš Bizjak * config/i386/i386.md (fixuns_truncsi2): Use "clobber (scratch:M)" instad of "clobber (match_scratch:M N)". (addqi3_cconly_overflow): Ditto. (umulv4): Ditto.

Re: [PATH] Enable GCC support for SERIALIZE

2020-05-06 Thread Uros Bizjak via Gcc-patches
On Wed, May 6, 2020 at 5:11 AM Hongtao Liu wrote: > > On Mon, May 4, 2020 at 1:17 AM Uros Bizjak wrote: > > > > On Wed, Apr 1, 2020 at 9:23 AM Hongtao Liu wrote: > > > > > > Hi: > > > This patch is about to enable GCC support for SERIALIZE which would > > > be in GLC. There's only 1

Re: [PATCH] Enable GCC support for TSXLDTRK

2020-05-06 Thread Uros Bizjak via Gcc-patches
On Wed, May 6, 2020 at 4:48 AM Hongtao Liu wrote: > > On Mon, May 4, 2020 at 12:58 AM Uros Bizjak wrote: > > > > The part above is OK, but you are missing support for > > __attribute__((__target__("..."))). Please see how for example -msgx > > is handled in isa2_opts in i386-options.c and in > >

Re: [PATCH] make minmax detection work with FMIN/FMAX IFNs

2020-05-10 Thread Uros Bizjak via Gcc-patches
On Fri, May 8, 2020 at 4:38 PM Richard Biener wrote: > > On May 8, 2020 4:28:24 PM GMT+02:00, Alexander Monakov > wrote: > >On Fri, 8 May 2020, Uros Bizjak wrote: > > > >> > Am I missing something? > >> > >> Is the above enough to declare min/max as IEEE compliant? > > > >No. SSE min/max

Re: [committed] i386: Vectorize basic V2SFmode operations [PR95046]

2020-05-11 Thread Uros Bizjak via Gcc-patches
Now with missing testcase. On Mon, May 11, 2020 at 11:20 AM Uros Bizjak wrote: > > Enable V2SFmode vectorization and vectorize V2SFmode PLUS, > MINUS, MULT, MIN and MAX operations using XMM registers. > > To avoid unwanted secondary effects (e.g. exceptions), load values > to XMM registers using

[committed] i386: Vectorize basic V2SFmode operations [PR95046]

2020-05-11 Thread Uros Bizjak via Gcc-patches
Enable V2SFmode vectorization and vectorize V2SFmode PLUS, MINUS, MULT, MIN and MAX operations using XMM registers. To avoid unwanted secondary effects (e.g. exceptions), load values to XMM registers using MOVQ that clears high bits of the XMM register outside V2SFmode. The compiler now

Re: [PATCH] make minmax detection work with FMIN/FMAX IFNs

2020-05-08 Thread Uros Bizjak via Gcc-patches
On Fri, May 8, 2020 at 3:02 PM Richard Biener wrote: > > > Currently we fail to optimize those which are used when MIN/MAX_EXPR > cannot be used for FP values but the target has IEEE conforming > implementations. > > This patch adds support for fmin/fmax detection to phiopt and > makes the named

Re: [PATCH] make minmax detection work with FMIN/FMAX IFNs

2020-05-08 Thread Uros Bizjak via Gcc-patches
On Fri, May 8, 2020 at 3:46 PM Alexander Monakov wrote: > > > > On Fri, 8 May 2020, Richard Biener wrote: > > > > > Currently we fail to optimize those which are used when MIN/MAX_EXPR > > cannot be used for FP values but the target has IEEE conforming > > implementations. > > i386

Re: [PATCH] ix86: Add peephole2 for *add3_cc_overflow_1 followed by matching memory store [PR94857]

2020-05-08 Thread Uros Bizjak via Gcc-patches
On Fri, May 8, 2020 at 9:47 AM Jakub Jelinek wrote: > > Hi! > > The following peephole2 changes: > - addl(%rdi), %esi > + xorl%eax, %eax > + addl%esi, (%rdi) > setc%al > - movl%esi, (%rdi) > - movzbl %al, %eax > ret > on the

[committed] i386: Allow SI, DI and TImode pushes from XMM registers

2020-05-15 Thread Uros Bizjak via Gcc-patches
Also change XMM register constraint from "x" to "v" in FP push insns. gcc/ChangeLog: 2020-05-15 Uroš Bizjak * config/i386/i386.md (SWI48DWI): New mode iterator. (*push2): Allow XMM registers. (*pushdi2_rex64): Ditto. (*pushsi2_rex64): Ditto. (*pushsi2): Ditto. (push

Re: [committed] i386: Add V2SFmode conversion functions [PR95046]

2020-05-14 Thread Uros Bizjak via Gcc-patches
Now with a patch attached. On Thu, May 14, 2020 at 9:18 AM Uros Bizjak wrote: > > gcc/ChangeLog: > > 2020-05-14 Uroš Bizjak > > PR target/95046 > * config/i386/mmx.md (mmx_fix_truncv2sfv2si2): rename from mmx_pf2id. > Add SSE/AVX alternative. Change operand predicates from >

[committed] i386: Add V2SFmode conversion functions [PR95046]

2020-05-14 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: 2020-05-14 Uroš Bizjak PR target/95046 * config/i386/mmx.md (mmx_fix_truncv2sfv2si2): rename from mmx_pf2id. Add SSE/AVX alternative. Change operand predicates from nonimmediate_operand to register_mmxmem_operand. Enable instruction pattern for

Re: [PATCH] x86: Update Intel processor detection

2020-05-18 Thread Uros Bizjak via Gcc-patches
On Mon, May 18, 2020 at 1:58 PM H.J. Lu wrote: > > Add cpu model numbers for Intel Airmont, Tremont, Comet Lake, Ice Lake > and Tiger Lake processor families. > > OK for master? OK. Please also update cpuinfo.c from libgcc and corresponding gcc.target/i386/builtin_target.c testcase. Thanks,

Re: [PATCH] x86: Update Intel processor detection

2020-05-18 Thread Uros Bizjak via Gcc-patches
On Mon, May 18, 2020 at 2:34 PM H.J. Lu wrote: > > On Mon, May 18, 2020 at 5:18 AM Uros Bizjak wrote: > > > > On Mon, May 18, 2020 at 1:58 PM H.J. Lu wrote: > > > > > > Add cpu model numbers for Intel Airmont, Tremont, Comet Lake, Ice Lake > > > and Tiger Lake processor families. > > > > > > OK

[committed] i386: Avoid reversing a non-trapping comparison to a trapping one [PR95169]

2020-05-18 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: 2020-05-18 Uroš Bizjak PR target/95169 * config/i386/i386-expand.c (ix86_expand_int_movcc): Avoid reversing a non-trapping comparison to a trapping one. testsuite/ChangeLog: 2020-05-18 Uroš Bizjak PR target/95169 * gcc.target/i386/pr95169.c: New test.

[committed] i386: Improve vector mode and TFmode ABS and NEG patterns

2020-05-18 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: 2020-05-18 Uroš Bizjak * config/i386/i386-expand.c (ix86_expand_fp_absneg_operator): Do not emit FLAGS_REG clobber for TFmode. * config/i386/i386.md (*tf2_1): Rewrite as define_insn_and_split. Mark operands 1 and 2 commutative. (*nabstf2_1): Ditto.

Re: [PATCH] x86: Move cpuinfo.h from libgcc to common/config/i386

2020-05-18 Thread Uros Bizjak via Gcc-patches
On Tue, May 19, 2020 at 4:17 AM H.J. Lu wrote: > > On Mon, May 18, 2020 at 5:57 AM H.J. Lu wrote: > > > > On Mon, May 18, 2020 at 5:43 AM Uros Bizjak wrote: > > > > > > On Mon, May 18, 2020 at 2:34 PM H.J. Lu wrote: > > > > > > > > On Mon, May 18, 2020 at 5:18 AM Uros Bizjak wrote: > > > > >

[RFC PATCH] i386: Add missing vector zero/sign_extend expanders [PR92658]

2020-05-19 Thread Uros Bizjak via Gcc-patches
Hello! Attached patch adds missing vector zero/sign_extend expanders to allow vectorization of operations between different vector sizes. The patch regresses (progresses?): FAIL: gcc.target/i386/pr92645-4.c scan-tree-dump-times optimized "vec_unpack_lo" 3 but eyeballing the asm code

Re: [PATCH] x86: Add cmpmemsi for -mgeneral-regs-only

2020-05-19 Thread Uros Bizjak via Gcc-patches
On Sun, May 17, 2020 at 7:06 PM H.J. Lu wrote: > > Duplicate the cmpstrn pattern for cmpmem. The only difference is that > the length argument of cmpmem is guaranteed to be less than or equal to > lengths of 2 memory areas. Since "repz cmpsb" can be much slower than > memcmp function

Re: [RFC PATCH] i386: Add missing vector zero/sign_extend expanders [PR92658]

2020-05-19 Thread Uros Bizjak via Gcc-patches
On Tue, May 19, 2020 at 10:48 AM Richard Biener wrote: > > On Tue, 19 May 2020, Uros Bizjak wrote: > > > Hello! > > > > Attached patch adds missing vector zero/sign_extend expanders to allow > > vectorization of operations between different vector sizes. > > > > The patch regresses (progresses?):

Re: [PATCH] x86: Add -mavoid-libcall for -mgeneral-regs-only

2020-05-15 Thread Uros Bizjak via Gcc-patches
On Fri, May 15, 2020 at 1:13 AM H.J. Lu wrote: > > The -mgeneral-regs-only option generates code that uses only the > general-purpose registers. It prevents the compiler from using vector > registers. But GCC may still generate calls to memcpy, memmove, memset > and memcmp library functions.

[committed] i386: Add V2SFmode hadd/hsub instructions [PR95046]

2020-05-15 Thread Uros Bizjak via Gcc-patches
PFACC/PFNACC 3dNow! instructions got their corresponding SSE alternative in SSE3, so these can't be implemented with TARGET_MMX_WITH_SSE, which implies SSE2. These instructions are only generated via builtins, and since several 3dNow! insns have no corresponding SSE alternative, we can't avoid

Re: [PATCH] x86: Allow vector register pushes

2020-05-15 Thread Uros Bizjak via Gcc-patches
On Wed, May 13, 2020 at 5:58 PM H.J. Lu wrote: > > > > The question is, why STV pass creates its funny sequence? The original > > > > sequence should be easily solved by storing DImode from XMM register > > > > and (with patched gcc) pushing DImode value from the same XMM > > > > register. > > >

Re: [PATCH] x86: Properly count cost of XMM register push

2020-05-13 Thread Uros Bizjak via Gcc-patches
On Tue, May 12, 2020 at 10:07 PM H.J. Lu wrote: > > Update STV pass to properly count cost of XMM register push. In 32-bit > mode, to convert XMM register push in DImode, we do an XMM store in > DImode, followed by 2 memory pushes in SImode, instead of 2 integer > register pushes in SImode. To

[committed] i386: Add V2DFmode conversion functions [PR95046]

2020-05-14 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: 2020-05-14 Uroš Bizjak PR target/95046 * config/i386/sse.md (sse2_cvtpi2pd): Add memory to alternative 1. (floatv2siv2df2): New expander. (floatunsv2siv2df2): New insn pattern. (fix_truncv2dfv2si2): New expander. (fixuns_truncv2dfv2si2): New insn

Re: [PATCH] x86: Properly count cost of XMM register push

2020-05-13 Thread Uros Bizjak via Gcc-patches
On Wed, May 13, 2020 at 2:37 PM H.J. Lu wrote: > > On Wed, May 13, 2020 at 5:04 AM Uros Bizjak wrote: > > > > On Wed, May 13, 2020 at 1:05 PM Uros Bizjak wrote: > > > > > > On Tue, May 12, 2020 at 10:07 PM H.J. Lu wrote: > > > > > > > > Update STV pass to properly count cost of XMM register

Re: [PATCH] x86: Properly count cost of XMM register push

2020-05-13 Thread Uros Bizjak via Gcc-patches
On Wed, May 13, 2020 at 3:25 PM H.J. Lu wrote: > > On Wed, May 13, 2020 at 6:17 AM Uros Bizjak wrote: > > > > On Wed, May 13, 2020 at 2:37 PM H.J. Lu wrote: > > > > > > On Wed, May 13, 2020 at 5:04 AM Uros Bizjak wrote: > > > > > > > > On Wed, May 13, 2020 at 1:05 PM Uros Bizjak wrote: > > >

[committed] i386: Add V2DFmode float trunc/extend functions [PR95046]

2020-05-14 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: 2020-05-14 Uroš Bizjak PR target/95046 * config/i386/sse.md (truncv2dfv2df2): New insn pattern. (extendv2sfv2df2): Ditto. testsuite/ChangeLog: 2020-05-14 Uroš Bizjak PR target/95046 * gcc.target/i386/pr95046-7.c: New test. Bootstrapped and regression

[WIP PATCH]: Autovectorize V2SF mode

2020-05-08 Thread Uros Bizjak via Gcc-patches
Attached WIP patch enables auto-vectorization of basic V2SF operations (plus, minus, mult, min/max). The compiler takes care that everything is loaded from memory via movq insn, so top two registers always remain zero. We could probably vectorize some more operations (horizontal add, horizontal

Re: [WIP PATCH]: Autovectorize V2SF mode

2020-05-08 Thread Uros Bizjak via Gcc-patches
On Fri, May 8, 2020 at 7:22 PM Uros Bizjak wrote: > > Attached WIP patch enables auto-vectorization of basic V2SF operations > (plus, minus, mult, min/max). The compiler takes care that everything > is loaded from memory via movq insn, so top two registers always > remain zero. This example:

Re: [PATCH] Add enqcmd, avx512bf16, avx512vp2intersect to funcspec-56.inc

2020-05-06 Thread Uros Bizjak via Gcc-patches
On Wed, May 6, 2020 at 11:08 AM Hongtao Liu wrote: > > Hi: > Test is ok for funcspec-5.c, funcspec-6.c. > > gcc/testuite/ChangeLog > * gcc.target/i386/funcspec-56.inc: Add enqcmd, avx512bf16, > avx512vp2intersect. OK. Thanks, Uros. >

[committed] i386: Add V2SFmode sqrt insn pattern [PR95046]

2020-05-11 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: 2020-05-11 Uroš Bizjak PR target/95046 * config/i386/mmx.md (sqrtv2sf2): New insn pattern. testsuite/ChangeLog: 2020-05-11 Uroš Bizjak PR target/95046 * gcc.target/i386/pr95046-1.c (test_sqrt): Add. Bootstrapped and regression tested on x86_64-linux-gnu

[committed] testsuite: Fix g++.dg/debug/dwarf2/const2b.C target selector

2020-03-17 Thread Uros Bizjak via Gcc-patches
2020-03-17 Uroš Bizjak * g++.dg/debug/dwarf2/const2b.C (dg-do): Fix target selector. Tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/const2b.C b/gcc/testsuite/g++.dg/debug/dwarf2/const2b.C index 3ad1c080945..681ad721dd7 100644 ---

Re: [PATCH] Darwin: Fix i686 bootstrap when the assembler supports GOTOFF in data.

2020-03-21 Thread Uros Bizjak via Gcc-patches
On Sat, Mar 21, 2020 at 3:21 PM Iain Sandoe wrote: > > Hi, > > When we use an assembler that supports " .long XX@GOTOFF", the current > combination of configuration parameters and conditional compilation > (when building an i686-darwin compiler with mdynamic-no-pic) assume that > it's OK to put

Re: [PATCH] LRA: fix for PR94185

2020-03-17 Thread Uros Bizjak via Gcc-patches
>The following committed patch solves > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94185 > >The patch was successfully bootstrapped and tested on x86-64. This patch creates unoptimal sequence in reload: (insn 64 63 248 10 (set (reg:DI 0 ax [161]) (zero_extend:DI (mem/c:SI

[committed] i386: Use ADD to implement compares with negated operand [PR94913]

2020-05-06 Thread Uros Bizjak via Gcc-patches
Use carry flag from addition to implement GEU/LTU compares with negated operand, so e.g. ~x < y compiles to: addq%rsi, %rdi setc%al instead of: notq%rdi cmpq%rsi, %rdi setb%al 2020-05-06 Uroš Bizjak PR target/94913 *

Re: [PATCH] i386: Define __ILP32__ and _ILP32 for all 32-bit targets

2020-05-07 Thread Uros Bizjak via Gcc-patches
On Fri, May 8, 2020 at 12:58 AM Gerald Pfeifer wrote: > > A user reported that gcc -m32 on x86-64 does not define __ILP32__ > and I found the same on i686 (with gcc -x c -dM -E /dev/null). > > The code has > > if (TARGET_X32) > { > cpp_define (parse_in, "_ILP32"); >

[committed]i386: Fix zero/sign extend expanders [PR95229]

2020-05-20 Thread Uros Bizjak via Gcc-patches
2020-05-20 Uroš Bizjak gcc/ChangeLog: PR target/95229 * config/i386/sse.md (v8qiv8hi2): Use simplify_gen_subreg instead of simplify_subreg. (v8qiv8si2): Ditto. (v4qiv4si2): Ditto. (v4hiv4si2): Ditto. (v8qiv8di2): Ditto. (v4qiv4di2): Ditto. (v2qiv2di2):

[committed] i386: Fix *pushsi2_rex64 constraints [PR95238]

2020-05-20 Thread Uros Bizjak via Gcc-patches
2020-05-20 Uroš Bizjak gcc/ChangeLog: PR target/95238 * config/i386/i386.md (*pushsi2_rex64): Use "e" constraint instead of "i". Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index

Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-20 Thread Uros Bizjak via Gcc-patches
On Wed, May 20, 2020 at 10:35 AM Hongtao Liu wrote: > > Hi: > Bootstrap is ok, regression test on i386/x86-64 backend is ok. > > gcc/ChangeLog: > PR target/92658 > * config/i386/sse.md > (trunc2, truncv32hiv32qi2, > trunc2): New expander. > >

Re: V2 [PATCH] x86: Move cpuinfo.h from libgcc to common/config/i386

2020-05-19 Thread Uros Bizjak via Gcc-patches
On Tue, May 19, 2020 at 9:58 PM H.J. Lu wrote: > > On Mon, May 18, 2020 at 10:56 PM Uros Bizjak wrote: > > > > On Tue, May 19, 2020 at 4:17 AM H.J. Lu wrote: > > > > > > On Mon, May 18, 2020 at 5:57 AM H.J. Lu wrote: > > > > > > > > On Mon, May 18, 2020 at 5:43 AM Uros Bizjak wrote: > > > > >

Re: [PATCH] x86: Add FEATURE_AVX512VP2INTERSECT and update GFNI check

2020-05-19 Thread Uros Bizjak via Gcc-patches
On Tue, May 19, 2020 at 11:37 PM H.J. Lu wrote: > > Add FEATURE_AVX512VP2INTERSECT to libgcc so that enum processor_features > in libgcc matches enum processor_features in i386-builtins.c. Update > GFNI check to support processors with SSE and AVX versions of GFNI. > > PR target/95212 >

[committed] i386: Do not use commutative operands with (use) RTX [PR95238]

2020-05-20 Thread Uros Bizjak via Gcc-patches
2020-05-21 Uroš Bizjak gcc/ChangeLog: PR target/95218 * config/i386/mmx.md (*mmx_v2sf): Do not mark operands 1 and 2 commutative. Manually swap operands. (*mmx_nabsv2sf2): Ditto. Partially revert: 2020-05-18 Uroš Bizjak * config/i386/i386.md (*tf2_1):

Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-21 Thread Uros Bizjak via Gcc-patches
On Thu, May 21, 2020 at 7:35 AM Hongtao Liu wrote: > > On Wed, May 20, 2020 at 11:43 PM Uros Bizjak wrote: > > > > On Wed, May 20, 2020 at 10:35 AM Hongtao Liu wrote: > > > > > > Hi: > > > Bootstrap is ok, regression test on i386/x86-64 backend is ok. > > > > > > gcc/ChangeLog: > > >

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-23 Thread Uros Bizjak via Gcc-patches
On Sat, May 23, 2020 at 9:25 AM Hongtao Liu wrote: > > Hi: > This patch fix non-conforming expander for > floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di, > refer to PR95211, PR95256. > bootstrap ok, regression test on i386/x86-64 backend is ok. > > gcc/ChangeLog: >

Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-22 Thread Uros Bizjak via Gcc-patches
On Fri, May 22, 2020 at 6:55 AM Hongtao Liu wrote: > > On Thu, May 21, 2020 at 7:18 PM Uros Bizjak wrote: > > > > On Thu, May 21, 2020 at 7:35 AM Hongtao Liu wrote: > > > > > > On Wed, May 20, 2020 at 11:43 PM Uros Bizjak wrote: > > > > > > > > On Wed, May 20, 2020 at 10:35 AM Hongtao Liu

Re: [PATCH] x86: Handle -mavx512vpopcntdq for -march=native

2020-05-22 Thread Uros Bizjak via Gcc-patches
On Thu, May 21, 2020 at 2:54 PM H.J. Lu wrote: > > Add -mavx512vpopcntdq for -march=native if AVX512VPOPCNTDQ is available. > > PR target/95258 > * config/i386/driver-i386.c (host_detect_local_cpu): Detect > AVX512VPOPCNTDQ. OK. Thanks, Uros. > --- >

Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-22 Thread Uros Bizjak via Gcc-patches
On Fri, May 22, 2020 at 11:52 AM Hongtao Liu wrote: > > On a related note, it looks that pmov stores are modelled in a wrong > > way. For example, this pattern; > > > > (define_insn "*avx512f_v8div16qi2_store" > > [(set (match_operand:V16QI 0 "memory_operand" "=m") > > (vec_concat:V16QI > >

Re: [PATCH] x86: Handle -mavx512vpopcntdq for -march=native

2020-05-24 Thread Uros Bizjak via Gcc-patches
On Sat, May 23, 2020 at 5:07 PM H.J. Lu wrote: > > On Fri, May 22, 2020 at 12:42 AM Uros Bizjak wrote: > > > > On Thu, May 21, 2020 at 2:54 PM H.J. Lu wrote: > > > > > > Add -mavx512vpopcntdq for -march=native if AVX512VPOPCNTDQ is available. > > > > > > PR target/95258 > > > *

Re: [PATCH] x86: Fix up ssse3_pshufbv8qi splitter

2020-08-30 Thread Uros Bizjak via Gcc-patches
On Sun, Aug 30, 2020 at 11:21 AM Jakub Jelinek wrote: > > Hi! > > The constant pool size optimization I was testing resulted in various ICEs > in gcc.target/i386/ testsuite, the problem is that the ssse3_pshufbv8qi > splitter emits invalid RTL, in V4SImode 0xf7f7f7f7 CONST_INTs shouldn't >

Re: [PATCH] Return mask <-> integer cost for non-AVX512 micro-architecture.

2020-09-15 Thread Uros Bizjak via Gcc-patches
On Tue, Sep 15, 2020 at 4:59 AM Hongtao Liu wrote: > > Hi: > This patch would avoid spill gprs to mask registers for non-AVX512 > micro-architecture and fix regression in PR96744. > > Bootstrap is ok, regression test for i386/x86-64 backend is ok. > No big performance impact on SPEC2017. >

Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Wed, Oct 14, 2020 at 10:34 AM Hongyu Wang wrote: > > > > > Please also add -muintr to g++.dg/other/i386-{2,3}.C and > > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new intrinsics > > header. > > > > Thanks for your review. We found that without adding -muintr, the intrinsics >

Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Wed, Oct 14, 2020 at 10:42 AM Uros Bizjak wrote: > > On Wed, Oct 14, 2020 at 10:34 AM Hongyu Wang wrote: > > > > > > > > Please also add -muintr to g++.dg/other/i386-{2,3}.C and > > > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new intrinsics > > > header. > > > > > > > Thanks for

Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Wed, Oct 14, 2020 at 11:04 AM Hongyu Wang wrote: > > > > Uros Bizjak 于2020年10月14日周三 下午4:42写道: >> >> On Wed, Oct 14, 2020 at 10:34 AM Hongyu Wang wrote: >> > >> > > >> > > Please also add -muintr to g++.dg/other/i386-{2,3}.C and >> > > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test

Re: [Patch] x86: Enable GCC support for Intel Hreset extension

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Tue, Oct 13, 2020 at 10:49 AM Hongyu Wang wrote: > > Hi: > > This patch is about to support Intel Hreset instruction. > > Hreset provides a hint to the processor to selectively reset the prediction > history of the current logical processor. > > For more details, please refer to >

Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Tue, Oct 13, 2020 at 10:30 AM Hongyu Wang wrote: > > Hi: > > This patch is about to support User Interrupt (UINTR) instructions. > > This feature defines user interrupts as new events in the architecture. They > are delivered to software operating in 64-bit mode with CPL = 3 without any >

Re: [PATCH][middle-end][i386][version 3]Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-19 Thread Uros Bizjak via Gcc-patches
On Tue, Oct 6, 2020 at 4:02 PM Qing Zhao wrote: > > Hi, Gcc team, > > This is the 3rd version of the implementation of patch -fzero-call-used-regs. > > We will provide a new feature into GCC: > > Add > -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all] >

Re: [PATCH] i386: Improve chaining of _{addcarry, subborrow}_u{32, 64} [PR97387]

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Wed, Oct 14, 2020 at 11:01 AM Jakub Jelinek wrote: > > Hi! > > These builtins have two known issues and this patch fixes one of them. > > One issue is that the builtins effectively return two results and > they make the destination addressable until expansion, which means > a stack slot is

Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Uros Bizjak via Gcc-patches
> > Please also add -muintr to g++.dg/other/i386-{2,3}.C and > > >> > > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new intrinsics > > >> > > header. > > >> > > > > >> > > > >> > Thanks for your review. We found that without adding -muintr, the > > >> > intrinsics header could also be

Re: [PATCH] i386: Improve chaining of _{addcarry, subborrow}_u{32, 64} [PR97387]

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Wed, Oct 14, 2020 at 3:49 PM Jakub Jelinek wrote: > > On Wed, Oct 14, 2020 at 03:17:03PM +0200, Uros Bizjak wrote: > > > +(define_insn_and_split "*setcc_qi_addqi3_cconly_overflow_1_" > > > + [(set (reg:CCC FLAGS_REG) > > > + (compare:CCC (neg:QI (geu:QI (reg:CC_CCC FLAGS_REG) (const_int

Re: [PATCH] x86: Add missing intrinsics [PR95483]

2020-10-14 Thread Uros Bizjak via Gcc-patches
> gcc/ChangeLog: > > * config/i386/avx2intrin.h (_mm_broadcastsi128_si256): New intrinsics. > (_mm_broadcastsd_pd): Ditto. > * config/i386/avx512bwintrin.h (_mm512_loadu_epi16): New intrinsics. > (_mm512_storeu_epi16): Ditto. > (_mm512_loadu_epi8): Ditto. > (_mm512_storeu_epi8): Ditto. > *

  1   2   3   4   5   6   7   8   9   10   >