Re: [PATCH] disable use_vector_fp_converts for m_CORE_ALL

2013-09-22 Thread Uros Bizjak
Li; Zamyatin, Igor > Subject: [PATCH] disable use_vector_fp_converts for m_CORE_ALL > > For the following testcase 1.c, on westmere and sandybridge, performance with > the option -mtune=^use_vector_fp_converts is better (improves from 3.46s to > 2.83s). It means cvtss2sd is often b

Re: [PATCH] disable use_vector_fp_converts for m_CORE_ALL

2013-10-01 Thread H.J. Lu
> Sent: Thursday, September 12, 2013 2:51 AM >> To: GCC Patches >> Cc: David Li; Zamyatin, Igor >> Subject: [PATCH] disable use_vector_fp_converts for m_CORE_ALL >> >> For the following testcase 1.c, on westmere and sandybridge, performance >> with the option -mtun

RE: [PATCH] disable use_vector_fp_converts for m_CORE_ALL

2013-09-18 Thread Zamyatin, Igor
Ccing Uros. Changes in i386.md could be related to the fix for PR57954. Thanks, Igor -Original Message- From: Wei Mi [mailto:w...@google.com] Sent: Thursday, September 12, 2013 2:51 AM To: GCC Patches Cc: David Li; Zamyatin, Igor Subject: [PATCH] disable use_vector_fp_converts for

Re: [PATCH] disable use_vector_fp_converts for m_CORE_ALL

2013-09-20 Thread Wei Mi
Ping. > -Original Message- > From: Wei Mi [mailto:w...@google.com] > Sent: Thursday, September 12, 2013 2:51 AM > To: GCC Patches > Cc: David Li; Zamyatin, Igor > Subject: [PATCH] disable use_vector_fp_converts for m_CORE_ALL > > For the following testcase 1.c, on

[PATCH] disable use_vector_fp_converts for m_CORE_ALL

2013-09-11 Thread Wei Mi
For the following testcase 1.c, on westmere and sandybridge, performance with the option -mtune=^use_vector_fp_converts is better (improves from 3.46s to 2.83s). It means cvtss2sd is often better than unpcklps+cvtps2pd on recent x86 platforms. 1.c: float total = 0.2; int k = 5; int main() { int

Re: [PATCH] disable use_vector_fp_converts for m_CORE_ALL

2013-10-01 Thread Wei Mi
> Hi Wei Mi, > > Have you checked in your patch? > > -- > H.J. No, I havn't. Honza wants me to wait for his testing on AMD hardware. http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01603.html

Re: [PATCH 3/4] [PATCH 3/4] x86: Properly handle USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONVERTS

2021-09-16 Thread Uros Bizjak via Gcc-patches
Re: [PATCH 3/4] [PATCH 3/4] x86: Properly handle > > USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONVERTS > > > > On Wed, Sep 15, 2021 at 10:10 AM wrote: > > > > > > From: "H.J. Lu" > > > > > > Check TARGET_USE_VECTOR_FP_CONVERTS or > >

Re: [PATCH] disable use_vector_fp_converts for m_CORE_ALL

2013-10-01 Thread Jan Hubicka
> > Hi Wei Mi, > > > > Have you checked in your patch? > > > > -- > > H.J. > > No, I havn't. Honza wants me to wait for his testing on AMD hardware. > http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01603.html I only wanted to separate it from the changes in generic so the regular testers can pick it

RE: [PATCH 3/4] [PATCH 3/4] x86: Properly handle USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONVERTS

2021-09-16 Thread Cui, Lili via Gcc-patches
> -Original Message- > From: Uros Bizjak > Sent: Thursday, September 16, 2021 2:28 PM > To: Cui, Lili > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; H. J. Lu > > Subject: Re: [PATCH 3/4] [PATCH 3/4] x86: Properly handle > USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONV

[PATCH 3/4] [PATCH 3/4] x86: Properly handle USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONVERTS

2021-09-15 Thread lili.cui--- via Gcc-patches
uite/gcc.target/i386/pr101900-1.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=skylake -mfpmath=sse -mtune-ctrl=use_vector_fp_converts" } */ + +extern float f; +extern double d; +extern int i; + +void +foo (void) +{ + d = f; + f = i; +} + +/* { dg-final { scan-

Re: [PATCH] disable use_vector_fp_converts for m_CORE_ALL

2013-10-01 Thread Wei Mi
On Tue, Oct 1, 2013 at 3:50 PM, Jan Hubicka wrote: >> > Hi Wei Mi, >> > >> > Have you checked in your patch? >> > >> > -- >> > H.J. >> >> No, I havn't. Honza wants me to wait for his testing on AMD hardware. >> http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01603.html > I only wanted to separate it

Re: Revisit Core tunning flags

2013-09-22 Thread Jan Hubicka
est to not use this: > >> > >> Assembly/Compiler Coding Rule 33. (M impact, H generality) > >> INC and DEC instructions should be replaced with ADD or SUB instructions, > >> because ADD and SUB overwrite all flags, whereas INC and DEC do not, > >> ther

Re: [PATCH 3/4] [PATCH 3/4] x86: Properly handle USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONVERTS

2021-09-15 Thread Uros Bizjak via Gcc-patches
gt; diff --git a/gcc/testsuite/gcc.target/i386/pr101900-1.c > b/gcc/testsuite/gcc.target/i386/pr101900-1.c > new file mode 100644 > index 000..0a45f8e340a > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr101900-1.c > @@ -0,0 +1,18 @@ > +/* { dg-do compile } */ >

Re: Revisit Core tunning flags

2013-09-22 Thread Wei Mi
>> > http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00884.html > > This patch seems resonable. (in fact I have pretty much same in my tree) > use_vector_fp_converts is actually trying to solve the same problem in AMD > hardware - you need to type the whole register when convert

[PATCH 0/4] Update mtune=tremont

2021-09-15 Thread lili.cui--- via Gcc-patches
rly handle USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONVERTS x86: Add TARGET_SSE_PARTIAL_REG_[FP_]CONVERTS_DEPENDENCY gcc/common/config/i386/i386-common.c | 2 +- gcc/config/i386/i386-features.c | 23 +++- gcc/config/i386/i386-options.c| 2 +- gcc/config/i

[PATCH 2/3, x86] X86 Silvermont vector cost model tune

2014-04-15 Thread Evgeny Stupachenko
/config/i386/x86-tune.def +++ b/gcc/config/i386/x86-tune.def @@ -386,6 +386,10 @@ DEF_TUNE (X86_TUNE_USE_VECTOR_FP_CONVERTS, "use_vector_fp_converts", from integer to FP. */ DEF_TUNE (X86_TUNE_USE_VECTOR_CONVERTS, "use_vector_converts", m_AMDFAM10) +/* X86_TUNE_SLOW_SHUFB

Re: [PATCH 3/4] [PATCH 3/4] x86: Properly handle USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONVERTS

2021-09-17 Thread Jakub Jelinek via Gcc-patches
On Fri, Sep 17, 2021 at 08:35:57AM +0200, Uros Bizjak via Gcc-patches wrote: > > > On Wed, Sep 15, 2021 at 10:10 AM wrote: > > > > > > > > From: "H.J. Lu" > > > > > > > > Check TARGET_USE_VECTOR_FP_CONVERTS or > > > TARGET_USE_VECTOR_CONVERTS when > > > > handling avx_partial_xmm_update attribute

Re: [PATCH 3/4] [PATCH 3/4] x86: Properly handle USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONVERTS

2021-09-17 Thread Hongtao Liu via Gcc-patches
On Sat, Sep 18, 2021 at 7:50 AM Jakub Jelinek via Gcc-patches wrote: > > On Fri, Sep 17, 2021 at 08:35:57AM +0200, Uros Bizjak via Gcc-patches wrote: > > > > On Wed, Sep 15, 2021 at 10:10 AM wrote: > > > > > > > > > > From: "H.J. Lu" > > > > > > > > > > Check TARGET_USE_VECTOR_FP_CONVERTS or > >

Revisit Core tunning flags

2013-09-21 Thread Jan Hubicka
. Other change dropped is use_vector_fp_converts that seems to improve Core perofrmance. I benchmarked the patch on SPEC2k and earlier it was benchmarked on 2k6 and the performance difference seems in noise. It causes about 0.3% code size reduction. Main motivation for the patch is to drop some

Re: Revisit Core tunning flags

2013-09-21 Thread Xinliang David Li
SUB overwrite all flags, whereas INC and DEC do not, therefore > creating false dependencies on earlier instructions that set the flags. > > Other change dropped is use_vector_fp_converts that seems to improve > Core perofrmance. I did not see this in your patch, but Wei has this t

Re: Revisit Core tunning flags

2013-09-21 Thread Xinliang David Li
e 33. (M impact, H generality) >> INC and DEC instructions should be replaced with ADD or SUB instructions, >> because ADD and SUB overwrite all flags, whereas INC and DEC do not, >> therefore >> creating false dependencies on earlier instructions that set the flags. >> >&g

New option to do fine grain control [on|off] of micro-arch specific features : -mtune-ctrl=....

2013-08-03 Thread Xinliang David Li
ule") +DEF_TUNE (X86_TUNE_USE_BT, "use_bt") +DEF_TUNE (X86_TUNE_USE_INCDEC, "use_incdec") +DEF_TUNE (X86_TUNE_PAD_RETURNS, "pad_returns") +DEF_TUNE (X86_TUNE_PAD_SHORT_FUNCTION, "pad_short_function") +DEF_TUNE (X86_TUNE_EXT_80387_CONSTANTS, "ext_80387_cons

Re: Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion

2013-09-24 Thread Wei Mi
ix86_tune_features[X86_TUNE_VECTORIZE_DOUBLE] diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def index 4ae5f70..3d395b0 100644 --- a/gcc/config/i386/x86-tune.def +++ b/gcc/config/i386/x86-tune.def @@ -193,10 +193,24 @@ DEF_TUNE (X86_TUNE_USE_VECTOR_FP_CONVERTS, "us

Re: Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion

2013-10-16 Thread Wei Mi
ix86_tune_features[X86_TUNE_FUSE_ALU_AND_BRANCH] #define TARGET_OPT_AGU ix86_tune_features[X86_TUNE_OPT_AGU] #define TARGET_VECTORIZE_DOUBLE \ ix86_tune_features[X86_TUNE_VECTORIZE_DOUBLE] diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def index 4ae5f70..3d395b0 10

Re: Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion

2013-09-24 Thread Wei Mi
TARGET_VECTORIZE_DOUBLE \ ix86_tune_features[X86_TUNE_VECTORIZE_DOUBLE] diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def index 4ae5f70..3d395b0 100644 --- a/gcc/config/i386/x86-tune.def +++ b/gcc/config/i386/x86-tune.def @@ -193,10 +193,24 @@ DEF_TUNE (X86_TUNE_USE_VECTOR_FP_CONVERTS, &

Re: Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion

2013-10-03 Thread Wei Mi
_AND_BRANCH_32) +#define TARGET_FUSE_CMP_AND_BRANCH_SOFLAGS \ + ix86_tune_features[X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS] +#define TARGET_FUSE_ALU_AND_BRANCH \ + ix86_tune_features[X86_TUNE_FUSE_ALU_AND_BRANCH] #define TARGET_OPT_AGU ix86_tune_features[X86_TUNE_OPT_AGU] #define

Re: Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion

2013-11-01 Thread Wei Mi
; ix86_tune_features[X86_TUNE_USE_VECTOR_FP_CONVERTS] > #define TARGET_USE_VECTOR_CONVERTS \ > ix86_tune_features[X86_TUNE_USE_VECTOR_CONVERTS] > +#define TARGET_FUSE_CMP_AND_BRANCH_32 \ > + ix86_tune_features[X86_TUNE_FUSE_CMP_AND_BRANCH_32] > +#define TARG

-mtune-ctrl=.... support, round-2

2013-08-07 Thread Xinliang David Li
F_TUNE (X86_TUNE_EPILOGUE_USING_MOVE, "epilogue_using_move") -DEF_TUNE (X86_TUNE_SHIFT1, "shift1") -DEF_TUNE (X86_TUNE_USE_FFREEP, "use_ffreep") -DEF_TUNE (X86_TUNE_INTER_UNIT_MOVES_TO_VEC, "inter_unit_moves_to_vec") -DEF_TUNE (X86_TUNE_INTER_UNIT_MOVES_FROM_VEC, &q

Re: -mtune-ctrl=.... support, round-2

2013-08-12 Thread Xinliang David Li
ot;sse_split_regs") -DEF_TUNE (X86_TUNE_SSE_TYPELESS_STORES, "sse_typeless_stores") -DEF_TUNE (X86_TUNE_SSE_LOAD0_BY_PXOR, "sse_load0_by_pxor") -DEF_TUNE (X86_TUNE_MEMORY_MISMATCH_STALL, "memory_mismatch_stall") -DEF_TUNE (X86_TUNE_PROLOGUE_USING_MOVE, "pr

Document x86-tune options

2013-09-29 Thread Jan Hubicka
6_TUNE_MOVE_M1_VIA_OR: On pentiums, it is faster to load -1 via OR than a MOV. */ DEF_TUNE (X86_TUNE_MOVE_M1_VIA_OR, "move_m1_via_or", m_PENT) + /* X86_TUNE_NOT_UNPAIRABLE: NOT is not pairable on Pentium, while XOR is, but one byte longer. */ DEF_TUNE (X86_TUNE_NOT_UNPAIRA