Hi,
I modified the patch as H.J. suggested (patch attached).
Is it OK to commit to trunk now?
Thanks,
Changpeng
________________________________________
From: H.J. Lu [[email protected]]
Sent: Friday, June 17, 2011 5:44 PM
To: Fang, Changpeng
Cc: Richard Guenther; [email protected]
Subject: Re: [PATCH, PR 49089] Don't split AVX256 unaligned loads by default on
bdver1 and generic
On Fri, Jun 17, 2011 at 3:18 PM, Fang, Changpeng <[email protected]> wrote:
> Hi,
>
> I added AVX256_SPLIT_UNALIGNED_STORE to ix86_tune_indices
> and put m_COREI7, m_BDVER1 and m_GENERIC as the targets that
> enable it.
>
> Is this OK?
Can you do something similar to how MASK_ACCUMULATE_OUTGOING_ARGS
is handled?
Thanks.
H.J.
From 50310fc367348b406fc88d54c3ab54d1a304ad52 Mon Sep 17 00:00:00 2001
From: Changpeng Fang <chfang@huainan.(none)>
Date: Mon, 13 Jun 2011 13:13:32 -0700
Subject: [PATCH 2/2] pr49089: enable avx256 splitting unaligned load/store only when beneficial
* config/i386/i386.c (avx256_split_unaligned_load): New definition.
(avx256_split_unaligned_store): New definition.
(ix86_option_override_internal): Enable avx256 unaligned load(store)
splitting only when avx256_split_unaligned_load(store) is set.
---
gcc/config/i386/i386.c | 12 ++++++++++--
1 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7b266b9..3bc0b53 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2121,6 +2121,12 @@ static const unsigned int x86_arch_always_fancy_math_387
= m_PENT | m_ATOM | m_PPRO | m_AMD_MULTIPLE | m_PENT4
| m_NOCONA | m_CORE2I7 | m_GENERIC;
+static const unsigned int x86_avx256_split_unaligned_load
+ = m_COREI7 | m_GENERIC;
+
+static const unsigned int x86_avx256_split_unaligned_store
+ = m_COREI7 | m_BDVER1 | m_GENERIC;
+
/* In case the average insn count for single function invocation is
lower than this constant, emit fast (but longer) prologue and
epilogue code. */
@@ -4194,9 +4200,11 @@ ix86_option_override_internal (bool main_args_p)
if (flag_expensive_optimizations
&& !(target_flags_explicit & MASK_VZEROUPPER))
target_flags |= MASK_VZEROUPPER;
- if (!(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
+ if ((x86_avx256_split_unaligned_load & ix86_tune_mask)
+ && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
- if (!(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE))
+ if ((x86_avx256_split_unaligned_store & ix86_tune_mask)
+ && !(target_flags_explicit & MASK_AVX256_SPLIT_UNALIGNED_STORE))
target_flags |= MASK_AVX256_SPLIT_UNALIGNED_STORE;
}
}
--
1.7.0.4