On Tue, Jan 28, 2020 at 6:45 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Tue, Jan 28, 2020 at 3:32 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > On Mon, Jan 27, 2020 at 11:04 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > > > On Mon, Jan 27, 2020 at 11:17 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > > > On Mon, Jan 27, 2020 at 12:26 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > > > > > > > On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > > > > > > > movaps/movups is one byte shorter than movdaq/movdqu. But it isn't > > > > > > the > > > > > > case for AVX nor AVX512. We should disable > > > > > > TARGET_SSE_TYPELESS_STORES > > > > > > for TARGET_AVX. > > > > > > > > > > > > gcc/ > > > > > > > > > > > > PR target/91461 > > > > > > * config/i386/i386.h (TARGET_SSE_TYPELESS_STORES): Disable > > > > > > for > > > > > > TARGET_AVX. > > > > > > * config/i386/i386.md (*movoi_internal_avx): Remove > > > > > > TARGET_SSE_TYPELESS_STORES check. > > > > > > > > > > > > gcc/testsuite/ > > > > > > > > > > > > PR target/91461 > > > > > > * gcc.target/i386/pr91461-1.c: New test. > > > > > > * gcc.target/i386/pr91461-2.c: Likewise. > > > > > > * gcc.target/i386/pr91461-3.c: Likewise. > > > > > > * gcc.target/i386/pr91461-4.c: Likewise. > > > > > > * gcc.target/i386/pr91461-5.c: Likewise. > > > > > > --- > > > > > > gcc/config/i386/i386.h | 4 +- > > > > > > gcc/config/i386/i386.md | 4 +- > > > > > > gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 ++++++++++++++++++++ > > > > > > gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++++++ > > > > > > gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 > > > > > > +++++++++++++++++++++++ > > > > > > gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++++++ > > > > > > gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 +++++ > > > > > > 7 files changed, 203 insertions(+), 4 deletions(-) > > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c > > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c > > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c > > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c > > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c > > > > > > > > > > > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > > > > > > index 943e9a5c783..c134b04c5c4 100644 > > > > > > --- a/gcc/config/i386/i386.h > > > > > > +++ b/gcc/config/i386/i386.h > > > > > > @@ -516,8 +516,10 @@ extern unsigned char > > > > > > ix86_tune_features[X86_TUNE_LAST]; > > > > > > #define TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL \ > > > > > > ix86_tune_features[X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL] > > > > > > #define TARGET_SSE_SPLIT_REGS > > > > > > ix86_tune_features[X86_TUNE_SSE_SPLIT_REGS] > > > > > > +/* NB: movaps/movups is one byte shorter than movdaq/movdqu. But > > > > > > it > > > > > > + isn't the case for AVX nor AVX512. */ > > > > > > #define TARGET_SSE_TYPELESS_STORES \ > > > > > > - ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES] > > > > > > + (!TARGET_AVX && > > > > > > ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES]) > > > > > > > > > > This is wrong place to disable the feature. > > > > > > > > Like this? > > > > > > No. > > > > > > There is a mode attribute in i386.md/sse.md for relevant patterns. > > > Please adapt calculation of mode attributes instead. > > > > > > > Like this? > > Still no. > > You could move > > (match_test "TARGET_AVX") > (const_string "TI") > > up to bypass the cases below. >
I don't think we can do that. There are 2 cases where we prefer movaps/movups: /* Use packed single precision instructions where posisble. I.e. movups instead of movupd. */ DEF_TUNE (X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL, "sse_packed_single_insn_optimal", m_BDVER | m_ZNVER) /* X86_TUNE_SSE_TYPELESS_STORES: Always movaps/movups for 128bit stores. */ DEF_TUNE (X86_TUNE_SSE_TYPELESS_STORES, "sse_typeless_stores", m_AMD_MULTIPLE | m_CORE_ALL | m_GENERIC) We should always use movaps/movups for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL. It is wrong to bypass TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL with TARGET_AVX as m_BDVER | m_ZNVER support AVX. -- H.J.