On Tue, Jan 28, 2020 at 6:45 AM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> On Tue, Jan 28, 2020 at 3:32 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> >
> > On Mon, Jan 27, 2020 at 11:04 PM Uros Bizjak <ubiz...@gmail.com> wrote:
> > >
> > > On Mon, Jan 27, 2020 at 11:17 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> > > >
> > > > On Mon, Jan 27, 2020 at 12:26 PM Uros Bizjak <ubiz...@gmail.com> wrote:
> > > > >
> > > > > On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> > > > > >
> > > > > > movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't 
> > > > > > the
> > > > > > case for AVX nor AVX512.  We should disable 
> > > > > > TARGET_SSE_TYPELESS_STORES
> > > > > > for TARGET_AVX.
> > > > > >
> > > > > > gcc/
> > > > > >
> > > > > >         PR target/91461
> > > > > >         * config/i386/i386.h (TARGET_SSE_TYPELESS_STORES): Disable 
> > > > > > for
> > > > > >         TARGET_AVX.
> > > > > >         * config/i386/i386.md (*movoi_internal_avx): Remove
> > > > > >         TARGET_SSE_TYPELESS_STORES check.
> > > > > >
> > > > > > gcc/testsuite/
> > > > > >
> > > > > >         PR target/91461
> > > > > >         * gcc.target/i386/pr91461-1.c: New test.
> > > > > >         * gcc.target/i386/pr91461-2.c: Likewise.
> > > > > >         * gcc.target/i386/pr91461-3.c: Likewise.
> > > > > >         * gcc.target/i386/pr91461-4.c: Likewise.
> > > > > >         * gcc.target/i386/pr91461-5.c: Likewise.
> > > > > > ---
> > > > > >  gcc/config/i386/i386.h                    |  4 +-
> > > > > >  gcc/config/i386/i386.md                   |  4 +-
> > > > > >  gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 ++++++++++++++++++++
> > > > > >  gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++++++
> > > > > >  gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 
> > > > > > +++++++++++++++++++++++
> > > > > >  gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++++++
> > > > > >  gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 +++++
> > > > > >  7 files changed, 203 insertions(+), 4 deletions(-)
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c
> > > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c
> > > > > >
> > > > > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > > > > > index 943e9a5c783..c134b04c5c4 100644
> > > > > > --- a/gcc/config/i386/i386.h
> > > > > > +++ b/gcc/config/i386/i386.h
> > > > > > @@ -516,8 +516,10 @@ extern unsigned char 
> > > > > > ix86_tune_features[X86_TUNE_LAST];
> > > > > >  #define TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL \
> > > > > >         ix86_tune_features[X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL]
> > > > > >  #define TARGET_SSE_SPLIT_REGS  
> > > > > > ix86_tune_features[X86_TUNE_SSE_SPLIT_REGS]
> > > > > > +/* NB: movaps/movups is one byte shorter than movdaq/movdqu.  But 
> > > > > > it
> > > > > > +   isn't the case for AVX nor AVX512.  */
> > > > > >  #define TARGET_SSE_TYPELESS_STORES \
> > > > > > -       ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES]
> > > > > > +       (!TARGET_AVX && 
> > > > > > ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES])
> > > > >
> > > > > This is wrong place to disable the feature.
> > > >
> > > > Like this?
> > >
> > > No.
> > >
> > > There is a mode attribute in i386.md/sse.md for relevant patterns.
> > > Please adapt calculation of mode attributes instead.
> > >
> >
> > Like this?
>
> Still no.
>
> You could move
>
> (match_test "TARGET_AVX")
>   (const_string "TI")
>
> up to bypass the cases below.
>

I don't think we can do that.   There are 2 cases where we prefer movaps/movups:

/* Use packed single precision instructions where posisble.  I.e.
movups instead   of movupd.  */
DEF_TUNE (X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL,
"sse_packed_single_insn_optimal",
          m_BDVER | m_ZNVER)

/* X86_TUNE_SSE_TYPELESS_STORES: Always movaps/movups for 128bit stores.   */
DEF_TUNE (X86_TUNE_SSE_TYPELESS_STORES, "sse_typeless_stores",
          m_AMD_MULTIPLE | m_CORE_ALL | m_GENERIC)

We should always use movaps/movups for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL.
It is wrong to bypass TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL with TARGET_AVX
as m_BDVER | m_ZNVER support AVX.

-- 
H.J.

Reply via email to