[Bug target/83479] Register spilling in AVX code

rguenther at suse dot de Tue, 19 Dec 2017 06:43:11 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83479


--- Comment #9 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 19 Dec 2017, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83479
> 
> --- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> (In reply to Richard Biener from comment #7)
> > but it seems this is how _mm512_set1_pd works:
> > 
> > extern __inline __m512d
> > __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> > _mm512_set1_pd (double __A)
> > {
> >   return (__m512d) __builtin_ia32_broadcastsd512 (__extension__
> >                                                   (__v2df) { __A, },
> >                                                   (__v8df)
> >                                                   _mm512_undefined_pd (),
> >                                                   (__mmask8) -1);
> > }
> > 
> > given we now have VEC_DUPLICATE_EXPR it would be nice to open-code
> > those builtins somehow (or for GCC 9).
> 
> The builtin handles the masking and zeroing/previous value, which is something
> the generic code can't easily handle.  But we could in backend gimple folder
> fold those into VEC_DUPLICATE_EXPR or VEC_PERM_EXPR with all zeros if the mask
> is all ones.

Yeah, but this is _mm512_set1_pd, not some masking intrinsic.  We'd need
to think about how the generic vector extension can be used to do
a splat of course.  Apart from just writing

  return (__m512d) { __A, __A, __A, ... };

I suppose we expected that combine will never be able to match this
to the broadcast instruction which presumambly only exists with all
the bells and whistles.

[Bug target/83479] Register spilling in AVX code

Reply via email to