https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83479

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #7)
> but it seems this is how _mm512_set1_pd works:
> 
> extern __inline __m512d
> __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> _mm512_set1_pd (double __A)
> {
>   return (__m512d) __builtin_ia32_broadcastsd512 (__extension__
>                                                   (__v2df) { __A, },
>                                                   (__v8df)
>                                                   _mm512_undefined_pd (),
>                                                   (__mmask8) -1);
> }
> 
> given we now have VEC_DUPLICATE_EXPR it would be nice to open-code
> those builtins somehow (or for GCC 9).

The builtin handles the masking and zeroing/previous value, which is something
the generic code can't easily handle.  But we could in backend gimple folder
fold those into VEC_DUPLICATE_EXPR or VEC_PERM_EXPR with all zeros if the mask
is all ones.(In reply to Daniel Fruzynski from comment #6)

> One correction: In c#4 line 17 has incorrect index, should be 8 instead of
> 9. For some reason gcc did not complain here.
> 
> vLastRow = _mm512_load_pd (&data[8][0]);

That is because in C/C++ const double data[9][8] argument is actually const
double (*data)[8].

Reply via email to