https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66862

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org,
                   |                            |kyukhin at gcc dot gnu.org,
                   |                            |uros at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
If you sed -i 's/short/int/' on the testcase, then e.g. with -mavx2 it is
vectorized with vmaskmovd.  But AVX2 does not have a masked store for packed
16-bit integers, and as Richard mentioned, using vpminuw/vmovdqu that icc emits
is IMHO invalid, as it introduces a store data race and I see no wording in the
OpenMP standard that would allow introducing store data races, even in omp simd
regions.

Now, it seems AVX512BW (and AVX512VL in some cases) has the needed
instructions,
in particular VMOVDQU{8,16}, but it is not reflected in maskload<mode> and
maskstore<mode> expanders.  CCing Kyrill and Uros on this.

Reply via email to