Re: [RFC] Slightly fix up vgather* patterns

Uros Bizjak Sun, 09 Oct 2011 03:56:01 -0700

On Sat, Oct 8, 2011 at 5:43 PM, Jakub Jelinek <ja...@redhat.com> wrote:


> The AVX2 docs say that the insns will #UD if any of the mask, src and index
> registers are the same, but e.g. on
> #include <x86intrin.h>
>
> __m256 m;
> float f[1024];
>
> __m256
> foo (void)
> {
>  __m256i mi = (__m256i) m;
>  return _mm256_mask_i32gather_ps (m, f, mi, m, 4);
> }
>
> which is IMHO valid and should for m being zero vector just return a
> zero vector and clear mask (in this case it was already cleared) we compile
> it as
>        vmovdqa m(%rip), %ymm1
>        vmovaps %ymm1, %ymm0
>        vgatherdps      %ymm1, (%rax, %ymm1, 4), %ymm0
> and thus IMHO it will #UD.  Also, the insns should make it clear that
> the mask register is modified too (the patch clobbers it, perhaps
> we could instead say that it zeros the register (which is true if
> it doesn't segfault), but then what if a segfault handler chooses to
> continue with the next insn and doesn't clear the mask register?).
> Still, the insn description is imprecise, saying that it loads from mem
> at the address register is wrong and perhaps some DCE might delete
> what shouldn't be deleted.  So, either it should (use (mem (scratch)))
> or something similar, or in the unspec list all the memory locations
> that are being read
> (mem:<scalarssemode> (plus:SI (reg:SI) (vec_select:SI (match_operand:V4SI)
> (parallel [(const_int N)]))))
> for N 0 through something (but it is complicated by Pmode size vs.
> the need to do nothing/truncate/sign_extend the vec_select to the right
> mode).
>
> What do you think?

Regarding the clear of mask operand: I agree that this should be
modelled as a clobber. Zeroing can't be guaranteed due to the fact you
described above.

About memory - can't we use (mem:BLK (match_operand:P
"register_operand" "r")) here?

BTW: No need to use %c modifier:

/* Meaning of CODE:
   L,W,B,Q,S,T -- print the opcode suffix for specified size of operand.
   C -- print opcode suffix for set/cmov insn.
   c -- like C, but print reversed condition
   ...
*/

Uros.

Re: [RFC] Slightly fix up vgather* patterns

Reply via email to