100711-like splitting

Hongtao Liu via Gcc-patches Sat, 24 Jun 2023 23:36:10 -0700

On Sun, Jun 25, 2023 at 2:25 PM Jan Beulich <jbeul...@suse.com> wrote:
>
> On 25.06.2023 07:12, Hongtao Liu wrote:
> > On Wed, Jun 21, 2023 at 2:29 PM Jan Beulich via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> ---
> >> For the purpose here (and elsewhere) bcst_vector_operand() (really:
> >> bcst_mem_operand()) isn't permissive enough: We'd want it to allow
> >> 128-bit and 256-bit types as well irrespective of AVX512VL being
> >> enabled. This would likely require a new predicate
> >> (bcst_intvec_operand()?) and a new constraint (BR? Bi?). (Yet for name
> >> selection it will want considering that this is applicable to certain
> >> non-calculational FP operations as well.)
> > I think so.
>
> Any preference towards predicate and constraint naming?
something like bcst_mem_operand_$suffiix, $suffix indicates the
pattern may use zmm instruction for 128/256-bit operand.
maybe just bcst_mem_operand_zmm?


>
> Plus I think there's a more general question behind this: A new
> predicate / constraint pair is likely just one way of dealing
> with the issue. Another would appear to be to remove the
> restriction of 128- and 256-byte types when AVX512VL is not
> enabled, but AVX512F is. While that would require touching a
> lot of insn constraints, it looks as if lifting that restriction
> would "merely" require much wider use of Yv where v is used
> right now. But of course I may well be unaware of (some of) the
> reasons why that restriction was put in place in the first place
> (it can't really be the lack of suitable move insns, as those
> can be synthesized by using e.g. vextract{32,64}x4).
Also be careful of SIMD Floating-Point Exception if we use the zmm
version for those arithmetic instructions, the upper bits need to be
explicitly cleared for 128/256-bit operand.
For pternlog or other logic instructions, it's ok since there's no
SIMD Floating-Point Exception for such instructions.

>
> Jan



-- 
BR,
Hongtao

Re: [PATCH 5/5] x86: yet more PR target/100711-like splitting

Reply via email to