On Wed, Aug 23, 2023 at 01:57:59AM +0000, Jiang, Haochen wrote:
> > > Let's assume there's no detla now, AVX10.1-512 is equal to
> > > AVX512{F,VL,BW,DQ,CD,BF16,FP16,VBMI,VBMI2,VNNI,IFMA,BITALG,VPOPCNTDQ}
> > > > other stuff.
> > > > The current common/config/i386/i386-common.cc OPTION_MASK_ISA*SET* 
> > > > would be
> > > > like now, except that the current AVX512* sets imply also 
> > > > EVEX512/whatever
> > > > it will be called, that option itself enables nothing (or 
> > > > TARGET_AVX512F),
> > > > and unsetting it doesn't disable all the TARGET_AVX512*.
> > > > -mavx10.1 would enable the AVX512* sets without EVEX512/whatever.
> > > So for -mavx512bw -mavx10.1-256, -mavx512bw will set EVEX512, but
> > > -mavx10.1-256 doesn't clear EVEX512 but just enable all AVX512* sets?.
> > > then the combination basically is equal to AVX10.1-512(AVX512* sets +
> > > EVEX512)
> > > If this is your assumption, yes, there's no need for TARGET_AVX10_1.
> 
> I think we still need that since the current w/o AVX512VL, we will not only
> enable 512 bit vector instructions but also enable scalar instructions, which
> means when it comes to -mavx512bw -mno-evex512, we should enable
> the scalar function.
> 
> And scalar functions will also be enabled in AVX10.1-256, we need something
> to distinguish them out from the ISA set w/o AVX512VL.

Ah, forgot about scalar instructions, even better, then we don't have to do
that special case.  So, I think TARGET_AVX512F && !TARGET_EVEX512 && 
!TARGET_AVX512VL
in general should disable 512-bit modes in ix86_hard_regno_mode_ok.  That
should prevent the need to replace TARGET_AVX512F to TARGET_EVEX512 on all
the patterns which refer to 512-bit modes.  Also wonder if it
wouldn't be easiest to make "v" constraint in that case be equivalent to
just "x" so that all those hacks to make xmm16+ registers working in various
instructions through g modifiers wouldn't trigger.  Sure, that would
penalize also scalar instructions, but the above case wouldn't be something
any CPU actually supports, it would be only the common subset of say XeonPhi
and AVX10.1-256.

        Jakub

Reply via email to