* Richard Biener:

> If it were possible I'd axe x86_64-v4.  Maybe we should add a x86_64-v3.5
> that sits inbetween v3 and v4, offering AVX512 but restricted to 256bit
> (and obviously not requiring more of the AVX512 features that v4
> requires).

As far as I understand it, GCC's Intel tuning for AVX-512 is leaning
heavily towards 256 bit vector length anyway.  That's not true for the
default tuning for -march=x86-64-v4, though, it prefers 512 bit vectors.
I've seen third-party reports that AMD Zen 4 does better in some ways
with 512 bit vectors than with 256 bit vectors (despite its 256-bit-wide
execution ports), but I have not tried to verify these observations.
Still, this suggests that restricting a post-x86-64-v3 level to 256 bit
vectors may not be an easy decision.

On the other hand, a new EVEX-capable level might bring earlier adoption
of EVEX capabilities to AMD CPUs, which still should be an improvement
over AVX2.  This could benefit AMD as well.  So I would really like to
see some AMD feedback here.

There's also the matter that time scales for EVEX adoption are so long
that by then, Intel CPUs may end up supporting and preferring 512 bit
vectors again.

Thanks,
Florian

Reply via email to