https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85052

--- Comment #12 from Matthias Kretz <kretz at kde dot org> ---
(In reply to Jakub Jelinek from comment #11)
> [...] though for 8x conversions we
> are e.g. on x86 already outside of the realm of natively supported vectors
> (we don't really want MMX and for 1024 bit and wider generic vectors we
> don't always emit best code).

Creatively thinking, consider constants stored as (u)char arrays (for bandwith
optimization), converted to double or (u)llong when used. I'd want to use a
half-SSE load + subsequent conversion to AVX-512 vector (e.g. vpmovsxbq +
vcvtqq2pd) or even full SSE load + one shift and two conversions to AVX-512.

Similar motivation for the reverse direction. (Though a lot less likely to be
used in practice, I believe. Hmm, maybe AI applications can prove that
expectation wrong.)

But we should track optimizations in their own issues.

Reply via email to