https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102583
Bug ID: 102583 Summary: [x86] Failure to optimize 32-byte integer vector conversion to 16-byte float vector properly when converting upper part with -mavx2 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gabravier at gmail dot com Target Milestone: --- typedef int v8si __attribute__((vector_size(32))); typedef float v4sf __attribute__((vector_size(16))); v4sf high (v8si *srcp) { v8si src = *srcp; return (v4sf) { (float)src[4], (float)src[5], (float)src[6], (float)src[7] }; } With -O3 -mavx2, GCC outputs this: high(int __vector(8)*): vmovdqa ymm0, YMMWORD PTR [rdi] vperm2i128 ymm0, ymm0, ymm0, 17 vcvtdq2ps xmm0, xmm0 vzeroupper ret LLVM instead outputs this: high(int __vector(8)*): vcvtdq2ps xmm0, xmmword ptr [rdi + 16] ret And GCC outputs the equivalent code if -mavx2 is removed: high(int __vector(8)*): cvtdq2ps xmm0, XMMWORD PTR [rdi+16] ret