https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107432
--- Comment #2 from g.peterh...@t-online.de --- Another example. I want to convert an array<Bool> to array<Float64>. There are basically 3 options: - Copy - Test (b2f64_default) - optimized version (b2f64_manually) gcc12.2 + gcctrunc convertSIZE_copy only generates scalar code (_mm_cvtsi64_sd) convertSIZE_default always generates conditional jumps convertSIZE_manually gcctrunc always generates branch-free scalar code gcc12.2 convert1024_manually generates vector code, but does not use HW conversion int8->int64 (_mm(256)_cvtepi8_epi64) and converts int8->int16->int32->int64 manually convert8_manually generates branch-free scalar code convert4_manually generates vector code and uses HW conversion int8->int64 NONE of these conversions are transformed/optimized to the extent that always - all available intrinsics are used - no "normal" registers are used - branch-free code is generated https://godbolt.org/z/f74vK79of thx Gero