https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55600
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Target| |x86_64-linux-gnu Keywords| |missed-optimization Component|tree-optimization |target --- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> --- So here is what the current state for this. GCC vectorizers it and unroll it fully at 128 and 64 clang keeps it as **scalars** but unrolls the loop 4 times at 128 and fully at 64 ICC vectorizers it and unrolls it half way (that is 32 times) at 128 and fully at 64 MSVC keeps it as **scalars** but unrolls it half way (that is 32 times) at 128 So it looks all compilers do stuff hugely different here really.