https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114449
Xi Ruoyao <xry111 at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |xry111 at gcc dot gnu.org --- Comment #1 from Xi Ruoyao <xry111 at gcc dot gnu.org> --- Adding #pragma GCC unroll 8 for the loop makes it optimized. IIRC by default GCC only unroll loops with a factor of 4 so it's not "fully" unrolled w/o the pragma.