https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121662
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rguenth at gcc dot gnu.org Keywords| |missed-optimization Component|target |tree-optimization --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- This is done to avoid all-masked stores which can repeatedly trigger fault suppression, causing 1000x slowdown. For this case consider *a mapped read-only and all zero and time the loop with and without the branch (or make the testcase to check a different object for the flag to get the unmapped write-object case). So, it works as designed, see tree-vect-loop.cc:optimize_mask_stores