https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120639
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed| |2025-06-19 Blocks| |53947 Status|UNCONFIRMED |NEW Component|middle-end |tree-optimization Keywords| |missed-optimization Ever confirmed|0 |1 --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- We could use scatter stores, building the index vector somehow cleverly with i_width contiguous indexes interspaced by i_dst_stride. In fact this vector could be built as inductions when building the i_height number of vectors to store and concatenated the same way? On no x86 uarch are scatters implemented efficiently enough to be worth this though. Stores with variable gaps in general ask for scatters, with constant known gaps contiguous stores with mask might work. But the code we emit now is quite efficient on the store side (it does force a higher VF sometimes, due to the single-vector-size limitation). Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations