https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120639
--- Comment #4 from rguenther at suse dot de <rguenther at suse dot de> --- On Fri, 20 Jun 2025, rdapp at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120639 > > --- Comment #3 from Robin Dapp <rdapp at gcc dot gnu.org> --- > > We could use scatter stores, building the index vector somehow cleverly with > > i_width contiguous indexes interspaced by i_dst_stride. In fact this vector > > could be built as inductions when building the i_height number of vectors > > to store and concatenated the same way? > > Interesting, so you mean having a strided index vector > [0, 1, ..., vector_size, vector_size + 1, ..., i_width, 0 + stride, 1 + > stride, > ...]? > > What about something like i_width = 12 and a 64-bit strided element (that > doesn't cover all of i_width but would require another 32-bit strided > element)? > Wouldn't we still need a mechanism to "fill" up to i_width? Well, consider the desired index vector being a real induction (just store it somewhere). If we can handle that, we should be able to handle the scatter. If not, we can't handle the scatter.