https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122297
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- OK, so the issue seems to be that we vect_record_loop_len vector(4) int once with factor 4, once with factor 1 (but we keep the factor 4 recorded one), and then again with factor 4. So we neither reliably keep track of the largest/lowest factor needed nor does the later query via vect_get_loop_len take into account the factor we query (we generate only one len per index). This 'factor' is only used for this bytewise fallback for loads and stores. It might be best to kill it off and apply the factor after the loop len get in load/store vectorization. The other option is to track the lowest factor necessary and upon get, scale the len according to the requested factor.
