https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94401
Kewen Lin <linkw at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |segher at gcc dot gnu.org, | |wschmidt at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #4 from Kewen Lin <linkw at gcc dot gnu.org> --- My commit extends the current scalar epilogue peeling for gaps elimination, it makes the case can make use of int for the construction. But it reveals the existing handlings misses to handle VMAT_CONTIGUOUS_REVERSE case, currently it assumes overrun happens on high address end, it's true for almost all cases, but this case is on the low address end. So if we have to load the high part and put it in the latter part of constructed vector for VMAT_CONTIGUOUS_REVERSE. The IR before/after the commit looks good: vect__9.16_80 = MEM <vector(2) int> [(int *)vectp_y.14_78]; vect__9.17_81 = VEC_PERM_EXPR <vect__9.16_80, vect__9.16_80, { 1, 0 }>; vect__9.18_82 = VEC_PERM_EXPR <vect__9.17_81, vect__9.17_81, { 0, 0 }>; bad: _30 = MEM[(int *)vectp_y.12_34]; _20 = {_30, 0}; vect__9.14_19 = VIEW_CONVERT_EXPR<vector(2) int>(_20); vect__9.15_61 = VEC_PERM_EXPR <vect__9.14_19, vect__9.14_19, { 1, 0 }>; vect__9.16_54 = VEC_PERM_EXPR <vect__9.15_61, vect__9.15_61, { 0, 0 }>;