https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96918
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Jakub Jelinek from comment #6) > Or the generic code could try to expand vector rotates by multiplies of > BITS_PER_UNIT as vector permutations, perhaps only if there is no optab for > it. Or trying to expand both permutation and rotate and determine at > expansion time using costs which sequence is cheaper. I guess for the specific case we could think of what is canonical for GIMPLE and then deal with that during RTL expansion as you say. The question is whether vector rotate or permutes are more likely to be combined with earlier/later stmts. Guess there's no clear answer to that which means there would need to be special handling anyway of which follows that we might stick to what the source did.