https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114107
Hongtao Liu <liuhongt at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |liuhongt at gcc dot gnu.org --- Comment #7 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- perm_cost is very low in x86 backend, and it maybe ok for 128-bit vectors, pshufb/shufps are avaible for most cases. But for 256/512-bit vectors, when the permuation is cross-lane, the cost could be higher. One solution is increase perm_cost when vector size is more than 128 since vperm is most likely used instead of vblend/vpblend/vpshuf/vshuf.