https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117438
--- Comment #5 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- I reproduce with 30% regression on CLX, there's more frontend-bound with aligned case, it's uarch specific, will make it a uarch tune.