x86 doesn't directly support 8-bit vector shifts, so we have some 2 to 5 insn expansions. With VGF2P8AFFINEQB, we can do it in 1 insn, plus a (possibly shared) constant load.
r~ Richard Henderson (3): cpuinfo/i386: Detect GFNI as an AVX extension tcg/i386: Add INDEX_op_x86_vgf2p8affineqb_vec tcg/i386: Use vgf2p8affineqb for MO_8 vector shifts host/include/i386/host/cpuinfo.h | 1 + include/qemu/cpuid.h | 3 ++ util/cpuinfo-i386.c | 1 + tcg/i386/tcg-target-opc.h.inc | 1 + tcg/i386/tcg-target.c.inc | 81 ++++++++++++++++++++++++++++++-- 5 files changed, 83 insertions(+), 4 deletions(-) -- 2.43.0