https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750
--- Comment #15 from Hongtao.liu <crazylht at gmail dot com> --- (In reply to Hongtao.liu from comment #14) > Created attachment 52032 [details] > update patch > > Update patch, Now gcc can generate optimal code > current fix add define_insn_and_splitter for 3 things: 1. Combine vpcmpuw and zero_extend into vpcmpuw. 2. Canonicalize vpcmpuw pattern so CSE can replace duplicate vpcmpuw to just kmov 3. Use DImode as dest of zero_extend so cprop_hardreg can eliminate redundant kmov. But the sink issue still exists, i.e. for testcase in PR103774, there's memory_operand in vpcmpuw, and combine failed due to cost increase, and the redudant kmov remains in the loop.