https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750

--- Comment #15 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #14)
> Created attachment 52032 [details]
> update patch
> 
> Update patch, Now gcc can generate optimal code
> 

current fix add define_insn_and_splitter for 3 things:
1. Combine vpcmpuw and zero_extend into vpcmpuw.
2. Canonicalize vpcmpuw pattern so CSE can replace duplicate vpcmpuw to just
kmov
3. Use DImode as dest of zero_extend so cprop_hardreg can eliminate redundant
kmov.

But the sink issue still exists, i.e. for testcase in PR103774, there's
memory_operand in vpcmpuw, and combine failed due to cost increase, and the
redudant kmov remains in the loop.

Reply via email to