A patch that I'm working on to improve RTL simplifications in the middle-end results in the regression of pr78904-1b.c, due to changes in the canonical representation of high-byte (%ah, %bh, %ch, %dh) logic. This patch avoids/prevents those failures by adding support for the alternate representation, duplicating the existing *<code>qi_ext<mode>_2 as *<code>qi_ext<mode>_3 (the new version also replacing any_or with any_logic to provide *andqi_ext<mode>_3 in the same pattern). Removing the original pattern isn't trivial, as it's generated by define_split, but this can be investigated after the other pieces are approved.
The current representation of this instruction is: (set (zero_extract:DI (reg/v:DI 87 [ aD.2763 ]) (const_int 8 [0x8]) (const_int 8 [0x8])) (subreg:DI (xor:QI (subreg:QI (zero_extract:DI (reg:DI 94) (const_int 8 [0x8]) (const_int 8 [0x8])) 0) (subreg:QI (zero_extract:DI (reg/v:DI 87 [ aD.2763 ]) (const_int 8 [0x8]) (const_int 8 [0x8])) 0)) 0)) after my proposed middle-end improvement, we attempt to recognize: (set (zero_extract:DI (reg/v:DI 87 [ aD.2763 ]) (const_int 8 [0x8]) (const_int 8 [0x8])) (zero_extract:DI (xor:DI (reg:DI 94) (reg/v:DI 87 [ aD.2763 ])) (const_int 8 [0x8]) (const_int 8 [0x8]))) This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-06-18 Roger Sayle <ro...@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386.md (*<code>qi_ext<mode>_3): New define_insn. Thanks in advance, Roger --
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 0929115..889405e 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -10848,6 +10848,8 @@ [(set_attr "type" "alu") (set_attr "mode" "QI")]) +;; *andqi_ext<mode>_3 is defined via *<code>qi_ext<mode>_3 below. + ;; Convert wide AND instructions with immediate operand to shorter QImode ;; equivalents when possible. ;; Don't do the splitting with memory operands, since it introduces risk @@ -11560,6 +11562,26 @@ [(set_attr "type" "alu") (set_attr "mode" "QI")]) +(define_insn "*<code>qi_ext<mode>_3" + [(set (zero_extract:SWI248 + (match_operand 0 "int248_register_operand" "+Q") + (const_int 8) + (const_int 8)) + (zero_extract:SWI248 + (any_logic:SWI248 + (match_operand 1 "int248_register_operand" "%0") + (match_operand 2 "int248_register_operand" "Q")) + (const_int 8) + (const_int 8))) + (clobber (reg:CC FLAGS_REG))] + "(!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) + /* FIXME: without this LRA can't reload this pattern, see PR82524. */ + && (rtx_equal_p (operands[0], operands[1]) + || rtx_equal_p (operands[0], operands[2]))" + "<logic>{b}\t{%h2, %h0|%h0, %h2}" + [(set_attr "type" "alu") + (set_attr "mode" "QI")]) + ;; Convert wide OR instructions with immediate operand to shorter QImode ;; equivalents when possible. ;; Don't do the splitting with memory operands, since it introduces risk