Michael Collison <michael.colli...@arm.com> writes: > +(define_insn_and_split "*aarch64_reg_<mode>3_neg_mask2" > + [(set (match_operand:GPI 0 "register_operand" "=r") > + (SHIFT:GPI > + (match_operand:GPI 1 "register_operand" "r") > + (match_operator 4 "subreg_lowpart_operator" > + [(neg:SI (and:SI (match_operand:SI 2 "register_operand" "r") > + (match_operand 3 "const_int_operand" "n")))]))) > + (clobber (match_scratch:SI 5 "=&r"))] > + "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0)" > + "#" > + "&& reload_completed" > + [(const_int 0)] > + { > + emit_insn (gen_negsi2 (operands[5], operands[2])); > + > + rtx and_op = gen_rtx_AND (SImode, operands[5], operands[3]); > + rtx subreg_tmp = gen_rtx_SUBREG (GET_MODE (operands[4]), and_op, > + SUBREG_BYTE (operands[4])); > + emit_insn (gen_<optab><mode>3 (operands[0], operands[1], subreg_tmp)); > + DONE; > + } > +)
Thanks, I agree this looks correct from the split/reload_completed POV. I think we can go one better though, either: (a) Still allow the split when !reload_completed, and use: if (GET_MODE (operands[5]) == SCRATCH) operands[5] = gen_reg_rtx (SImode); This will allow the individual instructions to be scheduled by sched1. (b) Continue to restrict the split to reload_completed, change operand 0 to =&r so that it can be used as a temporary, and drop operand 5 entirely. Or perhaps do both: (define_insn_and_split "*aarch64_reg_<mode>3_neg_mask2" [(set (match_operand:GPI 0 "register_operand" "=&r") (SHIFT:GPI (match_operand:GPI 1 "register_operand" "r") (match_operator 4 "subreg_lowpart_operator" [(neg:SI (and:SI (match_operand:SI 2 "register_operand" "r") (match_operand 3 "const_int_operand" "n")))])))] "((~INTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode)-1)) == 0)" "#" "&& 1" [(const_int 0)] { rtx tmp = (can_create_pseudo_p () ? gen_reg_rtx (<GPI:MODE>mode) : operands[0]); emit_insn (gen_negsi2 (tmp, operands[2])); rtx and_op = gen_rtx_AND (SImode, tmp, operands[3]); rtx subreg_tmp = gen_rtx_SUBREG (GET_MODE (operands[4]), and_op, SUBREG_BYTE (operands[4])); emit_insn (gen_<optab><mode>3 (operands[0], operands[1], subreg_tmp)); DONE; } ) Sorry for the run-around. I should have realised earlier that these patterns didn't really need a distinct register after RA. Thanks, Richard