Hi! As the PR mentions, DImode AND/IOR/XOR patterns often result in too ugly code, regression from when the patterns weren't there (before STV has been added). This patch attempts to improve it a little bit by improving the splitter for these, rather than always generating two SImode AND/IOR/XOR instructions, if the last operand's subword is either 0 or -1, optimize the corresponding instruction in the pair to nothing, or to clearing, or negation. More improvement can be IMHO only achieved by moving the STV pass before combiner and split patterns we don't adjust into vector patterns into corresponding SImode patterns, so that the combiner can handle them, but that sounds like stage1 material.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-03-22 Jakub Jelinek <ja...@redhat.com> PR target/70321 * config/i386/i386.md (*anddi3_doubleword, *<code>di3_doubleword): Optimize TARGET_STV splitters, if high or low word of last argument is 0 or -1. --- gcc/config/i386/i386.md.jj 2016-03-22 09:13:54.000000000 +0100 +++ gcc/config/i386/i386.md 2016-03-22 18:45:16.392316554 +0100 @@ -8141,16 +8141,31 @@ (match_operand:DI 1 "nonimmediate_operand" "%0,0,0") (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,rm"))) (clobber (reg:CC FLAGS_REG))] - "!TARGET_64BIT && TARGET_STV && TARGET_SSE2 && ix86_binary_operator_ok (AND, DImode, operands)" + "!TARGET_64BIT && TARGET_STV && TARGET_SSE2 + && ix86_binary_operator_ok (AND, DImode, operands)" "#" "&& reload_completed" - [(parallel [(set (match_dup 0) - (and:SI (match_dup 1) (match_dup 2))) - (clobber (reg:CC FLAGS_REG))]) - (parallel [(set (match_dup 3) - (and:SI (match_dup 4) (match_dup 5))) - (clobber (reg:CC FLAGS_REG))])] - "split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);") + [(const_int 0)] +{ + split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]); + if (operands[2] == const0_rtx) + { + operands[1] = const0_rtx; + ix86_expand_move (SImode, &operands[0]); + } + else if (operands[2] != constm1_rtx) + emit_insn (gen_andsi3 (operands[0], operands[1], operands[2])); + else if (operands[5] == constm1_rtx) + emit_note (NOTE_INSN_DELETED); + if (operands[5] == const0_rtx) + { + operands[4] = const0_rtx; + ix86_expand_move (SImode, &operands[3]); + } + else if (operands[5] != constm1_rtx) + emit_insn (gen_andsi3 (operands[3], operands[4], operands[5])); + DONE; +}) (define_insn "*andsi_1" [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,r,Ya,!k") @@ -8665,16 +8680,41 @@ (match_operand:DI 1 "nonimmediate_operand" "%0,0,0") (match_operand:DI 2 "x86_64_szext_general_operand" "Z,re,rm"))) (clobber (reg:CC FLAGS_REG))] - "!TARGET_64BIT && TARGET_STV && TARGET_SSE2 && ix86_binary_operator_ok (<CODE>, DImode, operands)" + "!TARGET_64BIT && TARGET_STV && TARGET_SSE2 + && ix86_binary_operator_ok (<CODE>, DImode, operands)" "#" "&& reload_completed" - [(parallel [(set (match_dup 0) - (any_or:SI (match_dup 1) (match_dup 2))) - (clobber (reg:CC FLAGS_REG))]) - (parallel [(set (match_dup 3) - (any_or:SI (match_dup 4) (match_dup 5))) - (clobber (reg:CC FLAGS_REG))])] - "split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]);") + [(const_int 0)] +{ + split_double_mode (DImode, &operands[0], 3, &operands[0], &operands[3]); + if (operands[2] == constm1_rtx) + { + if (<CODE> == IOR) + { + operands[1] = constm1_rtx; + ix86_expand_move (SImode, &operands[0]); + } + else + ix86_expand_unary_operator (NOT, SImode, &operands[0]); + } + else if (operands[2] != const0_rtx) + ix86_expand_binary_operator (<CODE>, SImode, &operands[0]); + else if (operands[5] == const0_rtx) + emit_note (NOTE_INSN_DELETED); + if (operands[5] == constm1_rtx) + { + if (<CODE> == IOR) + { + operands[4] = constm1_rtx; + ix86_expand_move (SImode, &operands[3]); + } + else + ix86_expand_unary_operator (NOT, SImode, &operands[3]); + } + else if (operands[5] != const0_rtx) + ix86_expand_binary_operator (<CODE>, SImode, &operands[3]); + DONE; +}) (define_insn_and_split "*andndi3_doubleword" [(set (match_operand:DI 0 "register_operand" "=r,r") Jakub