On Sat, Aug 20, 2011 at 11:52 PM, Richard Henderson <r...@redhat.com> wrote: > On 08/20/2011 02:16 PM, Uros Bizjak wrote: >> +(define_insn "bmi2_umul<mode><dwi>3_1" >> + [(set (match_operand:<DWI> 0 "register_operand" "=r") >> + (mult:<DWI> >> + (zero_extend:<DWI> >> + (match_operand:DWIH 1 "nonimmediate_operand" "%d")) >> + (zero_extend:<DWI> >> + (match_operand:DWIH 2 "nonimmediate_operand" "rm"))))] >> + "TARGET_BMI >> + && !(MEM_P (operands[1]) && MEM_P (operands[2]))" >> + "mulx\t{%2, %M0, %N0|%N0, %M0, %2}" >> + [(set_attr "type" "imul") >> + (set_attr "prefix" "vex") >> + (set_attr "mode" "<MODE>")]) > > You can do better than this, and avoid the %M %N specifiers. > The outputs are truly independent and do not need to be a pair. > > See the mn10300 umulsidi3{,_internal} patterns.
I have tried your suggestion, using patterns like following: (define_insn "umulsidi3_1" [(set (match_operand:SI 0 "register_operand" "=a,r") (mult:SI (match_operand:SI 2 "nonimmediate_operand" "%0,d") (match_operand:SI 3 "nonimmediate_operand" "rm,rm"))) (set (match_operand:SI 1 "register_operand" "=d,r") (truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (match_dup 2)) (zero_extend:DI (match_dup 3))) (const_int 32)))) (clobber (reg:CC FLAGS_REG))] "!TARGET_64BIT && !(MEM_P (operands[2]) && MEM_P (operands[3]))" "@ mull\t%3 #" [(set_attr "isa" "base,bmi2") (set_attr "type" "imul,imulx") (set_attr "length_immediate" "0,*") (set (attr "athlon_decode") (cond [(eq_attr "alternative" "0") (if_then_else (eq_attr "cpu" "athlon") (const_string "vector") (const_string "double"))] (const_string "*"))) (set_attr "amdfam10_decode" "double,*") (set_attr "bdver1_decode" "direct,*") (set_attr "prefix" "orig,vex") (set_attr "mode" "SI")]) The compiler works, for a couple of simple testcases it produces the same code as with register pairs. However, there are a couple of problems: - various length calculations look into operand{0,1,2} to determine instruction length. This is fixable with a little effort. - patterns that include (const_int N) do not macroize and this leads to pattern explosion. For this simple example, in addition to splitting out any_extend pattern, we have to split also DWIH patterns. In the past, I have tried to use match_operand with const_int INTVAL predicates, but gcc crashed elsewhere due to additional operand. Please see [1]. IMO, it is currently too much pain to implement splitted pairs in existing patterns for too low gain. I will however implement split to mulx pattern after reload to proposed pattern to avoid %M %N. [1] http://gcc.gnu.org/ml/gcc/2010-07/msg00143.html Uros.