On 09/13/2012 11:49 AM, Jakub Jelinek wrote: > On Thu, Sep 13, 2012 at 11:25:42AM -0700, Richard Henderson wrote: >> (1) Negating the second argument is arguably non-canonical rtl. > > That is why I've put in the *fmai_fnm{add,sub}_<mode> patterns > operands 2 with the neg as first operand of the FMA rtl. That way it is > canonical (otherwise it didn't match in combine). The FMA rtl operand > order doesn't need to imply the order of instruction operands.
Sorry, I didn't read the unidiff properly. > (fma:VF_128 > (match_operand:VF_128 1 "nonimmediate_operand" " 0, 0") > (match_operand:VF_128 2 "nonimmediate_operand" "xm, x") > (match_operand:VF_128 3 "nonimmediate_operand" " x,xm")) > (match_operand:VF_128 4 "nonimmediate_operand" " 0, 0") ... > which was apparently too much for reload (supposedly the two "0" constraint > operands, even when the expander used (match_dup 1)). Yes. We'd have to have two different patterns to "properly" support fma4. Though I suppose now that I think about it this is extremely similar to the vfmadd231 case, in that in order to want to generate vfmaddss %xmm3, %xmm2, %xmm1, %xmm0 given the semantics of the builtin we'd have had to emit a copy of %xmm1 or %xmm2 into %xmm0 anyway. So we might as well not support this and just do (define_insn "*fmai_fmadd_<mode>" [(set (match_operand:VF_128 0 "register_operand" "=x,x,x,x") (vec_merge:VF_128 (fma:VF_128 (match_operand:VF_128 1 "nonimmediate_operand" "%0, 0, 0,0") (match_operand:VF_128 2 "nonimmediate_operand" "xm, x, x,m") (match_operand:VF_128 3 "nonimmediate_operand" " x,xm,xm,x")) (match_dup 0) (const_int 1)))] "TARGET_FMA || TARGET_FMA4" "@ vfmadd132<ssescalarmodesuffix>\t{%2, %3, %0|%0, %3, %2} vfmadd213<ssescalarmodesuffix>\t{%3, %2, %0|%0, %2, %3} vfmadd<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3} vfmadd<ssescalarmodesuffix>\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "fma,fma,fma4,fma4") (set_attr "type" "ssemuladd") (set_attr "mode" "<MODE>")]) r~