On Wed, 4 Dec 2013, Vidya Praveen wrote: > Hi Richi, > > Apologies for the late response. I was on vacation. > > On Mon, Oct 14, 2013 at 09:04:58AM +0100, Richard Biener wrote: > > > void > > > foo (int *__restrict__ a, > > > int *__restrict__ b, > > > int c) > > > { > > > int i; > > > > > > for (i = 0; i < 8; i++) > > > a[i] = b[i] * c; > > > } > > > > Both cases can be handled by patterns that match > > > > (mul:VXSI (reg:VXSI > > (vec_duplicate:VXSI reg:SI))) > > How do I arrive at this pattern in the first place? Assuming vec_init with > uniform values are expanded as vec_duplicate, it will still be two > expressions. > > That is, > > (set reg:VXSI (vec_duplicate:VXSI (reg:SI))) > (set reg:VXSI (mul:VXSI (reg:VXSI) (reg:VXSI)))
Yes, but then combine comes along and creates (set reg:VXSI (mul:VXSI (reg:VXSI (vec_duplicate:VXSI (reg:SI))))) which matches one of your define_insn[_and_split]s. > > You'd then "consume" the vec_duplicate and implement it as > > load scalar into element zero of the vector and use index mult > > with index zero. > > If I understand this correctly, you are suggesting to leave the scalar > load from memory as it is but treat the > > (mul:VXSI (reg:VXSI (vec_duplicate:VXSI reg:SI))) > > as > > load reg:VXSI[0], reg:SI > mul reg:VXSI, reg:VXSI, re:VXSI[0] // by reusing the destination register > perhaps > > either by generating instructions directly or by using define_split. Am I > right? Possibly. Or allow memory as operand 2 for your pattern (so, not reg:SI but mem:SI). Combine should be happy with that, too. > If I'm right, then my concern is that it may be possible to simplify this > further > by loading directly to a indexed vector register from memory but it's too > late at > this point for such simplification to be possible. > > Please let me know what am I not understanding. Not sure. Did you try it? Richard.