On Wed, 4 Dec 2013, Vidya Praveen wrote:

> Hi Richi,
> 
> Apologies for the late response. I was on vacation.
> 
> On Mon, Oct 14, 2013 at 09:04:58AM +0100, Richard Biener wrote:
> > > void
> > > foo (int *__restrict__ a,
> > >      int *__restrict__ b,
> > >      int c)
> > > {
> > >   int i;
> > > 
> > >   for (i = 0; i < 8; i++)
> > >     a[i] = b[i] * c;
> > > }
> > 
> > Both cases can be handled by patterns that match
> > 
> >   (mul:VXSI (reg:VXSI
> >              (vec_duplicate:VXSI reg:SI)))
> 
> How do I arrive at this pattern in the first place? Assuming vec_init with
> uniform values are expanded as vec_duplicate, it will still be two 
> expressions.
> 
> That is,
> 
> (set reg:VXSI (vec_duplicate:VXSI (reg:SI)))
> (set reg:VXSI (mul:VXSI (reg:VXSI) (reg:VXSI)))

Yes, but then combine comes along and creates

 (set reg:VXSI (mul:VXSI (reg:VXSI (vec_duplicate:VXSI (reg:SI)))))

which matches one of your define_insn[_and_split]s.

> > You'd then "consume" the vec_duplicate and implement it as
> > load scalar into element zero of the vector and use index mult
> > with index zero.
> 
> If I understand this correctly, you are suggesting to leave the scalar
> load from memory as it is but treat the 
> 
> (mul:VXSI (reg:VXSI (vec_duplicate:VXSI reg:SI)))
> 
> as 
> 
> load reg:VXSI[0], reg:SI
> mul reg:VXSI, reg:VXSI, re:VXSI[0] // by reusing the destination register 
> perhaps
> 
> either by generating instructions directly or by using define_split. Am I 
> right?

Possibly.  Or allow memory as operand 2 for your pattern (so, not
reg:SI but mem:SI).  Combine should be happy with that, too.
 
> If I'm right, then my concern is that it may be possible to simplify this 
> further
> by loading directly to a indexed vector register from memory but it's too 
> late at
> this point for such simplification to be possible.
> 
> Please let me know what am I not understanding.

Not sure.  Did you try it?

Richard.

Reply via email to