On Tue, Jan 9, 2024 at 9:58 AM Richard Biener <rguent...@suse.de> wrote:
>
> On Mon, 8 Jan 2024, Uros Bizjak wrote:
>
> > On Mon, Jan 8, 2024 at 5:57?PM Andrew Pinski <pins...@gmail.com> wrote:
> > >
> > > On Mon, Jan 8, 2024 at 6:44?AM Uros Bizjak <ubiz...@gmail.com> wrote:
> > > >
> > > > Instead of converting XOR or PLUS of two values, ANDed with two 
> > > > constants that
> > > > have no bits in common, to IOR expression, convert IOR or XOR of said 
> > > > two
> > > > ANDed values to PLUS expression.
> > >
> > > I think this only helps targets which have leal like instruction. Also
> > > I think it is the same issue as I recorded as PR 111763 .  I suspect
> > > BIT_IOR is more of a Canonical form for GIMPLE while we should handle
> > > this in expand to decide if we want to use PLUS or IOR.
> >
> > For the pr108477.c testcase, expand pass expands:
> >
> >   r_3 = a_2(D) & 1;
> >  p_5 = b_4(D) & 4294967292;
> >  _1 = r_3 | p_5;
> >  _6 = _1 + 2;
> >  return _6;
> >
> > The transformation ( | -> + ) is valid only when CST1 & CST2 == 0, so
> > we need to determine values of constants. Is this information
> > available in the expand pass?
>
> If there's single-uses then TER makes this info available.
>
> > IMO, the transformation from (ra | rb | cst) to (ra + rb + cst) as in
> > the shown testcase would be beneficial when constructing control
> > register values (see e.g. mesa-3d). We can use LEA instead of OR+ADD
> > sequence in this case.
>
> The other possibility is to expose LEA as optab and making GIMPLE
> instruction selection generate a direct internal function for that
> (that would be the "better" way).  There is LEA-like &TARGET_MEM_REF
> but that has constraints on the addends mode (ptr_mode) which might
> not fit what the target can do?  Otherwise that would be an existing
> way to do this computation as well.

I think there is no need for a new optab. If we can determine at
expand time that ANDed values are fed to the IOR/XOR expressions, then
we can check the constants and emit PLUS RTX instead. RTL combine pass
will then create LEA instruction from separate PLUS instructions.

So, we can emit:

op0 = and (a, CST1)
op1 = and (b, CST2)
op2 = plus (op0, op1)

RTX sequence for (a & CST1) | (b & CST2) when CST1 & CST2 == 0

and

op0 = and (a, CST1)
op1 = plus (op0, CST2)

RTX sequence for (a & CST1) | CST2 when CST1 & CST2 == 0

The above transformation is valid for IOR and XOR.

x86 can't combine IOR/XOR in any meaningful way, but can combine the
sequence of PLUS (together with MULT) RTXes to LEA.

(BTW: I am not versed in the expand stuff, so a disclaimer is at hand ;) )

Uros.

Reply via email to