On Tue, Jan 9, 2024 at 11:19 AM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> On Tue, Jan 9, 2024 at 11:06 AM Richard Biener <rguent...@suse.de> wrote:
> >
> > On Tue, 9 Jan 2024, Uros Bizjak wrote:
> >
> > > On Tue, Jan 9, 2024 at 10:44?AM Richard Biener <rguent...@suse.de> wrote:
> > > >
> > > > On Tue, 9 Jan 2024, Uros Bizjak wrote:
> > > >
> > > > > On Tue, Jan 9, 2024 at 9:58?AM Richard Biener <rguent...@suse.de> 
> > > > > wrote:
> > > > > >
> > > > > > On Mon, 8 Jan 2024, Uros Bizjak wrote:
> > > > > >
> > > > > > > On Mon, Jan 8, 2024 at 5:57?PM Andrew Pinski <pins...@gmail.com> 
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Mon, Jan 8, 2024 at 6:44?AM Uros Bizjak <ubiz...@gmail.com> 
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Instead of converting XOR or PLUS of two values, ANDed with 
> > > > > > > > > two constants that
> > > > > > > > > have no bits in common, to IOR expression, convert IOR or XOR 
> > > > > > > > > of said two
> > > > > > > > > ANDed values to PLUS expression.
> > > > > > > >
> > > > > > > > I think this only helps targets which have leal like 
> > > > > > > > instruction. Also
> > > > > > > > I think it is the same issue as I recorded as PR 111763 .  I 
> > > > > > > > suspect
> > > > > > > > BIT_IOR is more of a Canonical form for GIMPLE while we should 
> > > > > > > > handle
> > > > > > > > this in expand to decide if we want to use PLUS or IOR.
> > > > > > >
> > > > > > > For the pr108477.c testcase, expand pass expands:
> > > > > > >
> > > > > > >   r_3 = a_2(D) & 1;
> > > > > > >  p_5 = b_4(D) & 4294967292;
> > > > > > >  _1 = r_3 | p_5;
> > > > > > >  _6 = _1 + 2;
> > > > > > >  return _6;
> > > > > > >
> > > > > > > The transformation ( | -> + ) is valid only when CST1 & CST2 == 
> > > > > > > 0, so
> > > > > > > we need to determine values of constants. Is this information
> > > > > > > available in the expand pass?
> > > > > >
> > > > > > If there's single-uses then TER makes this info available.
> > > > > >
> > > > > > > IMO, the transformation from (ra | rb | cst) to (ra + rb + cst) 
> > > > > > > as in
> > > > > > > the shown testcase would be beneficial when constructing control
> > > > > > > register values (see e.g. mesa-3d). We can use LEA instead of 
> > > > > > > OR+ADD
> > > > > > > sequence in this case.
> > > > > >
> > > > > > The other possibility is to expose LEA as optab and making GIMPLE
> > > > > > instruction selection generate a direct internal function for that
> > > > > > (that would be the "better" way).  There is LEA-like &TARGET_MEM_REF
> > > > > > but that has constraints on the addends mode (ptr_mode) which might
> > > > > > not fit what the target can do?  Otherwise that would be an existing
> > > > > > way to do this computation as well.
> > > > >
> > > > > I think there is no need for a new optab. If we can determine at
> > > > > expand time that ANDed values are fed to the IOR/XOR expressions, then
> > > > > we can check the constants and emit PLUS RTX instead. RTL combine pass
> > > > > will then create LEA instruction from separate PLUS instructions.
> > > > >
> > > > > So, we can emit:
> > > > >
> > > > > op0 = and (a, CST1)
> > > > > op1 = and (b, CST2)
> > > > > op2 = plus (op0, op1)
> > > > >
> > > > > RTX sequence for (a & CST1) | (b & CST2) when CST1 & CST2 == 0
> > > > >
> > > > > and
> > > > >
> > > > > op0 = and (a, CST1)
> > > > > op1 = plus (op0, CST2)
> > > > >
> > > > > RTX sequence for (a & CST1) | CST2 when CST1 & CST2 == 0
> > > > >
> > > > > The above transformation is valid for IOR and XOR.
> > > > >
> > > > > x86 can't combine IOR/XOR in any meaningful way, but can combine the
> > > > > sequence of PLUS (together with MULT) RTXes to LEA.
> > > >
> > > > Btw, this looks like a three-insn combination even with IOR so a
> > > > pattern for this case would work as well?
> > >
> > > IIUC the question: x86 does not have three-input IOR, but we want to
> > > emulate it with LEA (three-input PLUS, but one of the arguments has to
> > > be constant).
> >
> > But couldn't you have a define_insn matching LEA but with IOR instead
> > of PLUS (and with the appropriate constraints?).  Maybe it could
> > also be combine trying PLUS instead of IOR if that's possible
> > (looking at the constants).
>
> we would have to include masking ANDs in the define_insn and add
> additional conditions regarding mask constants in the insn constraint.
>
> So, this define_insn would look something like:
>
> (ior (ior (and (op1, CST1), and (op2, CST2)), CST3))
>
> with (CST1 & CST2 & CST3) == 0 condition.
>
> and combinations with PLUS / MULT RTXes including all the variants
> with two arguments.

Oh, and then we would have to split masking ANDs out of the above
instruction. We can't clobber the inputs, so this should be made with
a temporary registers before reload...

Uros.

Reply via email to