On Wed, Dec 10, 2025 at 11:00 AM Kishan Parmar <[email protected]> wrote:
>
> On 10/12/25 10:25 pm, Andrew Pinski wrote:
> > The only question I have is why reject INTEGER_CST and not VECTOR_CST?
> >  Really why reject integer cst in general?
> >
> > Thanks,
> > Andrew
> Hi Andrew,
>
> The reason for rejecting |INTEGER_CST| is to avoid regressing rs6000 single 
> rotate-and-insert/mask
> instructions generation.

I am still not happy with the constant check.
Can you provide a testcase where the constant check is worse? What is
the before and after check?
Attach the combine dumps if needed because the below:

>
> When @2 is a constant (a mask), the existing RTL infrastructure(simplify-rtx) 
> handles the canonical XOR-form well.
> Specifically, the combiner can match the sequence (rotate -> xor -> and -> 
> xor) and merge it into a  rotate + (rotate-and-insert/mask) instruction.

Does not help me better to understand since there is no rotate in
match pattern at all.



>
> If we force the IOR form |(A & C) | (B & ~C)| in GIMPLE for constants, the 
> RTL combiner fails to match
> the single rotate-and-insert pattern.
> It often greedily simplifies the (ROTATE + AND ) part into a simple logical 
> shift (|lshiftrt|), breaking the sequence.
>
> Example Trace (IOR Form - Regressed):
>
> When we have a sequence of ROTATE(A) + AND(A, C) + ANDN(B, C) + IOR (A,B)
> 1) combine sees: (rotate %a) & c
> 2) Simplifies to: lshiftrt %a (loss of IOR form semantics)
> 3) Result: We end up with 3 instructions: lshiftrt, andn, ior.

This is the part I am not getting. Specifically the whole `+` part and
which instructions are being combined.

Is it that we originally had:
A = RRotate(P, N)
T = (A & C)
T1 = (B & ~C)
R = T | T1

And then we get:
T = lshiftrt (P, N)
T1 = B & ~C
R = T | T1

Which then is not recognized as inserting P into B starting at N for
the bitmask ?
But can't that show up even without the andn pattern?
Which means there might be a missing rs6000 pattern for this case?

Thanks,
Andrew


>
>
> Example Trace (XOR Form - Current/Optimal):
>
> When we have a sequence of ROTATE(A) +XOR + AND + XOR
> 1) Ignore the ROTATE
> 2) simplify-rtx sees the canonical XOR pattern.
> 3) Matches it directly to a rlwimi kind of operation.
> 4) Result: 2 instruction rotate + (rotate-and-insert/mask)
>
> Attempting to match this new sequence in the combiner is difficult because 
> the middle-end attempts to
> simplify the shift/mask operation into a ZERO_EXTRACT. Since the RS6000 
> backend does not generally accept ZERO_EXTRACT
> for these integer operations, the match fails, and we lose the rlwimi kind of 
> instructions (regressing to 3 separate instructions).
>
> Since simplify-rtx already detects and optimizes the constant case correctly
> (transforming the XOR form where appropriate without breaking), i thought it 
> is safer to restrict this GIMPLE patch to variables only.



I do think you can add a few gimple level testcases here.
You will need at least a new target-supports andn_int and maybe and
andn_v4int. Right now aarch64 defines both andnsi and andnv4si too.
x86 defines andnv4si with `-msse2` but not the scalar version (even
though -mbmi defines the scalar version).
So you might need add_options_for_andn_int and
add_options_for_andnv4int procs (like add_options_for_tls in
target-supports.exp).

Thanks,
Andrew Pinski

>
> Thanks,
> Kishan

Reply via email to