https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80770

--- Comment #12 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jeff Law <[email protected]>:

https://gcc.gnu.org/g:684d385720cd5d25df8dc69c5281fc0fb9c3bebe

commit r17-434-g684d385720cd5d25df8dc69c5281fc0fb9c3bebe
Author: Shreya Munnangi <[email protected]>
Date:   Sun May 10 21:37:29 2026 -0600

    [RISC-V][PR rtl-optimization/80770] Simplify bit flipping operations down
to xor

    So this is the target independent work to finish resolving pr80770.  It's a
    combination of Shreya's efforts and my own.

    To recap, the basic idea is we want to simplify RTL blobs which ultimately
are
    just flipping a bit.  Consider:

    > (set (reg:DI 153)
    >      (ior:DI (and:DI (reg:DI 140 [ *s_4(D) ])
    >              (const_int 254 [0xfe]))
    >          (and:DI (not:DI (reg:DI 140 [ *s_4(D) ]))
    >              (const_int 1 [0x1]))))

    The first operand of the IOR clears the low bit of the source register
leaving
    everything else unchanged.  The second operand of the IOR clears everything
but
    the low bit and flips the low bit. When we IOR those together we get the
    original value with the lowest bit flipped.  The key is to realize we have
the
    same pseudo in both arms and there are no bits in common for the constants.
So
    this works for an arbitrary bit(s) as long as the constants have the right
    form.

    That gets us good code on riscv and almost certainly helps other targets.
    There is another form which shows up on the H8 and possibly other targets
    sub-word arithmetic.  op0 and op1 are respectively:

    > (gdb) p debug_rtx (op0)
    > (and:QI (reg:QI 24 [ *s_4(D) ])
    >     (const_int 127 [0x7f]))
    > $1 = void
    > (gdb) p debug_rtx (op1)
    > (plus:QI (and:QI (reg:QI 24 [ *s_4(D) ])
    >         (const_int -128 [0xffffffffffffff80]))
    >     (const_int -128 [0xffffffffffffff80]))
    > $2 = void

    Note we're in QImode.  op1 just flips the highest QImode bit.  If there are
    carry-outs, we don't really care about them.  The net is we can capture
that
    case on the H8 by verifying this form flips the highest bit for the given
mode.
    Otherwise the carry-outs are relevant and our transformation is incorrect.

    Plan is to commit Friday.  While it has been tested with the usual
bootstraps
    as well as testing on various cross platforms, I'm more comfortable giving
    folks time to take a looksie to see if Shreya or I missed anything
critical.

    For the testcase in question before/afters look like this:

    x86:
            movzbl  (%rdi), %eax
            movl    %eax, %edx
            andl    $-2, %eax
            andl    $1, %edx
            xorl    $1, %edx
            orl     %edx, %eax
            movb    %al, (%rdi)

      Turns into:

            xorb    $1, (%rdi)

    RISC-V:

            lbu     a5,0(a0)
            andi    a4,a5,1
            xori    a4,a4,1
            andi    a5,a5,-2
            or      a5,a5,a4
            sb      a5,0(a0)

      Turns into:

            lbu     a5,0(a0)
            xori    a5,a5,1
            sb      a5,0(a0)

            PR rtl-optimization/80770
    gcc/
            * rtl.h (simplify_context::simplify_ior_with_common_term): Add
            new method.
            (simplify_context::simplify_binary_operation_1): Use new method.
            * simplify-rtx.cc
(simplify_context::simplify_ior_with_common_term):
            New method.

    gcc/testsuite/

            * gcc.target/riscv/pr80770.c: New test.
            * gcc.target/riscv/pr80770-2.c: New test.
            * gcc.target/h8300/pr80770.c: New test.
            * gcc.target/h8300/pr80770-2.c: New test.

    Co-authored-by: Jeff Law  <[email protected]>

Reply via email to