https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123883

            Bug ID: 123883
           Summary: Inefficient bit manipulation code on RISC-V port
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: law at gcc dot gnu.org
  Target Milestone: ---

void foo(unsigned char *data, unsigned int lo_bit) {
  unsigned int mask = ((1UL << 1) - 1) << lo_bit;
  *data = (*data & ~mask) | ((1 << lo_bit) & mask);
}

Currently generates this code with rv64gcbv:

        lbu     a4,0(a0)
        li      a5,1
        sllw    a5,a5,a1
        xor     a5,a4,a5
        bset    a1,x0,a1
        and     a5,a5,a1
        xor     a4,a4,a5
        sb      a4,0(a0)
        ret

Good code would look like this:

        lb      a2, 0(a0)
        bset    a1, a2, a1
        sb      a1, 0(a0)
        ret


This attempted combine pattern illustrates a key issue:
(set (reg:DI 148)
    (xor:DI (sign_extend:DI (ashift:SI (const_int 1 [0x1])
                (subreg:QI (reg/v:DI 143 [ lo_bit ]) 0)))
        (reg:DI 135 [ _2 ])))


That is *almost* a binv instruction.  The problem is the 1 << N is done in SI
and should be sign-extended to DI according to the RTL.  binv doesn't flip the
sign bit.  So if (reg 143) held the value 31, then we would have generated
incorrect code.

I have some notes which indicate a match.pd pattern like this helps codegen
marginally:

(simplify
 (bit_and (convert? (lshift integer_onep@1 SSA_NAME@0)) (convert? (lshift
integer_onep SSA_NAME@0)))
  (lshift (convert @1) (convert @0)))


Then we'd get something like this from combine:

(set (reg:DI 160)
    (ior:DI (and:DI (rotate:DI (const_int -2 [0xfffffffffffffffe])
                (subreg:QI (reg/v:DI 143 [ lo_bit ]) 0))
            (reg:DI 151 [ *data_10(D) ]))
        (sign_extend:DI (ashift:SI (const_int 1 [0x1])
                (subreg:QI (reg/v:DI 143 [ lo_bit ]) 0)))))

Which I think is just bset+sext -- the key being to realize the AND clears a
bit  that is unconditionally then set by the outer IOR because the shift counts
are common across the rotate/left shift.

Reply via email to