https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123883
Bug ID: 123883
Summary: Inefficient bit manipulation code on RISC-V port
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: law at gcc dot gnu.org
Target Milestone: ---
void foo(unsigned char *data, unsigned int lo_bit) {
unsigned int mask = ((1UL << 1) - 1) << lo_bit;
*data = (*data & ~mask) | ((1 << lo_bit) & mask);
}
Currently generates this code with rv64gcbv:
lbu a4,0(a0)
li a5,1
sllw a5,a5,a1
xor a5,a4,a5
bset a1,x0,a1
and a5,a5,a1
xor a4,a4,a5
sb a4,0(a0)
ret
Good code would look like this:
lb a2, 0(a0)
bset a1, a2, a1
sb a1, 0(a0)
ret
This attempted combine pattern illustrates a key issue:
(set (reg:DI 148)
(xor:DI (sign_extend:DI (ashift:SI (const_int 1 [0x1])
(subreg:QI (reg/v:DI 143 [ lo_bit ]) 0)))
(reg:DI 135 [ _2 ])))
That is *almost* a binv instruction. The problem is the 1 << N is done in SI
and should be sign-extended to DI according to the RTL. binv doesn't flip the
sign bit. So if (reg 143) held the value 31, then we would have generated
incorrect code.
I have some notes which indicate a match.pd pattern like this helps codegen
marginally:
(simplify
(bit_and (convert? (lshift integer_onep@1 SSA_NAME@0)) (convert? (lshift
integer_onep SSA_NAME@0)))
(lshift (convert @1) (convert @0)))
Then we'd get something like this from combine:
(set (reg:DI 160)
(ior:DI (and:DI (rotate:DI (const_int -2 [0xfffffffffffffffe])
(subreg:QI (reg/v:DI 143 [ lo_bit ]) 0))
(reg:DI 151 [ *data_10(D) ]))
(sign_extend:DI (ashift:SI (const_int 1 [0x1])
(subreg:QI (reg/v:DI 143 [ lo_bit ]) 0)))))
Which I think is just bset+sext -- the key being to realize the AND clears a
bit that is unconditionally then set by the outer IOR because the shift counts
are common across the rotate/left shift.