On 10/27/23 08:22, Roger Sayle wrote:

This patch improves the code generated by the ARC back-end for CPUs
without a barrel shifter but with -mswap.  The -mswap option provides
a SWAP instruction that implements SImode rotations by 16, but also
logical shift instructions (left and right) by 16 bits.  Clearly these
are also useful building blocks for implementing shifts by 17, 18, etc.
which would otherwise require a loop.

As a representative example:
int shl20 (int x) { return x << 20; }

GCC with -O2 -mcpu=em -mswap would previously generate:

shl20:  mov     lp_count,10
         lp      2f
         add     r0,r0,r0
         add     r0,r0,r0
2:      # end single insn loop
         j_s     [blink]

with this patch we now generate:

shl20:  mov_s   r2,0    ;3
         lsl16   r0,r0
         add3    r0,r2,r0
         j_s.d   [blink]
         asl_s r0,r0

Although both are four instructions (excluding the j_s),
the original takes ~22 cycles, and replacement ~4 cycles.


Tested with a cross-compiler to arc-linux hosted on x86_64,
with no new (compile-only) regressions from make -k check.
Ok for mainline if this passes Claudiu's nightly testing?
Not a review, just a comment.

The H8 has a ton of shift synthesis. If you're looking for inspiration to improve this stuff further for ARC, it might be worth a looksie.


Jeff

Reply via email to