https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110717

--- Comment #17 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sa...@gcc.gnu.org>:

https://gcc.gnu.org/g:ff8d0ce17fb585a29a83349acbc67b2dd3556629

commit r14-6495-gff8d0ce17fb585a29a83349acbc67b2dd3556629
Author: Roger Sayle <ro...@nextmovesoftware.com>
Date:   Wed Dec 13 13:36:44 2023 +0000

    ARC: Add *extvsi_n_0 define_insn_and_split for PR 110717.

    This patch improves the code generated for bitfield sign extensions on
    ARC cpus without a barrel shifter.

    Compiling the following test case:

    int foo(int x) { return (x<<27)>>27; }

    with -O2 -mcpu=em, generates two loops:

    foo:    mov     lp_count,27
            lp      2f
            add     r0,r0,r0
            nop
    2:      # end single insn loop
            mov     lp_count,27
            lp      2f
            asr     r0,r0
            nop
    2:      # end single insn loop
            j_s     [blink]

    and the closely related test case:

    struct S { int a : 5; };
    int bar (struct S *p) { return p->a; }

    generates the slightly better:

    bar:    ldb_s   r0,[r0]
            mov_s   r2,0    ;3
            add3    r0,r2,r0
            sexb_s  r0,r0
            asr_s   r0,r0
            asr_s   r0,r0
            j_s.d   [blink]
            asr_s   r0,r0

    which uses 6 instructions to perform this particular sign extension.
    It turns out that sign extensions can always be implemented using at
    most three instructions on ARC (without a barrel shifter) using the
    idiom ((x&mask)^msb)-msb [as described in section "2-5 Sign Extension"
    of Henry Warren's book "Hacker's Delight"].  Using this, the sign
    extensions above on ARC's EM both become:

            bmsk_s  r0,r0,4
            xor     r0,r0,16
            sub     r0,r0,16

    which takes about 3 cycles, compared to the ~112 cycles for the loops
    in foo.

    2023-12-13  Roger Sayle  <ro...@nextmovesoftware.com>
                Jeff Law  <j...@ventanamicro.com>

    gcc/ChangeLog
            * config/arc/arc.md (*extvsi_n_0): New define_insn_and_split to
            implement SImode sign extract using a AND, XOR and MINUS sequence.

    gcc/testsuite/ChangeLog
            * gcc.target/arc/extvsi-1.c: New test case.
            * gcc.target/arc/extvsi-2.c: Likewise.

Reply via email to