On 5/7/24 11:17 PM, Christoph Müllner wrote:
The current implementation of riscv_block_move_straight() emits a couple
of loads/stores with with maximum width (e.g. 8-byte for RV64).
The remainder is handed over to move_by_pieces().
The by-pieces framework utilizes target hooks to decide about the emitted
instructions (e.g. unaligned accesses or overlapping accesses).

Since the current implementation will always request less than XLEN bytes
to be handled by the by-pieces infrastructure, it is impossible that
overlapping memory accesses can ever be emitted (the by-pieces code does
not know of any previous instructions that were emitted by the backend).

This patch changes the implementation of riscv_block_move_straight()
such, that it utilizes the by-pieces framework if the remaining data
is less than 2*XLEN bytes, which is sufficient to enable overlapping
memory accesses (if the requirements for them are given).

The changes in the expansion can be seen in the adjustments of the
cpymem-NN-ooo test cases. The changes in the cpymem-NN tests are
caused by the different instruction ordering of the code emitted
by the by-pieces infrastructure, which emits alternating load/store
sequences.

gcc/ChangeLog:

        * config/riscv/riscv-string.cc (riscv_block_move_straight):
        Hand over up to 2xXLEN bytes to move_by_pieces().

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/cpymem-32-ooo.c: Adjustments for overlapping
        access.
        * gcc.target/riscv/cpymem-32.c: Adjustments for code emitted by
        by-pieces.
        * gcc.target/riscv/cpymem-64-ooo.c: Adjustments for overlapping
        access.
        * gcc.target/riscv/cpymem-64.c: Adjustments for code emitted by
        by-pieces.
OK once any prereqs are in.

jeff

Reply via email to