[Bug rtl-optimization/106518] Exchange/swap aware register allocation (generate xchg in reload)

2023-04-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106518

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:1f0bfbb26e532cef7347a91439008114fd88173a

commit r14-245-g1f0bfbb26e532cef7347a91439008114fd88173a
Author: Roger Sayle 
Date:   Wed Apr 26 09:10:06 2023 +0100

[xstormy16] Add support for byte and word swapping instructions.

This patch adds support for xstormy16's swpb (swap bytes) and swpw (swap
words) instructions.  The most obvious application of these to implement
the __builtin_bswap16 and __builtin_bswap32 intrinsics.

Currently, __builtin_bswap16 is implemented as:
foo:mov r7,r2
shl r7,#8
shr r2,#8
or r2,r7
ret

but with this patch becomes:
foo:swpb r2
ret

Likewise, __builtin_bswap32 now becomes:
foo:swpb r2 | swpb r3 | swpw r2,r3
ret

Finally, the swpw instruction on its own can be used to exchange
two word mode registers without a temporary, so a new pattern and
peephole2 have been added to catch this.  As described in the
PR rtl-optimization/106518, register allocation can (in theory)
be more efficient on targets that provide a swap/exchange instruction.
The slightly unusual swap naming matches that used in i386.md.

2024-04-26  Roger Sayle  

gcc/ChangeLog
* config/stormy16/stormy16.md (bswaphi2): New define_insn.
(bswapsi2): New define_insn.
(swaphi): New define_insn to exchange two registers (swpw).
(define_peephole2): Recognize exchange of registers as swaphi.

gcc/testsuite/ChangeLog
* gcc.target/xstormy16/bswap16.c: New test case.
* gcc.target/xstormy16/bswap32.c: Likewise.
* gcc.target/xstormy16/swpb.c: Likewise.
* gcc.target/xstormy16/swpw-1.c: Likewise.
* gcc.target/xstormy16/swpw-2.c: Likewise.

[Bug rtl-optimization/106518] Exchange/swap aware register allocation (generate xchg in reload)

2022-08-04 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106518

Jan Hubicka  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #2 from Jan Hubicka  ---
We have xchg patterns in i386.md and corresponding peephole. I used to play
with this long time ago and it was not giving any of performance benefits
becuase xchg at that time was not well optimized in CPUs.

With regstack the main problem is that RTL after reg-stack is inconsistent
since we have no way to explicitly express push/pop operations that renumber
the registers.  Some years ago I made patch for that
https://gcc.gnu.org/pipermail/gcc-patches/1999-November/021921.html

Even if you make representation correct register allocation for stack based CPU
is quite different from normal registr allocation.

These days I would more like to see x87 to silently die.

[Bug rtl-optimization/106518] Exchange/swap aware register allocation (generate xchg in reload)

2022-08-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106518

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization, ra

--- Comment #1 from Richard Biener  ---
one could try to recog

 (parallel
   (set (reg A) (reg B))
   (set (reg B) (reg A)))

?  But yes, having a standard name for this would be nice.