Am 31.07.25 um 12:30 schrieb Denis Chertykov:
ср, 30 июл. 2025 г. в 14:59, Georg-Johann Lay <a...@gjlay.de>:
Insn combine may come up with superfluous reg-reg moves, where the
combine people say that these are no problem since reg-alloc is supposed
to optimize them. The issue is that the lower-subreg pass sitting
between combine and reg-alloc may split such moves, coming up with a zoo
of subregs which are only handled poorly by the register allocator.
This patch adds a new avr mini-pass that handles such cases.
As an example, take
int f_ffssi (long x)
{
return __builtin_ffsl (x);
}
where the two functions have the same interface, i.e. there are no extra
moves required for the argument or for the return value. However,
$ avr-gcc -S -Os -dp -mno-fuse-move ...
f_ffssi:
mov r20,r22 ; 29 [c=4 l=1] movqi_insn/0
mov r21,r23 ; 30 [c=4 l=1] movqi_insn/0
mov r22,r24 ; 31 [c=4 l=1] movqi_insn/0
mov r23,r25 ; 32 [c=4 l=1] movqi_insn/0
mov r25,r23 ; 33 [c=4 l=4] *movsi/0
mov r24,r22
mov r23,r21
mov r22,r20
rcall __ffssi2 ; 34 [c=16 l=1] *ffssihi2.libgcc
ret ; 37 [c=0 l=1] return
where all the moves add up to a no-op. The -mno-fuse-move option
stops any attempts by the avr backend to clean up that mess.
gcc/
* config/avr/avr-passes.def (avr_pass_2moves): Insert after combine.
* config/avr/avr-passes.cc (make_avr_pass_2moves): New function.
(pass_data avr_pass_data_2moves): New static variable.
(avr_pass_2moves): New rtl_opt_pass.
* config/avr/avr-protos.h (make_avr_pass_2moves): New proto.
Ok for trunk?
Ok
Please apply
Denis
Applied with the attached addendum which adds an own
option for better control.
Johann
diff --git a/gcc/common/config/avr/avr-common.cc b/gcc/common/config/avr/avr-common.cc
index 203a9652818..d8b982c4fa6 100644
--- a/gcc/common/config/avr/avr-common.cc
+++ b/gcc/common/config/avr/avr-common.cc
@@ -38,6 +38,7 @@ static const struct default_options avr_option_optimization_table[] =
{ OPT_LEVELS_1_PLUS, OPT_mmain_is_OS_task, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_mfuse_add_, NULL, 1 },
{ OPT_LEVELS_2_PLUS, OPT_mfuse_add_, NULL, 2 },
+ { OPT_LEVELS_1_PLUS, OPT_mfuse_move2, NULL, 1 },
{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_mfuse_move_, NULL, 3 },
{ OPT_LEVELS_2_PLUS, OPT_mfuse_move_, NULL, 23 },
{ OPT_LEVELS_2_PLUS, OPT_msplit_bit_shift, NULL, 1 },
diff --git a/gcc/config/avr/avr-passes.cc b/gcc/config/avr/avr-passes.cc
index 1bcf211358e..69df6d263f6 100644
--- a/gcc/config/avr/avr-passes.cc
+++ b/gcc/config/avr/avr-passes.cc
@@ -4869,7 +4869,7 @@ public:
unsigned int execute (function *func) final override
{
- if (optimize > 0 && avropt_fuse_move > 0)
+ if (optimize && avropt_fuse_move2)
{
bool changed = false;
basic_block bb;
diff --git a/gcc/config/avr/avr.opt b/gcc/config/avr/avr.opt
index 988311927bd..7f6f18c3f23 100644
--- a/gcc/config/avr/avr.opt
+++ b/gcc/config/avr/avr.opt
@@ -164,6 +164,10 @@ mfuse-move=
Target Joined RejectNegative UInteger Var(avropt_fuse_move) Init(0) Optimization IntegerRange(0, 23)
-mfuse-move=<0,23> Optimization. Run a post-reload pass that tweaks move instructions.
+mfuse-move2
+Target Var(avropt_fuse_move2) Init(0) Optimization
+Optimization. Fuse some move insns after insn combine.
+
mabsdata
Target Mask(ABSDATA)
Assume that all data in static storage can be accessed by LDS / STS instructions. This option is only useful for reduced Tiny devices like ATtiny40.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 09802303254..ce139213ee7 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -915,7 +915,7 @@ Objective-C and Objective-C++ Dialects}.
@emph{AVR Options} (@ref{AVR Options})
@gccoptlist{-mmcu=@var{mcu} -mabsdata -maccumulate-args -mcvt
-mbranch-cost=@var{cost} -mfuse-add=@var{level} -mfuse-move=@var{level}
--mcall-prologues -mgas-isr-prologues -mint8 -mflmap
+-mfuse-move2 -mcall-prologues -mgas-isr-prologues -mint8 -mflmap
-mdouble=@var{bits} -mlong-double=@var{bits} -mno-call-main
-mn_flash=@var{size} -mfract-convert-truncate -mno-interrupts
-mmain-is-OS_task -mrelax -mrmw -mstrict-X -mtiny-stack
@@ -25110,6 +25110,10 @@ Valid values for @var{level} are in the range @code{0} @dots{} @code{23}
which is a 3:2:2:2 mixed radix value. Each digit controls some
aspect of the optimization.
+@opindex mfuse-move2
+@item -mfuse-move2
+Run a post combine optimization pass that tries to fuse move instructions.
+
@opindex mstrict-X
@item -mstrict-X
Use address register @code{X} in a way proposed by the hardware. This means