https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123506
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
And the reason why the addition + move is not optimized into lea is that up to
gcse2, i.e. after late combine, there is the concatditi3_3 in between which
breaks such optimizations (of course, it helps others).
(insn 29 28 30 4 (parallel [
(set (reg/f:DI 5 di [orig:116 _2 ] [116])
(plus:DI (reg/f:DI 5 di [orig:126 buffer ] [126])
(const_int 2 [0x2])))
(clobber (reg:CC 17 flags))
]) "pr123506.c":7:29 289 {*adddi_1}
(nil))
(insn 30 29 31 4 (set (reg:TI 4 si [orig:100 D.3003 ] [100])
(ior:TI (ashift:TI (zero_extend:TI (reg/f:DI 5 di [orig:116 _2 ]
[116]))
(const_int 64 [0x40]))
(zero_extend:TI (reg:DI 4 si [115])))) "pr123506.c":7:33 946
{*concatditi3_3}
(nil))
(insn 31 30 32 4 (set (reg:SI 0 ax [orig:101 <retval> ] [101])
(reg:SI 4 si [orig:100 D.3003 ] [100])) "pr123506.c":7:33 discrim 1 100
{*movsi_internal}
(nil))
(insn 32 31 38 4 (set (reg:DI 1 dx [orig:102 <retval>+8 ] [102])
(reg:DI 5 di [ D.3003+8 ])) "pr123506.c":7:33 discrim 1 99
{*movdi_internal}
(nil))