https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122219
--- Comment #18 from Richard Biener <rguenth at gcc dot gnu.org> ---
I'm not sure what you are expecting here. LIM1 transforms this to the
following (which is what prevails until .optimized):
<bb 5> [local count: 105119324]:
v__lsm.10_20 = _16(D);
<bb 3> [local count: 955630224]:
# i_28 = PHI <i_12(6), 0(5)>
# f1$t_31 = PHI <_2(6), 2(5)>
# f0$t_32 = PHI <_1(6), 1(5)>
_1 = f0$t_32 * 2;
_2 = f1$t_31 * 2;
MEM <char[4]> [(union s2 *)&t + 4B] = {};
t.t[0].t = _1;
_8 = t.tt;
t ={v} {CLOBBER(eos)};
c.tt = _8;
c.t[1].t = _2;
_10 = c.tt;
c ={v} {CLOBBER(eos)};
v__lsm.10_5 = _10;
i_12 = i_28 + 1;
i.0_3 = (unsigned int) i_12;
if (i.0_3 < n_7(D))
goto <bb 6>; [89.00%]
else
goto <bb 7>; [11.00%]
<bb 6> [local count: 850510900]:
goto <bb 3>; [100.00%]
<bb 7> [local count: 105119324]:
# v__lsm.10_24 = PHI <v__lsm.10_5(3)>
*v_9(D) = v__lsm.10_24;
AFAICS there's no store-motion opportunity left (not to say LIM cannot deal
with mismatched load/store for this). Instead this is again sth for a
pass like SRA. FRE _might_ be able to handle
MEM <char[4]> [(union s2 *)&t + 4B] = {};
t.t[0].t = _1;
_8 = t.tt;
by detecting conversion/shifting when a zeroing partial def complements
a variable def. But I have not bothered implementing this (though it
should not be very difficult). In this case, if I get the "bits" right,
on LE it should be _8 = (long)_1 << 32.