https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110807

Alexandre Oliva <aoliva at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |aoliva at gcc dot gnu.org

--- Comment #11 from Alexandre Oliva <aoliva at gcc dot gnu.org> ---
> The new test fails with -m32

I've looked a bit into why.  The memmove is optimized out by vrp (or, if that's
disabled, by dom) on lp64, because it's guarded by two conditions: _10 >
sizeof(long), and !(_14 > 1), where _10 is a signed long (ptrdiff_t) computed
as the difference between the _M_p of _M_finish and _M_start in the preexisting
vector, and _14 = (unsigned long)(_10*8 + _8), where _8 is the vector's finish
offset.  in order for the _14 condition to hold, _14 must be 0ul..1ul.

Since _10 is long, _8 promotes to long in lp64, the addition is performed as a
signed long, and then converted to unsigned long.  _8 is loaded from memory as
an unsigned int, and nothing is known about it, so its promoted operand is
0l..0xffffffffl.  In order for _14 to be <= 1ul, _10 * 8 must be in the range
-0xffffffffl..1l, and therefore _10 must be <= 0x1fffffffl..0l, which enables
folding of the _10 condition as the entire range is <= sizeof(long).

In the lp32 case, _10 is int, so _10*8 promotes to unsigned int for the
addition, whose result is then NOPped to unsigned long.  _8 is also loaded from
memory as unsigned int, but because unsigned addition wraps around and _8
covers the full range, nothing can be inferred about the range of _10*8, and
thus _10's range is only limited by overflow-avoidance in the signed
multiplication: -0x1fffffffl..0x1ffffffl.  Therefore, the _10 compare cannot be
folded, and the memmove call remains.

I think the missed optimization and the overall problem stems from the fact
that optimizers don't know the actual range of _M_offset.  Ensuring it's
visibly normalized at uses in which out-of-range _M_offsets might sneak in
might be enough to avoid the warning and enable further optimizations.

Reply via email to