On Wed, Mar 6, 2019 at 9:47 AM Stanislav Malyshev <smalys...@gmail.com>
wrote:

> Hi!
>
> > This is intended x86 specific optimization. The behavior of SHR is
> > documented.
> >
> > See https://c9x.me/x86/html/file_module_x86_id_285.html
> >
> > "The count is masked to 5 bits, which limits the count range to 0 to 31"
> > on 32-bit systems.
>
> I see. But I wonder whether the compiler will always compile it as SHR,
> is it guarnateed? Looking at optimization behavior for clang, for me
> ZEND_BIT_TEST( bits, 65 ) - where bits is an array of 0xFF - produces 1
> under -O0 but 0 under -O1 (I guess it figures out it's a constant and
> optimizes the shift away). So I wonder whether this construct isn't
> dangerous...
>

This is the general problem with undefined behavior: Just because something
is UB, doesn't necessarily mean that it will *always* be micompiled. If you
write ZEND_BIT_TEST(bits, 65), then the shift amount is known to be too
large and the computation is folded to poison. In ZEND_BIT_TEST(bits, bit)
where the shift amount if not known, it is quite likely that no
optimizations will be able to apply and this will indeed be compiled to SHR
with the desired making behavior. But unlikely doesn't mean it can't
happen. Even if the input is dynamic, known bits can prove that one of the
top bits is set and fold everything to undef. Here is a quick example:
https://godbolt.org/z/C3rSSW

That's why we really shouldn't be doing such optimizations.

Regards,
Nikita

Reply via email to