https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61559

--- Comment #11 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Jakub Jelinek from comment #9)
> Aren't these optimizations actually a pessimization for -mmovbe if the inner
> bswap is on a read from memory?  Assuming the load and bswap instruction is
> cheap, then e.g. loading two values with bswap on them and doing say xor on
> them afterwards might be cheaper than load the two values, xor them and then
> bswap them (because for that bswap you don't have a load+bswap instruction).

    (simplify
      (bitop (bswap @0) (bswap @1))
      (bswap (bitop @0 @1)))

This one should be:

    (simplify
      (bswap (bitop (bswap @0) (bswap @1)))
      (bitop @0 @1))

This is what builtin-bswap-8.c tests, and I believe it will address Jakub's
concerns.

Reply via email to