[Bug middle-end/114449] bswap64 not optimized

2024-03-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114449 --- Comment #5 from Richard Biener --- Note we do unroll the loop with -O3 but only late after which we do not re-do bswap recognition (which happens before loop optimization). At -O2 we don't unroll because that increases code-size too much.

[Bug middle-end/114449] bswap64 not optimized

2024-03-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114449 Andrew Pinski changed: What|Removed |Added CC||pinskia at gcc dot gnu.org Se

[Bug middle-end/114449] bswap64 not optimized

2024-03-24 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114449 Xi Ruoyao changed: What|Removed |Added Keywords||missed-optimization Ever confirmed|0

[Bug middle-end/114449] bswap64 not optimized

2024-03-24 Thread pali at kernel dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114449 --- Comment #3 from Pali Rohár --- Note that clang optimizes it just with -O2 and does not require any special pragma.

[Bug middle-end/114449] bswap64 not optimized

2024-03-24 Thread pali at kernel dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114449 --- Comment #2 from Pali Rohár --- Interesting... I was expecting that some -O3 or better -Ofast option tells gcc to optimize the code as much as possible. I added that pragma before for-loop in the first example and then gcc really optimized t

[Bug middle-end/114449] bswap64 not optimized

2024-03-24 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114449 Xi Ruoyao changed: What|Removed |Added CC||xry111 at gcc dot gnu.org --- Comment #1 fr