https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110551

--- Comment #3 from Moncef Mechri <moncef.mechri at gmail dot com> ---
> Please next time attach (which you can do paste in the box) or paste inline
> the testcase rather than just link to godbolt .

Noted. Apologies.

> It is an older regression though.
> ```
> #include <stdint.h>
> 
> void mulx64(uint64_t *x, uint64_t *t)
> {
>     __uint128_t r = (__uint128_t)*x * 0x9E3779B97F4A7C15ull;
>     *t = (uint64_t)r ^ (uint64_t)( r >> 64 );
> }
> ```
> 
> It is just an extra mov.
> 
> Also the mulx should have allowed the register allocator to do better but it
> was worse ...

It is true that with this new test case, all GCC versions (including GCC 10)
seem to suffer from both issues reported in the original post.

But the original test case only exhibits suboptimal codegen for GCC >= 11, as
shown in the godbolt link shared above.

Reply via email to