https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82658
--- Comment #2 from mike.k at digitalcarbide dot com --- I wanted to validate if this issue was presenting in the toolchains for other architectures, so I tested a bit: GCC 7.2.0 on x86-64 (-O3): C: movzx eax, BYTE PTR [rsp-1] shr al mov BYTE PTR [rsp-1], al ret C++: movzx eax, BYTE PTR [rsp-1] sar eax mov BYTE PTR [rsp-1], al ret While not different in performance, it _is_ generating different code, and the code difference seems to reflect what Richard already found. I am not able to reproduce any difference on MIPS64, MIPS32, ARM, ARM64, PPC, PPC64. This is probably due to backend differences not causing the sequences to map differently. I do see it going back to GCC 4.6.4 on AVR.