https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94703
--- Comment #11 from pskocik at gmail dot com --- Thanks for the shot at a fix, Richard Biener. Since I have reported this, I think I should mentioned a related suboptimality that should probably be getting fixed alongside with this (if this one is getting fixed), namely that while int64_t zextend_int_to_int64_nospill(int *X) { union { int64_t _; } r = {0}; return memcpy(&r._,X,sizeof(*X)),r._; } (and hopefully later even int64_t zextend_int_to_int64_spill(int *X) { int64_t r = {0}; return memcpy(&r,X,sizeof(*X)),r; } ) generates, on x86_64, the optimal zextend_int_to_int64_nospill: mov eax, DWORD PTR [rdi] ret for zeroextending promotions of sub-int types, an extra xor instruction gets generated, e.g.: int64_t zextend_short_to_int64_nospill_but_suboptimal(short *X) { union { int64_t _; } r ={0}; return memcpy(&r._,X,sizeof(*X)),r._; } => zextend_short_to_int64_nospill_but_suboptimal: xor eax, eax mov ax, WORD PTR [rdi] ret which was surprising to me because it doesn't happen with zero-extending memcpy-based promotion from {,u}ints to larger types ({,u}{,l}longs). https://gcc.godbolt.org/z/ZjXaCw