https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110613
Bug ID: 110613 Summary: optimization about combined store of adjacent bitfields Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: lh_mouse at 126 dot com Target Milestone: --- This is a piece of code taken from a WebSocket frame parser: ``` #include <stdint.h> struct Header { // 1st byte uint8_t opcode : 4; uint8_t rsv3 : 1; uint8_t rsv2 : 1; uint8_t rsv1 : 1; uint8_t fin : 1; // 2nd byte uint8_t reserved_1 : 7; uint8_t mask : 1; // 3rd and 4th bytes uint8_t reserved_2; uint8_t reserved_3; // 5th to 7th bytes union { char mask_key[4]; uint32_t mask_key_u32; }; // 8th to 15th bytes uint64_t payload_len; }; void set_header(Header* ph, const uint8_t* bptr) { uint8_t f = bptr[0]; uint8_t t = bptr[1]; ph->fin = f >> 7; ph->rsv1 = f >> 6; ph->rsv2 = f >> 5; ph->rsv3 = f >> 4; ph->opcode = f; ph->mask = t >> 7; ph->reserved_1 = t; } ``` The structure is designed to match x86_64 ABI (little endian), so ``` ph->fin = f >> 7; ph->rsv1 = f >> 6; ph->rsv2 = f >> 5; ph->rsv3 = f >> 4; ph->opcode = f; ``` should be a simple move (https://gcc.godbolt.org/z/9vTqs7axj), and ``` ph->mask = t >> 7; ph->reserved_1 = t; ``` should also be a simple move (https://gcc.godbolt.org/z/KdchWvEn1), but! When put these two pieces of code together, guess what?: (godbolt: https://gcc.godbolt.org/z/hbEaeb3MT) ``` set_header(Header*, unsigned char const*): movzx edx, BYTE PTR [rsi] movzx ecx, BYTE PTR [rsi+1] mov eax, edx mov esi, edx shr al, 4 and esi, 15 and eax, 1 sal eax, 4 or eax, esi mov esi, edx shr sil, 5 and esi, 1 sal esi, 5 or eax, esi mov esi, edx shr dl, 7 shr sil, 6 movzx edx, dl and esi, 1 sal edx, 7 sal esi, 6 or eax, esi or eax, edx mov edx, ecx shr cl, 7 and edx, 127 sal ecx, 15 sal edx, 8 or eax, edx or eax, ecx mov WORD PTR [rdi], ax ret ```