https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93722
Bug ID: 93722 Summary: rorq is not produced for rotate on some cases Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: x86_64-linux-gnu Take: void f0(unsigned long *a) { __uint128_t t0 = ((__uint128_t *)a)[0]; __uint128_t t1 = t0>>sizeof(unsigned long)*8; __uint128_t t2 = t0<<sizeof(unsigned long)*8; ((__uint128_t*)a)[0] = t1 | t2; } --- CUT --- We should just produce: rolq $32, (%rdi) ret --- CUT --- But we produce: movq 8(%rdi), %rdx movq (%rdi), %rax movq %rdx, (%rdi) movq %rax, 8(%rdi) ret Note it gets worse when using rotate to half-way