https://issues.dlang.org/show_bug.cgi?id=14943
--- Comment #1 from [email protected] --- Further notes: - gdc not only inlines the call trees of .empty, .front, .popFront, it also applied other loop optimizations like strength reduction to refactor the a => a*7 into a += b; b += 7. Not sure if dmd is capable of doing this, but in any case the opportunity is missed because .popFront was not inlined, so the optimizer wouldn't have been able to apply strength reduction. - gdc's aggressive inlining also allowed various loop counters and accumulators to be completely held in registers, while the function calls generated by dmd necessitated dereferencing addresses to stack variables, which is an extra layer of indirection. Again, a missed opportunity due to not inlining aggressively enough. For reference, here's the inner loop produced by gdc: 403b80: 89 d7 mov %edx,%edi 403b82: c1 ef 1f shr $0x1f,%edi 403b85: 8d 34 3a lea (%rdx,%rdi,1),%esi 403b88: 83 c2 07 add $0x7,%edx 403b8b: 83 e6 01 and $0x1,%esi 403b8e: 39 fe cmp %edi,%esi 403b90: 75 1e jne 403bb0 <int test.fun(int)+0x80> 403b92: 89 c6 mov %eax,%esi 403b94: 8d 14 cd 00 00 00 00 lea 0x0(,%rcx,8),%edx 403b9b: c1 ee 1f shr $0x1f,%esi 403b9e: 01 f0 add %esi,%eax 403ba0: 29 ca sub %ecx,%edx 403ba2: d1 f8 sar %eax 403ba4: 01 d0 add %edx,%eax 403ba6: 83 c2 07 add $0x7,%edx 403ba9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 403bb0: 83 c1 01 add $0x1,%ecx 403bb3: 39 cb cmp %ecx,%ebx 403bb5: 75 c9 jne 403b80 <int test.fun(int)+0x50> --
