https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
Yuri Rumyantsev <ysrumyan at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ysrumyan at gmail dot com --- Comment #6 from Yuri Rumyantsev <ysrumyan at gmail dot com> --- I don't see any issues with 'false dependency' on HSW. I've got sep data on it: for unsigned veriant (with LEA instructions): 0x400b30 52 161 lea 0x1(%rdx),%ecx 0x400b33 53 0 popcnt (%rbx,%rax,8),%rax 0x400b39 54 353 lea 0x2(%rdx),%r8d 0x400b3d 55 0 popcnt (%rbx,%rcx,8),%rcx 0x400b43 56 170 add %rax,%rcx 0x400b46 57 25 lea 0x3(%rdx),%esi 0x400b49 58 332 popcnt (%rbx,%r8,8),%rax 0x400b4f 59 196 add %rax,%rcx 0x400b52 60 199 popcnt (%rbx,%rsi,8),%rax 0x400b58 61 235 add %rax,%rcx 0x400b5b 62 414 lea 0x4(%rdx),%eax 0x400b5e 63 0 add %rcx,%r14 0x400b61 64 312 mov %rax,%rdx 0x400b64 65 0 cmp %rax,%r12 0x400b67 66 0 ja 400b30 <main+0xb0> and we don't see any performance anomaly with popcnt. But for 2nd loop we have 0x400c50 118 0 popcnt -0x8(%rdx),%rax 0x400c56 119 0 popcnt (%rdx),%rcx 0x400c5b 120 1086 add %rax,%rcx 0x400c5e 121 492 popcnt 0x8(%rdx),%rax 0x400c64 122 3 add %rcx,%rax 0x400c67 123 507 add $0x20,%rdx 0x400c6b 124 0 popcnt -0x10(%rdx),%rcx 0x400c71 125 955 add %rax,%rcx 0x400c74 126 479 add %rcx,%r13 0x400c77 127 489 cmp %rsi,%rdx 0x400c7a 128 0 jne 400c50 <main+0x1d0> So far I can't imagine what the problem is.