https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011

Yuri Rumyantsev <ysrumyan at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ysrumyan at gmail dot com

--- Comment #6 from Yuri Rumyantsev <ysrumyan at gmail dot com> ---
I don't see any issues with 'false dependency' on HSW. I've got sep data on it:

for unsigned veriant (with LEA instructions):

0x400b30 52 161 lea    0x1(%rdx),%ecx 
0x400b33 53 0 popcnt (%rbx,%rax,8),%rax 
0x400b39 54 353 lea    0x2(%rdx),%r8d 
0x400b3d 55 0 popcnt (%rbx,%rcx,8),%rcx 
0x400b43 56 170 add    %rax,%rcx 
0x400b46 57 25 lea    0x3(%rdx),%esi 
0x400b49 58 332 popcnt (%rbx,%r8,8),%rax 
0x400b4f 59 196 add    %rax,%rcx 
0x400b52 60 199 popcnt (%rbx,%rsi,8),%rax 
0x400b58 61 235 add    %rax,%rcx 
0x400b5b 62 414 lea    0x4(%rdx),%eax 
0x400b5e 63 0 add    %rcx,%r14 
0x400b61 64 312 mov    %rax,%rdx 
0x400b64 65 0 cmp    %rax,%r12 
0x400b67 66 0 ja     400b30 <main+0xb0> 

and we don't see any performance anomaly with popcnt.

But for 2nd loop we have

0x400c50 118 0 popcnt -0x8(%rdx),%rax 
0x400c56 119 0 popcnt (%rdx),%rcx 
0x400c5b 120 1086 add    %rax,%rcx 
0x400c5e 121 492 popcnt 0x8(%rdx),%rax 
0x400c64 122 3 add    %rcx,%rax 
0x400c67 123 507 add    $0x20,%rdx 
0x400c6b 124 0 popcnt -0x10(%rdx),%rcx 
0x400c71 125 955 add    %rax,%rcx 
0x400c74 126 479 add    %rcx,%r13 
0x400c77 127 489 cmp    %rsi,%rdx 
0x400c7a 128 0 jne    400c50 <main+0x1d0>

So far I can't imagine what the problem is.

Reply via email to