https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43
--- Comment #7 from Paul Eggert ---
(In reply to Alexander Monakov from comment #6)
> Are you binding the benchmark to some core in particular?
I did the benchmark on performance cores, which was my original use case. On
efficiency cores,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43
--- Comment #6 from Alexander Monakov ---
Thanks.
i5-1335U has two "performance cores" (with HT, four logical CPUs) and eight
"efficiency cores". They have different micro-architecture. Are you binding the
benchmark to some core in particular?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43
--- Comment #5 from Paul Eggert ---
(In reply to Alexander Monakov from comment #4)
> To evaluate scheduling aspect, keep 'mov eax, 1' while changing 'add rbx,
> rax' to 'add rbx, 1'.
Adding the (unnecessary) 'mov eax, 1' doesn't affect the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43
--- Comment #3 from Andrew Pinski ---
_22 = *iter_57;
if (_22 >= 0)
goto ; [90.00%]
else
goto ; [10.00%]
[local count: 860067200]:
_76 = (long long unsigned int) _22;
_15 = sum_31 + _76;
goto ; [100.00%]
...
[local
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43
--- Comment #2 from Paul Eggert ---
Created attachment 55790
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55790=edit
asm code that's 38% faster on my platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43
--- Comment #1 from Paul Eggert ---
Created attachment 55789
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55789=edit
asm code generated by gcc -O2 -S