On Thursday, 14 April 2016 at 02:55:01 UTC, deadalnix wrote:
On Wednesday, 13 April 2016 at 22:13:27 UTC, Walter Bright
wrote:
On 4/12/2016 5:06 AM, Andrei Alexandrescu wrote:
Interesting: http://blog.regehr.org/archives/1384 -- Andrei
Curiously never mentioned is the following optimization:
return a+b*2+27;
becomes:
LEA EAX,27[ESI][EDI*2]
To overflow check:
ADD EDI,EDI
JO overflow
ADD EDI,27
JO overflow
MOV EAX,ESI
ADD EAX,EDI
JO overflow
I don't see efficiency there, even with the JO's being free.
It is clearly not as optimal, but still pretty good. The
article doesn't pretend it all come for free, just that it
comes for much cheaper than before.
Also, just checked, on sandy bridge, the LEA has 3clock latency
(but start earlier in the pipeline) and the add 1, so it is not
as bad as it looks (it is still bad).