[v8-dev] Re: X64: Remove more fpu code. Unroll more local initialization loops. (issue2815028)

lrn Thu, 24 Jun 2010 02:02:27 -0700


http://codereview.chromium.org/2815028/diff/1/6
File src/x64/virtual-frame-x64.cc (right):


http://codereview.chromium.org/2815028/diff/1/6#newcode119
src/x64/virtual-frame-x64.cc:119: // For less locals the unrolled loop
is more compact.
Fixed.

http://codereview.chromium.org/2815028/diff/1/6#newcode124
src/x64/virtual-frame-x64.cc:124: ASSERT(tmp.is_valid());
Probably not. This code happens on entry to the function, so it's
incredibly likely that one of the first four registers are free, and if
it happens most of the time, it's fine (I'm guessing there is really a
100% chance of getting rax, but I'm still satisfied if it's only 90%).
Also, it only cost one extra byte per push if we fail to get one of the
low registers, and if there are many pushes, we go to the loop anyway,
so the damage is minimal.
This is really a micro-optimization for code size in the common case
more than anything else.

http://codereview.chromium.org/2815028/diff/1/6#newcode141
src/x64/virtual-frame-x64.cc:141: __ movb(cnt.reg(), Immediate(count));
There should be no partial register read stalls on intel processors,
since we only read the part we just wrote (there is no need to merge
partial registers).
On AMD processors there should be no problem either, since it always
merges immediately, and there is no dependencies that we have to wait
for.

http://codereview.chromium.org/2815028/show

--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev

[v8-dev] Re: X64: Remove more fpu code. Unroll more local initialization loops. (issue2815028)

Reply via email to