On 2013/10/16 08:58:17, Sven Panne wrote:
[ Reviving the dead... :-]
While the performance improvement on kraken's audio-oscillator is great,
I've
just seen that we actually regress SunSpider's math-spectral-norm by
roughly
25%
on linux-x64 (Core 2 and Core i5). I am not sure what the actual reason
is,
bmeurer@ has the theory that it might be related to single vs. double
mode.
Anyway, we might have to investigate this further, any help from the Intel
side
would be highly appreciated.
Hi Sven,
Thanks a lot for your sharing.
I did a quick experiment on my PC (Core 2) by simply not using xorps before
cvtsi2sd as below:
diff --git a/src/x64/macro-assembler-x64.cc b/src/x64/macro-assembler-x64.cc
index 9dcb9d1..05ae973 100644
--- a/src/x64/macro-assembler-x64.cc
+++ b/src/x64/macro-assembler-x64.cc
@@ -936,13 +936,11 @@ void MacroAssembler::PopCallerSaved(SaveFPRegsMode
fp_mode,
void MacroAssembler::Cvtlsi2sd(XMMRegister dst, Register src) {
- xorps(dst, dst);
cvtlsi2sd(dst, src);
}
void MacroAssembler::Cvtlsi2sd(XMMRegister dst, const Operand& src) {
- xorps(dst, dst);
cvtlsi2sd(dst, src);
}
Then I got the following result for SunSpider's math-spectral-norm. Because
math-spectral-norm is a quite small case (only 4-5 ms in my machine), I
load it
ten time to measure its performance.
Case: (run.js)
var _start = new Date();
for (var ii = 0; ii < 10; ii += 1) {
load("math-spectral-norm.js");
}
var elapsed = new Date() - _start;
print(elapsed + " ms");
Result:
[~]./d8.x64.no.xorps run.js
28 ms
[~]./d8.x64.xorps run.js
23 ms
So it seems there is ~17% improvement by using xorps. :)
https://codereview.chromium.org/23654026/
--
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.