LGTM with suggestions.
http://codereview.chromium.org/660072/diff/1062/1063 File src/ia32/codegen-ia32.cc (right): http://codereview.chromium.org/660072/diff/1062/1063#newcode5345 src/ia32/codegen-ia32.cc:5345: __ test(y.reg(), Immediate(1)); If you shr(y.reg(),1) here, then the bit that was shifted out is in the carry flag. I.e.: shr(y.reg(), 1); j(not_carry, &no_multiply); mulsd(xmm1, xmm0); bind(&no_multiply); http://codereview.chromium.org/660072/diff/1062/1063#newcode5351 src/ia32/codegen-ia32.cc:5351: __ j(zero, &powi_done); Just do: mulsd(xmm0, xmm0); j(not_zero, &while_true); // mulsd doesn't affect flags. It does one more mulsd than necessary, but on, e.g., a core2 chip, that's just one cycle, and it saves an instruction on each iteration of the loop. This and he previous comment should reduce the loop to four operations. That's probably only a space saving, since some of the multiplications are dependent and have a five cycle latency (and probably want to use the same ALU). http://codereview.chromium.org/660072/diff/1062/1063#newcode5360 src/ia32/codegen-ia32.cc:5360: __ mov(p.reg(), Immediate(0x7FF00000)); I don't think this is single-precission infinity. Single precission floats only have eight exponent bits. Positive infinity would be 0x7f800000. (And 0x7ff00000 is a quiet NaN with payload). Do we need a better test for this case? :) http://codereview.chromium.org/660072/diff/1062/1063#newcode5363 src/ia32/codegen-ia32.cc:5363: __ comisd(xmm0, xmm1); Use ucomisd instead of comisd. http://codereview.chromium.org/660072/diff/1062/1063#newcode5376 src/ia32/codegen-ia32.cc:5376: __ comisd(xmm1, xmm1); ucomisd http://codereview.chromium.org/660072/diff/1062/1063#newcode5395 src/ia32/codegen-ia32.cc:5395: __ cmp(Operand(p.reg()), Immediate(HeapNumber::kExponentMask)); Add comment here saying something like "p is NaN or +/-Infinity" http://codereview.chromium.org/660072/diff/1062/1063#newcode5409 src/ia32/codegen-ia32.cc:5409: __ comisd(xmm2, xmm1); ucomisd http://codereview.chromium.org/660072/diff/1062/1063#newcode5413 src/ia32/codegen-ia32.cc:5413: __ sqrt(xmm0, xmm0); Why isn't sqrt called sqrtsd? http://codereview.chromium.org/660072/diff/1062/1063#newcode5415 src/ia32/codegen-ia32.cc:5415: __ movsd(xmm1, xmm3); Move sqrt to the end (i.e., sqrt(xmm1,xmm1)) since it'shas a higher latency than movsd. (And perhaps comment that 1/sqrt(x) = sqrt(1/x)). http://codereview.chromium.org/660072/diff/1062/1063#newcode5421 src/ia32/codegen-ia32.cc:5421: __ mov(p.reg(), Immediate(0x3F000000)); You have -0.5 in xmm2 already, and 1.0 in xmm3, so just do addsd(xmm2, xmm3) to get +0.5. http://codereview.chromium.org/660072/diff/1062/1063#newcode5425 src/ia32/codegen-ia32.cc:5425: __ comisd(xmm2, xmm1); ucomisd http://codereview.chromium.org/660072/diff/1062/1063#newcode5428 src/ia32/codegen-ia32.cc:5428: __ sqrt(xmm0, xmm0); do movsd(xmm1, xmm0); sqrt(xmm1, xmm1); instead, to allow the latency of the sqrt operation to be spent during the heap allocation code, instead of stalling on the move. http://codereview.chromium.org/660072 -- v8-dev mailing list [email protected] http://groups.google.com/group/v8-dev
