LGTM with suggestions.

http://codereview.chromium.org/660072/diff/1062/1063
File src/ia32/codegen-ia32.cc (right):

http://codereview.chromium.org/660072/diff/1062/1063#newcode5345
src/ia32/codegen-ia32.cc:5345: __ test(y.reg(), Immediate(1));
If you shr(y.reg(),1) here, then the bit that was shifted out is in the
carry flag. I.e.:
shr(y.reg(), 1);
j(not_carry, &no_multiply);
mulsd(xmm1, xmm0);
bind(&no_multiply);

http://codereview.chromium.org/660072/diff/1062/1063#newcode5351
src/ia32/codegen-ia32.cc:5351: __ j(zero, &powi_done);
Just do:
 mulsd(xmm0, xmm0);
 j(not_zero, &while_true);  // mulsd doesn't affect flags.

It does one more mulsd than necessary, but on, e.g., a core2 chip,
that's just one cycle, and it saves an instruction on each iteration of
the loop.

This and he previous comment should reduce the loop to four operations.
That's probably only a space saving, since some of the multiplications
are dependent and have a five cycle latency (and probably want to use
the same ALU).

http://codereview.chromium.org/660072/diff/1062/1063#newcode5360
src/ia32/codegen-ia32.cc:5360: __ mov(p.reg(), Immediate(0x7FF00000));
I don't think this is single-precission infinity.
Single precission floats only have eight exponent bits.
Positive infinity would be 0x7f800000.
(And 0x7ff00000 is a quiet NaN with payload).

Do we need a better test for this case? :)

http://codereview.chromium.org/660072/diff/1062/1063#newcode5363
src/ia32/codegen-ia32.cc:5363: __ comisd(xmm0, xmm1);
Use ucomisd instead of comisd.

http://codereview.chromium.org/660072/diff/1062/1063#newcode5376
src/ia32/codegen-ia32.cc:5376: __ comisd(xmm1, xmm1);
ucomisd

http://codereview.chromium.org/660072/diff/1062/1063#newcode5395
src/ia32/codegen-ia32.cc:5395: __ cmp(Operand(p.reg()),
Immediate(HeapNumber::kExponentMask));
Add comment here saying something like "p is NaN or +/-Infinity"

http://codereview.chromium.org/660072/diff/1062/1063#newcode5409
src/ia32/codegen-ia32.cc:5409: __ comisd(xmm2, xmm1);
ucomisd

http://codereview.chromium.org/660072/diff/1062/1063#newcode5413
src/ia32/codegen-ia32.cc:5413: __ sqrt(xmm0, xmm0);
Why isn't sqrt called sqrtsd?

http://codereview.chromium.org/660072/diff/1062/1063#newcode5415
src/ia32/codegen-ia32.cc:5415: __ movsd(xmm1, xmm3);
Move sqrt to the end (i.e., sqrt(xmm1,xmm1)) since it'shas a higher
latency than movsd.
(And perhaps comment that 1/sqrt(x) = sqrt(1/x)).

http://codereview.chromium.org/660072/diff/1062/1063#newcode5421
src/ia32/codegen-ia32.cc:5421: __ mov(p.reg(), Immediate(0x3F000000));
You have -0.5 in xmm2 already, and 1.0 in xmm3, so just do
addsd(xmm2, xmm3) to get +0.5.

http://codereview.chromium.org/660072/diff/1062/1063#newcode5425
src/ia32/codegen-ia32.cc:5425: __ comisd(xmm2, xmm1);
ucomisd

http://codereview.chromium.org/660072/diff/1062/1063#newcode5428
src/ia32/codegen-ia32.cc:5428: __ sqrt(xmm0, xmm0);
do
 movsd(xmm1, xmm0);
 sqrt(xmm1, xmm1);
instead, to allow the latency of the sqrt operation to be spent during
the heap allocation code, instead of stalling on the move.

http://codereview.chromium.org/660072

--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev

Reply via email to