Status: New
Owner: ----
New issue 1518 by [email protected]: Shorter ia32 deferred code fragments
http://code.google.com/p/v8/issues/detail?id=1518
Deferred code segments could generally be shorter and straight-line in the
common case.
If the Genenerate() methods are changed to take the EXIT label, the
deferred code has fewer constraints.
Other than the space saving, I don't see this having much impact on
benchmarks.
Example:
0xf53b0ffd 285 8179ffa14037f5 cmp [ecx+0xff],0xf53740a1 ;; object:
0xf53740a1 <Map>
0xf53b1004 292 0f85483077ff jnz 0xf4b24052 ;;
deoptimization bailout 5
0xf53b100a 298 f20f104103 movsd xmm0,[ecx+0x3]
0xf53b100f 303 f20f2cc8 cvttsd2si ecx,xmm0
0xf53b1013 307 f20f2ac9 cvtsi2sd xmm1,ecx
0xf53b1017 311 660f2ec1 ucomisd xmm0,xmm1
0xf53b101b 315 0f85313077ff jnz 0xf4b24052 ;;
deoptimization bailout 5
0xf53b1021 321 0f8a2b3077ff jpe 0xf4b24052 ;;
deoptimization bailout 5
0xf53b1027 327 85c9 test ecx,ecx
0xf53b1029 329 0f850d000000 jnz 348 (0xf53b103c)
0xf53b102f 335 660f50c8 movmskpd ecx,xmm0
0xf53b1033 339 83e101 and ecx,0x1
0xf53b1036 342 0f85163077ff jnz 0xf4b24052 ;;
deoptimization bailout 5
0xf53b103c 348 e9e0feffff jmp 65 (0xf53b0f21)
(68 bytes + 4 relocation records)
Proposed layout:
ENTRY:
cmp [r-1],<Map>
jnz short bail
movsd
cvttsd2si
cvtsi2sd
ucomisd
jnz short bail
jpe short bail
test r,r
jnz EXIT
movmskpd
and r,1
jz EXIT
bail:
jmp deoptimization_bailout_N
(56 bytes + 1 relocation record)
The above code branches forward in the unexpected case and backwards (to
the main code) in the expected case, consistent with:
[IntelĀ® 64 and IA-32 Architectures Optimization Reference Manual Order
Number: 248966-024 April 2011]
Assembly/Compiler Coding Rule 3. (M impact, H generality) Arrange code to
be consistent with the static branch prediction algorithm: make the
fall-through
code following a conditional branch be the likely target for a branch with
a forward
target, and make the fall-through code following a conditional branch be the
unlikely target for a branch with a backward target.
(However, I have read that recent microarchitectures tend to always use
dynamic prediction rather than this algorithm.)
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev