http://codereview.chromium.org/2582001/diff/1/2 File src/ia32/assembler-ia32.cc (right):
http://codereview.chromium.org/2582001/diff/1/2#newcode2225 src/ia32/assembler-ia32.cc:2225: ASSERT(CpuFeatures::IsEnabled(SSE2)); Well spotted. http://codereview.chromium.org/2582001/diff/1/2#newcode2248 src/ia32/assembler-ia32.cc:2248: ASSERT(CpuFeatures::IsEnabled(SSE2)); True. Do we even have a test for that? http://codereview.chromium.org/2582001/diff/1/2#newcode2337 src/ia32/assembler-ia32.cc:2337: Fixed. http://codereview.chromium.org/2582001/diff/1/4 File src/ia32/codegen-ia32.cc (right): http://codereview.chromium.org/2582001/diff/1/4#newcode13505 src/ia32/codegen-ia32.cc:13505: MacroAssembler assembler(buffer, static_cast<int>(actual_size)); True. Just have to redefine __ as well. http://codereview.chromium.org/2582001/diff/1/4#newcode13541 src/ia32/codegen-ia32.cc:13541: __ mov(ecx, Operand(esp, stack_offset + kSizeOffset)); Done. http://codereview.chromium.org/2582001/diff/1/4#newcode13546 src/ia32/codegen-ia32.cc:13546: __ and_(edx, 0x0F); On 2010/06/04 07:18:10, Erik Corry wrote:
0x0F -> 0xF
Done. http://codereview.chromium.org/2582001/diff/1/4#newcode13551 src/ia32/codegen-ia32.cc:13551: __ sub(Operand(ecx), edx); They were copied just before. I only increase src/dst by 1..16, and I just copied 16 bytes above. That means that there is an overlap, but avoiding that would just introduce a number of (usually unpredictable) conditional branches. http://codereview.chromium.org/2582001/diff/1/4#newcode13567 src/ia32/codegen-ia32.cc:13567: __ movdqa(xmm0, Operand(esi, 0x00)); Correct. This is the fast case code where source is also aligned. The "nearly as fast case" code below uses unaligned loads, but is otherwise identical. Sadly the loads are spread all over the code, so there is little chance to reuse the remining code. It's only the last six instructions that can be saved (and I chose to duplicate those for simplicity). http://codereview.chromium.org/2582001/diff/1/4#newcode13579 src/ia32/codegen-ia32.cc:13579: // At most 31 bytes to copy. Not identically, the second copy uses movdqu for the load. http://codereview.chromium.org/2582001/diff/1/5 File src/ia32/disasm-ia32.cc (right): http://codereview.chromium.org/2582001/diff/1/5#newcode820 src/ia32/disasm-ia32.cc:820: case 0x18: return "prefetch"; movntdq[a] can't, but they aren't used yet either. Still, someone might start using them, so they should be disassemblable. http://codereview.chromium.org/2582001/diff/1/6 File src/utils.h (right): http://codereview.chromium.org/2582001/diff/1/6#newcode533 src/utils.h:533: // TODO(lrn): Extend to other platforms as needed, especially X64. You think it's more needed on ARM than on X64? :P http://codereview.chromium.org/2582001/diff/1/6#newcode569 src/utils.h:569: if (chars >= kMinComplexMemCopy) { It would make sense to base the decision on the number of bytes moved, and not the length of the string being copied. http://codereview.chromium.org/2582001/show -- v8-dev mailing list [email protected] http://groups.google.com/group/v8-dev
