There seems to be an issue with what happens if there is a GC and the string
moves while we are matching. This can happen if the regexp is interrupted
by
the stack check (eg if the user breaks using the debugger).
Most of the arch-specific comments I have only written in the ARM files, but
they apply to all architectures.
Otherwise LGTM
http://codereview.chromium.org/10386090/diff/3002/src/arm/code-stubs-arm.cc
File src/arm/code-stubs-arm.cc (right):
http://codereview.chromium.org/10386090/diff/3002/src/arm/code-stubs-arm.cc#newcode4850
src/arm/code-stubs-arm.cc:4850: // Argument 6: Clear the number of
capture registers for non-global capture.
I don't understand this comment. Is argument 6 a boolean argument that
indicates whether we should clear the number of capture registers when
the capture is non-global? Does it have any meaning if the regexp is
non-global?
http://codereview.chromium.org/10386090/diff/3002/src/arm/code-stubs-arm.cc#newcode4901
src/arm/code-stubs-arm.cc:4901: // We do not expect multiple results.
This comment does not clarify much. Why do we not expect multiple
results here.
http://codereview.chromium.org/10386090/diff/3002/src/arm/regexp-macro-assembler-arm.cc
File src/arm/regexp-macro-assembler-arm.cc (right):
http://codereview.chromium.org/10386090/diff/3002/src/arm/regexp-macro-assembler-arm.cc#newcode45
src/arm/regexp-macro-assembler-arm.cc:45: * This assembler uses the
following register assignment convention
This needs updating to include r4
http://codereview.chromium.org/10386090/diff/3002/src/arm/regexp-macro-assembler-arm.cc#newcode78
src/arm/regexp-macro-assembler-arm.cc:78: * - fp[-12] start index
(character index of start).
Inconsistent capitalization, not started by you.
http://codereview.chromium.org/10386090/diff/3002/src/arm/regexp-macro-assembler-arm.cc#newcode80
src/arm/regexp-macro-assembler-arm.cc:80: * - fp[-20] success counter
(only useful for global regexp to count matches)
Does this lint?
http://codereview.chromium.org/10386090/diff/3002/src/arm/regexp-macro-assembler-arm.cc#newcode745
src/arm/regexp-macro-assembler-arm.cc:745: __ mov(current_character(),
Operand('\n'), LeaveCC, eq);
Having conditional instructions here after non-conditional instructions
that do not affect the flags - that is too tricky. Please refactor to
use conventional if-then-else branching.
http://codereview.chromium.org/10386090/diff/3002/src/arm/regexp-macro-assembler-arm.cc#newcode755
src/arm/regexp-macro-assembler-arm.cc:755:
LoadCurrentCharacterUnchecked(-1, 1);
In the global case the generated code isn't making much sense. It's
doing:
Load CurrentChar
jmp loaded
Load CurrentChar
bind loaded
http://codereview.chromium.org/10386090/diff/3002/src/arm/regexp-macro-assembler-arm.cc#newcode894
src/arm/regexp-macro-assembler-arm.cc:894: // String might have moved:
Reload end of string from frame.
If this happens will r4 be updated with the correct position?
http://codereview.chromium.org/10386090/diff/3002/src/arm/regexp-macro-assembler-arm.cc#newcode1378
src/arm/regexp-macro-assembler-arm.cc:1378: offset = ip;
This is setting a trap for a future programmer. There are lots of
instructions that implicitly overwrite ip, so we should not use ip
unless we have to, and if we do it should be over 1-2 lines so you don't
overlook that it is taken.
http://codereview.chromium.org/10386090/diff/3002/src/jsregexp.cc
File src/jsregexp.cc (right):
http://codereview.chromium.org/10386090/diff/3002/src/jsregexp.cc#newcode5808
src/jsregexp.cc:5808: RegExpCompiler compiler(data->capture_count,
ignore_case, is_ascii);
Optimization idea (probably for the next patch): At this point you can
use data->tree->min_match() to see whether the regexp can do a zero
length match. You can pass that as a flag to the compiler and it can
tell the macro assembler to avoid testing for zero length matches in the
repeat code on global regexps.
http://codereview.chromium.org/10386090/diff/3002/src/regexp-macro-assembler-tracer.cc
File src/regexp-macro-assembler-tracer.cc (right):
http://codereview.chromium.org/10386090/diff/3002/src/regexp-macro-assembler-tracer.cc#newcode107
src/regexp-macro-assembler-tracer.cc:107: assembler_->global() ? "
[restart for global match]" : "");
It's a bit ugly that this will print "restart..." for global regexps in
the interpreter, but it won't actually restart. Instead you could make
this function return a bool that tells you whether it actually emitted
code to restart.
http://codereview.chromium.org/10386090/diff/3002/src/runtime.cc
File src/runtime.cc (right):
http://codereview.chromium.org/10386090/diff/3002/src/runtime.cc#newcode3863
src/runtime.cc:3863: return num_matches;
Perhaps it is clearer to write "return RegExpImpl::RE_EXCEPTION here
since there is no number of matches to return.
http://codereview.chromium.org/10386090/diff/3002/test/mjsunit/regexp-global.js
File test/mjsunit/regexp-global.js (right):
http://codereview.chromium.org/10386090/diff/3002/test/mjsunit/regexp-global.js#newcode50
test/mjsunit/regexp-global.js:50: assertEquals("2It 3was 1a 8pleasure
2to 4burn.", str);
Nice test :-)
http://codereview.chromium.org/10386090/
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev