Thanks for all the comments.  I've fixed (almost) all of them and
submitted the code.


http://codereview.chromium.org/8188/diff/1/2
File src/ast.cc (right):

http://codereview.chromium.org/8188/diff/1/2#newcode197
Line 197: class RegExpUnparser: public RegExpVisitor {
I can't be bothered to write the full format but I've added a comment
that says that it prints the trees as sexps.  This code should be moved
to the test file later but I think it will be useful for us at this
point to have it as part of the library as well.

http://codereview.chromium.org/8188/diff/1/2#newcode253
Line 253: stream()->Add("^");
Nope, the style guide allows both forms.  "In general, curly braces are
not required for single-line statements, but they are allowed if you
like them".  I don't like them.

http://codereview.chromium.org/8188/diff/1/2#newcode257
Line 257: if (first) first = false;
Good point.

http://codereview.chromium.org/8188/diff/1/3
File src/ast.h (right):

http://codereview.chromium.org/8188/diff/1/3#newcode1184
Line 1184: // Regular expressions
I copied the banner style from parser.cc which doesn't use periods at
the end so I prefer to leave it as it is.

http://codereview.chromium.org/8188/diff/1/3#newcode1236
Line 1236: enum Type { START, END, BOUNDARY, NON_BOUNDARY };
Good point.

http://codereview.chromium.org/8188/diff/1/3#newcode1249
Line 1249: CharacterRange(uc32 from, uc32 to, bool is_special)
I've renamed it to is_character_class_.

http://codereview.chromium.org/8188/diff/1/3#newcode1271
Line 1271: unsigned from_ : 21;
Yes, and then when we unfold the character classes so we don't need the
is_character_class_ bit we can fit a CharacterRange in a single word.

http://codereview.chromium.org/8188/diff/1/3#newcode1332
Line 1332: class RegExpCapture: public RegExpTree {
The ast nodes that I don't yet convert into nodes are incomplete so once
we start actually using them we'll add the info.

http://codereview.chromium.org/8188/diff/1/6
File src/jsregexp.cc (right):

http://codereview.chromium.org/8188/diff/1/6#newcode954
Line 954: case '.':
This is the general '.'.  However, I don't expect this code to survive
long so it's probably not worth the bother to fix it.

http://codereview.chromium.org/8188/diff/1/6#newcode986
Line 986: break;
No, I've fixed it.

http://codereview.chromium.org/8188/diff/1/6#newcode1015
Line 1015: Vector<const uc32> input);
No.  It was getting late...

http://codereview.chromium.org/8188/diff/1/7
File src/jsregexp.h (right):

http://codereview.chromium.org/8188/diff/1/7#newcode131
Line 131: class RegExpEngine {
Yes.

http://codereview.chromium.org/8188/diff/1/9
File src/parser.cc (right):

http://codereview.chromium.org/8188/diff/1/9#newcode3271
Line 3271: return c == '|' || c == ')' || c == '\0';
Hmm.  I've replaced it with a special kEndMarker which is just a
different name for kBadChar.

http://codereview.chromium.org/8188/diff/1/9#newcode3595
Line 3595: // other implementations...
I think we're free to do whatever we want with \8 and \9 and I suggest
we treat them as '8' and '9'.  Otherwise we're probably stuck with the
octal/decimal-back-reference confusion.

By the way, you forgot to mention leading zeros.  The spec explicitly
disallows them but lo and behold:

   js> /\000011/.test("\x09")
   true

http://codereview.chromium.org/8188/diff/1/9#newcode3662
Line 3662: uc32 c = ParseCharacterEscape(CHECK_OK);
Right, I expect what we'll do is check for backreferences first in the
atom parser and then if we can verify that it isn't one fall through to
here.

http://codereview.chromium.org/8188

--~--~---------~--~----~------------~-------~--~----~
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
-~----------~----~----~----~------~----~------~--~---

Reply via email to