Thanks for all the comments. I've fixed (almost) all of them and submitted the code.
http://codereview.chromium.org/8188/diff/1/2 File src/ast.cc (right): http://codereview.chromium.org/8188/diff/1/2#newcode197 Line 197: class RegExpUnparser: public RegExpVisitor { I can't be bothered to write the full format but I've added a comment that says that it prints the trees as sexps. This code should be moved to the test file later but I think it will be useful for us at this point to have it as part of the library as well. http://codereview.chromium.org/8188/diff/1/2#newcode253 Line 253: stream()->Add("^"); Nope, the style guide allows both forms. "In general, curly braces are not required for single-line statements, but they are allowed if you like them". I don't like them. http://codereview.chromium.org/8188/diff/1/2#newcode257 Line 257: if (first) first = false; Good point. http://codereview.chromium.org/8188/diff/1/3 File src/ast.h (right): http://codereview.chromium.org/8188/diff/1/3#newcode1184 Line 1184: // Regular expressions I copied the banner style from parser.cc which doesn't use periods at the end so I prefer to leave it as it is. http://codereview.chromium.org/8188/diff/1/3#newcode1236 Line 1236: enum Type { START, END, BOUNDARY, NON_BOUNDARY }; Good point. http://codereview.chromium.org/8188/diff/1/3#newcode1249 Line 1249: CharacterRange(uc32 from, uc32 to, bool is_special) I've renamed it to is_character_class_. http://codereview.chromium.org/8188/diff/1/3#newcode1271 Line 1271: unsigned from_ : 21; Yes, and then when we unfold the character classes so we don't need the is_character_class_ bit we can fit a CharacterRange in a single word. http://codereview.chromium.org/8188/diff/1/3#newcode1332 Line 1332: class RegExpCapture: public RegExpTree { The ast nodes that I don't yet convert into nodes are incomplete so once we start actually using them we'll add the info. http://codereview.chromium.org/8188/diff/1/6 File src/jsregexp.cc (right): http://codereview.chromium.org/8188/diff/1/6#newcode954 Line 954: case '.': This is the general '.'. However, I don't expect this code to survive long so it's probably not worth the bother to fix it. http://codereview.chromium.org/8188/diff/1/6#newcode986 Line 986: break; No, I've fixed it. http://codereview.chromium.org/8188/diff/1/6#newcode1015 Line 1015: Vector<const uc32> input); No. It was getting late... http://codereview.chromium.org/8188/diff/1/7 File src/jsregexp.h (right): http://codereview.chromium.org/8188/diff/1/7#newcode131 Line 131: class RegExpEngine { Yes. http://codereview.chromium.org/8188/diff/1/9 File src/parser.cc (right): http://codereview.chromium.org/8188/diff/1/9#newcode3271 Line 3271: return c == '|' || c == ')' || c == '\0'; Hmm. I've replaced it with a special kEndMarker which is just a different name for kBadChar. http://codereview.chromium.org/8188/diff/1/9#newcode3595 Line 3595: // other implementations... I think we're free to do whatever we want with \8 and \9 and I suggest we treat them as '8' and '9'. Otherwise we're probably stuck with the octal/decimal-back-reference confusion. By the way, you forgot to mention leading zeros. The spec explicitly disallows them but lo and behold: js> /\000011/.test("\x09") true http://codereview.chromium.org/8188/diff/1/9#newcode3662 Line 3662: uc32 c = ParseCharacterEscape(CHECK_OK); Right, I expect what we'll do is check for backreferences first in the atom parser and then if we can verify that it isn't one fall through to here. http://codereview.chromium.org/8188 --~--~---------~--~----~------------~-------~--~----~ v8-dev mailing list [email protected] http://groups.google.com/group/v8-dev -~----------~----~----~----~------~----~------~--~---
