LGTM then. On Mon, Mar 16, 2009 at 12:14 PM, John Tamplin <[email protected]> wrote:
> On Mon, Mar 16, 2009 at 11:54 AM, Scott Blum <[email protected]> wrote: > >> 85: Why the change from [a-zA-Z] to [a-z]? FWIW, the tests should check >> the validatity of "C:\foo" and "c:\foo". >> > > The pattern was already being compared in a case-insensitive manner, so it > was pointless to also handle case here. In fact, the previous formulation > did rule.toLowerCase(), so it was getting changed to [a-za-z]. > > >> 88: unless I'm missing something, this should read: >> String prefix = "^(" >> >> Otherwise, you lose the property of checking only start of the URL. Think >> "http://evildomain.org/evil?http://localhost:88/". There should probably >> also be a test case that this does not pass. >> > > From the documentation of Matcher.matches: > >> * @return <tt>true</tt> if, and only if, the entire region sequence >> * matches this matcher's pattern > > > Ie, it already includes implicit ^$ around the pattern, which is why .* was > added in various places. > > I added your suggested tests to verify it and they do pass. I also added > an extra () around each individual regex to avoid surprises if someone adds > a regex in the future that needs it. > > >> Also, have you verified that it's actually faster to use a single regex >> than to check multiples? >> > > I have not actually compared this particular implementation, but in my > experience it is always faster to build one DFA based on all the patterns > than to build multple ones and try and match each separately. In addition, > the old formulation was compiling each pattern every time it was checked, so > I am completely confident this is much faster (not that the number of checks > is going to be significant anyway). > > > -- > John A. Tamplin > Software Engineer (GWT), Google > --~--~---------~--~----~------------~-------~--~----~ http://groups.google.com/group/Google-Web-Toolkit-Contributors -~----------~----~----~----~------~----~------~--~---
