I thought about it, i think this is the best fix I think RegExp.toString is
just for debugging, not really guaranteed to be reparseable.

On Mon, Jul 26, 2010 at 1:49 PM, Michael McCandless <
[email protected]> wrote:

> OK lemme try...
>
> Mike
>
> On Mon, Jul 26, 2010 at 1:47 PM, Robert Muir <[email protected]> wrote:
> > maybe you can try this on your beast computer for a while? I think its
> > better.
> > Index: lucene/src/test/org/apache/lucene/search/TestRegexpRandom2.java
> > ===================================================================
> > --- lucene/src/test/org/apache/lucene/search/TestRegexpRandom2.java
> > (revision 979377)
> > +++ lucene/src/test/org/apache/lucene/search/TestRegexpRandom2.java
> (working
> > copy)
> > @@ -86,9 +86,9 @@
> >    private class DumbRegexpQuery extends MultiTermQuery {
> >      private final Automaton automaton;
> >
> > -    DumbRegexpQuery(Term term) {
> > +    DumbRegexpQuery(Term term, int flags) {
> >        super(term.field());
> > -      RegExp re = new RegExp(term.text());
> > +      RegExp re = new RegExp(term.text(), flags);
> >        automaton = re.toAutomaton();
> >      }
> >
> > @@ -130,8 +130,8 @@
> >     * simple regexpquery implementation.
> >     */
> >    private void assertSame(String regexp) throws IOException {
> > -    RegexpQuery smart = new RegexpQuery(new Term("field", regexp));
> > -    DumbRegexpQuery dumb = new DumbRegexpQuery(new Term("field",
> regexp));
> > +    RegexpQuery smart = new RegexpQuery(new Term("field", regexp),
> > RegExp.NONE);
> > +    DumbRegexpQuery dumb = new DumbRegexpQuery(new Term("field",
> regexp),
> > RegExp.NONE);
> >
> >      // we can't compare the two if automaton rewrites to a simpler enum.
> >      // for example: "a\uda07\udcc7?.*?" gets rewritten to a simpler
> query:
> > Index:
> > lucene/src/test/org/apache/lucene/util/automaton/AutomatonTestUtil.java
> > ===================================================================
> > ---
> lucene/src/test/org/apache/lucene/util/automaton/AutomatonTestUtil.java
> > (revision 979377)
> > +++
> lucene/src/test/org/apache/lucene/util/automaton/AutomatonTestUtil.java
> > (working copy)
> > @@ -40,7 +40,9 @@
> >        if (!UnicodeUtil.validUTF16String(regexp))
> >          continue;
> >        try {
> > -        return new RegExp(regexp, RegExp.NONE);
> > +        // NOTE: we parse-tostring-parse again, because we are
> > +        // really abusing RegExp.toString() here (its just for
> debugging)
> > +        return new RegExp(new RegExp(regexp, RegExp.NONE).toString(),
> > RegExp.NONE);
> >        } catch (Exception e) {}
> >      }
> >    }
> >
> > On Mon, Jul 26, 2010 at 1:32 PM, Michael McCandless
> > <[email protected]> wrote:
> >>
> >> Hmm that doesn't fix it.
> >>
> >> The magical seed is: 50686536365145364L (for TRR2).
> >>
> >> I still see the IAE even if I remove that RegExp.NONE.
> >>
> >> Mike
> >>
> >> On Mon, Jul 26, 2010 at 1:28 PM, Robert Muir <[email protected]> wrote:
> >> > I don't think we should do that. I think i found the problem:
> >> > AutomatonTestUtil's randomRegexp does this:
> >> >       try {
> >> >         return new RegExp(regexp, RegExp.NONE);
> >> >       } catch (Exception e) {}
> >> > I think the RegExp.NONE is the problem, and we should remove it.
> >> > because the test then toString's it and compiles it again, but without
> >> > this
> >> > option.
> >> > On Mon, Jul 26, 2010 at 1:16 PM, Michael McCandless
> >> > <[email protected]> wrote:
> >> >>
> >> >> My random stress testing hit an IllegalArgExc because the random
> >> >> regexp was malformed.
> >> >>
> >> >> Does this patch look OK to fix?
> >> >>
> >> >> Index: src/test/org/apache/lucene/search/TestRegexpRandom2.java
> >> >> ===================================================================
> >> >> --- src/test/org/apache/lucene/search/TestRegexpRandom2.java
> >> >>  (revision
> >> >> 979227)
> >> >> +++ src/test/org/apache/lucene/search/TestRegexpRandom2.java
> >> >>  (working
> >> >> copy)
> >> >> @@ -130,6 +130,14 @@
> >> >>    * simple regexpquery implementation.
> >> >>    */
> >> >>   private void assertSame(String regexp) throws IOException {
> >> >> +    try {
> >> >> +      new RegExp(regexp);
> >> >> +    } catch (IllegalArgumentException iae) {
> >> >> +      // the random regexp could be malformed, eg "foo"bar",
> >> >> +      // so we ignore this
> >> >> +      return;
> >> >> +    }
> >> >> +
> >> >>     RegexpQuery smart = new RegexpQuery(new Term("field", regexp));
> >> >>     DumbRegexpQuery dumb = new DumbRegexpQuery(new Term("field",
> >> >> regexp));
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: [email protected]
> >> >> For additional commands, e-mail: [email protected]
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Robert Muir
> >> > [email protected]
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [email protected]
> >> For additional commands, e-mail: [email protected]
> >>
> >
> >
> >
> > --
> > Robert Muir
> > [email protected]
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>


-- 
Robert Muir
[email protected]

Reply via email to