It would help if we have a target date, then I'll know how many more X's I
need to mark on the Calendar :)

On Mon, Jun 15, 2009 at 6:56 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> :)
>
> But those days are numbered!
>
> Mike
>
> On Mon, Jun 15, 2009 at 11:55 AM, Uwe Schindler<u...@thetaphi.de> wrote:
> > By the way:
> > I compiled core and corresponding tests with an old JDK 1.4 version, I
> found
> > locally on my machine. Works fine!
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> >> -----Original Message-----
> >> From: Uwe Schindler (JIRA) [mailto:j...@apache.org]
> >> Sent: Monday, June 15, 2009 5:48 PM
> >> To: java-dev@lucene.apache.org
> >> Subject: [jira] Commented: (LUCENE-1606) Automaton Query/Filter
> (scalable
> >> regex)
> >>
> >>
> >>     [ https://issues.apache.org/jira/browse/LUCENE-
> >> 1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
> >> tabpanel&focusedCommentId=12719606#action_12719606 ]
> >>
> >> Uwe Schindler commented on LUCENE-1606:
> >> ---------------------------------------
> >>
> >> Doesn't seem to work, I will check the sources:
> >>
> >> {code}
> >> compile-core:
> >>     [javac] Compiling 12 source files to
> >> C:\Projects\lucene\trunk\build\contrib\regex\classes\java
> >>     [javac]
> >>
> C:\Projects\lucene\trunk\contrib\regex\src\java\org\apache\lucene\search\r
> >> egex\AutomatonFuzzyQuery.java:11: cannot access
> >> dk.brics.automaton.Automaton
> >>     [javac] bad class file:
> >> C:\Projects\lucene\trunk\contrib\regex\lib\automaton
> >> .jar(dk/brics/automaton/Automaton.class)
> >>     [javac] class file has wrong version 49.0, should be 48.0
> >>     [javac] Please remove or make sure it appears in the correct
> >> subdirectory of
> >>  the classpath.
> >>     [javac] import dk.brics.automaton.Automaton;
> >>     [javac]                           ^
> >>     [javac] 1 error
> >> {code}
> >>
> >> > Automaton Query/Filter (scalable regex)
> >> > ---------------------------------------
> >> >
> >> >                 Key: LUCENE-1606
> >> >                 URL:
> https://issues.apache.org/jira/browse/LUCENE-1606
> >> >             Project: Lucene - Java
> >> >          Issue Type: New Feature
> >> >          Components: contrib/*
> >> >            Reporter: Robert Muir
> >> >            Assignee: Uwe Schindler
> >> >            Priority: Minor
> >> >             Fix For: 2.9
> >> >
> >> >         Attachments: automaton.patch, automatonMultiQuery.patch,
> >> automatonmultiqueryfuzzy.patch, automatonMultiQuerySmart.patch,
> >> automatonWithWildCard.patch, automatonWithWildCard2.patch, LUCENE-
> >> 1606.patch
> >> >
> >> >
> >> > Attached is a patch for an AutomatonQuery/Filter (name can change if
> its
> >> not suitable).
> >> > Whereas the out-of-box contrib RegexQuery is nice, I have some very
> >> large indexes (100M+ unique tokens) where queries are quite slow, 2
> >> minutes, etc. Additionally all of the existing RegexQuery
> implementations
> >> in Lucene are really slow if there is no constant prefix. This
> >> implementation does not depend upon constant prefix, and runs the same
> >> query in 640ms.
> >> > Some use cases I envision:
> >> >  1. lexicography/etc on large text corpora
> >> >  2. looking for things such as urls where the prefix is not constant
> >> (http:// or ftp://)
> >> > The Filter uses the BRICS package (http://www.brics.dk/automaton/) to
> >> convert regular expressions into a DFA. Then, the filter "enumerates"
> >> terms in a special way, by using the underlying state machine. Here is
> my
> >> short description from the comments:
> >> >      The algorithm here is pretty basic. Enumerate terms but instead
> of
> >> a binary accept/reject do:
> >> >
> >> >      1. Look at the portion that is OK (did not enter a reject state
> in
> >> the DFA)
> >> >      2. Generate the next possible String and seek to that.
> >> > the Query simply wraps the filter with ConstantScoreQuery.
> >> > I did not include the automaton.jar inside the patch but it can be
> >> downloaded from http://www.brics.dk/automaton/ and is BSD-licensed.
> >>
> >> --
> >> This message is automatically generated by JIRA.
> >> -
> >> You can reply to this email to add a comment to the issue online.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-dev-h...@lucene.apache.org
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-dev-h...@lucene.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>

Reply via email to