It would help if we have a target date, then I'll know how many more X's I need to mark on the Calendar :)
On Mon, Jun 15, 2009 at 6:56 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > :) > > But those days are numbered! > > Mike > > On Mon, Jun 15, 2009 at 11:55 AM, Uwe Schindler<u...@thetaphi.de> wrote: > > By the way: > > I compiled core and corresponding tests with an old JDK 1.4 version, I > found > > locally on my machine. Works fine! > > > > Uwe > > > > ----- > > Uwe Schindler > > H.-H.-Meier-Allee 63, D-28213 Bremen > > http://www.thetaphi.de > > eMail: u...@thetaphi.de > > > >> -----Original Message----- > >> From: Uwe Schindler (JIRA) [mailto:j...@apache.org] > >> Sent: Monday, June 15, 2009 5:48 PM > >> To: java-dev@lucene.apache.org > >> Subject: [jira] Commented: (LUCENE-1606) Automaton Query/Filter > (scalable > >> regex) > >> > >> > >> [ https://issues.apache.org/jira/browse/LUCENE- > >> 1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment- > >> tabpanel&focusedCommentId=12719606#action_12719606 ] > >> > >> Uwe Schindler commented on LUCENE-1606: > >> --------------------------------------- > >> > >> Doesn't seem to work, I will check the sources: > >> > >> {code} > >> compile-core: > >> [javac] Compiling 12 source files to > >> C:\Projects\lucene\trunk\build\contrib\regex\classes\java > >> [javac] > >> > C:\Projects\lucene\trunk\contrib\regex\src\java\org\apache\lucene\search\r > >> egex\AutomatonFuzzyQuery.java:11: cannot access > >> dk.brics.automaton.Automaton > >> [javac] bad class file: > >> C:\Projects\lucene\trunk\contrib\regex\lib\automaton > >> .jar(dk/brics/automaton/Automaton.class) > >> [javac] class file has wrong version 49.0, should be 48.0 > >> [javac] Please remove or make sure it appears in the correct > >> subdirectory of > >> the classpath. > >> [javac] import dk.brics.automaton.Automaton; > >> [javac] ^ > >> [javac] 1 error > >> {code} > >> > >> > Automaton Query/Filter (scalable regex) > >> > --------------------------------------- > >> > > >> > Key: LUCENE-1606 > >> > URL: > https://issues.apache.org/jira/browse/LUCENE-1606 > >> > Project: Lucene - Java > >> > Issue Type: New Feature > >> > Components: contrib/* > >> > Reporter: Robert Muir > >> > Assignee: Uwe Schindler > >> > Priority: Minor > >> > Fix For: 2.9 > >> > > >> > Attachments: automaton.patch, automatonMultiQuery.patch, > >> automatonmultiqueryfuzzy.patch, automatonMultiQuerySmart.patch, > >> automatonWithWildCard.patch, automatonWithWildCard2.patch, LUCENE- > >> 1606.patch > >> > > >> > > >> > Attached is a patch for an AutomatonQuery/Filter (name can change if > its > >> not suitable). > >> > Whereas the out-of-box contrib RegexQuery is nice, I have some very > >> large indexes (100M+ unique tokens) where queries are quite slow, 2 > >> minutes, etc. Additionally all of the existing RegexQuery > implementations > >> in Lucene are really slow if there is no constant prefix. This > >> implementation does not depend upon constant prefix, and runs the same > >> query in 640ms. > >> > Some use cases I envision: > >> > 1. lexicography/etc on large text corpora > >> > 2. looking for things such as urls where the prefix is not constant > >> (http:// or ftp://) > >> > The Filter uses the BRICS package (http://www.brics.dk/automaton/) to > >> convert regular expressions into a DFA. Then, the filter "enumerates" > >> terms in a special way, by using the underlying state machine. Here is > my > >> short description from the comments: > >> > The algorithm here is pretty basic. Enumerate terms but instead > of > >> a binary accept/reject do: > >> > > >> > 1. Look at the portion that is OK (did not enter a reject state > in > >> the DFA) > >> > 2. Generate the next possible String and seek to that. > >> > the Query simply wraps the filter with ConstantScoreQuery. > >> > I did not include the automaton.jar inside the patch but it can be > >> downloaded from http://www.brics.dk/automaton/ and is BSD-licensed. > >> > >> -- > >> This message is automatically generated by JIRA. > >> - > >> You can reply to this email to add a comment to the issue online. > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-dev-h...@lucene.apache.org > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >