Re: The best way to know when an index has been changed

2005-08-03 Thread Luke
laced. Luke - Original Message - From: "Steve Gaunt" <[EMAIL PROTECTED]> To: Sent: Wednesday, August 03, 2005 9:34 AM Subject: The best way to know when an index has been changed > Hi > > We have a web app, which keep a copy of the index searcher, then reloads >

Deleting All Documents With Certain Field Name

2005-09-05 Thread Luke
Would this not delete all records from the index that have a saleDate field? reader.delete(new Term("salesDate", "")); Thanks, Luke - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Levenshtein FST's?

2016-05-23 Thread Luke Nezda
? Thanks in advance, - Luke

Levenshtein FST's?

2016-05-24 Thread Luke Nezda
the match character offsets of each match in each document. > > Mike McCandless > > http://blog.mikemccandless.com > > On Mon, May 23, 2016 at 8:59 PM, Luke Nezda wrote: > > > Hello, all - > > > > I'd like to use Lucene's automaton/FST code to

Re: Levenshtein FST's?

2016-05-25 Thread Luke Nezda
Oof, sounds too tricky for me to justify pursuing right now. While union'ing 10k Levenshtein automata was tractable, seems determinizing the result is not (NP-hard - oops :)), let alone working out a suitably useful conversion to an FST. Thank you very much for input! Kind regards, - Luk

Re: Levenshtein FST's?

2016-05-26 Thread Luke Nezda
yeah, converting THAT to an FST is tricky... > > Mike McCandless > > http://blog.mikemccandless.com > > On Wed, May 25, 2016 at 2:46 PM, Luke Nezda wrote: > > > Oof, sounds too tricky for me to justify pursuing right now. While > > union'ing 10k Levenshtein

Re: Levenshtein FST's?

2016-05-26 Thread Luke Nezda
I should note, I know in I can call Operations.determinize(union, 10_000_000) but union of 5000+ Levenshtein automata seems to require too many states to be tractable, and that's on the low end of what I'd like to work with. On Thu, May 26, 2016 at 9:59 AM, Luke Nezda wrote: > I

Re: Levenshtein FST's?

2016-05-27 Thread Luke Nezda
any states does the not-yet-determinized union of 5000+ > Levenshtein automata contain? > > Mike McCandless > > http://blog.mikemccandless.com > > On Thu, May 26, 2016 at 12:08 PM, Luke Nezda wrote: > > > I should note, I know in I can > > call Oper

Re: Levenshtein FST's?

2016-05-28 Thread Luke Nezda
s, > > Xmx3g OK) > > > > On Thu, May 26, 2016 at 12:13 PM, Michael McCandless < > > luc...@mikemccandless.com> wrote: > > > >> But how many states does the not-yet-determinized union of 5000+ > >> Levenshtein automata contain? > >>

Re: ApacheCon next week

2005-12-11 Thread Luke Nezda
Hello Grant- Could you post the material you present (eg slides, handouts, etc) for those of us who cannot attend? Thanks in advance, -Luke On 12/9/05, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > > Any one planning on going to ApacheCon next week? I will be giving a > talk on L

Re: ApacheCon next week

2005-12-11 Thread Luke Nezda
Where are my manners :-/ Anyway, I found the answer to my own request. http://www.cnlp.org/apachecon2005/ Looks like some cool work, I only wish I could hear the accompanying speech. Cheers, -Luke On 12/11/05, gekkokid <[EMAIL PROTECTED]> wrote: > > please :) > - Original Messa

SpanRegexQuery causes error

2006-09-07 Thread Luke Tan
SpanFirstQuery sfq = new SpanFirstQuery(srq, 1); return sfq; } } Query getFieldQuery(String field, String queryText) throws ParseException Thanks Luke inside

Re: SpanRegexQuery causes error

2006-09-07 Thread Luke Tan
uot; or some other pattern. Erik On Sep 7, 2006, at 7:41 AM, Luke Tan wrote: > Hi, > > I am using code in > http://mail-archives.apache.org/mod_mbox/lucene-java-user/ > 200605.mbox/% > [EMAIL PROTECTED] > > for wildcard search in phrase > > but it seems that

Re: SpanRegexQuery causes error

2006-09-08 Thread Luke Tan
e: On Sep 7, 2006, at 9:26 PM, Luke Tan wrote: > spanFirst(spanRegexQuery(monthly:day * of every * months), 10) What analyzer did you use for your text? Again, that is not a valid regular expression. But also, you're using a single long string of several words within your SpanRege

Re: SpanRegexQuery causes error

2006-09-08 Thread Luke Tan
I use analyzer with LowerCaseTokenizer only (No stop word or any other special treatment). The phrase is tokenized. On 9/9/06, Luke Tan I tried .* too but it gave the same error. I think it's a bug. I solve it using SpanTermQuery where the search phrase is broken into day of every months

Using SpanRegexQuery to search year like 200?

2006-09-08 Thread Luke Tan
Hi, Can this be use to search year 2000, 2001, 2002, ... 2009? SpanFirstQuery snq = new SpanFirstQuery(new SpanRegexQuery(new Term("year", "200?")), 1); I need to use it to search something like Who is born in 200? Thanks

Re: Using SpanRegexQuery to search year like 200?

2006-09-10 Thread Luke Tan
er to search people:"born in 200?" Kind Regards, Luke On 9/9/06, Erick Erickson <[EMAIL PROTECTED]> wrote: I've got to ask Why not just use a RangeQuery? Seems to be just what you want without the complications. Best Erick On 9/8/06, Luke Tan <[EMAIL PROTECT

Re: Using SpanRegexQuery to search year like 200?

2006-09-10 Thread Luke Tan
Hi, Oops. You just remind me about that. I conveniently think regex as simple as * and ? Yes, I understood java regex. Thanks Luke On 9/9/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: To use SpanRegexQuery, you need to understand regular expressions. The WildcardQuery syntax is _NOT_ th

Indexing sit (stuff it) files

2005-03-02 Thread Luke Shannon
e with this? Thanks, Luke - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Zip Files

2005-03-07 Thread Luke Shannon
same document). Luke //zip files else if (attached.getPath().endsWith(".zip")) { Document attachedDoc = new Document(); Trace.DEBUG("Got a zip file to index: " + attached.getPath()); try { ZipFile zip =

Score Question

2005-03-09 Thread Luke Shannon
ssible?) 2. The XSL did do something with this value before converting it to an int and somehow that code has been misplaced. 3. Scoring has changed in the last version of Lucene and I need multilply the score by some factor to make it more int friendly. Can someone shed some light on this please?

Re: Score Question

2005-03-10 Thread Luke Shannon
A couple of times. Luke - Original Message - From: "Erik Hatcher" <[EMAIL PROTECTED]> To: Sent: Wednesday, March 09, 2005 8:03 PM Subject: Re: Score Question > Did you reindex after upgrading? > > Erik > > On Mar 9, 2005, at 5:55 PM, Luke Shannon wr

Re: Score Question

2005-03-10 Thread Luke Shannon
ll matched documents equally. In a multiple-clause boolean query, some documents may match one clause but not another, enabling the boost factor to discriminate between queries. Queries also default to a 1.0 boost factor. Luke - To u

Boost/Scoring Question

2005-03-10 Thread Luke Shannon
ot; note:"sub brand" pc_file:"sub brand" question:"sub brand" sort:"sub brand" stylesheet:"sub brand" thumbnail:"sub brand" uncomp_ext:"sub brand" urgent:"sub brand" weblink:"sub brand" I get 22 results but they are all smaller than 0 by an exponent of 4. Is there anything I can do to resolve this? Luke - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Best way to purposely corrupt an index?

2005-04-19 Thread Luke Shannon
The only time I have seen corrupted indexes is when the java process is killed during the indexing process. If you shutdown tomcat (or what ever you are running for java) during the indexing process you will end up with a corrupted index. - Original Message - From: "Andy Roberts" <[EMAIL

RE: IndexSearcher hanging on to old index files in Windows

2005-04-29 Thread Luke Francl
also implemented a reference counting scheme for IndexSearchers and it works well. Regards, Luke Francl - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: getting document metadata

2005-05-03 Thread Luke Shannon
contain a field called path. This will have the location of the document on the system. Is this what you are after? Luke - Original Message - From: "Pablo Gomes Ludermir" <[EMAIL PROTECTED]> To: "Lucene user list" Sent: Tuesday, May 03, 2005 2:23 PM Subject:

Re: indexing synonyms / reducing the index size

2005-05-05 Thread Luke Shannon
applicable plus the original word/string in the Query. This reduces index size (synonyms not in there), but it did result in some queries exceeding the default max clause count for the BooleanQuery. I ended up having to increase this. Luke - Original Message - From: "Pablo Gomes Ludermir&quo

Re: Max Field Length

2005-05-06 Thread Luke Shannon
Hi; I think by default only 10,000 terms will be indexed for a field. You can change this using the maxFieldLength method of IndexWriter. Luke - Original Message - From: "Ernesto De Santis" <[EMAIL PROTECTED]> To: "Lucene Users List" Sent: Friday, May 06,

Re: Best Practices for Distributing Lucene Indexing and Searching

2005-05-13 Thread Luke Francl
. Are there any problems known problems having a read-only index shared over SMB? Using a shared file system is preferable to me because it's easier, but if it's necessary I will write the code to copy the index to each node. Thanks, Luke Francl -

Re: Indexes auto creation

2005-06-13 Thread Luke Francl
You may want to try using IndexReader's indexExists family of methods. They will tell you whether or not an index is there. http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexReader.html#indexExists(org.apache.lucene.store.Directory)

Re: Case-sensitive search

2005-08-18 Thread Luke Francl
: > > Query query = QueryParser.parse(line, "contents", analyzer); > > As for analyzer, I have tried both StardaAnalyzer and StopAnalyzer. You need to use the same analyzer for parsing queries as you do

Re: QueryParser not thread-safe

2005-08-23 Thread Luke Francl
n doing the simplest possible thing, I would recommend creating a new parser for every thread using QueryParser.parse( String, String, Analyzer) until/unless you determine this is a performance bottleneck. Regards, Luke Francl - To

Re: indexing documents from 1857

2005-09-28 Thread Luke Francl
Index your dates as strings (mmdd). This works better anyway because range searches work over a wider range of dates than when you index the full precision. On Wed, 2005-09-28 at 09:54, Renaud Richardet wrote: > Hello, > > From our understanding, Lucene uses the Unix Epoch (Jan 1, 1970) and

OpenNLPLemmatizer + KeywordRepeatFilter bugs

2022-09-14 Thread Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD A)
://github.com/apache/lucene/pull/11734 . I would greatly appreciate any feedback on this change as it fixes both issues mentioned above. Many thanks, Luke

Integrating NLP into Lucene Analysis Chain

2022-11-19 Thread Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD A)
42960569/indexing-taking-long-time-when-using-opennlp-lemmatizer-with-solr Many thanks, Luke Kot-Zaniewski

Offset-Based Analysis

2023-02-21 Thread Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD A)
appreciated. Thanks, Luke

Re: Offset-Based Analysis

2023-02-22 Thread Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD A)
-defined solution to the problem I presented. I realize with offsets you would have to make assumptions when offset-boundaries fall in the middle of a token and other such odd cases. Thanks again, Luke From: java-user@lucene.apache.org At: 02/22/23 02:38:30 UTC-5:00To: java-user