RE: Optimize completely in memory with a FSDirectory?

2006-04-07 Thread Max Pfingsthorn
leave the default settings and call optimize() periodically (like each n added documents). However, if you do one _huge_ indexing batch, it might be nice for you to tweak this parameter to use more memory while indexing. Bye! max > -Original Message----- > From: Max Pfingsthorn > Sent:

RE: Optimize completely in memory with a FSDirectory?

2006-04-06 Thread Max Pfingsthorn
006 20:23 > To: java-user@lucene.apache.org > Subject: Re: Optimize completely in memory with a FSDirectory? > > > On Mittwoch 05 April 2006 13:02, Max Pfingsthorn wrote: > > > The setMaxBufferedDocs and related parameters help a lot already to > > fully exploit m

Optimize completely in memory with a FSDirectory?

2006-04-05 Thread Max Pfingsthorn
memory to hold the index many times over, so it really shouldn't be a problem there, and it would be so much faster (I have to think). Any hints? Best regards, Max Pfingsthorn Hippo Oosteinde 11 1017WT Amsterdam The Netherlands Tel +3

Indexing derived data

2005-11-02 Thread Max Pfingsthorn
zations and the question if this can be done in a generic way or if this has to be built in to the business objects (e.g. to notice that the derived data has to be updated). Thanks in advance and best regards, Max Pfingsthorn - To

RE: Implicit Stopping in StandardTokenizer??

2005-06-20 Thread Max Pfingsthorn
ed out even with no stopwords set. Any ideas? Thanks a lot! max -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Monday, June 20, 2005 16:57 To: java-user@lucene.apache.org Subject: Re: Implicit Stopping in StandardTokenizer?? On Jun 20, 2005, at 10:41 AM, Max Pfin

Implicit Stopping in StandardTokenizer??

2005-06-20 Thread Max Pfingsthorn
keyword,hello!,nicetomeetyou". This should tokenize into "hello this is a keyword hello nicetomeetyou", but actually it does "hello keyword hello nicetomeetyou". Does anyone know why it drops those extra terms? Best regards, Max Pfingsthorn Hippo Oosteind

document score modification on the fly?

2005-06-14 Thread Max Pfingsthorn
, Max Pfingsthorn PS: I tried to look into Nutch for this, but I didn't recognize much from Lucene there... - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

RE: deleting on a keyword field

2005-06-07 Thread Max Pfingsthorn
hanks for bearing with me though! max -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 07, 2005 03:37 To: java-user@lucene.apache.org Subject: Re: deleting on a keyword field On Jun 6, 2005, at 7:07 AM, Max Pfingsthorn wrote: > Thanks for all the re

RE: deleting on a keyword field

2005-06-06 Thread Max Pfingsthorn
3, 2005 20:10 To: java-user@lucene.apache.org Subject: Re: deleting on a keyword field On Friday 03 June 2005 18:50, Max Pfingsthorn wrote: > reader.delete(new Term(URI_FIELD, uri)); > > This does not remove anything. Do I have to make the uri a normal field? How do you know nothing w

deleting on a keyword field

2005-06-03 Thread Max Pfingsthorn
make the uri a normal field? Thanks for your help in advance! Best regards, Max Pfingsthorn Hippo Oosteinde 11 1017WT Amsterdam The Netherlands Tel +31 (0)20 5224466 - [EMAIL PROTECTED]

RE: Indexing multiple languages

2005-06-03 Thread Max Pfingsthorn
Hi You could use the ParalellReader for this if you have all documents in all languages. Then, the metadata fields can be stored in one of the field data files, while each languages gets its own field data file... max -Original Message- From: Paul Libbrecht [mailto:[EMAIL PROTECTED] Se

RE: calculate wi = tfi * IDFi for each document.

2005-06-03 Thread Max Pfingsthorn
. for(termFreqVec){ TermWeight wi = Similarity.wi(termFreqVec[], termFreqVec.length); ... } } Andrew -Original Message- From: Max Pfingsthorn <[EMAIL PROTECTED]> Sent: Jun 3, 2005 4:13 AM To: java-user@lucene.apache.org Subject: RE: calculate wi = tfi *

RE: calculate wi = tfi * IDFi for each document.

2005-06-03 Thread Max Pfingsthorn
find the connection between Similarity and a Document. I know I'm missing the elephant that must be in the middle of the room. Or maybe it's not there. Is what I'm trying to do do-able? Thanks, Andrew -Original Message----- From: Max Pfingsthorn <[EMAIL PROTECTED]> Sent: Jun

RE: calculate wi = tfi * IDFi for each document.

2005-06-02 Thread Max Pfingsthorn
Hi, DefaultSimilarity uses exactly this weighting scheme. Makes sense since it's a pretty standard relevance measure... Bye! max -Original Message- From: Andrew Boyd [mailto:[EMAIL PROTECTED] Sent: Thursday, June 02, 2005 11:39 To: java-user@lucene.apache.org Subject: calculate wi = tfi

RE: ACLs and Lucene

2005-05-30 Thread Max Pfingsthorn
specially in a multi-processor environment. Have there been any thoughts about this? Best regards, Max Pfingsthorn Hippo Oosteinde 11 1017WT Amsterdam The Netherlands Tel +31 (0)20 5224466 - [E

RE: Confused about non-tokenized fields

2005-05-27 Thread Max Pfingsthorn
manually during indexing? Or is there some nicer way? Thanks! Max Pfingsthorn -Original Message- From: Gusenbauer Stefan [mailto:[EMAIL PROTECTED] Sent: Friday, May 27, 2005 18:00 To: java-user@lucene.apache.org Subject: Re: Confused about non-tokenized fields Max Pfingsthorn wrote: >

Confused about non-tokenized fields

2005-05-27 Thread Max Pfingsthorn
most frequent terms. Shouldn't I get only the complete filenames there?? Also, how do I search case-insensitive over this kind of field? Thanks! Best regards, Max Pfingsthorn Hippo Oosteinde 11 1017WT Amsterdam The Netherlan

mutiple index question

2005-05-20 Thread Max Pfingsthorn
properties. Content and properties could be indexed separately. Even different sets of properties could be combined in maybe different MultiSearcher instances to speed up querying... Any ideas on this? Best regards, Max Pfingsthorn Hippo Oosteinde 11 1017WT Amsterdam The Netherlands Tel +31 (0)20