leave the default settings and call optimize()
periodically (like each n added documents).
However, if you do one _huge_ indexing batch, it might be nice for you to tweak
this parameter to use more memory while indexing.
Bye!
max
> -Original Message-----
> From: Max Pfingsthorn
> Sent:
006 20:23
> To: java-user@lucene.apache.org
> Subject: Re: Optimize completely in memory with a FSDirectory?
>
>
> On Mittwoch 05 April 2006 13:02, Max Pfingsthorn wrote:
>
> > The setMaxBufferedDocs and related parameters help a lot already to
> > fully exploit m
memory to hold the
index many times over, so it really shouldn't be a problem there, and it would
be so much faster (I have to think).
Any hints?
Best regards,
Max Pfingsthorn
Hippo
Oosteinde 11
1017WT Amsterdam
The Netherlands
Tel +3
zations and the question if this can be
done in a generic way or if this has to be built in to the business objects
(e.g. to notice that the derived data has to be updated).
Thanks in advance and best regards,
Max Pfingsthorn
-
To
ed
out even with no stopwords set. Any ideas?
Thanks a lot!
max
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Monday, June 20, 2005 16:57
To: java-user@lucene.apache.org
Subject: Re: Implicit Stopping in StandardTokenizer??
On Jun 20, 2005, at 10:41 AM, Max Pfin
keyword,hello!,nicetomeetyou". This should tokenize into
"hello this is a keyword hello nicetomeetyou", but actually it does "hello
keyword hello nicetomeetyou". Does anyone know why it drops those extra terms?
Best regards,
Max Pfingsthorn
Hippo
Oosteind
,
Max Pfingsthorn
PS: I tried to look into Nutch for this, but I didn't recognize much from
Lucene there...
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
hanks for bearing with me though!
max
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Tuesday, June 07, 2005 03:37
To: java-user@lucene.apache.org
Subject: Re: deleting on a keyword field
On Jun 6, 2005, at 7:07 AM, Max Pfingsthorn wrote:
> Thanks for all the re
3, 2005 20:10
To: java-user@lucene.apache.org
Subject: Re: deleting on a keyword field
On Friday 03 June 2005 18:50, Max Pfingsthorn wrote:
> reader.delete(new Term(URI_FIELD, uri));
>
> This does not remove anything. Do I have to make the uri a normal field?
How do you know nothing w
make the uri a normal field?
Thanks for your help in advance!
Best regards,
Max Pfingsthorn
Hippo
Oosteinde 11
1017WT Amsterdam
The Netherlands
Tel +31 (0)20 5224466
-
[EMAIL PROTECTED]
Hi
You could use the ParalellReader for this if you have all documents in all
languages. Then, the metadata fields can be stored in one of the field data
files, while each languages gets its own field data file...
max
-Original Message-
From: Paul Libbrecht [mailto:[EMAIL PROTECTED]
Se
.
for(termFreqVec){
TermWeight wi = Similarity.wi(termFreqVec[], termFreqVec.length);
...
}
}
Andrew
-Original Message-
From: Max Pfingsthorn <[EMAIL PROTECTED]>
Sent: Jun 3, 2005 4:13 AM
To: java-user@lucene.apache.org
Subject: RE: calculate wi = tfi *
find the connection between Similarity and a Document.
I know I'm missing the elephant that must be in the middle of the room. Or
maybe it's not there.
Is what I'm trying to do do-able?
Thanks,
Andrew
-Original Message-----
From: Max Pfingsthorn <[EMAIL PROTECTED]>
Sent: Jun
Hi,
DefaultSimilarity uses exactly this weighting scheme. Makes sense since it's a
pretty standard relevance measure...
Bye!
max
-Original Message-
From: Andrew Boyd [mailto:[EMAIL PROTECTED]
Sent: Thursday, June 02, 2005 11:39
To: java-user@lucene.apache.org
Subject: calculate wi = tfi
specially in a multi-processor environment. Have there been any
thoughts about this?
Best regards,
Max Pfingsthorn
Hippo
Oosteinde 11
1017WT Amsterdam
The Netherlands
Tel +31 (0)20 5224466
-
[E
manually during indexing? Or
is there some nicer way?
Thanks!
Max Pfingsthorn
-Original Message-
From: Gusenbauer Stefan [mailto:[EMAIL PROTECTED]
Sent: Friday, May 27, 2005 18:00
To: java-user@lucene.apache.org
Subject: Re: Confused about non-tokenized fields
Max Pfingsthorn wrote:
>
most frequent terms.
Shouldn't I get only the complete filenames there??
Also, how do I search case-insensitive over this kind of field?
Thanks!
Best regards,
Max Pfingsthorn
Hippo
Oosteinde 11
1017WT Amsterdam
The Netherlan
properties. Content and
properties could be indexed separately. Even different sets of properties could
be combined in maybe different MultiSearcher instances to speed up querying...
Any ideas on this?
Best regards,
Max Pfingsthorn
Hippo
Oosteinde 11
1017WT Amsterdam
The Netherlands
Tel +31 (0)20
18 matches
Mail list logo