heard that ~80million docs per index (varying with average document size).
@Uwe Schindler: Is hashed distribution really necessary when using
MultiReader? I did hear that solr uses continuous hashing algorithm with
shards of indexes. But MultiReader didn't say anything about hashing.
Hi,
This all depends on your index contents and hardware. In general the size of
a single index / index segment vs multiple segments / indexes is not an
issue on one single machine. To scale, you should also think of using more
than one machine with e.g. ElasticSearch or Apache Solr instead of pla
Hi All,
In my application, we have been maintaining lucene index for the 3 years
worth of data. (more than 70GB of single lucene index). To improve
performance, recently it was decided to break indexes into 1 year worth of
data each (3 indexes). before we work on the required change, I wanted to
g
t: Re: Index creation
increase the mergeFactor (how much depends on what the limit of open
file descriptors is on your machine)
increase maxBufferedDocs (how much depend son how much RAM you've got
and how big is your JVM heap)
I covered this in a Lucene article on onjava.com in 2003, I th
ginal Message
From: WATHELET Thomas <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Monday, January 29, 2007 4:16:22 AM
Subject: Index creation
How could I optimize my index creation?
// setUseCompoundFile(?);
// setMaxBufferedDocs(?);
// setMergeFacto
How could I optimize my index creation?
// setUseCompoundFile(?);
// setMaxBufferedDocs(?);
// setMergeFactor(?);
How could I reduce the disk access because I work with more than 100
documents?
Thanks