Are you doing all 7 million docs with the same writer? The call to
optimize will take longer as your index size increases. So if you are
actually indexing your docs in smaller chunks, the speed will decrease
due to the call to optimize.
Mekin Maheshwari wrote:
I am creating an index of about 7 Million documents.
The total size of the index is about 2.7G once indexing is done.
For the 1st 3Million documents, the indexer takes about 3 hours (can i
get better than this? )
- 4 seconds per thousand documents
After this it slows down terribly and takes about 20 seconds for every
thousand documents.
It doesnt seem to be a data issue, as I tried starting to create the
index from the 3Millionth document & I get the initial speed (1k docs
in 3 secs)
What could be going wrong?
I have tried a few things, but cant quickly try out a lot of things as
it takes 3 hours before the slow down happens.
Below is the relevant pieces of the code::
Thanks for the help,
mekin
IndexWriter writer = new IndexWriter(dirname, analyzer,doTotalIndex );
writer.setMergeFactor(1000);
while(more records to get){
Document doc = new Document();
//... add fields to doc
//add boost to doc
//
writer.addDocument(doc);
}
writer.optimize();
writer.close();
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]