Hey Lucene-users,
I'm setting up a Lucene index on 5G of PDF files (full-text search). I've
been really happy with Lucene so far but I'm curious what tips and strategies
I can use to optimize my performance at this large size.
So far I am using pretty much all of the defaults (I'm new to Lucene).
I am using PDFBox to add the documents to the index.
I can usually add about 800 or so PDF files and then the add loop:
for ( int i = 0; i < fileNames.length; i++ ) {
Document doc = IndexFile.index(baseDirectory+documentRoot+"fileNames
[i]);
writer.addDocument(doc);
}
really starts to slow down. Doesn't seem to be memory related.
Thoughts anyone?
Thanks in advance,
CK Hill
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]