I think that using a pool of cloned inputStreams would be the best solution. I've implemented such a solution locally using two pools of 3 readers each (configurable via system properties) and will post the diff after I do some testing to confirm accuracy and speed improvements.
Could you also benchmark this against a version that clones new streams for each call? That sounds extravagant, but it removes a configuration parameter, always a good thing.
After doing some testing tonight I've come up with the following numbers after running some tests on my personal workstation.
Dell 2 ghz with single HD, Win XP SP1, 768 Mb ram Sun JDK 1.4.2 Resin 2.1.0 (-DdisableLuceneLocks=true -J-server) Lucene index with 474128 documents, not completely optimized (most content is in 1 segment) First run after startup discarded Tested with 5 simultaneous threads
Unmodified CVS source
Run Count Time (ms) Queries/Second 1 1001 542050 1.85 2 1001 508458 1.97 3 1001 524396 1.93
CVS source using suggested clone solution for FieldsReader and removing synchronized from SegmentReader.document(i)
Run Count Time (ms) Queries/Second 1 1008 674123 1.495 2 1017 675363 1.51 3 1005 655551 1.53
CVS source using pool of 3 input streams for the previous fieldsStream and indexStream variables in FieldsReader and removing synchronized from SegmentReader.document(i)
Run Count Time (ms) Queries/Second 1 1009 392536 2.57 2 999 364783 2.74 3 995 386501 2.57
The times shown above is only the time taken to call the following code (numResults is a max of 1500 or hits.length(), whichever is smaller):
for (int i = 0; i < numResults; i++) { ids[i] = Long.parseLong((hits.doc(i)).get("messageID")); }
I've uploaded my testing app with 3 prebuilt lucene libs (unomodified, clone and pool) and my source modifications to FieldsReader to http://www.jivesoftware.com/~bruce/lucene/lucene-test.zip (624K) if anyone else wants to run the tests on their hardware. You'll have to edit the lucenetest.LuceneTestThread class as it has the location of the search directory hardcoded, but it should be pretty easy to understand what is going on.
Regards,
Bruce Ritchie
smime.p7s
Description: S/MIME Cryptographic Signature