Scalability of Lucene indexes
Hi all, I have a question about scaling lucene across a cluster, and good ways of breaking up the work. We have a very large index and searches sometimes take more time than they're allowed. What we have been doing is during indexing we index into 256 seperate indexes (depending on the md5sum) then distribute the indexes to the search machines. So if a machine has 128 indexes it would have to do 128 searches. I gave parallelMultiSearcher a try and it was significantly slower than simply iterating through the indexes one at a time. Our new plan is to somehow have only one index per search machine and a larger main index stored on the master. What I'm interested to know is whether having one extremely large index for the master then splitting the index into several smaller indexes (if this is possible) would be better than having several smaller indexes and merging them on the search machines into one index. I would also be interested to know how others have divided up search work across a cluster. Thanks, Chris - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Bug in current CVS source with DateField
I found that the current code in CVS prevents a org.apache.lucene.search.DateFilter from functioning properly. This fragment is taken from org.apache.lucene.document.DateField // Pad with leading zeros if (s.length() DATE_LEN) { StringBuffer sb = new StringBuffer(s); while (sb.length() DATE_LEN) sb.insert(0, ' '); s = sb.toString(); } The code is padding ' ' (space) instead of zeros. Line 5 should be: sb.insert(0, '0'); Making this change and recompiling gave the expected results. Looking back, the lucene-1.2 source uses the following fragment: while (s.length() DATE_LEN) s = 0 + s; // pad with leading zeros _ Add photos to your messages with MSN 8. Get 2 months FREE*. http://join.msn.com/?page=features/featuredemail -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: has this exception been seen before
I am getting this problem as well, but have not been able to pinpoint the cause. A tip for those who are doing a complete re-index. You can save alot of time by creating a new index and then merging the old files into the new index. One disadvantage here is that you may have to re-point your app to the new index. I find that the bug prevents the old index from being deleted on Win2K. _ The new MSN 8: smart spam protection and 2 months FREE* http://join.msn.com/?page=features/junkmail -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]