Well, if you're adding data, you must be doing something with it later. Are you sure the problem is in the index growing and not how you use the data afterwards?
The reason I ask is that I found almost a 10-fold increase in my apps performance when I used FieldSelector (Lucene 2.1) to only load the important fields in my documents. See the thread labeled Lucene 2.1, using FieldSelector speeds up my app by a factor of 10+, numbers attached A quick test you could do is to time the actual search as opposed to the overall response time. That is, just time the call to Searcher.search, and collect the time you spend, say, assembling whatever you do to respond separately to see whether the search is killing you or your manipulation after the search. The take-away is that we're all surprised by the difference you're seeing and I'm betting that it's something other than merely adding 5% to your index size. What is left as an exercise for the reader <G>. Best Erick On 3/28/07, Oshima, Scott <[EMAIL PROTECTED]> wrote:
Yeah it might be an hardware issue, with a slightly smaller index with less stored data, the performance is what we want it to be. Just adding 5% more stored data(unidexed of course) pushes us over some sort of threshold causing performance to tank. -----Original Message----- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, March 28, 2007 12:46 PM To: java-user@lucene.apache.org Subject: Re: index file size threshold affecting search performance? I've just built a 9.3G index (admittedly tons of stored data in there, 3.3M documents) and performance is amazing (through Solr). Erik On Mar 28, 2007, at 3:11 PM, Erick Erickson wrote: > This surprises me, I'm currently working with a 4G index, and the > improvement from when it was an 8G index was only 10% or so. > And it's plenty speedy. > > Are you hitting hardware limitations and perhaps swapping like crazy? > In which case, unless you split things across several machines, I > doubt it would help to make two smaller indexes. > > In sum, I really suspect that you're NOT hitting a Lucene limitation, > but it's something else about your system.... > > Best > Erick > > On 3/28/07, Scott Oshima <[EMAIL PROTECTED]> wrote: >> >> So I assumed a linear decay of performance as an index got bigger. >> >> For some reason when going from an index size of 1.89 to 1.95 gigs >> dramatically increased cpu across all of our servers. >> >> I was thinking of splitting the 1.95 index into 2 separate indexes >> and using a multisearcher on those parts? >> >> thanks. >> >> -scott >> --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]