Well, if you're adding data, you must be doing something with
it later. Are you sure the problem is in the index growing and not
how you use the data afterwards?

The reason I ask is that I found almost a 10-fold increase in
my apps performance when I used FieldSelector (Lucene 2.1)
to only load the important fields in my documents. See the
thread labeled

Lucene 2.1, using FieldSelector speeds up my app by a factor of 10+,
numbers attached

A quick test you could do is to time the actual search as
opposed to the overall response time. That is, just time
the call to Searcher.search, and collect the time you spend,
say, assembling whatever you do to respond separately to
see whether the search is killing you or your manipulation
after the search.

The take-away is that we're all surprised by the difference you're
seeing and I'm betting that it's something other than merely
adding 5% to your index size. What is left as an exercise for the
reader <G>.

Best
Erick


On 3/28/07, Oshima, Scott <[EMAIL PROTECTED]> wrote:

Yeah it might be an hardware issue, with a slightly smaller index with
less stored data, the performance is what we want it to be.  Just adding
5% more stored data(unidexed of course) pushes us over some sort of
threshold causing performance to tank.



-----Original Message-----
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 28, 2007 12:46 PM
To: java-user@lucene.apache.org
Subject: Re: index file size threshold affecting search performance?

I've just built a 9.3G index (admittedly tons of stored data in there,
3.3M documents) and performance is amazing (through Solr).

        Erik



On Mar 28, 2007, at 3:11 PM, Erick Erickson wrote:

> This surprises me, I'm currently working with a 4G index, and the
> improvement from when it was an 8G index was only 10% or so.
> And it's plenty speedy.
>
> Are you hitting hardware limitations and perhaps swapping like crazy?
> In which case, unless you split things across several machines, I
> doubt it would help to make two smaller indexes.
>
> In sum, I really suspect that you're NOT hitting a Lucene limitation,
> but it's something else about your system....
>
> Best
> Erick
>
> On 3/28/07, Scott Oshima <[EMAIL PROTECTED]> wrote:
>>
>> So I assumed a linear decay of performance as an index got bigger.
>>
>> For some reason when going from an index size of 1.89 to 1.95 gigs
>> dramatically increased cpu across all of our servers.
>>
>> I was thinking of splitting the 1.95 index into 2 separate indexes
>> and using a multisearcher on those parts?
>>
>> thanks.
>>
>> -scott
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to