Hi Tate (didn't know you were lurking on the list),

I've found that it's often not very clear what truly affects performance. 
Doing batch indexes with  a data set of 250,000 docs (with 10 fields each) 
on a machine with 2 Gbytes of 400 DDR RAM, I've tested a few merge factors 
to discover that it seemed optimal at 50 and even then, performance wasn't 
much better than with a MF of 20. Nowadays, there can be so many hidden 
optimisations by HDs and OSs, that it's often worth testing with each 
configuration used.

sv

On Tue, 21 Oct 2003, Tate Avery wrote:

> Doug,
> 
> Re: high merge factor.  I was building test indexes and writing out 300 segments of 
> 300 docs and merging them every 90,000 kept the 'merging' time down to a minimum 
> (for my slowish HD).
> 
> I was assuming that 11 of these large merges during the indexing of 1,000,000 docs 
> (plus a final optimize) would be faster than 10,000 little merges if the mergeFactor 
> was set to 10 (for the same corpus).
> 
> Maybe this is not the case.
> 
> 
> 
> 
> Tate
> 
> 
> -----Original Message-----
> From: Doug Cutting [mailto:[EMAIL PROTECTED]
> Sent: October 21, 2003 12:37 PM
> To: Lucene Users List
> Subject: Re: Lucene on Windows
> 
> 
> Tate Avery wrote:
> > You might have trouble with "too many open files" if you set your mergeFactor too 
> > high.  For example, on my Win2k, I can go up to mergeFactor=300 (or so).  At 400 I 
> > get a too many open files error.  Note: the default mergeFactor of 10 should give 
> > no trouble.
> 
> Please note that it is never recommended that you set mergeFactor 
> anywhere near this high.  I don't know why folks do this.  It really 
> doesn't make indexing much faster, and it makes searching slower if you 
> don't optimize.  It's a bad idea.  The default setting of 10 works 
> pretty well.  I've also had good experience setting it as high as 50 on 
> big batch indexing runs, but do not recommend setting it much higher 
> than that.  Even then, this can cause problems if you need to use 
> several indexes at once, or you have lots of fields.
> 
> Doug
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to