> Thanks for the answers, more questions below.
> 
> On 2/16/2011 3:37 PM, Markus Jelsma wrote:
> > 200.000 stored fields? I asume that number includes your number of
> > documents? Sounds crazy =)
> 
> Nope, I wasn't clear. I have less than a dozen stored field, but the
> value of a stored field can sometimes be as large as 200kb.
> 
> > You can set mergeFactor to 2, not lower.
> 
> Am I right though that manually running an 'optimize' is the equivalent
> of a mergeFactor=1?  So there's no way to get Solr to keep the index in
> an 'always optimized' state, if I'm understanding correctly? Cool. Just
> want to understand what's going on.

That should be it. If i remember correctly a second segment is always written, 
new updates aren't merged immediately. 

> 
> > This depends on commit rate and if there are a lot of updates and deletes
> > instead of adds. Setting it very low will indeed cause a lot of merging
> > and slow commits. It will also be very slow in replication because
> > merged files are copied over again and again, causing high I/O on your
> > slaves.
> > 
> > There is always a `break even` but it depends (as usual) on your scenario
> > and business demands.
> 
> There are indeed sadly lots of updates and deletes, which is why I need
> to run optimize periodically. I am aware that this will cause more work
> for replication -- I think this is true whether I manually issue an
> optimize before replication _or_ whether I just keep the mergeFactor
> very low, right? Same issue either way.

Yes. But having several segments shouldn't make that much of a difference. If 
search latency is just a few addidional milliseconds than i'd rather have a 
few more segments being copied over more quickly.

> 
> So... if I'm going to do lots of updates and deletes, and my other
> option is running an optimize before replication anyway....   is there
> any reason it's going to be completely stupid to set the mergeFactor to
> 2 on the master?  I realize it'll mean all index files are going to have
> to be replicated, but that would be the case if I ran a manual optimize
> in the same situation before replication too, I think.

No, it's not stupid if you allow for slow indexing and slow copying of files 
but want a very quick search.

> 
> Jonathan

Reply via email to