Thanks for the answers, more questions below.

On 2/16/2011 3:37 PM, Markus Jelsma wrote:

200.000 stored fields? I asume that number includes your number of documents?
Sounds crazy =)

Nope, I wasn't clear. I have less than a dozen stored field, but the value of a stored field can sometimes be as large as 200kb.


You can set mergeFactor to 2, not lower.

Am I right though that manually running an 'optimize' is the equivalent of a mergeFactor=1? So there's no way to get Solr to keep the index in an 'always optimized' state, if I'm understanding correctly? Cool. Just want to understand what's going on.

This depends on commit rate and if there are a lot of updates and deletes
instead of adds. Setting it very low will indeed cause a lot of merging and
slow commits. It will also be very slow in replication because merged files are
copied over again and again, causing high I/O on your slaves.

There is always a `break even` but it depends (as usual) on your scenario and
business demands.


There are indeed sadly lots of updates and deletes, which is why I need to run optimize periodically. I am aware that this will cause more work for replication -- I think this is true whether I manually issue an optimize before replication _or_ whether I just keep the mergeFactor very low, right? Same issue either way.

So... if I'm going to do lots of updates and deletes, and my other option is running an optimize before replication anyway.... is there any reason it's going to be completely stupid to set the mergeFactor to 2 on the master? I realize it'll mean all index files are going to have to be replicated, but that would be the case if I ran a manual optimize in the same situation before replication too, I think.

Jonathan

Reply via email to