On 8/27/2012 2:54 PM, Voth, Brad (GE Corporate) wrote:
Hello all,

I'm working on implementing a solution in a very high index rate, lower query 
rate project.  We have a very small record size, 2 small strings, 6 longs, 7 
ints, and 2 dates, indexing on 8 of those fields.  We need to be able to 
maintain an injest rate of 50k records/sec average.  Through sharding and a 
reasonable sized cluster we've hit most of our performance goals, but have 
found that our producers tend to get hung on a shard that is doing a merge.  
I've done a bit of digging and found some tips and hints on configuring the 
merging, but have yet to get rid of the issue.

What we see on the server hosting the shard during a problematic period is a 
single cpu core at 100%, and very little IO activity on the disk and merge 
messages in the logs.  This leads me to believe that a single merge thread is 
blocking indexing from occuring.  When this happens our producers, which 
distribute their updates amongst all the shards, pile up on this shard and 
wait.  This causes our overall injest rate to plummet and is currently keeping 
us from going to prod with the solution.

The relevant portion of our solrconfig.xml is:

<indexConfig>
....
     <ramBufferSizeMB>1024</ramBufferSizeMB>
     <mergeFactor>20</mergeFactor>
         <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
           <int name="maxMergeAtOnce">8</int>
           <int name="segmentsPerTier">20</int>
         </mergePolicy>
        <mergeScheduler 
class="org.apache.lucene.index.ConcurrentMergeScheduler">
         <int name="maxMergeCount">10</int>
         <int name="maxThreadCount">10</int>
         </mergeScheduler>
      <autoCommit>
        <maxTime>1500000</maxTime>
        <openSearcher>false</openSearcher>
      </autoCommit>
        <autoSoftCommit>
          <maxTime>10000</maxTime>
        </autoSoftCommit>

....
</indexConfig>

The mergeFactor configuration is no longer applicable if you are using TieredMergePolicy, so that should be removed.

Your maxMergeAtOnce field looks low to me, it should probably be the same as segmentsPerTier. Here is what I have currently on my Solr 3.5 setup. This emulates what a mergeFactor of 35 would have done on Solr 3.4 or lower. The maxMergeAtOnceExplicit is three times the other values so that three levels of merging can take place simultaneously:

<mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
  <int name="maxMergeAtOnce">35</int>
  <int name="segmentsPerTier">35</int>
  <int name="maxMergeAtOnceExplicit">105</int>
</mergePolicy>

With a merge and thread count of 10, this is probably what you want:

<mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
  <int name="maxMergeAtOnce">20</int>
  <int name="segmentsPerTier">20</int>
  <int name="maxMergeAtOnceExplicit">200</int>
</mergePolicy>

I admit that I am only taking a shot in the dark here, I have no idea whether it will help or not.

Thanks,
Shawn

Reply via email to