Scott, In addition to what Lance said, make sure your ramBufferSizeMB in solrconfig.xml is high. Try with 512MB or 1024MB. Seeing Solr/Lucene index segment merging visualization in SPM for Solr is one of my favourite reports in SPM. It's kind of "amazing" how much index size fluctuates!
Otis ---- Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm >________________________________ > From: Scott Preddy <scott.m.pre...@gmail.com> >To: solr-user@lucene.apache.org >Sent: Wednesday, May 23, 2012 2:19 PM >Subject: configuring solr3.6 for a large intensive index only run > >I am trying to do a very large insertion (about 68million documents) into a >solr instance. > >Our schema is pretty simple. About 40 fields using these types: > > <types> > <fieldType name="string" class="solr.StrField" sortMissingLast="true" >omitNorms="true"/> > <fieldType name="text_general" class="solr.TextField" >positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > </fieldType> > <fieldType name="int" class="solr.TrieIntField" precisionStep="0" >omitNorms="true" positionIncrementGap="0"/> > </types> > >We are running solrj clients from a hadoop cluster, and are struggling with >the merge process as time progresses. >As the number of documents grows, merging will eventually hog everything. > >What we would really like to do is turn merging off and just do an index >run with a sparse solrconfig and then >start things back up with our runtime config which would kick off merging >when it starts. > >Is there a way to do this? > >I came close to finding an answer in this post, but did not find out how to >actually turn off merging. > >Post by Mike McCandless: >http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html > > >