Our index builds take around 6 hours, and I've noticed recently that
segments created towards the end of the build (in the last hour or so)  use
the compound file format (.cfs). I assumed that this might be due to the
number of open files approaching a maximum, but both the hard and soft open
file limits for the Solr JVM process are set to 65536, so that doesn't seem
very likely. It's obviously not a problem, but I'm curious as to why this
might be happening.


Environment:
OS = Centos 7 Linux

Java:
java -version =>
openjdk version "1.8.0_45"
OpenJDK Runtime Environment (build 1.8.0_45-b13)
OpenJDK 64-Bit Server VM (build 25.45-b02, mixed mode)

Solr 5.4 started with the bin/solr script: ps shows

java -server -Xms5g -Xmx5g -XX:NewRatio=3 -XX:SurvivorRatio=4
-XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ConcGCThreads=4
-XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark
-XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000
-XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -Djetty.port=8983
-DSTOP.PORT=7983 -DSTOP.KEY=solrrocks -Duser.timezone=EST
-Djetty.home=/home/srosenthal/defsolr/server
-Dsolr.solr.home=/home/srosenthal/defsolr/server/solr
-Dsolr.install.dir=/home/srosenthal/defsolr -Xss256k -jar start.jar
-XX:OnOutOfMemoryError=/home/srosenthal/defsolr/bin/oom_solr.sh 8983
/home/srosenthal/defsolr/server/logs --module=http

solrconfig.xml: basically the default with some minor tweaks in the
indexConfig section
<luceneMatchVersion>5.0</luceneMatchVersion>
....
 <indexConfig>
    <!-- <ramBufferSizeMB>100</ramBufferSizeMB> -->
    <!-- <maxBufferedDocs>1000</maxBufferedDocs> -->
    <ramBufferSizeMB>200</ramBufferSizeMB>
    <maxBufferedDocs>10000</maxBufferedDocs>

        <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
          <int name="maxMergeAtOnce">20</int>
          <int name="maxMergeAtOnceExplicit">60</int>
          <int name="segmentsPerTier">20</int>
        </mergePolicy>

    <!-- deprecated<mergeFactor>20</mergeFactor> -->
<mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
    ... everything else is default
</indexConfig>
Insights as to why this is happening would be welcome.

-Simon

Reply via email to