I think I am hitting this problem. Since, maxIndexingThreads is not used
anymore, i see 330+ indexing threads (in the attached log:-   "334 in-use
non-flushing threads states" )

The bugfix recommends using custom code to control concurrency in
IndexWriter, how can I configure it using solr6.6 ?


On Sat, Aug 5, 2017 at 12:59 PM, Nawab Zada Asad Iqbal <khi...@gmail.com>
wrote:

> Hi,
>
> I am debugging a bulk indexing performance issue while upgrading to 6.6
> from 4.5.0 . I have commits disabled while indexing total of 85G data
> during 7 hours. At the end of it, I want some 30 or so big segments. But i
> am getting 3000 segments.
> I deleted the index and enabled infostream logging ; i have attached the
> log when first segment is flushed. Here are few questions:
>
> 1. When a segment if flushed , then is it permanent or can more documents
> be written to it (besides the merge scenario)?
> 2. It seems that 330+ threads are writing in parallel. Will each one of
> them become one segment when written to the disk? In which case, i should
> probably decrease concurrency?
> 3. One possibility is to delay flushing, the flush is getting triggered at
> 10000MB, probably coming from <ramBufferSizeMB>10000</ramBufferSizeMB> ;
> however, the segment which is flushed is only 115MB. Is this limit for the
> combined size of all in-memory segments? In which case, is it ok to
> increase it further to use more of my heap (48GB).
> 4. How can I decrease the concurrency, maybe the solution is to use fewer
> in memory segments?
>
> In previous run, there were 110k files in the index folder after I
> stopping indexing. Before doing commit, I noticed that the file count
> continued to decrease every few minutes, until it reduced to 27k or so. (I
> committed after it stabilized)
>
>
> My Indexconfig is this:
>
>   <indexConfig>
>     <writeLockTimeout>1000</writeLockTimeout>
>     <commitLockTimeout>10000</commitLockTimeout>
>     <maxIndexingThreads>10</maxIndexingThreads>
>     <useCompoundFile>false</useCompoundFile>
>     <ramBufferSizeMB>10000</ramBufferSizeMB>
>   <mergePolicyFactory class="org.apache.solr.index.
> TieredMergePolicyFactory">
>   <int name="maxMergeAtOnce">5</int>
>      <int name="segmentsPerTier">3000</int>
>       <int name="maxMergeAtOnceExplicit">10</int>
>       <int name="floorSegmentMB">16</int>
>       <!-- 200 gb since we want few big segments during full indexing -->
>       <double name="maxMergedSegmentMB">200000</double>
>       <double name="forceMergeDeletesPctAllowed">1</double>
>     </mergePolicyFactory>
>      <mergeScheduler class="org.apache.lucene.index.
> ConcurrentMergeScheduler">
>        <int name="maxThreadCount">10</int>
>        <int name="maxMergeCount">10</int>
>      </mergeScheduler>
>     <lockType>${solr.lock.type:native}</lockType>
>     <reopenReaders>true</reopenReaders>
>     <deletionPolicy class="solr.SolrDeletionPolicy">
>       <str name="maxCommitsToKeep">1</str>
>       <str name="maxOptimizedCommitsToKeep">0</str>
>     </deletionPolicy>
>     <infoStream>true</infoStream>
>     <applyAllDeletesOnFlush>false</applyAllDeletesOnFlush>
>   </indexConfig>
>
>
> Thanks
> Nawab
>
>
>

Reply via email to