Hi Shalin, yes I'm going to setup 5.5.3 to see how that behaves. Michael McCandless gave me the hint about LUCENE-6161.
We will see... :-) Am 02.08.2016 um 16:31 schrieb Shalin Shekhar Mangar: > Hi Bernd, > > I think you are running into > https://issues.apache.org/jira/browse/LUCENE-6161. Can you upgrade to 5.1 > or newer? > > On Wed, Jul 27, 2016 at 7:29 PM, Bernd Fehling < > bernd.fehl...@uni-bielefeld.de> wrote: > >> After enhancing the server with SSDs I'm trying to speed up indexing. >> >> The server has 16 CPUs and more than 100G RAM. >> JAVA (1.8.0_92) has 24G. >> SOLR is 4.10.4. >> Plain XML data to load is 218G with about 96M records. >> This will result in a single index of 299G. >> >> I tried with 4, 8, 12 and 16 concurrent DIHs. >> 16 and 12 was to much because for 16 CPUs and my test continued with 8 >> concurrent DIHs. >> Then i was trying different <indexConfig> and <updateHandler> settings but >> now I'm stuck. >> I can't figure out what is the best setting for bulk indexing. >> What I see is that the indexing is "falling asleep" after some time of >> indexing. >> It is only producing del-files, like _11_1.del, _w_2.del, _h_3.del,... >> >> <indexConfig> >> <maxIndexingThreads>8</maxIndexingThreads> >> <ramBufferSizeMB>1024</ramBufferSizeMB> >> <maxBufferedDocs>-1</maxBufferedDocs> >> <mergePolicy class="org.apache.lucene.index.TieredMergePolicy"> >> <int name="maxMergeAtOnce">8</int> >> <int name="segmentsPerTier">100</int> >> <int name="maxMergedSegmentMB">512</int> >> </mergePolicy> >> <mergeFactor>8</mergeFactor> >> <mergeScheduler >> class="org.apache.lucene.index.ConcurrentMergeScheduler"/> >> <lockType>${solr.lock.type:native}</lockType> >> ... >> </indexConfig> >> >> <updateHandler class="solr.DirectUpdateHandler2"> >> ### no autocommit at all >> <autoSoftCommit> >> <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime> >> </autoSoftCommit> >> </updateHandler> >> >> >> >> command=full-import&optimize=false&clean=false&commit=false&waitSearcher=false >> After indexing finishes there is a final optimize. >> >> My idea is, if 8 DIHs use 8 CPUs then I have 8 CPUs left for merging >> (maxIndexingThreads/maxMergeAtOnce/mergeFactor). >> It should do no commit, no optimize. >> ramBufferSizeMB is high because I have plenty of RAM and I want make use >> the speed of RAM. >> segmentsPerTier is high to reduce merging. >> >> But somewhere is a misconfiguration because indexing gets stalled. >> >> Any idea what's going wrong? >> >> >> Bernd >> >> >> >> >> > > -- ************************************************************* Bernd Fehling Bielefeld University Library Dipl.-Inform. (FH) LibTec - Library Technology Universitätsstr. 25 and Knowledge Management 33615 Bielefeld Tel. +49 521 106-4060 bernd.fehling(at)uni-bielefeld.de BASE - Bielefeld Academic Search Engine - www.base-search.net *************************************************************