I had extremely specific use case; about 5000 documents-per-second (small documents) update rate, some documents can be repeatedly sent to SOLR with different timestamp field (and same unique document ID). Nothing breaks, just a great performance gain which was impossible with 32GB Buffer (- it caused constant index merge, 5 times more CPU than index update). Nothing breaks... with indexMerge=10 I don't have ANY merge during 24 hours; segments are large (few of 4Gb-8Gb, and one large "union"); I have "merge" explicitly only, at night, when I issue "commit".
Of course, it depends on use case, for applications such as "Content Management System" we don't need high remBufferSizeMB (few updates a day sent to SOLR)... > -----Original Message----- > From: Mark Miller [mailto:markrmil...@gmail.com] > Sent: October-23-09 5:28 PM > To: solr-user@lucene.apache.org > Subject: Re: Too many open files > > 8 GB is much larger than is well supported. Its diminishing returns over > 40-100 and mostly a waste of RAM. Too high and things can break. It > should be well below 2 GB at most, but I'd still recommend 40-100. > > Fuad Efendi wrote: > > Reason of having big RAM buffer is lowering frequency of IndexWriter flushes > > and (subsequently) lowering frequency of index merge events, and > > (subsequently) merging of a few larger files takes less time... especially > > if RAM Buffer is intelligent enough (and big enough) to deal with 100 > > concurrent updates of existing document without 100-times flushing to disk > > of 100 document versions. > > > > I posted here thread related; I had 1:5 timing for Update:Merge (5 minutes > > merge, and 1 minute update) with default SOLR settings (32Mb buffer). I > > increased buffer to 8Gb on Master, and it triggered significant indexing > > performance boost... > > > > -Fuad > > http://www.linkedin.com/in/liferay > > > > > > > >> -----Original Message----- > >> From: Mark Miller [mailto:markrmil...@gmail.com] > >> Sent: October-23-09 3:03 PM > >> To: solr-user@lucene.apache.org > >> Subject: Re: Too many open files > >> > >> I wouldn't use a RAM buffer of a gig - 32-100 is generally a good number. > >> > >> Fuad Efendi wrote: > >> > >>> I was partially wrong; this is what Mike McCandless (Lucene-in-Action, > >>> > > 2nd > > > >>> edition) explained at Manning forum: > >>> > >>> mergeFactor of 1000 means you will have up to 1000 segments at each > >>> > > level. > > > >>> A level 0 segment means it was flushed directly by IndexWriter. > >>> After you have 1000 such segments, they are merged into a single level 1 > >>> segment. > >>> Once you have 1000 level 1 segments, they are merged into a single level > >>> > > 2 > > > >>> segment, etc. > >>> So, depending on how many docs you add to your index, you'll could have > >>> 1000s of segments w/ mergeFactor=1000. > >>> > >>> http://www.manning-sandbox.com/thread.jspa?threadID=33784&tstart=0 > >>> > >>> > >>> So, in case of mergeFactor=100 you may have (theoretically) 1000 > >>> > > segments, > > > >>> 10-20 files each (depending on schema)... > >>> > >>> > >>> mergeFactor=10 is default setting... ramBufferSizeMB=1024 means that you > >>> need at least double Java heap, but you have -Xmx1024m... > >>> > >>> > >>> -Fuad > >>> > >>> > >>> > >>> > >>>> I am getting too many open files error. > >>>> > >>>> Usually I test on a server that has 4GB RAM and assigned 1GB for > >>>> tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this > >>>> server and has following setting for SolrConfig.xml > >>>> > >>>> > >>>> > >>>> <useCompoundFile>true</useCompoundFile> > >>>> > >>>> <ramBufferSizeMB>1024</ramBufferSizeMB> > >>>> > >>>> <mergeFactor>100</mergeFactor> > >>>> > >>>> <maxMergeDocs>2147483647</maxMergeDocs> > >>>> > >>>> <maxFieldLength>10000</maxFieldLength> > >>>> > >>>> > >>>> > >>> > >>> > >> -- > >> - Mark > >> > >> http://www.lucidimagination.com > >> > >> > >> > > > > > > > > > > > -- > - Mark > > http://www.lucidimagination.com > >