Hi, And there is a wonderful report in SPM for Solr that shows how your index changes over time in terms of size, index files, segments, indexed docs, deleted docs... very useful for understanding what's going on at that level.
Otis -- Performance Monitoring - http://sematext.com/spm On Sep 20, 2012 7:49 AM, "Erick Erickson" <erickerick...@gmail.com> wrote: > > Is it correct that a segment file is ready for merging after a commit has > > been done (e.g. using the autoCommit property), so I will see merges of > 100 > > and up documents (and the index writer continues writing into a new > segment > > file)? > > Yes, merging won't happen until after a segment is closed. How big the > segments > are depends on the MergePolicy, of which there are several. Here's a great > blog explaining that... > > > http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html > > Best > Erick > > On Thu, Sep 20, 2012 at 5:17 AM, "Trym R. Møller" <t...@sigmat.dk> wrote: > > Hi > > > > Thanks a lot for your answer, Erick! > > > > I changed the value of the autoSoftCommit property and it had the > expected > > effect. It can be noted that this is per Core, so I get four getReader > calls > > when my Solr contains four cores per autoSoftCommit interval. > > > > Is it correct that a segment file is ready for merging after a commit has > > been done (e.g. using the autoCommit property), so I will see merges of > 100 > > and up documents (and the index writer continues writing into a new > segment > > file)? > > > > It looks like the segments are being merged into 6 MB files and when > enough > > into 60MB files and these again into 3,5GB files. > > > > Best regards Trym > > > > Den 19-09-2012 14:49, Erick Erickson skrev: > > > >> I _think_ the getReader calls are being triggered by the autoSoftCommit > >> being > >> at one second. If so, this is probably OK. But bumping that up would > nail > >> whether that's the case... > >> > >> About RamBufferSizeMB. This has nothing to do with the size of the > >> segments! > >> It's just how much memory is consumed before the RAMBuffer is flushed to > >> the _currently open_ segment. So until a hard commit happens, the > >> currently > >> open segment will continue to grow as successive RAMBuffers are flushed. > >> > >> bq: I expected that my Lucene index segment files would be a bit > >> bigger than 1KB > >> > >> Is this a typo? The 512 is specifying MB...... > >> > >> Best > >> Erick > >> > >> On Wed, Sep 19, 2012 at 6:01 AM, "Trym R. Møller" <t...@sigmat.dk> > wrote: > >>> > >>> Hi > >>> > >>> Using SolrCloud I have added the following to solrconfig.xml (actually > >>> the > >>> node in zookeeper) > >>> <ramBufferSizeMB>512</ramBufferSizeMB> > >>> > >>> After that I expected that my Lucene index segment files would be a bit > >>> bigger than 1KB as I'm indexing very small documents > >>> Enabling the infoStream I see a lot of "flush at getReader" (one > segment > >>> of > >>> the infoStream file pasted below) > >>> > >>> 1. Where can I look for why documents are flushed so frequently? > >>> 2. Does it have anything to do with "getReader" and can I do anything > so > >>> Solr doesn't need to get a new reader so often? > >>> > >>> Any comments are most welcome. > >>> > >>> Best regards Trym > >>> > >>> Furthermore I have specified > >>> <autoCommit> > >>> <maxTime>180000</maxTime> > >>> </autoCommit> > >>> <autoSoftCommit> > >>> <maxTime>1000</maxTime> > >>> </autoSoftCommit> > >>> > >>> > >>> IW 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: flush at > >>> getReader > >>> DW 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: > pool-12-thread-1 > >>> startFullFlush > >>> DW 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: anyChanges? > >>> numDocsInRam=7 deletes=false hasTickets:false > pendingChangesInFullFlush: > >>> false > >>> DWFC 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: > >>> addFlushableState > >>> DocumentsWriterPerThread [pendingDeletes=gen=0, segment=_kc, > >>> aborting=false, > >>> numDocsInRAM=7, deleteQueue=DWDQ: [ generation: 1 ]] > >>> DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: flush > postings > >>> as > >>> segment _kc numDocs=7 > >>> DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: new segment > has > >>> 0 > >>> deleted docs > >>> DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: new segment > has > >>> no > >>> vectors; norms; no docValues; prox; freqs > >>> DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: > >>> flushedFiles=[_kc_Lucene40_0.frq, _kc.fnm, _kc_Lucene40_0.tim, > >>> _kc_nrm.cfs, > >>> _kc.fdx, _kc.fdt, _kc_Lucene40_0.prx, _kc_nrm.cfe, _kc_Lucene40_0.tip] > >>> DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: flushed > >>> codec=Lucene40 > >>> DWPT 0 [Wed Sep 19 11:07:45 CEST 2012; pool-12-thread-1]: flushed: > >>> segment=_kc ramUsed=0,095 MB newFlushedSize(includes docstores)=0,003 > MB > >>> docs/MB=2.283,058 > >>> > > >