Re: Does Solr flush to disk even before ramBufferSizeMB is hit?

2011-08-30 Thread roz dev
Thanks Shawn.

If Solr writes this info to Disk as soon as possible (which is what I am
seeing) then ramBuffer setting seems to be misleading.

Anyone else has any thoughts on this?

-Saroj


On Mon, Aug 29, 2011 at 6:14 AM, Shawn Heisey s...@elyograg.org wrote:

 On 8/28/2011 11:18 PM, roz dev wrote:

 I notice that even though InfoStream does not mention that data is being
 flushed to disk, new segment files were created on the server.
 Size of these files kept growing even though there was enough Heap
 available
 and 856MB Ram was not even used.


 With the caveat that I am not an expert and someone may correct me, I'll
 offer this:  It's been my experience that Solr will write the files that
 constitute stored fields as soon as they are available, because that
 information is always the same and nothing will change in those files based
 on the next chunk of data.

 Thanks,
 Shawn




Re: Does Solr flush to disk even before ramBufferSizeMB is hit?

2011-08-30 Thread Shawn Heisey

On 8/30/2011 12:57 AM, roz dev wrote:

Thanks Shawn.

If Solr writes this info to Disk as soon as possible (which is what I am
seeing) then ramBuffer setting seems to be misleading.

Anyone else has any thoughts on this?


The stored fields are only two of the eleven Lucene files in each 
segment.  The buffer is not needed for them, because there is no 
transformation or data aggregation, they are written continuously as 
data is read.  The other files have to utilize the buffer, and can only 
be written once all the data for that segment has been read, 
transformed, and aggregated.


Thanks,
Shawn



Re: Does Solr flush to disk even before ramBufferSizeMB is hit?

2011-08-29 Thread Shawn Heisey

On 8/28/2011 11:18 PM, roz dev wrote:

I notice that even though InfoStream does not mention that data is being
flushed to disk, new segment files were created on the server.
Size of these files kept growing even though there was enough Heap available
and 856MB Ram was not even used.


With the caveat that I am not an expert and someone may correct me, I'll 
offer this:  It's been my experience that Solr will write the files that 
constitute stored fields as soon as they are available, because that 
information is always the same and nothing will change in those files 
based on the next chunk of data.


Thanks,
Shawn



Does Solr flush to disk even before ramBufferSizeMB is hit?

2011-08-28 Thread roz dev
Hi All,
I am trying to tune ramBufferSizeMB and merge factor for my setup.

So, i enabled Lucene Index Writer's log info stream and started monitoring
Data folder where index files are created.
I started my test with following

Heap: 3GB
Solr 1.4.1,
Index Size = 20 GB,
ramBufferSizeMB=856
Merge Factor=25


I ran my testing with 30 concurrent threads writing to Solr.
My jobs delete 6 (approx) records by issuing a deleteByQuery command and
then proceed to write data.

Commit is done at the end of writing process.

Results are bit surprising for me and I need some help understanding them.

I notice that even though InfoStream does not mention that data is being
flushed to disk, new segment files were created on the server.
Size of these files kept growing even though there was enough Heap available
and 856MB Ram was not even used.

Is it the case that Lucene is flushing to disk even if ramBufferSizeMB is
being hit. If that is the case then why is it that
InfoStream is not logging this info.

As per Infostream, it is flushing at the end but files are created much
before that.

Here is what InfoStream is saying: - Please note that is indicating that a
new segment is being flushed at 12:58 AM but files were created at 12:53 am
itself and they kept growing.

Aug 29, 2011 12:46:00 AM IW 0 [main]: setInfoStream:
dir=org.apache.lucene.store.NIOFSDirectory@/opt/gid/solr/ecom/data/index
autoCommit=false
mergePolicy=org.apache.lucene.index.LogByteSizeMergePolicy@4552a64dmergeScheduler=org.apache.lucene.index.ConcurrentMergeScheduler@35242cc9ramBufferSizeMB=856.0
maxBufferedDocs=-1 maxBuffereDeleteTerms=-1
maxFieldLength=1 index=_3l:C2151995

Aug 29, 2011 12:57:35 AM IW 0 [web-1]: now flush at close
Aug 29, 2011 12:57:35 AM IW 0 [web-1]: flush: now pause all indexing threads
Aug 29, 2011 12:57:35 AM IW 0 [web-1]:   flush: segment=_3m
docStoreSegment=_3m docStoreOffset=0 flushDocs=true flushDeletes=true
flushDocStores=true numDocs=60788 numBufDelTerms=60788
Aug 29, 2011 12:57:35 AM IW 0 [web-1]:   index before flush _3l:C2151995
Aug 29, 2011 12:57:35 AM IW 0 [web-1]: DW: flush postings as segment _3m
numDocs=60788
Aug 29, 2011 12:57:35 AM IW 0 [web-1]: DW: closeDocStore: 2 files to flush
to segment _3m numDocs=60788
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=9 total
now 9
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks
blockSize=32768 count=182 total now 182
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=49
total now 49
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=7 total
now 16
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks
blockSize=32768 count=145 total now 327
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=37
total now 86
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=9 total
now 25
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks
blockSize=32768 count=208 total now 535
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=52
total now 138
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=7 total
now 32
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks
blockSize=32768 count=136 total now 671
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=39
total now 177
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=3 total
now 35
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks
blockSize=32768 count=58 total now 729
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=16
total now 193
Aug 29, 2011 12:57:41 AM IW 0 [web-1]: DW:   oldRAMSize=50469888
newFlushedSize=161169038 docs/MB=395.491 new/old=319.337%
Aug 29, 2011 12:57:41 AM IFD [web-1]: now checkpoint segments_1x [2
segments ; isCommit = false]
Aug 29, 2011 12:57:41 AM IW 0 [web-1]: DW: apply 60788 buffered deleted
terms and 0 deleted docIDs and 1 deleted queries on 2 segments.
Aug 29, 2011 12:57:42 AM IFD [web-1]: now checkpoint segments_1x [2
segments ; isCommit = false]
Aug 29, 2011 12:57:42 AM IFD [web-1]: now checkpoint segments_1x [2
segments ; isCommit = false]
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: LMP: findMerges: 2 segments
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: LMP:   level 6.6799455 to 7.4299455:
1 segments
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: LMP:   level 5.1209826 to 5.8709826:
1 segments
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS: now merge
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS:   index: _3l:C2151995 _3m:C60788
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS:   no more merges pending; now
return
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS: now merge
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS:   index: _3l:C2151995 _3m:C60788
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS:   no more merges pending; now
return
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: now call final commit()
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: startCommit(): start sizeInBytes=0
Aug 29, 2011 12:57:42 AM