The ram efficiency (= size of segment once flushed divided by size of
RAM buffer) can vary drastically.

Because the in-RAM data structures must be "growable" (to append new
docs to the postings as they are encountered), the efficiency is never
100%.  I think 50% is actually a "good" ram efficiency, and lower than
that (even down to 27%) I think is still normal.

Do you have many unique or low-doc-freq terms?  That brings the efficiency down.

If you turn on IndexWriter's infoStream and post the output we can see
if anything odd is going on...

80 * 20 = ~1.6 GB so I'm not sure why you're getting 1 GB segments.
Do you do any deletions in this run?  A merged segment size will often
be less than the sum of the parts, especially if there are many terms
but across segments these terms are shared.... but the infoStream will
also show what merges are taking place.

Mike

On Wed, Dec 1, 2010 at 2:13 PM, Burton-West, Tom <tburt...@umich.edu> wrote:
> We are using a recent Solr 3.x (See below for exact version).
>
> We have set the ramBufferSizeMB to 320 in both the indexDefaults and the 
> mainIndex sections of our solrconfig.xml:
>
> <ramBufferSizeMB>320</ramBufferSizeMB>
> <mergeFactor>20</mergeFactor>
>
> We expected that this would mean that the index would not write to disk until 
> it reached somewhere approximately over 300MB in size.
> However, we see many small segments that look to be around 80MB in size.
>
> We have not yet issued a single commit so nothing else should force a write 
> to disk.
>
> With a merge factor of 20 we also expected to see larger segments somewhere 
> around 320 * 20 = 6GB in size, however we see several around 1GB.
>
> We understand that the sizes are approximate, but these seem nowhere near 
> what we expected.
>
> Can anyone explain what is going on?
>
> BTW
> maxBufferedDocs is commented out, so this should not be affecting the buffer 
> flushes
> <!--<maxBufferedDocs>1000</maxBufferedDocs>-->
>
>
> Solr Specification Version: 3.0.0.2010.11.19.16.00.54Solr Implementation 
> Version: 3.1-SNAPSHOT 1036094 - root - 2010-11-19 16:00:54Lucene 
> Specification Version: 3.1-SNAPSHOTLucene Implementation Version: 
> 3.1-SNAPSHOT 1036094 - 2010-11-19 16:01:10
>
> Tom Burton-West
>
>

Reply via email to