Burton-West, Tom wrote:
> Hello all,
>
> At some point we will need to re-build an index that totals about 2 
> terrabytes in size (split over 10 shards).  At our current indexing speed we 
> estimate that this will take about 3 weeks.  We would like to reduce that 
> time.  It appears that our main bottleneck is disk I/O.
>  We currently have ramBufferSizeMB set to 32 and our merge factor is 10.  If 
> we increase ramBufferSizeMB to 320, we avoid a merge and the 9 disk writes 
> and reads to merge 9+1 32MB segments into a 320MB segment.
>
>  Assuming we allocate enough memory to the JVM, would it make sense to 
> increase ramBufferSize to 3200MB?   What are people's experiences with very 
> large ramBufferSizeMB sizes?
>
> Tom Burton-West
> University of Michigan Library
> www.hathitrust.org
>
>
>   
There is a hard limit just under about 2 gigs. Appears to be diminishing
returns as you go over a hundred to a few hundred MB. IE, you prob
picked a good number with 320. If you plan to go big anyway ( > 1 gig ),
you really have to give a lot of RAM to the JVM to avoid some nasty
paging / GC effects. I think someone that tested this had to give over 6
gigabytes to go over 1 gig without these affects? Thats remembering from
memory though. If you look at the gain eked out at that point, its not
really worth it. I'd stick to lower hundreds max.

-- 
- Mark

http://www.lucidimagination.com



Reply via email to