Shane,

Several points for your consideration:

- "cache_size" is broken in 1.3.x, fixed in 1.4.x.  Any cache_size above the 
default 8M can ruin performance because some idiot (me) put a linear walk in 
the cache code.  Reverted for 1.4 and cache_size tested to 2G.

- do you have AAE (active anti-entropy) running?  AAE creates a parallel set of 
vnodes/databases to track hash trees of the primary vnodes.  This will add to 
your memory load.

- max_open_files in 1.3 puts a limit on the count of open files, but NOT on the 
amount of memory that is used by each file.  The larger .sst files create 
larger bloom filters which take up a greater amount of memory for same 
max_open_files setting.  Poor accounting.  1.4.x has a correction that changes 
file cache accounting from file count to amount of memory used per file (limit 
is 4Mbyte * max_open_files).  I am aware of one other 1.3.x site that had to 
reduce their max_open_files from 300 or more, to around 70 to stabilize memory.

That said, I have attached a spreadsheet that simplifies the math discussed on 
our web site.  I have plugged in your numbers and see a "balance" around 
cache_size=8388603 and max_open_files=200.  


I am currently writing a dynamic cache size model for Riak 2.0.  All the smoke 
and mirrors for memory sizing will then go away.  The code will automatically 
change caches as vnodes migrate to and from a given node (no need to "reserve 
50%") … you just say total memory you want to allocate to Riak and walk away.

Matthew

Attachment: LevelDB1.2MemModel.Shane.xls
Description: Binary data


On Sep 6, 2013, at 12:10 PM, Shane McEwan <[email protected]> wrote:

> G'day!
> 
> Our Riak nodes have 48GB of RAM in them. When we installed Riak 1.2.0 on them 
> we tuned the LevelDB settings as per the Parameter Planning section in 
> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/LevelDB/
> 
> {eleveldb, [
>             {write_buffer_size_min, 31457280},
>             {write_buffer_size_max, 62914560},
>             {cache_size, 300000000},
>             {max_open_files, 300}
>            ]},
> 
> With 64 vnodes per node, an average SST size of 2MB, key size of 150B and 
> average value size of 1024B we were expecting our Riak memory usage to peak 
> at around 24GB . . . the recommended 50% of system memory.
> 
> Currently the Riak process on each of our nodes is using 32GB of RAM and 
> still rising.
> 
> I've noticed that since we moved to Riak 1.3.1 new SST files are no longer 
> 2MB but are now 100MB. Recalculating memory usage based on that gives me an 
> expected usage of 32GB. So maybe that explains the higher memory usage than 
> expected.
> 
> So, 2 questions:
> 
> Can we expect the memory usage to plateau soon?
> 
> The cache_size of 300MB should allow for one of our nodes to go down and 
> we'll still be under the 50% threshold when the 64 down vnodes are shared 
> amongst the remaining three nodes . . . except we're already using more than 
> the 24GB threshold (or the new 32GB threshold with 100MB SSTs) with all four 
> nodes still up. How can this be?
> 
> I'd appreciate if someone could do the math for me and come up with suggested 
> tuning parameters because I'm losing faith in the documentation:
> 
> Riak: 1.3.1
> Nodes: 4
> Total RAM per node: 48GB
> Desired RAM allocated to Riak: 24GB
> ring_creation_size: 256
> Partitions per node: 64
> 
> Thanks!
> 
> Shane.
> 
> 
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to