Hi,
As I'm about to dramatically increase our riak investment by putting
lots more data into it. I figured I might try to run through the
capacity planning on the wiki.
Since my current setup is fairly small and manageable I decided to try
to see how accurately the capacity planning matches what I see.
So first reality
A: Number of Machine : 8
B: Memory per Machine : 24 GB
C: Length of Bucket Name: 10 bytes
D: Length of Keys : 36 bytes
E: Length of Values : 36 bytes
F: Replication Factor : 3
G: Number of Keys : 183915891
H: Disk Space used : 341898018816 bytes (341 GB)
I: RAM : 70536691712 bytes (70 GB)
G was calculated using riak_kv_bitcask_backend:key_counts/0 for
each bitcask on a node, summing, then dividing by 3
H was calculated with 'du -sk /var/lib/riak/bitcask/ | cut -f1', summing
and multiplying by 1024
I was caluclated with 'ps -U riak -o vsz h', summing and multiplying
by 1024
Now from entering A-G on the Bitcask-Capacity-Planning page I get
Total Key Space: 34.9 GB
Node Count : 3 (7 GB Storage per Node)
in the first section and
Key Overhead: 73 Bytes (22 Byte Overhead)
Total Documents: 1,010,580,541
Total Disk Used: 102 GB of Disk Space
Also when using the Cluster Capacity Planning page I get
(static bitcask per key overhead
+ estimated average bucket+key length in bytes)
* estimate total number of keys
* n_val
= Approximate RAM Needed for Bitcask
So plugging in values
( 22 + 10 + 36 ) * 183915891 * 3 = 37518841696 = 34.9 GB
and
Disk = Estimated Total Objects * Average Object Size * n_val
Disk = 183915891 * 36 * 3 = 19862916228 = 18.49 GB
So either the equations are drastically wrong or my calculations are. I find
it very suspect that the equation for the amount of disk includes zero
overhead when reading the bitcask paper it seems like each entry consists
of
CRC, timestamp, keysz, valsz, key, value
Well anyway, there's obviously something off, as I end up with the following
Bitcask-Capacity-Planning Cluster-Capacity-Planning Reality
RAM 34.9 GB 34.9 GB 70 GB
Disk 102 GB 18.49 GB 341 GB
So it looks to me like the numbers for RAM are about 1/2 of actual and
the number for Disk are completely off, they are different depending on
which page you look at on the wiki and vastly underestimate reality.
I'm hoping someone from basho can clarify so I can really determine
capacity.
Thanks,
-Anthony
--
------------------------------------------------------------------------
Anthony Molinaro <[email protected]>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com