This describes how they are written, with your knowledge of your data size and key average size you can do the math:
http://hbase.apache.org/book.html#d0e9542 J-D On Tue, Feb 21, 2012 at 2:30 PM, Mikael Sitruk <[email protected]> wrote: > Ok, so this is approx 150 regions per RS > What are the maths between the memory (index size) and number of regions? > (Btw at the beginning when I mentionned 500 regions it was per RS.) > I'm trying to figure out what should be my cluster configuration, regarding > region, region size, memory size, and number of RS for the volume and > workload I'm using > On Feb 22, 2012 12:14 AM, "Jean-Daniel Cryans" <[email protected]> wrote: > >> On Tue, Feb 21, 2012 at 1:57 PM, Mikael Sitruk <[email protected]> >> wrote: >> >> > If so beside the collection time is there >> >> > any impact (perhaps the documentation should be updated too)? >> >> >> >> Collection time? You mean GC? Sorry I don't get what you mean. >> >> >> > >> > *Sorry, typo mistake (from mobile) I meant compaction not collection >> >> Ah! Well there's a ton of impacts starting from having less regions :) >> But definitely compactions will take a lot longer the bigger the >> regions are since more and more is done in a single process. The >> documentation could definitely have more info on that. >> >> > >> >> > Regarding the number of regions you have (14,398) is it for a single >> RS? >> >> > What is your number of RS? >> >> >> >> Currently 91 in that cluster. It varies :) >> >> >> >> We have >200 tables coming all in different sizes. >> > >> > *Not clear, 91 rs, and 14398 regions in total? Or per RS? >> >> Oh sorry, total. 14k on a single RS is impossible/suicide if you have >> any data in there because it would OOME trying to load the indexes >> (better in 0.92 tho). >> >> J-D >>
