The general rule is to keep your disks utilization under 80%. That means that you need to figure how much disk space HBase will take (also how it will grow), and keep in mind that every value is also stored with it's full key (row key + family + qualifier + timestamp).
J-D On Thu, Jul 8, 2010 at 5:36 PM, <[email protected]> wrote: > > Our organization people are more familiar with and use just Hadoop..HBase is > not yet..I'm venturing > into it..Because most are familiar with Hadoop, everyone thinks we would need > lot more storage.. > > I understand the general use case for Hadoop is storing giant raw log files > from webservers & other > servers..which are huge & fills up quickly > If all we are storing processed data(events) directly in HBase tables, that > space should not be that > much..I would think.. > I've done some small benchmarking testing (300,000 records )/less 100 bytes > per record..HBase does n't > take up much disk space..(looking raw disk usage) > > As table gets larger it will..we have flexibility on how long to keep data > around.. > > Thoughts? > > thanks > venkatesh > > > > >
