Thanks Xavier. I'll give that a shot. Norbert
On Mon, Jan 24, 2011 at 1:33 PM, Xavier Stevens <[email protected]>wrote: > Not sure if there is a way to do that. You could get a really rough > estimate if you did the job I described and subtracted the total bytes > calculated for the records from the "hadoop fs -dus /hbase/<table_name>" > bytes. Then that would give an idea of the amount of overhead. I have > a feeling it is negligible in the grand scheme of things. > > -Xavier > > On 1/24/11 10:23 AM, Norbert Burger wrote: > > Good idea. But it seems like this approach would give me the size of > just > > the raw data itself, ignoring any kind of container (like HFiles) that > are > > used to store the data. What I'd like ideally is to get an idea of what > the > > fixed cost (in terms of bytes) is for each my tables, and then understand > > how I can calculate a variable bytes/record cost. > > > > Is this feasible? > > > > Norbert > > > > On Mon, Jan 24, 2011 at 1:16 PM, Xavier Stevens <[email protected] > >wrote: > > > >> Norbert, > >> > >> It would probably be best if you wrote a quick MapReduce job that > >> iterates over those records and outputs the sum of bytes for each one. > >> Then you could use that output and get some general descriptive > >> statistics based on it. > >> > >> Cheers, > >> > >> > >> -Xavier > >> > >> > >> On 1/24/11 9:37 AM, Norbert Burger wrote: > >>> Hi folks - is there a recommended way of estimating HBase HDFS usage > for > >> a > >>> new environment? > >>> > >>> We have a DEV HBase cluster in place, and from this, I'm trying to > >> estimate > >>> the specs of our not-yet-built PROD environment. One of the variables > >> we're > >>> considering is HBase usage of HDFS. What I've just tried is to > calculate > >> an > >>> average bytes/record ratio by using "hadoop dfs -du /hbase", and > dividing > >> by > >>> the number of records/table. But this ignores any kind of fixed > >> overhead, > >>> so I have concerns about it. > >>> > >>> Is there a better way? > >>> > >>> Norbert > >>> >
