Hi all
I want to according to the keyvalue format, calculate data in HBase table
size.
HBase table structure is as follows:

The hbase table scan 'test' like this

 010012010114200           column=s:STATION, timestamp=1378892292800,
value=00001
 010012010114200           column=s:YEAR, timestamp=1378892292800,
value=2010
 010012010114210           column=s:DAY, timestamp=1378892292800, value=14

 010012010114210           column=s:HOUR, timestamp=1378892292800, value=21

 010012010114210           column=s:MINUTE, timestamp=1378892292800,
value=0
 010012010114210           column=s:MONTH, timestamp=1378892292800, value=1


I want to calculate the record size:
Fixed part needed by KeyValue format = Key Length + Value Length + Row
Length + CF Length + Timestamp + Key Value = ( 4 + 4 + 2 + 1

+ 8 + 1) = 20 Bytes

Variable part needed by KeyValue format = Row + Column Family + Column
Qualifier + Value


Total bytes required = Fixed part + Variable part

1 Column = 20 + (15 + 1 + 7 + 5) = 48 Bytes
1 Column = 20 + (15 + 1 + 4 + 4) = 44 Bytes
1 Column = 20 + (15 + 1 + 3 + 2) = 41 Bytes
1 Column = 20 + (15 + 1 + 4 + 2) = 42 Bytes
1 Column = 20 + (15 + 1 + 6 + 1) = 43 Bytes
1 Column = 20 + (15 + 1 + 6 + 1) = 43 Bytes
one record need 271 Bytes
And I hava 2 million record Total Size is about 542MB
My Question is:
This calculation method is right ?

-- 

In the Hadoop world, I am just a novice, explore the entire Hadoop
ecosystem, I hope one day I can contribute their own code

YanBit
[email protected]

Reply via email to