Here's a few more comments on top of Jean-Daniels's suggestion:
Marcus Schlüter wrote:
Hi everyone
We used hadoop 0.16.4 with a replaction level of 2 and hbase 0.1.3.
Make sure you tell hbase that you only want a replication of 2: See
http://wiki.apache.org/hadoop/Hbase/FAQ#12.
On a side note, we also observe that hbase seems to have a large
storage overhead.
When we insert about 1GB of rawdata into hbase, it uses about 8GB of
HDFS space (when taking into account the replication).
Is this large overhead expected?
Your value is small; 100 bytes. Then there are keys whose form is
rowid/columnname/timestamp.
Can you slice and dice using your hadoop fs dus and figure where the
bulk of the 8G is under you hbase.rootdir? (You may have an extra
replica that you did not expect given #12 from the FAQ above).
St.Ack