You can run ``hbase org.apache.hadoop.hbase.io.hfile.HFile -f
"$region" -m'' where $region is every HFile (located under
/hbase/$table/*/$family).  This is rather slow [1] for some reason I
don't quite understand, but it's many orders of magnitude faster than
MapReducing the entire table.  The output will have information like
"entryCount" (number of cells in this file), "totalBytes" (size of the
uncompressed data), "length" (actual size on disk), "avgKeyLen"
(average number of bytes in a key), "avgValueLen" (average number of
bytes stored in a cell).

This way you can get detailed information about your table.  The
results won't be up-to-date to the second, but they'll be pretty
close.


  [1] I recently ran this at SU on a table with about 1200 regions and
it took 1h 15m to read the meta data of every HFile.  I don't
understand how this can take so much time.

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Reply via email to