How to get size of Hbase Table

Sachin Jain Wed, 20 Jul 2016 23:29:14 -0700

*Context*
I am using Spark (1.5.1) with HBase (1.1.2) to dump the output of Spark
Jobs into HBase which will be further available as lookups from HBase
Table. BaseRelation extends HadoopFSRelation and is used to read and write
to HBase. Spark Default Source API is used.


*Use Case*
Now, whenever I perform join operation, Spark creates a logical plan and
decides which type of join it should execute and as per Spark Strategies
[0] it checks the size of HBase Table. If it is less than some threshold
(10 MB) it selects Broadcast Hash join otherwise Sort Merge join.

*Problem Statement*
I want to know if there is an API or some approach to calculate the size of
an HBase table.

[0]:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala#L118

Thanks
-Sachin

How to get size of Hbase Table

Reply via email to