Re: How to count table rows from Java?

James Taylor Fri, 26 Jun 2015 11:14:20 -0700

Zach,
I wouldn't at all say that doing a count(*) is not recommended. It's
important to know that 1) this requires a full table scan and 2) this is
done by Phoenix asynchronously. You'll need to set the timeouts high enough
for this to complete. Phoenix will be much faster than running a MR job,
but the MR job runs asynchronously.


To ensure your stats are up to date, run a major compaction on your table
as that updates the stats. Also, if you have wide rows, consider using
multiple column families so that the count(*) doesn't have to traverse all
of your data.

Thanks,
James

On Friday, June 26, 2015, Nick Dimiduk <[email protected]> wrote:

> RowCounter is s mapreduce program. After the program completes execution
> of the job, it returns information about that job, including job counters.
> RowCounter includes its counts in the job counters, so they're easily
> accessed programmatically from the returned object. It's not a ResultSet,
> but it should work none the less.
>
> On Friday, June 26, 2015, Riesland, Zack <[email protected]
> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
>
>>  I wrote a Java program that runs nightly and collects metrics about our
>> hive tables.
>>
>>
>>
>> I would like to include HBase tables in this as well.
>>
>>
>>
>> Since select count(*) is slow and not recommended on Phoenix, what are my
>> alternatives from Java?
>>
>>
>>
>> Is there a way to call org.apache.hadoop.hbase.mapreduce.RowCounter from
>> java and get results in some kind of result set?
>>
>>
>>
>> Thanks for any info!
>>
>>
>>
>>
>>
>

Re: How to count table rows from Java?

Reply via email to