Zach, I wouldn't at all say that doing a count(*) is not recommended. It's important to know that 1) this requires a full table scan and 2) this is done by Phoenix asynchronously. You'll need to set the timeouts high enough for this to complete. Phoenix will be much faster than running a MR job, but the MR job runs asynchronously.
To ensure your stats are up to date, run a major compaction on your table as that updates the stats. Also, if you have wide rows, consider using multiple column families so that the count(*) doesn't have to traverse all of your data. Thanks, James On Friday, June 26, 2015, Nick Dimiduk <[email protected]> wrote: > RowCounter is s mapreduce program. After the program completes execution > of the job, it returns information about that job, including job counters. > RowCounter includes its counts in the job counters, so they're easily > accessed programmatically from the returned object. It's not a ResultSet, > but it should work none the less. > > On Friday, June 26, 2015, Riesland, Zack <[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > >> I wrote a Java program that runs nightly and collects metrics about our >> hive tables. >> >> >> >> I would like to include HBase tables in this as well. >> >> >> >> Since select count(*) is slow and not recommended on Phoenix, what are my >> alternatives from Java? >> >> >> >> Is there a way to call org.apache.hadoop.hbase.mapreduce.RowCounter from >> java and get results in some kind of result set? >> >> >> >> Thanks for any info! >> >> >> >> >> >
