Another alternate outside Phoenix is to use http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html M/R.
On Tue, Dec 19, 2017 at 3:18 PM, James Taylor <jamestay...@apache.org> wrote: > If it needs to be 100% accurate, then count(*) is the only way. If your > data is write-once data, you might be able to track the row count at the > application level through some kind of atomic counter in a different table > (but this will likely be brittle). If you can live with an estimate, you > could enable statistics [1], optionally configuring Phoenix not to use > stats for parallelization [2], and query the SYSTEM.STATS table to get an > estimate [3]. > > Another interesting alternative if you want the approximate row count when > you have a where clause would be to use the new table sampling feature [4]. > You'd also want stats enabled for this to be more accurate too. > > Thanks, > James > > > [1] https://phoenix.apache.org/update_statistics.html > [2] phoenix.use.stats.parallelization=false > [3] select sum(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS where > physical_name='my_schema.my_table' > and COLUMN_FAMILY='my_first_column_family' -- necessary only if you > have multiple column families > [4] https://phoenix.apache.org/tablesample.html > > On Tue, Dec 19, 2017 at 2:57 PM, Jins George <jins.geo...@aeris.net> > wrote: > >> Hi, >> >> Is there a way to get the total row count of a phoenix table without >> running select count(*) from table ? >> my use case is to monitor the record count in a table every x minutes, so >> didn't want to put load on the system by running a select count(*) query. >> >> Thanks, >> Jins George >> > >