In Phoenix 4.13 the COUNT is VERY fast..was there some optimization in this sense (from 4.7)??
On Wed, Sep 13, 2017 at 12:53 PM, Ankit Singhal <ankitsingha...@gmail.com> wrote: > Best is to do "SELECT COUNT(*) FROM MYTABLE" with index. As index table > will have less data so it can be read faster. > if you have time series data or your data is always incremental with some > ID then you can do incremental count with row_timestamp filters or ID filter > > > bq. however the result could be non-deterministic if HBase has just been > restarted.. > > Results are expected to be deterministic in normal scenarios. can > you elaborate what is the difference you see after HBase restarted? > > bq. SELECT SUM(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE > PHYSICAL_NAME = 'MYTABLE'; > > We calculate row count till the guidePosts is found in the region > and no count will be stored for a region having a size not enough for > guidepost width or remaining region after the last guidePosts. so this > row_count should not be used against actual count. > > > On Wed, Sep 13, 2017 at 4:04 PM, Flavio Pompermaier <pomperma...@okkam.it> > wrote: > >> Hi to all, >> I'm trying to investigate the best option to have get the row count out >> of a table. >> >> I've tried the following: >> >> >> 1. SELECT COUNT(*) FROM MYTABLE >> 1. very slow without an index, very quick with an index >> 2. however the result could be non-deterministic if HBase has just >> been restarted.. >> 2. SELECT SUM(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE >> PHYSICAL_NAME = 'MYTABLE'; >> 1. the result here is completely different from the first >> one..323329772 vs 13376168. How is that possible? >> >> Best, >> Flavio >> >