[ 
https://issues.apache.org/jira/browse/TRAFODION-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16551754#comment-16551754
 ] 

liu ming commented on TRAFODION-3137:
-------------------------------------

This is only for the very first count star without any predicate. In other 
words, after data loading, and update stats, when at this time, run 

select count ( * ) from t;

Then, it is possible to get the exact row count from stats, since it is 
accurate.

 

However, I still cannot find a way to tell the last modified time for a hbase 
table. HBase doesn't support it. So this feature is blocking.

I am thinking of WAL, but it is also not very easy to tell the last WAL for a 
given table, HBase seems to save WAL per region not per table. 

So I still cannot find a way to do this now.

 

The value is : a lot of users do a count star after data loading to check how 
many rows loaded. In many cases, if load 10 billions rows, the count star took 
hours, and it is a bad impression. In this case, read from stat is valid.

> speed up count(*) by read row count from stats in special use cases
> -------------------------------------------------------------------
>
>                 Key: TRAFODION-3137
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-3137
>             Project: Apache Trafodion
>          Issue Type: Improvement
>            Reporter: liu ming
>            Assignee: liu ming
>            Priority: Major
>
> If the table is bulkload once and no change, and finish update statistics.
> When doing count(*), there is no need to scan the whole table, but to read 
> row count from stats



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to