[ 
https://issues.apache.org/jira/browse/KYLIN-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17114469#comment-17114469
 ] 

ASF subversion and git services commented on KYLIN-4315:
--------------------------------------------------------

Commit 24f0063daacb6732aa06f5abbe6d198f570ecf95 in kylin's branch 
refs/heads/master from xiacongling
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=24f0063 ]

KYLIN-4315 use metadata numRows in beeline client for quick row counting


> Use metadata numRows in beeline client for quick row counting
> -------------------------------------------------------------
>
>                 Key: KYLIN-4315
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4315
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Job Engine
>            Reporter: Congling Xia
>            Assignee: Congling Xia
>            Priority: Major
>             Fix For: v3.1.0
>
>
> Hi, I find that in `BeelineHiveClient`, method `getHiveTableRows` uses 
> "select count(*) from <tb_name>" for table row counting. The method is 
> invoked in flat intermediate table redistribution step in cube building.
> This stats can be loaded in metastore. It costs much less time than scanning 
> all rows in Hive table. Since intermediate tables are created and inserted by 
> Kylin, statistics will be automatically calculated and stored in metastore 
> when 
> `[hive.stats.autogather|https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.autogather]`
>  is enabled (which is the default setting for Hive). 
> ref Hive wiki for more detail about `numRows` stats: 
> [https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables%E2%80%93ANALYZE]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to