[ https://issues.apache.org/jira/browse/KYLIN-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17114469#comment-17114469 ]
ASF subversion and git services commented on KYLIN-4315: -------------------------------------------------------- Commit 24f0063daacb6732aa06f5abbe6d198f570ecf95 in kylin's branch refs/heads/master from xiacongling [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=24f0063 ] KYLIN-4315 use metadata numRows in beeline client for quick row counting > Use metadata numRows in beeline client for quick row counting > ------------------------------------------------------------- > > Key: KYLIN-4315 > URL: https://issues.apache.org/jira/browse/KYLIN-4315 > Project: Kylin > Issue Type: Improvement > Components: Job Engine > Reporter: Congling Xia > Assignee: Congling Xia > Priority: Major > Fix For: v3.1.0 > > > Hi, I find that in `BeelineHiveClient`, method `getHiveTableRows` uses > "select count(*) from <tb_name>" for table row counting. The method is > invoked in flat intermediate table redistribution step in cube building. > This stats can be loaded in metastore. It costs much less time than scanning > all rows in Hive table. Since intermediate tables are created and inserted by > Kylin, statistics will be automatically calculated and stored in metastore > when > `[hive.stats.autogather|https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.autogather]` > is enabled (which is the default setting for Hive). > ref Hive wiki for more detail about `numRows` stats: > [https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables%E2%80%93ANALYZE] -- This message was sent by Atlassian Jira (v8.3.4#803005)