[ 
https://issues.apache.org/jira/browse/KYLIN-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135127#comment-17135127
 ] 

ASF GitHub Bot commented on KYLIN-4315:
---------------------------------------

RupengWang commented on pull request #1024:
URL: https://github.com/apache/kylin/pull/1024#issuecomment-643753684


   ### Verify test case
   #### Case 1
   1. Result of Kylin
   
![image](https://user-images.githubusercontent.com/9884693/84591771-a15e3800-ae73-11ea-85c5-962ff4abc817.png)
   2. Result of Hive
   ```sql
   DROP TABLE IF EXISTS 
kylin_intermediate_kylin_sales_cube_f91987ee_443f_cadb_f3cb_04accea358b6;
   CREATE EXTERNAL TABLE IF NOT EXISTS 
kylin_intermediate_kylin_sales_cube_f91987ee_443f_cadb_f3cb_04accea358b6
   (
   `KYLIN_SALES_TRANS_ID` bigint
   ,`KYLIN_SALES_PART_DT` date
   ,`KYLIN_SALES_LEAF_CATEG_ID` bigint
   ,`KYLIN_SALES_LSTG_SITE_ID` int
   ,`KYLIN_CATEGORY_GROUPINGS_META_CATEG_NAME` string
   ,`KYLIN_CATEGORY_GROUPINGS_CATEG_LVL2_NAME` string
   ,`KYLIN_CATEGORY_GROUPINGS_CATEG_LVL3_NAME` string
   ,`KYLIN_SALES_LSTG_FORMAT_NAME` string
   ,`KYLIN_SALES_SELLER_ID` bigint
   ,`KYLIN_SALES_BUYER_ID` bigint
   ,`BUYER_ACCOUNT_ACCOUNT_BUYER_LEVEL` int
   ,`SELLER_ACCOUNT_ACCOUNT_SELLER_LEVEL` int
   ,`BUYER_ACCOUNT_ACCOUNT_COUNTRY` string
   ,`SELLER_ACCOUNT_ACCOUNT_COUNTRY` string
   ,`BUYER_COUNTRY_NAME` string
   ,`SELLER_COUNTRY_NAME` string
   ,`KYLIN_SALES_OPS_USER_ID` string
   ,`KYLIN_SALES_OPS_REGION` string
   ,`KYLIN_SALES_PRICE` decimal(19,4)
   )
   STORED AS SEQUENCEFILE
   LOCATION 
'hdfs://sandbox.hortonworks.com:8020/kylin/kylin_metadata_wrp_310/kylin-c9804baf-044f-4537-132a-81ce89d76e10/kylin_intermediate_kylin_sales_cube_f91987ee_443f_cadb_f3cb_04accea358b6';
   ALTER TABLE 
kylin_intermediate_kylin_sales_cube_f91987ee_443f_cadb_f3cb_04accea358b6 SET 
TBLPROPERTIES('auto.purge'='true');
   INSERT OVERWRITE TABLE 
`kylin_intermediate_kylin_sales_cube_f91987ee_443f_cadb_f3cb_04accea358b6` 
SELECT
   `KYLIN_SALES`.`TRANS_ID` as `KYLIN_SALES_TRANS_ID`
   ,`KYLIN_SALES`.`PART_DT` as `KYLIN_SALES_PART_DT`
   ,`KYLIN_SALES`.`LEAF_CATEG_ID` as `KYLIN_SALES_LEAF_CATEG_ID`
   ,`KYLIN_SALES`.`LSTG_SITE_ID` as `KYLIN_SALES_LSTG_SITE_ID`
   ,`KYLIN_CATEGORY_GROUPINGS`.`META_CATEG_NAME` as 
`KYLIN_CATEGORY_GROUPINGS_META_CATEG_NAME`
   ,`KYLIN_CATEGORY_GROUPINGS`.`CATEG_LVL2_NAME` as 
`KYLIN_CATEGORY_GROUPINGS_CATEG_LVL2_NAME`
   ,`KYLIN_CATEGORY_GROUPINGS`.`CATEG_LVL3_NAME` as 
`KYLIN_CATEGORY_GROUPINGS_CATEG_LVL3_NAME`
   ,`KYLIN_SALES`.`LSTG_FORMAT_NAME` as `KYLIN_SALES_LSTG_FORMAT_NAME`
   ,`KYLIN_SALES`.`SELLER_ID` as `KYLIN_SALES_SELLER_ID`
   ,`KYLIN_SALES`.`BUYER_ID` as `KYLIN_SALES_BUYER_ID`
   ,`BUYER_ACCOUNT`.`ACCOUNT_BUYER_LEVEL` as `BUYER_ACCOUNT_ACCOUNT_BUYER_LEVEL`
   ,`SELLER_ACCOUNT`.`ACCOUNT_SELLER_LEVEL` as 
`SELLER_ACCOUNT_ACCOUNT_SELLER_LEVEL`
   ,`BUYER_ACCOUNT`.`ACCOUNT_COUNTRY` as `BUYER_ACCOUNT_ACCOUNT_COUNTRY`
   ,`SELLER_ACCOUNT`.`ACCOUNT_COUNTRY` as `SELLER_ACCOUNT_ACCOUNT_COUNTRY`
   ,`BUYER_COUNTRY`.`NAME` as `BUYER_COUNTRY_NAME`
   ,`SELLER_COUNTRY`.`NAME` as `SELLER_COUNTRY_NAME`
   ,`KYLIN_SALES`.`OPS_USER_ID` as `KYLIN_SALES_OPS_USER_ID`
   ,`KYLIN_SALES`.`OPS_REGION` as `KYLIN_SALES_OPS_REGION`
   ,`KYLIN_SALES`.`PRICE` as `KYLIN_SALES_PRICE`
    FROM `DEFAULT`.`KYLIN_SALES` as `KYLIN_SALES`
   INNER JOIN `DEFAULT`.`KYLIN_CAL_DT` as `KYLIN_CAL_DT`
   ON `KYLIN_SALES`.`PART_DT` = `KYLIN_CAL_DT`.`CAL_DT`
   INNER JOIN `DEFAULT`.`KYLIN_CATEGORY_GROUPINGS` as `KYLIN_CATEGORY_GROUPINGS`
   ON `KYLIN_SALES`.`LEAF_CATEG_ID` = 
`KYLIN_CATEGORY_GROUPINGS`.`LEAF_CATEG_ID` AND `KYLIN_SALES`.`LSTG_SITE_ID` = 
`KYLIN_CATEGORY_GROUPINGS`.`SITE_ID`
   INNER JOIN `DEFAULT`.`KYLIN_ACCOUNT` as `BUYER_ACCOUNT`
   ON `KYLIN_SALES`.`BUYER_ID` = `BUYER_ACCOUNT`.`ACCOUNT_ID`
   INNER JOIN `DEFAULT`.`KYLIN_ACCOUNT` as `SELLER_ACCOUNT`
   ON `KYLIN_SALES`.`SELLER_ID` = `SELLER_ACCOUNT`.`ACCOUNT_ID`
   INNER JOIN `DEFAULT`.`KYLIN_COUNTRY` as `BUYER_COUNTRY`
   ON `BUYER_ACCOUNT`.`ACCOUNT_COUNTRY` = `BUYER_COUNTRY`.`COUNTRY`
   INNER JOIN `DEFAULT`.`KYLIN_COUNTRY` as `SELLER_COUNTRY`
   ON `SELLER_ACCOUNT`.`ACCOUNT_COUNTRY` = `SELLER_COUNTRY`.`COUNTRY`
   WHERE 1=1 AND (`KYLIN_SALES`.`PART_DT` >= '2012-01-01' AND 
`KYLIN_SALES`.`PART_DT` < '2012-02-01')
   ;
   select count(*) from 
kylin_intermediate_kylin_sales_cube_f91987ee_443f_cadb_f3cb_04accea358b6;
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


> Use metadata numRows in beeline client for quick row counting
> -------------------------------------------------------------
>
>                 Key: KYLIN-4315
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4315
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Job Engine
>            Reporter: Congling Xia
>            Assignee: Congling Xia
>            Priority: Major
>             Fix For: v3.1.0
>
>
> Hi, I find that in `BeelineHiveClient`, method `getHiveTableRows` uses 
> "select count(*) from <tb_name>" for table row counting. The method is 
> invoked in flat intermediate table redistribution step in cube building.
> This stats can be loaded in metastore. It costs much less time than scanning 
> all rows in Hive table. Since intermediate tables are created and inserted by 
> Kylin, statistics will be automatically calculated and stored in metastore 
> when 
> `[hive.stats.autogather|https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.autogather]`
>  is enabled (which is the default setting for Hive). 
> ref Hive wiki for more detail about `numRows` stats: 
> [https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables%E2%80%93ANALYZE]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to