[ 
https://issues.apache.org/jira/browse/KYLIN-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wang, Gang updated KYLIN-2903:
------------------------------
    Attachment: 0001-KYLIN-2903-support-cardinality-calculation-for-Hive-.patch

Attached it a patch.
One way is to leverage HQL 'COUNT DISTINCT' statement to calculate column 
cardinality, and use 'INSERT OVERWRITE DIRECTORY' to put the result in the 
output path. To make it recognizable for the following step 
HiveColumnCardinalityUpdateJob, the output need following the specified format 
as following:
column1 cardinality
column2 cardinality
column3 cardinality
.....

And this can be reached as well by setting 'ROW FORMAT DELIMITED' and adding 
line break in HQL.

> support cardinality calculation for Hive view
> ---------------------------------------------
>
>                 Key: KYLIN-2903
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2903
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Job Engine
>            Reporter: Wang, Gang
>            Assignee: Wang, Gang
>            Priority: Minor
>         Attachments: 
> 0001-KYLIN-2903-support-cardinality-calculation-for-Hive-.patch
>
>
> Currently, Kylin leverage HCatlog to calculate column cardinality for Hive 
> tables. While, HCatlog does not support Hive view actually. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to