[ 
https://issues.apache.org/jira/browse/KYLIN-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15831504#comment-15831504
 ] 

kangkaisen commented on KYLIN-2308:
-----------------------------------

Hi, Zhong,Jason. Thanks  for your review.
The more precise count distinct metrics in one cube, the more query performance 
improve after setting more columnFamily.
For example, one cube has 13 precise count distinct metrics and the cardinality 
for one metric is about tens of millions.

one SQL is as follows:

SELECT A, B, count(distinct uuid), 
FROM table
WHERE dt = 17150
GROUP BY A, B

The uuid is a precise count distinct metric and the cardinality for it is about 
30000000.

The origin query latency in HBase is about 6 seconds, after set one 
columnFamily for only uuid metric, the query latency in HBase become about 1 
seconds.

> Allow user to set more columnFamily in web 
> -------------------------------------------
>
>                 Key: KYLIN-2308
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2308
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Web 
>    Affects Versions: v1.6.0
>            Reporter: kangkaisen
>            Assignee: kangkaisen
>         Attachments: KYLIN-2308.patch
>
>
> currently, when user set dozens of precise count distinct metrics in one 
> cube, we put all the count distinct metrics column in one columnFamily. Which 
> result in HBase scan become slow because the one {{KeyValue}} is too big. we 
> could    set more columnFamily to speed up the HBase scan in this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to