[
https://issues.apache.org/jira/browse/KYLIN-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15831504#comment-15831504
]
kangkaisen commented on KYLIN-2308:
-----------------------------------
Hi, Zhong,Jason. Thanks for your review.
The more precise count distinct metrics in one cube, the more query performance
improve after setting more columnFamily.
For example, one cube has 13 precise count distinct metrics and the cardinality
for one metric is about tens of millions.
one SQL is as follows:
SELECT A, B, count(distinct uuid),
FROM table
WHERE dt = 17150
GROUP BY A, B
The uuid is a precise count distinct metric and the cardinality for it is about
30000000.
The origin query latency in HBase is about 6 seconds, after set one
columnFamily for only uuid metric, the query latency in HBase become about 1
seconds.
> Allow user to set more columnFamily in web
> -------------------------------------------
>
> Key: KYLIN-2308
> URL: https://issues.apache.org/jira/browse/KYLIN-2308
> Project: Kylin
> Issue Type: Improvement
> Components: Web
> Affects Versions: v1.6.0
> Reporter: kangkaisen
> Assignee: kangkaisen
> Attachments: KYLIN-2308.patch
>
>
> currently, when user set dozens of precise count distinct metrics in one
> cube, we put all the count distinct metrics column in one columnFamily. Which
> result in HBase scan become slow because the one {{KeyValue}} is too big. we
> could set more columnFamily to speed up the HBase scan in this scenario.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)