Yerui Sun created KYLIN-1379:
--------------------------------
Summary: More stable precise count distinct implements after
KYLIN-1186
Key: KYLIN-1379
URL: https://issues.apache.org/jira/browse/KYLIN-1379
Project: Kylin
Issue Type: Improvement
Components: Job Engine
Affects Versions: v2.1, v1.3
Reporter: Yerui Sun
Assignee: Yerui Sun
After KYLIN-1186, we've gained the ability to count distinct int type columns.
However, the implements of KYLIN-1186 is not stable, especially in 2.x-staging
branch.
The reason is that the measure's maxlength is used to allocate memory in 2.x
version, and the BitmapMeasure is hardcoded to 8MB in KYLIN-1186, causing OOM
when cube building.
To resolve this problem, we have introduce precision on the bitmap measure,
such as bitmap(100), bitmap(10000), bitmap(1000000), meaning the measure could
accept 100/10000/1M cardinality at most. This solution should be fine,
considering the reality, if the count value over 1000000, the hyperloglog
measure which produce approx. result should be acceptable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)