Hi :
It is a good question, Kylin use Global Dictionary to encode a string into 
integer, not a specific hashing function, please refer to 
http://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/ for it design. 
For newer version, we provided build global dictionary in distributed way, such 
as http://kylin.apache.org/docs30/howto/howto_use_hive_mr_dict.html. On the 
another hand, cardinal number is calculated at Extract Fact Table Distinct 
Columns Step.
If you interested in the detail and you can read Chinese article, please read 
https://blog.bcmeng.com/post/kylin-distinct-count-global-dict.html and 
https://blog.bcmeng.com/post/kylin-distinct-count.html and 
https://hexiaoqiao.github.io/blog/2016/11/27/exact-count-and-global-dictionary-of-apache-kylin/
 for further information.

----------------
Best wishes,
Xiaoxiang Yu


发件人: "[email protected]" <[email protected]>
答复: "[email protected]" <[email protected]>
日期: 2019年9月27日 星期五 10:23
收件人: kylin-user <[email protected]>
主题: global dictionary scople

Hi:
    As the official document describes, Kylin uses bitmap to achieve precise 
count ( distinct) .
    The premise is to map the content to a number using a specific hash 
function.  But, if we don't  know how much cardinal number is ,how to confirm 
the range of mapped numbers?
    If the range is set too large, memory is wasted. And if it is set too 
small, it is not enough.


________________________________
[email protected]

Reply via email to