Hi, For COUNT_DITINCT , they should be stored into Hbase as a large byte array which could be decoded to a bitmap/HllCounter, not a simple Java primitive data type. So “他们不就是一个数值么” is not correct.
Since they are always larger than simple measure (sum/min/max), using a separated column family is good choice to make query only related to simple measure more efficient. Please check https://github.com/apache/kylin/tree/master/core-metadata/src/main/java/org/apache/kylin/measure for more accurate answer. If you find any mistake, please let me know. ---------------- Best wishes, Xiaoxiang Yu 发件人: TUESDAY <[email protected]> 答复: "[email protected]" <[email protected]> 日期: 2019年2月12日 星期二 19:43 收件人: user <[email protected]> 主题: kylin的topN,count distinct是如何存储的 一直有个问题,kylin在hbase的存储中,rowkey是由维度的组合组成的,列簇是由这个组合的数值组成的,那为什么像topnN,count distinct这些要用另外的列簇来存储,他们不就是一个数值么(精确计算,如果不是精确计算的时候,那又是什么) 谢谢![https://rescdn.qqmail.com/zh_CN/images/mo/DEFAULT2/5.gif]不知道有没有发送成功,又发送了一次
