Re: some confuse about Mandatory Dimensions

2016-11-17 Thread ShaoFeng Shi
xiaoming, Kylin saves the HyperLogLog or Bitmap for the distinct count
measure (not just a number!), which means they are mergable for complex
query. So even you mark A+B+C as mandantory, when you query for a certain
sub combination like A, it will use those HLL or Bitmap to merge again to
return what you want.

2016-11-17 14:18 GMT+08:00 张晓明(zhangxiaoming)-技术产品中心 <zhangxiaom...@qiyi.com
>:

> Thanks Billy
>
> If Kylin save the result separate By A B C,The Times Can be understand,
> But “count distinct ” is  merge the same “u” ,Can’t do ++ Operation
> “”
>
>
>
> *From:* Billy(Yiming) Liu [mailto:liuyiming@gmail.com]
> *Sent:* Thursday, November 17, 2016 2:05 PM
> *To:* user <user@kylin.apache.org>
> *Cc:* d...@kylin.apache.org
> *Subject:* Re: some confuse about Mandatory Dimensions
>
>
>
> If you set A, B, and C as mandatory dimensions, that means Kylin will save
> the cuboid result by grouping A, B, C internally. But that not means you
> could only query by grouping A, B, C.  If you only query A, B. The final
> result will do post-aggregation by grouping the above cuboid. Same as query
> grouping A. The cost is performance, since more post-aggregation needed.
> But if you query by grouping D. There would be no result, since you missed
> the mandatory dimension.
>
>
>
> 2016-11-17 13:31 GMT+08:00 张晓明(zhangxiaoming)-技术产品中心 <
> zhangxiaom...@qiyi.com>:
>
> Hi,all
>
>  I have create a cube in My System with Mandatory Dimensions such
> as  A B C, and the Measure use count distinct filed “u” will HLL ,
>
> When the segment of the cube complete,I query the result with kylin sql
> as “select count(distinct u) from table where A=xxx and b=yyy” or “select
> count(distinct u) from table where A=xxx ”. The result is correct
>
> In my opinion, all of the query condition must be set (A=xxx,B=,C=zzz)
> ,the kylin sql can be wrok,
>
> The question is How the Kylin query the result and the distinct value is
> right ?  that is unbelievable
>
>
>
>
>
> --
>
> With Warm regards
>
> Yiming Liu (刘一鸣)
>



-- 
Best regards,

Shaofeng Shi 史少锋


Re: some confuse about Mandatory Dimensions

2016-11-16 Thread Billy(Yiming) Liu
If you set A, B, and C as mandatory dimensions, that means Kylin will save
the cuboid result by grouping A, B, C internally. But that not means you
could only query by grouping A, B, C.  If you only query A, B. The final
result will do post-aggregation by grouping the above cuboid. Same as query
grouping A. The cost is performance, since more post-aggregation needed.
But if you query by grouping D. There would be no result, since you missed
the mandatory dimension.

2016-11-17 13:31 GMT+08:00 张晓明(zhangxiaoming)-技术产品中心 :

> Hi,all
>
>  I have create a cube in My System with Mandatory Dimensions such
> as  A B C, and the Measure use count distinct filed “u” will HLL ,
>
> When the segment of the cube complete,I query the result with kylin sql
> as “select count(distinct u) from table where A=xxx and b=yyy” or “select
> count(distinct u) from table where A=xxx ”. The result is correct
>
> In my opinion, all of the query condition must be set (A=xxx,B=,C=zzz)
> ,the kylin sql can be wrok,
>
> The question is How the Kylin query the result and the distinct value is
> right ?  that is unbelievable
>



-- 
With Warm regards

Yiming Liu (刘一鸣)