Maybe it is a kind of the threshold query. You can google it for much info.

?? 2020/5/18 ????3:51, ???? ????:
Shaofeng Shi ????????
?0?2 ?0?2 ????????????????????????
????????????????????????????????????????????????????????????????????????father????????????????????key??????city??????amt??????????a??????????b??????????c????????????1????????????????????????????child??????????????????????
?0?2 ?0?2 1?? ????????????????????????????10%??select count(1) from child / select count(1) from father ?? 10%???? ?0?2 ?0?2 2?? ????????????????????????????????????????????????????????5%??select sum(province(city)) from child group by province(city) / select sum(province(city)) from father group by province(city) ?? 5%??????province????????????????????province??????????province(city)??????city????????province??udf????????????????????????????????????????1%??select sum(city) from child group by city / select sum(city) from father group by city??????city????????????????????city???????????? ?0?2 ?0?2 3?? ????????????????????????????100????10%??????????????????90??~110????????90?? ?? select sum(amt) from child ?? 110?????? ?0?2 ?0?2 4?? ????????????????????????20%??select a / (a+b+c) from child?0?2 ?? 20%, select b / (a+b+c) from child?0?2 ?? 20%, select c / (a+b+c) from child?0?2 ?? 20%????????????????????????????????????????????????????????????a??b??c???????????? ?0?2 ?0?2 ????????????????????????????????????????????????????????????????????????????????????????????????Apache Kylin????????????????????????????????????????????????????????????????????????????????????????????????


------------------?0?2?????????0?2------------------
*??????:*?0?2"ShaoFeng Shi"<[email protected]>;
*????????:*?0?22020??5??16??(??????) ????11:24
*??????:*?0?2"user"<[email protected]>;
*????:*?0?2Re: ????????

Hi Xiang,

I'm not sure whether Kylin can help; Does Hive/Spark SQL can fullfill the requirement? If you can provide a couple of SQL queries, that would help us to see whether Kylin can help.

Best regards,

Shaofeng Shi ??????
Apache Kylin PMC
Email: [email protected] <mailto:[email protected]>

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: [email protected] <mailto:[email protected]> Join Kylin dev mail group: [email protected] <mailto:[email protected]>




???? <[email protected] <mailto:[email protected]>> ??2020??5??15?????? ????1:18??????

    ????????
    
??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????15%??????????15%??????????????????????????????????????????????????????????????????????????????????????????Apache
    
Kylin??????????????????????????????????????????????????????????????????????????????????????

    Hello??everyone??
    Now we have a business requirement, which is to filter out sub
    datasets from a large number of data that can meet multiple rules
    at the same time. In different scenarios, there will be different
    and complex rules. For example, the proportion of a single city in
    the data source cannot exceed 15% (of course, 15% can be adjusted
    on demand by users), the proportion of various calculated business
    values does not exceed a specific value, and so on. I want to
    know, can we resolve this requirement by Apache Kylin? What plan
    should be adopted if possible? Is there any information or demo
    for reference? Does it need to be done with other tools?Thanks a lot.


Reply via email to