Shaofeng Shi ????????
    ????????????????????????
    
????????????????????????????????????????????????????????????????????????father????????????????????key??????city??????amt??????????a??????????b??????????c????????????1????????????????????????????child??????????????????????
    1?? ????????????????????????????10%??select count(1) from child / 
select count(1) from father ?? 10%????
    2?? 
????????????????????????????????????????????????????????5%??select 
sum(province(city)) from child group by province(city) / select 
sum(province(city)) from father group by province(city) ?? 
5%??????province????????????????????province??????????province(city)??????city????????province??udf????????????????????????????????????????1%??select
 sum(city) from child group by city / select sum(city) from father group by 
city??????city????????????????????city????????????
    3?? 
????????????????????????????100????10%??????????????????90??~110????????90?? ?? 
select sum(amt) from child ?? 110??????
    4?? ????????????????????????20%??select a / (a+b+c) from 
child  ?? 20%, select b / (a+b+c) from child  ?? 20%, select c / 
(a+b+c) from child  ?? 
20%????????????????????????????????????????????????????????????a??b??c????????????

    
????????????????????????????????????????????????????????????????????????????????????????????????Apache
 
Kylin????????????????????????????????????????????????????????????????????????????????????????????????




------------------ ???????? ------------------
??????:&nbsp;"ShaoFeng Shi"<[email protected]&gt;;
????????:&nbsp;2020??5??16??(??????) ????11:24
??????:&nbsp;"user"<[email protected]&gt;;

????:&nbsp;Re: ????????



Hi Xiang,

I'm not sure whether Kylin can help; Does Hive/Spark SQL can fullfill the 
requirement? If you can provide a couple of SQL queries, that would help us to 
see whether Kylin can help.


Best regards,

Shaofeng Shi ??????
Apache Kylin PMC
Email: [email protected]


Apache Kylin FAQ:&nbsp;https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: [email protected]
Join Kylin dev mail group: [email protected]

















???? <[email protected]&gt; ??2020??5??15?????? ????1:18??????

????????
??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????15%??????????15%??????????????????????????????????????????????????????????????????????????????????????????Apache
 
Kylin??????????????????????????????????????????????????????????????????????????????????????


Hello??everyone??
Now we have a business requirement, which is to filter out sub datasets from a 
large number of data that can meet multiple rules at the same time. In 
different scenarios, there will be different and complex rules. For example, 
the proportion of a single city in the data source cannot exceed 15% (of 
course, 15% can be adjusted on demand by users), the proportion of various 
calculated business values does not exceed a specific value, and so on. I want 
to know, can we resolve this requirement by Apache Kylin? What plan should be 
adopted if possible? Is there any information or demo for reference? Does it 
need to be done with other tools?Thanks a lot.

Reply via email to