Shaofeng Shi ???????? ???????????????????????? ????????????????????????????????????????????????????????????????????????father????????????????????key??????city??????amt??????????a??????????b??????????c????????????1????????????????????????????child?????????????????????? 1?? ????????????????????????????10%??select count(1) from child / select count(1) from father ?? 10%???? 2?? ????????????????????????????????????????????????????????5%??select sum(province(city)) from child group by province(city) / select sum(province(city)) from father group by province(city) ?? 5%??????province????????????????????province??????????province(city)??????city????????province??udf????????????????????????????????????????1%??select sum(city) from child group by city / select sum(city) from father group by city??????city????????????????????city???????????? 3?? ????????????????????????????100????10%??????????????????90??~110????????90?? ?? select sum(amt) from child ?? 110?????? 4?? ????????????????????????20%??select a / (a+b+c) from child ?? 20%, select b / (a+b+c) from child ?? 20%, select c / (a+b+c) from child ?? 20%????????????????????????????????????????????????????????????a??b??c????????????
????????????????????????????????????????????????????????????????????????????????????????????????Apache Kylin???????????????????????????????????????????????????????????????????????????????????????????????? ------------------ ???????? ------------------ ??????: "ShaoFeng Shi"<[email protected]>; ????????: 2020??5??16??(??????) ????11:24 ??????: "user"<[email protected]>; ????: Re: ???????? Hi Xiang, I'm not sure whether Kylin can help; Does Hive/Spark SQL can fullfill the requirement? If you can provide a couple of SQL queries, that would help us to see whether Kylin can help. Best regards, Shaofeng Shi ?????? Apache Kylin PMC Email: [email protected] Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: [email protected] Join Kylin dev mail group: [email protected] ???? <[email protected]> ??2020??5??15?????? ????1:18?????? ???????? ??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????15%??????????15%??????????????????????????????????????????????????????????????????????????????????????????Apache Kylin?????????????????????????????????????????????????????????????????????????????????????? Hello??everyone?? Now we have a business requirement, which is to filter out sub datasets from a large number of data that can meet multiple rules at the same time. In different scenarios, there will be different and complex rules. For example, the proportion of a single city in the data source cannot exceed 15% (of course, 15% can be adjusted on demand by users), the proportion of various calculated business values does not exceed a specific value, and so on. I want to know, can we resolve this requirement by Apache Kylin? What plan should be adopted if possible? Is there any information or demo for reference? Does it need to be done with other tools?Thanks a lot.
