Hi Xiang, I'm not sure whether Kylin can help; Does Hive/Spark SQL can fullfill the requirement? If you can provide a couple of SQL queries, that would help us to see whether Kylin can help.
Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Email: [email protected] Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: [email protected] Join Kylin dev mail group: [email protected] 寒香 <[email protected]> 于2020年5月15日周五 下午1:18写道: > 大家好: > 我们现在有一个业务需求,大致是从大量数据中筛选出可以同时满足多个规则的子数据集。不同的场景下会有不同的多个规则并且比较复杂,比如数据来源的单个城市占比不能超过15%(当然这个15%是可以按需调整的)、各种通过计算得到的业务值占比不超过某特定值,诸如此类。想请教下可以通过Apache > Kylin来解决吗?可以的话应该采取什么方案,有没有可供参考的资料?是否需要借助工具完成?谢谢。 > > Hello,everyone: > Now we have a business requirement, which is to filter out sub datasets > from a large number of data that can meet multiple rules at the same time. > In different scenarios, there will be different and complex rules. For > example, the proportion of a single city in the data source cannot exceed > 15% (of course, 15% can be adjusted on demand by users), the proportion of > various calculated business values does not exceed a specific value, and so > on. I want to know, can we resolve this requirement by Apache Kylin? What > plan should be adopted if possible? Is there any information or demo for > reference? Does it need to be done with other tools?Thanks a lot. > > >
