Hi, ShaoFeng:
Thanks for the info. So what I found is a bug in kylin. I am curious if there are any tutorial about hwo to use KYLIN GUI to configure the TOPN measure, i.e what is the minimum info being configured to make it work? I can see in sample project json files how kylin expects the cube configuration. But how does a user using KYLIN GUI to accomplish the same effect is not clear. Kang-sen ________________________________ From: ShaoFeng Shi <[email protected]> Sent: Friday, January 18, 2019 9:16:47 AM To: user Subject: Re: question about how to configure TOPN aggregation function In theory, it doesn't need a separate SUM() measure; Your issue seems to be the same as: https://issues.apache.org/jira/browse/KYLIN-3322 Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Work email: [email protected] <mailto:[email protected]> Kyligence Inc: https://kyligence.io/ Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: [email protected]<mailto:[email protected]> Join Kylin dev mail group: [email protected]<mailto:[email protected]> Kang-Sen Lu <[email protected]<mailto:[email protected]>> 于2019年1月18日周五 下午10:13写道: I am running kylin 2.5.1. I have a question about topn aggregation function usage. Because I did not find document about how to configure TOPN aggregation function, so I am not sure if the problem I am facing is expected or a bug. Here is my test case: one data model, and one cube configured. in the cube, only TOPN(SUM(HITS),GROUP-BY SUBSCRIBER_ID) was configured. No SUM(HITS) was configured in the cube. Built one hour of cube data. Issued the following query: select SUBSCRIBER_ID, sum(hits) from a_ma_hourly_v where THEDATE='20180501' and THEHOUR='07' GROUP BY SUBSCRIBER_ID ORDER BY sum(hits) DESC LIMIT 100 ; The query had "null" Exception. 2019-01-18 08:58:28,740 INFO [Query d666c666-af7e-8c39-ef57-e80d49590e87-514] routing.QueryRouter:51 : Applying rule: class org.apache.kylin.query.routing.rules.RemoveUncapableRealizationsRule, realizations before: [CUBE[name=ma_aggs_cube_5],CUBE[name=ma_aggs_topn_cube_test]], realizations after: [CUBE[name=ma_aggs_cube_5],CUBE[name=ma_aggs_topn_cube_test]] 2019-01-18 08:58:28,741 INFO [Query d666c666-af7e-8c39-ef57-e80d49590e87-514] rules.RealizationSortRule:40 : CUBE[name=ma_aggs_cube_5] priority 1 cost 279. CUBE[name=ma_aggs_topn_cube_test] priority 1 cost 27. 2019-01-18 08:58:28,741 INFO [Query d666c666-af7e-8c39-ef57-e80d49590e87-514] routing.QueryRouter:51 : Applying rule: class org.apache.kylin.query.routing.rules.RealizationSortRule, realizations before: [CUBE[name=ma_aggs_cube_5],CUBE[name=ma_aggs_topn_cube_test]], realizations after: [CUBE[name=ma_aggs_topn_cube_test],CUBE[name=ma_aggs_cube_5]] 2019-01-18 08:58:28,741 INFO [Query d666c666-af7e-8c39-ef57-e80d49590e87-514] routing.QueryRouter:75 : The realizations remaining: [CUBE[name=ma_aggs_topn_cube_test],CUBE[name=ma_aggs_cube_5]],and the final chosen one for current olap context 0 is CUBE[name=ma_aggs_topn_cube_test] 2019-01-18 08:58:28,767 ERROR [Query d666c666-af7e-8c39-ef57-e80d49590e87-514] service.QueryService:480 : Exception while executing query java.sql.SQLException: Error while executing SQL "select SUBSCRIBER_ID, sum(hits) from a_ma_hourly_v where THEDATE='20180501' and THEHOUR='07' GROUP BY SUBSCRIBER_ID ORDER BY sum(hits) DESC LIMIT 100 ": null at org.apache.calcite.avatica.Helper.createException(Helper.java:56) at org.apache.calcite.avatica.Helper.createException(Helper.java:41) My question is that when "TOPN(SUM(HITS),GROUP-BY SUBSCRIBER_ID)" is configured in a cube, is it necessary to also configure the "SUM(HITS)" in the cube? Kang-sen
