I am running kylin 2.5.1. I have a question about topn aggregation function 
usage. Because I did not find document about how to configure TOPN aggregation 
function, so I am not sure if the problem I am facing is expected or a bug.


Here is my test case:


one data model, and one cube configured.
in the cube, only TOPN(SUM(HITS),GROUP-BY SUBSCRIBER_ID) was configured.
No SUM(HITS) was configured in the cube.
Built one hour of cube data.
Issued the following query:
select SUBSCRIBER_ID, sum(hits)
from a_ma_hourly_v where THEDATE='20180501' and THEHOUR='07' GROUP BY 
SUBSCRIBER_ID ORDER BY sum(hits) DESC LIMIT 100 ;

The query had "null" Exception.

2019-01-18 08:58:28,740 INFO  [Query d666c666-af7e-8c39-ef57-e80d49590e87-514] 
routing.QueryRouter:51 : Applying rule: class 
org.apache.kylin.query.routing.rules.RemoveUncapableRealizationsRule, 
realizations before: 
[CUBE[name=ma_aggs_cube_5],CUBE[name=ma_aggs_topn_cube_test]], realizations 
after: [CUBE[name=ma_aggs_cube_5],CUBE[name=ma_aggs_topn_cube_test]]
2019-01-18 08:58:28,741 INFO  [Query d666c666-af7e-8c39-ef57-e80d49590e87-514] 
rules.RealizationSortRule:40 : CUBE[name=ma_aggs_cube_5] priority 1 cost 279. 
CUBE[name=ma_aggs_topn_cube_test] priority 1 cost 27.
2019-01-18 08:58:28,741 INFO  [Query d666c666-af7e-8c39-ef57-e80d49590e87-514] 
routing.QueryRouter:51 : Applying rule: class 
org.apache.kylin.query.routing.rules.RealizationSortRule, realizations before: 
[CUBE[name=ma_aggs_cube_5],CUBE[name=ma_aggs_topn_cube_test]], realizations 
after: [CUBE[name=ma_aggs_topn_cube_test],CUBE[name=ma_aggs_cube_5]]
2019-01-18 08:58:28,741 INFO  [Query d666c666-af7e-8c39-ef57-e80d49590e87-514] 
routing.QueryRouter:75 : The realizations remaining: 
[CUBE[name=ma_aggs_topn_cube_test],CUBE[name=ma_aggs_cube_5]],and the final 
chosen one for current olap context 0 is CUBE[name=ma_aggs_topn_cube_test]
2019-01-18 08:58:28,767 ERROR [Query d666c666-af7e-8c39-ef57-e80d49590e87-514] 
service.QueryService:480 : Exception while executing query
java.sql.SQLException: Error while executing SQL "select SUBSCRIBER_ID, 
sum(hits)
from a_ma_hourly_v where THEDATE='20180501' and THEHOUR='07' GROUP BY 
SUBSCRIBER_ID ORDER BY sum(hits) DESC LIMIT 100 ": null
        at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
        at org.apache.calcite.avatica.Helper.createException(Helper.java:41)

My question is that when "TOPN(SUM(HITS),GROUP-BY SUBSCRIBER_ID)" is configured 
in a cube, is it necessary to also configure the "SUM(HITS)" in the cube?


Kang-sen


Reply via email to