[ https://issues.apache.org/jira/browse/KYLIN-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781854#comment-16781854 ]
KANG-SEN LU commented on KYLIN-2620: ------------------------------------ If we have TOPN(SUM(X), GROUP-BY D1) metric configured in a kylin cube, the query in hand must meet the following conditions: # GROUP-BY list includes D1 dimension, # ORDER-BY SUM(X) # LIMIT n, where n <= TOPN's limit. Condition 2 and 3 are mentioned by the bug description. But about point 1, I think it is important. We don't want the kylin to use TOPN(SUM(X), GROUP-BY D1) in case the query did not have GROUP-BY D1. If kylin rewrite SUM(X) to TOPN(SUM(X)), then it would have to aggregate over all D1 values. That may lost accuracy, if kylin did not save all D1 value in its cuboid. > Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN > ---------------------------------------------------------------- > > Key: KYLIN-2620 > URL: https://issues.apache.org/jira/browse/KYLIN-2620 > Project: Kylin > Issue Type: Bug > Components: Measure - TopN > Reporter: Lin Tingmao > Assignee: Chao Long > Priority: Major > Fix For: v2.6.2 > > > When running the following query > select sum(measure) from table group by col_id > if there exists TOPN(measure, group by col_id) measure, > TopNMeasureType.isTopNCompatibleSum() will pass, so the SUM is rewritten > to TOPN. This confuses the user since they may expect a accurate result for > every distinct value of group by column(s). > Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the > query to determine whether to rewrite. -- This message was sent by Atlassian JIRA (v7.6.3#76005)