[jira] [Commented] (KYLIN-2620) Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN

2019-06-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871973#comment-16871973
 ] 

ASF subversion and git services commented on KYLIN-2620:


Commit 2bfda2d6c2a19586d0854e487bd3ceb805922940 in kylin's branch 
refs/heads/2.6.x from chao long
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=2bfda2d ]

KYLIN-2620 TopN can't match when multi sort columns


> Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN
> 
>
> Key: KYLIN-2620
> URL: https://issues.apache.org/jira/browse/KYLIN-2620
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: Lin Tingmao
>Assignee: Chao Long
>Priority: Major
> Fix For: v3.0.0, v2.6.3
>
> Attachments: cube_desc.png, sql.png
>
>
> When running the following query
> select sum(measure) from table group by col_id
> if there exists TOPN(measure, group by col_id)  measure, 
> TopNMeasureType.isTopNCompatibleSum()will pass, so the SUM is rewritten 
> to TOPN. This confuses the user since they may expect a accurate result for 
> every distinct value of group by column(s). 
> Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the 
> query to determine whether to rewrite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2620) Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN

2019-05-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851728#comment-16851728
 ] 

ASF GitHub Bot commented on KYLIN-2620:
---

nichunen commented on pull request #647: KYLIN-2620 TopN can't match when multi 
sort columns
URL: https://github.com/apache/kylin/pull/647
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN
> 
>
> Key: KYLIN-2620
> URL: https://issues.apache.org/jira/browse/KYLIN-2620
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: Lin Tingmao
>Assignee: Chao Long
>Priority: Major
> Fix For: v3.0.0, v2.6.3
>
> Attachments: cube_desc.png, sql.png
>
>
> When running the following query
> select sum(measure) from table group by col_id
> if there exists TOPN(measure, group by col_id)  measure, 
> TopNMeasureType.isTopNCompatibleSum()will pass, so the SUM is rewritten 
> to TOPN. This confuses the user since they may expect a accurate result for 
> every distinct value of group by column(s). 
> Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the 
> query to determine whether to rewrite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2620) Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN

2019-05-30 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851727#comment-16851727
 ] 

ASF subversion and git services commented on KYLIN-2620:


Commit 670c2a98a82a699481eceddb685168104d5b185c in kylin's branch 
refs/heads/master from chao long
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=670c2a9 ]

KYLIN-2620 TopN can't match when multi sort columns


> Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN
> 
>
> Key: KYLIN-2620
> URL: https://issues.apache.org/jira/browse/KYLIN-2620
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: Lin Tingmao
>Assignee: Chao Long
>Priority: Major
> Fix For: v3.0.0, v2.6.3
>
> Attachments: cube_desc.png, sql.png
>
>
> When running the following query
> select sum(measure) from table group by col_id
> if there exists TOPN(measure, group by col_id)  measure, 
> TopNMeasureType.isTopNCompatibleSum()will pass, so the SUM is rewritten 
> to TOPN. This confuses the user since they may expect a accurate result for 
> every distinct value of group by column(s). 
> Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the 
> query to determine whether to rewrite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2620) Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN

2019-05-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16843389#comment-16843389
 ] 

ASF GitHub Bot commented on KYLIN-2620:
---

Wayne1c commented on pull request #647: KYLIN-2620 TopN can't match when multi 
sort columns
URL: https://github.com/apache/kylin/pull/647
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN
> 
>
> Key: KYLIN-2620
> URL: https://issues.apache.org/jira/browse/KYLIN-2620
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: Lin Tingmao
>Assignee: Chao Long
>Priority: Major
> Fix For: Future, v3.0.0
>
> Attachments: cube_desc.png, sql.png
>
>
> When running the following query
> select sum(measure) from table group by col_id
> if there exists TOPN(measure, group by col_id)  measure, 
> TopNMeasureType.isTopNCompatibleSum()will pass, so the SUM is rewritten 
> to TOPN. This confuses the user since they may expect a accurate result for 
> every distinct value of group by column(s). 
> Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the 
> query to determine whether to rewrite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2620) Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN

2019-05-13 Thread Na Zhai (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838674#comment-16838674
 ] 

Na Zhai commented on KYLIN-2620:


Use this SQL test: {code:sql}SELECT SELLER_ID, SUM(PRICE) FROM KYLIN_SALES
 group by SELLER_ID 
 order by SUM(PRICE), seller_id DESC limit 10;{code} It did not hit the top_n 
measure.
!sql.png!
!cube_desc.png!

> Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN
> 
>
> Key: KYLIN-2620
> URL: https://issues.apache.org/jira/browse/KYLIN-2620
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: Lin Tingmao
>Assignee: Chao Long
>Priority: Major
> Fix For: Future, v3.0.0
>
> Attachments: cube_desc.png, sql.png
>
>
> When running the following query
> select sum(measure) from table group by col_id
> if there exists TOPN(measure, group by col_id)  measure, 
> TopNMeasureType.isTopNCompatibleSum()will pass, so the SUM is rewritten 
> to TOPN. This confuses the user since they may expect a accurate result for 
> every distinct value of group by column(s). 
> Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the 
> query to determine whether to rewrite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2620) Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN

2019-03-26 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802425#comment-16802425
 ] 

ASF subversion and git services commented on KYLIN-2620:


Commit d5c37dcaeaca6d08f2098a645d78b2c800482db4 in kylin's branch 
refs/heads/2.6.x from chao long
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=d5c37dc ]

KYLIN-2620 Make the condition stricter to answer query with topN


> Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN
> 
>
> Key: KYLIN-2620
> URL: https://issues.apache.org/jira/browse/KYLIN-2620
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: Lin Tingmao
>Assignee: Chao Long
>Priority: Major
> Fix For: v2.6.2
>
>
> When running the following query
> select sum(measure) from table group by col_id
> if there exists TOPN(measure, group by col_id)  measure, 
> TopNMeasureType.isTopNCompatibleSum()will pass, so the SUM is rewritten 
> to TOPN. This confuses the user since they may expect a accurate result for 
> every distinct value of group by column(s). 
> Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the 
> query to determine whether to rewrite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2620) Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN

2019-03-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783279#comment-16783279
 ] 

ASF subversion and git services commented on KYLIN-2620:


Commit f968e31140e6b5e68ef3c8124192987752c32a03 in kylin's branch 
refs/heads/master from chao long
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=f968e31 ]

KYLIN-2620 Make the condition stricter to answer query with topN


> Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN
> 
>
> Key: KYLIN-2620
> URL: https://issues.apache.org/jira/browse/KYLIN-2620
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: Lin Tingmao
>Assignee: Chao Long
>Priority: Major
> Fix For: v2.6.2
>
>
> When running the following query
> select sum(measure) from table group by col_id
> if there exists TOPN(measure, group by col_id)  measure, 
> TopNMeasureType.isTopNCompatibleSum()will pass, so the SUM is rewritten 
> to TOPN. This confuses the user since they may expect a accurate result for 
> every distinct value of group by column(s). 
> Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the 
> query to determine whether to rewrite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2620) Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN

2019-03-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783278#comment-16783278
 ] 

ASF GitHub Bot commented on KYLIN-2620:
---

shaofengshi commented on pull request #489: KYLIN-2620 Make the condition 
stricter to answer query with topN
URL: https://github.com/apache/kylin/pull/489
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN
> 
>
> Key: KYLIN-2620
> URL: https://issues.apache.org/jira/browse/KYLIN-2620
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: Lin Tingmao
>Assignee: Chao Long
>Priority: Major
> Fix For: v2.6.2
>
>
> When running the following query
> select sum(measure) from table group by col_id
> if there exists TOPN(measure, group by col_id)  measure, 
> TopNMeasureType.isTopNCompatibleSum()will pass, so the SUM is rewritten 
> to TOPN. This confuses the user since they may expect a accurate result for 
> every distinct value of group by column(s). 
> Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the 
> query to determine whether to rewrite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2620) Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN

2019-03-01 Thread KANG-SEN LU (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781854#comment-16781854
 ] 

KANG-SEN LU commented on KYLIN-2620:


If we have TOPN(SUM(X), GROUP-BY D1) metric configured in a kylin cube, the 
query in hand must meet the following conditions:
 # GROUP-BY list includes D1 dimension,
 # ORDER-BY SUM(X)
 # LIMIT n,   where n <= TOPN's limit.

Condition 2 and 3 are mentioned by the bug description. But about point 1, I 
think it is important. We don't want the kylin to use TOPN(SUM(X), GROUP-BY D1) 
in case the query did not have GROUP-BY D1. If kylin rewrite SUM(X) to 
TOPN(SUM(X)), then it would have to aggregate over all D1 values. That may lost 
accuracy, if kylin did not save all D1 value in its cuboid.

> Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN
> 
>
> Key: KYLIN-2620
> URL: https://issues.apache.org/jira/browse/KYLIN-2620
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: Lin Tingmao
>Assignee: Chao Long
>Priority: Major
> Fix For: v2.6.2
>
>
> When running the following query
> select sum(measure) from table group by col_id
> if there exists TOPN(measure, group by col_id)  measure, 
> TopNMeasureType.isTopNCompatibleSum()will pass, so the SUM is rewritten 
> to TOPN. This confuses the user since they may expect a accurate result for 
> every distinct value of group by column(s). 
> Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the 
> query to determine whether to rewrite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2620) Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN

2019-02-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778974#comment-16778974
 ] 

ASF GitHub Bot commented on KYLIN-2620:
---

Wayne1c commented on pull request #489: KYLIN-2620 Make the condition stricter 
to answer query with topN
URL: https://github.com/apache/kylin/pull/489
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN
> 
>
> Key: KYLIN-2620
> URL: https://issues.apache.org/jira/browse/KYLIN-2620
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: Lin Tingmao
>Assignee: Shaofeng SHI
>Priority: Major
> Fix For: v2.6.1
>
>
> When running the following query
> select sum(measure) from table group by col_id
> if there exists TOPN(measure, group by col_id)  measure, 
> TopNMeasureType.isTopNCompatibleSum()will pass, so the SUM is rewritten 
> to TOPN. This confuses the user since they may expect a accurate result for 
> every distinct value of group by column(s). 
> Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the 
> query to determine whether to rewrite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2620) Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN

2019-01-15 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743572#comment-16743572
 ] 

Shaofeng SHI commented on KYLIN-2620:
-

I will check it.

> Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN
> 
>
> Key: KYLIN-2620
> URL: https://issues.apache.org/jira/browse/KYLIN-2620
> Project: Kylin
>  Issue Type: Bug
>  Components: Measure - TopN
>Reporter: Lin Tingmao
>Assignee: Shaofeng SHI
>Priority: Major
>
> When running the following query
> select sum(measure) from table group by col_id
> if there exists TOPN(measure, group by col_id)  measure, 
> TopNMeasureType.isTopNCompatibleSum()will pass, so the SUM is rewritten 
> to TOPN. This confuses the user since they may expect a accurate result for 
> every distinct value of group by column(s). 
> Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the 
> query to determine whether to rewrite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2620) Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN

2019-01-15 Thread KANG-SEN LU (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743363#comment-16743363
 ] 

KANG-SEN LU commented on KYLIN-2620:


This bug would limit the selection of topn metric only when the query is better 
served by the topn cube.

However, the cube cost evaluation algorithm in

core-metadata/src/main/java/org/apache/kylin/measure/topn/TopNMeasureType.java, 
function influenceCapabilityCheck().

must be enhanced when there are more than one cube associated with the same 
data model.

The current problem is that when "select sum(x) from fact_table " is issued, if 
there are two cube spec both can answer this query, the kylin would prefer to 
use topn cue, even if that means we would retrieve limited rows of data from 
"group by col_id" then aggregated later. That is not only inefficient, but also 
incorrect.

> Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN
> 
>
> Key: KYLIN-2620
> URL: https://issues.apache.org/jira/browse/KYLIN-2620
> Project: Kylin
>  Issue Type: Bug
>Reporter: Lin Tingmao
>Priority: Major
>
> When running the following query
> select sum(measure) from table group by col_id
> if there exists TOPN(measure, group by col_id)  measure, 
> TopNMeasureType.isTopNCompatibleSum()will pass, so the SUM is rewritten 
> to TOPN. This confuses the user since they may expect a accurate result for 
> every distinct value of group by column(s). 
> Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the 
> query to determine whether to rewrite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2620) Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN

2019-01-15 Thread KANG-SEN LU (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743356#comment-16743356
 ] 

KANG-SEN LU commented on KYLIN-2620:


I am having doubt in this sentence:

"Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the 
query to determine whether to rewrite."

this sentence should be corrected into:

Kylin should check if "ORDER BY measure LIMIT topncapacity" is present in the 
query to determine whether to rewrite.

Am I right?

> Check for "ORDER BY LIMIT" clause when rewrite SUM query as TOPN
> 
>
> Key: KYLIN-2620
> URL: https://issues.apache.org/jira/browse/KYLIN-2620
> Project: Kylin
>  Issue Type: Bug
>Reporter: Lin Tingmao
>Priority: Major
>
> When running the following query
> select sum(measure) from table group by col_id
> if there exists TOPN(measure, group by col_id)  measure, 
> TopNMeasureType.isTopNCompatibleSum()will pass, so the SUM is rewritten 
> to TOPN. This confuses the user since they may expect a accurate result for 
> every distinct value of group by column(s). 
> Kylin should check if "ORDER BY col_id LIMIT topncapacity" is present in the 
> query to determine whether to rewrite.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)