[jira] [Comment Edited] (KYLIN-2841) LIMIT is buggy with subquery

XiaoXiang Yu (JIRA) Thu, 13 Dec 2018 08:47:10 -0800


    [ 
https://issues.apache.org/jira/browse/KYLIN-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720322#comment-16720322
 ]


XiaoXiang Yu edited comment on KYLIN-2841 at 12/13/18 4:46 PM:
---------------------------------------------------------------

Same cube, same SQL, only difference is the limit number. It should not cause 
the lost of return result rows. When set LIMIT to 10000, result has 8 rows; 
when set LIMIT to 10, result has 4 rows with each sum smaller than last result.

 

After that I using hive to execute same sql query. By comparing result with 
hive , it is clear that when set LIMIT to 10,  we got a error result.

 

I upload two log file which show details about two query with "calicite.debug" 
is enable. These files contains whole query analyse process.
 * {color:#FF0000}*+kylin-2841-1.txt  :  I set LIMIT to 10, and got wrong 
result.+*{color}
 * {color:#FF0000}+*kylin-2841-2.txt  :  I set LIMIT to 10000, and got correct 
result.*+{color}

 

!image-2018-12-14-00-08-27-648.png!

 

After that, I apply contributor's patch, {color:#FF0000}+*and it does fix that 
issue.*+{color} You can see it in the following or check my log file 
(kylin-2841-apply-patch.txt). 

 

!image-2018-12-14-00-23-08-869.png!

 


was (Author: hit_lacus):
Same cube, same SQL, only difference is the limit number. It should not cause 
the lost of return result rows. When set LIMIT to 10000, result has 8 rows; 
when set LIMIT to 10, result has 4 rows with each sum smaller than last result.

 

After that I using hive to execute same sql query. By comparing result with 
hive , it is clear that when set LIMIT to 10,  we got a error result.

 

I upload two log file which show details about two query with "calicite.debug" 
is enable.

 

!image-2018-12-14-00-08-27-648.png!

 

After that, I apply contributor's patch, and it does fix that issue. You can 
see it in the following or check my log file (kylin-2841-apply-patch.txt). 

 

!image-2018-12-14-00-23-08-869.png!

 

> LIMIT is buggy with subquery
> ----------------------------
>
>                 Key: KYLIN-2841
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2841
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: v2.1.0
>            Reporter: Mu Kong
>            Assignee: zhengdong
>            Priority: Major
>              Labels: scope
>             Fix For: v2.6.0
>
>         Attachments: 0001-KYLIN-2841-LIMIT-is-buggy-with-subquery.patch, 
> image-2018-12-13-23-41-53-732.png, image-2018-12-13-23-45-06-271.png, 
> image-2018-12-14-00-05-19-114.png, image-2018-12-14-00-08-27-648.png, 
> image-2018-12-14-00-23-08-869.png, kylin-2841-1.txt, kylin-2841-2.txt, 
> kylin-2841-apply-patch.txt
>
>
> Hi, all.
> I found that limit in the web UI seems not behaving as expected.
> When I run a query like the follows:
> {code:sql}
> SELECT
>   SUM(col3) AS col4, 
>   SUM(col5) AS total_col5,
>   col1 
> FROM
> (
>   SELECT
>     col1,
>     col2,
>     MAX(col3) AS col3,
>     COUNT(*) AS col5
>   FROM db.table
>   WHERE col6 = 'somestring'
>   GROUP BY col1, col2
> )
> GROUP BY col1
> {code}
> When I specify the limit as 50, the result has 19 records, and when I specify 
> the limit as 500000, there are 90+ records in the result and each record has 
> higher col4 and total_col5.
> But for query that doesn't have subquery, the result remains the same no 
> matter how I change the limit.
> I guess for the query with subquery, limit somehow limits the number of the 
> result from the inner query instead of the result of the outer query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (KYLIN-2841) LIMIT is buggy with subquery

Reply via email to