[
https://issues.apache.org/jira/browse/KYLIN-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
nichunen updated KYLIN-4260:
----------------------------
Fix Version/s: v3.0.1
> When using server side PreparedStatement cache, the query result are not
> match on TopN scenario
> -----------------------------------------------------------------------------------------------
>
> Key: KYLIN-4260
> URL: https://issues.apache.org/jira/browse/KYLIN-4260
> Project: Kylin
> Issue Type: Bug
> Components: Query Engine
> Affects Versions: v3.0.0-alpha2, v2.6.4
> Reporter: Marc Wu
> Assignee: Marc Wu
> Priority: Major
> Fix For: v3.1.0, v3.0.1
>
> Attachments: image-2019-11-18-15-55-00-312.png,
> image-2019-11-18-15-55-11-906.png, image-2019-11-18-19-29-34-489.png,
> image-2019-11-18-19-29-42-721.png
>
>
> Hi Kylin team,
> I found an issue while server side PreparedStatement enabled. The second time
> query and after's result will be different from the first when query TopN,
> and the result is not right.
> Part of Cube info:
>
> Dimensions
> TRANS_ID
> PART_DT
> SELLER_ID
> BUYER_ID
> Measures:
> SUM(PRICE)
> MAX(PRICE)
> TOPN(PRICE) Group By:KYLIN_SALES.SELLER_ID,KYLIN_SALES.BUYER_ID
>
> SQL:
> {code:java}
> {"sql":"select seller_id, buyer_id, sum(PRICE) from glaucus.kylin_sales where
> PART_DT >= ? and PART_DT <= ? group by seller_id, buyer_id order by
> sum(PRICE) desc limit 20","project":"DDTFORTEST_Analytics",
> "params":[{"className": "java.lang.String","value":
> "2012-01-01"},{"className": "java.lang.String","value": "2012-01-10"}]}
> {code}
> The First query result:
> !image-2019-11-18-15-55-00-312.png!
> The Second and after:
> !image-2019-11-18-15-55-11-906.png|width=2046,height=1096!
> -----------------------------------------------
> h2. -Root Cause-
> Cached preparedContext is changed when doing preparedStatement.executeQuery,
> and losing groupByColumns. So the first execution result is correct, the
> second and the after will be incorrect.
> !image-2019-11-18-19-29-34-489.png!
> !image-2019-11-18-19-29-42-721.png!
> h2. Real Root Cause
> The first time we entered PreparedStatement logic, we'll try to borrow
> preparedContext from cache pool, of course there isn't any, but the cache
> pool will execute create method to create a new preparedContext, and then
> loaned it to us.
> I didn't figure out how adjustSqlDigest works before, and try to remove code
> {code:java}
> sqlDigest.groupbyColumns.removeAll(topnLiteralCol){code}
> but it's not right. Top-N isn't like some other measures, the dimensions
> aren't as part of row key, they stored in measures in design, so it's why the
> adjustSqlDigest matters, especially those codes.
> {code:java}
> sqlDigest.groupbyColumns.removeAll(topnLiteralCol);
> sqlDigest.metricColumns.addAll(topnLiteralCol);
> {code}
> The root cause for this issue is because the create sql digest is execute
> again after we store it in cache, so the digest changed.
> {code:java}
> # This is from the first time
> fact table GLAUCUS.KYLIN_SALES,group by [],filter on
> [GLAUCUS.KYLIN_SALES.PART_DT],with aggregates[FunctionDesc [expression=TOP_N,
> parameter=GLAUCUS.KYLIN_SALES.PRICE,GLAUCUS.KYLIN_SALES.SELLER_ID,GLAUCUS.KYLIN_SALES.BUYER_ID,
> returnType=topn(5000,8)]].
> # This is the second one
> fact table GLAUCUS.KYLIN_SALES,group by [],filter on
> [GLAUCUS.KYLIN_SALES.PART_DT],with aggregates[FunctionDesc [expression=SUM,
> parameter=GLAUCUS.KYLIN_SALES.PRICE, returnType=decimal(19,4)]].
> {code}
> So the second time and after we execute the same query or same pattern, the
> expression will be changed to SUM instead of TOPN, that's why the strange
> result show up.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)