[ https://issues.apache.org/jira/browse/KYLIN-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
nichunen updated KYLIN-4260: ---------------------------- Fix Version/s: (was: v3.0.0) Future > When using server side PreparedStatement cache, the query result are not > match on TopN scenario > ----------------------------------------------------------------------------------------------- > > Key: KYLIN-4260 > URL: https://issues.apache.org/jira/browse/KYLIN-4260 > Project: Kylin > Issue Type: Bug > Components: Query Engine > Affects Versions: v3.0.0-alpha2, v2.6.4 > Reporter: Marc Wu > Assignee: Marc Wu > Priority: Major > Fix For: Future, v2.6.5 > > Attachments: image-2019-11-18-15-55-00-312.png, > image-2019-11-18-15-55-11-906.png, image-2019-11-18-19-29-34-489.png, > image-2019-11-18-19-29-42-721.png > > > Hi Kylin team, > I found an issue while server side PreparedStatement enabled. The second time > query and after's result will be different from the first when query TopN, > and the result is not right. > Part of Cube info: > > Dimensions > TRANS_ID > PART_DT > SELLER_ID > BUYER_ID > Measures: > SUM(PRICE) > MAX(PRICE) > TOPN(PRICE) Group By:KYLIN_SALES.SELLER_ID,KYLIN_SALES.BUYER_ID > > SQL: > {code:java} > {"sql":"select seller_id, buyer_id, sum(PRICE) from glaucus.kylin_sales where > PART_DT >= ? and PART_DT <= ? group by seller_id, buyer_id order by > sum(PRICE) desc limit 20","project":"DDTFORTEST_Analytics", > "params":[{"className": "java.lang.String","value": > "2012-01-01"},{"className": "java.lang.String","value": "2012-01-10"}]} > {code} > The First query result: > !image-2019-11-18-15-55-00-312.png! > The Second and after: > !image-2019-11-18-15-55-11-906.png|width=2046,height=1096! > ----------------------------------------------- > h2. -Root Cause- > Cached preparedContext is changed when doing preparedStatement.executeQuery, > and losing groupByColumns. So the first execution result is correct, the > second and the after will be incorrect. > !image-2019-11-18-19-29-34-489.png! > !image-2019-11-18-19-29-42-721.png! > h2. Real Root Cause > The first time we entered PreparedStatement logic, we'll try to borrow > preparedContext from cache pool, of course there isn't any, but the cache > pool will execute create method to create a new preparedContext, and then > loaned it to us. > I didn't figure out how adjustSqlDigest works before, and try to remove code > {code:java} > sqlDigest.groupbyColumns.removeAll(topnLiteralCol){code} > but it's not right. Top-N isn't like some other measures, the dimensions > aren't as part of row key, they stored in measures in design, so it's why the > adjustSqlDigest matters, especially those codes. > {code:java} > sqlDigest.groupbyColumns.removeAll(topnLiteralCol); > sqlDigest.metricColumns.addAll(topnLiteralCol); > {code} > The root cause for this issue is because the create sql digest is execute > again after we store it in cache, so the digest changed. > {code:java} > # This is from the first time > fact table GLAUCUS.KYLIN_SALES,group by [],filter on > [GLAUCUS.KYLIN_SALES.PART_DT],with aggregates[FunctionDesc [expression=TOP_N, > parameter=GLAUCUS.KYLIN_SALES.PRICE,GLAUCUS.KYLIN_SALES.SELLER_ID,GLAUCUS.KYLIN_SALES.BUYER_ID, > returnType=topn(5000,8)]]. > # This is the second one > fact table GLAUCUS.KYLIN_SALES,group by [],filter on > [GLAUCUS.KYLIN_SALES.PART_DT],with aggregates[FunctionDesc [expression=SUM, > parameter=GLAUCUS.KYLIN_SALES.PRICE, returnType=decimal(19,4)]]. > {code} > So the second time and after we execute the same query or same pattern, the > expression will be changed to SUM instead of TOPN, that's why the strange > result show up. -- This message was sent by Atlassian Jira (v8.3.4#803005)