[ https://issues.apache.org/jira/browse/KYLIN-5766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17832355#comment-17832355 ]
pengfei.zhan edited comment on KYLIN-5766 at 3/30/24 2:19 AM: -------------------------------------------------------------- h1. Design Parses the sql with javaCC, gets the "normalized" sql, and uses that sql as the key. Among them, "normalization" specific form: * Remove general comments (already implemented in the previous sql parsing step) * Replacing any number of spaces, line feeds, tabs, returns, and page breaks with a single whitespace character; * Replace "+", "-", "*", "/", "%", "=", ">=", "<=", "! =", "<>", "||" Single operators are replaced with one space to the left and one space to the right; * Replace ( ), the parentheses, with a single space to the left and right of each; * Converting , i.e. English comma to the left and replacing it with a single space on the right, in the form of test ,test1 to test, test1. * For strings with escaped identifiers, such as `2 + 3 `, no changes will be made, leaving them as they are, so `2 + 3 ` and `2 + 3 ` are different sql, and can't hit each other's caches. For example, these two queries are the same after transformation. {code:sql} -- sql1 select user , count(*) from /*comments comments */ demo group by user -- sql2 select user, count(*) -- comments from demo group by user {code} the normalized cache key is {code:sql} select user, count ( * ) from demo group by user {code} was (Author: JIRAUSER294653): h1. Design Parses the sql with javaCC, gets the "normalized" sql, and uses that sql as the key. Among them, "normalization" specific form: * Remove general comments (already implemented in the previous sql parsing step) * Replacing any number of spaces, line feeds, tabs, returns, and page breaks with a single whitespace character; * Replace "+", "-", "*", "/", "%", "=", ">=", "<=", "! =", "<>", "||" Single operators are replaced with one space to the left and one space to the right; * Replace ( ), the parentheses, with a single space to the left and right of each; * Converting , i.e. English comma to the left and replacing it with a single space on the right, in the form of test ,test1 to test, test1. * For strings with escaped identifiers, such as `2 + 3 `, no changes will be made, leaving them as they are, so `2 + 3 ` and `2 + 3 ` are different sql, and can't hit each other's caches. > Normalize query cache key > ------------------------- > > Key: KYLIN-5766 > URL: https://issues.apache.org/jira/browse/KYLIN-5766 > Project: Kylin > Issue Type: Improvement > Components: Query Engine > Affects Versions: 5.0-beta > Reporter: pengfei.zhan > Assignee: pengfei.zhan > Priority: Major > Fix For: 5.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)