[
https://issues.apache.org/jira/browse/CALCITE-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270822#comment-17270822
]
wangjie commented on CALCITE-4474:
----------------------------------
[~julianhyde]
{noformat}
Hi, I may not have described the cause of the problem correctly before.I think
the cause of this problem is not just "SqlAdvisor ignores '--' comments".
it because of SqlSimpleParser.Tokenizer#nextToken() cannot distinguish Token
correctly, when a continuous string is composed of multiple Tokens.
E.g :
select * from a where column_b='/* this is not comment */' => SELECT * FROM a
WHERE column_b=' '
The reason for the result is because they are strings with no spaces. If there
is a space after the equal sign, just like this:
select * from a where column_b= '/* this is not comment */' => SELECT * FROM a
WHERE column_b= '/* this is not comment */'
So I think we need to add some keywords to skip when it traverses the string.
But, This modification will have a difference from the previous function.
E.g :
originSql: select * from a where column_b='2021--this is not comment'
nowResult: SELECT * FROM a WHERE column_b='2021--this is not comment'
modifyResult: SELECT * FROM a WHERE column_b= '2021--this is not comment'
{noformat}
> SqlSimpleParser inner Tokenizer should not recognize the sql of TokenType.ID
> or some keywords in some case
> ----------------------------------------------------------------------------------------------------------
>
> Key: CALCITE-4474
> URL: https://issues.apache.org/jira/browse/CALCITE-4474
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.26.0
> Reporter: wangjie
> Priority: Minor
> Labels: pull-request-available
> Fix For: 1.26.0
>
> Time Spent: 2h
> Remaining Estimate: 0h
>
> in SqlSimpleParser.Tokenizer#nextToken() lines 401 - 423, is used to
> recognize the sql of TokenType.ID or some keywords.
> If a certain segment of characters is continuously composed of Token,the
> function of this code may be wrong.
>
> E.g :
> (1)select * from a where price> 10.0\-–comment
> 【10.0\-–comment】should be recognize as TokenType.ID("10.0") and
> TokenType.COMMENT, but it recognize as TokenType.ID("10.0--comment")
> (2)select * from a where column_b='/* this is not comment */'
> 【/\* this is not comment \*/】should be recognize as TokenType.SQID("/\* this
> is not comment \*/"), but it was not
>
> you can see reproduce-demo in
> [https://github.com/wangjie-fourth/calcite-reproduce]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)