[ 
https://issues.apache.org/jira/browse/CALCITE-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270822#comment-17270822
 ] 

wangjie commented on CALCITE-4474:
----------------------------------

[~julianhyde]
{noformat}
Hi, I may not have described the cause of the problem correctly before.I think 
the cause of this problem is not just "SqlAdvisor ignores '--' comments".
it because of SqlSimpleParser.Tokenizer#nextToken() cannot distinguish Token 
correctly, when a continuous string is composed of multiple Tokens. 
 
E.g :
select * from a where column_b='/* this is not comment */' => SELECT * FROM a 
WHERE column_b=' '
The reason for the result is because they are strings with no spaces. If there 
is a space after the equal sign, just like this:
select * from a where column_b= '/* this is not comment */' => SELECT * FROM a 
WHERE column_b= '/* this is not comment */' 
 
So I think we need to add some keywords to skip when it traverses the string.
But, This modification will have a difference from the previous function.
E.g :
originSql:    select * from a where column_b='2021--this is not comment'
nowResult:    SELECT * FROM a WHERE column_b='2021--this is not comment'
modifyResult: SELECT * FROM a WHERE column_b= '2021--this is not comment'
{noformat}
 

 

> SqlSimpleParser inner Tokenizer should not recognize the sql of TokenType.ID 
> or some keywords in some case
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-4474
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4474
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.26.0
>            Reporter: wangjie
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 1.26.0
>
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> in SqlSimpleParser.Tokenizer#nextToken() lines 401 - 423, is used to 
> recognize the sql of TokenType.ID or some keywords. 
> If a certain segment of characters is continuously composed of Token,the 
> function of this code may be wrong.
>  
> E.g :
>  (1)select * from a where price> 10.0\-–comment
> 【10.0\-–comment】should be recognize as TokenType.ID("10.0") and 
> TokenType.COMMENT, but it recognize as TokenType.ID("10.0--comment")
>  (2)select * from a where column_b='/* this is not comment */'
> 【/\* this is not comment \*/】should be recognize as TokenType.SQID("/\* this 
> is not comment \*/"), but it was not
>   
> you can see reproduce-demo in 
> [https://github.com/wangjie-fourth/calcite-reproduce]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to