Josh Elser created PHOENIX-3246:
-----------------------------------

             Summary: U+2002 (En Space) not handled as whitespace in grammar
                 Key: PHOENIX-3246
                 URL: https://issues.apache.org/jira/browse/PHOENIX-3246
             Project: Phoenix
          Issue Type: Bug
            Reporter: Josh Elser
            Assignee: Josh Elser
             Fix For: 4.9.0, 4.8.1


I had the goofiest query issue the other day. A seemingly fine query was 
throwing a parse error via sqlline.

{noformat}
Error: ERROR 601 (42P00): Syntax error. Unexpected char: ' ' 
(state=42P00,code=601)
org.apache.phoenix.exception.PhoenixParserException: ERROR 601 (42P00): Syntax 
error. Unexpected char: ' '
        at 
org.apache.phoenix.exception.PhoenixParserException.newException(PhoenixParserException.java:33)
        at org.apache.phoenix.parse.SQLParser.parseStatement(SQLParser.java:118)
        at 
org.apache.phoenix.jdbc.PhoenixStatement$PhoenixStatementParser.parseStatement(PhoenixStatement.java:1280)
        at 
org.apache.phoenix.jdbc.PhoenixStatement.parseStatement(PhoenixStatement.java:1363)
        at 
org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1434)
        at sqlline.Commands.execute(Commands.java:822)
        at sqlline.Commands.sql(Commands.java:732)
        at sqlline.SqlLine.dispatch(SqlLine.java:807)
        at sqlline.SqlLine.runCommands(SqlLine.java:1710)
        at sqlline.Commands.run(Commands.java:1285)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36)
        at sqlline.SqlLine.dispatch(SqlLine.java:803)
        at sqlline.SqlLine.initArgs(SqlLine.java:613)
        at sqlline.SqlLine.begin(SqlLine.java:656)
        at sqlline.SqlLine.start(SqlLine.java:398)
        at sqlline.SqlLine.main(SqlLine.java:292)
Caused by: java.lang.RuntimeException: Unexpected char: ' '
        at 
org.apache.phoenix.parse.PhoenixSQLLexer.mOTHER(PhoenixSQLLexer.java:4324)
        at 
org.apache.phoenix.parse.PhoenixSQLLexer.mTokens(PhoenixSQLLexer.java:5437)
        at 
org.apache.phoenix.shaded.org.antlr.runtime.Lexer.nextToken(Lexer.java:85)
        at 
org.apache.phoenix.shaded.org.antlr.runtime.BufferedTokenStream.fetch(BufferedTokenStream.java:143)
        at 
org.apache.phoenix.shaded.org.antlr.runtime.BufferedTokenStream.sync(BufferedTokenStream.java:137)
        at 
org.apache.phoenix.shaded.org.antlr.runtime.CommonTokenStream.skipOffTokenChannels(CommonTokenStream.java:113)
        at 
org.apache.phoenix.shaded.org.antlr.runtime.CommonTokenStream.LT(CommonTokenStream.java:102)
        at 
org.apache.phoenix.shaded.org.antlr.runtime.BufferedTokenStream.LA(BufferedTokenStream.java:174)
        at 
org.apache.phoenix.shaded.org.antlr.runtime.BaseRecognizer.mismatchIsUnwantedToken(BaseRecognizer.java:127)
        at 
org.apache.phoenix.parse.PhoenixSQLParser.recoverFromMismatchedToken(PhoenixSQLParser.java:354)
        at 
org.apache.phoenix.shaded.org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
        at 
org.apache.phoenix.parse.PhoenixSQLParser.parseNoReserved(PhoenixSQLParser.java:9969)
        at 
org.apache.phoenix.parse.PhoenixSQLParser.identifier(PhoenixSQLParser.java:9936)
        at 
org.apache.phoenix.parse.PhoenixSQLParser.column_def(PhoenixSQLParser.java:3938)
        at 
org.apache.phoenix.parse.PhoenixSQLParser.column_defs(PhoenixSQLParser.java:3858)
        at 
org.apache.phoenix.parse.PhoenixSQLParser.create_table_node(PhoenixSQLParser.java:1104)
        at 
org.apache.phoenix.parse.PhoenixSQLParser.oneStatement(PhoenixSQLParser.java:816)
        at 
org.apache.phoenix.parse.PhoenixSQLParser.statement(PhoenixSQLParser.java:508)
        at org.apache.phoenix.parse.SQLParser.parseStatement(SQLParser.java:108)
        ... 18 more
{noformat}

Re-typing the statement by hand worked successfully.

After some hexdump and diff action, I finally found out that some of the space 
characters in the statement were not the normal ASCII 0x20 character, but 
actually the unicode U+2002 "En Space" character. They look identical to the 
eye which spawned all of the confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to