[ 
https://issues.apache.org/jira/browse/CALCITE-7533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Darpan Lunagariya (e6data computing) updated CALCITE-7533:
----------------------------------------------------------
    Description: 
The parser fails on queries where a WITH clause is followed by a parenthesized 
query expression, even though this is permitted by {{SQL:2016 (<query 
expression body> → ( <query expression body> ))}} and accepted by PostgreSQL.

Example failing query:

{code:java}
SELECT * FROM (
  WITH q AS (SELECT 1 AS id)
  (SELECT id FROM q) UNION ALL (SELECT id FROM q)
) t
{code}

Error:

{code:java}
org.apache.calcite.sql.parser.SqlParseException: Encountered "(" at line 1, 
column 46.
Was expecting one of:
    "SELECT" ...
    "TABLE" ...
    "VALUE" ...
    "VALUES" ...
    "," ...
{code}



Root cause:

In core/src/main/codegen/templates/Parser.jj, the Query production only allows 
LeafQuery (i.e. SELECT / VALUES / TABLE) as the body after a WithList:

{code:java}
[ withList = WithList() ]
e = LeafQuery(exprContext) { list.add(e); }
( AddSetOpQuery(list, exprContext) )*
{code}

This is asymmetric in two ways:
1. The parallel QueryOrExpr rule already uses LeafQueryOrExpr after WithList, 
so a parenthesized query is allowed there.
2. AddSetOpQuery (used for the 2nd…Nth set-op operands) also uses 
LeafQueryOrExpr, so the parser accepts … UNION ALL (SELECT 1) but rejects 
(SELECT 1) UNION ALL … when it appears as the first operand after WITH.

Without WITH, the same shape parses fine because it goes through 
ExprOrJoinOrOrderedQuery's alt-2 (TableRef1) path:
(SELECT 1) UNION ALL (SELECT 2)   -- OK today

Proposed fix:

Branch Query so the post-WithList body uses `LeafQueryOrExpr` (mirroring 
QueryOrExpr), while the no-WITH path keeps LeafQuery. This preserves Query's 
first-set ({SELECT, VALUE, VALUES, TABLE}), so the LOOKAHEAD(2) decision in
ExprOrJoinOrOrderedQuery is unchanged and existing parses (e.g. (SELECT 1) 
UNION ALL (SELECT 2)) take the same path and produce the same tree.

{code:java}
SqlNode Query(ExprContext exprContext) :
{
    SqlNodeList withList = null;
    final SqlNode e;
    final List<Object> list = new ArrayList<Object>();
}
{
    (
        withList = WithList()
        e = LeafQueryOrExpr(exprContext) { list.add(e); }
    |
        e = LeafQuery(exprContext) { list.add(e); }
    )
    ( AddSetOpQuery(list, exprContext) )*
    { return addWith(withList, SqlParserUtil.toTree(list)); }
}
{code}

  was:
The parser fails on queries where a `WITH` clause is followed by a 
parenthesized query expression, even though this is permitted by `SQL:2016 
(<query expression body> → ( <query expression body> ))` and accepted by 
PostgreSQL.

Example failing query:
```
SELECT * FROM (
  WITH q AS (SELECT 1 AS id)
  (SELECT id FROM q) UNION ALL (SELECT id FROM q)
) t
```

Error:
```
org.apache.calcite.sql.parser.SqlParseException: Encountered "(" at line 1, 
column 46.
Was expecting one of:
    "SELECT" ...
    "TABLE" ...
    "VALUE" ...
    "VALUES" ...
    "," ...
```

Root cause:

In `core/src/main/codegen/templates/Parser.jj`, the Query production only 
allows `LeafQuery` (i.e. SELECT / VALUES / TABLE) as the body after a WithList:

```
[ withList = WithList() ]
e = LeafQuery(exprContext) { list.add(e); }
( AddSetOpQuery(list, exprContext) )*
```

This is asymmetric in two ways:
1. The parallel `QueryOrExpr` rule already uses `LeafQueryOrExpr` after 
WithList, so a parenthesized query is allowed there.
2. AddSetOpQuery (used for the 2nd…Nth set-op operands) also uses 
LeafQueryOrExpr, so the parser accepts … UNION ALL (SELECT 1) but rejects 
(SELECT 1) UNION ALL … when it appears as the first operand after WITH.

Without WITH, the same shape parses fine because it goes through 
ExprOrJoinOrOrderedQuery's alt-2 (TableRef1) path:
(SELECT 1) UNION ALL (SELECT 2)   -- OK today

Proposed fix:

Branch Query so the post-WithList body uses `LeafQueryOrExpr` (mirroring 
QueryOrExpr), while the no-WITH path keeps LeafQuery. This preserves Query's 
first-set `({SELECT, VALUE, VALUES, TABLE})`, so the `LOOKAHEAD(2)` decision in
`ExprOrJoinOrOrderedQuery` is unchanged and existing parses (e.g. (SELECT 1) 
UNION ALL (SELECT 2)) take the same path and produce the same tree.

```
SqlNode Query(ExprContext exprContext) :
{
    SqlNodeList withList = null;
    final SqlNode e;
    final List<Object> list = new ArrayList<Object>();
}
{
    (
        withList = WithList()
        e = LeafQueryOrExpr(exprContext) { list.add(e); }
    |
        e = LeafQuery(exprContext) { list.add(e); }
    )
    ( AddSetOpQuery(list, exprContext) )*
    { return addWith(withList, SqlParserUtil.toTree(list)); }
}
```


> Parser rejects parenthesized query as the body of a WITH clause
> ---------------------------------------------------------------
>
>                 Key: CALCITE-7533
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7533
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.41.0
>            Reporter: Darpan Lunagariya (e6data computing)
>            Priority: Minor
>
> The parser fails on queries where a WITH clause is followed by a 
> parenthesized query expression, even though this is permitted by {{SQL:2016 
> (<query expression body> → ( <query expression body> ))}} and accepted by 
> PostgreSQL.
> Example failing query:
> {code:java}
> SELECT * FROM (
>   WITH q AS (SELECT 1 AS id)
>   (SELECT id FROM q) UNION ALL (SELECT id FROM q)
> ) t
> {code}
> Error:
> {code:java}
> org.apache.calcite.sql.parser.SqlParseException: Encountered "(" at line 1, 
> column 46.
> Was expecting one of:
>     "SELECT" ...
>     "TABLE" ...
>     "VALUE" ...
>     "VALUES" ...
>     "," ...
> {code}
> Root cause:
> In core/src/main/codegen/templates/Parser.jj, the Query production only 
> allows LeafQuery (i.e. SELECT / VALUES / TABLE) as the body after a WithList:
> {code:java}
> [ withList = WithList() ]
> e = LeafQuery(exprContext) { list.add(e); }
> ( AddSetOpQuery(list, exprContext) )*
> {code}
> This is asymmetric in two ways:
> 1. The parallel QueryOrExpr rule already uses LeafQueryOrExpr after WithList, 
> so a parenthesized query is allowed there.
> 2. AddSetOpQuery (used for the 2nd…Nth set-op operands) also uses 
> LeafQueryOrExpr, so the parser accepts … UNION ALL (SELECT 1) but rejects 
> (SELECT 1) UNION ALL … when it appears as the first operand after WITH.
> Without WITH, the same shape parses fine because it goes through 
> ExprOrJoinOrOrderedQuery's alt-2 (TableRef1) path:
> (SELECT 1) UNION ALL (SELECT 2)   -- OK today
> Proposed fix:
> Branch Query so the post-WithList body uses `LeafQueryOrExpr` (mirroring 
> QueryOrExpr), while the no-WITH path keeps LeafQuery. This preserves Query's 
> first-set ({SELECT, VALUE, VALUES, TABLE}), so the LOOKAHEAD(2) decision in
> ExprOrJoinOrOrderedQuery is unchanged and existing parses (e.g. (SELECT 1) 
> UNION ALL (SELECT 2)) take the same path and produce the same tree.
> {code:java}
> SqlNode Query(ExprContext exprContext) :
> {
>     SqlNodeList withList = null;
>     final SqlNode e;
>     final List<Object> list = new ArrayList<Object>();
> }
> {
>     (
>         withList = WithList()
>         e = LeafQueryOrExpr(exprContext) { list.add(e); }
>     |
>         e = LeafQuery(exprContext) { list.add(e); }
>     )
>     ( AddSetOpQuery(list, exprContext) )*
>     { return addWith(withList, SqlParserUtil.toTree(list)); }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to