[ 
https://issues.apache.org/jira/browse/SPARK-52261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18043519#comment-18043519
 ] 

André Souprayane commented on SPARK-52261:
------------------------------------------

The first sql query generates the following logical plan:

'WithCTE
:- 'CTERelationDef 2, false => 
:  +- 'SubqueryAlias bar
:     +- 'Project [*]
:        +- 'SubqueryAlias foo
:           +- 'CTERelationRef 1, false, false, false, 6, false
+- 'Project [*]
   +- 'SubqueryAlias foo
      +- 'SubqueryAlias foo
         +- 'Union false, false
            :- SubqueryAlias a
            :  +- LocalRelation [str#4, num#5]
            +- SubqueryAlias b
               +- LocalRelation [str#6]


The second sql query generates the following logical plan:

'Project [*]
+- 'SubqueryAlias foo
   +- 'SubqueryAlias foo
      +- 'Union false, false
         :- SubqueryAlias a
         :  +- LocalRelation [str#1, num#2]
         +- SubqueryAlias b
            +- LocalRelation [str#3]


When there are two CTE subquery, the second CTE subquery is included in a child 
plan whereas the first CTE subquery is included as a subqueryAlias.
In this case, the second CTE query has an unresolved star expression, that is 
why the CheckAnalysis fails when it parses the first child plan. 
I don't find how we can only update CheckAnalysis class to avoid failing on the 
unresolved star expression and fail instead on the root cause which is the 
number of columns mismatch.

> Misleading error: Invalid usage of '*'
> --------------------------------------
>
>                 Key: SPARK-52261
>                 URL: https://issues.apache.org/jira/browse/SPARK-52261
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 4.0.0
>            Reporter: Max Gekk
>            Priority: Major
>
> The code below raises the misleading error:
> {code:java}
> [INVALID_USAGE_OF_STAR_OR_REGEX] Invalid usage of '*' in Project. SQLSTATE: 
> 42000; line 7 pos 9;
> {code}
> {code:sql}
>     with foo as (
>           values ("one", 1), ("two", 2), ("three", 3) as a (str, num)
>           union all
>           values ("four"), ("five"), ("six") as b (str)
>         ),
>         bar as (
>           select * from foo
>         )
>         select * from foo
> {code}
> The error is not caused by '*' usage, and should be similar to:
> {code:sql}
>     with foo as (
>           values ("one", 1), ("two", 2), ("three", 3) as a (str, num)
>           union all
>           values ("four"), ("five"), ("six") as b (str)
>         )
>         select * from foo
> {code}
> {code:java}
> [NUM_COLUMNS_MISMATCH] UNION can only be performed on inputs with the same 
> number of columns, but the first input has 2 columns and the second input has 
> 1 columns. SQLSTATE: 42826; line 2 pos 2;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to