[ 
https://issues.apache.org/jira/browse/SPARK-43841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17726980#comment-17726980
 ] 

Bruce Robbins commented on SPARK-43841:
---------------------------------------

PR at https://github.com/apache/spark/pull/41353

> Non-existent column in projection of full outer join with USING results in 
> StringIndexOutOfBoundsException
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-43841
>                 URL: https://issues.apache.org/jira/browse/SPARK-43841
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.5.0
>            Reporter: Bruce Robbins
>            Priority: Minor
>
> The following query throws a {{StringIndexOutOfBoundsException}}:
> {noformat}
> with v1 as (
>  select * from values (1, 2) as (c1, c2)
> ),
> v2 as (
>   select * from values (2, 3) as (c1, c2)
> )
> select v1.c1, v1.c2, v2.c1, v2.c2, b
> from v1
> full outer join v2
> using (c1);
> {noformat}
> The query should fail anyway, since {{b}} refers to a non-existent column. 
> But it should fail with a helpful error message, not with a 
> {{StringIndexOutOfBoundsException}}.
> The issue seems to be in 
> {{StringUtils#orderSuggestedIdentifiersBySimilarity}}. 
> {{orderSuggestedIdentifiersBySimilarity}} assumes that a list of candidate 
> attributes with a mix of prefixes will never have an attribute name with an 
> empty prefix. But in this case it does ({{c1}} from the {{coalesce}} has no 
> prefix, since it is not associated with any relation or subquery):
> {noformat}
> +- 'Project [c1#5, c2#6, c1#7, c2#8, 'b]
>    +- Project [coalesce(c1#5, c1#7) AS c1#9, c2#6, c2#8] <== c1#9 has no 
> prefix, unlike c2#6 (v1.c2) or c2#8 (v2.c2)
>       +- Join FullOuter, (c1#5 = c1#7)
>          :- SubqueryAlias v1
>          :  +- CTERelationRef 0, true, [c1#5, c2#6]
>          +- SubqueryAlias v2
>             +- CTERelationRef 1, true, [c1#7, c2#8]
> {noformat}
> Because of this, {{orderSuggestedIdentifiersBySimilarity}} returns a sorted 
> list of suggestions like this:
> {noformat}
> ArrayBuffer(.c1, v1.c2, v2.c2)
> {noformat}
> {{UnresolvedAttribute.parseAttributeName}} chokes on an attribute name that 
> starts with a namespace separator ('.').



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to