Bruce Robbins created SPARK-43841:
-------------------------------------

             Summary: Non-existent column in projection of full outer join with 
USING results in StringIndexOutOfBoundsException
                 Key: SPARK-43841
                 URL: https://issues.apache.org/jira/browse/SPARK-43841
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.5.0
            Reporter: Bruce Robbins


The following query throws a {{StringIndexOutOfBoundsException}}:
{noformat}
with v1 as (
 select * from values (1, 2) as (c1, c2)
),
v2 as (
  select * from values (2, 3) as (c1, c2)
)
select v1.c1, v1.c2, v2.c1, v2.c2, b
from v1
full outer join v2
using (c1);
{noformat}
The query should fail anyway, since {{b}} refers to a non-existent column. But 
it should fail with a helpful error message, not with a 
{{StringIndexOutOfBoundsException}}.

The issue seems to be in {{StringUtils#orderSuggestedIdentifiersBySimilarity}}. 
{{orderSuggestedIdentifiersBySimilarity}} assumes that a list of candidate 
attributes with a mix of prefixes will never have an attribute name with an 
empty prefix. But in this case it does ({{c1}} from the {{coalesce}} has no 
prefix, since it is not associated with any relation or subquery):
{noformat}
+- 'Project [c1#5, c2#6, c1#7, c2#8, 'b]
   +- Project [coalesce(c1#5, c1#7) AS c1#9, c2#6, c2#8] <== c1#9 has no 
prefix, unlike c2#6 (v1.c2) or c2#8 (v2.c2)
      +- Join FullOuter, (c1#5 = c1#7)
         :- SubqueryAlias v1
         :  +- CTERelationRef 0, true, [c1#5, c2#6]
         +- SubqueryAlias v2
            +- CTERelationRef 1, true, [c1#7, c2#8]
{noformat}
Because of this, {{orderSuggestedIdentifiersBySimilarity}} returns a sorted 
list of suggestions like this:
{noformat}
ArrayBuffer(.c1, v1.c2, v2.c2)
{noformat}
{{UnresolvedAttribute.parseAttributeName}} chokes on an attribute name that 
starts with a namespace separator ('.').




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to