Ioana Delaney created SPARK-16161:
-------------------------------------

             Summary: Ambiguous error message for unsupported correlated 
predicate subqueries
                 Key: SPARK-16161
                 URL: https://issues.apache.org/jira/browse/SPARK-16161
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.0.0
            Reporter: Ioana Delaney
            Priority: Minor


Subqueries with deep correlation fail with ambiguous error message.

Problem repro:
{code}
Seq((1, 1), (2, 2)).toDF("c1", "c2").createOrReplaceTempView("t1")
Seq((1, 1), (2, 2)).toDF("c1", "c2").createOrReplaceTempView("t2")
Seq((1, 1), (2, 2)).toDF("c1", "c2").createOrReplaceTempView("t3")

sql("select c1 from t1 where c1 IN (select t2.c1 from t2 where t2.c2 IN (select 
t3.c2 from t3 where t3.c1 = t1.c1))").show()

org.apache.spark.sql.AnalysisException: filter expression 'listquery()' of type 
array<null> is not a boolean.;
  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:40)
  at 
org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:58)
{code}

Based on testing, Spark supports one level of correlation in predicate and 
scalar subqueries. An example of supported correlation is shown below.

{code}
select c1 from t1
where c1 IN (select t2.c1 from t2 where t2.c2 IN (select t3.c2 from t3 where 
t3.c1 = t2.c1))
{code}

If the query has deep correlation, such as in the first example, where the 
inner subquery is correlated to the outer most query block, the above error 
message is issued. 

This PR changes the error message to the following one:

{code}
Correlated column in subquery cannot be resolved: t1.c1; line 5 pos 28
org.apache.spark.sql.AnalysisException: Correlated column in subquery cannot be 
resolved: t1.c1; line 5 pos 28
  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:40)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to