[
https://issues.apache.org/jira/browse/SPARK-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Armbrust reopened SPARK-2781:
-------------------------------------
I'm sorry... I thought this was stale and did not read it carefully. Reopening.
> Analyzer should check resolution of LogicalPlans
> ------------------------------------------------
>
> Key: SPARK-2781
> URL: https://issues.apache.org/jira/browse/SPARK-2781
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Reporter: Aaron Staple
> Assignee: Michael Armbrust
> Fix For: 1.0.1, 1.1.0
>
>
> Currently the Analyzer’s CheckResolution rule checks that all attributes are
> resolved by searching for unresolved Expressions. But some LogicalPlans,
> including Union, contain custom implementations of the resolve attribute that
> validate other criteria in addition to checking for attribute resolution of
> their descendants. These LogicalPlans are not currently validated by the
> CheckResolution implementation.
> As a result, it is currently possible to execute a query generated from
> unresolved LogicalPlans. One example is a UNION query that produces rows
> with different data types in the same column:
> {noformat}
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> import sqlContext._
> case class T1(value:Seq[Int])
> val t1 = sc.parallelize(Seq(T1(Seq(0,1))))
> t1.registerAsTable("t1")
> sqlContext.sql("SELECT value FROM t1 UNION SELECT 2 FROM t1”).collect()
> {noformat}
> In this example, the type coercion implementation cannot unify array and
> integer types. One row contains an array in the returned column and the
> other row contains an integer. The result is:
> {noformat}
> res3: Array[org.apache.spark.sql.Row] = Array([List(0, 1)], [2])
> {noformat}
> I believe fixing this is a first step toward improving validation for Union
> (and similar) plans. (For instance, Union does not currently validate that
> its children contain the same number of columns.)
--
This message was sent by Atlassian JIRA
(v6.2#6252)