[ 
https://issues.apache.org/jira/browse/SPARK-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Armbrust reopened SPARK-2781:
-------------------------------------


I'm sorry... I thought this was stale and did not read it carefully. Reopening.

> Analyzer should check resolution of LogicalPlans
> ------------------------------------------------
>
>                 Key: SPARK-2781
>                 URL: https://issues.apache.org/jira/browse/SPARK-2781
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Aaron Staple
>            Assignee: Michael Armbrust
>             Fix For: 1.0.1, 1.1.0
>
>
> Currently the Analyzer’s CheckResolution rule checks that all attributes are 
> resolved by searching for unresolved Expressions.  But some LogicalPlans, 
> including Union, contain custom implementations of the resolve attribute that 
> validate other criteria in addition to checking for attribute resolution of 
> their descendants.  These LogicalPlans are not currently validated by the 
> CheckResolution implementation.
> As a result, it is currently possible to execute a query generated from 
> unresolved LogicalPlans.  One example is a UNION query that produces rows 
> with different data types in the same column:
> {noformat}
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> import sqlContext._
> case class T1(value:Seq[Int])
> val t1 = sc.parallelize(Seq(T1(Seq(0,1))))
> t1.registerAsTable("t1")
> sqlContext.sql("SELECT value FROM t1 UNION SELECT 2 FROM t1”).collect()
> {noformat}
> In this example, the type coercion implementation cannot unify array and 
> integer types.  One row contains an array in the returned column and the 
> other row contains an integer.  The result is:
> {noformat}
> res3: Array[org.apache.spark.sql.Row] = Array([List(0, 1)], [2])
> {noformat}
> I believe fixing this is a first step toward improving validation for Union 
> (and similar) plans.  (For instance, Union does not currently validate that 
> its children contain the same number of columns.)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to