[
https://issues.apache.org/jira/browse/SPARK-24781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Takuya Ueshin updated SPARK-24781:
----------------------------------
Summary: Using a reference from Dataset in Filter/Sort might not work.
(was: Using a reference from Dataset in Filter might not work.)
> Using a reference from Dataset in Filter/Sort might not work.
> -------------------------------------------------------------
>
> Key: SPARK-24781
> URL: https://issues.apache.org/jira/browse/SPARK-24781
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.3.1
> Reporter: Takuya Ueshin
> Priority: Blocker
>
> When we use a reference fromĀ {{Dataset}} in {{filter}}, which was not used in
> the prior {{select}}, the {{AnalysisException}} occurs, e.g.,
> {code:scala}
> val df = Seq(("test1", 0), ("test2", 1)).toDF("name", "enable")
> df.select(df("name")).filter(df("enable") =!= 0).show()
> {code}
> {noformat}
> org.apache.spark.sql.AnalysisException: Resolved attribute(s) enable#6
> missing from name#5 in operator !Filter NOT (enable#6 = 0).;;
> !Filter NOT (enable#6 = 0)
> +- AnalysisBarrier
> +- Project [name#5]
> +- Project [_1#2 AS name#5, _2#3 AS enable#6]
> +- LocalRelation [_1#2, _2#3]
> {noformat}
> If we use {{col}} instead, it works:
> {code:scala}
> val df = Seq(("test1", 0), ("test2", 1)).toDF("name", "enable")
> df.select(col("name")).filter(col("enable") =!= 0).show()
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]