Thanks for reporting! It's a bug, just filed a ticket to track it:
https://issues.apache.org/jira/browse/SPARK-18053
Cheng
On 10/20/16 1:54 AM, filthysocks wrote:
I have a Column in a DataFrame that contains Arrays and I wanna filter
for equality. It does work fine in spark 1.6 but not in 2.0 In spark
1.6.2:
import org.apache.spark.sql.SQLContext
case class DataTest(lists: Seq[Int])
val sql = new SQLContext(sc)
val data = sql.createDataFrame(sc.parallelize(Seq(
DataTest(Seq(1)),
DataTest(Seq(4,5,6))
)))
data.registerTempTable("uiae")
sql.sql(s"SELECT lists FROM uiae WHERE
lists=Array(1)").collect().foreach(println)
returns:[WrappedArray(1)]
In spark 2.0.0:
import spark.implicits._
case class DataTest(lists: Seq[Int])
val data = Seq(DataTest(Seq(1)),DataTest(Seq(4,5,6))).toDS()
data.createOrReplaceTempView("uiae")
spark.sql(s"SELECT lists FROM uiae WHERE
lists=Array(1)").collect().foreach(println)
returns: nothing
Is that a bug? Or is it just done differently in spark 2?
------------------------------------------------------------------------
View this message in context: Where condition on columns of Arrays
does no longer work in spark 2
<http://apache-spark-user-list.1001560.n3.nabble.com/Where-condition-on-columns-of-Arrays-does-no-longer-work-in-spark-2-tp27926.html>
Sent from the Apache Spark User List mailing list archive
<http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.