I have a Column in a DataFrame that contains Arrays and I wanna filter for equality. It does work fine in spark 1.6 but not in 2.0In spark 1.6.2: import org.apache.spark.sql.SQLContextcase class DataTest(lists: Seq[Int])val sql = new SQLContext(sc)val data = sql.createDataFrame(sc.parallelize(Seq( DataTest(Seq(1)), DataTest(Seq(4,5,6)) )))data.registerTempTable("uiae")sql.sql(s"SELECT lists FROM uiae WHERE lists=Array(1)").collect().foreach(println) returns:[WrappedArray(1)] In spark 2.0.0: import spark.implicits._case class DataTest(lists: Seq[Int])val data = Seq(DataTest(Seq(1)),DataTest(Seq(4,5,6))).toDS()data.createOrReplaceTempView("uiae")spark.sql(s"SELECT lists FROM uiae WHERE lists=Array(1)").collect().foreach(println) returns: nothing
Is that a bug? Or is it just done differently in spark 2? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Where-condition-on-columns-of-Arrays-does-no-longer-work-in-spark-2-tp27926.html Sent from the Apache Spark User List mailing list archive at Nabble.com.