[ https://issues.apache.org/jira/browse/SPARK-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094499#comment-15094499 ]
kevin yu commented on SPARK-12754: ---------------------------------- Hello Jesse: It looks there is changing for the nullable checking after spark 1.4. For your testcase, the default nullable is true for createArrayType in StructField("point", DataTypes.createArrayType(LongType), false) and the nullable is false for val targetPoint:Array[Long] = Array(0L,9L) that is why caused the failure. you can change the nullable to false at createArrayType, it will work. StructField("point", DataTypes.createArrayType(LongType, false), false) > Data type mismatch on two array<bigint> values when using filter/where > ---------------------------------------------------------------------- > > Key: SPARK-12754 > URL: https://issues.apache.org/jira/browse/SPARK-12754 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.5.0, 1.6.0 > Environment: OSX 10.11.1, Scala 2.11.7, Spark 1.5.0+ > Reporter: Jesse English > > The following test produces the error > _org.apache.spark.sql.AnalysisException: cannot resolve '(point = > array(0,9))' due to data type mismatch: differing types in '(point = > array(0,9))' (array<bigint> and array<bigint>)_ > This is not the case on 1.4.x, but has been introduced with 1.5+. Is there a > preferred method for making this sort of arbitrarily sized array comparison? > {code:title=test.scala} > test("test array comparison") { > val vectors: Vector[Row] = Vector( > Row.fromTuple("id_1" -> Array(0L, 2L)), > Row.fromTuple("id_2" -> Array(0L, 5L)), > Row.fromTuple("id_3" -> Array(0L, 9L)), > Row.fromTuple("id_4" -> Array(1L, 0L)), > Row.fromTuple("id_5" -> Array(1L, 8L)), > Row.fromTuple("id_6" -> Array(2L, 4L)), > Row.fromTuple("id_7" -> Array(5L, 6L)), > Row.fromTuple("id_8" -> Array(6L, 2L)), > Row.fromTuple("id_9" -> Array(7L, 0L)) > ) > val data: RDD[Row] = sc.parallelize(vectors, 3) > val schema = StructType( > StructField("id", StringType, false) :: > StructField("point", DataTypes.createArrayType(LongType), false) :: > Nil > ) > val sqlContext = new SQLContext(sc) > var dataframe = sqlContext.createDataFrame(data, schema) > val targetPoint:Array[Long] = Array(0L,9L) > //This is the line where it fails > //org.apache.spark.sql.AnalysisException: cannot resolve > // '(point = array(0,9))' due to data type mismatch: > // differing types in '(point = array(0,9))' > // (array<bigint> and array<bigint>). > val targetRow = dataframe.where(dataframe("point") === > array(targetPoint.map(value => lit(value)): _*)).first() > assert(targetRow != null) > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org