[ 
https://issues.apache.org/jira/browse/SPARK-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094499#comment-15094499
 ] 

kevin yu commented on SPARK-12754:
----------------------------------

Hello Jesse:  It looks there is changing for the nullable checking after spark 
1.4. For your testcase, the default nullable is true for createArrayType in 
StructField("point", DataTypes.createArrayType(LongType), false)

and the nullable is false for

val  targetPoint:Array[Long] = Array(0L,9L)

that is why caused the failure. 

you can change the nullable to false at createArrayType, it will work. 

StructField("point", DataTypes.createArrayType(LongType, false), false)


> Data type mismatch on two array<bigint> values when using filter/where
> ----------------------------------------------------------------------
>
>                 Key: SPARK-12754
>                 URL: https://issues.apache.org/jira/browse/SPARK-12754
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.5.0, 1.6.0
>         Environment: OSX 10.11.1, Scala 2.11.7, Spark 1.5.0+
>            Reporter: Jesse English
>
> The following test produces the error 
> _org.apache.spark.sql.AnalysisException: cannot resolve '(point = 
> array(0,9))' due to data type mismatch: differing types in '(point = 
> array(0,9))' (array<bigint> and array<bigint>)_
> This is not the case on 1.4.x, but has been introduced with 1.5+.  Is there a 
> preferred method for making this sort of arbitrarily sized array comparison?
> {code:title=test.scala}
> test("test array comparison") {
>     val vectors: Vector[Row] =  Vector(
>       Row.fromTuple("id_1" -> Array(0L, 2L)),
>       Row.fromTuple("id_2" -> Array(0L, 5L)),
>       Row.fromTuple("id_3" -> Array(0L, 9L)),
>       Row.fromTuple("id_4" -> Array(1L, 0L)),
>       Row.fromTuple("id_5" -> Array(1L, 8L)),
>       Row.fromTuple("id_6" -> Array(2L, 4L)),
>       Row.fromTuple("id_7" -> Array(5L, 6L)),
>       Row.fromTuple("id_8" -> Array(6L, 2L)),
>       Row.fromTuple("id_9" -> Array(7L, 0L))
>     )
>     val data: RDD[Row] = sc.parallelize(vectors, 3)
>     val schema = StructType(
>       StructField("id", StringType, false) ::
>         StructField("point", DataTypes.createArrayType(LongType), false) ::
>         Nil
>     )
>     val sqlContext = new SQLContext(sc)
>     var dataframe = sqlContext.createDataFrame(data, schema)
>     val  targetPoint:Array[Long] = Array(0L,9L)
>     //This is the line where it fails
>     //org.apache.spark.sql.AnalysisException: cannot resolve 
>     // '(point = array(0,9))' due to data type mismatch:
>     // differing types in '(point = array(0,9))' 
>     // (array<bigint> and array<bigint>).
>     val targetRow = dataframe.where(dataframe("point") === 
> array(targetPoint.map(value => lit(value)): _*)).first()
>     assert(targetRow != null)
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to