Jesse English created SPARK-12754:
-------------------------------------

             Summary: Data type mismatch on two array<bigint> values when using 
filter/where
                 Key: SPARK-12754
                 URL: https://issues.apache.org/jira/browse/SPARK-12754
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.6.0, 1.5.0
         Environment: OSX 10.11.1, Scala 2.11.7, Spark 1.5.0+
            Reporter: Jesse English


The following test produces the error _org.apache.spark.sql.AnalysisException: 
cannot resolve '(point = array(0,9))' due to data type mismatch: differing 
types in '(point = array(0,9))' (array<bigint> and array<bigint>)_

This is not the case on 1.4.x, but has been introduced with 1.5+.  Is there a 
preferred method for making this sort of arbitrarily sized array comparison?

{code:title=test.scala}
test("test array comparison") {

    val vectors: Vector[Row] =  Vector(
      Row.fromTuple("id_1" -> Array(0L, 2L)),
      Row.fromTuple("id_2" -> Array(0L, 5L)),
      Row.fromTuple("id_3" -> Array(0L, 9L)),
      Row.fromTuple("id_4" -> Array(1L, 0L)),
      Row.fromTuple("id_5" -> Array(1L, 8L)),
      Row.fromTuple("id_6" -> Array(2L, 4L)),
      Row.fromTuple("id_7" -> Array(5L, 6L)),
      Row.fromTuple("id_8" -> Array(6L, 2L)),
      Row.fromTuple("id_9" -> Array(7L, 0L))
    )
    val data: RDD[Row] = sc.parallelize(vectors, 3)

    val schema = StructType(
      StructField("id", StringType, false) ::
        StructField("point", DataTypes.createArrayType(LongType), false) ::
        Nil
    )

    val sqlContext = new SQLContext(sc)
    var dataframe = sqlContext.createDataFrame(data, schema)

    val  targetPoint:Array[Long] = Array(0L,9L)

    //This is the line where it fails
    //org.apache.spark.sql.AnalysisException: cannot resolve 
    // '(point = array(0,9))' due to data type mismatch:
    // differing types in '(point = array(0,9))' 
    // (array<bigint> and array<bigint>).

    val targetRow = dataframe.where(dataframe("point") === 
array(targetPoint.map(value => lit(value)): _*)).first()

    assert(targetRow != null)
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to