[GitHub] spark pull request: [SPARK-6126][SQL] added coverage for UDTs in J...

yhuai Wed, 15 Jul 2015 11:56:19 -0700

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/7416#discussion_r34715159
  
    --- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/UserDefinedTypeSuite.scala ---
    @@ -138,4 +143,21 @@ class UserDefinedTypeSuite extends QueryTest {
         val actual = openHashSetUDT.deserialize(openHashSetUDT.serialize(set))
         assert(actual.iterator.toSet === set.iterator.toSet)
       }
    +
    +  test("UDTs with JSON") {
    +    val data = Seq(
    +      "{\"id\":1,\"vec\":[1.1,2.2,3.3,4.4]}",
    +      "{\"id\":2,\"vec\":[2.25,4.5,8.75]}"
    +    )
    +    val schema = StructType(Seq(
    +      StructField("id", IntegerType, false),
    +      StructField("vec", new MyDenseVectorUDT, false)
    +    ))
    +
    +    val stringRDD = ctx.sparkContext.parallelize(data)
    +    val jsonRDD = ctx.read.schema(schema).json(stringRDD)
    +    assertResult("[1,[1.1,2.2,3.3,4.4]],[2,[2.25,4.5,8.75]]") {
    +      jsonRDD.collect().mkString(",")
    --- End diff --
    
    @drubbo I feel comparing string may not be a very stable way. We have a 
utility function `checkAnswer` to check results.
    For example, you can use
    ```
    checkAnswer(
      jsonRDD,
      Row(1, Array(1.1, 2.2, 3.3, 4.4)) :: Row(2, Array(2.25, 4.5, 8.75)) :: 
Nil)
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-6126][SQL] added coverage for UDTs in J...

Reply via email to