Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18875#discussion_r138102302
  
    --- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala ---
    @@ -180,10 +180,30 @@ class JsonFunctionsSuite extends QueryTest with 
SharedSQLContext {
     
       test("to_json - array") {
         val df = Seq(Tuple1(Tuple1(1) :: Nil)).toDF("a")
    +    val df2 = Seq(Tuple1(Map("a" -> 1) :: Nil)).toDF("a")
     
         checkAnswer(
           df.select(to_json($"a")),
           Row("""[{"_1":1}]""") :: Nil)
    +    checkAnswer(
    +      df2.select(to_json($"a")),
    +      Row("""[{"a":1}]""") :: Nil)
    +  }
    +
    +  test("to_json - map") {
    +    val df1 = Seq(Map("a" -> Tuple1(1))).toDF("a")
    +    val df2 = Seq(Map(Tuple1(1) -> Tuple1(1))).toDF("a")
    +    val df3 = Seq(Map("a" -> 1)).toDF("a")
    +
    +    checkAnswer(
    +      df1.select(to_json($"a")),
    +      Row("""{"a":{"_1":1}}""") :: Nil)
    +    checkAnswer(
    +      df2.select(to_json($"a")),
    +      Row("""{"[0,1]":{"_1":1}}""") :: Nil)
    --- End diff --
    
    Yes, I was thinking that showing debugging (internal formatted) strings is 
rather a bug and actually we might have to fix if possible, as I think we have 
pretty `toString` format for public API `Row` class.
    
    I remember we eventually decided to fix `TimestampType` and `DataType` in 
JSON, which were printed as longs and integers, to human readable formats, 
SPARK-16216. I think there was a discussion that they should be written as are 
(long and int) vs human readable format at that time if I understood correctly.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to