Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/18875#discussion_r138102302
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala ---
@@ -180,10 +180,30 @@ class JsonFunctionsSuite extends QueryTest with
SharedSQLContext {
test("to_json - array") {
val df = Seq(Tuple1(Tuple1(1) :: Nil)).toDF("a")
+ val df2 = Seq(Tuple1(Map("a" -> 1) :: Nil)).toDF("a")
checkAnswer(
df.select(to_json($"a")),
Row("""[{"_1":1}]""") :: Nil)
+ checkAnswer(
+ df2.select(to_json($"a")),
+ Row("""[{"a":1}]""") :: Nil)
+ }
+
+ test("to_json - map") {
+ val df1 = Seq(Map("a" -> Tuple1(1))).toDF("a")
+ val df2 = Seq(Map(Tuple1(1) -> Tuple1(1))).toDF("a")
+ val df3 = Seq(Map("a" -> 1)).toDF("a")
+
+ checkAnswer(
+ df1.select(to_json($"a")),
+ Row("""{"a":{"_1":1}}""") :: Nil)
+ checkAnswer(
+ df2.select(to_json($"a")),
+ Row("""{"[0,1]":{"_1":1}}""") :: Nil)
--- End diff --
Yes, I was thinking that showing debugging (internal formatted) strings is
rather a bug and actually we might have to fix if possible, as I think we have
pretty `toString` format for public API `Row` class.
I remember we eventually decided to fix `TimestampType` and `DataType` in
JSON, which were printed as longs and integers, to human readable formats,
SPARK-16216. I think there was a discussion that they should be written as are
(long and int) vs human readable format at that time if I understood correctly.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]