Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/20884#discussion_r176897510
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
---
@@ -1229,7 +1229,7 @@ class JsonSuite extends QueryTest with
SharedSQLContext with TestJsonData {
val df2 = df1.toDF
val result = df2.toJSON.collect()
// scalastyle:off
- assert(result(0) ===
"{\"f1\":1,\"f2\":\"A1\",\"f3\":true,\"f4\":[\"1\",\" A1\",\" true\",\"
null\"]}")
+ assert(result(0) ===
"{\"f1\":1,\"f2\":\"A1\",\"f3\":true,\"f4\":[\"1\",\" A1\",\" true\",\"
null\"],\"f5\":null}")
--- End diff --
If we go the current way, it'd write out every `null` with every field:
```json
{"a":null,"b":null,"c":null}
{"a":null,"b":null,"c":1}
{"a":1,"b":null,"c":1}
{"a":1,"b":2,"c":3}
```
which I think's quit inefficient. Does that fix actually use case to be
clear?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]