MaxGekk opened a new pull request #29311:
URL: https://github.com/apache/spark/pull/29311


   ### What changes were proposed in this pull request?
   Convert `NULL` elements of maps, structs and arrays to the `"null"` string 
while converting maps/struct/array values to strings. 
   
   ### Why are the changes needed?
   1. It is impossible to distinguish empty string and null, for instance:
   ```scala
   scala> Seq(Seq(""), Seq(null)).toDF().show
   +-----+
   |value|
   +-----+
   |   []|
   |   []|
   +-----+
   ```
   2. Inconsistent NULL conversions for top-level values and nested columns, 
for instance:
   ```scala
   scala> sql("select named_struct('c', null), null").show
   +---------------------+----+
   |named_struct(c, NULL)|NULL|
   +---------------------+----+
   |                   []|null|
   +---------------------+----+
   ``` 
   3. `.show()` is different from conversions to Hive strings, and as a 
consequence its output is different from `spark-sql` (sql tests):
   ```sql
   spark-sql> select named_struct('c', null) as struct;
   {"c":null}
   ```
   ```scala
   scala> sql("select named_struct('c', null) as struct").show
   +------+
   |struct|
   +------+
   |    []|
   +------+
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, before:
   ```scala
   scala> Seq(Seq(""), Seq(null)).toDF().show
   +-----+
   |value|
   +-----+
   |   []|
   |   []|
   +-----+
   ```
   
   After:
   ```scala
   scala> Seq(Seq(""), Seq(null)).toDF().show
   +------+
   | value|
   +------+
   |    []|
   |[null]|
   +------+
   ```
   
   ### How was this patch tested?
   By existing test suite `CastSuite`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to