Github user patrickmcgloin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21671#discussion_r199356020
--- Diff: python/pyspark/sql/functions.py ---
@@ -2163,9 +2163,9 @@ def json_tuple(col, *fields):
@since(2.1)
def from_json(col, schema, options={}):
"""
- Parses a column containing a JSON string into a :class:`MapType` with
:class:`StringType`
- as keys type, :class:`StructType` or :class:`ArrayType` of
:class:`StructType`\\s with
- the specified schema. Returns `null`, in the case of an unparseable
string.
+ Parses a column containing a JSON string into a :class:`MapType`,
:class:`StructType`
+ or :class:`ArrayType` of :class:`StructType`\\s with the specified
schema. Returns
+ `null`, in the case of an unparseable string.
--- End diff --
For awareness, I also added Unit Tests for each of the supported key types:
```
[info] - SPARK-24682: roundtrip in to_json and from_json - Boolean as key
(260 milliseconds)
[info] - SPARK-24682: roundtrip in to_json and from_json - Byte as key (277
milliseconds)
[info] - SPARK-24682: roundtrip in to_json and from_json - Short as key
(210 milliseconds)
[info] - SPARK-24682: roundtrip in to_json and from_json - Integer as key
(248 milliseconds)
[info] - SPARK-24682: roundtrip in to_json and from_json - Long as key (313
milliseconds)
[info] - SPARK-24682: roundtrip in to_json and from_json - Float as key
(214 milliseconds)
[info] - SPARK-24682: roundtrip in to_json and from_json - Double as key
(316 milliseconds)
[info] - SPARK-24682: roundtrip in to_json and from_json - String as key
(247 milliseconds)
[info] - SPARK-24682: roundtrip in to_json and from_json - Timestamp as key
(284 milliseconds)
[info] - SPARK-24682: roundtrip in to_json and from_json - Date as key (179
milliseconds)
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]