Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22237#discussion_r222651368
--- Diff: docs/sql-programming-guide.md ---
@@ -1890,6 +1890,10 @@ working with timestamps in `pandas_udf`s to get the
best performance, see
# Migration Guide
+## Upgrading From Spark SQL 2.4 to 3.0
+
+ - Since Spark 3.0, the `from_json` function supports three modes -
`PERMISSIVE`, `FAILFAST` and `NULLMALFORMED`. The modes can be set via the
`mode` option. `PERMISSIVE` became the default mode. In previous versions,
behavior of `from_json` did not conform to either `PERMISSIVE` nor `FAILFAST`,
especially in processing of malformed JSON records. For example, the JSON
string `{"a" 1}` with the schema `a INT` is converted to `null` by previous
versions but Spark 3.0 converts it to `Row(null)`. In version 2.4 and earlier,
arrays of JSON objects are considered as invalid and converted to `null` if
specified schema is `StructType`. Since Spark 3.0, the input is considered as a
valid JSON array and only its first element is parsed if it conforms to the
specified `StructType`. To restore the previous behavior, set the JSON option
`mode` to `NULLMALFORMED`.
--- End diff --
how about `PERMISSIVE`, `FAILFAST` and `LEGACY`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]