dongjoon-hyun commented on a change in pull request #23325: [SPARK-26376][SQL]
Skip inputs without tokens by JSON datasource
URL: https://github.com/apache/spark/pull/23325#discussion_r241964012
##########
File path: docs/sql-migration-guide-upgrade.md
##########
@@ -17,7 +17,7 @@ displayTitle: Spark SQL Upgrading Guide
- Since Spark 3.0, the `from_json` functions supports two modes -
`PERMISSIVE` and `FAILFAST`. The modes can be set via the `mode` option. The
default mode became `PERMISSIVE`. In previous versions, behavior of `from_json`
did not conform to either `PERMISSIVE` nor `FAILFAST`, especially in processing
of malformed JSON records. For example, the JSON string `{"a" 1}` with the
schema `a INT` is converted to `null` by previous versions but Spark 3.0
converts it to `Row(null)`.
- - In Spark version 2.4 and earlier, the `from_json` function produces
`null`s for JSON strings and JSON datasource skips the same independetly of its
mode if there is no valid root JSON token in its input (` ` for example). Since
Spark 3.0, such input is treated as a bad record and handled according to
specified mode. For example, in the `PERMISSIVE` mode the ` ` input is
converted to `Row(null, null)` if specified schema is `key STRING, value INT`.
+ - In Spark version 2.4 and earlier, the `from_json` function produces
`null`s for JSON strings without valid root JSON tokens (` ` for example).
Since Spark 3.0, such input is treated as a bad record and handled according to
specified mode. For example, in the `PERMISSIVE` mode the ` ` input is
converted to `Row(null, null)` if specified schema is `key STRING, value INT`.
Review comment:
`Skipping` seems to be unclear here. Could you elaborate the difference?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]