Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/21909#discussion_r205977956
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -450,7 +450,8 @@ class DataFrameReader private[sql](sparkSession:
SparkSession) extends Logging {
input => rawParser.parse(input, createParser,
UTF8String.fromString),
parsedOptions.parseMode,
schema,
- parsedOptions.columnNameOfCorruptRecord)
+ parsedOptions.columnNameOfCorruptRecord,
+ optimizeEmptySchema = true)
--- End diff --
Here can be only one JSON object of struct type per input string. Don't see
any reasons to turn the optimization off. Maybe you have some examples when the
optimization doesn't work correctly?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]