Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r144239543
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -138,6 +138,10 @@ class ParquetFileFormat
conf.setBoolean(ParquetOutputFormat.ENABLE_JOB_SUMMARY, false)
}
+ require(!conf.getBoolean(ParquetOutputFormat.ENABLE_JOB_SUMMARY, false)
--- End diff --
There's another option which is "log @ warn and continue". If someone has
changed the committer, they get the consequences. That could also permit
someone with a modified committer to generate schema summaries if they
chose/permitted.
IT'd simplify this patch, need the tests tweaked...I'd change the SQLConf
text with the committer key to say "if the committer isn't a
ParquetOutputCommitter then don't expect summaries"
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]