Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r144331909
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -138,6 +138,10 @@ class ParquetFileFormat
conf.setBoolean(ParquetOutputFormat.ENABLE_JOB_SUMMARY, false)
}
+ require(!conf.getBoolean(ParquetOutputFormat.ENABLE_JOB_SUMMARY, false)
--- End diff --
I think I'd prefer the warn & continue option. It does little good to fail
so late in a job, when the caller has already indicated that they want to use a
different committer. Let them write the data out since this isn't a correctness
issue, and they can add a summary file later if they want. Basically, there's
less annoyance and interruption by not writing a summary file than by failing a
job and forcing the user to re-run near the end.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]