Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r143992319
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -138,6 +138,13 @@ class ParquetFileFormat
conf.setBoolean(ParquetOutputFormat.ENABLE_JOB_SUMMARY, false)
}
+ if (conf.getBoolean(ParquetOutputFormat.ENABLE_JOB_SUMMARY, false)
+ &&
!classOf[ParquetOutputCommitter].isAssignableFrom(committerClass)) {
+ // output summary is requested, but the class is not a Parquet
Committer
+ throw new RuntimeException(s"Committer $committerClass is not a
ParquetOutputCommitter" +
+ s" and cannot create job summaries.")
--- End diff --
Depends on the policy about "what to do if it's not a parquet committer
*and* the option for job summaries is set. It could just mean "you don't get
summaries", which worksforme :). May want to log at info though?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]