Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19448#discussion_r144331909
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
 ---
    @@ -138,6 +138,10 @@ class ParquetFileFormat
           conf.setBoolean(ParquetOutputFormat.ENABLE_JOB_SUMMARY, false)
         }
     
    +    require(!conf.getBoolean(ParquetOutputFormat.ENABLE_JOB_SUMMARY, false)
    --- End diff --
    
    I think I'd prefer the warn & continue option. It does little good to fail 
so late in a job, when the caller has already indicated that they want to use a 
different committer. Let them write the data out since this isn't a correctness 
issue, and they can add a summary file later if they want. Basically, there's 
less annoyance and interruption by not writing a summary file than by failing a 
job and forcing the user to re-run near the end.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to