Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23052 First of all, sometimes we do need to write "empty" files, so that we can infer schema of a parquet directory. Empty parquet file is not really empty, as it has header/footer. https://github.com/apache/spark/pull/20525 guarantees we always write out at least one empty file. One important thing is, when we write out an empty dataframe to file, and read it back, it should still be an empty dataframe. I'd suggest we skip empty file in text-based data sources, and later on send a followup PR to not write empty text files, as a perf improvement.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org