Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/22952#discussion_r231705717
--- Diff: docs/structured-streaming-programming-guide.md ---
@@ -530,6 +530,8 @@ Here are the details of all the sources in Spark.
"s3://a/dataset.txt"<br/>
"s3n://a/b/dataset.txt"<br/>
"s3a://a/b/c/dataset.txt"<br/>
+ <br/>
+ <code>renameCompletedFiles</code>: whether to rename completed
files in previous batch (default: false). If the option is enabled, input file
will be renamed with additional postfix "_COMPLETED_". This is useful to clean
up old input files to save space in storage.
--- End diff --
For example,
http://spark.apache.org/docs/latest/sql-programming-guide.html#schema-merging
```
Since schema merging is a relatively expensive operation,
and is not a necessity in most cases, we turned it off
by default starting from 1.5.0.
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]