GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/21381
refactor ExecuteWriteTask ## What changes were proposed in this pull request? As I am working on File data source V2 write path [in my repo ](https://github.com/gengliangwang/spark/blob/47f39e1f54bc748e116ae9580413fae317898327/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileSourceWriter.scala#L78), I find it essential to refactor ExecuteWriteTask in FileFormatWriter with DataWriter of Data source V2: 1. Reuse the code in both `FileFormat` and Data Source V2 2. Better abstraction, callers only need to call `commit()` or `abort` at the end of task. Also there is less code in `SingleDirectoryWriteTask` and `DynamicPartitionWriteTask`. This PR is part of data source V2 migration. Definitions of related classes is moved to a new file, and `ExecuteWriteTask` is rename to `FileFormatDataWriter` ## How was this patch tested? Existing unit test You can merge this pull request into a Git repository by running: $ git pull https://github.com/gengliangwang/spark refactorExecuteWriteTask Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21381.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21381 ---- commit cbd4ce2959bdfe63dff32d0c36b2982fcde22aac Author: Gengliang Wang <gengliang.wang@...> Date: 2018-05-21T12:16:14Z refactor ExecuteWriteTask ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org