GitHub user rdblue opened a pull request:
https://github.com/apache/spark/pull/22190
SPARK-25188: Add WriteConfig to v2 write API.
## What changes were proposed in this pull request?
This updates the v2 write path to a similar structure as the v2 read path.
Individual writes are configured and tracked using `WriteConfig` (analogous to
`ScanConfig`) and this config is passed to the methods of `WriteSupport` that
are specific to a single write, like `commit` and `abort`.
This new config will be used to communicate overwrite options to data
sources that implement new support classes, `BatchOverwriteSupport` and
`BatchPartitionOverwriteSupport`. The new config could also be used by
implementations to get and hold locks to make operations atomic.
Streaming is also updated to use a `StreamingWriteConfig`. Options that are
specific to a write, like schema, output mode, and write options.
## How was this patch tested?
This is primarily an API change and should pass existing tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rdblue/spark SPARK-25188-add-write-config
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22190.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22190
----
commit e3fcc83a4a55576821573ceb9a3a56b89218a187
Author: Ryan Blue <blue@...>
Date: 2018-08-22T21:17:11Z
SPARK-25188: Add WriteConfig to v2 write API.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]