Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/20386#discussion_r164648645
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceV2Writer.java
---
@@ -40,11 +40,13 @@
* 1. Create a writer factory by {@link #createWriterFactory()},
serialize and send it to all the
* partitions of the input data(RDD).
* 2. For each partition, create the data writer, and write the data of
the partition with this
- * writer. If all the data are written successfully, call {@link
DataWriter#commit()}. If
- * exception happens during the writing, call {@link
DataWriter#abort()}.
- * 3. If all writers are successfully committed, call {@link
#commit(WriterCommitMessage[])}. If
+ * writer. If all the data are written successfully, call {@link
DataWriter#commit()}.
+ * On a writer being successfully committed, call {@link
#add(WriterCommitMessage)} to
+ * handle its commit message.
+ * If exception happens during the writing, call {@link
DataWriter#abort()}.
+ * 3. If all writers are successfully committed, call {@link #commit()}.
If
--- End diff --
If all the data writers finish successfully, and #add is successfully
called for all the commit messages, Spark will call #commit. If any of the data
writers failed, or any of the #add call failed, or the job failed with an
unknown reason, call #abort.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]