[GitHub] spark pull request #20386: [SPARK-23202][SQL] Break down DataSourceV2Writer....

cloud-fan Tue, 30 Jan 2018 05:11:33 -0800

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20386#discussion_r164735522
  
    --- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceWriter.java
 ---
    @@ -40,16 +40,21 @@
      *   1. Create a writer factory by {@link #createWriterFactory()}, 
serialize and send it to all the
      *      partitions of the input data(RDD).
      *   2. For each partition, create the data writer, and write the data of 
the partition with this
    - *      writer. If all the data are written successfully, call {@link 
DataWriter#commit()}. If
    - *      exception happens during the writing, call {@link 
DataWriter#abort()}.
    - *   3. If all writers are successfully committed, call {@link 
#commit(WriterCommitMessage[])}. If
    - *      some writers are aborted, or the job failed with an unknown 
reason, call
    - *      {@link #abort(WriterCommitMessage[])}.
    + *      writer. If one data writer finishes successfully, the commit 
message will be sent back to
    + *      the driver side and Spark will call {@link 
#add(WriterCommitMessage)}.
    + *      If exception happens during the writing, call {@link 
DataWriter#abort()}.
    + *   3. If all the data writers finish successfully, and {@link 
#add(WriterCommitMessage)} is
    + *      successfully called for all the commit messages, Spark will call 
{@link #commit()}.
    + *      If any of the data writers failed, or any of the {@link 
#add(WriterCommitMessage)}
    + *      calls failed, or the job failed with an unknown reason, call 
{@link #abort()}.
      *
      * While Spark will retry failed writing tasks, Spark won't retry failed 
writing jobs. Users should
      * do it manually in their Spark applications if they want to retry.
      *
    - * Please refer to the documentation of commit/abort methods for detailed 
specifications.
    + * All these methods are guaranteed to be called in a single thread.
    --- End diff --
    
    nit: `... in a single thread at driver side`



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20386: [SPARK-23202][SQL] Break down DataSourceV2Writer....

Reply via email to