Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20369#discussion_r164008791
  
    --- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/streaming/writer/StreamWriter.java
 ---
    @@ -37,8 +41,28 @@
        */
       void commit(long epochId, WriterCommitMessage[] messages);
     
    +  /**
    +   * Aborts this writing job because some data writers are failed and keep 
failing when retry, or
    +   * the Spark job fails with some unknown reasons, or {@link 
#commit(WriterCommitMessage[])} fails.
    +   *
    +   * If this method fails (by throwing an exception), the underlying data 
source may require manual
    +   * cleanup.
    +   *
    +   * Unless the abort is triggered by the failure of commit, the given 
messages should have some
    +   * null slots as there maybe only a few data writers that are committed 
before the abort
    +   * happens, or some data writers were committed but their commit 
messages haven't reached the
    +   * driver when the abort is triggered. So this is just a "best effort" 
for data sources to
    +   * clean up the data left by data writers.
    +   */
    +  void abort(long epochId, WriterCommitMessage[] messages);
    +
       default void commit(WriterCommitMessage[] messages) {
         throw new UnsupportedOperationException(
    -       "Commit without epoch should not be called with ContinuousWriter");
    +       "Commit without epoch should not be called with StreamWriter");
    --- End diff --
    
    nit. One more space at the start?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to