Github user steveloughran commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21257#discussion_r187954805
  
    --- Diff: 
core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
 ---
    @@ -235,4 +247,23 @@ class HadoopMapReduceCommitProtocol(
           tmp.getFileSystem(taskContext.getConfiguration).delete(tmp, false)
         }
       }
    +
    +  /**
    +   * now just record the file to be delete
    +   */
    +  override def deleteWithJob(fs: FileSystem, path: Path, recursive: 
Boolean,
    +    canDeleteNow: Boolean = true): Boolean = {
    +    if (canDeleteNow) {
    +      super.deleteWithJob(fs, path, recursive)
    +    } else {
    +      pathsToDelete.add(path -> recursive)
    +    }
    +  }
    +
    +  private def deletePath(fs: FileSystem, path: Path, recursive: Boolean): 
Unit = {
    +    if (fs.exists(path) && !fs.delete(path, recursive)) {
    +      throw new IOException(s"Unable to clear output " +
    +        s"directory $path")
    +    }
    --- End diff --
    
    I'd personally ignore a failure on delete(), as the conditions for the API 
call are "if this doesn't raise an exception then the dest is gone". You can 
skip the exists check as it will be superfluous


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to