Github user fangshil commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20931#discussion_r179672722
  
    --- Diff: 
core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
 ---
    @@ -186,7 +186,9 @@ class HadoopMapReduceCommitProtocol(
             logDebug(s"Clean up default partition directories for overwriting: 
$partitionPaths")
             for (part <- partitionPaths) {
               val finalPartPath = new Path(path, part)
    -          fs.delete(finalPartPath, true)
    +          if (!fs.delete(finalPartPath, true) && 
!fs.exists(finalPartPath.getParent)) {
    --- End diff --
    
    The FileSystem API spec on delete says "Code SHOULD just call delete(path, 
recursive) and assume the destination is no longer present". Referring to its 
detailed spec, the only case that we may get false from delete would be 
<code>finalPartPath</code> does not exist. Other failures should result in 
exception.  When <code>finalPartPath</code> does not exist, which is an 
expected case, we should only check if the parent of <code>finalPartPath</code> 
does not exist because otherwise we will have problem in rename according to 
its spec.  Please advise if you guys think we still should double-check 
<code>finalPartPath</code> before rename


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to