Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/21577
  
    >  * fixed the issue Mridul brought up, but I think the race that Tom 
describes still exists. I'm just not sure it would cause problems, since as far 
as I can tell it can only happen in a map stage, not a result stage.
    
    @vanzin  which race were you referring to here, I think tracking the stage 
across attempts fixes both the ones I mentioned in reference to scenario 2 for 
Mridul
    
    > There is another case though here where T1_1.1 could have just asked to 
be committed, but not yet committed, then if it gets delayed committing, the 
new stage attempt starts and T1_1.2 asks if it could commit and is granted, so 
then both try to commit at the same time causing corruption.
    
    Fixed because T1_1.2 won't be allowed to commit because we track first 
state attempt as committing.
    
    > The caveat there though would be if since T1_1.1 was committed, the 
second stage attempt could finish and call commitJob while T1_1.2 is committing 
since spark thinks it doesn't need to wait for T1_1.2. Anyway this seems very 
unlikely but we should protect against it.
    
    T1_1.2 shouldn't ever be allowed to commit since we track across the 
attempts so it wouldn't ever commit after the stage itself has completed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to