Github user squito commented on the issue:

    https://github.com/apache/spark/pull/21698
  
    @tgravescs its not guaranteed to reproduce with that.  IIUC, you need to do 
a repartition in the same stage that also does a shuffle-read, then have a 
fetch failure, and on recompute that stage needs to fetch shuffle data in a 
different order.  I think you probably need to make sure the fetches are remote 
to get a different order on a retry (local shuffle reads are deterministic, I 
think).
    
    @jiangxb1987 has worked on this stopped?  I think there are still ideas for 
how to go forward on this, and its a really important fix.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to