[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

mateiz Mon, 12 Oct 2015 17:03:07 -0700

Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/6648#issuecomment-147552582
  
    Hey Imran,
    
    Given the number of changes required for this approach, I wonder whether an 
atomic rename design wouldn't be simpler (in particular, the "first attempt 
wins" in the doc). The doc seems to be worried that a file output might be 
corrupted, but in that case, why not send a message to the node asking it to 
delete its old output files, and then send a new map task? It can just be the 
delete-block message that the block manager already supports. This seems much 
nicer because it doesn't require any changes to the data structures in the rest 
of Spark.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-8029][core] shuffleoutput per attempt

Reply via email to