[GitHub] spark pull request: [SPARK-4286] Integrate external shuffle servic...

andrewor14 Thu, 12 Feb 2015 21:26:43 -0800

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/3861#issuecomment-74207099
  
    @tnachen Thanks for working on this. Before I dive deeper into the 
implementation there's a main open question that I'd like to address. The 
external shuffle service is intended to live across the executor lifetime and 
so launched independently of any Spark application. The service enables dynamic 
allocation of resources because it can continue to serve an executor's shuffle 
files after the executor has been killed. However, in this patch the service 
seems to be started inside the executor backend itself and its fate necessarily 
tied with the application.
    
    If I understand correctly, the Mesos slave is equivalent to the standalone 
Worker in that it is long running and lives beyond the lifetime of a particular 
application. If this is the case, the appropriate place to start the shuffle 
service would be there instead.
    
    Another issue is that this patch in its current state seems to conflate two 
issues (1) dynamic allocation and (2) external shuffle service. (1) is what you 
refer to as auto-scaling on the JIRA, and depends on (2) to work. However, 
since we already check whether shuffle service is enabled in 
`ExecutorAllocationManager`, we shouldn't check it again when launching the 
Mesos executor. More specifically, I don't understand why we launch the 
executor two different ways depending on whether (2) is enabled. I believe a 
better solution is to separate these two concerns and launch the executor the 
same way we already do today.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-4286] Integrate external shuffle servic...

Reply via email to