Xuefu Zhang created SPARK-22765:
-----------------------------------

             Summary: Create a new executor allocation scheme based on that of 
MR
                 Key: SPARK-22765
                 URL: https://issues.apache.org/jira/browse/SPARK-22765
             Project: Spark
          Issue Type: Improvement
          Components: Scheduler
    Affects Versions: 1.6.0
            Reporter: Xuefu Zhang


Many users migrating their workload from MR to Spark find a significant 
resource consumption hike (i.e, SPARK-22683). While this might not be a concern 
for users that are more performance centric, for others conscious about cost, 
such hike creates a migration obstacle. This situation can get worse as more 
users are moving to cloud.

Dynamic allocation make it possible for Spark to be deployed in multi-tenant 
environment. With its performance-centric design, its inefficiency has also 
unfortunately shown up, especially when compared with MR. Thus, it's believed 
that MR-styled scheduler still has its merit. Based on our research, the 
inefficiency associated with dynamic allocation comes in many aspects such as 
executor idling out, bigger executors, many stages (rather than 2 stages only 
in MR) in a spark job, etc.

Rather than fine tuning dynamic allocation for efficiency, the proposal here is 
to add a new, efficiency-centric  scheduling scheme based on that of MR. Such a 
MR-based scheme can be further enhanced and be more adapted to Spark execution 
model. This alternative is expected to offer good performance improvement 
(compared to MR) still with similar to or even better efficiency than MR.

Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to