[jira] [Comment Edited] (SPARK-3174) Provide elastic scaling within a Spark application

Praveen Seluka (JIRA) Thu, 16 Oct 2014 01:56:13 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14173542#comment-14173542
 ]


Praveen Seluka edited comment on SPARK-3174 at 10/16/14 8:55 AM:
-----------------------------------------------------------------

[~vanzin] Regarding your question related to scaling heuristic
 
- It does take care of the backlog of tasks also. Once some task completes, it 
calculates the average runtime of a task (within a stage). Then it estimates 
the runtime(remaining) of the stage using the following heuristic
 var estimatedRuntimeForStage = averageRuntime * (remainingTasks + 
(activeTasks/2))
 We also add (activeTasks/2) as we need to take the current running tasks into 
account.

- I think, the proposal I have made is not very different from the exponential 
approach. Lets say we start the Spark application with just 2 executors. It 
will double the number of executors and hence goes to 4, 8 and so on. The 
difference I see here comparing to exponential approach is, we start doubling 
the current count of executors whereas exponential starts from 1 (also resets 
to 1 when there are no pending tasks).  But yeah, this could be altered to do 
the same as exponential approach also.

- In a way, this proposal adds some ETA based heuristic in addition. (threshold 
for stage completion time)

- Also, this proposal adds the "memory used" heuristic too for scaling 
decisions which is missing in Andrew's PR. (Correct me if am wrong here). This 
for sure will be very useful. 

- The main point being, It does all these without making any changes in 
TaskSchedulerImpl/TaskSetManager code base.
        


was (Author: praveenseluka):
[~vanzin] Regarding your question related to scaling heuristic
 
- It does take care of the backlog of tasks also. Once some task completes, it 
calculates the average runtime of a task (within a stage). Then it estimates 
the runtime(remaining) of the stage using the following heuristic
 var estimatedRuntimeForStage = averageRuntime * (remainingTasks + 
(activeTasks/2))
 We also add (activeTasks/2) as we need to take the current running tasks into 
account.

- I think, the proposal I have made is not very different from the exponential 
approach. Lets say we start the Spark application with just 2 executors. It 
will double the number of executors and hence goes to 4, 8 and so on. The 
difference I see here comparing to exponential approach is, we start doubling 
the current count of executors whereas exponential starts from 1 (also resets 
to 1 when there are no pending tasks).  But yeah, this could be altered to do 
the same as exponential approach also.

- In a way, this proposal adds some ETA based heuristic in addition. (threshold 
for stage completion time)

- Also, this proposal adds the "memory used" heuristic too for scaling 
decisions which is missing in Andrew's PR. (Correct me if am wrong here). This 
for sure will be very useful. 
        

> Provide elastic scaling within a Spark application
> --------------------------------------------------
>
>                 Key: SPARK-3174
>                 URL: https://issues.apache.org/jira/browse/SPARK-3174
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, YARN
>    Affects Versions: 1.0.2
>            Reporter: Sandy Ryza
>            Assignee: Andrew Or
>         Attachments: SPARK-3174design.pdf, SparkElasticScalingDesignB.pdf, 
> dynamic-scaling-executors-10-6-14.pdf
>
>
> A common complaint with Spark in a multi-tenant environment is that 
> applications have a fixed allocation that doesn't grow and shrink with their 
> resource needs.  We're blocked on YARN-1197 for dynamically changing the 
> resources within executors, but we can still allocate and discard whole 
> executors.
> It would be useful to have some heuristics that
> * Request more executors when many pending tasks are building up
> * Discard executors when they are idle
> See the latest design doc for more information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-3174) Provide elastic scaling within a Spark application

Reply via email to