[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14173542#comment-14173542 ]
Praveen Seluka edited comment on SPARK-3174 at 10/16/14 8:55 AM: ----------------------------------------------------------------- [~vanzin] Regarding your question related to scaling heuristic - It does take care of the backlog of tasks also. Once some task completes, it calculates the average runtime of a task (within a stage). Then it estimates the runtime(remaining) of the stage using the following heuristic var estimatedRuntimeForStage = averageRuntime * (remainingTasks + (activeTasks/2)) We also add (activeTasks/2) as we need to take the current running tasks into account. - I think, the proposal I have made is not very different from the exponential approach. Lets say we start the Spark application with just 2 executors. It will double the number of executors and hence goes to 4, 8 and so on. The difference I see here comparing to exponential approach is, we start doubling the current count of executors whereas exponential starts from 1 (also resets to 1 when there are no pending tasks). But yeah, this could be altered to do the same as exponential approach also. - In a way, this proposal adds some ETA based heuristic in addition. (threshold for stage completion time) - Also, this proposal adds the "memory used" heuristic too for scaling decisions which is missing in Andrew's PR. (Correct me if am wrong here). This for sure will be very useful. - The main point being, It does all these without making any changes in TaskSchedulerImpl/TaskSetManager code base. was (Author: praveenseluka): [~vanzin] Regarding your question related to scaling heuristic - It does take care of the backlog of tasks also. Once some task completes, it calculates the average runtime of a task (within a stage). Then it estimates the runtime(remaining) of the stage using the following heuristic var estimatedRuntimeForStage = averageRuntime * (remainingTasks + (activeTasks/2)) We also add (activeTasks/2) as we need to take the current running tasks into account. - I think, the proposal I have made is not very different from the exponential approach. Lets say we start the Spark application with just 2 executors. It will double the number of executors and hence goes to 4, 8 and so on. The difference I see here comparing to exponential approach is, we start doubling the current count of executors whereas exponential starts from 1 (also resets to 1 when there are no pending tasks). But yeah, this could be altered to do the same as exponential approach also. - In a way, this proposal adds some ETA based heuristic in addition. (threshold for stage completion time) - Also, this proposal adds the "memory used" heuristic too for scaling decisions which is missing in Andrew's PR. (Correct me if am wrong here). This for sure will be very useful. > Provide elastic scaling within a Spark application > -------------------------------------------------- > > Key: SPARK-3174 > URL: https://issues.apache.org/jira/browse/SPARK-3174 > Project: Spark > Issue Type: Improvement > Components: Spark Core, YARN > Affects Versions: 1.0.2 > Reporter: Sandy Ryza > Assignee: Andrew Or > Attachments: SPARK-3174design.pdf, SparkElasticScalingDesignB.pdf, > dynamic-scaling-executors-10-6-14.pdf > > > A common complaint with Spark in a multi-tenant environment is that > applications have a fixed allocation that doesn't grow and shrink with their > resource needs. We're blocked on YARN-1197 for dynamically changing the > resources within executors, but we can still allocate and discard whole > executors. > It would be useful to have some heuristics that > * Request more executors when many pending tasks are building up > * Discard executors when they are idle > See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org