[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256045#comment-14256045 ] Andrew Or commented on SPARK-3174: -- Hey [~nemccarthy] I filed one at SPARK-4922, which is for coarse-grained mode. For fine-grained mode, there is already one that enables dynamically scaling memory instead of just CPU at SPARK-1882. I believe there has not been progress on either issue yet. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Fix For: 1.2.0 Attachments: SPARK-3174design.pdf, SparkElasticScalingDesignB.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14255344#comment-14255344 ] Nathan M commented on SPARK-3174: - Is there a JIRA to add this feature to Mesos? Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Fix For: 1.2.0 Attachments: SPARK-3174design.pdf, SparkElasticScalingDesignB.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178554#comment-14178554 ] Dag Liodden commented on SPARK-3174: Hey guys, glad to see some progress on this! I believe a very common use case here is to not do scaling based on the internal needs of the job, but also external factors. For instance, we're using the fair scheduler and want to have Spark jobs opportunistically leverage available containers, but it's equally important that the executor count is scaled down if the job does over its fair share during the job lifetime. I'm not sure that's being covered here? I'm sure there will be classes of jobs that heavily leverage RDD caching where heuristic yielding will be very hard to implement with deterministic performance impacts, but regular MR jobs should be pretty straight-forward. Pluggable algorithms definitely sound like a viable approach for this and for regular MR it could be as simple as regularly check the current target container count and create or drain + kill executors accordingly? Dag Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, SparkElasticScalingDesignB.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14173542#comment-14173542 ] Praveen Seluka commented on SPARK-3174: --- [~vanzin] Regarding your question related to scaling heuristic - It does take care of the backlog of tasks also. Once some task completes, it calculates the average runtime of a task (within a stage). Then it estimates the runtime(remaining) of the stage using the following heuristic var estimatedRuntimeForStage = averageRuntime * (remainingTasks + (activeTasks/2)) We also add (activeTasks/2) as we need to take the current running tasks into account. - I think, the proposal I have made is not very different from the exponential approach. Lets say we start the Spark application with just 2 executors. It will double the number of executors and hence goes to 4, 8 and so on. The difference I see here comparing to exponential approach is, we start doubling the current count of executors whereas exponential starts from 1 (also resets to 1 when there are no pending tasks). But yeah, this could be altered to do the same as exponential approach also. - In a way, this proposal adds some ETA based heuristic in addition. (threshold for stage completion time) - Also, this proposal adds the memory used heuristic too for scaling decisions which is missing in Andrew's PR. (Correct me if am wrong here). This for sure will be very useful. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, SparkElasticScalingDesignB.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14173951#comment-14173951 ] Marcelo Vanzin commented on SPARK-3174: --- bq. Lets say we start the Spark application with just 2 executors. It will double the number of executors and hence goes to 4, 8 and so on. Well, I'd say it's unusual for applications to start with a low number of executors, especially if the user knows it will be executing things right away. So if I start it with 32 executors, your code will right away try to make it 64. Andrew's approach would try to make it 33, then 35, then... But I agree that it might be a good idea to make the auto-scaling backend an interface, so that we can easily play with different approaches. That shouldn't be hard at all. bq. The main point being, It does all these without making any changes in TaskSchedulerImpl/TaskSetManager Theoretically, I agree that's a good thing. I haven't gone through the code in detail, though, to know whether all the information Andrew is using from the scheduler is available from SparkListener events. If you can derive that info, great, I think it would be worth it to make the auto-scale code decoupled from the scheduler. If not, then we either have the choice of hooking the auto-scaling backend into the scheduler (like Andrew's change) or exposing more info in the events - which may or may not be a good thing, depending on what that info is. Anyway, as I've said, both approaches are not irreconcilably different - they're actually more similar than not. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, SparkElasticScalingDesignB.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172387#comment-14172387 ] Praveen Seluka commented on SPARK-3174: --- [~andrewor] - Can you also comment on the API exposed to add/delete executors from SparkContext ? I believe it will be, sc.addExecutors(count : Int) sc.deleteExecutors(List[String]) [~sandyr] [~tgraves] [~andrewor] [~vanzin] - Can you please take a look at the design doc I have proposed. I am sure there are some pros in doing it this way - Have indicated them in detail in the doc. Since, it does not change Spark Core itself, you could easily replace with another pluggable algorithm for dynamic scaling. I know that Anrdrew already have a PR based on his design doc, but would surely love to get some feedback. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, SparkElasticScalingDesignB.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172861#comment-14172861 ] Marcelo Vanzin commented on SPARK-3174: --- [~praveenseluka] I took a quick look at your design. I have a few coments: * It ignores the shuffle, which is coverend in Andrew's design. You can't scale down without figuring out what to do with the shuffle data of completed states, since it may be reused. It's not just about cached blocks. * Overall, I like the idea of having pluggable scaling algorithms, but I think it's too early for a public add/remove executor API. For example, standalone mode doesn't support adding or removing executors, AFAIK. * The task runtime-based heuristic looks a bit weird. I think it needs to be more dynamic - e.g., actually consider the backlog of tasks vs. number of tasks that can be run concurrently on the current executors. Also, doubling number of executors seems a bit like a sledgehammer - an exponential approach like Andrew has proposed might be able to reach similar results with lower overall system load. I haven't looked at Andrew's PR yet (it's on my todo list), but I don't see a lot that is that much different in the two proposals. I agree that if it's possible it would be nice to keep the autoscaling code / interfaces as separate as possible from the current core, if only to make them easier to replace / refactor / disable, but then again, I haven't looked at Andrew's code yet. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, SparkElasticScalingDesignB.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14170998#comment-14170998 ] Praveen Seluka commented on SPARK-3174: --- Posted a detailed autoscaling design doc = https://issues.apache.org/jira/secure/attachment/12674773/SparkElasticScalingDesignB.pdf - Created PR for https://issues.apache.org/jira/browse/SPARK-3822 (hooks to add/delete executors from SparkContext) - Posted the detailed design on Autoscaling criteria's and also have a patch ready for the same. Just to give some context, I have been looking at elastic autoscaling for quite sometime. Mailed sparkusers mailing list few weeks back on the idea of having hooks for adding and deleting executors and also ready to submit a patch. Last week, I saw the initial details design doc posted in SPARK3174 JIRA here. After looking briefly at the design proposed, few things were evident for me. - The design largely overlaps with what I have built so far. Also, I have the patch in working state for this now. - Sharing this design doc which highlights the idea in slightly more detail. It will be great if we could collaborate on this as I have the basic pieces implemented already. - Contrasting some implementation level details when compared to the one proposed already (hence, calling this design B) and we could take the best course of action. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, SparkElasticScalingDesignB.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14171024#comment-14171024 ] Sandy Ryza commented on SPARK-3174: --- bq. If I understand correctly, your concern with requesting executors in rounds is that we will end up making many requests if we have many executors, when we could instead just batch them? My original claim was that a policy that could be aware of the number of tasks needed by a stage would be necessary to get enough executors quickly when going from idle to running a job. You claim was that the exponential policy can be tuned to give similar enough behavior to the type of policy I'm describing. My concern was that, even if that is true, tuning the exponential policy in this way would be detrimental at other points of the application lifecycle: e.g., starting the next stage within a job could lead to over-allocation. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, SparkElasticScalingDesignB.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14171580#comment-14171580 ] Marcelo Vanzin commented on SPARK-3174: --- [~andrewor14] still regarding the shuffle service, your plan is not to use the existing ShuffleService in Hadoop, but still deploy spark's shuffle service as a node manager aux service (in the case of Yarn), correct? Or is that API not generic enough for Spark's needs? Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, SparkElasticScalingDesignB.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169830#comment-14169830 ] Andrew Or commented on SPARK-3174: -- [~vanzin] bq. My first question I think is similar to Tom's. It was not clear to me how the app will behave when it starts up. I'd expect the first job to be the one that has to process the largest amount of data, so it would benefit from having as many executors as possible available as quickly as possible - something that seems to conflict with the idea of a slow start. Are you proposing a change to the current semantics, where Yarn will request --num-executors up front? If you keep that, I think that would cover my above concerns. But switching to a slow start with no option to pre-allocate a certain numbers seems like it might harm certain jobs. I'm actually not proposing to change the application start-up behavior. Spark will continue to request however many number of executors it will today upfront. The slow-start comes in when you want to add executors after removing them. Also, you can control how often you want to add executors with a config, so if the application wants the behavior where it requests all executors at once, it can still do that. bq. My second is about the shuffle service you're proposing. Have you investigated whether it would be possible to make Hadoop's shuffle service more generic, so that Spark can benefit from it? It does mean that this feature might be constrained to certain versions of Hadoop, but maybe that's not necessarily a bad thing if it means more infrastructure is shared. I have indeed. The main difficulty to integrating Spark and Yarn cleanly there stems from the hard-coded shuffle file paths and index shuffle file format. Currently, both are highly specific to MR, and although we can work around them by adapting Spark's shuffle behavior to MR's (non-trivial but certainly possible), we'll only be able to use the feature on Yarn. If we decide to extend this feature to standalone or mesos mode, we'll have to do what we're doing right now anyway since we can't rely on the Yarn ShuffleHandler there. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169856#comment-14169856 ] Andrew Or commented on SPARK-3174: -- [~sandyr] bq. Consider the (common) case of a user keeping a Hive session open and setting a low number of minimum executors in order to not sit on cluster resources when idle. Goal number 1 should be making queries return as fast as possible. A policy that, upon receiving a job, simply requested executors with enough slots to handle all the tasks required by the first stage would be a vast latency and user experience improvement over the exponential increase policy. Given that resource managers like YARN will mediate fairness between users and that Spark will be able to give executors back, there's not much advantage to being conservative or ramping up slowly in this case. Accurately anticipating resource needs is difficult, but not necessary. Yes, in this case we may want to get back many executors quickly, but this can be achieved even in the exponential increase model because we expose a config that regulates how often executors should be added. Slow-start is actually not slow at all if we lower the interval between which we add executors. I just think that this model gives the application more control and flexibility than one where you get either all executors or very few of them. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14170033#comment-14170033 ] Timothy St. Clair commented on SPARK-3174: -- [~pwendell] are you talking about resizing? [~nnielsen] [~tnachen] ^ FYI. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14170244#comment-14170244 ] Andrew Or commented on SPARK-3174: -- @[~sandyr] by quick-start do you mean adding all executors all at once? If I understand correctly, your concern with requesting executors in rounds is that we will end up making many requests if we have many executors, when we could instead just batch them? Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162514#comment-14162514 ] Marcelo Vanzin commented on SPARK-3174: --- Hi Andrew, thanks for writing this up. My first question I think is similar to Tom's. It was not clear to me how the app will behave when it starts up. I'd expect the first job to be the one that has to process the largest amount of data, so it would benefit from having as many executors as possible available as quickly as possible - something that seems to conflict with the idea of a slow start. Are you proposing a change to the current semantics, where Yarn will request --num-executors up front? If you keep that, I think that would cover my above concerns. But switching to a slow start with no option to pre-allocate a certain numbers seems like it might harm certain jobs. My second is about the shuffle service you're proposing. Have you investigated whether it would be possible to make Hadoop's shuffle service more generic, so that Spark can benefit from it? It does mean that this feature might be constrained to certain versions of Hadoop, but maybe that's not necessarily a bad thing if it means more infrastructure is shared. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162685#comment-14162685 ] Sandy Ryza commented on SPARK-3174: --- bq. Maybe it makes sense to just call it `spark.dynamicAllocation.*` That sounds good to me. bq. I think in general we should limit the number of things that will affect adding/removing executors. Otherwise an application might get/lose many executors all of a sudden without a good understanding of why. Also anticipating what's needed in a future stage is usually fairly difficult, because you don't know a priori how long each stage is running. I don't see a good metric to decide how far in the future to anticipate for. Consider the (common) case of a user keeping a Hive session open and setting a low number of minimum executors in order to not sit on cluster resources when idle. Goal number 1 should be making queries return as fast as possible. A policy that, upon receiving a job, simply requested executors with enough slots to handle all the tasks required by the first stage would be a vast latency and user experience improvement over the exponential increase policy. Given that resource managers like YARN will mediate fairness between users and that Spark will be able to give executors back, there's not much advantage to being conservative or ramping up slowly in this case. Accurately anticipating resource needs is difficult, but not necessary. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160914#comment-14160914 ] Andrew Or commented on SPARK-3174: -- Hi all, I have attached an updated design doc that details the various components of the proposed solution. It summarizes the considerations in each component and explains why we have chosen the approaches outlined in the doc. The re-organization of this issue corresponds to the main components of the design as follows: (1) Heuristics for scaling executors (SPARK-3795) (2) Mechanism for scaling executors (SPARK-3822) (3) Graceful decommission of executors (SPARK-3796 / SPARK-3797) Let me know if you have any questions or feedback. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160937#comment-14160937 ] Sandy Ryza commented on SPARK-3174: --- Thanks for posting the detailed design, Andrew. A few comments. I would expect properties underneath spark.executor.* to pertain to what goes on inside of executors. This is really more of a driver/scheduler feature, so a different prefix might make more sense. Because it's a large task and there's still significant value without it, I assume we'll hold off on implementing the graceful decommission until we're done with the first parts? This is probably out of scope for the first cut, but in the future it might be useful to include addition/removal policies that use what Spark knows about upcoming stages to anticipate the number of executors needed. Can we structure the config property names in a way that will make sense if we choose to add more advanced functionality like this? When cluster resources are constrained, we may find ourselves in situations where YARN is unable to allocate the additional resources we requested before the next time interval. I haven't thought about it extremely deeply, but it seems like there may be some pathological situations in which we request an enormous number of additional executors while waiting. It might make sense to do something like avoid increasing the number requested until we've actually received some? Last, any thoughts on what reasonable intervals would be? For the add interval, I imagine that we want it to be at least the amount of time required between invoking requestExecutors and being able to schedule tasks on the executors requested. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160939#comment-14160939 ] Thomas Graves commented on SPARK-3174: -- Good write up Andrew. Just to make sure, the things in blue with the star in the doc are the approaches you are proposing? So how well is the spark scheduler going to handle the addition of/waiting for executors? is SPARK-3795 going to address making that better? Meaning if you start a job with a small number of executors it will be scheduled non-optimally and in many cases will cause failures. Hence why we added the configs to wait for executors before starting. Will the config(s) be allowed to be changed part of the way through an application? for instance, lets say I do some ETL stuff where I want it to do dynamic, but then I need to run an ML algorithm or do some heavy caching where I want to shut it off. Is it also safe to assume this doesn't handle changing the resource requirements of the containers? Ie the executors it starts and stops will also be the same size. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160960#comment-14160960 ] Sandy Ryza commented on SPARK-3174: --- bq. for instance, lets say I do some ETL stuff where I want it to do dynamic, but then I need to run an ML algorithm or do some heavy caching where I want to shut it off. IIUC, the proposal only gets rid of executors once they are idle / empty of cached data. If that's the case, do you still see issues with leaving dynamic allocation on during the ML algorithm phase? Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160988#comment-14160988 ] Thomas Graves commented on SPARK-3174: -- Perhaps I misread it then because the proposal that is starred in Graceful Decommission. I didn't see anything else mentioned in the removing but perhaps I missed it? If the graceful decommission is going to be handled later thats fine but perhaps we clarify in the doc. == Do nothing, and recompute these blocks later if necessary This is the simplest solution. Note that this may not be optimal if the application relies heavily on caching, however. A relevant question then is what to do if the user attempts to cache blocks. Spark should at least log a warning so the performance does not degrade silently. It may also be appropriate to throw an exception when caching is detected. If we do throw an exception, we could add an extra Spark property that only disables this exception if the user knows what s/he is doing. = Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161048#comment-14161048 ] Sandy Ryza commented on SPARK-3174: --- Ah, misread. My opinion is that, for a first cut without the graceful decommission, we should go with option 2 and only remove executors that have no cached blocks. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Provide elastic scaling within a Spark application
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161378#comment-14161378 ] Andrew Or commented on SPARK-3174: -- @[~tgraves] Replying inline: bq. Just to make sure, the things in blue with the star in the doc are the approaches you are proposing? Yes. bq. So how well is the spark scheduler going to handle the addition of/waiting for executors? is SPARK-3795 going to address making that better? Meaning if you start a job with a small number of executors it will be scheduled non-optimally and in many cases will cause failures. Hence why we added the configs to wait for executors before starting. Not sure if I understand what you mean by this. I would assume that the Spark application will start with a reasonable number of executors. This scaling feature is mainly concerned with scaling down your executors from the ones you started with, but because we may need them later we also need a way and a heuristic to add them back. I did not intend for this feature to be, for instance, used for bootstrapping from one executor in the beginning. bq. Will the config(s) be allowed to be changed part of the way through an application? for instance, lets say I do some ETL stuff where I want it to do dynamic, but then I need to run an ML algorithm or do some heavy caching where I want to shut it off. Yeah, especially for a REPL it would be good to change this across jobs. This proposal is a first-cut design and does not incorporate that. Though I think this is a more general issue; Spark configurations are intended to be set for the entire duration of the application, but many are somewhat specific to each job within that application. I think it is of interest to eventually be able to configure this dynamically, but I don't have a great idea of how to expose that at the moment. bq. Is it also safe to assume this doesn't handle changing the resource requirements of the containers? Ie the executors it starts and stops will also be the same size. Yes, we do not resize the executor's JVM or container (I don't think Yarn supports this yet). Our unit of execution here is on the granularity of fixed-size executors. Provide elastic scaling within a Spark application -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf, dynamic-scaling-executors-10-6-14.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. It would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Discard executors when they are idle See the latest design doc for more information. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org