[
https://issues.apache.org/jira/browse/SPARK-24815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763940#comment-17763940
]
Krystal Mitchell edited comment on SPARK-24815 at 2/5/24 3:32 PM:
--
Thank you [~pavan0831]. This draft PR will have a significant impact some of
the projects we are currently working on. Can't wait to see it over the line.
was (Author: JIRAUSER302183):
Thank you [~pavan0831]. This draft PR will impact some of the projects we are
currently working on.
> Structured Streaming should support dynamic allocation
> --
>
> Key: SPARK-24815
> URL: https://issues.apache.org/jira/browse/SPARK-24815
> Project: Spark
> Issue Type: Improvement
> Components: Scheduler, Spark Core, Structured Streaming
>Affects Versions: 2.3.1
>Reporter: Karthik Palaniappan
>Priority: Minor
> Labels: pull-request-available
>
> For batch jobs, dynamic allocation is very useful for adding and removing
> containers to match the actual workload. On multi-tenant clusters, it ensures
> that a Spark job is taking no more resources than necessary. In cloud
> environments, it enables autoscaling.
> However, if you set spark.dynamicAllocation.enabled=true and run a structured
> streaming job, the batch dynamic allocation algorithm kicks in. It requests
> more executors if the task backlog is a certain size, and removes executors
> if they idle for a certain period of time.
> Quick thoughts:
> 1) Dynamic allocation should be pluggable, rather than hardcoded to a
> particular implementation in SparkContext.scala (this should be a separate
> JIRA).
> 2) We should make a structured streaming algorithm that's separate from the
> batch algorithm. Eventually, continuous processing might need its own
> algorithm.
> 3) Spark should print a warning if you run a structured streaming job when
> Core's dynamic allocation is enabled
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org