[
https://issues.apache.org/jira/browse/SPARK-24815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850235#comment-16850235
]
Jungtaek Lim edited comment on SPARK-24815 at 5/28/19 11:22 PM:
----------------------------------------------------------------
I'm also interested on the design doc, as I'd like to see whether the fact is
considered as well: there're some points which dynamic allocation could hurt
the performance of streaming query. Executors running structured streaming
query is stateful (and may need to incur huge latency to restore), at least for
stateful queries, and query leveraging Kafka source.
was (Author: kabhwan):
I'm also interested on the design doc, as I'd like to see whether the fact is
considered as well: there're some points which dynamic allocation could hurt
the performance of streaming query. Executors running structured streaming
query is stateful, at least for stateful queries, and query leveraging Kafka
source.
> Structured Streaming should support dynamic allocation
> ------------------------------------------------------
>
> Key: SPARK-24815
> URL: https://issues.apache.org/jira/browse/SPARK-24815
> Project: Spark
> Issue Type: Improvement
> Components: Scheduler, Structured Streaming
> Affects Versions: 2.3.1
> Reporter: Karthik Palaniappan
> Priority: Minor
>
> Dynamic allocation is very useful for adding and removing containers to match
> the actual workload. On multi-tenant clusters, it ensures that a Spark job is
> taking no more resources than necessary. In cloud environments, it enables
> autoscaling.
> However, if you set spark.dynamicAllocation.enabled=true and run a structured
> streaming job, Core's dynamic allocation algorithm kicks in. It requests
> executors if the task backlog is a certain size, and remove executors if they
> idle for a certain period of time.
> This does not make sense for streaming jobs, as outlined in
> https://issues.apache.org/jira/browse/SPARK-12133, which introduced dynamic
> allocation for the old streaming API.
> First, Spark should print a warning if you run a structured streaming job
> when Core's dynamic allocation is enabled
> Second, structured streaming should have support for dynamic allocation. It
> would be convenient if it were the same set of properties as Core's dynamic
> allocation, but I don't have a strong opinion on that.
> If somebody can give me pointers on how to add dynamic allocation support,
> I'd be happy to take a stab.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]