[
https://issues.apache.org/jira/browse/SPARK-29905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun updated SPARK-29905:
----------------------------------
Affects Version/s: (was: 3.0.0)
3.1.0
> ExecutorPodsLifecycleManager has sub-optimal behavior with dynamic allocation
> -----------------------------------------------------------------------------
>
> Key: SPARK-29905
> URL: https://issues.apache.org/jira/browse/SPARK-29905
> Project: Spark
> Issue Type: Improvement
> Components: Kubernetes
> Affects Versions: 3.1.0
> Reporter: Marcelo Masiero Vanzin
> Priority: Minor
>
> I've been playing with dynamic allocation on k8s and noticed some weird
> behavior from ExecutorPodsLifecycleManager when it's on.
> The cause of this behavior is mostly because of the higher rate of pod
> updates when you have dynamic allocation. Pods being created and going away
> all the time generate lots of events, that are then translated into
> "snapshots" internally in Spark, and fed to subscribers such as
> ExecutorPodsLifecycleManager.
> The first effect of that is that you get a lot of spurious logging. Since
> snapshots are incremental, you can get lots of snapshots with the same
> "PodDeleted" information, for example, and ExecutorPodsLifecycleManager will
> log for all of them. Yes, log messages are at debug level, but if you're
> debugging that stuff, it's really noisy and distracting.
> The second effect is that the same way you get multiple log messages, you end
> up calling into the Spark scheduler, and worse, into the K8S API server,
> multiple times for the same pod update. We can optimize that and reduce the
> chattiness with the API server.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]