[
https://issues.apache.org/jira/browse/SPARK-35399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris updated SPARK-35399:
--------------------------
Description:
[Graceful Decommission of
Executors|https://spark.apache.org/docs/3.1.1/job-scheduling.html#graceful-decommission-of-executors]
section states that:
{quote}a Spark executor exits either on failure or when the associated
application has also exited. In both scenarios, all state associated with the
executor is no longer needed and can be safely discarded.
{quote}
However, in the case of a flaky app with tasks causing occasional executor OOM,
state _is_ needed to prevent triggering of the stage failure mechanism to
regenerate missing blocks. Hence, the Shuffle Service is valuable in this
scenario, not only in the dynamic resource allocation scenario.
was:
[Graceful Decommission of
Executors|https://spark.apache.org/docs/3.1.1/job-scheduling.html#graceful-decommission-of-executors]
section states that:
{quote}a Spark executor exits either on failure or when the associated
application has also exited. In both scenarios, all state associated with the
executor is no longer needed and can be safely discarded.
{quote}
However, in the case of a flaky app with tasks causing occasional executor OOM,
state _is_ needed to prevent triggering of the stage failure mechanism to
regenerate missing blocks. Hence, the Shuffle Service is valuable in this
scenario, not only in the dynamic resource allocation scenario.
> State is still needed in the event of executor failure
> ------------------------------------------------------
>
> Key: SPARK-35399
> URL: https://issues.apache.org/jira/browse/SPARK-35399
> Project: Spark
> Issue Type: Documentation
> Components: Documentation
> Affects Versions: 3.1.1
> Reporter: Chris
> Priority: Minor
> Labels: newbie, pull-request-available
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> [Graceful Decommission of
> Executors|https://spark.apache.org/docs/3.1.1/job-scheduling.html#graceful-decommission-of-executors]
> section states that:
> {quote}a Spark executor exits either on failure or when the associated
> application has also exited. In both scenarios, all state associated with the
> executor is no longer needed and can be safely discarded.
> {quote}
> However, in the case of a flaky app with tasks causing occasional executor
> OOM, state _is_ needed to prevent triggering of the stage failure mechanism
> to regenerate missing blocks. Hence, the Shuffle Service is valuable in this
> scenario, not only in the dynamic resource allocation scenario.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]