Peter Andrew created SPARK-57467:
------------------------------------
Summary: Executors are not reused for identical resource profiles
Key: SPARK-57467
URL: https://issues.apache.org/jira/browse/SPARK-57467
Project: Spark
Issue Type: Bug
Components: Scheduler
Affects Versions: 4.1.2
Reporter: Peter Andrew
Even when building identical resource profiles, executors are not reused –
e.g., running the following snippet twice will lead to new executors starting
up both times, which is unnecessary:
{code:java}
```
profile_builder = ResourceProfileBuilder()
executor_requests = ExecutorResourceRequests().cores(12)
task_requests = TaskResourceRequests().cpus(2)
profile =
profile_builder.require(executor_requests).require(task_requests).build
def _fn(dfs):
for df in dfs:
yield df
df = spark.range(10).select(F.col("id").cast("string"))
df.mapInPandas(_fn, df.schema, False, profile).show(n=10)
```
{code}
ResourceProfileManager has a method '
getEquivalentProfile', but it is only called in
DAGScheduler.mergeResourceProfilesForStage when
stageResourceProfiles.size > 1.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]