Peter Andrew created SPARK-57467:
------------------------------------

             Summary: Executors are not reused for identical resource profiles
                 Key: SPARK-57467
                 URL: https://issues.apache.org/jira/browse/SPARK-57467
             Project: Spark
          Issue Type: Bug
          Components: Scheduler
    Affects Versions: 4.1.2
            Reporter: Peter Andrew


 

Even when building identical resource profiles, executors are not reused – 
e.g., running the following snippet twice will lead to new executors starting 
up both times, which is unnecessary:
{code:java}
```
profile_builder = ResourceProfileBuilder()
executor_requests = ExecutorResourceRequests().cores(12)
task_requests = TaskResourceRequests().cpus(2)
profile = 
profile_builder.require(executor_requests).require(task_requests).build 

def _fn(dfs):
  for df in dfs:
    yield df

df = spark.range(10).select(F.col("id").cast("string"))
df.mapInPandas(_fn, df.schema, False, profile).show(n=10)
```
{code}

ResourceProfileManager has a method '
getEquivalentProfile', but it is only called in 
DAGScheduler.mergeResourceProfilesForStage when 
stageResourceProfiles.size > 1.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to