[jira] [Assigned] (SPARK-26513) Trigger GC on executor node idle
[ https://issues.apache.org/jira/browse/SPARK-26513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26513: Assignee: Apache Spark > Trigger GC on executor node idle > > > Key: SPARK-26513 > URL: https://issues.apache.org/jira/browse/SPARK-26513 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.3.0 >Reporter: Sandish Kumar HN >Assignee: Apache Spark >Priority: Major > Fix For: 3.0.0 > > > Correct me if I'm wrong. > *Stage:* > On a large cluster, each stage would have some executors. were a few > executors would finish a couple of tasks first and wait for whole stage or > remaining tasks to finish which are executed by different executors nodes in > a cluster. a stage will only be completed when all tasks in a current stage > finish its execution. and the next stage execution has to wait till all tasks > of the current stage are completed. > > why don't we trigger GC, when the executor node is waiting for remaining > tasks to finish, or executor Idle? anyways executor has to wait for the > remaining tasks to finish which can at least take a couple of seconds. why > don't we trigger GC? which will max take <300ms > > I have proposed a small code snippet which triggers GC when running tasks are > empty and heap usage in current executor node is more than the given > threshold. > This could improve performance for long-running spark job's. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-26513) Trigger GC on executor node idle
[ https://issues.apache.org/jira/browse/SPARK-26513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26513: Assignee: (was: Apache Spark) > Trigger GC on executor node idle > > > Key: SPARK-26513 > URL: https://issues.apache.org/jira/browse/SPARK-26513 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.3.0 >Reporter: Sandish Kumar HN >Priority: Major > Fix For: 3.0.0 > > > Correct me if I'm wrong. > *Stage:* > On a large cluster, each stage would have some executors. were a few > executors would finish a couple of tasks first and wait for whole stage or > remaining tasks to finish which are executed by different executors nodes in > a cluster. a stage will only be completed when all tasks in a current stage > finish its execution. and the next stage execution has to wait till all tasks > of the current stage are completed. > > why don't we trigger GC, when the executor node is waiting for remaining > tasks to finish, or executor Idle? anyways executor has to wait for the > remaining tasks to finish which can at least take a couple of seconds. why > don't we trigger GC? which will max take <300ms > > I have proposed a small code snippet which triggers GC when running tasks are > empty and heap usage in current executor node is more than the given > threshold. > This could improve performance for long-running spark job's. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org