[jira] [Assigned] (SPARK-26513) Trigger GC on executor node idle

2018-12-31 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26513:


Assignee: Apache Spark

> Trigger GC on executor node idle
> 
>
> Key: SPARK-26513
> URL: https://issues.apache.org/jira/browse/SPARK-26513
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: Sandish Kumar HN
>Assignee: Apache Spark
>Priority: Major
> Fix For: 3.0.0
>
>
> Correct me if I'm wrong.
> *Stage:*
>       On a large cluster, each stage would have some executors. were a few 
> executors would finish a couple of tasks first and wait for whole stage or 
> remaining tasks to finish which are executed by different executors nodes in 
> a cluster. a stage will only be completed when all tasks in a current stage 
> finish its execution. and the next stage execution has to wait till all tasks 
> of the current stage are completed. 
>  
> why don't we trigger GC, when the executor node is waiting for remaining 
> tasks to finish, or executor Idle? anyways executor has to wait for the 
> remaining tasks to finish which can at least take a couple of seconds. why 
> don't we trigger GC? which will max take <300ms
>  
> I have proposed a small code snippet which triggers GC when running tasks are 
> empty and heap usage in current executor node is more than the given 
> threshold.
> This could improve performance for long-running spark job's. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26513) Trigger GC on executor node idle

2018-12-31 Thread Apache Spark (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-26513:


Assignee: (was: Apache Spark)

> Trigger GC on executor node idle
> 
>
> Key: SPARK-26513
> URL: https://issues.apache.org/jira/browse/SPARK-26513
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: Sandish Kumar HN
>Priority: Major
> Fix For: 3.0.0
>
>
> Correct me if I'm wrong.
> *Stage:*
>       On a large cluster, each stage would have some executors. were a few 
> executors would finish a couple of tasks first and wait for whole stage or 
> remaining tasks to finish which are executed by different executors nodes in 
> a cluster. a stage will only be completed when all tasks in a current stage 
> finish its execution. and the next stage execution has to wait till all tasks 
> of the current stage are completed. 
>  
> why don't we trigger GC, when the executor node is waiting for remaining 
> tasks to finish, or executor Idle? anyways executor has to wait for the 
> remaining tasks to finish which can at least take a couple of seconds. why 
> don't we trigger GC? which will max take <300ms
>  
> I have proposed a small code snippet which triggers GC when running tasks are 
> empty and heap usage in current executor node is more than the given 
> threshold.
> This could improve performance for long-running spark job's. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org