[jira] [Commented] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

Thomas Graves (JIRA) Mon, 07 Aug 2017 13:27:21 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-21656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117204#comment-16117204
 ]


Thomas Graves commented on SPARK-21656:
---------------------------------------

Another option would be just to add logic for spark to look at some point to 
see if it should try reacquiring some. All of that though seems like more logic 
then just not letting them go.  To me Spark needs to be more resilient about 
this and should handle various possible conditions.  User shouldn't have to 
tune every single job to account for weird things happening.  Note that if 
dynamic allocation is off this doesn't happen. So why is user getting worse 
experience in this case.

> spark dynamic allocation should not idle timeout executors when tasks still 
> to run
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-21656
>                 URL: https://issues.apache.org/jira/browse/SPARK-21656
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.1.1
>            Reporter: Jong Yoon Lee
>             Fix For: 2.1.1
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Right now spark lets go of executors when they are idle for the 60s (or 
> configurable time). I have seen spark let them go when they are idle but they 
> were really needed. I have seen this issue when the scheduler was waiting to 
> get node locality but that takes longer then the default idle timeout. In 
> these jobs the number of executors goes down really small (less than 10) but 
> there are still like 80,000 tasks to run.
> We should consider not allowing executors to idle timeout if they are still 
> needed according to the number of tasks to be run.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-21656) spark dynamic allocation should not idle timeout executors when tasks still to run

Reply via email to