[
https://issues.apache.org/jira/browse/SPARK-21695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-21695:
---------------------------------
Labels: bulk-closed (was: )
> Spark scheduler locality algorithm can take longer then expected
> ----------------------------------------------------------------
>
> Key: SPARK-21695
> URL: https://issues.apache.org/jira/browse/SPARK-21695
> Project: Spark
> Issue Type: Bug
> Components: Scheduler
> Affects Versions: 2.1.0
> Reporter: Thomas Graves
> Priority: Major
> Labels: bulk-closed
>
> Reference jira https://issues.apache.org/jira/browse/SPARK-21656
> I'm seeing an issue with some jobs where the scheduler takes a long time to
> schedule tasks on executors. The default locality wait is 3 seconds so I
> was expecting that an executor should get some task on it in max 9 seconds
> (node local, rack local, any), but its taking way more time then that. In
> the case of spark-21656 it takes 60+ seconds and executors idle timeout.
> We should investigate why and see if we can fix this.
> Upon an initial look it seems the scheduler resets the locality
> lastLaunchTime whenever it places any task on a node at that locality level.
> It appears this means it can take way longer then 3 seconds for any
> particular task to fall back, but this needs to be verified.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]