[
https://issues.apache.org/jira/browse/SPARK-25400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen reassigned SPARK-25400:
---------------------------------
Assignee: Imran Rashid
> Increase timeouts in schedulerIntegrationSuite
> ----------------------------------------------
>
> Key: SPARK-25400
> URL: https://issues.apache.org/jira/browse/SPARK-25400
> Project: Spark
> Issue Type: Improvement
> Components: Scheduler
> Affects Versions: 2.4.0
> Reporter: Imran Rashid
> Assignee: Imran Rashid
> Priority: Major
> Fix For: 2.3.2, 2.4.0
>
>
> I just took a look at a flaky failure in {{SchedulerIntegrationSuite}}
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95887
> it seems the timeout really is too short:
> {noformat}
> 18/09/10 11:14:07.821 mock backend thread INFO TaskSetManager: Starting task
> 5.0 in stage 1.0 (TID 8, localhost, executor driver, partition 5,
> PROCESS_LOCAL, 7677 bytes)
> 18/09/10 11:14:07.821 task-result-getter-2 INFO TaskSetManager: Finished task
> 3.0 in stage 1.0 (TID 6) in 1 ms on localhost (executor driver) (4/10)
> 18/09/10 11:14:07.821 task-result-getter-0 INFO TaskSetManager: Finished task
> 4.0 in stage 1.0 (TID 7) in 1 ms on localhost (executor driver) (5/10)
> 18/09/10 11:14:07.821 mock backend thread INFO TaskSetManager: Starting task
> 6.0 in stage 1.0 (TID 9, localhost, executor driver, partition 6,
> PROCESS_LOCAL, 7677 bytes)
> 18/09/10 11:14:07.821 task-result-getter-1 INFO TaskSetManager: Finished task
> 5.0 in stage 1.0 (TID 8) in 0 ms on localhost (executor driver) (6/10)
> 18/09/10 11:14:09.481 mock backend thread INFO TaskSetManager: Starting task
> 7.0 in stage 1.0 (TID 10, localhost, executor driver, partition 7,
> PROCESS_LOCAL, 7677 bytes)
> 18/09/10 11:14:09.482 dispatcher-event-loop-14 INFO BlockManagerInfo: Removed
> broadcast_0_piece0 on amp-jenkins-worker-05.amp:36913 in memory (size: 1260.0
> B, free: 1638.6 MB)
> {noformat}
> you'll see that the "mock backend thread" does keep making progress, but for
> whatever reason there is over a one second delay in the middle. Thats
> already going over the existing timeouts.
> Its possible there is something else going on here, but for now just
> increasing the timeouts seems like the best next step.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]