[
https://issues.apache.org/jira/browse/TEZ-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165048#comment-17165048
]
Mustafa Iman commented on TEZ-4206:
-----------------------------------
[~abstractdog] yes, silly me I did not check if there was an existing issue.
According to your comment here
https://issues.apache.org/jira/browse/TEZ-4119?focusedCommentId=17024431&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17024431
it is likely the same issue. I tried to explain this in a comment in the
patch. The issue comes from these two factors:
# We use a mock clock advancing 1 second at each tick
# LegacySpeculator artificially increases cool off period if evaluation itself
takes long time. See (clock.getTime() - backgroundRunStartTime) at
[https://github.com/apache/tez/blob/2d7c60849adf3ed62f36f00e161c5d55962206f5/tez-dag/src/main/java/org/apache/tez/dag/app/dag/speculation/legacy/LegacySpeculator.java#L256]
If mock clock tick(1 second) happens while computeSpeculations is in progress,
speculator thinks it takes 1 second to run computeSpeculations. Therefore waits
1 second before the second attempt. The problem is that, the original task in
the test completes before speculator has the second chance to speculate the
task.
There is no recent work on TEZ-4119. I think we can merge this and close
TEZ-4119 as duplicate.
> TestSpeculation.testBasicSpeculationPerVertexConf is flaky
> ----------------------------------------------------------
>
> Key: TEZ-4206
> URL: https://issues.apache.org/jira/browse/TEZ-4206
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Mustafa Iman
> Assignee: Mustafa Iman
> Priority: Major
> Attachments: TEZ-4206.1.patch
>
>
> Test is flaky due to timing issue in MockDAGAppMaster's clock and
> LegacySpeculator
> [https://builds.apache.org/job/PreCommit-TEZ-Build/491/]
> [https://builds.apache.org/job/PreCommit-TEZ-Build/492/]
> [https://builds.apache.org/job/PreCommit-TEZ-Build/493/]
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)