[
https://issues.apache.org/jira/browse/FLINK-29789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640352#comment-17640352
]
Sopan Phaltankar commented on FLINK-29789:
------------------------------------------
[~martijnvisser] any thoughts?
> Fix flaky tests in CheckpointCoordinatorTest
> --------------------------------------------
>
> Key: FLINK-29789
> URL: https://issues.apache.org/jira/browse/FLINK-29789
> Project: Flink
> Issue Type: Bug
> Reporter: Sopan Phaltankar
> Priority: Minor
> Labels: pull-request-available
>
> The test
> org.apache.flink.runtime.checkpoint.CheckpointCoordinatorTest.testTriggerAndDeclineCheckpointComplex
> is flaky and has the following failure:
> Failures:
> [ERROR] Failures:
> [ERROR]
> CheckpointCoordinatorTest.testTriggerAndDeclineCheckpointComplex:1054
> expected:<2> but was:<1>
> I used the tool [NonDex|https://github.com/TestingResearchIllinois/NonDex] to
> find this flaky test.
> Command: mvn -pl flink-runtime edu.illinois:nondex-maven-plugun:1.1.2:nondex
> -Dtest=org.apache.flink.runtime.checkpoint.CheckpointCoordinatorTest#testTriggerAndDeclineCheckpointComplex
> I analyzed the assertion failure and found that checkpoint1Id and
> checkpoint2Id are getting assigned by iterating over a HashMap.
> As we know, iterator() returns elements in a random order
> [(JavaDoc|https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html#entrySet--])
> and this might cause test failures for some orders.
> Therefore, to remove this non-determinism, we would change HashMap to
> LinkedHashMap.
> On further analysis, it was found that the Map is getting initialized on line
> 1894 of org.apache.flink.runtime.checkpoint.CheckpointCoordinator class.
> After changing from HashMap to LinkedHashMap, the above test is passing
> without any non-determinism.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)