[ 
https://issues.apache.org/jira/browse/FLINK-29789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640352#comment-17640352
 ] 

Sopan Phaltankar commented on FLINK-29789:
------------------------------------------

[~martijnvisser] any thoughts?

> Fix flaky tests in CheckpointCoordinatorTest
> --------------------------------------------
>
>                 Key: FLINK-29789
>                 URL: https://issues.apache.org/jira/browse/FLINK-29789
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Sopan Phaltankar
>            Priority: Minor
>              Labels: pull-request-available
>
> The test 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinatorTest.testTriggerAndDeclineCheckpointComplex
>  is flaky and has the following failure:
> Failures:
> [ERROR] Failures:
> [ERROR]   
> CheckpointCoordinatorTest.testTriggerAndDeclineCheckpointComplex:1054 
> expected:<2> but was:<1>
> I used the tool [NonDex|https://github.com/TestingResearchIllinois/NonDex] to 
> find this flaky test.
> Command: mvn -pl flink-runtime edu.illinois:nondex-maven-plugun:1.1.2:nondex 
> -Dtest=org.apache.flink.runtime.checkpoint.CheckpointCoordinatorTest#testTriggerAndDeclineCheckpointComplex
> I analyzed the assertion failure and found that checkpoint1Id and 
> checkpoint2Id are getting assigned by iterating over a HashMap.
> As we know, iterator() returns elements in a random order 
> [(JavaDoc|https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html#entrySet--])
>  and this might cause test failures for some orders.
> Therefore, to remove this non-determinism, we would change HashMap to 
> LinkedHashMap.
> On further analysis, it was found that the Map is getting initialized on line 
> 1894 of org.apache.flink.runtime.checkpoint.CheckpointCoordinator class.
> After changing from HashMap to LinkedHashMap, the above test is passing 
> without any non-determinism.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to