[
https://issues.apache.org/jira/browse/FLINK-29789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sopan Phaltankar updated FLINK-29789:
-------------------------------------
Description:
The test
org.apache.flink.runtime.checkpoint.CheckpointCoordinatorTest.testTriggerAndDeclineCheckpointComplex
is flaky and has the following failure:
Failures:
[ERROR] Failures:
[ERROR] CheckpointCoordinatorTest.testTriggerAndDeclineCheckpointComplex:1054
expected:<2> but was:<1>
I used the tool [NonDex|https://github.com/TestingResearchIllinois/NonDex] to
find this flaky test.
Command: mvn -pl flink-runtime edu.illinois:nondex-maven-plugun:1.1.2:nondex
-Dtest=org.apache.flink.runtime.checkpoint.CheckpointCoordinatorTest#testTriggerAndDeclineCheckpointComplex
I analyzed the assertion failure and found that checkpoint1Id and checkpoint2Id
are getting assigned by iterating over a HashMap.
As we know, iterator() returns elements in a random order
[(JavaDoc|https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html#entrySet--])
and this might cause test failures for some orders.
Therefore, to remove this non-determinism, we would change HashMap to
LinkedHashMap.
On further analysis, it was found that the Map is getting initialized on line
1894 of org.apache.flink.runtime.checkpoint.CheckpointCoordinator class.
After changing from HashMap to LinkedHashMap, the above test is passing without
any non-determinism.
was:
The test
org.apache.flink.runtime.checkpoint.CheckpointCoordinatorTest.testTriggerAndDeclineCheckpointComplex
is flaky and has the following failure:
Failures:
[ERROR] Failures:
[ERROR] CheckpointCoordinatorTest.testTriggerAndDeclineCheckpointComplex:1054
expected:<2> but was:<1>
I used the tool (NonDex|https://github.com/TestingResearchIllinois/NonDex) to
find this flaky test.
Command: mvn -pl flink-runtime edu.illinois:nondex-maven-plugun:1.1.2:nondex
-Dtest=org.apache.flink.runtime.checkpoint.CheckpointCoordinatorTest#testTriggerAndDeclineCheckpointComplex
I analyzed the assertion failure and found that checkpoint1Id and checkpoint2Id
are getting assigned by iterating over a HashMap.
As we know, iterator() returns elements in a random order
(JavaDoc|https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html#entrySet--)
and this might cause test failures for some orders.
Therefore, to remove this non-determinism, we would change HashMap to
LinkedHashMap.
On further analysis, it was found that the Map is getting initialized on line
1894 of org.apache.flink.runtime.checkpoint.CheckpointCoordinator class.
After changing from HashMap to LinkedHashMap, the above test is passing without
any non-determinism.
> Fix flaky tests in CheckpointCoordinatorTest
> --------------------------------------------
>
> Key: FLINK-29789
> URL: https://issues.apache.org/jira/browse/FLINK-29789
> Project: Flink
> Issue Type: Bug
> Reporter: Sopan Phaltankar
> Priority: Minor
> Labels: pull-request-available
>
> The test
> org.apache.flink.runtime.checkpoint.CheckpointCoordinatorTest.testTriggerAndDeclineCheckpointComplex
> is flaky and has the following failure:
> Failures:
> [ERROR] Failures:
> [ERROR]
> CheckpointCoordinatorTest.testTriggerAndDeclineCheckpointComplex:1054
> expected:<2> but was:<1>
> I used the tool [NonDex|https://github.com/TestingResearchIllinois/NonDex] to
> find this flaky test.
> Command: mvn -pl flink-runtime edu.illinois:nondex-maven-plugun:1.1.2:nondex
> -Dtest=org.apache.flink.runtime.checkpoint.CheckpointCoordinatorTest#testTriggerAndDeclineCheckpointComplex
> I analyzed the assertion failure and found that checkpoint1Id and
> checkpoint2Id are getting assigned by iterating over a HashMap.
> As we know, iterator() returns elements in a random order
> [(JavaDoc|https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html#entrySet--])
> and this might cause test failures for some orders.
> Therefore, to remove this non-determinism, we would change HashMap to
> LinkedHashMap.
> On further analysis, it was found that the Map is getting initialized on line
> 1894 of org.apache.flink.runtime.checkpoint.CheckpointCoordinator class.
> After changing from HashMap to LinkedHashMap, the above test is passing
> without any non-determinism.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)