This is an automated email from the ASF dual-hosted git repository.
zuston pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-uniffle.git
The following commit(s) were added to refs/heads/master by this push:
new b724a31fd [#1219] fix(test): Fix the flaky test
`WriteAndReadMetricsTest` (#1235)
b724a31fd is described below
commit b724a31fd1acbcdf3b0b9077dab00c4bd8716201
Author: summaryzb <[email protected]>
AuthorDate: Fri Oct 13 23:15:21 2023 -0500
[#1219] fix(test): Fix the flaky test `WriteAndReadMetricsTest` (#1235)
### What changes were proposed in this pull request?
Add a little wait time before verify the result
### Why are the changes needed?
Usually this happens in spark2.3 integration test
```
Error: Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed:
31.317 s <<< FAILURE! - in org.apache.uniffle.test.WriteAndReadMetricsTest
Error: test Time elapsed: 28.47 s <<< FAILURE!
org.opentest4j.AssertionFailedError: expected: <55> but was: <54>
at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
at
org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62)
at
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182)
at
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177)
at
org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1141)
at
org.apache.uniffle.test.SparkIntegrationTestBase.verifyTestResult(SparkIntegrationTestBase.java:130)
at
org.apache.uniffle.test.SparkIntegrationTestBase.run(SparkIntegrationTestBase.java:67)
at
org.apache.uniffle.test.WriteAndReadMetricsTest.test(WriteAndReadMetricsTest.java:40)
```
Inspired by [SPARK-24415](https://issues.apache.org/jira/browse/SPARK-24415)
It might be an order of events type problem, taskEndEvent trigger the
metric updates, while stageCompletion trigger the stageData updates
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Run local test in a loop of 100 times without a failure
---
.../src/test/java/org/apache/uniffle/test/WriteAndReadMetricsTest.java | 2 ++
1 file changed, 2 insertions(+)
diff --git
a/integration-test/spark-common/src/test/java/org/apache/uniffle/test/WriteAndReadMetricsTest.java
b/integration-test/spark-common/src/test/java/org/apache/uniffle/test/WriteAndReadMetricsTest.java
index 9b7d450bc..c7b014d43 100644
---
a/integration-test/spark-common/src/test/java/org/apache/uniffle/test/WriteAndReadMetricsTest.java
+++
b/integration-test/spark-common/src/test/java/org/apache/uniffle/test/WriteAndReadMetricsTest.java
@@ -60,6 +60,8 @@ public class WriteAndReadMetricsTest extends SimpleTestBase {
Map<String, Long> result = new HashMap<>();
result.put("size", (long) list.size());
+ // take a rest to make sure all task metrics are updated before read
stageData
+ Thread.sleep(100);
for (int stageId :
spark.sparkContext().statusTracker().getJobInfo(0).get().stageIds()) {
long writeRecords = getFirstStageData(spark,
stageId).shuffleWriteRecords();
long readRecords = getFirstStageData(spark,
stageId).shuffleReadRecords();