This is an automated email from the ASF dual-hosted git repository.

zuston pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-uniffle.git


The following commit(s) were added to refs/heads/master by this push:
     new b724a31fd [#1219] fix(test): Fix the flaky test 
`WriteAndReadMetricsTest` (#1235)
b724a31fd is described below

commit b724a31fd1acbcdf3b0b9077dab00c4bd8716201
Author: summaryzb <[email protected]>
AuthorDate: Fri Oct 13 23:15:21 2023 -0500

    [#1219] fix(test): Fix the flaky test `WriteAndReadMetricsTest` (#1235)
    
    ### What changes were proposed in this pull request?
    Add a little wait time before verify the result
    
    ### Why are the changes needed?
    Usually this happens in spark2.3 integration test
    ```
    Error:  Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
31.317 s <<< FAILURE! - in org.apache.uniffle.test.WriteAndReadMetricsTest
    Error:  test  Time elapsed: 28.47 s  <<< FAILURE!
    org.opentest4j.AssertionFailedError: expected: <55> but was: <54>
            at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
            at 
org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62)
            at 
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182)
            at 
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177)
            at 
org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1141)
            at 
org.apache.uniffle.test.SparkIntegrationTestBase.verifyTestResult(SparkIntegrationTestBase.java:130)
            at 
org.apache.uniffle.test.SparkIntegrationTestBase.run(SparkIntegrationTestBase.java:67)
            at 
org.apache.uniffle.test.WriteAndReadMetricsTest.test(WriteAndReadMetricsTest.java:40)
    ```
    Inspired by [SPARK-24415](https://issues.apache.org/jira/browse/SPARK-24415)
    It might be an order of events type problem, taskEndEvent trigger the 
metric updates, while stageCompletion trigger the stageData updates
    
    ### Does this PR introduce _any_ user-facing change?
    No.
    
    ### How was this patch tested?
    Run local test in a loop of 100 times without a failure
---
 .../src/test/java/org/apache/uniffle/test/WriteAndReadMetricsTest.java  | 2 ++
 1 file changed, 2 insertions(+)

diff --git 
a/integration-test/spark-common/src/test/java/org/apache/uniffle/test/WriteAndReadMetricsTest.java
 
b/integration-test/spark-common/src/test/java/org/apache/uniffle/test/WriteAndReadMetricsTest.java
index 9b7d450bc..c7b014d43 100644
--- 
a/integration-test/spark-common/src/test/java/org/apache/uniffle/test/WriteAndReadMetricsTest.java
+++ 
b/integration-test/spark-common/src/test/java/org/apache/uniffle/test/WriteAndReadMetricsTest.java
@@ -60,6 +60,8 @@ public class WriteAndReadMetricsTest extends SimpleTestBase {
     Map<String, Long> result = new HashMap<>();
     result.put("size", (long) list.size());
 
+    // take a rest to make sure all task metrics are updated before read 
stageData
+    Thread.sleep(100);
     for (int stageId : 
spark.sparkContext().statusTracker().getJobInfo(0).get().stageIds()) {
       long writeRecords = getFirstStageData(spark, 
stageId).shuffleWriteRecords();
       long readRecords = getFirstStageData(spark, 
stageId).shuffleReadRecords();

Reply via email to