HeartSaVioR opened a new pull request #27004: [SPARK-30348][CORE] Fix flaky 
test failure on "MasterSuite.SPARK-27510: Master should avoid ..."
URL: https://github.com/apache/spark/pull/27004
 
 
   ### What changes were proposed in this pull request?
   
   This patch fixes the flaky test failure on MasterSuite, "SPARK-27510: Master 
should avoid dead loop while launching executor failed in Worker".
   
   The culprit of test failure was ironically the test ran too fast; the 
interval of `eventually` is by default "15 ms", but it took only "8 ms" from 
submit driver to remove app from master.
   
   ```
   19/12/23 15:45:06.533 dispatcher-event-loop-6 INFO Master: Registering 
worker localhost:9999 with 10 cores, 3.6 GiB RAM
   19/12/23 15:45:06.534 dispatcher-event-loop-6 INFO Master: Driver submitted 
org.apache.spark.FakeClass
   19/12/23 15:45:06.535 dispatcher-event-loop-6 INFO Master: Launching driver 
driver-20191223154506-0000 on worker 10001
   19/12/23 15:45:06.536 dispatcher-event-loop-9 INFO Master: Registering app 
name
   19/12/23 15:45:06.537 dispatcher-event-loop-9 INFO Master: Registered app 
name with ID app-20191223154506-0000
   19/12/23 15:45:06.537 dispatcher-event-loop-9 INFO Master: Launching 
executor app-20191223154506-0000/0 on worker 10001
   19/12/23 15:45:06.537 dispatcher-event-loop-10 INFO Master: Removing 
executor app-20191223154506-0000/0 because it is FAILED
   ...
   19/12/23 15:45:06.542 dispatcher-event-loop-19 ERROR Master: Application 
name with ID app-20191223154506-0000 failed 10 times; removing it
   ```
   
   Given the interval is already tiny, instead of lowering interval, the patch 
considers above case as well when verifying the status.
   
   ### Why are the changes needed?
   
   We observed intermittent test failure in Jenkins build which should be fixed.
   
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115664/testReport/
   
   ### Does this PR introduce any user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Modified UT.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to