[ 
https://issues.apache.org/jira/browse/SPARK-28467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977044#comment-16977044
 ] 

zhao bo commented on SPARK-28467:
---------------------------------

https://github.com/apache/spark/pull/25864

> Flaky test: org.apache.spark.SparkContextSuite.test driver discovery under 
> local-cluster mode
> ---------------------------------------------------------------------------------------------
>
>                 Key: SPARK-28467
>                 URL: https://issues.apache.org/jira/browse/SPARK-28467
>             Project: Spark
>          Issue Type: Bug
>          Components: Tests
>    Affects Versions: 3.0.0
>            Reporter: huangtianhua
>            Priority: Minor
>
> We ran unit tests on arm64 instance, and there are tests failed due to the 
> executor can't up under the timeout 10000 ms:
>  - test driver discovery under local-cluster mode *** FAILED ***
>  java.util.concurrent.TimeoutException: Can't find 1 executors before 10000 
> milliseconds elapsed
>  at org.apache.spark.TestUtils$.waitUntilExecutorsUp(TestUtils.scala:293)
>  at 
> org.apache.spark.SparkContextSuite.$anonfun$new$78(SparkContextSuite.scala:753)
>  at 
> org.apache.spark.SparkContextSuite.$anonfun$new$78$adapted(SparkContextSuite.scala:741)
>  at org.apache.spark.SparkFunSuite.withTempDir(SparkFunSuite.scala:161)
>  at 
> org.apache.spark.SparkContextSuite.$anonfun$new$77(SparkContextSuite.scala:741)
>  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
>  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
>  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
>  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>  at org.scalatest.Transformer.apply(Transformer.scala:22)
>  - test gpu driver resource files and discovery under local-cluster mode *** 
> FAILED ***
>  java.util.concurrent.TimeoutException: Can't find 1 executors before 10000 
> milliseconds elapsed
>  at org.apache.spark.TestUtils$.waitUntilExecutorsUp(TestUtils.scala:293)
>  at 
> org.apache.spark.SparkContextSuite.$anonfun$new$80(SparkContextSuite.scala:781)
>  at 
> org.apache.spark.SparkContextSuite.$anonfun$new$80$adapted(SparkContextSuite.scala:761)
>  at org.apache.spark.SparkFunSuite.withTempDir(SparkFunSuite.scala:161)
>  at 
> org.apache.spark.SparkContextSuite.$anonfun$new$79(SparkContextSuite.scala:761)
>  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
>  at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
>  at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
>  at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>  at org.scalatest.Transformer.apply(Transformer.scala:22)
> And then we increase the timeout to 20000(or 30000) the tests passed, I found 
> there are other issues about the timeout increasing before, see: 
> https://issues.apache.org/jira/browse/SPARK-7989 and 
> https://issues.apache.org/jira/browse/SPARK-10651 
>  I think the timeout doesn't work well, and seems there is no principle of 
> the timeout setting, how can I fix this? Could I increase the timeout for 
> these two tests?
>  
> ----
> This test fails intermittently in Amplab CI as well.
> [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110681/testReport/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to