[ https://issues.apache.org/jira/browse/SPARK-28467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16931024#comment-16931024 ]
Jungtaek Lim commented on SPARK-28467: -------------------------------------- Looks like it is flaky in Amplab PR builder. [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110681/testReport/] Would it be better to file a new issue to track this? > Tests failed if there are not enough executors up before running > ---------------------------------------------------------------- > > Key: SPARK-28467 > URL: https://issues.apache.org/jira/browse/SPARK-28467 > Project: Spark > Issue Type: Bug > Components: Tests > Affects Versions: 3.0.0 > Reporter: huangtianhua > Priority: Minor > > We ran unit tests on arm64 instance, and there are tests failed due to the > executor can't up under the timeout 10000 ms: > - test driver discovery under local-cluster mode *** FAILED *** > java.util.concurrent.TimeoutException: Can't find 1 executors before 10000 > milliseconds elapsed > at org.apache.spark.TestUtils$.waitUntilExecutorsUp(TestUtils.scala:293) > at > org.apache.spark.SparkContextSuite.$anonfun$new$78(SparkContextSuite.scala:753) > at > org.apache.spark.SparkContextSuite.$anonfun$new$78$adapted(SparkContextSuite.scala:741) > at org.apache.spark.SparkFunSuite.withTempDir(SparkFunSuite.scala:161) > at > org.apache.spark.SparkContextSuite.$anonfun$new$77(SparkContextSuite.scala:741) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > > - test gpu driver resource files and discovery under local-cluster mode *** > FAILED *** > java.util.concurrent.TimeoutException: Can't find 1 executors before 10000 > milliseconds elapsed > at org.apache.spark.TestUtils$.waitUntilExecutorsUp(TestUtils.scala:293) > at > org.apache.spark.SparkContextSuite.$anonfun$new$80(SparkContextSuite.scala:781) > at > org.apache.spark.SparkContextSuite.$anonfun$new$80$adapted(SparkContextSuite.scala:761) > at org.apache.spark.SparkFunSuite.withTempDir(SparkFunSuite.scala:161) > at > org.apache.spark.SparkContextSuite.$anonfun$new$79(SparkContextSuite.scala:761) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > And then we increase the timeout to 20000(or 30000) the tests passed, I found > there are other issues about the timeout increasing before, see: > https://issues.apache.org/jira/browse/SPARK-7989 and > https://issues.apache.org/jira/browse/SPARK-10651 > I think the timeout doesn't work well, and seems there is no principle of the > timeout setting, how can I fix this? Could I increase the timeout for these > two tests? -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org