[ 
https://issues.apache.org/jira/browse/SPARK-25291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612911#comment-16612911
 ] 

Ilan Filonenko commented on SPARK-25291:
----------------------------------------

[~skonto] the problem stems from the executors not being created in time for 
the reading of logs. As such, the tests fail. As such, it is required that we 
block on executor creation via a Watcher and only read the logs when executors 
are up. In essence:

 val executorPods = kubernetesTestComponents.kubernetesClient
 .pods()
 .withLabel("spark-app-locator", appLocator)
 .withLabel("spark-role", "executor")
 .list()
 .getItems
 executorPods.asScala.foreach { pod =>
 executorPodChecker(pod)
 }

runs after the spark-submit command and it takes an arbitrary period of time 
for the executors to get spun up. If the k8s client which is reading the 
executor logs returns 0 pods it won't check over the executor pods. As such, 
this flakiness occurs when the executor pod isn't always checking the executor 
pods that are made. In terms of the flakiness for the PySpark tests it seems 
that the executor pods are setting: .set("spark.executor.memory", "500m") in 
the SparkConf and as such are expecting 884 instead of the 1408. So that error 
seems to be related to the PySpark test framework. 

> Flakiness of tests in terms of executor memory (SecretsTestSuite)
> -----------------------------------------------------------------
>
>                 Key: SPARK-25291
>                 URL: https://issues.apache.org/jira/browse/SPARK-25291
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes
>    Affects Versions: 2.4.0
>            Reporter: Ilan Filonenko
>            Priority: Major
>
> SecretsTestSuite shows flakiness in terms of correct setting of executor 
> memory: 
> Run SparkPi with env and mount secrets. *** FAILED ***
>  "[884]Mi" did not equal "[1408]Mi" (KubernetesSuite.scala:272)
> When ran with default settings 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to