Prashant Sharma created SPARK-33668:
---------------------------------------
Summary: Fix flaky test "Verify logging configuration is picked
from the provided SPARK_CONF_DIR/log4j.properties."
Key: SPARK-33668
URL: https://issues.apache.org/jira/browse/SPARK-33668
Project: Spark
Issue Type: Bug
Components: Kubernetes, Tests
Affects Versions: 3.1.0
Reporter: Prashant Sharma
The test is flaking, and at more than one instance and the reason for the
failure is
{code:java}
The code passed to eventually never returned normally. Attempted 109 times
over 3.0079882413999997 minutes. Last failure message: Failure executing: GET
at:
https://192.168.39.167:8443/api/v1/namespaces/b37fc72a991b49baa68a2eaaa1516463/pods/spark-pi-97a9bc76308e7fe3-exec-1/log?pretty=false.
Message: pods "spark-pi-97a9bc76308e7fe3-exec-1" not found. Received status:
Status(apiVersion=v1, code=404, details=StatusDetails(causes=[], group=null,
kind=pods, name=spark-pi-97a9bc76308e7fe3-exec-1, retryAfterSeconds=null,
uid=null, additionalProperties={}), kind=Status, message=pods
"spark-pi-97a9bc76308e7fe3-exec-1" not found, metadata=ListMeta(_continue=null,
remainingItemCount=null, resourceVersion=null, selfLink=null,
additionalProperties={}), reason=NotFound, status=Failure,
additionalProperties={}).. (KubernetesSuite.scala:402)
{code}
>From the above failure, it seems, that executor finishes too quickly and is
>removed by spark before the test can complete.
So, in order to mitigate this situation, one way is to turn on the flag
{code}
"spark.kubernetes.executor.deleteOnTermination"
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]