cristian-fatu opened a new issue #18468:
URL: https://github.com/apache/airflow/issues/18468


   ### Apache Airflow version
   
   2.1.1
   
   ### Operating System
   
   Ubuntu
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   I tried to run a simple Spark application using the SparkKubernetesOperator 
and SparkKubernetesSensor.
   In the yaml file for the Spark Operator I added a sidecar container to the 
driver pod.
   When the job runs in Airflow the SparkKubernetesSensor step will fail with 
the following error:
   
   > [2021-09-23 13:24:21,547] {spark_kubernetes.py:92} WARNING - Could not 
read logs for pod pyspark-pi-driver. It may have been disposed.
   Make sure timeToLiveSeconds is set on your SparkApplication spec.
   underlying exception: (400)
   Reason: Bad Request
   HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 
'Content-Type': 'application/json', 'Date': 'Thu, 23 Sep 2021 13:24:21 GMT', 
'Content-Length': '233'})
   HTTP response body: 
b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"a
 container name must be specified for pod pyspark-pi-driver, choose one of: 
[spark-kubernetes-driver logging-sidecar]","reason":"BadRequest","code":400}\n'
   
   In my yaml file I am not setting timeToLiveSeconds so the driver pod is 
still around at the end of the job execution.
   I believe this is due to the fact that in the call to **get_pod_logs**, from 
within SparkKubernetesSensor._log_driver, only the driver pod name is sent and 
not any container name. This syntax works fine if the driver container is alone 
in the pod, but it will throw an error if there are multiple containers inside 
the pod.
   
   I'm attaching my DAG and yaml files.
   
[spark-py-pi-dag-and-yaml.tar.gz](https://github.com/apache/airflow/files/7218062/spark-py-pi-dag-and-yaml.tar.gz)
   
   
   
   ### What you expected to happen
   
   The SparkKubernetesSensor should be able to get the driver container logs 
even if there are sidecar containers running along side the driver.
   
   ### How to reproduce
   
   The attached YAML and DAG definition can be used to reproduce the issue.
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to