GitHub user hezeclark added a comment to the discussion: Airflow task failed 
but spark kube app is running

This is a common issue with Airflow + SparkKubernetesOperator when the Airflow 
task timeout is shorter than the actual Spark job duration.

**Root cause**: Airflow marks the task as failed when it doesn't receive a 
heartbeat or when the task's `execution_timeout` is exceeded, but the Spark 
application keeps running in Kubernetes independently — it has no awareness of 
Airflow's state.

**Solutions:**

**1. Increase `execution_timeout` on the Spark task**

```python
from airflow.providers.cncf.kubernetes.operators.spark_kubernetes import 
SparkKubernetesOperator
from datetime import timedelta

submit_job = SparkKubernetesOperator(
    task_id='submit_spark_job',
    application='/path/to/spark-app.yaml',
    execution_timeout=timedelta(hours=3),  # longer than max expected job 
duration
    ...
)
```

**2. Use `SparkKubernetesSensor` to poll instead of waiting synchronously**

```python
from airflow.providers.cncf.kubernetes.sensors.spark_kubernetes import 
SparkKubernetesSensor

monitor_job = SparkKubernetesSensor(
    task_id='monitor_spark_job',
    application_name='{{ task_instance.xcom_pull(task_ids=\"submit_spark_job\") 
}}',
    poke_interval=30,
    timeout=7200,  # 2 hours
    ...
)
```

**3. Add a cleanup DAG / task that deletes orphaned Spark apps**

When Airflow fails but Spark keeps running, you need to handle the orphaned 
app. Add an `on_failure_callback` that calls `kubectl delete sparkapplication 
<name>` to prevent resource leaks:

```python
def cleanup_spark_app(context):
    import subprocess
    app_name = context['task_instance'].xcom_pull(task_ids='submit_spark_job')
    subprocess.run(['kubectl', 'delete', 'sparkapplication', app_name, '-n', 
'spark'], check=False)

submit_job = SparkKubernetesOperator(
    on_failure_callback=cleanup_spark_app,
    ...
)
```

GitHub link: 
https://github.com/apache/airflow/discussions/63298#discussioncomment-16117745

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to