meatulbisht opened a new pull request, #50803: URL: https://github.com/apache/airflow/pull/50803
This PR fixes issue #41211 where the SparkKubernetesOperator's reattach_on_restart functionality doesn't work correctly. ## Problem When reattach_on_restart is enabled, the SparkKubernetesOperator tries to find the driver pod by looking for pods with specific task context labels (dag_id, task_id, run_id). However, these labels are not actually added to the driver and executor pods when creating them, causing the reattach functionality to fail. ## Solution This PR adds code to the execute method of the SparkKubernetesOperator class to add task context labels to both the driver and executor pods when reattach_on_restart is enabled. This allows the operator to find the existing driver pod if the scheduler restarts. ## Testing I've performed comprehensive testing to ensure the fix works correctly: 1. Unit tests that verify the task context labels are correctly added when reattach_on_restart is enabled 2. Tests that verify the default behavior remains unchanged when reattach_on_restart is disabled 3. Integration tests in a real Kubernetes environment using Kind 4. Tests with different Spark application configurations to ensure compatibility All tests passed, confirming that the fix works as expected. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
