meatulbisht opened a new pull request, #50803:
URL: https://github.com/apache/airflow/pull/50803

   This PR fixes issue #41211 where the SparkKubernetesOperator's 
reattach_on_restart functionality doesn't work correctly.
   
   ## Problem
   
   When reattach_on_restart is enabled, the SparkKubernetesOperator tries to 
find the driver pod by looking for pods with specific task context labels 
(dag_id, task_id, run_id). However, these labels are not actually added to the 
driver and executor pods when creating them, causing the reattach functionality 
to fail.
   
   ## Solution
   
   This PR adds code to the execute method of the SparkKubernetesOperator class 
to add task context labels to both the driver and executor pods when 
reattach_on_restart is enabled. This allows the operator to find the existing 
driver pod if the scheduler restarts.
   
   ## Testing
   
   I've performed comprehensive testing to ensure the fix works correctly:
   
   1. Unit tests that verify the task context labels are correctly added when 
reattach_on_restart is enabled
   2. Tests that verify the default behavior remains unchanged when 
reattach_on_restart is disabled
   3. Integration tests in a real Kubernetes environment using Kind
   4. Tests with different Spark application configurations to ensure 
compatibility
   
   All tests passed, confirming that the fix works as expected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to