kaxil commented on code in PR #67118:
URL: https://github.com/apache/airflow/pull/67118#discussion_r3270207194


##########
task-sdk/src/airflow/sdk/__init__.py:
##########
@@ -65,6 +65,7 @@
     "ProductMapper",
     "RetryAction",
     "RetryDecision",
+    "ResumableJobMixin",

Review Comment:
   `"ResumableJobMixin"` is out of alphabetical order in `__all__` -- it sits 
between `RetryDecision` and `RetryPolicy`, but should be just after 
`ProductMapper` and before `RetryAction`. Same issue at line 227 in the lazy 
import dict.



##########
providers/apache/spark/src/airflow/providers/apache/spark/operators/spark_submit.py:
##########
@@ -198,8 +221,63 @@ def execute(self, context: Context) -> None:
             self.conf = 
inject_transport_information_into_spark_properties(self.conf, context)
         if self._hook is None:
             self._hook = self._get_hook()
+        if self._hook._should_track_driver_status:
+            return self.execute_resumable(context)
         self._hook.submit(self.application)
 
+    def submit_job(self, context: Context) -> str:
+        driver_id = self._hook.submit(self.application)
+        if not driver_id:
+            raise RuntimeError("spark-submit did not return a driver ID")
+        self.log.info("Spark driver submitted: %s", driver_id)
+        return driver_id
+
+    def get_job_status(self, external_id: str) -> str:
+        if self._hook._is_yarn:

Review Comment:
   The YARN and Kubernetes branches in `get_job_status` / `is_job_active` / 
`is_job_succeeded` / `poll_until_complete` are unreachable from this PR: 
`execute_resumable` is only called when `_should_track_driver_status` is True, 
which is only set for `spark://` + cluster mode. YARN and K8s never enter this 
code path.
   
   Reads as live branches a reviewer must verify, but they can never execute. 
Suggest either dropping them (cleaner -- add them alongside the routing change 
in the follow-up PR) or marking them clearly as scaffolding with a reference to 
the follow-up PR number.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to