amoghrajesh commented on PR #67473:
URL: https://github.com/apache/airflow/pull/67473#issuecomment-4666827014

   Addressed all review comments. Here is a summary of whats in the final state:
   
   **Hook (`spark_submit.py`)**
   - `_is_yarn_cluster_mode` flag added (YARN + deploy_mode=cluster)
   - `submit()` no longer blocks for RM API polling when 
`yarn_track_via_rm_api=True`. It now exits after capturing 
`_yarn_application_id`. Polling responsibility moved to the operator.
   - `_track_yarn_application` renamed to 
`_start_yarn_application_status_tracking` with state-change logging and a 
periodic heartbeat every 10 polls
   - `query_yarn_application_status()` added: normalizes the YARN `(state, 
finalStatus)` tuple to a single string for the `ResumableJobMixin` interface
   - `kill_yarn_application()` public wrapper removed. `on_kill()` in the 
operator calls `_kill_yarn_application()` directly so Kerberos auth is applied
   
   **Operator (`spark_submit.py`)**
   
   Four paths for YARN cluster mode in `execute()`:
   1. `reconnect_on_retry=True` + `yarn_track_via_rm_api=True` -- full crash 
recovery via `execute_resumable()`
   2. `reconnect_on_retry=True` + `yarn_track_via_rm_api=False` -- raises 
`ValueError` at startup (no way to resume without RM API)
   3. `reconnect_on_retry=False` + `yarn_track_via_rm_api=True` -- submit + 
poll without task_state persistence
   4. `reconnect_on_retry=False` + `yarn_track_via_rm_api=False` -- falls 
through to `hook.submit()`, spark-submit blocks with `waitAppCompletion=true` 
(unchanged legacy behavior)
   
   `on_kill()` kills via REST API for YARN cluster mode since spark-submit has 
already exited at that point.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to