nailo2c opened a new pull request, #65991: URL: https://github.com/apache/airflow/pull/65991
closes: #24171 # Why In YARN cluster mode, Airflow keeps a local `spark-submit` JVM alive for each running Spark task to monitor completion. At scale, these long-lived JVMs can consume significant Airflow worker memory. # How Added an opt-in `yarn_track_via_rm_api` flag that releases the local spark-submit JVM after YARN submission and tracks completion via the YARN ResourceManager REST API. # What + Confirmed: the test DAG submitted the Spark job via the RM REST API. <img width="1907" height="993" alt="rest_airflow_ui" src="https://github.com/user-attachments/assets/08e3466e-d266-443a-846e-7706982c6b01" /> + It works as expected. <img width="1911" height="879" alt="rest_hadoop_ui" src="https://github.com/user-attachments/assets/07ebcb36-94f5-4c79-bce2-5511834e148d" /> + Manually verify that the RM REST API can fetch the job status. <img width="1471" height="1001" alt="rest_breeze_to_rm_rest" src="https://github.com/user-attachments/assets/fd5a6562-352a-4b8c-8841-ca6ca12baae2" /> + Test Dag ```python from datetime import datetime from airflow.models import DAG from airflow.providers.apache.spark.operators.spark_submit import SparkSubmitOperator with DAG( dag_id="spark_yarn_repro_24171_rest", schedule=None, start_date=datetime(2026, 1, 1), catchup=False, tags=["repro", "issue-24171", "rest"], ): SparkSubmitOperator( task_id="spark_pi_yarn_cluster", application="/opt/airflow/dev/.issue-24171/spark/examples/jars/spark-examples_2.12-3.5.3.jar", java_class="org.apache.spark.examples.SparkPi", application_args=["200"], conn_id="spark_yarn_rm", deploy_mode="cluster", name="airflow-pi-cluster-rest", conf={ "spark.executor.instances": "1", "spark.executor.memory": "512m", "spark.driver.memory": "512m", }, yarn_track_via_rm_api=True, status_poll_interval=5, verbose=True, ) ``` + Test connection ```bash airflow connections add spark_yarn_rm \ --conn-type spark \ --conn-host yarn \ --conn-extra '{ "deploy-mode": "cluster", "spark-binary": "spark-submit", "yarn_resourcemanager_webapp_address": "http://resourcemanager:8088" }' ``` <br><br> --- ##### Was generative AI tooling used to co-author this PR? <!-- If generative AI tooling has been used in the process of authoring this PR, please change below checkbox to `[X]` followed by the name of the tool, uncomment the "Generated-by". --> - [x] Yes (please specify the tool below) Generated-by: Claude Opus 4.7 following [the guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
