nailo2c commented on PR #65991:
URL: https://github.com/apache/airflow/pull/65991#issuecomment-4490759932

   Hi all, after addressing the feedback, I retested the code and it works as 
expected.
   
   + Confirmed: the test Dag uses  `yarn application -status`.
         <img width="1904" height="982" alt="evidence_2" 
src="https://github.com/user-attachments/assets/43b5cb55-7f3b-4a20-821d-f6bc3325b4da";
 />
   
   + It works as expected.
         <img width="1909" height="868" alt="evidence_3" 
src="https://github.com/user-attachments/assets/a21acdff-475d-4749-8ee8-c1df10451992";
 />
   
   + The shim log confirms the hook called yarn with the expected args (PPID 
1209 = task runner).
   ```console
   [Breeze:3.10.20] root@61cd5c221936:/opt/airflow$ cat 
/tmp/yarn-invocations.log
   [2026-05-19T17:56:01+00:00] PID=1282 PPID=1209 ARGS: application -status 
application_1779159539321_0005
   [2026-05-19T17:56:09+00:00] PID=1285 PPID=1209 ARGS: application -status 
application_1779159539321_0005
   [2026-05-19T17:56:14+00:00] PID=1288 PPID=1209 ARGS: application -status 
application_1779159539321_0005
   [2026-05-19T17:56:19+00:00] PID=1299 PPID=1209 ARGS: application -status 
application_1779159539321_0005
   [2026-05-19T17:56:25+00:00] PID=1303 PPID=1209 ARGS: application -status 
application_1779159539321_0005
   ```
   
   + Per the feedback, the hook now uses 
spark.yarn.submit.waitAppCompletion=false.
         <img width="1919" height="574" alt="evidence_1" 
src="https://github.com/user-attachments/assets/a825e99d-18be-4db8-a2f5-5b293b072c94";
 />
   
   + The test Dag I used:
   ```python
   from airflow.models import DAG
   from airflow.providers.apache.spark.operators.spark_submit import 
SparkSubmitOperator
   
   with DAG(
       dag_id="spark_yarn_repro_24171",
       schedule=None,
       start_date=datetime(2026, 1, 1),
       catchup=False,
       tags=["repro", "issue-24171"],
   ):
       SparkSubmitOperator(
           task_id="spark_pi_yarn_cluster",
           
application="/opt/airflow/dev/.issue-24171/spark/examples/jars/spark-examples_2.12-3.5.3.jar",
           java_class="org.apache.spark.examples.SparkPi",
           application_args=["200"],
           conn_id="spark_yarn",
           deploy_mode="cluster",
           name="airflow-pi-cluster",
           conf={
               "spark.executor.instances": "1",
               "spark.executor.memory": "512m",
               "spark.driver.memory": "512m",
           },
           verbose=True,
           yarn_track_via_application_status=True,
           status_poll_interval=5,
       )
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to