nailo2c commented on PR #65991:
URL: https://github.com/apache/airflow/pull/65991#issuecomment-4490759932
Hi all, after addressing the feedback, I retested the code and it works as
expected.
+ Confirmed: the test Dag uses `yarn application -status`.
<img width="1904" height="982" alt="evidence_2"
src="https://github.com/user-attachments/assets/43b5cb55-7f3b-4a20-821d-f6bc3325b4da"
/>
+ It works as expected.
<img width="1909" height="868" alt="evidence_3"
src="https://github.com/user-attachments/assets/a21acdff-475d-4749-8ee8-c1df10451992"
/>
+ The shim log confirms the hook called yarn with the expected args (PPID
1209 = task runner).
```console
[Breeze:3.10.20] root@61cd5c221936:/opt/airflow$ cat
/tmp/yarn-invocations.log
[2026-05-19T17:56:01+00:00] PID=1282 PPID=1209 ARGS: application -status
application_1779159539321_0005
[2026-05-19T17:56:09+00:00] PID=1285 PPID=1209 ARGS: application -status
application_1779159539321_0005
[2026-05-19T17:56:14+00:00] PID=1288 PPID=1209 ARGS: application -status
application_1779159539321_0005
[2026-05-19T17:56:19+00:00] PID=1299 PPID=1209 ARGS: application -status
application_1779159539321_0005
[2026-05-19T17:56:25+00:00] PID=1303 PPID=1209 ARGS: application -status
application_1779159539321_0005
```
+ Per the feedback, the hook now uses
spark.yarn.submit.waitAppCompletion=false.
<img width="1919" height="574" alt="evidence_1"
src="https://github.com/user-attachments/assets/a825e99d-18be-4db8-a2f5-5b293b072c94"
/>
+ The test Dag I used:
```python
from airflow.models import DAG
from airflow.providers.apache.spark.operators.spark_submit import
SparkSubmitOperator
with DAG(
dag_id="spark_yarn_repro_24171",
schedule=None,
start_date=datetime(2026, 1, 1),
catchup=False,
tags=["repro", "issue-24171"],
):
SparkSubmitOperator(
task_id="spark_pi_yarn_cluster",
application="/opt/airflow/dev/.issue-24171/spark/examples/jars/spark-examples_2.12-3.5.3.jar",
java_class="org.apache.spark.examples.SparkPi",
application_args=["200"],
conn_id="spark_yarn",
deploy_mode="cluster",
name="airflow-pi-cluster",
conf={
"spark.executor.instances": "1",
"spark.executor.memory": "512m",
"spark.driver.memory": "512m",
},
verbose=True,
yarn_track_via_application_status=True,
status_poll_interval=5,
)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]