wilsonhooi86 opened a new issue, #62353:
URL: https://github.com/apache/airflow/issues/62353
### Apache Airflow version
2.11.X
### If "Other Airflow 3 version" selected, which one?
MWAA 2.11.0
### What happened?
Hi,
I tried to run the GlueJobOperator with `resume_glue_job_on_retry=True` but
somehow during retry when the task failed, it creates a new glue job ID while
the existing glue job ID is still running.
Sample of the GlueJobOperator:
```py
drug_program = GlueJobOperator(
task_id="drug_program",
job_name="gtm-core-dl-dev-euc1-glue-job-glo_informa_prd",
verbose=False,
script_args={
"--name": var_glue_job_name,
"--db_name": var_database_name,
"--tbl_name": "drug_program",
},
stop_job_run_on_kill=False,
deferrable=False,
pool=var_glue_pool,
botocore_config=botocore_overide_config,
resume_glue_job_on_retry=True,
)
```
1st Run:
Create a new glue job ID as per screenshot
<img width="1956" height="825" alt="Image"
src="https://github.com/user-attachments/assets/bfe92130-dd26-4141-a9b4-ca16d17f2718"
/>
2nd Run (1st Retry):
Job failed due to internal errors, during retry, it didn't find back the
existing glue job ID but created 1 new glue job ID. End results, 2 same glue
jobs are running together.
<img width="1756" height="756" alt="Image"
src="https://github.com/user-attachments/assets/04466625-4377-48a4-b4d4-0f58367bdc85"
/>
### What you think should happen instead?
During task retry, it should check the existing glue job ID, if found, it
should not create a new glue job ID and resume run status in Airflow Task
### How to reproduce
1. Install the apache-airflow-providers-amazon==9.22.0rc1 in MWAA Airflow
2.11 environment
2. Create GlueJobOperator Dag
Sample of the GlueJobOperator:
```py
drug_program = GlueJobOperator(
task_id="drug_program",
job_name="gtm-core-dl-dev-euc1-glue-job-glo_informa_prd",
verbose=False,
script_args={
"--name": var_glue_job_name,
"--db_name": var_database_name,
"--tbl_name": "drug_program",
},
stop_job_run_on_kill=False,
deferrable=False,
pool=var_glue_pool,
botocore_config=botocore_overide_config,
resume_glue_job_on_retry=True,
)
```
3. Run the glue job task.
4. During 1st run, it failed (to test, produce by marking the task as failed)
5. Task went to for retry but somehow it creates a new glue job ID
### Operating System
Amazon Linux 2023
### Versions of Apache Airflow Providers
apache-airflow-providers-amazon==9.22.0rc1
### Deployment
Amazon (AWS) MWAA
### Deployment details
MWAA 2.11.0 deployment with requirements to install
`apache-airflow-providers-amazon==9.22.0rc1`
### Anything else?
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]