wilsonhooi86 commented on PR #59392:
URL: https://github.com/apache/airflow/pull/59392#issuecomment-3723832283
Good Day@henry3260 ,
Happy New Year and thank you so much for taking the initiative to add this
feature. It will be helpful.
I would like to clarify a specific scenario regarding a Glue job named
`glue_job_database_name_1`. This job is designed to handle a single schema but
uses a `tbl_name ` argument to process different tables dynamically. The script
logic adapts based on the table name passed during execution.
Assuming 1 dag, there are 3 GlueJobOperator calling the same glue job name
`glue_job_database_name_1` running in parallel.
Assuming `task_id="table_1"` and `task_id="table_2"` are still running glue
jobs. If `task_id="table_3"` suddenly failed due to some internal error and
retry again, will it be able to find back the same previous_glue_job_id?
```
table_1 = GlueJobOperator(
task_id="table_1",
job_name="glue_job_database_name_1",
verbose=False,
script_args={
"--tbl_name": "table_1",
},
resume_glue_job_on_retry=True,
retry_limit=3,
)
table_2 = GlueJobOperator(
task_id="table_2",
job_name="glue_job_database_name_1",
verbose=False,
script_args={
"--tbl_name": "table_2",
},
resume_glue_job_on_retry=True,
retry_limit=3,
)
table_3 = GlueJobOperator(
task_id="table_3",
job_name="glue_job_database_name_1",
verbose=False,
script_args={
"--tbl_name": "table_3",
},
resume_glue_job_on_retry=True,
retry_limit=3,
)
```
Thanks and let me know if you need further clarification
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]