henry3260 commented on code in PR #59392:
URL: https://github.com/apache/airflow/pull/59392#discussion_r2621678059
##########
providers/amazon/src/airflow/providers/amazon/aws/operators/glue.py:
##########
@@ -217,13 +219,33 @@ def execute(self, context: Context):
:return: the current Glue job ID.
"""
- self.log.info(
- "Initializing AWS Glue Job: %s. Wait for completion: %s",
- self.job_name,
- self.wait_for_completion,
- )
- glue_job_run = self.hook.initialize_job(self.script_args,
self.run_job_kwargs)
- self._job_run_id = glue_job_run["JobRunId"]
+ previous_job_run_id = None
+ if self.resume_glue_job_on_retry:
+ ti = context.get("ti")
+ if ti:
+ previous_job_run_id = ti.xcom_pull(key="glue_job_run_id",
task_ids=ti.task_id)
Review Comment:
You are absolutely right. I overlooked that Airflow clears XCom data upon
task retry.
Instead of using XCom, I plan to use the
[get_job_runs](https://docs.aws.amazon.com/glue/latest/webapi/API_GetJobRuns.html
) API to query AWS directly. This allows us to retrieve the JobRunId of any
active run (filtering for RUNNING or STARTING states) and resume monitoring it,
instead of starting a new duplicate job. What do you think? Thanks for
reviewing!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]