e-galan commented on code in PR #38716:
URL: https://github.com/apache/airflow/pull/38716#discussion_r1555830278
##########
airflow/providers/apache/beam/operators/beam.py:
##########
@@ -364,11 +364,13 @@ def execute(self, context: Context):
def execute_sync(self, context: Context):
with ExitStack() as exit_stack:
- gcs_hook = GCSHook(gcp_conn_id=self.gcp_conn_id)
if self.py_file.lower().startswith("gs://"):
+ gcs_hook = GCSHook(gcp_conn_id=self.gcp_conn_id)
tmp_gcs_file =
exit_stack.enter_context(gcs_hook.provide_file(object_url=self.py_file))
self.py_file = tmp_gcs_file.name
if self.snake_case_pipeline_options.get("requirements_file",
"").startswith("gs://"):
+ if 'gcs_hook' not in locals():
Review Comment:
I don't think that this check is necessary, there won't be much harm in
re-instantiating the hook.
If you'd like to avoid re-instantiating it, then you could do the following:
add `gcs_hook` as another class instance attribute (with `None` as the default
value) and set it with a dedicated method. You could even include checks such
as `if self.snake_case_pipeline_options.get("requirements_file",
"").startswith("gs://")` and `if self.py_file.lower().startswith("gs://")` into
the method. This will require additional unit tests though.
##########
airflow/providers/apache/beam/operators/beam.py:
##########
@@ -364,11 +364,13 @@ def execute(self, context: Context):
def execute_sync(self, context: Context):
with ExitStack() as exit_stack:
- gcs_hook = GCSHook(gcp_conn_id=self.gcp_conn_id)
if self.py_file.lower().startswith("gs://"):
+ gcs_hook = GCSHook(gcp_conn_id=self.gcp_conn_id)
tmp_gcs_file =
exit_stack.enter_context(gcs_hook.provide_file(object_url=self.py_file))
self.py_file = tmp_gcs_file.name
if self.snake_case_pipeline_options.get("requirements_file",
"").startswith("gs://"):
+ if 'gcs_hook' not in locals():
Review Comment:
@zstrathe I don't think that this check is necessary, there won't be much
harm in re-instantiating the hook.
If you'd like to avoid re-instantiating it, then you could do the following:
add `gcs_hook` as another class instance attribute (with `None` as the default
value) and set it with a dedicated method. You could even include checks such
as `if self.snake_case_pipeline_options.get("requirements_file",
"").startswith("gs://")` and `if self.py_file.lower().startswith("gs://")` into
the method. This will require additional unit tests though.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]