e-galan commented on code in PR #38716: URL: https://github.com/apache/airflow/pull/38716#discussion_r1555830278
########## airflow/providers/apache/beam/operators/beam.py: ########## @@ -364,11 +364,13 @@ def execute(self, context: Context): def execute_sync(self, context: Context): with ExitStack() as exit_stack: - gcs_hook = GCSHook(gcp_conn_id=self.gcp_conn_id) if self.py_file.lower().startswith("gs://"): + gcs_hook = GCSHook(gcp_conn_id=self.gcp_conn_id) tmp_gcs_file = exit_stack.enter_context(gcs_hook.provide_file(object_url=self.py_file)) self.py_file = tmp_gcs_file.name if self.snake_case_pipeline_options.get("requirements_file", "").startswith("gs://"): + if 'gcs_hook' not in locals(): Review Comment: I don't think that this check is necessary, there won't be much harm in re-instantiating the hook. If you'd like to avoid re-instantiating it, then you could do the following: add `gcs_hook` as another class instance attribute (with `None` as the default value) and set it with a dedicated method. You could even include checks such as `if self.snake_case_pipeline_options.get("requirements_file", "").startswith("gs://")` and `if self.py_file.lower().startswith("gs://")` into the method. This will require additional unit tests though. ########## airflow/providers/apache/beam/operators/beam.py: ########## @@ -364,11 +364,13 @@ def execute(self, context: Context): def execute_sync(self, context: Context): with ExitStack() as exit_stack: - gcs_hook = GCSHook(gcp_conn_id=self.gcp_conn_id) if self.py_file.lower().startswith("gs://"): + gcs_hook = GCSHook(gcp_conn_id=self.gcp_conn_id) tmp_gcs_file = exit_stack.enter_context(gcs_hook.provide_file(object_url=self.py_file)) self.py_file = tmp_gcs_file.name if self.snake_case_pipeline_options.get("requirements_file", "").startswith("gs://"): + if 'gcs_hook' not in locals(): Review Comment: @zstrathe I don't think that this check is necessary, there won't be much harm in re-instantiating the hook. If you'd like to avoid re-instantiating it, then you could do the following: add `gcs_hook` as another class instance attribute (with `None` as the default value) and set it with a dedicated method. You could even include checks such as `if self.snake_case_pipeline_options.get("requirements_file", "").startswith("gs://")` and `if self.py_file.lower().startswith("gs://")` into the method. This will require additional unit tests though. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org