zstrathe commented on code in PR #38716:
URL: https://github.com/apache/airflow/pull/38716#discussion_r1550297987
##########
airflow/providers/apache/beam/operators/beam.py:
##########
@@ -364,11 +364,13 @@ def execute(self, context: Context):
def execute_sync(self, context: Context):
with ExitStack() as exit_stack:
- gcs_hook = GCSHook(gcp_conn_id=self.gcp_conn_id)
if self.py_file.lower().startswith("gs://"):
+ gcs_hook = GCSHook(gcp_conn_id=self.gcp_conn_id)
Review Comment:
Hello, thank you for the quick response to this PR. My use case is only for
local development (hence why I'm not utilizing Google Cloud at all), so please
do not consider this a high priority to review.
The change is because, when I'm running the Beam pipeline from local storage
only relative to airflow (with both the beam pipeline .py file and any other
pipeline assets located in the same Docker container as the airflow worker in
my case), then the connection to Google Cloud Storage (GCSHook) is not needed
at all. And my assumption is that this is why I am getting the
AirflowNotFoundException when the connection is attempted to be established.
However, I'm new to both Airflow and Beam so I can't say for certain if this is
actually the root cause for the error. If I'm wrong then I sincerely apologize
for wasting your time!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]