MichailParaskevopoulos opened a new issue #21600:
URL: https://github.com/apache/airflow/issues/21600
### Apache Airflow Provider(s)
google
### Versions of Apache Airflow Providers
apache-airflow-providers-google==6.3.0
### Apache Airflow version
2.1.1
### Operating System
Debian 10
### Deployment
Docker-Compose
### Deployment details
_No response_
### What happened
The operator
`airflow.providers.google.cloud.operators.bigquery.BigQueryCreateEmptyDatasetOperator`
fails when the GCP project ID can't be determined from the environment:
```
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py",
line 1157, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py",
line 1331, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py",
line 1361, in _execute_task
result = task_copy.execute(context=context)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/google/cloud/operators/bigquery.py",
line 1429, in execute
bq_hook.create_empty_dataset(
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/google/common/hooks/base_google.py",
line 425, in inner_wrapper
return func(self, *args, **kwargs)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/bigquery.py",
line 468, in create_empty_dataset
self.get_client(location=location).create_dataset(dataset=dataset,
exists_ok=exists_ok)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/bigquery.py",
line 156, in get_client
return Client(
File
"/home/airflow/.local/lib/python3.8/site-packages/google/cloud/bigquery/client.py",
line 209, in __init__
super(Client, self).__init__(
File
"/home/airflow/.local/lib/python3.8/site-packages/google/cloud/client.py", line
318, in __init__
_ClientProjectMixin.__init__(self, project=project,
credentials=credentials)
File
"/home/airflow/.local/lib/python3.8/site-packages/google/cloud/client.py", line
269, in __init__
raise EnvironmentError(
OSError: Project was not passed and could not be determined from the
environment.
```
I've tried the operator with either providing the `project_id` and
`dataset_id` or by providing a 'dataset_reference`. I can see the expected
dataset name and project ID being printed in the logs during the hook's
execution, right before the `get_client` method is invoked.
When the `get_client` method is called from
`airflow.providers.google.cloud.hooks.bigquery.BigQueryHook.create_empty_dataset`,
the `project_id` is not passed to it, which I think is the root cause of the
error.
>
### What you expected to happen
I expected the `project_id` to be passed to BQ's client from the arguments
that I provide in the
`airflow.providers.google.cloud.operators.bigquery.BigQueryCreateEmptyDatasetOperator`
operator.
### How to reproduce
Call the operator when the project ID is not set in the environment.
### Anything else
_No response_
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]