nathadfield opened a new issue, #32093:
URL: https://github.com/apache/airflow/issues/32093
### Apache Airflow version
2.6.2
### What happened
When using the `GCSToBigQueryOperator` in deferrable mode with an
impersonation_chain service account which has a default project_id that is
different from the project_id specified in the operator arguments, a failure
occurs.
```
[2023-06-23, 11:38:37 UTC] {taskinstance.py:1824} ERROR - Task failed with
exception
Traceback (most recent call last):
File
"/usr/local/lib/python3.10/site-packages/airflow/providers/google/cloud/transfers/gcs_to_bigquery.py",
line 447, in execute_complete
raise AirflowException(event["message"])
airflow.exceptions.AirflowException: 404, message='Not Found: {\n "error":
{\n "code": 404,\n "message": "Not found: Job
king-cdmr-etl-sandbox:airflow_apptweak_king_itunes_connect_channels_load_active_devices_to_bq_2023_06_22T07_00_00_00_00_4842808969d21632ecbb76ffca48aabd",\n
"errors": [\n {\n "message": "Not found: Job
king-cdmr-etl-sandbox:airflow_apptweak_king_itunes_connect_channels_load_active_devices_to_bq_2023_06_22T07_00_00_00_00_4842808969d21632ecbb76ffca48aabd",\n
"domain": "global",\n "reason": "notFound"\n }\n ],\n
"status": "NOT_FOUND"\n }\n}\n',
url=URL('https://www.googleapis.com/bigquery/v2/projects/king-cdmr-etl-sandbox/jobs/airflow_apptweak_king_itunes_connect_channels_load_active_devices_to_bq_2023_06_22T07_00_00_00_00_4842808969d21632ecbb76ffca48aabd')
```
I believe this happens because, although the BigQuery job to insert data, is
raised against `self.project_id` in
[_submit_job](https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/transfers/gcs_to_bigquery.py#L303),
when in deferrable mode it tries to find the job within the project in
[self.hook.project_id](https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/transfers/gcs_to_bigquery.py#L432).
It is possible that that the default project_id assigned to the
impersonation chain service account is different to the project_id specified to
the operator.
In the above error, you can see that the error says that it cannot find the
job_id
`airflow_apptweak_king_itunes_connect_channels_load_active_devices_to_bq_2023_06_22T07_00_00_00_00_4842808969d21632ecbb76ffca48aabd`
in the project `king-cdmt-etl-sandbox`.
In fact this job_id was created successfully in the project
`king-coredatasets-sandbox`
<img width="604" alt="Screenshot 2023-06-23 at 12 40 39"
src="https://github.com/apache/airflow/assets/967119/488ea8e2-e447-46b1-814e-419402639a76">
### What you think should happen instead
I think that we should modify the call to `self.defer` to receive
`self.project_id` rather than `self.hook.project_id`
### How to reproduce
I haven't quite got the exact steps to reproduce but I will submit a PR for
review soon.
### Operating System
Debian GNU/Linux 11 (bullseye)
### Versions of Apache Airflow Providers
apache-airflow-providers-google==10.0.0
### Deployment
Astronomer
### Deployment details
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]