mfjackson opened a new issue, #29301:
URL: https://github.com/apache/airflow/issues/29301
### Apache Airflow Provider(s)
google
### Versions of Apache Airflow Providers
I'm using `apache-airflow-providers-google==8.2.0`, but it looks like the
relevant code that's causing this to occur is still in use as of `8.8.0`.
### Apache Airflow version
2.3.2
### Operating System
Debian (from Docker image `apache/airflow:2.3.2-python3.10`)
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
Deployed on an EKS cluster via Helm.
### What happened
The first task in one of my DAGs is to create an empty BigQuery table using
the `BigQueryCreateEmptyTableOperator` as follows:
```python
create_staging_table = BigQueryCreateEmptyTableOperator(
task_id="create_staging_table",
dataset_id="my_dataset",
table_id="tmp_table",
schema_fields=[
{"name": "field_1", "type": "TIMESTAMP", "mode": "NULLABLE"},
{"name": "field_2", "type": "INTEGER", "mode": "NULLABLE"},
{"name": "field_3", "type": "INTEGER", "mode": "NULLABLE"}
],
exists_ok=False
)
```
Note that `exists_ok=False` explicitly here, but it is also the default
value.
This task exists with a `SUCCESS` status even when `my_dataset.tmp_table`
already exists in a given BigQuery project. The task returns the following logs:
```
[2023-02-02, 05:52:29 UTC] {bigquery.py:875} INFO - Creating table
[2023-02-02, 05:52:29 UTC] {bigquery.py:901} INFO - Table
my_dataset.tmp_table already exists.
[2023-02-02, 05:52:30 UTC] {taskinstance.py:1395} INFO - Marking task as
SUCCESS. dag_id=my_fake_dag, task_id=create_staging_table,
execution_date=20230202T044000, start_date=20230202T055229,
end_date=20230202T055230
[2023-02-02, 05:52:30 UTC] {local_task_job.py:156} INFO - Task exited with
return code 0
```
### What you think should happen instead
Setting `exists_ok=False` should raise an exception and exit the task with a
`FAILED` status if the table being created already exists in BigQuery.
### How to reproduce
1. Deploy Airflow 2.3.2 running Python 3.10 in some capacity
2. Ensure `apache-airflow-providers-google==8.2.0` (or 8.8.0, as I don't
believe the issue has been fixed) is installed on the deployment.
3. Set up a GCP project and create a BigQuery dataset.
4. Create an empty BigQuery table with a schema.
5. Create a DAG that uses the `BigQueryCreateEmptyTableOperator` to create a
new BigQuery table.
6. Run the DAG from Step 5 on the Airflow instance deployed in Step 1.
7. Observe the task's status.
### Anything else
I believe the silent failure may be occurring
[here](https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/operators/bigquery.py#L1377),
as the `except` statement results in a log output, but doesn't actually raise
an exception or change a state that would make the task fail.
If this is in fact the case, I'd be happy to submit a PR, but appreciate any
input as to any error-handling standards/consistencies that this provider
package maintains.
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]