EugeneYushin commented on code in PR #45610:
URL: https://github.com/apache/airflow/pull/45610#discussion_r1918462144
##########
providers/src/airflow/providers/google/cloud/operators/bigquery.py:
##########
@@ -1717,12 +1717,14 @@ def execute(self, context: Context) -> None:
"tableId": table_id,
},
"labels": self.labels,
- "schema": {"fields": schema_fields},
"externalDataConfiguration": external_data_configuration,
"location": self.location,
"encryptionConfiguration": self.encryption_configuration,
}
+ if self.schema_fields:
Review Comment:
@VladaZakharova I'm using the following DAG for e2e tests locally:
```python
bq_op = BigQueryCreateExternalTableOperator(
gcp_conn_id="gcp_pp",
task_id="bq_op",
bucket=GCS_BUCKET,
source_objects=["test.csv"],
destination_project_dataset_table=f"{PROJECT_ID}.user_eyushin.airflow_test_orig",
#
destination_project_dataset_table=f"{PROJECT_ID}.user_eyushin.airflow_test_new",
source_format="CSV",
autodetect=True,
skip_leading_rows=1,
location="us-east4",
)
```
Original behavior:
<img width="747" alt="image"
src="https://github.com/user-attachments/assets/0b3136bd-df99-4ce4-9000-783af326de24"
/>
<img width="637" alt="Screenshot 2025-01-16 at 15 48 03"
src="https://github.com/user-attachments/assets/5199bdd8-1026-4cc0-9218-7e5cbb46856b"
/>
Adjusted behavior:
<img width="847" alt="image"
src="https://github.com/user-attachments/assets/ac3db8e0-6a6f-4d1a-8080-85e1fa7a0aab"
/>
<img width="727" alt="Screenshot 2025-01-16 at 15 49 11"
src="https://github.com/user-attachments/assets/d7385ccf-ed56-4c37-b34f-4d66f99209b5"
/>
<img width="1035" alt="image"
src="https://github.com/user-attachments/assets/864147bc-42b3-4b05-a606-fd4d5318494c"
/>
In the other words, sending empty schema to GCP focres BQ to bypass
autodetect mechanics.
You can also check https://github.com/apache/airflow/issues/45512 for some
context about the issue.
I'll be happy to iterate on that if you have other questions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]