Berislav Lopac created AIRFLOW-2772:
---------------------------------------
Summary: BigQuery hook does not allow specifying both the
partition field name and table name at the same time
Key: AIRFLOW-2772
URL: https://issues.apache.org/jira/browse/AIRFLOW-2772
Project: Apache Airflow
Issue Type: Bug
Components: hooks
Reporter: Berislav Lopac
When creating a load job for a single partition in a BigQuery's partitioned
table, it is possible to specify either the table name with the partition (e.g.
{{dataset_name.table_name$partition_id}}), or the field used for the partition
(e.g. {{time_partitioning=\{"field": "field_name"\}}}) -- but not both.
This is the code that raises the exception, at the very end of
{{contrib/hooks/bigquery_hook.py}}:
{code}
assert not time_partitioning_in.get('field'), (
"Cannot specify field partition and partition name "
"(dataset.table$partition) at the same time"
)
{code}
My first problem is using {{assert}} for flow control, but more importantly it
is not clear what is the rationale for this check and the error if both are
defined? The code works well if we provide just the partition field
specification, but passing only the partition table name results in the
following BQ error:
{code}Incompatible table partitioning specification. Expects partitioning
specification interval(type:day,field:local_event_start_date), but input
partitioning specification is interval(type:day){code}
which implies that sending both should be perfectly fine.
Can anyone provide any insight?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)