Berislav Lopac created AIRFLOW-2772:
---------------------------------------

             Summary: BigQuery hook does not allow specifying both the 
partition field name and table name at the same time
                 Key: AIRFLOW-2772
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2772
             Project: Apache Airflow
          Issue Type: Bug
          Components: hooks
            Reporter: Berislav Lopac


When creating a load job for a single partition in a BigQuery's partitioned 
table, it is possible to specify either the table name with the partition (e.g. 
{{dataset_name.table_name$partition_id}}), or the field used for the partition 
(e.g. {{time_partitioning=\{"field": "field_name"\}}}) -- but not both.

This is the code that raises the exception, at the very end of 
{{contrib/hooks/bigquery_hook.py}}:

{code}
        assert not time_partitioning_in.get('field'), (
            "Cannot specify field partition and partition name "
            "(dataset.table$partition) at the same time"
        )
{code}

My first problem is using {{assert}} for flow control, but more importantly it 
is not clear what is the rationale for this check and the error if both are 
defined? The code works well if we provide just the partition field 
specification, but passing only the partition table name results in the 
following BQ error:

{code}Incompatible table partitioning specification. Expects partitioning 
specification interval(type:day,field:local_event_start_date), but input 
partitioning specification is interval(type:day){code}

which implies that sending both should be perfectly fine.

Can anyone provide any insight?




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to