Hi,

I am tracing a bug in one of our data pipelines and I narrowed it down to some 
small number of events not being in a table (using Airflow 1.8.2).
After running the query myself that airflow executed interactively, I saw the 
missing entry. When airflow executed the same query, and writes the results to 
a partitioned table in BQ it was missing in that destination table.
I’ve tried different scenarios now several times and the only explanation or 
difference I can come up with, is that airflow _might_ be that using 
partitioned tables is not fully supported or there is some weird bug in the 
bigquery-python implementation.

When deleting the table and recreating it and reloading the complete date with 
airflow the data is still missing. When reloading a single day, it is also 
missing. I’ve created a python script to execute the exact same query and it 
works as expected.

Any advice how to track this down further? Is this a known issue?

Best,
Tobias


Reply via email to