I’ve created a table with only the missing value in the exact same partition, 
and then it’s going through. Could it be that the volume of the data plays a 
role or the client libraries maybe? 

On 27.09.2017, 17:46, "Tobias Feldhaus" <[email protected]> wrote:

    Hi,
    
    
    I am tracing a bug in one of our data pipelines and I narrowed it down to 
some small number of events not being in a table (using Airflow 1.8.2).
    After running the query myself that airflow executed interactively, I saw 
the missing entry. When airflow executed the same query, and writes the results 
to a partitioned table in BQ it was missing in that destination table.
    I’ve tried different scenarios now several times and the only explanation 
or difference I can come up with, is that airflow _might_ be that using 
partitioned tables is not fully supported or there is some weird bug in the 
bigquery-python implementation.
    
    When deleting the table and recreating it and reloading the complete date 
with airflow the data is still missing. When reloading a single day, it is also 
missing. I’ve created a python script to execute the exact same query and it 
works as expected.
    
    Any advice how to track this down further? Is this a known issue?
    
    Best,
    Tobias
    
    
    

Reply via email to