AFAIK, google-api-python-client is not in maintenance mode. In fact, I believe the idiomatic Python library (google-cloud-python) is built off of google-api-python-client, I believe. I have spoken with several Google cloud PMs who have pointed me at google-api-python-client as the canonical library to use, and the one that receives updates for new products first (before google-cloud-python).
On Wed, Sep 27, 2017 at 10:34 AM, Tobias Feldhaus < [email protected]> wrote: > Sounds like a possible solution, however to avoid hitting this problem > I’ve deleted all the tables before rerunning stuff. I think it might have > to do with the library. Airflow uses google-api-python-client which is in > maintenance mode and Google suggests switching to google-cloud-python. I > will write a PythonOperator DAG tomorrow and will check DAG against DAG > then to see if the library could be the problem. > > On 27.09.2017, 19:15, "Chris Riccomini" <[email protected]> wrote: > > Is it possible that you were getting a cache hit with the BQ operator? > > https://cloud.google.com/bigquery/docs/cached-results# > bigquery-query-cache-api > > The operator does not currently expose this flag, and I couldn't find > whether the cache defaults to on or off for insert-job API. > > On Wed, Sep 27, 2017 at 9:41 AM, Tobias Feldhaus < > [email protected]> wrote: > > > I’ve created a table with only the missing value in the exact same > > partition, and then it’s going through. Could it be that the volume > of the > > data plays a role or the client libraries maybe? > > > > On 27.09.2017, 17:46, "Tobias Feldhaus" < > [email protected]> > > wrote: > > > > Hi, > > > > > > I am tracing a bug in one of our data pipelines and I narrowed > it down > > to some small number of events not being in a table (using Airflow > 1.8.2). > > After running the query myself that airflow executed > interactively, I > > saw the missing entry. When airflow executed the same query, and > writes the > > results to a partitioned table in BQ it was missing in that > destination > > table. > > I’ve tried different scenarios now several times and the only > > explanation or difference I can come up with, is that airflow > _might_ be > > that using partitioned tables is not fully supported or there is > some weird > > bug in the bigquery-python implementation. > > > > When deleting the table and recreating it and reloading the > complete > > date with airflow the data is still missing. When reloading a single > day, > > it is also missing. I’ve created a python script to execute the > exact same > > query and it works as expected. > > > > Any advice how to track this down further? Is this a known issue? > > > > Best, > > Tobias > > > > > > > > > > > > >
