bjoernpollex-sc commented on a change in pull request #4607: WIP: 
[AIRFLOW-1894] Google cloud bigquery
URL: https://github.com/apache/airflow/pull/4607#discussion_r264128537
 
 

 ##########
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##########
 @@ -119,113 +109,37 @@ def get_pandas_df(self, sql, parameters=None, 
dialect=None):
                         verbose=False,
                         private_key=private_key)
 
-    def table_exists(self, project_id, dataset_id, table_id):
+    def table_exists(self, dataset_id, table_id, project_id=None):
 
 Review comment:
   Hi, thanks for putting this together, this is great stuff!
   
   Regarding these convenience methods, I'd argue that it would be better if 
the functionality would be implemented on the connection object returned by 
`get_conn`, and the here we just delegate, like this:
   ```python
   def table_exists(self, dataset_id, table_id, project_id=None):
       return self.get_conn().table_exists(dataset_id, table_id, project_id)
   ```
   The reason is that sometimes it is very useful to have explicit control over 
when and where connections get created, for example when using multi-threading 
for I/O optimization. We've also had code that spuriously crashed because it 
was creating too many new connections (implicitly in such convenience methods), 
and at some point the authentication requests hit a rate-limit.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to