mik-laj commented on a change in pull request #6180: [AIRFLOW-5549] Extended BQ
GetDataOperator to handle query params
URL: https://github.com/apache/airflow/pull/6180#discussion_r329704187
##########
File path: airflow/gcp/operators/bigquery.py
##########
@@ -242,9 +252,14 @@ class BigQueryGetDataOperator(BaseOperator):
:type dataset_id: str
:param table_id: The table ID of the requested table. (templated)
:type table_id: str
+ :param start_index: (Optional) Start row index of the table.
+ :type start_index: str
:param max_results: The maximum number of records (rows) to be fetched
from the table. (templated)
- :type max_results: str
+ :type max_results: int
+ :param page_token: (Optional) Page token, returned from a previous call,
Review comment:
This is the implementation detail of the library. This should only be in the
hook, and the hook method should always return the full range of data.
Otherwise, It will not possible to migrate to different library eg.
https://pypi.org/project/google-cloud-bigquery/
My team is working on migrating integration(if possible) on
google-cloud-python. It's a rpart of recommendation of GCP integration guide.
We have started retrofiting this operator also -
https://github.com/PolideaInternal/airflow/issues/231
@TobKed Can you provide a detail information about retrofiting integration
with BigQuery?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services