potiuk commented on code in PR #24330:
URL: https://github.com/apache/airflow/pull/24330#discussion_r892600837


##########
airflow/providers/google/cloud/hooks/bigquery.py:
##########
@@ -1915,6 +1916,7 @@ def run_extract(
         :param print_header: Whether to print a header for a CSV file extract.
         :param labels: a dictionary containing labels for the job/query,
             passed to BigQuery
+        :param return_full_job: return full job instead of job id only

Review Comment:
   Technically speaking this functionality (returning job rather than job id) 
is already supported by the function that replaces this deprecated one.
   
   The function that  this one replaces already returns BigQueryJob rather than 
job id (I actually got the idea on how to fix it by looking at the new 
"insert_job" function. So if you are moving out from using this deprecated 
function it will only be easier to migrate, 
   
   But honestly i care more about being able to release new provider than 
worrying about this :). I actually looked at replacing this deprecated function 
and even started doing it, and I would just not attempt to fix it properly 
(i.e. replace the run_extract with insert_job - because you need to thoroughly 
test it (not only return, but also input parameters changes). The change I 
implemented can be simply released based on unit testing and review. 
   
   But this bug - (technically a regression) block from releasing Google 
Provider. So I think better question what alternative we have:
   
   * stop releasing the provider until someone fixes it "properly"? 
   * implement and it properly (i.e. replace it with `insert_job`). 
   
   If you think you - or anyone else - could fix and test it in few hours (I am 
about to prepare an RC2) - feel free :D. I am happy to merge it :)
   
   But if not - I will just proceed with releasing it.



##########
airflow/providers/google/cloud/hooks/bigquery.py:
##########
@@ -1915,6 +1916,7 @@ def run_extract(
         :param print_header: Whether to print a header for a CSV file extract.
         :param labels: a dictionary containing labels for the job/query,
             passed to BigQuery
+        :param return_full_job: return full job instead of job id only

Review Comment:
   Technically speaking this functionality (returning job rather than job id) 
is already supported by the function that replaces this deprecated one.
   
   The function that  this one replaces already returns BigQueryJob rather than 
job id (I actually got the idea on how to fix it by looking at the new 
"insert_job" function. So if you are moving out from using this deprecated 
function it will only be easier to migrate, 
   
   But honestly i care more about being able to release new provider than 
worrying about this :). I actually looked at replacing this deprecated function 
and even started doing it, and I would just not attempt to fix it properly 
(i.e. replace the run_extract with insert_job - because you need to thoroughly 
test it (not only return, but also input parameters changes). The change I 
implemented can be simply released based on unit testing and review. 
   
   But this bug - (technically a regression) block from releasing Google 
Provider. So I think better question what alternative we have:
   
   * stop releasing the provider until someone fixes it "properly"? 
   * implement  it properly and test now (i.e. replace it with `insert_job`). 
   
   If you think you - or anyone else - could fix and test it in few hours (I am 
about to prepare an RC2) - feel free :D. I am happy to merge it :)
   
   But if not - I will just proceed with releasing it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to