VladaZakharova commented on issue #29307: URL: https://github.com/apache/airflow/issues/29307#issuecomment-1420954260
Hi Team! After some investigation, it looks like that it is not a problem of hook implementation, but the way it uses Job object from gcloud.aio.bigquery package. For correct retrieving of the Job with specific job_id it is required to pass location, if you are using location other then US and you will get 404 error: ``` curl -H "Authorization: Bearer $ACCESS_TOKEN" "https://www.googleapis.com/bigquery/v2/projects/airflow-system-tests-303516/jobs/AAA_f9a0fb8126085cf06d9155cc320cbbfa" { "error": { "code": 404, "message": "Not found: Job airflow-system-tests-303516.AAA_f9a0fb8126085cf06d9155cc320cbbfa", "errors": [ { "message": "Not found: Job airflow-system-tests-303516.AAA_f9a0fb8126085cf06d9155cc320cbbfa", "domain": "global", "reason": "notFound" } ], "status": "NOT_FOUND" } } ``` But specifying the correct location of the job in the url solves this problem: ``` curl -H "Authorization: Bearer $ACCESS_TOKEN" "https://www.googleapis.com/bigquery/v2/projects/airflow-system-tests-303516/jobs/AAA_f9a0fb8126085cf06d9155cc320cbbfa?location=asia-south1" { "kind": "bigquery#job", "etag": "7Ut/s/Jhqvgpbb8miRNbYg==", "id": "airflow-system-tests-303516:asia-south1.AAA_f9a0fb8126085cf06d9155cc320cbbfa", "selfLink": "https://www.googleapis.com/bigquery/v2/projects/airflow-system-tests-303516/jobs/AAA_f9a0fb8126085cf06d9155cc320cbbfa?location=asia-south1", "user_email": "vlada-system-t...@airflow-system-tests-303516.iam.gserviceaccount.com", "configuration": { "query": {... ``` Current implementation of the methods for Job object from gcloud.aio.bigquery package doesn't support adding location as a parameter for the methods: https://github.com/talkiq/gcloud-aio/blob/c64b370bdae5aa0bd72373d1e4ef1c4a4b55a3c7/bigquery/gcloud/aio/bigquery/job.py#L60 But on the other hand, sync implementation of this method for QueryJob object takes location as a parameter when constructing url for the call: https://github.com/googleapis/python-bigquery/blob/beab7c2b27c27d8e824cbc66b290be8158da7abf/google/cloud/bigquery/client.py#L188 I have created an issue in GitHub repo for the gcloud.aio.bigquery package with the example and reproduction steps of the problem, however it may take some time to covert this issue from their side. As a workaround, I can implement async methods for BigQuery calls and don't use the Job object from gcloud.aio.bigquery package, but it will require to change all other classes that use current implementation of BigQueryAsyncHook with calling Job object methods: BigQueryIntervalCheckTrigger, BigQueryValueCheckTrigger, BigQueryCheckTrigger, BigQueryGetDataTrigger and all operators that use those triggers. Please, let me know what option you prefer more, thanks :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
