[
https://issues.apache.org/jira/browse/AIRFLOW-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kaxil Naik resolved AIRFLOW-2769.
---------------------------------
Resolution: Fixed
Fix Version/s: 2.0.0
Issue resolved by pull request #3617
[https://github.com/apache/incubator-airflow/pull/3617]
> Increase num_retries polling value on Dataflow hook
> ---------------------------------------------------
>
> Key: AIRFLOW-2769
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2769
> Project: Apache Airflow
> Issue Type: Bug
> Components: contrib, Dataflow
> Affects Versions: 1.10
> Reporter: Paul Woods
> Priority: Minor
> Fix For: 2.0.0
>
>
> *Problem Description*
> When airflow launches a Job in Dataflow, it polls the GCP api for job status
> until the job is complete or fails. The GCP API occasionally returns 500 and
> 429 errors on these API requests, which causes the airflow task to fail
> intermittently, particularly for long-running tasks, while the dataflow job
> itself does not terminate.
> The recommended action is to retry the request with exponential backoff
> ([https://developers.google.com/drive/api/v3/handle-errors)]. The gcp api
> provides this service via the `num_retries` parameter on execute(), but that
> parameter is not used in
> {code:java}
> airflow.contrib.hooks.gcp_dataflow_hook{code}
> *Proposed Solution*
> Add num_retries to the execute() calls in
> {code:java}
> _DataflowJob._get_job_id_from_name{code}
> and _
> {code:java}
> _DataflowJob._get_job{code}
>
> *NOTE:* the same problem was addressed for Dataproc in
> ([https://issues.apache.org/jira/browse/AIRFLOW-1718)]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)