[ 
https://issues.apache.org/jira/browse/AIRFLOW-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-2769.
---------------------------------
       Resolution: Fixed
    Fix Version/s: 2.0.0

Issue resolved by pull request #3617
[https://github.com/apache/incubator-airflow/pull/3617]

> Increase num_retries polling value on Dataflow hook
> ---------------------------------------------------
>
>                 Key: AIRFLOW-2769
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2769
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: contrib, Dataflow
>    Affects Versions: 1.10
>            Reporter: Paul Woods
>            Priority: Minor
>             Fix For: 2.0.0
>
>
> *Problem Description*
> When airflow launches a Job in Dataflow, it polls the GCP api for job status 
> until the job is complete or fails.  The GCP API occasionally returns 500 and 
> 429  errors on these API requests, which causes the airflow task to fail 
> intermittently, particularly for long-running tasks, while the dataflow job 
> itself does not terminate.
> The recommended action is to retry the request with exponential backoff 
> ([https://developers.google.com/drive/api/v3/handle-errors)].   The gcp api 
> provides this service via the `num_retries` parameter on execute(), but that 
> parameter is not used in
> {code:java}
> airflow.contrib.hooks.gcp_dataflow_hook{code}
> *Proposed Solution*
> Add num_retries to the execute() calls in 
> {code:java}
> _DataflowJob._get_job_id_from_name{code}
> and _
> {code:java}
> _DataflowJob._get_job{code}
>  
> *NOTE:*  the same problem was addressed for Dataproc in 
> ([https://issues.apache.org/jira/browse/AIRFLOW-1718)]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to