[
https://issues.apache.org/jira/browse/AIRFLOW-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16551635#comment-16551635
]
ASF subversion and git services commented on AIRFLOW-2769:
----------------------------------------------------------
Commit 97d5176c06823e344a12494222156b48108588f2 in incubator-airflow's branch
refs/heads/master from [~pwoods25443]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=97d5176 ]
[AIRFLOW-2769] Increase num_retries polling value on Dataflow hook
Closes #3617 from pwoods25443/2769-dataflow-num-
retries
> Increase num_retries polling value on Dataflow hook
> ---------------------------------------------------
>
> Key: AIRFLOW-2769
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2769
> Project: Apache Airflow
> Issue Type: Bug
> Components: contrib, Dataflow
> Affects Versions: 1.10
> Reporter: Paul Woods
> Priority: Minor
> Fix For: 2.0.0
>
>
> *Problem Description*
> When airflow launches a Job in Dataflow, it polls the GCP api for job status
> until the job is complete or fails. The GCP API occasionally returns 500 and
> 429 errors on these API requests, which causes the airflow task to fail
> intermittently, particularly for long-running tasks, while the dataflow job
> itself does not terminate.
> The recommended action is to retry the request with exponential backoff
> ([https://developers.google.com/drive/api/v3/handle-errors)]. The gcp api
> provides this service via the `num_retries` parameter on execute(), but that
> parameter is not used in
> {code:java}
> airflow.contrib.hooks.gcp_dataflow_hook{code}
> *Proposed Solution*
> Add num_retries to the execute() calls in
> {code:java}
> _DataflowJob._get_job_id_from_name{code}
> and _
> {code:java}
> _DataflowJob._get_job{code}
>
> *NOTE:* the same problem was addressed for Dataproc in
> ([https://issues.apache.org/jira/browse/AIRFLOW-1718)]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)