[
https://issues.apache.org/jira/browse/AIRFLOW-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wajid Khattak updated AIRFLOW-4004:
-----------------------------------
Environment: OS = Windows 7, Python = 2.7.12 (was: OS = Windows 7
Enterprise, Python = 2.7.12)
> Dataflow jobs only launched in us-central1 region
> -------------------------------------------------
>
> Key: AIRFLOW-4004
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4004
> Project: Apache Airflow
> Issue Type: Bug
> Components: Dataflow, hooks
> Affects Versions: 1.10.2
> Environment: OS = Windows 7, Python = 2.7.12
> Reporter: Wajid Khattak
> Priority: Major
>
> Dataflow jobs can only be launched in us-central1 region. Raeson seems to be
> that for launching jobs the REST endpoint
> "https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.templates/launch"
> is called as below in gcp_dataflow_hook.py:
>
> {code:java}
> ...
> request = service.projects().locations().templates().launch(
> projectId=variables['project'],
> location=variables['region'],
> gcsPath=dataflow_template,
> body=body
> )
> ...
> {code}
>
> However, for checking the progress of the launched job, the REST endpoint
> "https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.locations.jobs/get"
> is called as below in gcp_dataflow_hook.py:
> {code:java}
> ...
> def _get_job(self):
> if self._job_id:
> job = self._dataflow.projects().locations().jobs().get(
> projectId=self._project_number,
> location=self._job_location,
> jobId=self._job_id).execute(num_retries=5)
> elif self._job_name:
> job = self._get_job_id_from_name()
> else:
> raise Exception('Missing both dataflow job ID and name.')
> ...{code}
> The simple fix is to use the correct REST endpoint for launching jobs i.e
> "https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.locations.templates/launch"
> so that the job is launched in the correct region as specified in the launch
> parameters.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)