[jira] [Updated] (AIRFLOW-2315) S3Hook Extra Extras

2018-04-11 Thread Josh Bacon (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Bacon updated AIRFLOW-2315:

Labels: beginner features newbie starter  (was: )

> S3Hook Extra Extras 
> 
>
> Key: AIRFLOW-2315
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2315
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: boto3
>Affects Versions: 1.9.0
>Reporter: Josh Bacon
>Assignee: Josh Bacon
>Priority: Minor
>  Labels: beginner, features, newbie, starter
> Fix For: 1.10.0
>
>
> Feature improvement request to S3Hook to support additional JSON extra 
> arguments to apply to both upload and download ExtraArgs.
> Allowed Upload Arguments: 
> [http://boto3.readthedocs.io/en/latest/reference/customizations/s3.html#boto3.s3.transfer.S3Transfer.ALLOWED_UPLOAD_ARGS]
> Allowed Download Arguments: 
> [http://boto3.readthedocs.io/en/latest/reference/customizations/s3.html#boto3.s3.transfer.S3Transfer.ALLOWED_DOWNLOAD_ARGS]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2315) S3Hook Extra Extras

2018-04-11 Thread Josh Bacon (JIRA)
Josh Bacon created AIRFLOW-2315:
---

 Summary: S3Hook Extra Extras 
 Key: AIRFLOW-2315
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2315
 Project: Apache Airflow
  Issue Type: Improvement
  Components: boto3
Affects Versions: 1.9.0
Reporter: Josh Bacon
Assignee: Josh Bacon
 Fix For: 1.10.0


Feature improvement request to S3Hook to support additional JSON extra 
arguments to apply to both upload and download ExtraArgs.

Allowed Upload Arguments: 
[http://boto3.readthedocs.io/en/latest/reference/customizations/s3.html#boto3.s3.transfer.S3Transfer.ALLOWED_UPLOAD_ARGS]

Allowed Download Arguments: 
[http://boto3.readthedocs.io/en/latest/reference/customizations/s3.html#boto3.s3.transfer.S3Transfer.ALLOWED_DOWNLOAD_ARGS]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2314) Fix strip of trailing slash in GCS paths

2018-04-11 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guillermo Rodríguez Cano reassigned AIRFLOW-2314:
-

Assignee: Guillermo Rodríguez Cano

> Fix strip of trailing slash in GCS paths
> 
>
> Key: AIRFLOW-2314
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2314
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, dependencies, gcp
>Reporter: Guillermo Rodríguez Cano
>Assignee: Guillermo Rodríguez Cano
>Priority: Minor
>
> Behaviour of {{_parse_gcs_url}} function in 
> {{airflow.contrib.hooks.gcs_hook}} is stripping leading and trailing slash 
> characters: / in the blob part of the URI
> I expect the leading slash to be removed, but not the ending one when given. 
> Reason being is that since the blob is a representing a full 'path' object, 
> it is known that the separator between the bucket and the blob will be one 
> slash character but there may be cases when the user wants to explicitly know 
> whether the object is an actual file or a 'directory'.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2314) Fix strip of trailing slash in GCS paths

2018-04-11 Thread JIRA

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433859#comment-16433859
 ] 

Guillermo Rodríguez Cano commented on AIRFLOW-2314:
---

PR addressing the issue: https://github.com/apache/incubator-airflow/pull/3212

> Fix strip of trailing slash in GCS paths
> 
>
> Key: AIRFLOW-2314
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2314
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, dependencies, gcp
>Reporter: Guillermo Rodríguez Cano
>Priority: Minor
>
> Behaviour of {{_parse_gcs_url}} function in 
> {{airflow.contrib.hooks.gcs_hook}} is stripping leading and trailing slash 
> characters: / in the blob part of the URI
> I expect the leading slash to be removed, but not the ending one when given. 
> Reason being is that since the blob is a representing a full 'path' object, 
> it is known that the separator between the bucket and the blob will be one 
> slash character but there may be cases when the user wants to explicitly know 
> whether the object is an actual file or a 'directory'.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2314) Fix strip of trailing slash in GCS paths

2018-04-11 Thread JIRA
Guillermo Rodríguez Cano created AIRFLOW-2314:
-

 Summary: Fix strip of trailing slash in GCS paths
 Key: AIRFLOW-2314
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2314
 Project: Apache Airflow
  Issue Type: Bug
  Components: contrib, dependencies, gcp
Reporter: Guillermo Rodríguez Cano


Behaviour of {{_parse_gcs_url}} function in {{airflow.contrib.hooks.gcs_hook}} 
is stripping leading and trailing slash characters: / in the blob part of the 
URI

I expect the leading slash to be removed, but not the ending one when given. 
Reason being is that since the blob is a representing a full 'path' object, it 
is known that the separator between the bucket and the blob will be one slash 
character but there may be cases when the user wants to explicitly know whether 
the object is an actual file or a 'directory'.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2291) Ability to specify runtimeVersion, pythonVersion and jobDir in ML Engine operator

2018-04-11 Thread Fokko Driesprong (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-2291.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

Issue resolved by pull request #3202
[https://github.com/apache/incubator-airflow/pull/3202]

> Ability to specify runtimeVersion, pythonVersion and jobDir in ML Engine 
> operator
> -
>
> Key: AIRFLOW-2291
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2291
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp
>Affects Versions: Airflow 2.0, Airflow 1.8, Airflow 1.9.0
>Reporter: David Volquartz Lebech
>Assignee: David Volquartz Lebech
>Priority: Minor
> Fix For: 2.0.0
>
>
> For the Google Cloud ML Engine Training Operator (in {{contrib}}), the 
> [current list of 
> arguments|https://github.com/apache/incubator-airflow/blob/b918a4976a641a3c944ef4ad4323fe25bd219ea0/airflow/contrib/operators/mlengine_operator.py#L538-L544]
>  does not seem to include the option for specifying runtime and Python 
> version which limits execution to Python 2.7. It also does not include the 
> useful job directory argument.
> It seems that simply adding {{runtimeVersion}}, {{pythonVersion}} and 
> {{jobDir}} to the list would be enough to make this work, according to [the 
> documentations on ML 
> engine|https://cloud.google.com/ml-engine/docs/tensorflow/versioning#set-python-version-training].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2291) Ability to specify runtimeVersion, pythonVersion and jobDir in ML Engine operator

2018-04-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433809#comment-16433809
 ] 

ASF subversion and git services commented on AIRFLOW-2291:
--

Commit d9bf1edd44af847aa1f9ecd9d797dfacb806cc00 in incubator-airflow's branch 
refs/heads/master from [~dlebech]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=d9bf1ed ]

[AIRFLOW-2291] Add optional params to ML Engine

This commit adds three extra optional parameters
to the
`MLEngineTrainingOperator` as well as a new unit
test for the
`MLEngineVersionOperator`

Closes #3202 from dlebech/ml-engine-python-version


> Ability to specify runtimeVersion, pythonVersion and jobDir in ML Engine 
> operator
> -
>
> Key: AIRFLOW-2291
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2291
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp
>Affects Versions: Airflow 2.0, Airflow 1.8, Airflow 1.9.0
>Reporter: David Volquartz Lebech
>Assignee: David Volquartz Lebech
>Priority: Minor
>
> For the Google Cloud ML Engine Training Operator (in {{contrib}}), the 
> [current list of 
> arguments|https://github.com/apache/incubator-airflow/blob/b918a4976a641a3c944ef4ad4323fe25bd219ea0/airflow/contrib/operators/mlengine_operator.py#L538-L544]
>  does not seem to include the option for specifying runtime and Python 
> version which limits execution to Python 2.7. It also does not include the 
> useful job directory argument.
> It seems that simply adding {{runtimeVersion}}, {{pythonVersion}} and 
> {{jobDir}} to the list would be enough to make this work, according to [the 
> documentations on ML 
> engine|https://cloud.google.com/ml-engine/docs/tensorflow/versioning#set-python-version-training].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-2291] Add optional params to ML Engine

2018-04-11 Thread fokko
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 65e7025f3 -> d9bf1edd4


[AIRFLOW-2291] Add optional params to ML Engine

This commit adds three extra optional parameters
to the
`MLEngineTrainingOperator` as well as a new unit
test for the
`MLEngineVersionOperator`

Closes #3202 from dlebech/ml-engine-python-version


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/d9bf1edd
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/d9bf1edd
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/d9bf1edd

Branch: refs/heads/master
Commit: d9bf1edd44af847aa1f9ecd9d797dfacb806cc00
Parents: 65e7025
Author: David Volquartz Lebech 
Authored: Wed Apr 11 14:15:06 2018 +0200
Committer: Fokko Driesprong 
Committed: Wed Apr 11 14:15:06 2018 +0200

--
 airflow/contrib/operators/mlengine_operator.py  | 28 +
 .../contrib/operators/test_mlengine_operator.py | 63 +++-
 2 files changed, 90 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/d9bf1edd/airflow/contrib/operators/mlengine_operator.py
--
diff --git a/airflow/contrib/operators/mlengine_operator.py 
b/airflow/contrib/operators/mlengine_operator.py
index 3dd63f2..a6f186b 100644
--- a/airflow/contrib/operators/mlengine_operator.py
+++ b/airflow/contrib/operators/mlengine_operator.py
@@ -474,6 +474,16 @@ class MLEngineTrainingOperator(BaseOperator):
 :param scale_tier: Resource tier for MLEngine training job.
 :type scale_tier: string
 
+:param runtime_version: The Google Cloud ML runtime version to use for 
training.
+:type runtime_version: string
+
+:param python_version: The version of Python used in training.
+:type python_version: string
+
+:param job_dir: A Google Cloud Storage path in which to store training
+outputs and other data needed for training.
+:type job_dir: string
+
 :param gcp_conn_id: The connection ID to use when fetching connection info.
 :type gcp_conn_id: string
 
@@ -497,6 +507,9 @@ class MLEngineTrainingOperator(BaseOperator):
 '_training_args',
 '_region',
 '_scale_tier',
+'_runtime_version',
+'_python_version',
+'_job_dir'
 ]
 
 @apply_defaults
@@ -508,6 +521,9 @@ class MLEngineTrainingOperator(BaseOperator):
  training_args,
  region,
  scale_tier=None,
+ runtime_version=None,
+ python_version=None,
+ job_dir=None,
  gcp_conn_id='google_cloud_default',
  delegate_to=None,
  mode='PRODUCTION',
@@ -521,6 +537,9 @@ class MLEngineTrainingOperator(BaseOperator):
 self._training_args = training_args
 self._region = region
 self._scale_tier = scale_tier
+self._runtime_version = runtime_version
+self._python_version = python_version
+self._job_dir = job_dir
 self._gcp_conn_id = gcp_conn_id
 self._delegate_to = delegate_to
 self._mode = mode
@@ -555,6 +574,15 @@ class MLEngineTrainingOperator(BaseOperator):
 }
 }
 
+if self._runtime_version:
+training_request['trainingInput']['runtimeVersion'] = 
self._runtime_version
+
+if self._python_version:
+training_request['trainingInput']['pythonVersion'] = 
self._python_version
+
+if self._job_dir:
+training_request['trainingInput']['jobDir'] = self._job_dir
+
 if self._mode == 'DRY_RUN':
 self.log.info('In dry_run mode.')
 self.log.info('MLEngine Training job request is: {}'.format(

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/d9bf1edd/tests/contrib/operators/test_mlengine_operator.py
--
diff --git a/tests/contrib/operators/test_mlengine_operator.py 
b/tests/contrib/operators/test_mlengine_operator.py
index 2766e5d..9a8d56c 100644
--- a/tests/contrib/operators/test_mlengine_operator.py
+++ b/tests/contrib/operators/test_mlengine_operator.py
@@ -17,6 +17,7 @@
 
 from __future__ import absolute_import, division, print_function
 
+import copy
 import datetime
 import unittest
 
@@ -26,7 +27,8 @@ from mock import ANY, patch
 
 from airflow import DAG, configuration
 from airflow.contrib.operators.mlengine_operator import 
(MLEngineBatchPredictionOperator,
- 
MLEngineTrainingOperator)
+ 
MLEngineTrainingOperator,
+

[jira] [Resolved] (AIRFLOW-1774) Better handling of templated parameters in Google ML batch prediction/training operators

2018-04-11 Thread Fokko Driesprong (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-1774.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

Issue resolved by pull request #2746
[https://github.com/apache/incubator-airflow/pull/2746]

> Better handling of templated parameters in Google ML batch 
> prediction/training operators
> 
>
> Key: AIRFLOW-1774
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1774
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Affects Versions: Airflow 2.0, 1.9.0
>Reporter: Guillermo Rodríguez Cano
>Assignee: Guillermo Rodríguez Cano
>Priority: Major
> Fix For: 2.0.0
>
>
> The {{MLEngineBatchPredictionOperator}} does not support well the templated 
> parameter {{job_id}} due to a helper function used to detect and cleanup bad 
> job names inhibiting the templating engine to work. I suspect the same may 
> happen with {{MLEngineTrainingOperator}} as well.
> Google ML requieres a unique job id, therefore it is critical to have the 
> possibility to customise the job's name easily, and preferably with some data 
> related to the DAG, for example, appending the day to the job's name via a 
> macro like {{ds}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-1774] Allow consistent templating of arguments in MLEngineBatchPredictionOperator

2018-04-11 Thread fokko
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 3475faf6b -> 65e7025f3


[AIRFLOW-1774] Allow consistent templating of arguments in 
MLEngineBatchPredictionOperator

Fix a minor typo and a wrong non-default
assignment

Fix one more typo

Adapt tests to new error messages and fix another
typo

Fix exception type in utils operator test class

Improve cleansing of non-valid training and
prediciton job names

Closes #2746 from wileeam/ml-engine-prediction-
job-normalization


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/65e7025f
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/65e7025f
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/65e7025f

Branch: refs/heads/master
Commit: 65e7025f3af378dfa825eb04d005ec1f7a422cde
Parents: 3475faf
Author: Guillermo Rodríguez Cano 
Authored: Wed Apr 11 11:57:21 2018 +0200
Committer: Fokko Driesprong 
Committed: Wed Apr 11 11:57:21 2018 +0200

--
 airflow/contrib/operators/mlengine_operator.py  | 205 ++-
 .../contrib/operators/test_mlengine_operator.py |  88 
 .../operators/test_mlengine_operator_utils.py   |   6 +-
 3 files changed, 156 insertions(+), 143 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/65e7025f/airflow/contrib/operators/mlengine_operator.py
--
diff --git a/airflow/contrib/operators/mlengine_operator.py 
b/airflow/contrib/operators/mlengine_operator.py
index 0d033d3..3dd63f2 100644
--- a/airflow/contrib/operators/mlengine_operator.py
+++ b/airflow/contrib/operators/mlengine_operator.py
@@ -15,77 +15,17 @@
 # limitations under the License.
 import re
 
-from airflow import settings
+from apiclient import errors
+
 from airflow.contrib.hooks.gcp_mlengine_hook import MLEngineHook
 from airflow.exceptions import AirflowException
 from airflow.operators import BaseOperator
 from airflow.utils.decorators import apply_defaults
-from apiclient import errors
-
 from airflow.utils.log.logging_mixin import LoggingMixin
 
 log = LoggingMixin().log
 
 
-def _create_prediction_input(project_id,
- region,
- data_format,
- input_paths,
- output_path,
- model_name=None,
- version_name=None,
- uri=None,
- max_worker_count=None,
- runtime_version=None):
-"""
-Create the batch prediction input from the given parameters.
-
-Args:
-A subset of arguments documented in __init__ method of class
-MLEngineBatchPredictionOperator
-
-Returns:
-A dictionary representing the predictionInput object as documented
-in https://cloud.google.com/ml-engine/reference/rest/v1/projects.jobs.
-
-Raises:
-ValueError: if a unique model/version origin cannot be determined.
-"""
-prediction_input = {
-'dataFormat': data_format,
-'inputPaths': input_paths,
-'outputPath': output_path,
-'region': region
-}
-
-if uri:
-if model_name or version_name:
-log.error(
-'Ambiguous model origin: Both uri and model/version name are 
provided.'
-)
-raise ValueError('Ambiguous model origin.')
-prediction_input['uri'] = uri
-elif model_name:
-origin_name = 'projects/{}/models/{}'.format(project_id, model_name)
-if not version_name:
-prediction_input['modelName'] = origin_name
-else:
-prediction_input['versionName'] = \
-origin_name + '/versions/{}'.format(version_name)
-else:
-log.error(
-'Missing model origin: Batch prediction expects a model, '
-'a model & version combination, or a URI to savedModel.')
-raise ValueError('Missing model origin.')
-
-if max_worker_count:
-prediction_input['maxWorkerCount'] = max_worker_count
-if runtime_version:
-prediction_input['runtimeVersion'] = runtime_version
-
-return prediction_input
-
-
 def _normalize_mlengine_job_id(job_id):
 """
 Replaces invalid MLEngine job_id characters with '_'.
@@ -99,10 +39,27 @@ def _normalize_mlengine_job_id(job_id):
 Returns:
 A valid job_id representation.
 """
-match = re.search(r'\d', job_id)
+
+# Add a prefix when a job_id starts with a digit or a template
+match = re.search(r'\d|\{{2}', job_id)
 if match and match.start() is 0:
-job_id = 'z_{}'.format(job_id)
-return 

[jira] [Commented] (AIRFLOW-1774) Better handling of templated parameters in Google ML batch prediction/training operators

2018-04-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433662#comment-16433662
 ] 

ASF subversion and git services commented on AIRFLOW-1774:
--

Commit 65e7025f3af378dfa825eb04d005ec1f7a422cde in incubator-airflow's branch 
refs/heads/master from [~wileeam]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=65e7025 ]

[AIRFLOW-1774] Allow consistent templating of arguments in 
MLEngineBatchPredictionOperator

Fix a minor typo and a wrong non-default
assignment

Fix one more typo

Adapt tests to new error messages and fix another
typo

Fix exception type in utils operator test class

Improve cleansing of non-valid training and
prediciton job names

Closes #2746 from wileeam/ml-engine-prediction-
job-normalization


> Better handling of templated parameters in Google ML batch 
> prediction/training operators
> 
>
> Key: AIRFLOW-1774
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1774
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Affects Versions: Airflow 2.0, 1.9.0
>Reporter: Guillermo Rodríguez Cano
>Assignee: Guillermo Rodríguez Cano
>Priority: Major
>
> The {{MLEngineBatchPredictionOperator}} does not support well the templated 
> parameter {{job_id}} due to a helper function used to detect and cleanup bad 
> job names inhibiting the templating engine to work. I suspect the same may 
> happen with {{MLEngineTrainingOperator}} as well.
> Google ML requieres a unique job id, therefore it is critical to have the 
> possibility to customise the job's name easily, and preferably with some data 
> related to the DAG, for example, appending the day to the job's name via a 
> macro like {{ds}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2302) Add missing operators/hooks to the docs

2018-04-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433657#comment-16433657
 ] 

ASF subversion and git services commented on AIRFLOW-2302:
--

Commit 3475faf6ba9efc4ff9a08648a13f3f290ff671d7 in incubator-airflow's branch 
refs/heads/master from [~Fokko]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=3475faf ]

[AIRFLOW-2302] Add missing operators and hooks

Many operators and hooks are not listed in the
config, and therefore
not picked up by the docks generator.

Closes #3201 from Fokko/airflow-2302-add-missing-
docs


> Add missing operators/hooks to the docs
> ---
>
> Key: AIRFLOW-2302
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2302
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-2302] Add missing operators and hooks

2018-04-11 Thread fokko
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 69ccf8432 -> 3475faf6b


[AIRFLOW-2302] Add missing operators and hooks

Many operators and hooks are not listed in the
config, and therefore
not picked up by the docks generator.

Closes #3201 from Fokko/airflow-2302-add-missing-
docs


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/3475faf6
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/3475faf6
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/3475faf6

Branch: refs/heads/master
Commit: 3475faf6ba9efc4ff9a08648a13f3f290ff671d7
Parents: 69ccf84
Author: Fokko Driesprong 
Authored: Wed Apr 11 11:55:29 2018 +0200
Committer: Fokko Driesprong 
Committed: Wed Apr 11 11:55:29 2018 +0200

--
 .github/PULL_REQUEST_TEMPLATE.md |  7 +
 docs/code.rst| 53 +--
 docs/integration.rst | 41 +--
 docs/kubernetes.rst  |  3 +-
 4 files changed, 73 insertions(+), 31 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/3475faf6/.github/PULL_REQUEST_TEMPLATE.md
--
diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
index 72cbb76..6000d0e 100644
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -23,4 +23,11 @@ Make sure you have checked _all_ steps below.
 5. Body wraps at 72 characters
 6. Body explains "what" and "why", not "how"
 
+
+### Documentation
+- [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
+- When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
+
+
+### Code Quality
 - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/3475faf6/docs/code.rst
--
diff --git a/docs/code.rst b/docs/code.rst
index 34ecec9..a30d117 100644
--- a/docs/code.rst
+++ b/docs/code.rst
@@ -80,6 +80,7 @@ Operators
 .. autoclass:: airflow.operators.python_operator.ShortCircuitOperator
 .. autoclass:: airflow.operators.http_operator.SimpleHttpOperator
 .. autoclass:: airflow.operators.slack_operator.SlackAPIOperator
+.. autoclass:: airflow.operators.slack_operator.SlackAPIPostOperator
 .. autoclass:: airflow.operators.sqlite_operator.SqliteOperator
 .. autoclass:: airflow.operators.subdag_operator.SubDagOperator
 .. autoclass:: airflow.operators.dagrun_operator.TriggerDagRunOperator
@@ -130,6 +131,7 @@ Operators
 .. autoclass:: 
airflow.contrib.operators.dataproc_operator.DataProcSparkOperator
 .. autoclass:: 
airflow.contrib.operators.dataproc_operator.DataProcHadoopOperator
 .. autoclass:: 
airflow.contrib.operators.dataproc_operator.DataProcPySparkOperator
+.. autoclass:: 
airflow.contrib.operators.dataproc_operator.DataprocWorkflowTemplateBaseOperator
 .. autoclass:: 
airflow.contrib.operators.dataproc_operator.DataprocWorkflowTemplateInstantiateOperator
 .. autoclass:: 
airflow.contrib.operators.dataproc_operator.DataprocWorkflowTemplateInstantiateInlineOperator
 .. autoclass:: 
airflow.contrib.operators.datastore_export_operator.DatastoreExportOperator
@@ -148,9 +150,11 @@ Operators
 .. autoclass:: 
airflow.contrib.operators.gcs_operator.GoogleCloudStorageCreateBucketOperator
 .. autoclass:: 
airflow.contrib.operators.gcs_to_bq.GoogleCloudStorageToBigQueryOperator
 .. autoclass:: 
airflow.contrib.operators.gcs_to_gcs.GoogleCloudStorageToGoogleCloudStorageOperator
+.. autoclass:: 
airflow.contrib.operators.gcs_to_s3.GoogleCloudStorageToS3Operator
 .. autoclass:: airflow.contrib.operators.hipchat_operator.HipChatAPIOperator
 .. autoclass:: 
airflow.contrib.operators.hipchat_operator.HipChatAPISendRoomNotificationOperator
 .. autoclass:: 
airflow.contrib.operators.hive_to_dynamodb.HiveToDynamoDBTransferOperator
+.. autoclass:: 
airflow.contrib.operators.jenkins_job_trigger_operator.JenkinsJobTriggerOperator
 .. autoclass:: airflow.contrib.operators.jira_operator.JiraOperator
 .. autoclass:: 
airflow.contrib.operators.kubernetes_pod_operator.KubernetesPodOperator
 .. autoclass:: 
airflow.contrib.operators.mlengine_operator.MLEngineBatchPredictionOperator
@@ -167,13 +171,14 @@ Operators
 .. autoclass:: airflow.contrib.operators.qubole_operator.QuboleOperator
 .. autoclass:: airflow.contrib.operators.s3_list_operator.S3ListOperator
 .. autoclass:: airflow.contrib.operators.sftp_operator.SFTPOperator
+.. autoclass:: 
airflow.contrib.operators.slack_webhook_operator.SlackWebhookOperator
+.. autoclass:: 

[jira] [Resolved] (AIRFLOW-2302) Add missing operators/hooks to the docs

2018-04-11 Thread Fokko Driesprong (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-2302.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

Issue resolved by pull request #3201
[https://github.com/apache/incubator-airflow/pull/3201]

> Add missing operators/hooks to the docs
> ---
>
> Key: AIRFLOW-2302
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2302
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2312) Docs Typo Correction: Corresponding

2018-04-11 Thread Fokko Driesprong (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-2312.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

Issue resolved by pull request #3211
[https://github.com/apache/incubator-airflow/pull/3211]

> Docs Typo Correction: Corresponding
> ---
>
> Key: AIRFLOW-2312
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2312
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Tam-Sanh
>Priority: Trivial
> Fix For: 2.0.0
>
>
> [https://github.com/apache/incubator-airflow/pull/3211]
> {{ correpsonding to corresponding}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-2312] Docs Typo Correction: Corresponding

2018-04-11 Thread fokko
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 39b7d7d87 -> 69ccf8432


[AIRFLOW-2312] Docs Typo Correction: Corresponding

# JIRA
[x] My PR addresses the following Airflow JIRA
issues and references them in the PR title.
*
https://issues.apache.org/jira/browse/AIRFLOW-2312

# Description

[x] Here are some details about my PR

Minor typo fix in the docs. `correpsonding` to
`corresponding`

# Tests
[x] My PR adds the following unit tests OR does
not need testing for this extremely good reason:

I assume I don't need testing for docs

# Commits

[x] My commits all reference JIRA issues in their
subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"How to write a good git commit message":

1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

Closes #3211 from tamsanh/patch-1


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/69ccf843
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/69ccf843
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/69ccf843

Branch: refs/heads/master
Commit: 69ccf843213979baadf6aaf4967e6e52d490513e
Parents: 39b7d7d
Author: Tamu 
Authored: Wed Apr 11 11:42:46 2018 +0200
Committer: Fokko Driesprong 
Committed: Wed Apr 11 11:42:46 2018 +0200

--
 docs/concepts.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/69ccf843/docs/concepts.rst
--
diff --git a/docs/concepts.rst b/docs/concepts.rst
index be6809b..3f70555 100644
--- a/docs/concepts.rst
+++ b/docs/concepts.rst
@@ -380,7 +380,7 @@ opposed to XComs that are pushed manually).
 
 If ``xcom_pull`` is passed a single string for ``task_ids``, then the most
 recent XCom value from that task is returned; if a list of ``task_ids`` is
-passed, then a correpsonding list of XCom values is returned.
+passed, then a corresponding list of XCom values is returned.
 
 .. code:: python
 



[jira] [Commented] (AIRFLOW-2312) Docs Typo Correction: Corresponding

2018-04-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433633#comment-16433633
 ] 

ASF subversion and git services commented on AIRFLOW-2312:
--

Commit 69ccf843213979baadf6aaf4967e6e52d490513e in incubator-airflow's branch 
refs/heads/master from [~tamsanh]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=69ccf84 ]

[AIRFLOW-2312] Docs Typo Correction: Corresponding

# JIRA
[x] My PR addresses the following Airflow JIRA
issues and references them in the PR title.
*
https://issues.apache.org/jira/browse/AIRFLOW-2312

# Description

[x] Here are some details about my PR

Minor typo fix in the docs. `correpsonding` to
`corresponding`

# Tests
[x] My PR adds the following unit tests OR does
not need testing for this extremely good reason:

I assume I don't need testing for docs

# Commits

[x] My commits all reference JIRA issues in their
subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"How to write a good git commit message":

1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

Closes #3211 from tamsanh/patch-1


> Docs Typo Correction: Corresponding
> ---
>
> Key: AIRFLOW-2312
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2312
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Tam-Sanh
>Priority: Trivial
>
> [https://github.com/apache/incubator-airflow/pull/3211]
> {{ correpsonding to corresponding}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2312) Docs Typo Correction: Corresponding

2018-04-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433632#comment-16433632
 ] 

ASF subversion and git services commented on AIRFLOW-2312:
--

Commit 69ccf843213979baadf6aaf4967e6e52d490513e in incubator-airflow's branch 
refs/heads/master from [~tamsanh]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=69ccf84 ]

[AIRFLOW-2312] Docs Typo Correction: Corresponding

# JIRA
[x] My PR addresses the following Airflow JIRA
issues and references them in the PR title.
*
https://issues.apache.org/jira/browse/AIRFLOW-2312

# Description

[x] Here are some details about my PR

Minor typo fix in the docs. `correpsonding` to
`corresponding`

# Tests
[x] My PR adds the following unit tests OR does
not need testing for this extremely good reason:

I assume I don't need testing for docs

# Commits

[x] My commits all reference JIRA issues in their
subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"How to write a good git commit message":

1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

Closes #3211 from tamsanh/patch-1


> Docs Typo Correction: Corresponding
> ---
>
> Key: AIRFLOW-2312
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2312
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Tam-Sanh
>Priority: Trivial
>
> [https://github.com/apache/incubator-airflow/pull/3211]
> {{ correpsonding to corresponding}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2313) Add ClusterLifecycleConfig to DataprocClusterCreateOperator

2018-04-11 Thread Egidijus Bartkus (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Egidijus Bartkus reassigned AIRFLOW-2313:
-

Assignee: Egidijus Bartkus

> Add ClusterLifecycleConfig to DataprocClusterCreateOperator
> ---
>
> Key: AIRFLOW-2313
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2313
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp, operators
>Affects Versions: Airflow 2.0, 1.9.0
>Reporter: Egidijus Bartkus
>Assignee: Egidijus Bartkus
>Priority: Minor
>
> Google introduced [Cloud Dataproc Cluster Scheduled 
> Deletion|https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]
>  feature which isn't yet supported by the DataprocClusterCreateOperator.
>  
> Having this option enabled could be useful to make sure Dataproc cluster 
> doesn't stay active for extended time periods (In case delete 
> DataprocClusterDeleteOperator isn't suitable or fails to run for any reason).
>  
> Implementation requires adding config object 
> [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2313) Add ClusterLifecycleConfig to DataprocClusterCreateOperator

2018-04-11 Thread e b (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

e b updated AIRFLOW-2313:
-
Description: 
Google introduced [Cloud Dataproc Cluster Scheduled 
Deletion|https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]
 feature which isn't yet supported by the DataprocClusterCreateOperator.

 

Having this option enabled could be useful to make sure Dataproc cluster 
doesn't stay active for extended time periods (In case delete 
DataprocClusterDeleteOperator isn't suitable or fails to run for any reason).

 

Implementation requires adding config object 
[LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig]

 

 

  was:
Google introduced Cloud Dataproc Cluster 
[Scheduled-Deletion|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]]
 feature which isn't yet supported by the DataprocClusterCreateOperator.

 

Having this option enabled could be useful to make sure Dataproc cluster 
doesn't stay active for extended time periods (In case delete 
DataprocClusterDeleteOperator isn't suitable or fails to run for any reason).

 

Implementation requires adding config object 
[LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig]

 

 


> Add ClusterLifecycleConfig to DataprocClusterCreateOperator
> ---
>
> Key: AIRFLOW-2313
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2313
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp, operators
>Affects Versions: Airflow 2.0, 1.9.0
>Reporter: e b
>Priority: Minor
>
> Google introduced [Cloud Dataproc Cluster Scheduled 
> Deletion|https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]
>  feature which isn't yet supported by the DataprocClusterCreateOperator.
>  
> Having this option enabled could be useful to make sure Dataproc cluster 
> doesn't stay active for extended time periods (In case delete 
> DataprocClusterDeleteOperator isn't suitable or fails to run for any reason).
>  
> Implementation requires adding config object 
> [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2313) Add ClusterLifecycleConfig to DataprocClusterCreateOperator

2018-04-11 Thread e b (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

e b updated AIRFLOW-2313:
-
Description: 
Google introduced Cloud Dataproc Cluster 
[Scheduled-Deletion|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]]
 feature which isn't yet supported by the DataprocClusterCreateOperator.

 

Having this option enabled could be useful to make sure Dataproc cluster 
doesn't stay active for extended time periods (In case delete 
DataprocClusterDeleteOperator isn't suitable or fails to run for any reason).

 

Implementation requires adding config object 
[LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig]

 

 

  was:
Google introduced [Cloud Dataproc Cluster Scheduled 
Deletion|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]]
 feature which isn't yet supported by the DataprocClusterCreateOperator.

 

Having this option enabled could be useful to make sure Dataproc cluster 
doesn't stay active for extended time periods (In case delete 
DataprocClusterDeleteOperator isn't suitable or fails to run for any reason).

 

Implementation requires adding config object 
[LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig]

 

 


> Add ClusterLifecycleConfig to DataprocClusterCreateOperator
> ---
>
> Key: AIRFLOW-2313
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2313
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp, operators
>Affects Versions: Airflow 2.0, 1.9.0
>Reporter: e b
>Priority: Minor
>
> Google introduced Cloud Dataproc Cluster 
> [Scheduled-Deletion|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]]
>  feature which isn't yet supported by the DataprocClusterCreateOperator.
>  
> Having this option enabled could be useful to make sure Dataproc cluster 
> doesn't stay active for extended time periods (In case delete 
> DataprocClusterDeleteOperator isn't suitable or fails to run for any reason).
>  
> Implementation requires adding config object 
> [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2313) Add ClusterLifecycleConfig to DataprocClusterCreateOperator

2018-04-11 Thread e b (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

e b updated AIRFLOW-2313:
-
Description: 
Google introduced [Cloud Dataproc Cluster Scheduled 
Deletion|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]]
 feature which isn't yet supported by the DataprocClusterCreateOperator.

 

Having this option enabled could be useful to make sure Dataproc cluster 
doesn't stay active for extended time periods (In case delete 
DataprocClusterDeleteOperator isn't suitable or fails to run for any reason).

 

Implementation requires adding config object 
[LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig]

 

 

  was:
Google introduced [Cloud Dataproc Cluster Scheduled Deletion 
(TTL)|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]]
 feature which isn't yet supported by the DataprocClusterCreateOperator.

 

Having this option enabled could be useful to make sure Dataproc cluster 
doesn't stay active for extended time periods (In case delete 
DataprocClusterDeleteOperator isn't suitable or fails to run for any reason).

 

Implementation requires adding config object 
[LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig]

 

 


> Add ClusterLifecycleConfig to DataprocClusterCreateOperator
> ---
>
> Key: AIRFLOW-2313
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2313
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp, operators
>Affects Versions: Airflow 2.0, 1.9.0
>Reporter: e b
>Priority: Minor
>
> Google introduced [Cloud Dataproc Cluster Scheduled 
> Deletion|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]]
>  feature which isn't yet supported by the DataprocClusterCreateOperator.
>  
> Having this option enabled could be useful to make sure Dataproc cluster 
> doesn't stay active for extended time periods (In case delete 
> DataprocClusterDeleteOperator isn't suitable or fails to run for any reason).
>  
> Implementation requires adding config object 
> [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2313) Add ClusterLifecycleConfig to DataprocClusterCreateOperator

2018-04-11 Thread e b (JIRA)
e b created AIRFLOW-2313:


 Summary: Add ClusterLifecycleConfig to 
DataprocClusterCreateOperator
 Key: AIRFLOW-2313
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2313
 Project: Apache Airflow
  Issue Type: Improvement
  Components: contrib, gcp, operators
Affects Versions: Airflow 2.0, 1.9.0
Reporter: e b


Google introduced [Cloud Dataproc Cluster Scheduled Deletion 
(TTL)|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]]
 feature which isn't yet supported by the DataprocClusterCreateOperator.

 

Having this option enabled could be useful to make sure Dataproc cluster 
doesn't stay active for extended time periods (In case delete 
DataprocClusterDeleteOperator isn't suitable or fails to run for any reason).

 

Implementation requires adding config object 
[LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig]

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-1623) Clearing task in UI does not trigger on_kill method in operator

2018-04-11 Thread Bolke de Bruin (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bolke de Bruin resolved AIRFLOW-1623.
-
Resolution: Fixed

Issue resolved by pull request #3204
[https://github.com/apache/incubator-airflow/pull/3204]

> Clearing task in UI does not trigger on_kill method in operator
> ---
>
> Key: AIRFLOW-1623
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1623
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: Airflow 2.0
>Reporter: Richard Penman
>Priority: Major
> Fix For: 1.10.0
>
>
> When a task is cleared in the UI it doesn't call the [operators on_kill() 
> method|https://github.com/apache/incubator-airflow/blob/b2e1753f5b74ad1b6e0889f7b784ce69623c95ce/airflow/models.py#L2380]
>  to clean up the task. Apparently this is meant to be handled in the 
> [LocalTaskJob.on_kill()|https://github.com/apache/incubator-airflow/blob/b2e1753f5b74ad1b6e0889f7b784ce69623c95ce/airflow/jobs.py#L2512]
>  method, however it is not currently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1623) Clearing task in UI does not trigger on_kill method in operator

2018-04-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433447#comment-16433447
 ] 

ASF subversion and git services commented on AIRFLOW-1623:
--

Commit 39b7d7d87cabae9de02ba5d64b998317b494bdd9 in incubator-airflow's branch 
refs/heads/master from [~bolke]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=39b7d7d ]

[AIRFLOW-1623] Trigger on_kill method in operators

on_kill methods were not triggered, due to
processes
not being properly terminated. This was due to the
fact
the runners use a shell which is then replaced by
the
child pid, which is unknown to Airflow.

Closes #3204 from bolkedebruin/AIRFLOW-1623


> Clearing task in UI does not trigger on_kill method in operator
> ---
>
> Key: AIRFLOW-1623
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1623
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: Airflow 2.0
>Reporter: Richard Penman
>Priority: Major
> Fix For: 1.10.0
>
>
> When a task is cleared in the UI it doesn't call the [operators on_kill() 
> method|https://github.com/apache/incubator-airflow/blob/b2e1753f5b74ad1b6e0889f7b784ce69623c95ce/airflow/models.py#L2380]
>  to clean up the task. Apparently this is meant to be handled in the 
> [LocalTaskJob.on_kill()|https://github.com/apache/incubator-airflow/blob/b2e1753f5b74ad1b6e0889f7b784ce69623c95ce/airflow/jobs.py#L2512]
>  method, however it is not currently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1623) Clearing task in UI does not trigger on_kill method in operator

2018-04-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433448#comment-16433448
 ] 

ASF subversion and git services commented on AIRFLOW-1623:
--

Commit 39b7d7d87cabae9de02ba5d64b998317b494bdd9 in incubator-airflow's branch 
refs/heads/master from [~bolke]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=39b7d7d ]

[AIRFLOW-1623] Trigger on_kill method in operators

on_kill methods were not triggered, due to
processes
not being properly terminated. This was due to the
fact
the runners use a shell which is then replaced by
the
child pid, which is unknown to Airflow.

Closes #3204 from bolkedebruin/AIRFLOW-1623


> Clearing task in UI does not trigger on_kill method in operator
> ---
>
> Key: AIRFLOW-1623
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1623
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: Airflow 2.0
>Reporter: Richard Penman
>Priority: Major
> Fix For: 1.10.0
>
>
> When a task is cleared in the UI it doesn't call the [operators on_kill() 
> method|https://github.com/apache/incubator-airflow/blob/b2e1753f5b74ad1b6e0889f7b784ce69623c95ce/airflow/models.py#L2380]
>  to clean up the task. Apparently this is meant to be handled in the 
> [LocalTaskJob.on_kill()|https://github.com/apache/incubator-airflow/blob/b2e1753f5b74ad1b6e0889f7b784ce69623c95ce/airflow/jobs.py#L2512]
>  method, however it is not currently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-1623] Trigger on_kill method in operators

2018-04-11 Thread bolke
Repository: incubator-airflow
Updated Branches:
  refs/heads/master e58d0c9e2 -> 39b7d7d87


[AIRFLOW-1623] Trigger on_kill method in operators

on_kill methods were not triggered, due to
processes
not being properly terminated. This was due to the
fact
the runners use a shell which is then replaced by
the
child pid, which is unknown to Airflow.

Closes #3204 from bolkedebruin/AIRFLOW-1623


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/39b7d7d8
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/39b7d7d8
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/39b7d7d8

Branch: refs/heads/master
Commit: 39b7d7d87cabae9de02ba5d64b998317b494bdd9
Parents: e58d0c9
Author: Bolke de Bruin 
Authored: Wed Apr 11 08:05:42 2018 +0200
Committer: Bolke de Bruin 
Committed: Wed Apr 11 08:05:42 2018 +0200

--
 .../contrib/task_runner/cgroup_task_runner.py   |   4 +-
 airflow/jobs.py |   2 +-
 airflow/models.py   |   3 +-
 airflow/task/task_runner/base_task_runner.py|   3 +-
 airflow/task/task_runner/bash_task_runner.py|   4 +-
 airflow/utils/helpers.py| 113 +---
 tests/__init__.py   |   3 +
 tests/dags/test_on_kill.py  |  40 ++
 tests/task/__init__.py  |  18 +++
 tests/task/task_runner/__init__.py  |  13 ++
 tests/task/task_runner/test_bash_task_runner.py | 131 +++
 tests/utils/test_helpers.py |  75 +--
 12 files changed, 284 insertions(+), 125 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/39b7d7d8/airflow/contrib/task_runner/cgroup_task_runner.py
--
diff --git a/airflow/contrib/task_runner/cgroup_task_runner.py 
b/airflow/contrib/task_runner/cgroup_task_runner.py
index 87767a6..04ae81c 100644
--- a/airflow/contrib/task_runner/cgroup_task_runner.py
+++ b/airflow/contrib/task_runner/cgroup_task_runner.py
@@ -21,7 +21,7 @@ from cgroupspy import trees
 import psutil
 
 from airflow.task_runner.base_task_runner import BaseTaskRunner
-from airflow.utils.helpers import kill_process_tree
+from airflow.utils.helpers import reap_process_group
 
 
 class CgroupTaskRunner(BaseTaskRunner):
@@ -176,7 +176,7 @@ class CgroupTaskRunner(BaseTaskRunner):
 
 def terminate(self):
 if self.process and psutil.pid_exists(self.process.pid):
-kill_process_tree(self.log, self.process.pid)
+reap_process_group(self.process.pid, self.log)
 
 def on_finish(self):
 # Let the OOM watcher thread know we're done to avoid false OOM alarms

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/39b7d7d8/airflow/jobs.py
--
diff --git a/airflow/jobs.py b/airflow/jobs.py
index bcff868..1911896 100644
--- a/airflow/jobs.py
+++ b/airflow/jobs.py
@@ -2514,7 +2514,7 @@ class LocalTaskJob(BaseJob):
 
 def signal_handler(signum, frame):
 """Setting kill signal handler"""
-self.log.error("Killing subprocess")
+self.log.error("Received SIGTERM. Terminating subprocesses")
 self.on_kill()
 raise AirflowException("LocalTaskJob received SIGTERM signal")
 signal.signal(signal.SIGTERM, signal_handler)

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/39b7d7d8/airflow/models.py
--
diff --git a/airflow/models.py b/airflow/models.py
index e89e776..afcacd1 100755
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -1539,8 +1539,7 @@ class TaskInstance(Base, LoggingMixin):
 self.task = task_copy
 
 def signal_handler(signum, frame):
-"""Setting kill signal handler"""
-self.log.error("Killing subprocess")
+self.log.error("Received SIGTERM. Terminating 
subprocesses.")
 task_copy.on_kill()
 raise AirflowException("Task received SIGTERM signal")
 signal.signal(signal.SIGTERM, signal_handler)

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/39b7d7d8/airflow/task/task_runner/base_task_runner.py
--
diff --git a/airflow/task/task_runner/base_task_runner.py 
b/airflow/task/task_runner/base_task_runner.py
index 794b450..49bc8bc 100644
--- a/airflow/task/task_runner/base_task_runner.py
+++ b/airflow/task/task_runner/base_task_runner.py
@@ -122,7 +122,8 @@ class