[jira] [Updated] (AIRFLOW-2315) S3Hook Extra Extras
[ https://issues.apache.org/jira/browse/AIRFLOW-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Bacon updated AIRFLOW-2315: Labels: beginner features newbie starter (was: ) > S3Hook Extra Extras > > > Key: AIRFLOW-2315 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2315 > Project: Apache Airflow > Issue Type: Improvement > Components: boto3 >Affects Versions: 1.9.0 >Reporter: Josh Bacon >Assignee: Josh Bacon >Priority: Minor > Labels: beginner, features, newbie, starter > Fix For: 1.10.0 > > > Feature improvement request to S3Hook to support additional JSON extra > arguments to apply to both upload and download ExtraArgs. > Allowed Upload Arguments: > [http://boto3.readthedocs.io/en/latest/reference/customizations/s3.html#boto3.s3.transfer.S3Transfer.ALLOWED_UPLOAD_ARGS] > Allowed Download Arguments: > [http://boto3.readthedocs.io/en/latest/reference/customizations/s3.html#boto3.s3.transfer.S3Transfer.ALLOWED_DOWNLOAD_ARGS] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2315) S3Hook Extra Extras
Josh Bacon created AIRFLOW-2315: --- Summary: S3Hook Extra Extras Key: AIRFLOW-2315 URL: https://issues.apache.org/jira/browse/AIRFLOW-2315 Project: Apache Airflow Issue Type: Improvement Components: boto3 Affects Versions: 1.9.0 Reporter: Josh Bacon Assignee: Josh Bacon Fix For: 1.10.0 Feature improvement request to S3Hook to support additional JSON extra arguments to apply to both upload and download ExtraArgs. Allowed Upload Arguments: [http://boto3.readthedocs.io/en/latest/reference/customizations/s3.html#boto3.s3.transfer.S3Transfer.ALLOWED_UPLOAD_ARGS] Allowed Download Arguments: [http://boto3.readthedocs.io/en/latest/reference/customizations/s3.html#boto3.s3.transfer.S3Transfer.ALLOWED_DOWNLOAD_ARGS] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2314) Fix strip of trailing slash in GCS paths
[ https://issues.apache.org/jira/browse/AIRFLOW-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guillermo Rodríguez Cano reassigned AIRFLOW-2314: - Assignee: Guillermo Rodríguez Cano > Fix strip of trailing slash in GCS paths > > > Key: AIRFLOW-2314 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2314 > Project: Apache Airflow > Issue Type: Bug > Components: contrib, dependencies, gcp >Reporter: Guillermo Rodríguez Cano >Assignee: Guillermo Rodríguez Cano >Priority: Minor > > Behaviour of {{_parse_gcs_url}} function in > {{airflow.contrib.hooks.gcs_hook}} is stripping leading and trailing slash > characters: / in the blob part of the URI > I expect the leading slash to be removed, but not the ending one when given. > Reason being is that since the blob is a representing a full 'path' object, > it is known that the separator between the bucket and the blob will be one > slash character but there may be cases when the user wants to explicitly know > whether the object is an actual file or a 'directory'. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2314) Fix strip of trailing slash in GCS paths
[ https://issues.apache.org/jira/browse/AIRFLOW-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433859#comment-16433859 ] Guillermo Rodríguez Cano commented on AIRFLOW-2314: --- PR addressing the issue: https://github.com/apache/incubator-airflow/pull/3212 > Fix strip of trailing slash in GCS paths > > > Key: AIRFLOW-2314 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2314 > Project: Apache Airflow > Issue Type: Bug > Components: contrib, dependencies, gcp >Reporter: Guillermo Rodríguez Cano >Priority: Minor > > Behaviour of {{_parse_gcs_url}} function in > {{airflow.contrib.hooks.gcs_hook}} is stripping leading and trailing slash > characters: / in the blob part of the URI > I expect the leading slash to be removed, but not the ending one when given. > Reason being is that since the blob is a representing a full 'path' object, > it is known that the separator between the bucket and the blob will be one > slash character but there may be cases when the user wants to explicitly know > whether the object is an actual file or a 'directory'. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2314) Fix strip of trailing slash in GCS paths
Guillermo Rodríguez Cano created AIRFLOW-2314: - Summary: Fix strip of trailing slash in GCS paths Key: AIRFLOW-2314 URL: https://issues.apache.org/jira/browse/AIRFLOW-2314 Project: Apache Airflow Issue Type: Bug Components: contrib, dependencies, gcp Reporter: Guillermo Rodríguez Cano Behaviour of {{_parse_gcs_url}} function in {{airflow.contrib.hooks.gcs_hook}} is stripping leading and trailing slash characters: / in the blob part of the URI I expect the leading slash to be removed, but not the ending one when given. Reason being is that since the blob is a representing a full 'path' object, it is known that the separator between the bucket and the blob will be one slash character but there may be cases when the user wants to explicitly know whether the object is an actual file or a 'directory'. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-2291) Ability to specify runtimeVersion, pythonVersion and jobDir in ML Engine operator
[ https://issues.apache.org/jira/browse/AIRFLOW-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-2291. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request #3202 [https://github.com/apache/incubator-airflow/pull/3202] > Ability to specify runtimeVersion, pythonVersion and jobDir in ML Engine > operator > - > > Key: AIRFLOW-2291 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2291 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib, gcp >Affects Versions: Airflow 2.0, Airflow 1.8, Airflow 1.9.0 >Reporter: David Volquartz Lebech >Assignee: David Volquartz Lebech >Priority: Minor > Fix For: 2.0.0 > > > For the Google Cloud ML Engine Training Operator (in {{contrib}}), the > [current list of > arguments|https://github.com/apache/incubator-airflow/blob/b918a4976a641a3c944ef4ad4323fe25bd219ea0/airflow/contrib/operators/mlengine_operator.py#L538-L544] > does not seem to include the option for specifying runtime and Python > version which limits execution to Python 2.7. It also does not include the > useful job directory argument. > It seems that simply adding {{runtimeVersion}}, {{pythonVersion}} and > {{jobDir}} to the list would be enough to make this work, according to [the > documentations on ML > engine|https://cloud.google.com/ml-engine/docs/tensorflow/versioning#set-python-version-training]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2291) Ability to specify runtimeVersion, pythonVersion and jobDir in ML Engine operator
[ https://issues.apache.org/jira/browse/AIRFLOW-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433809#comment-16433809 ] ASF subversion and git services commented on AIRFLOW-2291: -- Commit d9bf1edd44af847aa1f9ecd9d797dfacb806cc00 in incubator-airflow's branch refs/heads/master from [~dlebech] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=d9bf1ed ] [AIRFLOW-2291] Add optional params to ML Engine This commit adds three extra optional parameters to the `MLEngineTrainingOperator` as well as a new unit test for the `MLEngineVersionOperator` Closes #3202 from dlebech/ml-engine-python-version > Ability to specify runtimeVersion, pythonVersion and jobDir in ML Engine > operator > - > > Key: AIRFLOW-2291 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2291 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib, gcp >Affects Versions: Airflow 2.0, Airflow 1.8, Airflow 1.9.0 >Reporter: David Volquartz Lebech >Assignee: David Volquartz Lebech >Priority: Minor > > For the Google Cloud ML Engine Training Operator (in {{contrib}}), the > [current list of > arguments|https://github.com/apache/incubator-airflow/blob/b918a4976a641a3c944ef4ad4323fe25bd219ea0/airflow/contrib/operators/mlengine_operator.py#L538-L544] > does not seem to include the option for specifying runtime and Python > version which limits execution to Python 2.7. It also does not include the > useful job directory argument. > It seems that simply adding {{runtimeVersion}}, {{pythonVersion}} and > {{jobDir}} to the list would be enough to make this work, according to [the > documentations on ML > engine|https://cloud.google.com/ml-engine/docs/tensorflow/versioning#set-python-version-training]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
incubator-airflow git commit: [AIRFLOW-2291] Add optional params to ML Engine
Repository: incubator-airflow Updated Branches: refs/heads/master 65e7025f3 -> d9bf1edd4 [AIRFLOW-2291] Add optional params to ML Engine This commit adds three extra optional parameters to the `MLEngineTrainingOperator` as well as a new unit test for the `MLEngineVersionOperator` Closes #3202 from dlebech/ml-engine-python-version Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/d9bf1edd Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/d9bf1edd Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/d9bf1edd Branch: refs/heads/master Commit: d9bf1edd44af847aa1f9ecd9d797dfacb806cc00 Parents: 65e7025 Author: David Volquartz LebechAuthored: Wed Apr 11 14:15:06 2018 +0200 Committer: Fokko Driesprong Committed: Wed Apr 11 14:15:06 2018 +0200 -- airflow/contrib/operators/mlengine_operator.py | 28 + .../contrib/operators/test_mlengine_operator.py | 63 +++- 2 files changed, 90 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/d9bf1edd/airflow/contrib/operators/mlengine_operator.py -- diff --git a/airflow/contrib/operators/mlengine_operator.py b/airflow/contrib/operators/mlengine_operator.py index 3dd63f2..a6f186b 100644 --- a/airflow/contrib/operators/mlengine_operator.py +++ b/airflow/contrib/operators/mlengine_operator.py @@ -474,6 +474,16 @@ class MLEngineTrainingOperator(BaseOperator): :param scale_tier: Resource tier for MLEngine training job. :type scale_tier: string +:param runtime_version: The Google Cloud ML runtime version to use for training. +:type runtime_version: string + +:param python_version: The version of Python used in training. +:type python_version: string + +:param job_dir: A Google Cloud Storage path in which to store training +outputs and other data needed for training. +:type job_dir: string + :param gcp_conn_id: The connection ID to use when fetching connection info. :type gcp_conn_id: string @@ -497,6 +507,9 @@ class MLEngineTrainingOperator(BaseOperator): '_training_args', '_region', '_scale_tier', +'_runtime_version', +'_python_version', +'_job_dir' ] @apply_defaults @@ -508,6 +521,9 @@ class MLEngineTrainingOperator(BaseOperator): training_args, region, scale_tier=None, + runtime_version=None, + python_version=None, + job_dir=None, gcp_conn_id='google_cloud_default', delegate_to=None, mode='PRODUCTION', @@ -521,6 +537,9 @@ class MLEngineTrainingOperator(BaseOperator): self._training_args = training_args self._region = region self._scale_tier = scale_tier +self._runtime_version = runtime_version +self._python_version = python_version +self._job_dir = job_dir self._gcp_conn_id = gcp_conn_id self._delegate_to = delegate_to self._mode = mode @@ -555,6 +574,15 @@ class MLEngineTrainingOperator(BaseOperator): } } +if self._runtime_version: +training_request['trainingInput']['runtimeVersion'] = self._runtime_version + +if self._python_version: +training_request['trainingInput']['pythonVersion'] = self._python_version + +if self._job_dir: +training_request['trainingInput']['jobDir'] = self._job_dir + if self._mode == 'DRY_RUN': self.log.info('In dry_run mode.') self.log.info('MLEngine Training job request is: {}'.format( http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/d9bf1edd/tests/contrib/operators/test_mlengine_operator.py -- diff --git a/tests/contrib/operators/test_mlengine_operator.py b/tests/contrib/operators/test_mlengine_operator.py index 2766e5d..9a8d56c 100644 --- a/tests/contrib/operators/test_mlengine_operator.py +++ b/tests/contrib/operators/test_mlengine_operator.py @@ -17,6 +17,7 @@ from __future__ import absolute_import, division, print_function +import copy import datetime import unittest @@ -26,7 +27,8 @@ from mock import ANY, patch from airflow import DAG, configuration from airflow.contrib.operators.mlengine_operator import (MLEngineBatchPredictionOperator, - MLEngineTrainingOperator) + MLEngineTrainingOperator, +
[jira] [Resolved] (AIRFLOW-1774) Better handling of templated parameters in Google ML batch prediction/training operators
[ https://issues.apache.org/jira/browse/AIRFLOW-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-1774. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request #2746 [https://github.com/apache/incubator-airflow/pull/2746] > Better handling of templated parameters in Google ML batch > prediction/training operators > > > Key: AIRFLOW-1774 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1774 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib, operators >Affects Versions: Airflow 2.0, 1.9.0 >Reporter: Guillermo Rodríguez Cano >Assignee: Guillermo Rodríguez Cano >Priority: Major > Fix For: 2.0.0 > > > The {{MLEngineBatchPredictionOperator}} does not support well the templated > parameter {{job_id}} due to a helper function used to detect and cleanup bad > job names inhibiting the templating engine to work. I suspect the same may > happen with {{MLEngineTrainingOperator}} as well. > Google ML requieres a unique job id, therefore it is critical to have the > possibility to customise the job's name easily, and preferably with some data > related to the DAG, for example, appending the day to the job's name via a > macro like {{ds}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
incubator-airflow git commit: [AIRFLOW-1774] Allow consistent templating of arguments in MLEngineBatchPredictionOperator
Repository: incubator-airflow Updated Branches: refs/heads/master 3475faf6b -> 65e7025f3 [AIRFLOW-1774] Allow consistent templating of arguments in MLEngineBatchPredictionOperator Fix a minor typo and a wrong non-default assignment Fix one more typo Adapt tests to new error messages and fix another typo Fix exception type in utils operator test class Improve cleansing of non-valid training and prediciton job names Closes #2746 from wileeam/ml-engine-prediction- job-normalization Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/65e7025f Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/65e7025f Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/65e7025f Branch: refs/heads/master Commit: 65e7025f3af378dfa825eb04d005ec1f7a422cde Parents: 3475faf Author: Guillermo RodrÃguez CanoAuthored: Wed Apr 11 11:57:21 2018 +0200 Committer: Fokko Driesprong Committed: Wed Apr 11 11:57:21 2018 +0200 -- airflow/contrib/operators/mlengine_operator.py | 205 ++- .../contrib/operators/test_mlengine_operator.py | 88 .../operators/test_mlengine_operator_utils.py | 6 +- 3 files changed, 156 insertions(+), 143 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/65e7025f/airflow/contrib/operators/mlengine_operator.py -- diff --git a/airflow/contrib/operators/mlengine_operator.py b/airflow/contrib/operators/mlengine_operator.py index 0d033d3..3dd63f2 100644 --- a/airflow/contrib/operators/mlengine_operator.py +++ b/airflow/contrib/operators/mlengine_operator.py @@ -15,77 +15,17 @@ # limitations under the License. import re -from airflow import settings +from apiclient import errors + from airflow.contrib.hooks.gcp_mlengine_hook import MLEngineHook from airflow.exceptions import AirflowException from airflow.operators import BaseOperator from airflow.utils.decorators import apply_defaults -from apiclient import errors - from airflow.utils.log.logging_mixin import LoggingMixin log = LoggingMixin().log -def _create_prediction_input(project_id, - region, - data_format, - input_paths, - output_path, - model_name=None, - version_name=None, - uri=None, - max_worker_count=None, - runtime_version=None): -""" -Create the batch prediction input from the given parameters. - -Args: -A subset of arguments documented in __init__ method of class -MLEngineBatchPredictionOperator - -Returns: -A dictionary representing the predictionInput object as documented -in https://cloud.google.com/ml-engine/reference/rest/v1/projects.jobs. - -Raises: -ValueError: if a unique model/version origin cannot be determined. -""" -prediction_input = { -'dataFormat': data_format, -'inputPaths': input_paths, -'outputPath': output_path, -'region': region -} - -if uri: -if model_name or version_name: -log.error( -'Ambiguous model origin: Both uri and model/version name are provided.' -) -raise ValueError('Ambiguous model origin.') -prediction_input['uri'] = uri -elif model_name: -origin_name = 'projects/{}/models/{}'.format(project_id, model_name) -if not version_name: -prediction_input['modelName'] = origin_name -else: -prediction_input['versionName'] = \ -origin_name + '/versions/{}'.format(version_name) -else: -log.error( -'Missing model origin: Batch prediction expects a model, ' -'a model & version combination, or a URI to savedModel.') -raise ValueError('Missing model origin.') - -if max_worker_count: -prediction_input['maxWorkerCount'] = max_worker_count -if runtime_version: -prediction_input['runtimeVersion'] = runtime_version - -return prediction_input - - def _normalize_mlengine_job_id(job_id): """ Replaces invalid MLEngine job_id characters with '_'. @@ -99,10 +39,27 @@ def _normalize_mlengine_job_id(job_id): Returns: A valid job_id representation. """ -match = re.search(r'\d', job_id) + +# Add a prefix when a job_id starts with a digit or a template +match = re.search(r'\d|\{{2}', job_id) if match and match.start() is 0: -job_id = 'z_{}'.format(job_id) -return
[jira] [Commented] (AIRFLOW-1774) Better handling of templated parameters in Google ML batch prediction/training operators
[ https://issues.apache.org/jira/browse/AIRFLOW-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433662#comment-16433662 ] ASF subversion and git services commented on AIRFLOW-1774: -- Commit 65e7025f3af378dfa825eb04d005ec1f7a422cde in incubator-airflow's branch refs/heads/master from [~wileeam] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=65e7025 ] [AIRFLOW-1774] Allow consistent templating of arguments in MLEngineBatchPredictionOperator Fix a minor typo and a wrong non-default assignment Fix one more typo Adapt tests to new error messages and fix another typo Fix exception type in utils operator test class Improve cleansing of non-valid training and prediciton job names Closes #2746 from wileeam/ml-engine-prediction- job-normalization > Better handling of templated parameters in Google ML batch > prediction/training operators > > > Key: AIRFLOW-1774 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1774 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib, operators >Affects Versions: Airflow 2.0, 1.9.0 >Reporter: Guillermo Rodríguez Cano >Assignee: Guillermo Rodríguez Cano >Priority: Major > > The {{MLEngineBatchPredictionOperator}} does not support well the templated > parameter {{job_id}} due to a helper function used to detect and cleanup bad > job names inhibiting the templating engine to work. I suspect the same may > happen with {{MLEngineTrainingOperator}} as well. > Google ML requieres a unique job id, therefore it is critical to have the > possibility to customise the job's name easily, and preferably with some data > related to the DAG, for example, appending the day to the job's name via a > macro like {{ds}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2302) Add missing operators/hooks to the docs
[ https://issues.apache.org/jira/browse/AIRFLOW-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433657#comment-16433657 ] ASF subversion and git services commented on AIRFLOW-2302: -- Commit 3475faf6ba9efc4ff9a08648a13f3f290ff671d7 in incubator-airflow's branch refs/heads/master from [~Fokko] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=3475faf ] [AIRFLOW-2302] Add missing operators and hooks Many operators and hooks are not listed in the config, and therefore not picked up by the docks generator. Closes #3201 from Fokko/airflow-2302-add-missing- docs > Add missing operators/hooks to the docs > --- > > Key: AIRFLOW-2302 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2302 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Fokko Driesprong >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
incubator-airflow git commit: [AIRFLOW-2302] Add missing operators and hooks
Repository: incubator-airflow Updated Branches: refs/heads/master 69ccf8432 -> 3475faf6b [AIRFLOW-2302] Add missing operators and hooks Many operators and hooks are not listed in the config, and therefore not picked up by the docks generator. Closes #3201 from Fokko/airflow-2302-add-missing- docs Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/3475faf6 Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/3475faf6 Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/3475faf6 Branch: refs/heads/master Commit: 3475faf6ba9efc4ff9a08648a13f3f290ff671d7 Parents: 69ccf84 Author: Fokko DriesprongAuthored: Wed Apr 11 11:55:29 2018 +0200 Committer: Fokko Driesprong Committed: Wed Apr 11 11:55:29 2018 +0200 -- .github/PULL_REQUEST_TEMPLATE.md | 7 + docs/code.rst| 53 +-- docs/integration.rst | 41 +-- docs/kubernetes.rst | 3 +- 4 files changed, 73 insertions(+), 31 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/3475faf6/.github/PULL_REQUEST_TEMPLATE.md -- diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 72cbb76..6000d0e 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -23,4 +23,11 @@ Make sure you have checked _all_ steps below. 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" + +### Documentation +- [ ] In case of new functionality, my PR adds documentation that describes how to use it. +- When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. + + +### Code Quality - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/3475faf6/docs/code.rst -- diff --git a/docs/code.rst b/docs/code.rst index 34ecec9..a30d117 100644 --- a/docs/code.rst +++ b/docs/code.rst @@ -80,6 +80,7 @@ Operators .. autoclass:: airflow.operators.python_operator.ShortCircuitOperator .. autoclass:: airflow.operators.http_operator.SimpleHttpOperator .. autoclass:: airflow.operators.slack_operator.SlackAPIOperator +.. autoclass:: airflow.operators.slack_operator.SlackAPIPostOperator .. autoclass:: airflow.operators.sqlite_operator.SqliteOperator .. autoclass:: airflow.operators.subdag_operator.SubDagOperator .. autoclass:: airflow.operators.dagrun_operator.TriggerDagRunOperator @@ -130,6 +131,7 @@ Operators .. autoclass:: airflow.contrib.operators.dataproc_operator.DataProcSparkOperator .. autoclass:: airflow.contrib.operators.dataproc_operator.DataProcHadoopOperator .. autoclass:: airflow.contrib.operators.dataproc_operator.DataProcPySparkOperator +.. autoclass:: airflow.contrib.operators.dataproc_operator.DataprocWorkflowTemplateBaseOperator .. autoclass:: airflow.contrib.operators.dataproc_operator.DataprocWorkflowTemplateInstantiateOperator .. autoclass:: airflow.contrib.operators.dataproc_operator.DataprocWorkflowTemplateInstantiateInlineOperator .. autoclass:: airflow.contrib.operators.datastore_export_operator.DatastoreExportOperator @@ -148,9 +150,11 @@ Operators .. autoclass:: airflow.contrib.operators.gcs_operator.GoogleCloudStorageCreateBucketOperator .. autoclass:: airflow.contrib.operators.gcs_to_bq.GoogleCloudStorageToBigQueryOperator .. autoclass:: airflow.contrib.operators.gcs_to_gcs.GoogleCloudStorageToGoogleCloudStorageOperator +.. autoclass:: airflow.contrib.operators.gcs_to_s3.GoogleCloudStorageToS3Operator .. autoclass:: airflow.contrib.operators.hipchat_operator.HipChatAPIOperator .. autoclass:: airflow.contrib.operators.hipchat_operator.HipChatAPISendRoomNotificationOperator .. autoclass:: airflow.contrib.operators.hive_to_dynamodb.HiveToDynamoDBTransferOperator +.. autoclass:: airflow.contrib.operators.jenkins_job_trigger_operator.JenkinsJobTriggerOperator .. autoclass:: airflow.contrib.operators.jira_operator.JiraOperator .. autoclass:: airflow.contrib.operators.kubernetes_pod_operator.KubernetesPodOperator .. autoclass:: airflow.contrib.operators.mlengine_operator.MLEngineBatchPredictionOperator @@ -167,13 +171,14 @@ Operators .. autoclass:: airflow.contrib.operators.qubole_operator.QuboleOperator .. autoclass:: airflow.contrib.operators.s3_list_operator.S3ListOperator .. autoclass:: airflow.contrib.operators.sftp_operator.SFTPOperator +.. autoclass:: airflow.contrib.operators.slack_webhook_operator.SlackWebhookOperator +.. autoclass::
[jira] [Resolved] (AIRFLOW-2302) Add missing operators/hooks to the docs
[ https://issues.apache.org/jira/browse/AIRFLOW-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-2302. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request #3201 [https://github.com/apache/incubator-airflow/pull/3201] > Add missing operators/hooks to the docs > --- > > Key: AIRFLOW-2302 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2302 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Fokko Driesprong >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-2312) Docs Typo Correction: Corresponding
[ https://issues.apache.org/jira/browse/AIRFLOW-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-2312. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request #3211 [https://github.com/apache/incubator-airflow/pull/3211] > Docs Typo Correction: Corresponding > --- > > Key: AIRFLOW-2312 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2312 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Tam-Sanh >Priority: Trivial > Fix For: 2.0.0 > > > [https://github.com/apache/incubator-airflow/pull/3211] > {{ correpsonding to corresponding}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
incubator-airflow git commit: [AIRFLOW-2312] Docs Typo Correction: Corresponding
Repository: incubator-airflow Updated Branches: refs/heads/master 39b7d7d87 -> 69ccf8432 [AIRFLOW-2312] Docs Typo Correction: Corresponding # JIRA [x] My PR addresses the following Airflow JIRA issues and references them in the PR title. * https://issues.apache.org/jira/browse/AIRFLOW-2312 # Description [x] Here are some details about my PR Minor typo fix in the docs. `correpsonding` to `corresponding` # Tests [x] My PR adds the following unit tests OR does not need testing for this extremely good reason: I assume I don't need testing for docs # Commits [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" Closes #3211 from tamsanh/patch-1 Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/69ccf843 Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/69ccf843 Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/69ccf843 Branch: refs/heads/master Commit: 69ccf843213979baadf6aaf4967e6e52d490513e Parents: 39b7d7d Author: TamuAuthored: Wed Apr 11 11:42:46 2018 +0200 Committer: Fokko Driesprong Committed: Wed Apr 11 11:42:46 2018 +0200 -- docs/concepts.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/69ccf843/docs/concepts.rst -- diff --git a/docs/concepts.rst b/docs/concepts.rst index be6809b..3f70555 100644 --- a/docs/concepts.rst +++ b/docs/concepts.rst @@ -380,7 +380,7 @@ opposed to XComs that are pushed manually). If ``xcom_pull`` is passed a single string for ``task_ids``, then the most recent XCom value from that task is returned; if a list of ``task_ids`` is -passed, then a correpsonding list of XCom values is returned. +passed, then a corresponding list of XCom values is returned. .. code:: python
[jira] [Commented] (AIRFLOW-2312) Docs Typo Correction: Corresponding
[ https://issues.apache.org/jira/browse/AIRFLOW-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433633#comment-16433633 ] ASF subversion and git services commented on AIRFLOW-2312: -- Commit 69ccf843213979baadf6aaf4967e6e52d490513e in incubator-airflow's branch refs/heads/master from [~tamsanh] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=69ccf84 ] [AIRFLOW-2312] Docs Typo Correction: Corresponding # JIRA [x] My PR addresses the following Airflow JIRA issues and references them in the PR title. * https://issues.apache.org/jira/browse/AIRFLOW-2312 # Description [x] Here are some details about my PR Minor typo fix in the docs. `correpsonding` to `corresponding` # Tests [x] My PR adds the following unit tests OR does not need testing for this extremely good reason: I assume I don't need testing for docs # Commits [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" Closes #3211 from tamsanh/patch-1 > Docs Typo Correction: Corresponding > --- > > Key: AIRFLOW-2312 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2312 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Tam-Sanh >Priority: Trivial > > [https://github.com/apache/incubator-airflow/pull/3211] > {{ correpsonding to corresponding}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2312) Docs Typo Correction: Corresponding
[ https://issues.apache.org/jira/browse/AIRFLOW-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433632#comment-16433632 ] ASF subversion and git services commented on AIRFLOW-2312: -- Commit 69ccf843213979baadf6aaf4967e6e52d490513e in incubator-airflow's branch refs/heads/master from [~tamsanh] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=69ccf84 ] [AIRFLOW-2312] Docs Typo Correction: Corresponding # JIRA [x] My PR addresses the following Airflow JIRA issues and references them in the PR title. * https://issues.apache.org/jira/browse/AIRFLOW-2312 # Description [x] Here are some details about my PR Minor typo fix in the docs. `correpsonding` to `corresponding` # Tests [x] My PR adds the following unit tests OR does not need testing for this extremely good reason: I assume I don't need testing for docs # Commits [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" Closes #3211 from tamsanh/patch-1 > Docs Typo Correction: Corresponding > --- > > Key: AIRFLOW-2312 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2312 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Tam-Sanh >Priority: Trivial > > [https://github.com/apache/incubator-airflow/pull/3211] > {{ correpsonding to corresponding}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2313) Add ClusterLifecycleConfig to DataprocClusterCreateOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Egidijus Bartkus reassigned AIRFLOW-2313: - Assignee: Egidijus Bartkus > Add ClusterLifecycleConfig to DataprocClusterCreateOperator > --- > > Key: AIRFLOW-2313 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2313 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib, gcp, operators >Affects Versions: Airflow 2.0, 1.9.0 >Reporter: Egidijus Bartkus >Assignee: Egidijus Bartkus >Priority: Minor > > Google introduced [Cloud Dataproc Cluster Scheduled > Deletion|https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion] > feature which isn't yet supported by the DataprocClusterCreateOperator. > > Having this option enabled could be useful to make sure Dataproc cluster > doesn't stay active for extended time periods (In case delete > DataprocClusterDeleteOperator isn't suitable or fails to run for any reason). > > Implementation requires adding config object > [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig] > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-2313) Add ClusterLifecycleConfig to DataprocClusterCreateOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] e b updated AIRFLOW-2313: - Description: Google introduced [Cloud Dataproc Cluster Scheduled Deletion|https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion] feature which isn't yet supported by the DataprocClusterCreateOperator. Having this option enabled could be useful to make sure Dataproc cluster doesn't stay active for extended time periods (In case delete DataprocClusterDeleteOperator isn't suitable or fails to run for any reason). Implementation requires adding config object [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig] was: Google introduced Cloud Dataproc Cluster [Scheduled-Deletion|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]] feature which isn't yet supported by the DataprocClusterCreateOperator. Having this option enabled could be useful to make sure Dataproc cluster doesn't stay active for extended time periods (In case delete DataprocClusterDeleteOperator isn't suitable or fails to run for any reason). Implementation requires adding config object [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig] > Add ClusterLifecycleConfig to DataprocClusterCreateOperator > --- > > Key: AIRFLOW-2313 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2313 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib, gcp, operators >Affects Versions: Airflow 2.0, 1.9.0 >Reporter: e b >Priority: Minor > > Google introduced [Cloud Dataproc Cluster Scheduled > Deletion|https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion] > feature which isn't yet supported by the DataprocClusterCreateOperator. > > Having this option enabled could be useful to make sure Dataproc cluster > doesn't stay active for extended time periods (In case delete > DataprocClusterDeleteOperator isn't suitable or fails to run for any reason). > > Implementation requires adding config object > [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig] > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-2313) Add ClusterLifecycleConfig to DataprocClusterCreateOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] e b updated AIRFLOW-2313: - Description: Google introduced Cloud Dataproc Cluster [Scheduled-Deletion|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]] feature which isn't yet supported by the DataprocClusterCreateOperator. Having this option enabled could be useful to make sure Dataproc cluster doesn't stay active for extended time periods (In case delete DataprocClusterDeleteOperator isn't suitable or fails to run for any reason). Implementation requires adding config object [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig] was: Google introduced [Cloud Dataproc Cluster Scheduled Deletion|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]] feature which isn't yet supported by the DataprocClusterCreateOperator. Having this option enabled could be useful to make sure Dataproc cluster doesn't stay active for extended time periods (In case delete DataprocClusterDeleteOperator isn't suitable or fails to run for any reason). Implementation requires adding config object [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig] > Add ClusterLifecycleConfig to DataprocClusterCreateOperator > --- > > Key: AIRFLOW-2313 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2313 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib, gcp, operators >Affects Versions: Airflow 2.0, 1.9.0 >Reporter: e b >Priority: Minor > > Google introduced Cloud Dataproc Cluster > [Scheduled-Deletion|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]] > feature which isn't yet supported by the DataprocClusterCreateOperator. > > Having this option enabled could be useful to make sure Dataproc cluster > doesn't stay active for extended time periods (In case delete > DataprocClusterDeleteOperator isn't suitable or fails to run for any reason). > > Implementation requires adding config object > [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig] > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-2313) Add ClusterLifecycleConfig to DataprocClusterCreateOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] e b updated AIRFLOW-2313: - Description: Google introduced [Cloud Dataproc Cluster Scheduled Deletion|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]] feature which isn't yet supported by the DataprocClusterCreateOperator. Having this option enabled could be useful to make sure Dataproc cluster doesn't stay active for extended time periods (In case delete DataprocClusterDeleteOperator isn't suitable or fails to run for any reason). Implementation requires adding config object [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig] was: Google introduced [Cloud Dataproc Cluster Scheduled Deletion (TTL)|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]] feature which isn't yet supported by the DataprocClusterCreateOperator. Having this option enabled could be useful to make sure Dataproc cluster doesn't stay active for extended time periods (In case delete DataprocClusterDeleteOperator isn't suitable or fails to run for any reason). Implementation requires adding config object [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig] > Add ClusterLifecycleConfig to DataprocClusterCreateOperator > --- > > Key: AIRFLOW-2313 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2313 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib, gcp, operators >Affects Versions: Airflow 2.0, 1.9.0 >Reporter: e b >Priority: Minor > > Google introduced [Cloud Dataproc Cluster Scheduled > Deletion|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]] > feature which isn't yet supported by the DataprocClusterCreateOperator. > > Having this option enabled could be useful to make sure Dataproc cluster > doesn't stay active for extended time periods (In case delete > DataprocClusterDeleteOperator isn't suitable or fails to run for any reason). > > Implementation requires adding config object > [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig] > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2313) Add ClusterLifecycleConfig to DataprocClusterCreateOperator
e b created AIRFLOW-2313: Summary: Add ClusterLifecycleConfig to DataprocClusterCreateOperator Key: AIRFLOW-2313 URL: https://issues.apache.org/jira/browse/AIRFLOW-2313 Project: Apache Airflow Issue Type: Improvement Components: contrib, gcp, operators Affects Versions: Airflow 2.0, 1.9.0 Reporter: e b Google introduced [Cloud Dataproc Cluster Scheduled Deletion (TTL)|[https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion]] feature which isn't yet supported by the DataprocClusterCreateOperator. Having this option enabled could be useful to make sure Dataproc cluster doesn't stay active for extended time periods (In case delete DataprocClusterDeleteOperator isn't suitable or fails to run for any reason). Implementation requires adding config object [LifecycleConfig|https://cloud.google.com/dataproc/docs/reference/rest/v1beta2/ClusterConfig#LifecycleConfig] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-1623) Clearing task in UI does not trigger on_kill method in operator
[ https://issues.apache.org/jira/browse/AIRFLOW-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1623. - Resolution: Fixed Issue resolved by pull request #3204 [https://github.com/apache/incubator-airflow/pull/3204] > Clearing task in UI does not trigger on_kill method in operator > --- > > Key: AIRFLOW-1623 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1623 > Project: Apache Airflow > Issue Type: Bug > Components: operators >Affects Versions: Airflow 2.0 >Reporter: Richard Penman >Priority: Major > Fix For: 1.10.0 > > > When a task is cleared in the UI it doesn't call the [operators on_kill() > method|https://github.com/apache/incubator-airflow/blob/b2e1753f5b74ad1b6e0889f7b784ce69623c95ce/airflow/models.py#L2380] > to clean up the task. Apparently this is meant to be handled in the > [LocalTaskJob.on_kill()|https://github.com/apache/incubator-airflow/blob/b2e1753f5b74ad1b6e0889f7b784ce69623c95ce/airflow/jobs.py#L2512] > method, however it is not currently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1623) Clearing task in UI does not trigger on_kill method in operator
[ https://issues.apache.org/jira/browse/AIRFLOW-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433447#comment-16433447 ] ASF subversion and git services commented on AIRFLOW-1623: -- Commit 39b7d7d87cabae9de02ba5d64b998317b494bdd9 in incubator-airflow's branch refs/heads/master from [~bolke] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=39b7d7d ] [AIRFLOW-1623] Trigger on_kill method in operators on_kill methods were not triggered, due to processes not being properly terminated. This was due to the fact the runners use a shell which is then replaced by the child pid, which is unknown to Airflow. Closes #3204 from bolkedebruin/AIRFLOW-1623 > Clearing task in UI does not trigger on_kill method in operator > --- > > Key: AIRFLOW-1623 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1623 > Project: Apache Airflow > Issue Type: Bug > Components: operators >Affects Versions: Airflow 2.0 >Reporter: Richard Penman >Priority: Major > Fix For: 1.10.0 > > > When a task is cleared in the UI it doesn't call the [operators on_kill() > method|https://github.com/apache/incubator-airflow/blob/b2e1753f5b74ad1b6e0889f7b784ce69623c95ce/airflow/models.py#L2380] > to clean up the task. Apparently this is meant to be handled in the > [LocalTaskJob.on_kill()|https://github.com/apache/incubator-airflow/blob/b2e1753f5b74ad1b6e0889f7b784ce69623c95ce/airflow/jobs.py#L2512] > method, however it is not currently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1623) Clearing task in UI does not trigger on_kill method in operator
[ https://issues.apache.org/jira/browse/AIRFLOW-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433448#comment-16433448 ] ASF subversion and git services commented on AIRFLOW-1623: -- Commit 39b7d7d87cabae9de02ba5d64b998317b494bdd9 in incubator-airflow's branch refs/heads/master from [~bolke] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=39b7d7d ] [AIRFLOW-1623] Trigger on_kill method in operators on_kill methods were not triggered, due to processes not being properly terminated. This was due to the fact the runners use a shell which is then replaced by the child pid, which is unknown to Airflow. Closes #3204 from bolkedebruin/AIRFLOW-1623 > Clearing task in UI does not trigger on_kill method in operator > --- > > Key: AIRFLOW-1623 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1623 > Project: Apache Airflow > Issue Type: Bug > Components: operators >Affects Versions: Airflow 2.0 >Reporter: Richard Penman >Priority: Major > Fix For: 1.10.0 > > > When a task is cleared in the UI it doesn't call the [operators on_kill() > method|https://github.com/apache/incubator-airflow/blob/b2e1753f5b74ad1b6e0889f7b784ce69623c95ce/airflow/models.py#L2380] > to clean up the task. Apparently this is meant to be handled in the > [LocalTaskJob.on_kill()|https://github.com/apache/incubator-airflow/blob/b2e1753f5b74ad1b6e0889f7b784ce69623c95ce/airflow/jobs.py#L2512] > method, however it is not currently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
incubator-airflow git commit: [AIRFLOW-1623] Trigger on_kill method in operators
Repository: incubator-airflow Updated Branches: refs/heads/master e58d0c9e2 -> 39b7d7d87 [AIRFLOW-1623] Trigger on_kill method in operators on_kill methods were not triggered, due to processes not being properly terminated. This was due to the fact the runners use a shell which is then replaced by the child pid, which is unknown to Airflow. Closes #3204 from bolkedebruin/AIRFLOW-1623 Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/39b7d7d8 Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/39b7d7d8 Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/39b7d7d8 Branch: refs/heads/master Commit: 39b7d7d87cabae9de02ba5d64b998317b494bdd9 Parents: e58d0c9 Author: Bolke de BruinAuthored: Wed Apr 11 08:05:42 2018 +0200 Committer: Bolke de Bruin Committed: Wed Apr 11 08:05:42 2018 +0200 -- .../contrib/task_runner/cgroup_task_runner.py | 4 +- airflow/jobs.py | 2 +- airflow/models.py | 3 +- airflow/task/task_runner/base_task_runner.py| 3 +- airflow/task/task_runner/bash_task_runner.py| 4 +- airflow/utils/helpers.py| 113 +--- tests/__init__.py | 3 + tests/dags/test_on_kill.py | 40 ++ tests/task/__init__.py | 18 +++ tests/task/task_runner/__init__.py | 13 ++ tests/task/task_runner/test_bash_task_runner.py | 131 +++ tests/utils/test_helpers.py | 75 +-- 12 files changed, 284 insertions(+), 125 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/39b7d7d8/airflow/contrib/task_runner/cgroup_task_runner.py -- diff --git a/airflow/contrib/task_runner/cgroup_task_runner.py b/airflow/contrib/task_runner/cgroup_task_runner.py index 87767a6..04ae81c 100644 --- a/airflow/contrib/task_runner/cgroup_task_runner.py +++ b/airflow/contrib/task_runner/cgroup_task_runner.py @@ -21,7 +21,7 @@ from cgroupspy import trees import psutil from airflow.task_runner.base_task_runner import BaseTaskRunner -from airflow.utils.helpers import kill_process_tree +from airflow.utils.helpers import reap_process_group class CgroupTaskRunner(BaseTaskRunner): @@ -176,7 +176,7 @@ class CgroupTaskRunner(BaseTaskRunner): def terminate(self): if self.process and psutil.pid_exists(self.process.pid): -kill_process_tree(self.log, self.process.pid) +reap_process_group(self.process.pid, self.log) def on_finish(self): # Let the OOM watcher thread know we're done to avoid false OOM alarms http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/39b7d7d8/airflow/jobs.py -- diff --git a/airflow/jobs.py b/airflow/jobs.py index bcff868..1911896 100644 --- a/airflow/jobs.py +++ b/airflow/jobs.py @@ -2514,7 +2514,7 @@ class LocalTaskJob(BaseJob): def signal_handler(signum, frame): """Setting kill signal handler""" -self.log.error("Killing subprocess") +self.log.error("Received SIGTERM. Terminating subprocesses") self.on_kill() raise AirflowException("LocalTaskJob received SIGTERM signal") signal.signal(signal.SIGTERM, signal_handler) http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/39b7d7d8/airflow/models.py -- diff --git a/airflow/models.py b/airflow/models.py index e89e776..afcacd1 100755 --- a/airflow/models.py +++ b/airflow/models.py @@ -1539,8 +1539,7 @@ class TaskInstance(Base, LoggingMixin): self.task = task_copy def signal_handler(signum, frame): -"""Setting kill signal handler""" -self.log.error("Killing subprocess") +self.log.error("Received SIGTERM. Terminating subprocesses.") task_copy.on_kill() raise AirflowException("Task received SIGTERM signal") signal.signal(signal.SIGTERM, signal_handler) http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/39b7d7d8/airflow/task/task_runner/base_task_runner.py -- diff --git a/airflow/task/task_runner/base_task_runner.py b/airflow/task/task_runner/base_task_runner.py index 794b450..49bc8bc 100644 --- a/airflow/task/task_runner/base_task_runner.py +++ b/airflow/task/task_runner/base_task_runner.py @@ -122,7 +122,8 @@ class