[jira] [Commented] (AIRFLOW-2558) Clear TASK/DAG is clearing all executions
[ https://issues.apache.org/jira/browse/AIRFLOW-2558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499602#comment-16499602 ] Tao Feng commented on AIRFLOW-2558: --- Will look into it this week. thanks for reporting. > Clear TASK/DAG is clearing all executions > - > > Key: AIRFLOW-2558 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2558 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: Airflow 2.0 >Reporter: Marcos Bernardelli >Assignee: Tao Feng >Priority: Major > > When I try to clear a DAG/TASK specific execution, the Airflow try to execute > all the past executions: > [Animeted > GIF|https://gist.githubusercontent.com/bern4rdelli/34c1e57acd53c8c67417604202f3e0e6/raw/4bcb3d3c23f2a3bb7f7bfb3e977d935e5bb9f0ee/clear.gif] > (I failed miserable trying to attache the animated GIF :() > > This behavior was changed here: > [https://github.com/apache/incubator-airflow/pull/3444] > The old version looks like this: > {code:python} > drs = session.query(DagModel).filter_by(dag_id=self.dag_id).all() > {code} > Then it's changed to: > {code:python} > drs = session.query(DagRun).filter_by(dag_id=self.dag_id).all() > {code} > This new query (using DagRun) get all the past executions, even when the > "Past" button is not checked. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2557) Reduce time spent in S3 tests
[ https://issues.apache.org/jira/browse/AIRFLOW-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong reassigned AIRFLOW-2557: - Assignee: Bolke de Bruin > Reduce time spent in S3 tests > - > > Key: AIRFLOW-2557 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2557 > Project: Apache Airflow > Issue Type: Sub-task >Reporter: Bolke de Bruin >Assignee: Bolke de Bruin >Priority: Major > Fix For: 1.10.0, 2.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-2557) Reduce time spent in S3 tests
[ https://issues.apache.org/jira/browse/AIRFLOW-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-2557. --- Resolution: Fixed Fix Version/s: 2.0.0 1.10.0 Issue resolved by pull request #3455 [https://github.com/apache/incubator-airflow/pull/3455] > Reduce time spent in S3 tests > - > > Key: AIRFLOW-2557 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2557 > Project: Apache Airflow > Issue Type: Sub-task >Reporter: Bolke de Bruin >Assignee: Bolke de Bruin >Priority: Major > Fix For: 1.10.0, 2.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2557) Reduce time spent in S3 tests
[ https://issues.apache.org/jira/browse/AIRFLOW-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499548#comment-16499548 ] ASF subversion and git services commented on AIRFLOW-2557: -- Commit a47b2776f159f8c439de89e68545e90b81a4397a in incubator-airflow's branch refs/heads/master from [~bolke] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=a47b277 ] [AIRFLOW-2557] Fix pagination for s3 Paged tests for s3 are taking over 120 seconds. There is functionality to set the page size. This reduces the time spent on tests. Closes #3455 from bolkedebruin/AIRFLOW-2557 > Reduce time spent in S3 tests > - > > Key: AIRFLOW-2557 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2557 > Project: Apache Airflow > Issue Type: Sub-task >Reporter: Bolke de Bruin >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2557) Reduce time spent in S3 tests
[ https://issues.apache.org/jira/browse/AIRFLOW-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499549#comment-16499549 ] ASF subversion and git services commented on AIRFLOW-2557: -- Commit a47b2776f159f8c439de89e68545e90b81a4397a in incubator-airflow's branch refs/heads/master from [~bolke] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=a47b277 ] [AIRFLOW-2557] Fix pagination for s3 Paged tests for s3 are taking over 120 seconds. There is functionality to set the page size. This reduces the time spent on tests. Closes #3455 from bolkedebruin/AIRFLOW-2557 > Reduce time spent in S3 tests > - > > Key: AIRFLOW-2557 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2557 > Project: Apache Airflow > Issue Type: Sub-task >Reporter: Bolke de Bruin >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
incubator-airflow git commit: [AIRFLOW-2557] Fix pagination for s3
Repository: incubator-airflow Updated Branches: refs/heads/master 91dd36866 -> a47b2776f [AIRFLOW-2557] Fix pagination for s3 Paged tests for s3 are taking over 120 seconds. There is functionality to set the page size. This reduces the time spent on tests. Closes #3455 from bolkedebruin/AIRFLOW-2557 Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/a47b2776 Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/a47b2776 Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/a47b2776 Branch: refs/heads/master Commit: a47b2776f159f8c439de89e68545e90b81a4397a Parents: 91dd368 Author: Bolke de Bruin Authored: Sun Jun 3 22:29:26 2018 +0200 Committer: Fokko Driesprong Committed: Sun Jun 3 22:29:26 2018 +0200 -- airflow/hooks/S3_hook.py| 30 ++ tests/hooks/test_s3_hook.py | 18 +++--- 2 files changed, 37 insertions(+), 11 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/a47b2776/airflow/hooks/S3_hook.py -- diff --git a/airflow/hooks/S3_hook.py b/airflow/hooks/S3_hook.py index 3d72275..b4f3ac3 100644 --- a/airflow/hooks/S3_hook.py +++ b/airflow/hooks/S3_hook.py @@ -77,7 +77,8 @@ class S3Hook(AwsHook): plist = self.list_prefixes(bucket_name, previous_level, delimiter) return False if plist is None else prefix in plist -def list_prefixes(self, bucket_name, prefix='', delimiter=''): +def list_prefixes(self, bucket_name, prefix='', delimiter='', + page_size=None, max_items=None): """ Lists prefixes in a bucket under prefix @@ -87,11 +88,21 @@ class S3Hook(AwsHook): :type prefix: str :param delimiter: the delimiter marks key hierarchy. :type delimiter: str +:param page_size: pagination size +:type page_size: int +:param max_items: maximum items to return +:type max_items: int """ +config = { +'PageSize': page_size, +'MaxItems': max_items, +} + paginator = self.get_conn().get_paginator('list_objects_v2') response = paginator.paginate(Bucket=bucket_name, Prefix=prefix, - Delimiter=delimiter) + Delimiter=delimiter, + PaginationConfig=config) has_results = False prefixes = [] @@ -104,7 +115,8 @@ class S3Hook(AwsHook): if has_results: return prefixes -def list_keys(self, bucket_name, prefix='', delimiter=''): +def list_keys(self, bucket_name, prefix='', delimiter='', + page_size=None, max_items=None): """ Lists keys in a bucket under prefix and not containing delimiter @@ -114,11 +126,21 @@ class S3Hook(AwsHook): :type prefix: str :param delimiter: the delimiter marks key hierarchy. :type delimiter: str +:param page_size: pagination size +:type page_size: int +:param max_items: maximum items to return +:type max_items: int """ +config = { +'PageSize': page_size, +'MaxItems': max_items, +} + paginator = self.get_conn().get_paginator('list_objects_v2') response = paginator.paginate(Bucket=bucket_name, Prefix=prefix, - Delimiter=delimiter) + Delimiter=delimiter, + PaginationConfig=config) has_results = False keys = [] http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/a47b2776/tests/hooks/test_s3_hook.py -- diff --git a/tests/hooks/test_s3_hook.py b/tests/hooks/test_s3_hook.py index 94d0e36..baac203 100644 --- a/tests/hooks/test_s3_hook.py +++ b/tests/hooks/test_s3_hook.py @@ -7,9 +7,9 @@ # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at -# +# # http://www.apache.org/licenses/LICENSE-2.0 -# +# # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY @@ -96,13 +96,16 @@ class TestS3Hook(unittest.TestCase): b = hook.get_bucket('bucket') b.create() -keys = ["%s/b" % i for i in range(5000)] -dirs = ["%s/" % i for i in range(5000)] +# we
[jira] [Assigned] (AIRFLOW-1022) Subdag can't receive templated fields
[ https://issues.apache.org/jira/browse/AIRFLOW-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao-Han Tsai reassigned AIRFLOW-1022: -- Assignee: Chao-Han Tsai > Subdag can't receive templated fields > - > > Key: AIRFLOW-1022 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1022 > Project: Apache Airflow > Issue Type: Improvement > Components: subdag >Reporter: Marcos Takahashi >Assignee: Chao-Han Tsai >Priority: Major > Labels: easyfix > > Subdag's can't receive any templated fields as the Operator is setted as > tuple() > (https://github.com/apache/incubator-airflow/blob/master/airflow/operators/subdag_operator.py#L24) > instead of any other templated dict like on PythonOperator > (https://github.com/apache/incubator-airflow/blob/master/airflow/operators/python_operator.py#L52). > That makes impossible on getting some important values like execution_date. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-2552) Improve description for airflow backfill -c
[ https://issues.apache.org/jira/browse/AIRFLOW-2552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao-Han Tsai updated AIRFLOW-2552: --- Description: Improve the description for {code}airflow backfill -c{code} Example: JSON string that gets pickled into the DagRun / backfill 's conf attribute was:Improve the description for {code}airflow backfill -c{code} > Improve description for airflow backfill -c > --- > > Key: AIRFLOW-2552 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2552 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Chao-Han Tsai >Assignee: Chao-Han Tsai >Priority: Major > > Improve the description for {code}airflow backfill -c{code} > Example: JSON string that gets pickled into the DagRun / backfill 's conf > attribute -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2533) Kubernetes worker configuration improperly sets path to DAG on worker node
[ https://issues.apache.org/jira/browse/AIRFLOW-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao-Han Tsai reassigned AIRFLOW-2533: -- Assignee: Chao-Han Tsai > Kubernetes worker configuration improperly sets path to DAG on worker node > -- > > Key: AIRFLOW-2533 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2533 > Project: Apache Airflow > Issue Type: Bug > Components: executor, worker >Affects Versions: 1.10.0 >Reporter: Brian Nutt >Assignee: Chao-Han Tsai >Priority: Blocker > Fix For: 1.10 > > Attachments: Screen Shot 2018-05-24 at 5.06.25 PM.png > > > When triggering a DAG using the kubernetes executor, the path to the DAG on > the worker node is not properly set due to the modification of the command > that is sent to the pod for the worker node to perform. See !Screen Shot > 2018-05-24 at 5.06.25 PM.png! > This means that the only DAG's that are working at the example DAGs. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2355) Airflow trigger tag parameters in subdag
[ https://issues.apache.org/jira/browse/AIRFLOW-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao-Han Tsai reassigned AIRFLOW-2355: -- Assignee: Chao-Han Tsai > Airflow trigger tag parameters in subdag > > > Key: AIRFLOW-2355 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2355 > Project: Apache Airflow > Issue Type: Bug > Components: api >Affects Versions: 1.9.0 >Reporter: Mohammed Tameem >Assignee: Chao-Han Tsai >Priority: Blocker > Fix For: 1.9.0 > > > The command airflow {color:#8eb021}+_trigger_dag -c > "\{'name':'value'}"_+{color} sends conf parameters only to the parent DAG. > I'm using SubDags that are dependent on these parameters. And no parameters > are recieved by the SubDag. > From source code of SubDag operator I see that there is no way of passing > these trigger parameters to a Subdag. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2533) Kubernetes worker configuration improperly sets path to DAG on worker node
[ https://issues.apache.org/jira/browse/AIRFLOW-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao-Han Tsai reassigned AIRFLOW-2533: -- Assignee: (was: Chao-Han Tsai) > Kubernetes worker configuration improperly sets path to DAG on worker node > -- > > Key: AIRFLOW-2533 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2533 > Project: Apache Airflow > Issue Type: Bug > Components: executor, worker >Affects Versions: 1.10.0 >Reporter: Brian Nutt >Priority: Blocker > Fix For: 1.10 > > Attachments: Screen Shot 2018-05-24 at 5.06.25 PM.png > > > When triggering a DAG using the kubernetes executor, the path to the DAG on > the worker node is not properly set due to the modification of the command > that is sent to the pod for the worker node to perform. See !Screen Shot > 2018-05-24 at 5.06.25 PM.png! > This means that the only DAG's that are working at the example DAGs. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2558) Clear TASK/DAG is clearing all executions
Marcos Bernardelli created AIRFLOW-2558: --- Summary: Clear TASK/DAG is clearing all executions Key: AIRFLOW-2558 URL: https://issues.apache.org/jira/browse/AIRFLOW-2558 Project: Apache Airflow Issue Type: Bug Affects Versions: Airflow 2.0 Reporter: Marcos Bernardelli Assignee: Tao Feng When I try to clear a DAG/TASK specific execution, the Airflow try to execute all the past executions: [Animeted GIF|https://gist.githubusercontent.com/bern4rdelli/34c1e57acd53c8c67417604202f3e0e6/raw/4bcb3d3c23f2a3bb7f7bfb3e977d935e5bb9f0ee/clear.gif] (I failed miserable trying to attache the animated GIF :() This behavior was changed here: [https://github.com/apache/incubator-airflow/pull/3444] The old version looks like this: {code:python} drs = session.query(DagModel).filter_by(dag_id=self.dag_id).all() {code} Then it's changed to: {code:python} drs = session.query(DagRun).filter_by(dag_id=self.dag_id).all() {code} This new query (using DagRun) get all the past executions, even when the "Past" button is not checked. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-2545) example_kubernetes_operator.py is causing a DeprecationWarning to be displayed during default install
[ https://issues.apache.org/jira/browse/AIRFLOW-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-2545. --- Resolution: Fixed Fix Version/s: 2.0.0 1.10.0 > example_kubernetes_operator.py is causing a DeprecationWarning to be > displayed during default install > - > > Key: AIRFLOW-2545 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2545 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.10 >Reporter: Craig Rodrigues >Assignee: Craig Rodrigues >Priority: Minor > Fix For: 1.10.0, 2.0.0 > > > |!https://mail.google.com/mail/u/0/images/cleardot.gif|id=:ts!| > > {{I tested master branch by putting the following in my requirements.txt:}} > {{git+https://github.com/rodrigc/incubator-airflow@master#egg=apache-airflow[celery,crypto,emr,hive,hdfs,ldap,mysql,postgres,redis,slack,s3]}} > and did a pip install -r requirements.txt > > and then saw this DeprecationWarning: > > [2018-05-29 14:06:27,567] \{example_kubernetes_operator.py:30} WARNING - > Could not import KubernetesPodOperator > /Users/c-craigr/airflow2/lib/python2.7/site-packages/airflow/models.py:2315: > PendingDeprecationWarning: Invalid arguments were passed to BaseOperator. > Support for passing such arguments will be dropped in Airflow 2.0. Invalid > arguments were: > *args: () > **kwargs: {'name': 'airflow-test-pod', 'image': 'ubuntu:16.04', 'labels': > {'foo': 'bar'} > , 'namespace': 'default', 'cmds': ['bash', '-cx'], 'arguments': ['echo', > '10'], 'in_cluster': False, 'get_logs': True > category=PendingDeprecationWarning -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2545) example_kubernetes_operator.py is causing a DeprecationWarning to be displayed during default install
[ https://issues.apache.org/jira/browse/AIRFLOW-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong reassigned AIRFLOW-2545: - Assignee: Craig Rodrigues > example_kubernetes_operator.py is causing a DeprecationWarning to be > displayed during default install > - > > Key: AIRFLOW-2545 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2545 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.10 >Reporter: Craig Rodrigues >Assignee: Craig Rodrigues >Priority: Minor > > |!https://mail.google.com/mail/u/0/images/cleardot.gif|id=:ts!| > > {{I tested master branch by putting the following in my requirements.txt:}} > {{git+https://github.com/rodrigc/incubator-airflow@master#egg=apache-airflow[celery,crypto,emr,hive,hdfs,ldap,mysql,postgres,redis,slack,s3]}} > and did a pip install -r requirements.txt > > and then saw this DeprecationWarning: > > [2018-05-29 14:06:27,567] \{example_kubernetes_operator.py:30} WARNING - > Could not import KubernetesPodOperator > /Users/c-craigr/airflow2/lib/python2.7/site-packages/airflow/models.py:2315: > PendingDeprecationWarning: Invalid arguments were passed to BaseOperator. > Support for passing such arguments will be dropped in Airflow 2.0. Invalid > arguments were: > *args: () > **kwargs: {'name': 'airflow-test-pod', 'image': 'ubuntu:16.04', 'labels': > {'foo': 'bar'} > , 'namespace': 'default', 'cmds': ['bash', '-cx'], 'arguments': ['echo', > '10'], 'in_cluster': False, 'get_logs': True > category=PendingDeprecationWarning -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2545) example_kubernetes_operator.py is causing a DeprecationWarning to be displayed during default install
[ https://issues.apache.org/jira/browse/AIRFLOW-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499417#comment-16499417 ] ASF subversion and git services commented on AIRFLOW-2545: -- Commit 91dd3686688dc42372ec5ae7b6efb08e35125b1f in incubator-airflow's branch refs/heads/master from [~rodrigc] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=91dd368 ] [AIRFLOW-2545] Eliminate DeprecationWarning Do not import BaseOperator as KubernetesOperator. This eliminates confusing DeprecationWarnings when setting up a default airflow install where additional kubernetes modules are not installed. Also, use LoggingMixin instead of logger module. Closes #3442 from rodrigc/AIRFLOW-2545 > example_kubernetes_operator.py is causing a DeprecationWarning to be > displayed during default install > - > > Key: AIRFLOW-2545 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2545 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.10 >Reporter: Craig Rodrigues >Priority: Minor > > |!https://mail.google.com/mail/u/0/images/cleardot.gif|id=:ts!| > > {{I tested master branch by putting the following in my requirements.txt:}} > {{git+https://github.com/rodrigc/incubator-airflow@master#egg=apache-airflow[celery,crypto,emr,hive,hdfs,ldap,mysql,postgres,redis,slack,s3]}} > and did a pip install -r requirements.txt > > and then saw this DeprecationWarning: > > [2018-05-29 14:06:27,567] \{example_kubernetes_operator.py:30} WARNING - > Could not import KubernetesPodOperator > /Users/c-craigr/airflow2/lib/python2.7/site-packages/airflow/models.py:2315: > PendingDeprecationWarning: Invalid arguments were passed to BaseOperator. > Support for passing such arguments will be dropped in Airflow 2.0. Invalid > arguments were: > *args: () > **kwargs: {'name': 'airflow-test-pod', 'image': 'ubuntu:16.04', 'labels': > {'foo': 'bar'} > , 'namespace': 'default', 'cmds': ['bash', '-cx'], 'arguments': ['echo', > '10'], 'in_cluster': False, 'get_logs': True > category=PendingDeprecationWarning -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2545) example_kubernetes_operator.py is causing a DeprecationWarning to be displayed during default install
[ https://issues.apache.org/jira/browse/AIRFLOW-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499418#comment-16499418 ] ASF subversion and git services commented on AIRFLOW-2545: -- Commit 91dd3686688dc42372ec5ae7b6efb08e35125b1f in incubator-airflow's branch refs/heads/master from [~rodrigc] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=91dd368 ] [AIRFLOW-2545] Eliminate DeprecationWarning Do not import BaseOperator as KubernetesOperator. This eliminates confusing DeprecationWarnings when setting up a default airflow install where additional kubernetes modules are not installed. Also, use LoggingMixin instead of logger module. Closes #3442 from rodrigc/AIRFLOW-2545 > example_kubernetes_operator.py is causing a DeprecationWarning to be > displayed during default install > - > > Key: AIRFLOW-2545 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2545 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.10 >Reporter: Craig Rodrigues >Priority: Minor > > |!https://mail.google.com/mail/u/0/images/cleardot.gif|id=:ts!| > > {{I tested master branch by putting the following in my requirements.txt:}} > {{git+https://github.com/rodrigc/incubator-airflow@master#egg=apache-airflow[celery,crypto,emr,hive,hdfs,ldap,mysql,postgres,redis,slack,s3]}} > and did a pip install -r requirements.txt > > and then saw this DeprecationWarning: > > [2018-05-29 14:06:27,567] \{example_kubernetes_operator.py:30} WARNING - > Could not import KubernetesPodOperator > /Users/c-craigr/airflow2/lib/python2.7/site-packages/airflow/models.py:2315: > PendingDeprecationWarning: Invalid arguments were passed to BaseOperator. > Support for passing such arguments will be dropped in Airflow 2.0. Invalid > arguments were: > *args: () > **kwargs: {'name': 'airflow-test-pod', 'image': 'ubuntu:16.04', 'labels': > {'foo': 'bar'} > , 'namespace': 'default', 'cmds': ['bash', '-cx'], 'arguments': ['echo', > '10'], 'in_cluster': False, 'get_logs': True > category=PendingDeprecationWarning -- This message was sent by Atlassian JIRA (v7.6.3#76005)
incubator-airflow git commit: [AIRFLOW-2545] Eliminate DeprecationWarning
Repository: incubator-airflow Updated Branches: refs/heads/master b7dc31510 -> 91dd36866 [AIRFLOW-2545] Eliminate DeprecationWarning Do not import BaseOperator as KubernetesOperator. This eliminates confusing DeprecationWarnings when setting up a default airflow install where additional kubernetes modules are not installed. Also, use LoggingMixin instead of logger module. Closes #3442 from rodrigc/AIRFLOW-2545 Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/91dd3686 Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/91dd3686 Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/91dd3686 Branch: refs/heads/master Commit: 91dd3686688dc42372ec5ae7b6efb08e35125b1f Parents: b7dc315 Author: Craig Rodrigues Authored: Sun Jun 3 15:29:31 2018 +0200 Committer: Fokko Driesprong Committed: Sun Jun 3 15:29:31 2018 +0200 -- .../example_dags/example_kubernetes_operator.py | 55 +++- 1 file changed, 29 insertions(+), 26 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/91dd3686/airflow/example_dags/example_kubernetes_operator.py -- diff --git a/airflow/example_dags/example_kubernetes_operator.py b/airflow/example_dags/example_kubernetes_operator.py index 8f5ab39..92d73c5 100644 --- a/airflow/example_dags/example_kubernetes_operator.py +++ b/airflow/example_dags/example_kubernetes_operator.py @@ -17,37 +17,40 @@ # specific language governing permissions and limitations # under the License. -import airflow -import logging +from airflow.utils.dates import days_ago +from airflow.utils.log.logging_mixin import LoggingMixin from airflow.models import DAG +log = LoggingMixin().log + try: # Kubernetes is optional, so not available in vanilla Airflow -# pip install airflow[gcp] +# pip install airflow[kubernetes] from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOperator -except ImportError: -# Just import the BaseOperator as the KubernetesPodOperator -logging.warn("Could not import KubernetesPodOperator") -from airflow.models import BaseOperator as KubernetesPodOperator -args = { -'owner': 'airflow', -'start_date': airflow.utils.dates.days_ago(2) -} +args = { +'owner': 'airflow', +'start_date': days_ago(2) +} + +dag = DAG( +dag_id='example_kubernetes_operator', +default_args=args, +schedule_interval=None) -dag = DAG( -dag_id='example_kubernetes_operator', -default_args=args, -schedule_interval=None) +k = KubernetesPodOperator( +namespace='default', +image="ubuntu:16.04", +cmds=["bash", "-cx"], +arguments=["echo", "10"], +labels={"foo": "bar"}, +name="airflow-test-pod", +in_cluster=False, +task_id="task", +get_logs=True, +dag=dag) -k = KubernetesPodOperator( -namespace='default', -image="ubuntu:16.04", -cmds=["bash", "-cx"], -arguments=["echo", "10"], -labels={"foo": "bar"}, -name="airflow-test-pod", -in_cluster=False, -task_id="task", -get_logs=True, -dag=dag) +except ImportError as e: +log.warn("Could not import KubernetesPodOperator: " + str(e)) +log.warn("Install kubernetes dependencies with: " + "pip install airflow['kubernetes']")
[jira] [Resolved] (AIRFLOW-2500) Fix MySqlToHiveTransfer to transfer unsigned type properly
[ https://issues.apache.org/jira/browse/AIRFLOW-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-2500. --- Resolution: Fixed > Fix MySqlToHiveTransfer to transfer unsigned type properly > -- > > Key: AIRFLOW-2500 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2500 > Project: Apache Airflow > Issue Type: Bug > Components: operators >Reporter: Kengo Seki >Assignee: Kengo Seki >Priority: Major > > Given the following table, > {code} > mysql> USE airflow_ci > Database changed > mysql> DESC users; > +---+--+--+-+-+---+ > | Field | Type | Null | Key | Default | Extra | > +---+--+--+-+-+---+ > | id| int(10) unsigned | YES | | NULL| | > +---+--+--+-+-+---+ > 1 row in set (0.00 sec) > mysql> SELECT * FROM users; > ++ > | id | > ++ > | 2147483647 | > | 2147483648 | > ++ > 2 rows in set (0.00 sec) > {code} > executing MySqlToHiveTransfer: > {code} > In [1]: from airflow.operators.mysql_to_hive import MySqlToHiveTransfer >...: t = MySqlToHiveTransfer(sql="SELECT * FROM airflow_ci.users", > hive_table="users", recreate=True, task_id="t") >...: t.execute(None) >...: > [2018-05-21 12:14:09,137] {base_hook.py:83} INFO - Using connection to: > localhost > [2018-05-21 12:14:09,140] {base_hook.py:83} INFO - Using connection to: > localhost > [2018-05-21 12:14:09,146] {hive_hooks.py:427} INFO - DROP TABLE IF EXISTS > users; > CREATE TABLE IF NOT EXISTS users ( > id INT) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '' > STORED AS textfile > ; > (snip) > [2018-05-21 12:14:31,667] {hive_hooks.py:233} INFO - Loading data to table > default.users > [2018-05-21 12:14:32,364] {hive_hooks.py:233} INFO - Table default.users > stats: [numFiles=1, numRows=0, totalSize=24, rawDataSize=0] > [2018-05-21 12:14:32,365] {hive_hooks.py:233} INFO - OK > [2018-05-21 12:14:32,366] {hive_hooks.py:233} INFO - Time taken: 1.299 seconds > {code} > ... then the value greater than the upper bound for signed integer is not > properly fetched from Hive. > {code} > hive> SELECT * FROM users; > OK > 2147483647 > NULL > Time taken: 2.461 seconds, Fetched: 2 row(s) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2500) Fix MySqlToHiveTransfer to transfer unsigned type properly
[ https://issues.apache.org/jira/browse/AIRFLOW-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499413#comment-16499413 ] ASF subversion and git services commented on AIRFLOW-2500: -- Commit b7dc315101a0fc924f76e5c5f500c96e85bdd672 in incubator-airflow's branch refs/heads/master from [~sekikn] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=b7dc315 ] [AIRFLOW-2500] Fix MySqlToHiveTransfer to transfer unsigned type properly MySQL supports unsigned data types, but Hive doesn't. So if MySqlToHiveTransfer maps MySQL's data types to Hive's corresponding ones directly (e.g. INT -> INT), unsigned values over signed type's upper bound transferred from MySQL are interpreted as invalid by Hive, and users get NULL. To avoid it, this PR fixes MySqlToHiveTransfer to map MySQL data types to Hive's wider ones (e.g. SMALLINT -> INT, INT -> BIGINT, etc.). Closes #3446 from sekikn/AIRFLOW-2500 > Fix MySqlToHiveTransfer to transfer unsigned type properly > -- > > Key: AIRFLOW-2500 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2500 > Project: Apache Airflow > Issue Type: Bug > Components: operators >Reporter: Kengo Seki >Assignee: Kengo Seki >Priority: Major > > Given the following table, > {code} > mysql> USE airflow_ci > Database changed > mysql> DESC users; > +---+--+--+-+-+---+ > | Field | Type | Null | Key | Default | Extra | > +---+--+--+-+-+---+ > | id| int(10) unsigned | YES | | NULL| | > +---+--+--+-+-+---+ > 1 row in set (0.00 sec) > mysql> SELECT * FROM users; > ++ > | id | > ++ > | 2147483647 | > | 2147483648 | > ++ > 2 rows in set (0.00 sec) > {code} > executing MySqlToHiveTransfer: > {code} > In [1]: from airflow.operators.mysql_to_hive import MySqlToHiveTransfer >...: t = MySqlToHiveTransfer(sql="SELECT * FROM airflow_ci.users", > hive_table="users", recreate=True, task_id="t") >...: t.execute(None) >...: > [2018-05-21 12:14:09,137] {base_hook.py:83} INFO - Using connection to: > localhost > [2018-05-21 12:14:09,140] {base_hook.py:83} INFO - Using connection to: > localhost > [2018-05-21 12:14:09,146] {hive_hooks.py:427} INFO - DROP TABLE IF EXISTS > users; > CREATE TABLE IF NOT EXISTS users ( > id INT) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '' > STORED AS textfile > ; > (snip) > [2018-05-21 12:14:31,667] {hive_hooks.py:233} INFO - Loading data to table > default.users > [2018-05-21 12:14:32,364] {hive_hooks.py:233} INFO - Table default.users > stats: [numFiles=1, numRows=0, totalSize=24, rawDataSize=0] > [2018-05-21 12:14:32,365] {hive_hooks.py:233} INFO - OK > [2018-05-21 12:14:32,366] {hive_hooks.py:233} INFO - Time taken: 1.299 seconds > {code} > ... then the value greater than the upper bound for signed integer is not > properly fetched from Hive. > {code} > hive> SELECT * FROM users; > OK > 2147483647 > NULL > Time taken: 2.461 seconds, Fetched: 2 row(s) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2500) Fix MySqlToHiveTransfer to transfer unsigned type properly
[ https://issues.apache.org/jira/browse/AIRFLOW-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499414#comment-16499414 ] ASF subversion and git services commented on AIRFLOW-2500: -- Commit b7dc315101a0fc924f76e5c5f500c96e85bdd672 in incubator-airflow's branch refs/heads/master from [~sekikn] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=b7dc315 ] [AIRFLOW-2500] Fix MySqlToHiveTransfer to transfer unsigned type properly MySQL supports unsigned data types, but Hive doesn't. So if MySqlToHiveTransfer maps MySQL's data types to Hive's corresponding ones directly (e.g. INT -> INT), unsigned values over signed type's upper bound transferred from MySQL are interpreted as invalid by Hive, and users get NULL. To avoid it, this PR fixes MySqlToHiveTransfer to map MySQL data types to Hive's wider ones (e.g. SMALLINT -> INT, INT -> BIGINT, etc.). Closes #3446 from sekikn/AIRFLOW-2500 > Fix MySqlToHiveTransfer to transfer unsigned type properly > -- > > Key: AIRFLOW-2500 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2500 > Project: Apache Airflow > Issue Type: Bug > Components: operators >Reporter: Kengo Seki >Assignee: Kengo Seki >Priority: Major > > Given the following table, > {code} > mysql> USE airflow_ci > Database changed > mysql> DESC users; > +---+--+--+-+-+---+ > | Field | Type | Null | Key | Default | Extra | > +---+--+--+-+-+---+ > | id| int(10) unsigned | YES | | NULL| | > +---+--+--+-+-+---+ > 1 row in set (0.00 sec) > mysql> SELECT * FROM users; > ++ > | id | > ++ > | 2147483647 | > | 2147483648 | > ++ > 2 rows in set (0.00 sec) > {code} > executing MySqlToHiveTransfer: > {code} > In [1]: from airflow.operators.mysql_to_hive import MySqlToHiveTransfer >...: t = MySqlToHiveTransfer(sql="SELECT * FROM airflow_ci.users", > hive_table="users", recreate=True, task_id="t") >...: t.execute(None) >...: > [2018-05-21 12:14:09,137] {base_hook.py:83} INFO - Using connection to: > localhost > [2018-05-21 12:14:09,140] {base_hook.py:83} INFO - Using connection to: > localhost > [2018-05-21 12:14:09,146] {hive_hooks.py:427} INFO - DROP TABLE IF EXISTS > users; > CREATE TABLE IF NOT EXISTS users ( > id INT) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '' > STORED AS textfile > ; > (snip) > [2018-05-21 12:14:31,667] {hive_hooks.py:233} INFO - Loading data to table > default.users > [2018-05-21 12:14:32,364] {hive_hooks.py:233} INFO - Table default.users > stats: [numFiles=1, numRows=0, totalSize=24, rawDataSize=0] > [2018-05-21 12:14:32,365] {hive_hooks.py:233} INFO - OK > [2018-05-21 12:14:32,366] {hive_hooks.py:233} INFO - Time taken: 1.299 seconds > {code} > ... then the value greater than the upper bound for signed integer is not > properly fetched from Hive. > {code} > hive> SELECT * FROM users; > OK > 2147483647 > NULL > Time taken: 2.461 seconds, Fetched: 2 row(s) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
incubator-airflow git commit: [AIRFLOW-2500] Fix MySqlToHiveTransfer to transfer unsigned type properly
Repository: incubator-airflow Updated Branches: refs/heads/master 29dbedfd0 -> b7dc31510 [AIRFLOW-2500] Fix MySqlToHiveTransfer to transfer unsigned type properly MySQL supports unsigned data types, but Hive doesn't. So if MySqlToHiveTransfer maps MySQL's data types to Hive's corresponding ones directly (e.g. INT -> INT), unsigned values over signed type's upper bound transferred from MySQL are interpreted as invalid by Hive, and users get NULL. To avoid it, this PR fixes MySqlToHiveTransfer to map MySQL data types to Hive's wider ones (e.g. SMALLINT -> INT, INT -> BIGINT, etc.). Closes #3446 from sekikn/AIRFLOW-2500 Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/b7dc3151 Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/b7dc3151 Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/b7dc3151 Branch: refs/heads/master Commit: b7dc315101a0fc924f76e5c5f500c96e85bdd672 Parents: 29dbedf Author: Kengo Seki Authored: Sun Jun 3 15:24:19 2018 +0200 Committer: Fokko Driesprong Committed: Sun Jun 3 15:24:19 2018 +0200 -- airflow/operators/mysql_to_hive.py | 5 +- tests/operators/operators.py | 112 2 files changed, 115 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/b7dc3151/airflow/operators/mysql_to_hive.py -- diff --git a/airflow/operators/mysql_to_hive.py b/airflow/operators/mysql_to_hive.py index 22b7ac2..ab7b7fa 100644 --- a/airflow/operators/mysql_to_hive.py +++ b/airflow/operators/mysql_to_hive.py @@ -104,9 +104,10 @@ class MySqlToHiveTransfer(BaseOperator): t.DOUBLE: 'DOUBLE', t.FLOAT: 'DOUBLE', t.INT24: 'INT', -t.LONG: 'INT', -t.LONGLONG: 'BIGINT', +t.LONG: 'BIGINT', +t.LONGLONG: 'DECIMAL(38,0)', t.SHORT: 'INT', +t.TINY: 'SMALLINT', t.YEAR: 'INT', } return d[mysql_type] if mysql_type in d else 'STRING' http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/b7dc3151/tests/operators/operators.py -- diff --git a/tests/operators/operators.py b/tests/operators/operators.py index 4eba0a8..07b1396 100644 --- a/tests/operators/operators.py +++ b/tests/operators/operators.py @@ -23,8 +23,11 @@ from airflow import DAG, configuration, operators from airflow.utils.tests import skipUnlessImported from airflow.utils import timezone +from collections import OrderedDict + import os import mock +import six import unittest configuration.load_test_config() @@ -313,3 +316,112 @@ class TransferTests(unittest.TestCase): tblproperties={'test_property':'test_value'}, dag=self.dag) t.run(start_date=DEFAULT_DATE, end_date=DEFAULT_DATE, ignore_ti_state=True) + +@mock.patch('airflow.hooks.hive_hooks.HiveCliHook.load_file') +def test_mysql_to_hive_type_conversion(self, mock_load_file): +mysql_conn_id = 'airflow_ci' +mysql_table = 'test_mysql_to_hive' + +from airflow.hooks.mysql_hook import MySqlHook +m = MySqlHook(mysql_conn_id) + +try: +with m.get_conn() as c: +c.execute("DROP TABLE IF EXISTS {}".format(mysql_table)) +c.execute(""" +CREATE TABLE {} ( +c0 TINYINT, +c1 SMALLINT, +c2 MEDIUMINT, +c3 INT, +c4 BIGINT +) +""".format(mysql_table)) + +from airflow.operators.mysql_to_hive import MySqlToHiveTransfer +t = MySqlToHiveTransfer( +task_id='test_m2h', +mysql_conn_id=mysql_conn_id, +hive_cli_conn_id='beeline_default', +sql="SELECT * FROM {}".format(mysql_table), +hive_table='test_mysql_to_hive', +dag=self.dag) +t.run(start_date=DEFAULT_DATE, end_date=DEFAULT_DATE, ignore_ti_state=True) + +mock_load_file.assert_called_once() +d = OrderedDict() +d["c0"] = "SMALLINT" +d["c1"] = "INT" +d["c2"] = "INT" +d["c3"] = "BIGINT" +d["c4"] = "DECIMAL(38,0)" +self.assertEqual(mock_load_file.call_args[1]["field_dict"], d) +finally: +with m.get_conn() as c: +c.execute("DROP TABLE IF EXISTS {}".format(mysql_table)) + +@unittest.skipIf(six.PY2, "Skip since HiveServer2Hook doesn't work " + "on Python2 for now. Se
[jira] [Commented] (AIRFLOW-2557) Reduce time spent in S3 tests
[ https://issues.apache.org/jira/browse/AIRFLOW-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499406#comment-16499406 ] Ash Berlin-Taylor commented on AIRFLOW-2557: The moto library (which is in use partially) should probably be used everywhere in the S3 tests. This might speed things up. > Reduce time spent in S3 tests > - > > Key: AIRFLOW-2557 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2557 > Project: Apache Airflow > Issue Type: Sub-task >Reporter: Bolke de Bruin >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2462) airflow.contrib.auth.backends.password_auth.PasswordUser exists bug
[ https://issues.apache.org/jira/browse/AIRFLOW-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong reassigned AIRFLOW-2462: - Assignee: froginwell > airflow.contrib.auth.backends.password_auth.PasswordUser exists bug > --- > > Key: AIRFLOW-2462 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2462 > Project: Apache Airflow > Issue Type: Bug > Components: authentication, contrib >Affects Versions: 1.9.0 >Reporter: froginwell >Assignee: froginwell >Priority: Blocker > > PasswordUser > {quote} > @password.setter > def _set_password(self, plaintext): > self._password = generate_password_hash(plaintext, 12) > if PY3: > self._password = str(self._password, 'utf-8') > {quote} > _set_password should be renamed as password. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-2462) airflow.contrib.auth.backends.password_auth.PasswordUser exists bug
[ https://issues.apache.org/jira/browse/AIRFLOW-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-2462. --- Resolution: Fixed > airflow.contrib.auth.backends.password_auth.PasswordUser exists bug > --- > > Key: AIRFLOW-2462 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2462 > Project: Apache Airflow > Issue Type: Bug > Components: authentication, contrib >Affects Versions: 1.9.0 >Reporter: froginwell >Assignee: froginwell >Priority: Blocker > > PasswordUser > {quote} > @password.setter > def _set_password(self, plaintext): > self._password = generate_password_hash(plaintext, 12) > if PY3: > self._password = str(self._password, 'utf-8') > {quote} > _set_password should be renamed as password. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2556) Reduce time spent on unit tests
[ https://issues.apache.org/jira/browse/AIRFLOW-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499403#comment-16499403 ] Ash Berlin-Taylor commented on AIRFLOW-2556: Does this actually cost Apache money, or is it free cos it's an open-source project? Reducing test time still def good though, just curious > Reduce time spent on unit tests > --- > > Key: AIRFLOW-2556 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2556 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Bolke de Bruin >Priority: Major > > Unit tests are taking up way too much time. This costs time and actually also > Money from the Apache Foundation. We need to reduce this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2462) airflow.contrib.auth.backends.password_auth.PasswordUser exists bug
[ https://issues.apache.org/jira/browse/AIRFLOW-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499402#comment-16499402 ] ASF subversion and git services commented on AIRFLOW-2462: -- Commit 29dbedfd0a49c54eaf388ff940ec7cfe4a6e1f7f in incubator-airflow's branch refs/heads/master from [~noremac201] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=29dbedf ] [AIRFLOW-2462] Change PasswordUser setter to correct syntax Closes #3415 from Noremac201/setterFix > airflow.contrib.auth.backends.password_auth.PasswordUser exists bug > --- > > Key: AIRFLOW-2462 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2462 > Project: Apache Airflow > Issue Type: Bug > Components: authentication, contrib >Affects Versions: 1.9.0 >Reporter: froginwell >Priority: Blocker > > PasswordUser > {quote} > @password.setter > def _set_password(self, plaintext): > self._password = generate_password_hash(plaintext, 12) > if PY3: > self._password = str(self._password, 'utf-8') > {quote} > _set_password should be renamed as password. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
incubator-airflow git commit: [AIRFLOW-2462] Change PasswordUser setter to correct syntax
Repository: incubator-airflow Updated Branches: refs/heads/master e5fb9c799 -> 29dbedfd0 [AIRFLOW-2462] Change PasswordUser setter to correct syntax Closes #3415 from Noremac201/setterFix Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/29dbedfd Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/29dbedfd Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/29dbedfd Branch: refs/heads/master Commit: 29dbedfd0a49c54eaf388ff940ec7cfe4a6e1f7f Parents: e5fb9c7 Author: Cameron Moberg Authored: Sun Jun 3 15:07:11 2018 +0200 Committer: Fokko Driesprong Committed: Sun Jun 3 15:07:11 2018 +0200 -- airflow/contrib/auth/backends/password_auth.py | 2 +- tests/core.py | 38 + 2 files changed, 39 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/29dbedfd/airflow/contrib/auth/backends/password_auth.py -- diff --git a/airflow/contrib/auth/backends/password_auth.py b/airflow/contrib/auth/backends/password_auth.py index 9e16bb6..879aaa1 100644 --- a/airflow/contrib/auth/backends/password_auth.py +++ b/airflow/contrib/auth/backends/password_auth.py @@ -63,7 +63,7 @@ class PasswordUser(models.User): return self._password @password.setter -def _set_password(self, plaintext): +def password(self, plaintext): self._password = generate_password_hash(plaintext, 12) if PY3: self._password = str(self._password, 'utf-8') http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/29dbedfd/tests/core.py -- diff --git a/tests/core.py b/tests/core.py index 83737ed..0312fed 100644 --- a/tests/core.py +++ b/tests/core.py @@ -1865,6 +1865,44 @@ class SecureModeWebUiTests(unittest.TestCase): configuration.conf.remove_option("core", "SECURE_MODE") +class PasswordUserTest(unittest.TestCase): +def setUp(self): +user = models.User() +from airflow.contrib.auth.backends.password_auth import PasswordUser +self.password_user = PasswordUser(user) +self.password_user.username = "password_test" + + @mock.patch('airflow.contrib.auth.backends.password_auth.generate_password_hash') +def test_password_setter(self, mock_gen_pass_hash): +mock_gen_pass_hash.return_value = b"hashed_pass" if six.PY3 else "hashed_pass" + +self.password_user.password = "secure_password" +mock_gen_pass_hash.assert_called_with("secure_password", 12) + +def test_password_unicode(self): +# In python2.7 no conversion is required back to str +# In python >= 3 the method must convert from bytes to str +self.password_user.password = "secure_password" +self.assertIsInstance(self.password_user.password, str) + +def test_password_user_authenticate(self): +self.password_user.password = "secure_password" +self.assertTrue(self.password_user.authenticate("secure_password")) + +def test_password_authenticate_session(self): +from airflow.contrib.auth.backends.password_auth import PasswordUser +self.password_user.password = 'test_password' +session = Session() +session.add(self.password_user) +session.commit() +query_user = session.query(PasswordUser).filter_by( +username=self.password_user.username).first() +self.assertTrue(query_user.authenticate('test_password')) +session.query(models.User).delete() +session.commit() +session.close() + + class WebPasswordAuthTest(unittest.TestCase): def setUp(self): configuration.conf.set("webserver", "authenticate", "True")
[jira] [Closed] (AIRFLOW-2525) Fix PostgresHook.copy_expert to work with "COPY FROM"
[ https://issues.apache.org/jira/browse/AIRFLOW-2525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong closed AIRFLOW-2525. - > Fix PostgresHook.copy_expert to work with "COPY FROM" > - > > Key: AIRFLOW-2525 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2525 > Project: Apache Airflow > Issue Type: Bug > Components: hooks >Reporter: Kengo Seki >Assignee: Kengo Seki >Priority: Major > Fix For: 2.0.0 > > > psycopg2's {{copy_expert}} [works with both "COPY FROM" and "COPY > TO"|http://initd.org/psycopg/docs/cursor.html#cursor.copy_expert]. > But {{PostgresHook.copy_expert}} fails with "COPY FROM" as follows: > {code} > In [1]: from airflow.hooks.postgres_hook import PostgresHook > In [2]: hook = PostgresHook() > In [3]: hook.copy_expert("COPY t FROM STDIN", "/tmp/t") > [2018-05-24 23:37:54,767] {base_hook.py:83} INFO - Using connection to: > localhost > --- > QueryCanceledErrorTraceback (most recent call last) > in () > > 1 hook.copy_expert("COPY t FROM STDIN", "/tmp/t") > ~/dev/incubator-airflow/airflow/hooks/postgres_hook.py in copy_expert(self, > sql, filename, open) > 68 with closing(self.get_conn()) as conn: > 69 with closing(conn.cursor()) as cur: > ---> 70 cur.copy_expert(sql, f) > 71 > 72 @staticmethod > QueryCanceledError: COPY from stdin failed: error in .read() call: > UnsupportedOperation not readable > CONTEXT: COPY t, line 1 > {code} > This is because the file used in this method is opened with 'w' mode. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2525) Fix PostgresHook.copy_expert to work with "COPY FROM"
[ https://issues.apache.org/jira/browse/AIRFLOW-2525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499400#comment-16499400 ] ASF subversion and git services commented on AIRFLOW-2525: -- Commit e5fb9c7998e67be40cb285971fb941ed4e81273a in incubator-airflow's branch refs/heads/master from [~sekikn] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=e5fb9c7 ] [AIRFLOW-2525] Fix a bug introduced by commit dabf1b9 The previous commit on this issue (#3421) introduced a new bug on COPY FROM operation. This PR fixes it by opening a file with 'r+' mode instead of 'w+' to avoid truncating it. Closes #3423 from sekikn/AIRFLOW-2525-2 > Fix PostgresHook.copy_expert to work with "COPY FROM" > - > > Key: AIRFLOW-2525 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2525 > Project: Apache Airflow > Issue Type: Bug > Components: hooks >Reporter: Kengo Seki >Assignee: Kengo Seki >Priority: Major > Fix For: 2.0.0 > > > psycopg2's {{copy_expert}} [works with both "COPY FROM" and "COPY > TO"|http://initd.org/psycopg/docs/cursor.html#cursor.copy_expert]. > But {{PostgresHook.copy_expert}} fails with "COPY FROM" as follows: > {code} > In [1]: from airflow.hooks.postgres_hook import PostgresHook > In [2]: hook = PostgresHook() > In [3]: hook.copy_expert("COPY t FROM STDIN", "/tmp/t") > [2018-05-24 23:37:54,767] {base_hook.py:83} INFO - Using connection to: > localhost > --- > QueryCanceledErrorTraceback (most recent call last) > in () > > 1 hook.copy_expert("COPY t FROM STDIN", "/tmp/t") > ~/dev/incubator-airflow/airflow/hooks/postgres_hook.py in copy_expert(self, > sql, filename, open) > 68 with closing(self.get_conn()) as conn: > 69 with closing(conn.cursor()) as cur: > ---> 70 cur.copy_expert(sql, f) > 71 > 72 @staticmethod > QueryCanceledError: COPY from stdin failed: error in .read() call: > UnsupportedOperation not readable > CONTEXT: COPY t, line 1 > {code} > This is because the file used in this method is opened with 'w' mode. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2525) Fix PostgresHook.copy_expert to work with "COPY FROM"
[ https://issues.apache.org/jira/browse/AIRFLOW-2525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499399#comment-16499399 ] ASF subversion and git services commented on AIRFLOW-2525: -- Commit e5fb9c7998e67be40cb285971fb941ed4e81273a in incubator-airflow's branch refs/heads/master from [~sekikn] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=e5fb9c7 ] [AIRFLOW-2525] Fix a bug introduced by commit dabf1b9 The previous commit on this issue (#3421) introduced a new bug on COPY FROM operation. This PR fixes it by opening a file with 'r+' mode instead of 'w+' to avoid truncating it. Closes #3423 from sekikn/AIRFLOW-2525-2 > Fix PostgresHook.copy_expert to work with "COPY FROM" > - > > Key: AIRFLOW-2525 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2525 > Project: Apache Airflow > Issue Type: Bug > Components: hooks >Reporter: Kengo Seki >Assignee: Kengo Seki >Priority: Major > Fix For: 2.0.0 > > > psycopg2's {{copy_expert}} [works with both "COPY FROM" and "COPY > TO"|http://initd.org/psycopg/docs/cursor.html#cursor.copy_expert]. > But {{PostgresHook.copy_expert}} fails with "COPY FROM" as follows: > {code} > In [1]: from airflow.hooks.postgres_hook import PostgresHook > In [2]: hook = PostgresHook() > In [3]: hook.copy_expert("COPY t FROM STDIN", "/tmp/t") > [2018-05-24 23:37:54,767] {base_hook.py:83} INFO - Using connection to: > localhost > --- > QueryCanceledErrorTraceback (most recent call last) > in () > > 1 hook.copy_expert("COPY t FROM STDIN", "/tmp/t") > ~/dev/incubator-airflow/airflow/hooks/postgres_hook.py in copy_expert(self, > sql, filename, open) > 68 with closing(self.get_conn()) as conn: > 69 with closing(conn.cursor()) as cur: > ---> 70 cur.copy_expert(sql, f) > 71 > 72 @staticmethod > QueryCanceledError: COPY from stdin failed: error in .read() call: > UnsupportedOperation not readable > CONTEXT: COPY t, line 1 > {code} > This is because the file used in this method is opened with 'w' mode. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
incubator-airflow git commit: [AIRFLOW-2525] Fix a bug introduced by commit dabf1b9
Repository: incubator-airflow Updated Branches: refs/heads/master 2ed618883 -> e5fb9c799 [AIRFLOW-2525] Fix a bug introduced by commit dabf1b9 The previous commit on this issue (#3421) introduced a new bug on COPY FROM operation. This PR fixes it by opening a file with 'r+' mode instead of 'w+' to avoid truncating it. Closes #3423 from sekikn/AIRFLOW-2525-2 Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/e5fb9c79 Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/e5fb9c79 Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/e5fb9c79 Branch: refs/heads/master Commit: e5fb9c7998e67be40cb285971fb941ed4e81273a Parents: 2ed6188 Author: Kengo Seki Authored: Sun Jun 3 15:03:05 2018 +0200 Committer: Fokko Driesprong Committed: Sun Jun 3 15:03:05 2018 +0200 -- airflow/hooks/postgres_hook.py| 18 +++--- tests/hooks/test_postgres_hook.py | 2 +- 2 files changed, 16 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/e5fb9c79/airflow/hooks/postgres_hook.py -- diff --git a/airflow/hooks/postgres_hook.py b/airflow/hooks/postgres_hook.py index bbf125b..0395e70 100644 --- a/airflow/hooks/postgres_hook.py +++ b/airflow/hooks/postgres_hook.py @@ -17,6 +17,7 @@ # specific language governing permissions and limitations # under the License. +import os import psycopg2 import psycopg2.extensions from contextlib import closing @@ -61,13 +62,24 @@ class PostgresHook(DbApiHook): def copy_expert(self, sql, filename, open=open): """ -Executes SQL using psycopg2 copy_expert method -Necessary to execute COPY command without access to a superuser +Executes SQL using psycopg2 copy_expert method. +Necessary to execute COPY command without access to a superuser. + +Note: if this method is called with a "COPY FROM" statement and +the specified input file does not exist, it creates an empty +file and no data is loaded, but the operation succeeds. +So if users want to be aware when the input file does not exist, +they have to check its existence by themselves. """ -with open(filename, 'w+') as f: +if not os.path.isfile(filename): +with open(filename, 'w'): +pass + +with open(filename, 'r+') as f: with closing(self.get_conn()) as conn: with closing(conn.cursor()) as cur: cur.copy_expert(sql, f) +f.truncate(f.tell()) conn.commit() @staticmethod http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/e5fb9c79/tests/hooks/test_postgres_hook.py -- diff --git a/tests/hooks/test_postgres_hook.py b/tests/hooks/test_postgres_hook.py index f636b5a..2828f8b 100644 --- a/tests/hooks/test_postgres_hook.py +++ b/tests/hooks/test_postgres_hook.py @@ -55,4 +55,4 @@ class TestPostgresHook(unittest.TestCase): self.cur.close.assert_called_once() self.conn.commit.assert_called_once() self.cur.copy_expert.assert_called_once_with(statement, m.return_value) -m.assert_called_once_with(filename, "w+") +self.assertEqual(m.call_args[0], (filename, "r+"))
[jira] [Created] (AIRFLOW-2556) Reduce time spent on unit tests
Bolke de Bruin created AIRFLOW-2556: --- Summary: Reduce time spent on unit tests Key: AIRFLOW-2556 URL: https://issues.apache.org/jira/browse/AIRFLOW-2556 Project: Apache Airflow Issue Type: Improvement Reporter: Bolke de Bruin Unit tests are taking up way too much time. This costs time and actually also Money from the Apache Foundation. We need to reduce this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2557) Reduce time spent in S3 tests
Bolke de Bruin created AIRFLOW-2557: --- Summary: Reduce time spent in S3 tests Key: AIRFLOW-2557 URL: https://issues.apache.org/jira/browse/AIRFLOW-2557 Project: Apache Airflow Issue Type: Sub-task Reporter: Bolke de Bruin -- This message was sent by Atlassian JIRA (v7.6.3#76005)