[jira] [Commented] (AIRFLOW-2558) Clear TASK/DAG is clearing all executions

2018-06-03 Thread Tao Feng (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499602#comment-16499602
 ] 

Tao Feng commented on AIRFLOW-2558:
---

Will look into it this week. thanks for reporting.

> Clear TASK/DAG is clearing all executions
> -
>
> Key: AIRFLOW-2558
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2558
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 2.0
>Reporter: Marcos Bernardelli
>Assignee: Tao Feng
>Priority: Major
>
> When I try to clear a DAG/TASK specific execution, the Airflow try to execute 
> all the past executions:
> [Animeted 
> GIF|https://gist.githubusercontent.com/bern4rdelli/34c1e57acd53c8c67417604202f3e0e6/raw/4bcb3d3c23f2a3bb7f7bfb3e977d935e5bb9f0ee/clear.gif]
>  (I failed miserable trying to attache the animated GIF :()
>  
> This behavior was changed here: 
> [https://github.com/apache/incubator-airflow/pull/3444]
> The old version looks like this:
> {code:python}
> drs = session.query(DagModel).filter_by(dag_id=self.dag_id).all()
> {code}
> Then it's changed to:
> {code:python}
> drs = session.query(DagRun).filter_by(dag_id=self.dag_id).all()
> {code}
> This new query (using DagRun) get all the past executions, even when the 
> "Past" button is not checked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2557) Reduce time spent in S3 tests

2018-06-03 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong reassigned AIRFLOW-2557:
-

Assignee: Bolke de Bruin

> Reduce time spent in S3 tests
> -
>
> Key: AIRFLOW-2557
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2557
> Project: Apache Airflow
>  Issue Type: Sub-task
>Reporter: Bolke de Bruin
>Assignee: Bolke de Bruin
>Priority: Major
> Fix For: 1.10.0, 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2557) Reduce time spent in S3 tests

2018-06-03 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-2557.
---
   Resolution: Fixed
Fix Version/s: 2.0.0
   1.10.0

Issue resolved by pull request #3455
[https://github.com/apache/incubator-airflow/pull/3455]

> Reduce time spent in S3 tests
> -
>
> Key: AIRFLOW-2557
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2557
> Project: Apache Airflow
>  Issue Type: Sub-task
>Reporter: Bolke de Bruin
>Assignee: Bolke de Bruin
>Priority: Major
> Fix For: 1.10.0, 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2557) Reduce time spent in S3 tests

2018-06-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499548#comment-16499548
 ] 

ASF subversion and git services commented on AIRFLOW-2557:
--

Commit a47b2776f159f8c439de89e68545e90b81a4397a in incubator-airflow's branch 
refs/heads/master from [~bolke]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=a47b277 ]

[AIRFLOW-2557] Fix pagination for s3

Paged tests for s3 are taking over 120 seconds.
There
is functionality to set the page size. This
reduces
the time spent on tests.

Closes #3455 from bolkedebruin/AIRFLOW-2557


> Reduce time spent in S3 tests
> -
>
> Key: AIRFLOW-2557
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2557
> Project: Apache Airflow
>  Issue Type: Sub-task
>Reporter: Bolke de Bruin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2557) Reduce time spent in S3 tests

2018-06-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499549#comment-16499549
 ] 

ASF subversion and git services commented on AIRFLOW-2557:
--

Commit a47b2776f159f8c439de89e68545e90b81a4397a in incubator-airflow's branch 
refs/heads/master from [~bolke]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=a47b277 ]

[AIRFLOW-2557] Fix pagination for s3

Paged tests for s3 are taking over 120 seconds.
There
is functionality to set the page size. This
reduces
the time spent on tests.

Closes #3455 from bolkedebruin/AIRFLOW-2557


> Reduce time spent in S3 tests
> -
>
> Key: AIRFLOW-2557
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2557
> Project: Apache Airflow
>  Issue Type: Sub-task
>Reporter: Bolke de Bruin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-2557] Fix pagination for s3

2018-06-03 Thread fokko
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 91dd36866 -> a47b2776f


[AIRFLOW-2557] Fix pagination for s3

Paged tests for s3 are taking over 120 seconds.
There
is functionality to set the page size. This
reduces
the time spent on tests.

Closes #3455 from bolkedebruin/AIRFLOW-2557


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/a47b2776
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/a47b2776
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/a47b2776

Branch: refs/heads/master
Commit: a47b2776f159f8c439de89e68545e90b81a4397a
Parents: 91dd368
Author: Bolke de Bruin 
Authored: Sun Jun 3 22:29:26 2018 +0200
Committer: Fokko Driesprong 
Committed: Sun Jun 3 22:29:26 2018 +0200

--
 airflow/hooks/S3_hook.py| 30 ++
 tests/hooks/test_s3_hook.py | 18 +++---
 2 files changed, 37 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/a47b2776/airflow/hooks/S3_hook.py
--
diff --git a/airflow/hooks/S3_hook.py b/airflow/hooks/S3_hook.py
index 3d72275..b4f3ac3 100644
--- a/airflow/hooks/S3_hook.py
+++ b/airflow/hooks/S3_hook.py
@@ -77,7 +77,8 @@ class S3Hook(AwsHook):
 plist = self.list_prefixes(bucket_name, previous_level, delimiter)
 return False if plist is None else prefix in plist
 
-def list_prefixes(self, bucket_name, prefix='', delimiter=''):
+def list_prefixes(self, bucket_name, prefix='', delimiter='',
+  page_size=None, max_items=None):
 """
 Lists prefixes in a bucket under prefix
 
@@ -87,11 +88,21 @@ class S3Hook(AwsHook):
 :type prefix: str
 :param delimiter: the delimiter marks key hierarchy.
 :type delimiter: str
+:param page_size: pagination size
+:type page_size: int
+:param max_items: maximum items to return
+:type max_items: int
 """
+config = {
+'PageSize': page_size,
+'MaxItems': max_items,
+}
+
 paginator = self.get_conn().get_paginator('list_objects_v2')
 response = paginator.paginate(Bucket=bucket_name,
   Prefix=prefix,
-  Delimiter=delimiter)
+  Delimiter=delimiter,
+  PaginationConfig=config)
 
 has_results = False
 prefixes = []
@@ -104,7 +115,8 @@ class S3Hook(AwsHook):
 if has_results:
 return prefixes
 
-def list_keys(self, bucket_name, prefix='', delimiter=''):
+def list_keys(self, bucket_name, prefix='', delimiter='',
+  page_size=None, max_items=None):
 """
 Lists keys in a bucket under prefix and not containing delimiter
 
@@ -114,11 +126,21 @@ class S3Hook(AwsHook):
 :type prefix: str
 :param delimiter: the delimiter marks key hierarchy.
 :type delimiter: str
+:param page_size: pagination size
+:type page_size: int
+:param max_items: maximum items to return
+:type max_items: int
 """
+config = {
+'PageSize': page_size,
+'MaxItems': max_items,
+}
+
 paginator = self.get_conn().get_paginator('list_objects_v2')
 response = paginator.paginate(Bucket=bucket_name,
   Prefix=prefix,
-  Delimiter=delimiter)
+  Delimiter=delimiter,
+  PaginationConfig=config)
 
 has_results = False
 keys = []

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/a47b2776/tests/hooks/test_s3_hook.py
--
diff --git a/tests/hooks/test_s3_hook.py b/tests/hooks/test_s3_hook.py
index 94d0e36..baac203 100644
--- a/tests/hooks/test_s3_hook.py
+++ b/tests/hooks/test_s3_hook.py
@@ -7,9 +7,9 @@
 # to you under the Apache License, Version 2.0 (the
 # "License"); you may not use this file except in compliance
 # with the License.  You may obtain a copy of the License at
-# 
+#
 #   http://www.apache.org/licenses/LICENSE-2.0
-# 
+#
 # Unless required by applicable law or agreed to in writing,
 # software distributed under the License is distributed on an
 # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
@@ -96,13 +96,16 @@ class TestS3Hook(unittest.TestCase):
 b = hook.get_bucket('bucket')
 b.create()
 
-keys = ["%s/b" % i for i in range(5000)]
-dirs = ["%s/" % i for i in range(5000)]
+# we 

[jira] [Assigned] (AIRFLOW-1022) Subdag can't receive templated fields

2018-06-03 Thread Chao-Han Tsai (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao-Han Tsai reassigned AIRFLOW-1022:
--

Assignee: Chao-Han Tsai

> Subdag can't receive templated fields
> -
>
> Key: AIRFLOW-1022
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1022
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: subdag
>Reporter: Marcos Takahashi
>Assignee: Chao-Han Tsai
>Priority: Major
>  Labels: easyfix
>
> Subdag's can't receive any templated fields as the Operator is setted as 
> tuple() 
> (https://github.com/apache/incubator-airflow/blob/master/airflow/operators/subdag_operator.py#L24)
>  instead of any other templated dict like on PythonOperator 
> (https://github.com/apache/incubator-airflow/blob/master/airflow/operators/python_operator.py#L52).
> That makes impossible on getting some important values like execution_date.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2552) Improve description for airflow backfill -c

2018-06-03 Thread Chao-Han Tsai (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao-Han Tsai updated AIRFLOW-2552:
---
Description: 
Improve the description for {code}airflow backfill -c{code}
Example: JSON string that gets pickled into the DagRun / backfill 's conf 
attribute

  was:Improve the description for {code}airflow backfill -c{code}


> Improve description for airflow backfill -c
> ---
>
> Key: AIRFLOW-2552
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2552
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
>
> Improve the description for {code}airflow backfill -c{code}
> Example: JSON string that gets pickled into the DagRun / backfill 's conf 
> attribute



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2533) Kubernetes worker configuration improperly sets path to DAG on worker node

2018-06-03 Thread Chao-Han Tsai (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao-Han Tsai reassigned AIRFLOW-2533:
--

Assignee: Chao-Han Tsai

> Kubernetes worker configuration improperly sets path to DAG on worker node
> --
>
> Key: AIRFLOW-2533
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2533
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor, worker
>Affects Versions: 1.10.0
>Reporter: Brian Nutt
>Assignee: Chao-Han Tsai
>Priority: Blocker
> Fix For: 1.10
>
> Attachments: Screen Shot 2018-05-24 at 5.06.25 PM.png
>
>
> When triggering a DAG using the kubernetes executor, the path to the DAG on 
> the worker node is not properly set due to the modification of the command 
> that is sent to the pod for the worker node to perform. See  !Screen Shot 
> 2018-05-24 at 5.06.25 PM.png!
> This means that the only DAG's that are working at the example DAGs. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2355) Airflow trigger tag parameters in subdag

2018-06-03 Thread Chao-Han Tsai (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao-Han Tsai reassigned AIRFLOW-2355:
--

Assignee: Chao-Han Tsai

> Airflow trigger tag parameters in subdag
> 
>
> Key: AIRFLOW-2355
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2355
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: api
>Affects Versions: 1.9.0
>Reporter: Mohammed Tameem
>Assignee: Chao-Han Tsai
>Priority: Blocker
> Fix For: 1.9.0
>
>
> The command airflow {color:#8eb021}+_trigger_dag -c 
> "\{'name':'value'}"_+{color} sends conf parameters only to the parent DAG. 
> I'm using SubDags that are dependent on these parameters. And no parameters 
> are recieved by the SubDag.
> From source code of SubDag operator I see that there is no way of passing 
> these trigger parameters to a Subdag.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2533) Kubernetes worker configuration improperly sets path to DAG on worker node

2018-06-03 Thread Chao-Han Tsai (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao-Han Tsai reassigned AIRFLOW-2533:
--

Assignee: (was: Chao-Han Tsai)

> Kubernetes worker configuration improperly sets path to DAG on worker node
> --
>
> Key: AIRFLOW-2533
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2533
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor, worker
>Affects Versions: 1.10.0
>Reporter: Brian Nutt
>Priority: Blocker
> Fix For: 1.10
>
> Attachments: Screen Shot 2018-05-24 at 5.06.25 PM.png
>
>
> When triggering a DAG using the kubernetes executor, the path to the DAG on 
> the worker node is not properly set due to the modification of the command 
> that is sent to the pod for the worker node to perform. See  !Screen Shot 
> 2018-05-24 at 5.06.25 PM.png!
> This means that the only DAG's that are working at the example DAGs. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2558) Clear TASK/DAG is clearing all executions

2018-06-03 Thread Marcos Bernardelli (JIRA)
Marcos Bernardelli created AIRFLOW-2558:
---

 Summary: Clear TASK/DAG is clearing all executions
 Key: AIRFLOW-2558
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2558
 Project: Apache Airflow
  Issue Type: Bug
Affects Versions: Airflow 2.0
Reporter: Marcos Bernardelli
Assignee: Tao Feng


When I try to clear a DAG/TASK specific execution, the Airflow try to execute 
all the past executions:

[Animeted 
GIF|https://gist.githubusercontent.com/bern4rdelli/34c1e57acd53c8c67417604202f3e0e6/raw/4bcb3d3c23f2a3bb7f7bfb3e977d935e5bb9f0ee/clear.gif]
 (I failed miserable trying to attache the animated GIF :()
 
This behavior was changed here: 
[https://github.com/apache/incubator-airflow/pull/3444]

The old version looks like this:
{code:python}
drs = session.query(DagModel).filter_by(dag_id=self.dag_id).all()
{code}

Then it's changed to:
{code:python}
drs = session.query(DagRun).filter_by(dag_id=self.dag_id).all()
{code}

This new query (using DagRun) get all the past executions, even when the "Past" 
button is not checked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2545) example_kubernetes_operator.py is causing a DeprecationWarning to be displayed during default install

2018-06-03 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-2545.
---
   Resolution: Fixed
Fix Version/s: 2.0.0
   1.10.0

> example_kubernetes_operator.py is causing a DeprecationWarning to be 
> displayed during default install
> -
>
> Key: AIRFLOW-2545
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2545
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10
>Reporter: Craig Rodrigues
>Assignee: Craig Rodrigues
>Priority: Minor
> Fix For: 1.10.0, 2.0.0
>
>
> |!https://mail.google.com/mail/u/0/images/cleardot.gif|id=:ts!|
>  
>  {{I tested master branch by putting the following in my requirements.txt:}}
> {{git+https://github.com/rodrigc/incubator-airflow@master#egg=apache-airflow[celery,crypto,emr,hive,hdfs,ldap,mysql,postgres,redis,slack,s3]}}
> and did a pip install -r requirements.txt
>   
>  and then saw this DeprecationWarning:
>   
>  [2018-05-29 14:06:27,567] \{example_kubernetes_operator.py:30} WARNING - 
> Could not import KubernetesPodOperator
>  /Users/c-craigr/airflow2/lib/python2.7/site-packages/airflow/models.py:2315: 
> PendingDeprecationWarning: Invalid arguments were passed to BaseOperator. 
> Support for passing such arguments will be dropped in Airflow 2.0. Invalid 
> arguments were:
>   *args: ()
>  **kwargs: {'name': 'airflow-test-pod', 'image': 'ubuntu:16.04', 'labels':
> {'foo': 'bar'}
> , 'namespace': 'default', 'cmds': ['bash', '-cx'], 'arguments': ['echo', 
> '10'], 'in_cluster': False, 'get_logs': True
>   category=PendingDeprecationWarning



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2545) example_kubernetes_operator.py is causing a DeprecationWarning to be displayed during default install

2018-06-03 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong reassigned AIRFLOW-2545:
-

Assignee: Craig Rodrigues

> example_kubernetes_operator.py is causing a DeprecationWarning to be 
> displayed during default install
> -
>
> Key: AIRFLOW-2545
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2545
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10
>Reporter: Craig Rodrigues
>Assignee: Craig Rodrigues
>Priority: Minor
>
> |!https://mail.google.com/mail/u/0/images/cleardot.gif|id=:ts!|
>  
>  {{I tested master branch by putting the following in my requirements.txt:}}
> {{git+https://github.com/rodrigc/incubator-airflow@master#egg=apache-airflow[celery,crypto,emr,hive,hdfs,ldap,mysql,postgres,redis,slack,s3]}}
> and did a pip install -r requirements.txt
>   
>  and then saw this DeprecationWarning:
>   
>  [2018-05-29 14:06:27,567] \{example_kubernetes_operator.py:30} WARNING - 
> Could not import KubernetesPodOperator
>  /Users/c-craigr/airflow2/lib/python2.7/site-packages/airflow/models.py:2315: 
> PendingDeprecationWarning: Invalid arguments were passed to BaseOperator. 
> Support for passing such arguments will be dropped in Airflow 2.0. Invalid 
> arguments were:
>   *args: ()
>  **kwargs: {'name': 'airflow-test-pod', 'image': 'ubuntu:16.04', 'labels':
> {'foo': 'bar'}
> , 'namespace': 'default', 'cmds': ['bash', '-cx'], 'arguments': ['echo', 
> '10'], 'in_cluster': False, 'get_logs': True
>   category=PendingDeprecationWarning



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2545) example_kubernetes_operator.py is causing a DeprecationWarning to be displayed during default install

2018-06-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499417#comment-16499417
 ] 

ASF subversion and git services commented on AIRFLOW-2545:
--

Commit 91dd3686688dc42372ec5ae7b6efb08e35125b1f in incubator-airflow's branch 
refs/heads/master from [~rodrigc]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=91dd368 ]

[AIRFLOW-2545] Eliminate DeprecationWarning

Do not import BaseOperator as KubernetesOperator.
This eliminates confusing DeprecationWarnings when
setting up a default airflow install where
additional
kubernetes modules are not installed.

Also, use LoggingMixin instead of logger module.

Closes #3442 from rodrigc/AIRFLOW-2545


> example_kubernetes_operator.py is causing a DeprecationWarning to be 
> displayed during default install
> -
>
> Key: AIRFLOW-2545
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2545
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10
>Reporter: Craig Rodrigues
>Priority: Minor
>
> |!https://mail.google.com/mail/u/0/images/cleardot.gif|id=:ts!|
>  
>  {{I tested master branch by putting the following in my requirements.txt:}}
> {{git+https://github.com/rodrigc/incubator-airflow@master#egg=apache-airflow[celery,crypto,emr,hive,hdfs,ldap,mysql,postgres,redis,slack,s3]}}
> and did a pip install -r requirements.txt
>   
>  and then saw this DeprecationWarning:
>   
>  [2018-05-29 14:06:27,567] \{example_kubernetes_operator.py:30} WARNING - 
> Could not import KubernetesPodOperator
>  /Users/c-craigr/airflow2/lib/python2.7/site-packages/airflow/models.py:2315: 
> PendingDeprecationWarning: Invalid arguments were passed to BaseOperator. 
> Support for passing such arguments will be dropped in Airflow 2.0. Invalid 
> arguments were:
>   *args: ()
>  **kwargs: {'name': 'airflow-test-pod', 'image': 'ubuntu:16.04', 'labels':
> {'foo': 'bar'}
> , 'namespace': 'default', 'cmds': ['bash', '-cx'], 'arguments': ['echo', 
> '10'], 'in_cluster': False, 'get_logs': True
>   category=PendingDeprecationWarning



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2545) example_kubernetes_operator.py is causing a DeprecationWarning to be displayed during default install

2018-06-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499418#comment-16499418
 ] 

ASF subversion and git services commented on AIRFLOW-2545:
--

Commit 91dd3686688dc42372ec5ae7b6efb08e35125b1f in incubator-airflow's branch 
refs/heads/master from [~rodrigc]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=91dd368 ]

[AIRFLOW-2545] Eliminate DeprecationWarning

Do not import BaseOperator as KubernetesOperator.
This eliminates confusing DeprecationWarnings when
setting up a default airflow install where
additional
kubernetes modules are not installed.

Also, use LoggingMixin instead of logger module.

Closes #3442 from rodrigc/AIRFLOW-2545


> example_kubernetes_operator.py is causing a DeprecationWarning to be 
> displayed during default install
> -
>
> Key: AIRFLOW-2545
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2545
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10
>Reporter: Craig Rodrigues
>Priority: Minor
>
> |!https://mail.google.com/mail/u/0/images/cleardot.gif|id=:ts!|
>  
>  {{I tested master branch by putting the following in my requirements.txt:}}
> {{git+https://github.com/rodrigc/incubator-airflow@master#egg=apache-airflow[celery,crypto,emr,hive,hdfs,ldap,mysql,postgres,redis,slack,s3]}}
> and did a pip install -r requirements.txt
>   
>  and then saw this DeprecationWarning:
>   
>  [2018-05-29 14:06:27,567] \{example_kubernetes_operator.py:30} WARNING - 
> Could not import KubernetesPodOperator
>  /Users/c-craigr/airflow2/lib/python2.7/site-packages/airflow/models.py:2315: 
> PendingDeprecationWarning: Invalid arguments were passed to BaseOperator. 
> Support for passing such arguments will be dropped in Airflow 2.0. Invalid 
> arguments were:
>   *args: ()
>  **kwargs: {'name': 'airflow-test-pod', 'image': 'ubuntu:16.04', 'labels':
> {'foo': 'bar'}
> , 'namespace': 'default', 'cmds': ['bash', '-cx'], 'arguments': ['echo', 
> '10'], 'in_cluster': False, 'get_logs': True
>   category=PendingDeprecationWarning



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-2545] Eliminate DeprecationWarning

2018-06-03 Thread fokko
Repository: incubator-airflow
Updated Branches:
  refs/heads/master b7dc31510 -> 91dd36866


[AIRFLOW-2545] Eliminate DeprecationWarning

Do not import BaseOperator as KubernetesOperator.
This eliminates confusing DeprecationWarnings when
setting up a default airflow install where
additional
kubernetes modules are not installed.

Also, use LoggingMixin instead of logger module.

Closes #3442 from rodrigc/AIRFLOW-2545


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/91dd3686
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/91dd3686
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/91dd3686

Branch: refs/heads/master
Commit: 91dd3686688dc42372ec5ae7b6efb08e35125b1f
Parents: b7dc315
Author: Craig Rodrigues 
Authored: Sun Jun 3 15:29:31 2018 +0200
Committer: Fokko Driesprong 
Committed: Sun Jun 3 15:29:31 2018 +0200

--
 .../example_dags/example_kubernetes_operator.py | 55 +++-
 1 file changed, 29 insertions(+), 26 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/91dd3686/airflow/example_dags/example_kubernetes_operator.py
--
diff --git a/airflow/example_dags/example_kubernetes_operator.py 
b/airflow/example_dags/example_kubernetes_operator.py
index 8f5ab39..92d73c5 100644
--- a/airflow/example_dags/example_kubernetes_operator.py
+++ b/airflow/example_dags/example_kubernetes_operator.py
@@ -17,37 +17,40 @@
 # specific language governing permissions and limitations
 # under the License.
 
-import airflow
-import logging
+from airflow.utils.dates import days_ago
+from airflow.utils.log.logging_mixin import LoggingMixin
 from airflow.models import DAG
 
+log = LoggingMixin().log
+
 try:
 # Kubernetes is optional, so not available in vanilla Airflow
-# pip install airflow[gcp]
+# pip install airflow[kubernetes]
 from airflow.contrib.operators.kubernetes_pod_operator import 
KubernetesPodOperator
-except ImportError:
-# Just import the BaseOperator as the KubernetesPodOperator
-logging.warn("Could not import KubernetesPodOperator")
-from airflow.models import BaseOperator as KubernetesPodOperator
 
-args = {
-'owner': 'airflow',
-'start_date': airflow.utils.dates.days_ago(2)
-}
+args = {
+'owner': 'airflow',
+'start_date': days_ago(2)
+}
+
+dag = DAG(
+dag_id='example_kubernetes_operator',
+default_args=args,
+schedule_interval=None)
 
-dag = DAG(
-dag_id='example_kubernetes_operator',
-default_args=args,
-schedule_interval=None)
+k = KubernetesPodOperator(
+namespace='default',
+image="ubuntu:16.04",
+cmds=["bash", "-cx"],
+arguments=["echo", "10"],
+labels={"foo": "bar"},
+name="airflow-test-pod",
+in_cluster=False,
+task_id="task",
+get_logs=True,
+dag=dag)
 
-k = KubernetesPodOperator(
-namespace='default',
-image="ubuntu:16.04",
-cmds=["bash", "-cx"],
-arguments=["echo", "10"],
-labels={"foo": "bar"},
-name="airflow-test-pod",
-in_cluster=False,
-task_id="task",
-get_logs=True,
-dag=dag)
+except ImportError as e:
+log.warn("Could not import KubernetesPodOperator: " + str(e))
+log.warn("Install kubernetes dependencies with: "
+ "pip install airflow['kubernetes']")



[jira] [Resolved] (AIRFLOW-2500) Fix MySqlToHiveTransfer to transfer unsigned type properly

2018-06-03 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-2500.
---
Resolution: Fixed

> Fix MySqlToHiveTransfer to transfer unsigned type properly
> --
>
> Key: AIRFLOW-2500
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2500
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Major
>
> Given the following table,
> {code}
> mysql> USE airflow_ci
> Database changed
> mysql> DESC users;
> +---+--+--+-+-+---+
> | Field | Type | Null | Key | Default | Extra |
> +---+--+--+-+-+---+
> | id| int(10) unsigned | YES  | | NULL|   |
> +---+--+--+-+-+---+
> 1 row in set (0.00 sec)
> mysql> SELECT * FROM users;
> ++
> | id |
> ++
> | 2147483647 |
> | 2147483648 |
> ++
> 2 rows in set (0.00 sec)
> {code}
> executing MySqlToHiveTransfer:
> {code}
> In [1]: from airflow.operators.mysql_to_hive import MySqlToHiveTransfer
>...: t = MySqlToHiveTransfer(sql="SELECT * FROM airflow_ci.users", 
> hive_table="users", recreate=True, task_id="t")
>...: t.execute(None)
>...: 
> [2018-05-21 12:14:09,137] {base_hook.py:83} INFO - Using connection to: 
> localhost
> [2018-05-21 12:14:09,140] {base_hook.py:83} INFO - Using connection to: 
> localhost
> [2018-05-21 12:14:09,146] {hive_hooks.py:427} INFO - DROP TABLE IF EXISTS 
> users;
> CREATE TABLE IF NOT EXISTS users (
> id INT)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ''
> STORED AS textfile
> ;
> (snip)
> [2018-05-21 12:14:31,667] {hive_hooks.py:233} INFO - Loading data to table 
> default.users
> [2018-05-21 12:14:32,364] {hive_hooks.py:233} INFO - Table default.users 
> stats: [numFiles=1, numRows=0, totalSize=24, rawDataSize=0]
> [2018-05-21 12:14:32,365] {hive_hooks.py:233} INFO - OK
> [2018-05-21 12:14:32,366] {hive_hooks.py:233} INFO - Time taken: 1.299 seconds
> {code}
> ... then the value greater than the upper bound for signed integer is not 
> properly fetched from Hive.
> {code}
> hive> SELECT * FROM users;
> OK
> 2147483647
> NULL
> Time taken: 2.461 seconds, Fetched: 2 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2500) Fix MySqlToHiveTransfer to transfer unsigned type properly

2018-06-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499413#comment-16499413
 ] 

ASF subversion and git services commented on AIRFLOW-2500:
--

Commit b7dc315101a0fc924f76e5c5f500c96e85bdd672 in incubator-airflow's branch 
refs/heads/master from [~sekikn]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=b7dc315 ]

[AIRFLOW-2500] Fix MySqlToHiveTransfer to transfer unsigned type properly

MySQL supports unsigned data types, but Hive
doesn't.
So if MySqlToHiveTransfer maps MySQL's data types
to
Hive's corresponding ones directly (e.g. INT ->
INT),
unsigned values over signed type's upper bound
transferred from MySQL are interpreted as invalid
by Hive, and users get NULL.
To avoid it, this PR fixes MySqlToHiveTransfer
to map MySQL data types to Hive's wider ones
(e.g. SMALLINT -> INT, INT -> BIGINT, etc.).

Closes #3446 from sekikn/AIRFLOW-2500


> Fix MySqlToHiveTransfer to transfer unsigned type properly
> --
>
> Key: AIRFLOW-2500
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2500
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Major
>
> Given the following table,
> {code}
> mysql> USE airflow_ci
> Database changed
> mysql> DESC users;
> +---+--+--+-+-+---+
> | Field | Type | Null | Key | Default | Extra |
> +---+--+--+-+-+---+
> | id| int(10) unsigned | YES  | | NULL|   |
> +---+--+--+-+-+---+
> 1 row in set (0.00 sec)
> mysql> SELECT * FROM users;
> ++
> | id |
> ++
> | 2147483647 |
> | 2147483648 |
> ++
> 2 rows in set (0.00 sec)
> {code}
> executing MySqlToHiveTransfer:
> {code}
> In [1]: from airflow.operators.mysql_to_hive import MySqlToHiveTransfer
>...: t = MySqlToHiveTransfer(sql="SELECT * FROM airflow_ci.users", 
> hive_table="users", recreate=True, task_id="t")
>...: t.execute(None)
>...: 
> [2018-05-21 12:14:09,137] {base_hook.py:83} INFO - Using connection to: 
> localhost
> [2018-05-21 12:14:09,140] {base_hook.py:83} INFO - Using connection to: 
> localhost
> [2018-05-21 12:14:09,146] {hive_hooks.py:427} INFO - DROP TABLE IF EXISTS 
> users;
> CREATE TABLE IF NOT EXISTS users (
> id INT)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ''
> STORED AS textfile
> ;
> (snip)
> [2018-05-21 12:14:31,667] {hive_hooks.py:233} INFO - Loading data to table 
> default.users
> [2018-05-21 12:14:32,364] {hive_hooks.py:233} INFO - Table default.users 
> stats: [numFiles=1, numRows=0, totalSize=24, rawDataSize=0]
> [2018-05-21 12:14:32,365] {hive_hooks.py:233} INFO - OK
> [2018-05-21 12:14:32,366] {hive_hooks.py:233} INFO - Time taken: 1.299 seconds
> {code}
> ... then the value greater than the upper bound for signed integer is not 
> properly fetched from Hive.
> {code}
> hive> SELECT * FROM users;
> OK
> 2147483647
> NULL
> Time taken: 2.461 seconds, Fetched: 2 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2500) Fix MySqlToHiveTransfer to transfer unsigned type properly

2018-06-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499414#comment-16499414
 ] 

ASF subversion and git services commented on AIRFLOW-2500:
--

Commit b7dc315101a0fc924f76e5c5f500c96e85bdd672 in incubator-airflow's branch 
refs/heads/master from [~sekikn]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=b7dc315 ]

[AIRFLOW-2500] Fix MySqlToHiveTransfer to transfer unsigned type properly

MySQL supports unsigned data types, but Hive
doesn't.
So if MySqlToHiveTransfer maps MySQL's data types
to
Hive's corresponding ones directly (e.g. INT ->
INT),
unsigned values over signed type's upper bound
transferred from MySQL are interpreted as invalid
by Hive, and users get NULL.
To avoid it, this PR fixes MySqlToHiveTransfer
to map MySQL data types to Hive's wider ones
(e.g. SMALLINT -> INT, INT -> BIGINT, etc.).

Closes #3446 from sekikn/AIRFLOW-2500


> Fix MySqlToHiveTransfer to transfer unsigned type properly
> --
>
> Key: AIRFLOW-2500
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2500
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Major
>
> Given the following table,
> {code}
> mysql> USE airflow_ci
> Database changed
> mysql> DESC users;
> +---+--+--+-+-+---+
> | Field | Type | Null | Key | Default | Extra |
> +---+--+--+-+-+---+
> | id| int(10) unsigned | YES  | | NULL|   |
> +---+--+--+-+-+---+
> 1 row in set (0.00 sec)
> mysql> SELECT * FROM users;
> ++
> | id |
> ++
> | 2147483647 |
> | 2147483648 |
> ++
> 2 rows in set (0.00 sec)
> {code}
> executing MySqlToHiveTransfer:
> {code}
> In [1]: from airflow.operators.mysql_to_hive import MySqlToHiveTransfer
>...: t = MySqlToHiveTransfer(sql="SELECT * FROM airflow_ci.users", 
> hive_table="users", recreate=True, task_id="t")
>...: t.execute(None)
>...: 
> [2018-05-21 12:14:09,137] {base_hook.py:83} INFO - Using connection to: 
> localhost
> [2018-05-21 12:14:09,140] {base_hook.py:83} INFO - Using connection to: 
> localhost
> [2018-05-21 12:14:09,146] {hive_hooks.py:427} INFO - DROP TABLE IF EXISTS 
> users;
> CREATE TABLE IF NOT EXISTS users (
> id INT)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ''
> STORED AS textfile
> ;
> (snip)
> [2018-05-21 12:14:31,667] {hive_hooks.py:233} INFO - Loading data to table 
> default.users
> [2018-05-21 12:14:32,364] {hive_hooks.py:233} INFO - Table default.users 
> stats: [numFiles=1, numRows=0, totalSize=24, rawDataSize=0]
> [2018-05-21 12:14:32,365] {hive_hooks.py:233} INFO - OK
> [2018-05-21 12:14:32,366] {hive_hooks.py:233} INFO - Time taken: 1.299 seconds
> {code}
> ... then the value greater than the upper bound for signed integer is not 
> properly fetched from Hive.
> {code}
> hive> SELECT * FROM users;
> OK
> 2147483647
> NULL
> Time taken: 2.461 seconds, Fetched: 2 row(s)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-2500] Fix MySqlToHiveTransfer to transfer unsigned type properly

2018-06-03 Thread fokko
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 29dbedfd0 -> b7dc31510


[AIRFLOW-2500] Fix MySqlToHiveTransfer to transfer unsigned type properly

MySQL supports unsigned data types, but Hive
doesn't.
So if MySqlToHiveTransfer maps MySQL's data types
to
Hive's corresponding ones directly (e.g. INT ->
INT),
unsigned values over signed type's upper bound
transferred from MySQL are interpreted as invalid
by Hive, and users get NULL.
To avoid it, this PR fixes MySqlToHiveTransfer
to map MySQL data types to Hive's wider ones
(e.g. SMALLINT -> INT, INT -> BIGINT, etc.).

Closes #3446 from sekikn/AIRFLOW-2500


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/b7dc3151
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/b7dc3151
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/b7dc3151

Branch: refs/heads/master
Commit: b7dc315101a0fc924f76e5c5f500c96e85bdd672
Parents: 29dbedf
Author: Kengo Seki 
Authored: Sun Jun 3 15:24:19 2018 +0200
Committer: Fokko Driesprong 
Committed: Sun Jun 3 15:24:19 2018 +0200

--
 airflow/operators/mysql_to_hive.py |   5 +-
 tests/operators/operators.py   | 112 
 2 files changed, 115 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/b7dc3151/airflow/operators/mysql_to_hive.py
--
diff --git a/airflow/operators/mysql_to_hive.py 
b/airflow/operators/mysql_to_hive.py
index 22b7ac2..ab7b7fa 100644
--- a/airflow/operators/mysql_to_hive.py
+++ b/airflow/operators/mysql_to_hive.py
@@ -104,9 +104,10 @@ class MySqlToHiveTransfer(BaseOperator):
 t.DOUBLE: 'DOUBLE',
 t.FLOAT: 'DOUBLE',
 t.INT24: 'INT',
-t.LONG: 'INT',
-t.LONGLONG: 'BIGINT',
+t.LONG: 'BIGINT',
+t.LONGLONG: 'DECIMAL(38,0)',
 t.SHORT: 'INT',
+t.TINY: 'SMALLINT',
 t.YEAR: 'INT',
 }
 return d[mysql_type] if mysql_type in d else 'STRING'

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/b7dc3151/tests/operators/operators.py
--
diff --git a/tests/operators/operators.py b/tests/operators/operators.py
index 4eba0a8..07b1396 100644
--- a/tests/operators/operators.py
+++ b/tests/operators/operators.py
@@ -23,8 +23,11 @@ from airflow import DAG, configuration, operators
 from airflow.utils.tests import skipUnlessImported
 from airflow.utils import timezone
 
+from collections import OrderedDict
+
 import os
 import mock
+import six
 import unittest
 
 configuration.load_test_config()
@@ -313,3 +316,112 @@ class TransferTests(unittest.TestCase):
 tblproperties={'test_property':'test_value'},
 dag=self.dag)
 t.run(start_date=DEFAULT_DATE, end_date=DEFAULT_DATE, 
ignore_ti_state=True)
+
+@mock.patch('airflow.hooks.hive_hooks.HiveCliHook.load_file')
+def test_mysql_to_hive_type_conversion(self, mock_load_file):
+mysql_conn_id = 'airflow_ci'
+mysql_table = 'test_mysql_to_hive'
+
+from airflow.hooks.mysql_hook import MySqlHook
+m = MySqlHook(mysql_conn_id)
+
+try:
+with m.get_conn() as c:
+c.execute("DROP TABLE IF EXISTS {}".format(mysql_table))
+c.execute("""
+CREATE TABLE {} (
+c0 TINYINT,
+c1 SMALLINT,
+c2 MEDIUMINT,
+c3 INT,
+c4 BIGINT
+)
+""".format(mysql_table))
+
+from airflow.operators.mysql_to_hive import MySqlToHiveTransfer
+t = MySqlToHiveTransfer(
+task_id='test_m2h',
+mysql_conn_id=mysql_conn_id,
+hive_cli_conn_id='beeline_default',
+sql="SELECT * FROM {}".format(mysql_table),
+hive_table='test_mysql_to_hive',
+dag=self.dag)
+t.run(start_date=DEFAULT_DATE, end_date=DEFAULT_DATE, 
ignore_ti_state=True)
+
+mock_load_file.assert_called_once()
+d = OrderedDict()
+d["c0"] = "SMALLINT"
+d["c1"] = "INT"
+d["c2"] = "INT"
+d["c3"] = "BIGINT"
+d["c4"] = "DECIMAL(38,0)"
+self.assertEqual(mock_load_file.call_args[1]["field_dict"], d)
+finally:
+with m.get_conn() as c:
+c.execute("DROP TABLE IF EXISTS {}".format(mysql_table))
+
+@unittest.skipIf(six.PY2, "Skip since HiveServer2Hook doesn't work "
+  "on Python2 for now. Se

[jira] [Commented] (AIRFLOW-2557) Reduce time spent in S3 tests

2018-06-03 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499406#comment-16499406
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2557:


The moto library (which is in use partially) should probably be used everywhere 
in the S3 tests. This might speed things up.

> Reduce time spent in S3 tests
> -
>
> Key: AIRFLOW-2557
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2557
> Project: Apache Airflow
>  Issue Type: Sub-task
>Reporter: Bolke de Bruin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2462) airflow.contrib.auth.backends.password_auth.PasswordUser exists bug

2018-06-03 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong reassigned AIRFLOW-2462:
-

Assignee: froginwell

> airflow.contrib.auth.backends.password_auth.PasswordUser exists bug
> ---
>
> Key: AIRFLOW-2462
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2462
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication, contrib
>Affects Versions: 1.9.0
>Reporter: froginwell
>Assignee: froginwell
>Priority: Blocker
>
>  PasswordUser
> {quote}
> @password.setter
> def _set_password(self, plaintext):
>     self._password = generate_password_hash(plaintext, 12)
>     if PY3:
>         self._password = str(self._password, 'utf-8')
> {quote}
> _set_password should be renamed as password.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2462) airflow.contrib.auth.backends.password_auth.PasswordUser exists bug

2018-06-03 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-2462.
---
Resolution: Fixed

> airflow.contrib.auth.backends.password_auth.PasswordUser exists bug
> ---
>
> Key: AIRFLOW-2462
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2462
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication, contrib
>Affects Versions: 1.9.0
>Reporter: froginwell
>Assignee: froginwell
>Priority: Blocker
>
>  PasswordUser
> {quote}
> @password.setter
> def _set_password(self, plaintext):
>     self._password = generate_password_hash(plaintext, 12)
>     if PY3:
>         self._password = str(self._password, 'utf-8')
> {quote}
> _set_password should be renamed as password.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2556) Reduce time spent on unit tests

2018-06-03 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499403#comment-16499403
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2556:


Does this actually cost Apache money, or is it free cos it's an open-source 
project? Reducing test time still def good though, just curious


> Reduce time spent on unit tests
> ---
>
> Key: AIRFLOW-2556
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2556
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Bolke de Bruin
>Priority: Major
>
> Unit tests are taking up way too much time. This costs time and actually also 
> Money from the Apache Foundation. We need to reduce this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2462) airflow.contrib.auth.backends.password_auth.PasswordUser exists bug

2018-06-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499402#comment-16499402
 ] 

ASF subversion and git services commented on AIRFLOW-2462:
--

Commit 29dbedfd0a49c54eaf388ff940ec7cfe4a6e1f7f in incubator-airflow's branch 
refs/heads/master from [~noremac201]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=29dbedf ]

[AIRFLOW-2462] Change PasswordUser setter to correct syntax

Closes #3415 from Noremac201/setterFix


> airflow.contrib.auth.backends.password_auth.PasswordUser exists bug
> ---
>
> Key: AIRFLOW-2462
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2462
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: authentication, contrib
>Affects Versions: 1.9.0
>Reporter: froginwell
>Priority: Blocker
>
>  PasswordUser
> {quote}
> @password.setter
> def _set_password(self, plaintext):
>     self._password = generate_password_hash(plaintext, 12)
>     if PY3:
>         self._password = str(self._password, 'utf-8')
> {quote}
> _set_password should be renamed as password.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-2462] Change PasswordUser setter to correct syntax

2018-06-03 Thread fokko
Repository: incubator-airflow
Updated Branches:
  refs/heads/master e5fb9c799 -> 29dbedfd0


[AIRFLOW-2462] Change PasswordUser setter to correct syntax

Closes #3415 from Noremac201/setterFix


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/29dbedfd
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/29dbedfd
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/29dbedfd

Branch: refs/heads/master
Commit: 29dbedfd0a49c54eaf388ff940ec7cfe4a6e1f7f
Parents: e5fb9c7
Author: Cameron Moberg 
Authored: Sun Jun 3 15:07:11 2018 +0200
Committer: Fokko Driesprong 
Committed: Sun Jun 3 15:07:11 2018 +0200

--
 airflow/contrib/auth/backends/password_auth.py |  2 +-
 tests/core.py  | 38 +
 2 files changed, 39 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/29dbedfd/airflow/contrib/auth/backends/password_auth.py
--
diff --git a/airflow/contrib/auth/backends/password_auth.py 
b/airflow/contrib/auth/backends/password_auth.py
index 9e16bb6..879aaa1 100644
--- a/airflow/contrib/auth/backends/password_auth.py
+++ b/airflow/contrib/auth/backends/password_auth.py
@@ -63,7 +63,7 @@ class PasswordUser(models.User):
 return self._password
 
 @password.setter
-def _set_password(self, plaintext):
+def password(self, plaintext):
 self._password = generate_password_hash(plaintext, 12)
 if PY3:
 self._password = str(self._password, 'utf-8')

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/29dbedfd/tests/core.py
--
diff --git a/tests/core.py b/tests/core.py
index 83737ed..0312fed 100644
--- a/tests/core.py
+++ b/tests/core.py
@@ -1865,6 +1865,44 @@ class SecureModeWebUiTests(unittest.TestCase):
 configuration.conf.remove_option("core", "SECURE_MODE")
 
 
+class PasswordUserTest(unittest.TestCase):
+def setUp(self):
+user = models.User()
+from airflow.contrib.auth.backends.password_auth import PasswordUser
+self.password_user = PasswordUser(user)
+self.password_user.username = "password_test"
+
+
@mock.patch('airflow.contrib.auth.backends.password_auth.generate_password_hash')
+def test_password_setter(self, mock_gen_pass_hash):
+mock_gen_pass_hash.return_value = b"hashed_pass" if six.PY3 else 
"hashed_pass"
+
+self.password_user.password = "secure_password"
+mock_gen_pass_hash.assert_called_with("secure_password", 12)
+
+def test_password_unicode(self):
+# In python2.7 no conversion is required back to str
+# In python >= 3 the method must convert from bytes to str
+self.password_user.password = "secure_password"
+self.assertIsInstance(self.password_user.password, str)
+
+def test_password_user_authenticate(self):
+self.password_user.password = "secure_password"
+self.assertTrue(self.password_user.authenticate("secure_password"))
+
+def test_password_authenticate_session(self):
+from airflow.contrib.auth.backends.password_auth import PasswordUser
+self.password_user.password = 'test_password'
+session = Session()
+session.add(self.password_user)
+session.commit()
+query_user = session.query(PasswordUser).filter_by(
+username=self.password_user.username).first()
+self.assertTrue(query_user.authenticate('test_password'))
+session.query(models.User).delete()
+session.commit()
+session.close()
+
+
 class WebPasswordAuthTest(unittest.TestCase):
 def setUp(self):
 configuration.conf.set("webserver", "authenticate", "True")



[jira] [Closed] (AIRFLOW-2525) Fix PostgresHook.copy_expert to work with "COPY FROM"

2018-06-03 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong closed AIRFLOW-2525.
-

> Fix PostgresHook.copy_expert to work with "COPY FROM"
> -
>
> Key: AIRFLOW-2525
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2525
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Major
> Fix For: 2.0.0
>
>
> psycopg2's {{copy_expert}} [works with both "COPY FROM" and "COPY 
> TO"|http://initd.org/psycopg/docs/cursor.html#cursor.copy_expert].
> But {{PostgresHook.copy_expert}} fails with "COPY FROM" as follows:
> {code}
> In [1]: from airflow.hooks.postgres_hook import PostgresHook
> In [2]: hook = PostgresHook()
> In [3]: hook.copy_expert("COPY t FROM STDIN", "/tmp/t")
> [2018-05-24 23:37:54,767] {base_hook.py:83} INFO - Using connection to: 
> localhost
> ---
> QueryCanceledErrorTraceback (most recent call last)
>  in ()
> > 1 hook.copy_expert("COPY t FROM STDIN", "/tmp/t")
> ~/dev/incubator-airflow/airflow/hooks/postgres_hook.py in copy_expert(self, 
> sql, filename, open)
>  68 with closing(self.get_conn()) as conn:
>  69 with closing(conn.cursor()) as cur:
> ---> 70 cur.copy_expert(sql, f)
>  71
>  72 @staticmethod
> QueryCanceledError: COPY from stdin failed: error in .read() call: 
> UnsupportedOperation not readable
> CONTEXT:  COPY t, line 1
> {code}
> This is because the file used in this method is opened with 'w' mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2525) Fix PostgresHook.copy_expert to work with "COPY FROM"

2018-06-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499400#comment-16499400
 ] 

ASF subversion and git services commented on AIRFLOW-2525:
--

Commit e5fb9c7998e67be40cb285971fb941ed4e81273a in incubator-airflow's branch 
refs/heads/master from [~sekikn]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=e5fb9c7 ]

[AIRFLOW-2525] Fix a bug introduced by commit dabf1b9

The previous commit on this issue (#3421)
introduced a new bug on COPY FROM operation.
This PR fixes it by opening a file with 'r+'
mode instead of 'w+' to avoid truncating it.

Closes #3423 from sekikn/AIRFLOW-2525-2


> Fix PostgresHook.copy_expert to work with "COPY FROM"
> -
>
> Key: AIRFLOW-2525
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2525
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Major
> Fix For: 2.0.0
>
>
> psycopg2's {{copy_expert}} [works with both "COPY FROM" and "COPY 
> TO"|http://initd.org/psycopg/docs/cursor.html#cursor.copy_expert].
> But {{PostgresHook.copy_expert}} fails with "COPY FROM" as follows:
> {code}
> In [1]: from airflow.hooks.postgres_hook import PostgresHook
> In [2]: hook = PostgresHook()
> In [3]: hook.copy_expert("COPY t FROM STDIN", "/tmp/t")
> [2018-05-24 23:37:54,767] {base_hook.py:83} INFO - Using connection to: 
> localhost
> ---
> QueryCanceledErrorTraceback (most recent call last)
>  in ()
> > 1 hook.copy_expert("COPY t FROM STDIN", "/tmp/t")
> ~/dev/incubator-airflow/airflow/hooks/postgres_hook.py in copy_expert(self, 
> sql, filename, open)
>  68 with closing(self.get_conn()) as conn:
>  69 with closing(conn.cursor()) as cur:
> ---> 70 cur.copy_expert(sql, f)
>  71
>  72 @staticmethod
> QueryCanceledError: COPY from stdin failed: error in .read() call: 
> UnsupportedOperation not readable
> CONTEXT:  COPY t, line 1
> {code}
> This is because the file used in this method is opened with 'w' mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2525) Fix PostgresHook.copy_expert to work with "COPY FROM"

2018-06-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499399#comment-16499399
 ] 

ASF subversion and git services commented on AIRFLOW-2525:
--

Commit e5fb9c7998e67be40cb285971fb941ed4e81273a in incubator-airflow's branch 
refs/heads/master from [~sekikn]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=e5fb9c7 ]

[AIRFLOW-2525] Fix a bug introduced by commit dabf1b9

The previous commit on this issue (#3421)
introduced a new bug on COPY FROM operation.
This PR fixes it by opening a file with 'r+'
mode instead of 'w+' to avoid truncating it.

Closes #3423 from sekikn/AIRFLOW-2525-2


> Fix PostgresHook.copy_expert to work with "COPY FROM"
> -
>
> Key: AIRFLOW-2525
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2525
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Major
> Fix For: 2.0.0
>
>
> psycopg2's {{copy_expert}} [works with both "COPY FROM" and "COPY 
> TO"|http://initd.org/psycopg/docs/cursor.html#cursor.copy_expert].
> But {{PostgresHook.copy_expert}} fails with "COPY FROM" as follows:
> {code}
> In [1]: from airflow.hooks.postgres_hook import PostgresHook
> In [2]: hook = PostgresHook()
> In [3]: hook.copy_expert("COPY t FROM STDIN", "/tmp/t")
> [2018-05-24 23:37:54,767] {base_hook.py:83} INFO - Using connection to: 
> localhost
> ---
> QueryCanceledErrorTraceback (most recent call last)
>  in ()
> > 1 hook.copy_expert("COPY t FROM STDIN", "/tmp/t")
> ~/dev/incubator-airflow/airflow/hooks/postgres_hook.py in copy_expert(self, 
> sql, filename, open)
>  68 with closing(self.get_conn()) as conn:
>  69 with closing(conn.cursor()) as cur:
> ---> 70 cur.copy_expert(sql, f)
>  71
>  72 @staticmethod
> QueryCanceledError: COPY from stdin failed: error in .read() call: 
> UnsupportedOperation not readable
> CONTEXT:  COPY t, line 1
> {code}
> This is because the file used in this method is opened with 'w' mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-2525] Fix a bug introduced by commit dabf1b9

2018-06-03 Thread fokko
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 2ed618883 -> e5fb9c799


[AIRFLOW-2525] Fix a bug introduced by commit dabf1b9

The previous commit on this issue (#3421)
introduced a new bug on COPY FROM operation.
This PR fixes it by opening a file with 'r+'
mode instead of 'w+' to avoid truncating it.

Closes #3423 from sekikn/AIRFLOW-2525-2


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/e5fb9c79
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/e5fb9c79
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/e5fb9c79

Branch: refs/heads/master
Commit: e5fb9c7998e67be40cb285971fb941ed4e81273a
Parents: 2ed6188
Author: Kengo Seki 
Authored: Sun Jun 3 15:03:05 2018 +0200
Committer: Fokko Driesprong 
Committed: Sun Jun 3 15:03:05 2018 +0200

--
 airflow/hooks/postgres_hook.py| 18 +++---
 tests/hooks/test_postgres_hook.py |  2 +-
 2 files changed, 16 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/e5fb9c79/airflow/hooks/postgres_hook.py
--
diff --git a/airflow/hooks/postgres_hook.py b/airflow/hooks/postgres_hook.py
index bbf125b..0395e70 100644
--- a/airflow/hooks/postgres_hook.py
+++ b/airflow/hooks/postgres_hook.py
@@ -17,6 +17,7 @@
 # specific language governing permissions and limitations
 # under the License.
 
+import os
 import psycopg2
 import psycopg2.extensions
 from contextlib import closing
@@ -61,13 +62,24 @@ class PostgresHook(DbApiHook):
 
 def copy_expert(self, sql, filename, open=open):
 """
-Executes SQL using psycopg2 copy_expert method
-Necessary to execute COPY command without access to a superuser
+Executes SQL using psycopg2 copy_expert method.
+Necessary to execute COPY command without access to a superuser.
+
+Note: if this method is called with a "COPY FROM" statement and
+the specified input file does not exist, it creates an empty
+file and no data is loaded, but the operation succeeds.
+So if users want to be aware when the input file does not exist,
+they have to check its existence by themselves.
 """
-with open(filename, 'w+') as f:
+if not os.path.isfile(filename):
+with open(filename, 'w'):
+pass
+
+with open(filename, 'r+') as f:
 with closing(self.get_conn()) as conn:
 with closing(conn.cursor()) as cur:
 cur.copy_expert(sql, f)
+f.truncate(f.tell())
 conn.commit()
 
 @staticmethod

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/e5fb9c79/tests/hooks/test_postgres_hook.py
--
diff --git a/tests/hooks/test_postgres_hook.py 
b/tests/hooks/test_postgres_hook.py
index f636b5a..2828f8b 100644
--- a/tests/hooks/test_postgres_hook.py
+++ b/tests/hooks/test_postgres_hook.py
@@ -55,4 +55,4 @@ class TestPostgresHook(unittest.TestCase):
 self.cur.close.assert_called_once()
 self.conn.commit.assert_called_once()
 self.cur.copy_expert.assert_called_once_with(statement, 
m.return_value)
-m.assert_called_once_with(filename, "w+")
+self.assertEqual(m.call_args[0], (filename, "r+"))



[jira] [Created] (AIRFLOW-2556) Reduce time spent on unit tests

2018-06-03 Thread Bolke de Bruin (JIRA)
Bolke de Bruin created AIRFLOW-2556:
---

 Summary: Reduce time spent on unit tests
 Key: AIRFLOW-2556
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2556
 Project: Apache Airflow
  Issue Type: Improvement
Reporter: Bolke de Bruin


Unit tests are taking up way too much time. This costs time and actually also 
Money from the Apache Foundation. We need to reduce this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2557) Reduce time spent in S3 tests

2018-06-03 Thread Bolke de Bruin (JIRA)
Bolke de Bruin created AIRFLOW-2557:
---

 Summary: Reduce time spent in S3 tests
 Key: AIRFLOW-2557
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2557
 Project: Apache Airflow
  Issue Type: Sub-task
Reporter: Bolke de Bruin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)