[jira] [Comment Edited] (AIRFLOW-2813) `pip install apache-airflow` fails

2018-08-29 Thread Shubham Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595969#comment-16595969
 ] 

Shubham Gupta edited comment on AIRFLOW-2813 at 8/29/18 6:07 AM:
-

It was left out in [{{PyPi}} 
release|https://pypi.org/project/apache-airflow/#history] of `Airflow 1.10`. 
Then how can one install {{Airflow v1.10?}}


was (Author: y2k-shubham):
Since it was left out in [{{PyPi}} 
release|https://pypi.org/project/apache-airflow/#history], I'm unable to figure 
out how to install {{Airflow v1.10?}}

> `pip install apache-airflow` fails
> --
>
> Key: AIRFLOW-2813
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2813
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.9.0
> Environment: Mac OS, Linux, Windows
>Reporter: Jeff Schwab
>Priority: Major
>
> `pip install apache-airflow` fails with a SyntaxError on Mac OS, and with a 
> different (extremely verbose) error on Linux.  This happens both on my 
> MacBook and on a fresh Alpine Linux Docker image, and with both pip2 and 
> pip3; a friend just tried `pip install apache-airflow` for me on his Windows 
> box, and it died with yet another error.  Googling quickly found someone else 
> seeing the same issue over a week ago: 
> https://gitter.im/apache/incubator-airflow?at=5b5130bac86c4f0b47201af0
> Please let me know what further information you would like, and/or what I am 
> doing wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported Prog Langs

2018-08-29 Thread GitBox
Fokko commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported Prog 
Langs
URL: 
https://github.com/apache/incubator-airflow/pull/3815#issuecomment-416843410
 
 
   @gerardo This does not make sense to me. Why are we installing both 2 and 3 
in the ci image? 
https://github.com/apache/incubator-airflow-ci/blob/master/Dockerfile
   
   Ideally we would have a Python 2 and Python 3 image, but this can become 
complicated quickly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] gerardo commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported Prog Langs

2018-08-29 Thread GitBox
gerardo commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported 
Prog Langs
URL: 
https://github.com/apache/incubator-airflow/pull/3815#issuecomment-416848261
 
 
   @Fokko The only reason 2 versions are being installed is that I wanted to 
follow the CI process as close as it was [before merging the docker-ci 
changes](https://travis-ci.org/apache/incubator-airflow/builds/416219284). At 
the time, the tox target and the python version to be used were being matched 
(run a python3 tox target and run tox under python3), although I never 
understood the reason. We only use the system python to install tox.
   
   I only remember coming across a single issue with a `PythonVirtualOperator` 
test failing [because of a version mismatch](
   https://github.com/apache/incubator-airflow/pull/3393#discussion_r189708965) 
(tox running python3 and `PythonVirtualOperator` trying to use the system 
python, which could be python2), but if that happens, it should be simple to 
fix.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2813) `pip install apache-airflow` fails

2018-08-29 Thread Shubham Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595969#comment-16595969
 ] 

Shubham Gupta commented on AIRFLOW-2813:


Since it was left out in [{{PyPi}} 
release|https://pypi.org/project/apache-airflow/#history], I'm unable to figure 
out how to install {{Airflow v1.10?}}

> `pip install apache-airflow` fails
> --
>
> Key: AIRFLOW-2813
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2813
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.9.0
> Environment: Mac OS, Linux, Windows
>Reporter: Jeff Schwab
>Priority: Major
>
> `pip install apache-airflow` fails with a SyntaxError on Mac OS, and with a 
> different (extremely verbose) error on Linux.  This happens both on my 
> MacBook and on a fresh Alpine Linux Docker image, and with both pip2 and 
> pip3; a friend just tried `pip install apache-airflow` for me on his Windows 
> box, and it died with yet another error.  Googling quickly found someone else 
> seeing the same issue over a week ago: 
> https://gitter.im/apache/incubator-airflow?at=5b5130bac86c4f0b47201af0
> Please let me know what further information you would like, and/or what I am 
> doing wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko commented on issue #3816: [AIRFLOW-2973] Use Python 3.6.x everywhere possible

2018-08-29 Thread GitBox
Fokko commented on issue #3816: [AIRFLOW-2973] Use Python 3.6.x everywhere 
possible
URL: 
https://github.com/apache/incubator-airflow/pull/3816#issuecomment-416848015
 
 
   This will add a lot of load to Travis, which we don't really want since 
Apache infra has to pay for this. My suggestion would be to set the tests to 
Python 3.6.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] gerardo commented on a change in pull request #3393: [AIRFLOW-2499] Dockerised CI pipeline

2018-08-29 Thread GitBox
gerardo commented on a change in pull request #3393: [AIRFLOW-2499] Dockerised 
CI pipeline
URL: https://github.com/apache/incubator-airflow/pull/3393#discussion_r189708965
 
 

 ##
 File path: airflow/operators/python_operator.py
 ##
 @@ -300,9 +300,9 @@ def _write_args(self, input_filename):
 with open(input_filename, 'wb') as f:
 arg_dict = ({'args': self.op_args, 'kwargs': self.op_kwargs})
 if self.use_dill:
-dill.dump(arg_dict, f)
+dill.dump(arg_dict, f, protocol=2)
 
 Review comment:
   These tests were failing because the default pickling protocol for python 2 
and 3 are different, so, if the python version being used to run airflow is 
different than the one passed to `PythonVirtualenvOperator` to create a 
virtualenv, then `pickle.load` fails. 
   
   These tests are not failing on master, but they kept failing under this 
docker setup. Not sure if it's actually a bug in my setup or is it an actual 
issue in this operator, that's why I didn't create another  PR, but I can 
submit these changes as a separate PR if you think it's worth it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil edited a comment on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported Prog Langs

2018-08-29 Thread GitBox
kaxil edited a comment on issue #3815: [AIRFLOW-2973] Add Python 3.6 to 
Supported Prog Langs
URL: 
https://github.com/apache/incubator-airflow/pull/3815#issuecomment-416848426
 
 
   @gerado I already tried changing that, it says python3.6 interpreter not 
found.
   
   Here is the link to the Travis job:
   
   https://travis-ci.org/apache/incubator-airflow/builds/421714589


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] gerardo commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported Prog Langs

2018-08-29 Thread GitBox
gerardo commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported 
Prog Langs
URL: 
https://github.com/apache/incubator-airflow/pull/3815#issuecomment-416850685
 
 
   @kaxil mmm, it seems it needs to be installed in the system to be used: 
https://tox.readthedocs.io/en/latest/example/general.html#basepython-defaults-overriding
   
   And it seems the only python3 version available in Ubuntu Xenial is 3.5.1: 
https://packages.ubuntu.com/search?keywords=python3-dev=names=xenial=all
   
   It might be a matter of having a separate image (like @Fokko suggested) 
using a more recent Ubuntu (`Artful` or `Bionic`). The official [python docker 
images](https://hub.docker.com/_/python/) are also based in Debian, that's 
another option. 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2975) Add support for Amazon cloudwatch (computing power at will)

2018-08-29 Thread jack (JIRA)
jack created AIRFLOW-2975:
-

 Summary: Add support for Amazon cloudwatch (computing power at 
will)
 Key: AIRFLOW-2975
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2975
 Project: Apache Airflow
  Issue Type: New Feature
Affects Versions: 1.10.0
Reporter: jack
 Fix For: 2.0.0, 1.10.1


Some of have one machine that runs airfow…

While we can scale up the executor to have many resources over different 
servers this is consider to be expensive.

There is another solution.. using Amazon Cloudwatch:

[https://aws.amazon.com/premiumsupport/knowledge-center/start-stop-lambda-cloudwatch/]

 

This enable the user to create and close EC2 machines on specific intervals for 
specific tasks.

Basically if I have 50 DAGS to run on 1PM-3PM and few dags to run on other 
hours there is no point in paying for 2nd server 24/7. 

This could be an enhancement to one of the executors that know to work with 
more than one server.

 

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] gerardo commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose

2018-08-29 Thread GitBox
gerardo commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + 
docker-compose
URL: 
https://github.com/apache/incubator-airflow/pull/3797#issuecomment-416836140
 
 
   For future reference. This looks good: 
https://github.com/kubernetes-sigs/kubeadm-dind-cluster
   
   > If you're an application developer, you may be better off with Minikube 
because it's more mature and less dependent on the local environment, but if 
you're feeling adventurous you may give kubeadm-dind-cluster a try, too. **In 
particular you can run kubeadm-dind-cluster in CI environment such as Travis 
without having issues with nested virtualization.**


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose

2018-08-29 Thread GitBox
Fokko commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + 
docker-compose
URL: 
https://github.com/apache/incubator-airflow/pull/3797#issuecomment-416842865
 
 
   Nice one @gerardo 
   
   I'm also working on getting rid of tox, since we now have docker-compose and 
tox, which both act as a visualisation layer.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported Prog Langs

2018-08-29 Thread GitBox
kaxil commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported Prog 
Langs
URL: 
https://github.com/apache/incubator-airflow/pull/3815#issuecomment-416851457
 
 
   Would you be able to do that if you have got time?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2975) Add support for Amazon cloudwatch (computing power at will)

2018-08-29 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596093#comment-16596093
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2975:


Airflow already has support for "queues" on the workers - 
https://airflow.readthedocs.io/en/stable/concepts.html#queues so you can set a 
queue on your tasks, and only have the "temporal" instance listen to this queue.

Or you could just use the default queue and have the normal worker share in the 
load, and just provide more processing at the 1-3pm period.

> Add support for Amazon cloudwatch (computing power at will)
> ---
>
> Key: AIRFLOW-2975
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2975
> Project: Apache Airflow
>  Issue Type: New Feature
>Affects Versions: 1.10.0
>Reporter: jack
>Priority: Major
> Fix For: 2.0.0, 1.10.1
>
>
> Some of have one machine that runs airfow…
> While we can scale up the executor to have many resources over different 
> servers this is consider to be expensive.
> There is another solution.. using Amazon Cloudwatch:
> [https://aws.amazon.com/premiumsupport/knowledge-center/start-stop-lambda-cloudwatch/]
>  
> This enable the user to create and close EC2 machines on specific intervals 
> for specific tasks.
> Basically if I have 50 DAGS to run on 1PM-3PM and few dags to run on other 
> hours there is no point in paying for 2nd server 24/7. 
> This could be an enhancement to one of the executors that know to work with 
> more than one server.
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] msumit closed pull request #3790: [AIRFLOW-2994] Fix command status check in Qubole Check operator

2018-08-29 Thread GitBox
msumit closed pull request #3790: [AIRFLOW-2994] Fix command status check in 
Qubole Check operator
URL: https://github.com/apache/incubator-airflow/pull/3790
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/operators/qubole_check_operator.py 
b/airflow/contrib/operators/qubole_check_operator.py
index 235af08ca7..8b6b5d351c 100644
--- a/airflow/contrib/operators/qubole_check_operator.py
+++ b/airflow/contrib/operators/qubole_check_operator.py
@@ -215,11 +215,11 @@ def get_sql_from_qbol_cmd(params):
 def handle_airflow_exception(airflow_exception, hook):
 cmd = hook.cmd
 if cmd is not None:
-if cmd.is_success:
+if cmd.is_success(cmd.status):
 qubole_command_results = hook.get_query_results()
 qubole_command_id = cmd.id
 exception_message = '\nQubole Command Id: {qubole_command_id}' \
 '\nQubole Command Results:' \
 '\n{qubole_command_results}'.format(**locals())
 raise AirflowException(str(airflow_exception) + exception_message)
-raise AirflowException(airflow_exception.message)
+raise AirflowException(str(airflow_exception))
diff --git a/tests/contrib/operators/test_qubole_check_operator.py 
b/tests/contrib/operators/test_qubole_check_operator.py
index 29044827ee..972038005b 100644
--- a/tests/contrib/operators/test_qubole_check_operator.py
+++ b/tests/contrib/operators/test_qubole_check_operator.py
@@ -24,6 +24,7 @@
 from airflow.contrib.operators.qubole_check_operator import 
QuboleValueCheckOperator
 from airflow.contrib.hooks.qubole_check_hook import QuboleCheckHook
 from airflow.contrib.hooks.qubole_hook import QuboleHook
+from qds_sdk.commands import HiveCommand
 
 try:
 from unittest import mock
@@ -80,11 +81,13 @@ def test_execute_pass(self, mock_get_hook):
 mock_hook.get_first.assert_called_with(query)
 
 @mock.patch.object(QuboleValueCheckOperator, 'get_hook')
-def test_execute_fail(self, mock_get_hook):
+def test_execute_assertion_fail(self, mock_get_hook):
 
 mock_cmd = mock.Mock()
 mock_cmd.status = 'done'
 mock_cmd.id = 123
+mock_cmd.is_success = mock.Mock(
+return_value=HiveCommand.is_success(mock_cmd.status))
 
 mock_hook = mock.Mock()
 mock_hook.get_first.return_value = [11]
@@ -97,6 +100,30 @@ def test_execute_fail(self, mock_get_hook):
  'Qubole Command Id: ' + str(mock_cmd.id)):
 operator.execute()
 
+mock_cmd.is_success.assert_called_with(mock_cmd.status)
+
+@mock.patch.object(QuboleValueCheckOperator, 'get_hook')
+def test_execute_assert_query_fail(self, mock_get_hook):
+
+mock_cmd = mock.Mock()
+mock_cmd.status = 'error'
+mock_cmd.id = 123
+mock_cmd.is_success = mock.Mock(
+return_value=HiveCommand.is_success(mock_cmd.status))
+
+mock_hook = mock.Mock()
+mock_hook.get_first.return_value = [11]
+mock_hook.cmd = mock_cmd
+mock_get_hook.return_value = mock_hook
+
+operator = self.__construct_operator('select value from tab1 limit 
1;', 5, 1)
+
+with self.assertRaises(AirflowException) as cm:
+operator.execute()
+
+self.assertNotIn('Qubole Command Id: ', str(cm.exception))
+mock_cmd.is_success.assert_called_with(mock_cmd.status)
+
 @mock.patch.object(QuboleCheckHook, 'get_query_results')
 @mock.patch.object(QuboleHook, 'execute')
 def test_results_parser_callable(self, mock_execute, 
mock_get_query_results):


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] msumit commented on issue #3790: [AIRFLOW-2994] Fix command status check in Qubole Check operator

2018-08-29 Thread GitBox
msumit commented on issue #3790: [AIRFLOW-2994] Fix command status check in 
Qubole Check operator
URL: 
https://github.com/apache/incubator-airflow/pull/3790#issuecomment-416881295
 
 
   lgtm


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] msumit commented on issue #3796: [AIRFLOW-2824] - Add config to disable default conn creation

2018-08-29 Thread GitBox
msumit commented on issue #3796: [AIRFLOW-2824] - Add config to disable default 
conn creation
URL: 
https://github.com/apache/incubator-airflow/pull/3796#issuecomment-416882661
 
 
   @andscoop can you please add a couple of specs for the change or have a 
strong reason for not adding. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #3796: [AIRFLOW-2824] - Add config to disable default conn creation

2018-08-29 Thread GitBox
ashb commented on issue #3796: [AIRFLOW-2824] - Add config to disable default 
conn creation
URL: 
https://github.com/apache/incubator-airflow/pull/3796#issuecomment-416884461
 
 
   Isn't this what the `airflow upgradedb` command already does?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #3796: [AIRFLOW-2824] - Add config to disable default conn creation

2018-08-29 Thread GitBox
ashb commented on issue #3796: [AIRFLOW-2824] - Add config to disable default 
conn creation
URL: 
https://github.com/apache/incubator-airflow/pull/3796#issuecomment-416884983
 
 
   Additionally why _just_ the connections? And do we need a separate setting 
for this? Could we not just reuse the existing `load_examples` config option?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] msumit commented on issue #3796: [AIRFLOW-2824] - Add config to disable default conn creation

2018-08-29 Thread GitBox
msumit commented on issue #3796: [AIRFLOW-2824] - Add config to disable default 
conn creation
URL: 
https://github.com/apache/incubator-airflow/pull/3796#issuecomment-416891061
 
 
   @ashb IMO examples and connections are 2 different entities altogether, and 
we should keep them different. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko closed pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-29 Thread GitBox
Fokko closed pull request #3570: [AIRFLOW-2709] Improve error handling in 
Databricks hook
URL: https://github.com/apache/incubator-airflow/pull/3570
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/hooks/databricks_hook.py 
b/airflow/contrib/hooks/databricks_hook.py
index 54f00e0090..5b97a0eba0 100644
--- a/airflow/contrib/hooks/databricks_hook.py
+++ b/airflow/contrib/hooks/databricks_hook.py
@@ -24,6 +24,7 @@
 from airflow.hooks.base_hook import BaseHook
 from requests import exceptions as requests_exceptions
 from requests.auth import AuthBase
+from time import sleep
 
 from airflow.utils.log.logging_mixin import LoggingMixin
 
@@ -47,7 +48,8 @@ def __init__(
 self,
 databricks_conn_id='databricks_default',
 timeout_seconds=180,
-retry_limit=3):
+retry_limit=3,
+retry_delay=1.0):
 """
 :param databricks_conn_id: The name of the databricks connection to 
use.
 :type databricks_conn_id: string
@@ -57,6 +59,9 @@ def __init__(
 :param retry_limit: The number of times to retry the connection in 
case of
 service outages.
 :type retry_limit: int
+:param retry_delay: The number of seconds to wait between retries (it
+might be a floating point number).
+:type retry_delay: float
 """
 self.databricks_conn_id = databricks_conn_id
 self.databricks_conn = self.get_connection(databricks_conn_id)
@@ -64,6 +69,7 @@ def __init__(
 if retry_limit < 1:
 raise ValueError('Retry limit must be greater than equal to 1')
 self.retry_limit = retry_limit
+self.retry_delay = retry_delay
 
 @staticmethod
 def _parse_host(host):
@@ -119,7 +125,8 @@ def _do_api_call(self, endpoint_info, json):
 else:
 raise AirflowException('Unexpected HTTP Method: ' + method)
 
-for attempt_num in range(1, self.retry_limit + 1):
+attempt_num = 1
+while True:
 try:
 response = request_func(
 url,
@@ -127,21 +134,29 @@ def _do_api_call(self, endpoint_info, json):
 auth=auth,
 headers=USER_AGENT_HEADER,
 timeout=self.timeout_seconds)
-if response.status_code == requests.codes.ok:
-return response.json()
-else:
+response.raise_for_status()
+return response.json()
+except requests_exceptions.RequestException as e:
+if not _retryable_error(e):
 # In this case, the user probably made a mistake.
 # Don't retry.
 raise AirflowException('Response: {0}, Status Code: 
{1}'.format(
-response.content, response.status_code))
-except (requests_exceptions.ConnectionError,
-requests_exceptions.Timeout) as e:
-self.log.error(
-'Attempt %s API Request to Databricks failed with reason: 
%s',
-attempt_num, e
-)
-raise AirflowException(('API requests to Databricks failed {} times. ' 
+
-   'Giving up.').format(self.retry_limit))
+e.response.content, e.response.status_code))
+
+self._log_request_error(attempt_num, e)
+
+if attempt_num == self.retry_limit:
+raise AirflowException(('API requests to Databricks failed {} 
times. ' +
+'Giving up.').format(self.retry_limit))
+
+attempt_num += 1
+sleep(self.retry_delay)
+
+def _log_request_error(self, attempt_num, error):
+self.log.error(
+'Attempt %s API Request to Databricks failed with reason: %s',
+attempt_num, error
+)
 
 def submit_run(self, json):
 """
@@ -175,6 +190,12 @@ def cancel_run(self, run_id):
 self._do_api_call(CANCEL_RUN_ENDPOINT, json)
 
 
+def _retryable_error(exception):
+return isinstance(exception, requests_exceptions.ConnectionError) \
+or isinstance(exception, requests_exceptions.Timeout) \
+or exception.response is not None and exception.response.status_code 
>= 500
+
+
 RUN_LIFE_CYCLE_STATES = [
 'PENDING',
 'RUNNING',
diff --git a/airflow/contrib/operators/databricks_operator.py 
b/airflow/contrib/operators/databricks_operator.py
index 7b8d522dba..3245a99256 100644
--- a/airflow/contrib/operators/databricks_operator.py
+++ b/airflow/contrib/operators/databricks_operator.py
@@ -146,6 +146,9 @@ class 

[jira] [Commented] (AIRFLOW-2709) Improve error handling in Databricks hook

2018-08-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596029#comment-16596029
 ] 

ASF GitHub Bot commented on AIRFLOW-2709:
-

Fokko closed pull request #3570: [AIRFLOW-2709] Improve error handling in 
Databricks hook
URL: https://github.com/apache/incubator-airflow/pull/3570
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/hooks/databricks_hook.py 
b/airflow/contrib/hooks/databricks_hook.py
index 54f00e0090..5b97a0eba0 100644
--- a/airflow/contrib/hooks/databricks_hook.py
+++ b/airflow/contrib/hooks/databricks_hook.py
@@ -24,6 +24,7 @@
 from airflow.hooks.base_hook import BaseHook
 from requests import exceptions as requests_exceptions
 from requests.auth import AuthBase
+from time import sleep
 
 from airflow.utils.log.logging_mixin import LoggingMixin
 
@@ -47,7 +48,8 @@ def __init__(
 self,
 databricks_conn_id='databricks_default',
 timeout_seconds=180,
-retry_limit=3):
+retry_limit=3,
+retry_delay=1.0):
 """
 :param databricks_conn_id: The name of the databricks connection to 
use.
 :type databricks_conn_id: string
@@ -57,6 +59,9 @@ def __init__(
 :param retry_limit: The number of times to retry the connection in 
case of
 service outages.
 :type retry_limit: int
+:param retry_delay: The number of seconds to wait between retries (it
+might be a floating point number).
+:type retry_delay: float
 """
 self.databricks_conn_id = databricks_conn_id
 self.databricks_conn = self.get_connection(databricks_conn_id)
@@ -64,6 +69,7 @@ def __init__(
 if retry_limit < 1:
 raise ValueError('Retry limit must be greater than equal to 1')
 self.retry_limit = retry_limit
+self.retry_delay = retry_delay
 
 @staticmethod
 def _parse_host(host):
@@ -119,7 +125,8 @@ def _do_api_call(self, endpoint_info, json):
 else:
 raise AirflowException('Unexpected HTTP Method: ' + method)
 
-for attempt_num in range(1, self.retry_limit + 1):
+attempt_num = 1
+while True:
 try:
 response = request_func(
 url,
@@ -127,21 +134,29 @@ def _do_api_call(self, endpoint_info, json):
 auth=auth,
 headers=USER_AGENT_HEADER,
 timeout=self.timeout_seconds)
-if response.status_code == requests.codes.ok:
-return response.json()
-else:
+response.raise_for_status()
+return response.json()
+except requests_exceptions.RequestException as e:
+if not _retryable_error(e):
 # In this case, the user probably made a mistake.
 # Don't retry.
 raise AirflowException('Response: {0}, Status Code: 
{1}'.format(
-response.content, response.status_code))
-except (requests_exceptions.ConnectionError,
-requests_exceptions.Timeout) as e:
-self.log.error(
-'Attempt %s API Request to Databricks failed with reason: 
%s',
-attempt_num, e
-)
-raise AirflowException(('API requests to Databricks failed {} times. ' 
+
-   'Giving up.').format(self.retry_limit))
+e.response.content, e.response.status_code))
+
+self._log_request_error(attempt_num, e)
+
+if attempt_num == self.retry_limit:
+raise AirflowException(('API requests to Databricks failed {} 
times. ' +
+'Giving up.').format(self.retry_limit))
+
+attempt_num += 1
+sleep(self.retry_delay)
+
+def _log_request_error(self, attempt_num, error):
+self.log.error(
+'Attempt %s API Request to Databricks failed with reason: %s',
+attempt_num, error
+)
 
 def submit_run(self, json):
 """
@@ -175,6 +190,12 @@ def cancel_run(self, run_id):
 self._do_api_call(CANCEL_RUN_ENDPOINT, json)
 
 
+def _retryable_error(exception):
+return isinstance(exception, requests_exceptions.ConnectionError) \
+or isinstance(exception, requests_exceptions.Timeout) \
+or exception.response is not None and exception.response.status_code 
>= 500
+
+
 RUN_LIFE_CYCLE_STATES = [
 'PENDING',
 'RUNNING',
diff --git a/airflow/contrib/operators/databricks_operator.py 

[jira] [Resolved] (AIRFLOW-2709) Improve error handling in Databricks hook

2018-08-29 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-2709.
---
Resolution: Fixed

> Improve error handling in Databricks hook
> -
>
> Key: AIRFLOW-2709
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2709
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, hooks
>Reporter: Victor Jimenez
>Priority: Major
>
> The Databricks hook handles both connection and timeout errors. However, 
> Databricks sometimes returns a temporarily unavailable error. That error is 
> neither a connection nor timeout one. It is just an HTTPError containing the 
> following text in the response: TEMPORARILY_UNAVAILABLE. The current error 
> handling in the hook should be enhanced to support this error.
> Also, the Databricks hook contains retry logic. Yet, it does not support 
> sleeping for some time between requests. This creates a problem in handling 
> errors such as the TEMPORARILY_UNAVAILABLE one. This error typically resolves 
> after a few seconds. Adding support for sleeping between retry attempts would 
> really help to enhance the reliability of Databricks operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko commented on issue #3747: [AIRFLOW-2895] Prevent scheduler from spamming heartbeats/logs

2018-08-29 Thread GitBox
Fokko commented on issue #3747: [AIRFLOW-2895] Prevent scheduler from spamming 
heartbeats/logs
URL: 
https://github.com/apache/incubator-airflow/pull/3747#issuecomment-416858185
 
 
   Thanks @ashb 
   
   My suggestion would be to branch of the 1.10 branch and cherry-pick some 
commits into this branch. Would also be awesome to make it compatible with 
Python 3.7. Cheers!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (AIRFLOW-2813) `pip install apache-airflow` fails

2018-08-29 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596084#comment-16596084
 ] 

Ash Berlin-Taylor edited comment on AIRFLOW-2813 at 8/29/18 8:44 AM:
-

The page you linked to [~y2k-shubham] lists 1.10.0, so you should be able to  
{{pip install 'apache-airflow==1.10.0'}}.

Oh unless you mean "how can one install Airflow 1.10 on Python3.7?" The answer 
is you can't right now, sorry. 1.10.1 will come in a few weeks that should fix 
this.


was (Author: ashb):
The page you linked to [~y2k-shubham] lists 1.10.0, so you should be able to  
{{pip install 'apache-airflow==1.10.0'}}

> `pip install apache-airflow` fails
> --
>
> Key: AIRFLOW-2813
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2813
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.9.0
> Environment: Mac OS, Linux, Windows
>Reporter: Jeff Schwab
>Priority: Major
>
> `pip install apache-airflow` fails with a SyntaxError on Mac OS, and with a 
> different (extremely verbose) error on Linux.  This happens both on my 
> MacBook and on a fresh Alpine Linux Docker image, and with both pip2 and 
> pip3; a friend just tried `pip install apache-airflow` for me on his Windows 
> box, and it died with yet another error.  Googling quickly found someone else 
> seeing the same issue over a week ago: 
> https://gitter.im/apache/incubator-airflow?at=5b5130bac86c4f0b47201af0
> Please let me know what further information you would like, and/or what I am 
> doing wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] gerardo commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported Prog Langs

2018-08-29 Thread GitBox
gerardo commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported 
Prog Langs
URL: 
https://github.com/apache/incubator-airflow/pull/3815#issuecomment-416851866
 
 
   @kaxil sure! I have time this week to work on this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko closed pull request #3697: [AIRFLOW-2854] kubernetes_pod_operator add more configuration items

2018-08-29 Thread GitBox
Fokko closed pull request #3697: [AIRFLOW-2854] kubernetes_pod_operator add 
more configuration items
URL: https://github.com/apache/incubator-airflow/pull/3697
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/airflow/contrib/kubernetes/kubernetes_request_factory/kubernetes_request_factory.py
 
b/airflow/contrib/kubernetes/kubernetes_request_factory/kubernetes_request_factory.py
index 27e0ebd29c..97bcdf2abc 100644
--- 
a/airflow/contrib/kubernetes/kubernetes_request_factory/kubernetes_request_factory.py
+++ 
b/airflow/contrib/kubernetes/kubernetes_request_factory/kubernetes_request_factory.py
@@ -175,6 +175,11 @@ def extract_service_account_name(pod, req):
 if pod.service_account_name:
 req['spec']['serviceAccountName'] = pod.service_account_name
 
+@staticmethod
+def extract_hostnetwork(pod, req):
+if pod.hostnetwork:
+req['spec']['hostNetwork'] = pod.hostnetwork
+
 @staticmethod
 def extract_image_pull_secrets(pod, req):
 if pod.image_pull_secrets:
diff --git 
a/airflow/contrib/kubernetes/kubernetes_request_factory/pod_request_factory.py 
b/airflow/contrib/kubernetes/kubernetes_request_factory/pod_request_factory.py
index 95d6c829de..877d7aafe2 100644
--- 
a/airflow/contrib/kubernetes/kubernetes_request_factory/pod_request_factory.py
+++ 
b/airflow/contrib/kubernetes/kubernetes_request_factory/pod_request_factory.py
@@ -59,6 +59,7 @@ def create(self, pod):
 self.extract_image_pull_secrets(pod, req)
 self.extract_annotations(pod, req)
 self.extract_affinity(pod, req)
+self.extract_hostnetwork(pod, req)
 return req
 
 
@@ -116,4 +117,5 @@ def create(self, pod):
 self.extract_image_pull_secrets(pod, req)
 self.extract_annotations(pod, req)
 self.extract_affinity(pod, req)
+self.extract_hostnetwork(pod, req)
 return req
diff --git a/airflow/contrib/kubernetes/pod.py 
b/airflow/contrib/kubernetes/pod.py
index 6fcf354459..221c8f4180 100644
--- a/airflow/contrib/kubernetes/pod.py
+++ b/airflow/contrib/kubernetes/pod.py
@@ -77,7 +77,8 @@ def __init__(
 service_account_name=None,
 resources=None,
 annotations=None,
-affinity=None
+affinity=None,
+hostnetwork=False
 ):
 self.image = image
 self.envs = envs or {}
@@ -98,3 +99,4 @@ def __init__(
 self.resources = resources or Resources()
 self.annotations = annotations or {}
 self.affinity = affinity or {}
+self.hostnetwork = hostnetwork or False
diff --git a/airflow/contrib/kubernetes/pod_launcher.py 
b/airflow/contrib/kubernetes/pod_launcher.py
index 42f2bfea8a..8c8d949107 100644
--- a/airflow/contrib/kubernetes/pod_launcher.py
+++ b/airflow/contrib/kubernetes/pod_launcher.py
@@ -22,7 +22,7 @@
 from datetime import datetime as dt
 from airflow.contrib.kubernetes.kubernetes_request_factory import \
 pod_request_factory as pod_factory
-from kubernetes import watch
+from kubernetes import watch, client
 from kubernetes.client.rest import ApiException
 from kubernetes.stream import stream as kubernetes_stream
 from airflow import AirflowException
@@ -59,6 +59,15 @@ def run_pod_async(self, pod):
 raise
 return resp
 
+def delete_pod(self, pod):
+try:
+self._client.delete_namespaced_pod(
+pod.name, pod.namespace, body=client.V1DeleteOptions())
+except ApiException as e:
+# If the pod is already deleted
+if e.status != 404:
+raise
+
 def run_pod(self, pod, startup_timeout=120, get_logs=True):
 # type: (Pod) -> (State, result)
 """
diff --git a/airflow/contrib/operators/kubernetes_pod_operator.py 
b/airflow/contrib/operators/kubernetes_pod_operator.py
index fb905622d8..bb4bf7fca1 100644
--- a/airflow/contrib/operators/kubernetes_pod_operator.py
+++ b/airflow/contrib/operators/kubernetes_pod_operator.py
@@ -102,6 +102,7 @@ def execute(self, context):
 labels=self.labels,
 )
 
+pod.service_account_name = self.service_account_name
 pod.secrets = self.secrets
 pod.envs = self.env_vars
 pod.image_pull_policy = self.image_pull_policy
@@ -109,6 +110,7 @@ def execute(self, context):
 pod.resources = self.resources
 pod.affinity = self.affinity
 pod.node_selectors = self.node_selectors
+pod.hostnetwork = self.hostnetwork
 
 launcher = pod_launcher.PodLauncher(kube_client=client,
 extract_xcom=self.xcom_push)
@@ -116,6 +118,10 @@ def execute(self, context):
  

[jira] [Commented] (AIRFLOW-2854) kubernetes_pod_operator add more configuration items

2018-08-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596042#comment-16596042
 ] 

ASF GitHub Bot commented on AIRFLOW-2854:
-

Fokko closed pull request #3697: [AIRFLOW-2854] kubernetes_pod_operator add 
more configuration items
URL: https://github.com/apache/incubator-airflow/pull/3697
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/airflow/contrib/kubernetes/kubernetes_request_factory/kubernetes_request_factory.py
 
b/airflow/contrib/kubernetes/kubernetes_request_factory/kubernetes_request_factory.py
index 27e0ebd29c..97bcdf2abc 100644
--- 
a/airflow/contrib/kubernetes/kubernetes_request_factory/kubernetes_request_factory.py
+++ 
b/airflow/contrib/kubernetes/kubernetes_request_factory/kubernetes_request_factory.py
@@ -175,6 +175,11 @@ def extract_service_account_name(pod, req):
 if pod.service_account_name:
 req['spec']['serviceAccountName'] = pod.service_account_name
 
+@staticmethod
+def extract_hostnetwork(pod, req):
+if pod.hostnetwork:
+req['spec']['hostNetwork'] = pod.hostnetwork
+
 @staticmethod
 def extract_image_pull_secrets(pod, req):
 if pod.image_pull_secrets:
diff --git 
a/airflow/contrib/kubernetes/kubernetes_request_factory/pod_request_factory.py 
b/airflow/contrib/kubernetes/kubernetes_request_factory/pod_request_factory.py
index 95d6c829de..877d7aafe2 100644
--- 
a/airflow/contrib/kubernetes/kubernetes_request_factory/pod_request_factory.py
+++ 
b/airflow/contrib/kubernetes/kubernetes_request_factory/pod_request_factory.py
@@ -59,6 +59,7 @@ def create(self, pod):
 self.extract_image_pull_secrets(pod, req)
 self.extract_annotations(pod, req)
 self.extract_affinity(pod, req)
+self.extract_hostnetwork(pod, req)
 return req
 
 
@@ -116,4 +117,5 @@ def create(self, pod):
 self.extract_image_pull_secrets(pod, req)
 self.extract_annotations(pod, req)
 self.extract_affinity(pod, req)
+self.extract_hostnetwork(pod, req)
 return req
diff --git a/airflow/contrib/kubernetes/pod.py 
b/airflow/contrib/kubernetes/pod.py
index 6fcf354459..221c8f4180 100644
--- a/airflow/contrib/kubernetes/pod.py
+++ b/airflow/contrib/kubernetes/pod.py
@@ -77,7 +77,8 @@ def __init__(
 service_account_name=None,
 resources=None,
 annotations=None,
-affinity=None
+affinity=None,
+hostnetwork=False
 ):
 self.image = image
 self.envs = envs or {}
@@ -98,3 +99,4 @@ def __init__(
 self.resources = resources or Resources()
 self.annotations = annotations or {}
 self.affinity = affinity or {}
+self.hostnetwork = hostnetwork or False
diff --git a/airflow/contrib/kubernetes/pod_launcher.py 
b/airflow/contrib/kubernetes/pod_launcher.py
index 42f2bfea8a..8c8d949107 100644
--- a/airflow/contrib/kubernetes/pod_launcher.py
+++ b/airflow/contrib/kubernetes/pod_launcher.py
@@ -22,7 +22,7 @@
 from datetime import datetime as dt
 from airflow.contrib.kubernetes.kubernetes_request_factory import \
 pod_request_factory as pod_factory
-from kubernetes import watch
+from kubernetes import watch, client
 from kubernetes.client.rest import ApiException
 from kubernetes.stream import stream as kubernetes_stream
 from airflow import AirflowException
@@ -59,6 +59,15 @@ def run_pod_async(self, pod):
 raise
 return resp
 
+def delete_pod(self, pod):
+try:
+self._client.delete_namespaced_pod(
+pod.name, pod.namespace, body=client.V1DeleteOptions())
+except ApiException as e:
+# If the pod is already deleted
+if e.status != 404:
+raise
+
 def run_pod(self, pod, startup_timeout=120, get_logs=True):
 # type: (Pod) -> (State, result)
 """
diff --git a/airflow/contrib/operators/kubernetes_pod_operator.py 
b/airflow/contrib/operators/kubernetes_pod_operator.py
index fb905622d8..bb4bf7fca1 100644
--- a/airflow/contrib/operators/kubernetes_pod_operator.py
+++ b/airflow/contrib/operators/kubernetes_pod_operator.py
@@ -102,6 +102,7 @@ def execute(self, context):
 labels=self.labels,
 )
 
+pod.service_account_name = self.service_account_name
 pod.secrets = self.secrets
 pod.envs = self.env_vars
 pod.image_pull_policy = self.image_pull_policy
@@ -109,6 +110,7 @@ def execute(self, context):
 pod.resources = self.resources
 pod.affinity = self.affinity
 pod.node_selectors = self.node_selectors
+

[jira] [Resolved] (AIRFLOW-2854) kubernetes_pod_operator add more configuration items

2018-08-29 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-2854.
---
Resolution: Fixed

> kubernetes_pod_operator add more configuration items
> 
>
> Key: AIRFLOW-2854
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2854
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: 2.0.0
>Reporter: pengchen
>Assignee: pengchen
>Priority: Minor
> Fix For: 2.0.0
>
>
> kubernetes_pod_operator is missing kubernetes pods related configuration 
> items, as follows:
> 1. image_pull_secrets
> _Pull secrets_ are used to _pull_ private container _images_ from registries. 
> In this case, we need to configure the image_pull_secrets in pod spec file
> 2. service_account_name
> When kubernetes is running on rbac Authorization. If it is a job that needs 
> to operate on kubernetes resources, we need to configure service account.
> 3. is_delete_operator_pod
> This option can be given to the user to decide whether to delete the job pod 
> created by pod_operator, which is currently not processed.
> 4. hostnetwork



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko commented on issue #3806: [AIRFLOW-2956] added kubernetes tolerations to kubernetes pod operator

2018-08-29 Thread GitBox
Fokko commented on issue #3806: [AIRFLOW-2956] added kubernetes tolerations to 
kubernetes pod operator
URL: 
https://github.com/apache/incubator-airflow/pull/3806#issuecomment-416857089
 
 
   @justinholmes can you resolve the merge conflicts?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #3767: [AIRFLOW-2524]Add SageMaker Batch Inference

2018-08-29 Thread GitBox
Fokko commented on issue #3767: [AIRFLOW-2524]Add SageMaker Batch Inference
URL: 
https://github.com/apache/incubator-airflow/pull/3767#issuecomment-416857563
 
 
   It looks like the tests are failing:
   ```
   
   ==
   24) ERROR: test_create_transform_job 
(tests.contrib.hooks.test_sagemaker_hook.TestSageMakerHook)
   --
  Traceback (most recent call last):
   .tox/py27-backend_mysql/lib/python2.7/site-packages/mock/mock.py line 
1305 in patched
 return func(*args, **keywargs)
   tests/contrib/hooks/test_sagemaker_hook.py line 439 in 
test_create_transform_job
 wait_for_completion=False)
   airflow/contrib/hooks/sagemaker_hook.py line 258 in create_transform_job
 ['TransformInput']['DataSource']['S3Uri'])
  KeyError: 'S3Uri'
   ==
   25) ERROR: test_create_transform_job_db_config 
(tests.contrib.hooks.test_sagemaker_hook.TestSageMakerHook)
   --
  Traceback (most recent call last):
   .tox/py27-backend_mysql/lib/python2.7/site-packages/mock/mock.py line 
1305 in patched
 return func(*args, **keywargs)
   tests/contrib/hooks/test_sagemaker_hook.py line 454 in 
test_create_transform_job_db_config
 create_transform_params, wait_for_completion=False)
   airflow/contrib/hooks/sagemaker_hook.py line 258 in create_transform_job
 ['TransformInput']['DataSource']['S3Uri'])
  KeyError: 'S3Uri'
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported Prog Langs

2018-08-29 Thread GitBox
Fokko commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported Prog 
Langs
URL: 
https://github.com/apache/incubator-airflow/pull/3815#issuecomment-416859152
 
 
   @gerardo On the `incubator-airflow-ci` we can create two branches, one for 
Python 2 and one for Python 3. I can configure the docker-hub to publish two 
images `incubator-airflow-ci:2` and `incubator-airflow-ci:3` for example


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2813) `pip install apache-airflow` fails

2018-08-29 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596084#comment-16596084
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2813:


The page you linked to [~y2k-shubham] lists 1.10.0, so you should be able to  
{{pip install 'apache-airflow==1.10.0'}}

> `pip install apache-airflow` fails
> --
>
> Key: AIRFLOW-2813
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2813
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.9.0
> Environment: Mac OS, Linux, Windows
>Reporter: Jeff Schwab
>Priority: Major
>
> `pip install apache-airflow` fails with a SyntaxError on Mac OS, and with a 
> different (extremely verbose) error on Linux.  This happens both on my 
> MacBook and on a fresh Alpine Linux Docker image, and with both pip2 and 
> pip3; a friend just tried `pip install apache-airflow` for me on his Windows 
> box, and it died with yet another error.  Googling quickly found someone else 
> seeing the same issue over a week ago: 
> https://gitter.im/apache/incubator-airflow?at=5b5130bac86c4f0b47201af0
> Please let me know what further information you would like, and/or what I am 
> doing wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ashb commented on issue #3796: [AIRFLOW-2824] - Add config to disable default conn creation

2018-08-29 Thread GitBox
ashb commented on issue #3796: [AIRFLOW-2824] - Add config to disable default 
conn creation
URL: 
https://github.com/apache/incubator-airflow/pull/3796#issuecomment-416898406
 
 
   Can you think of a case where you'd want to load the Example DAGs but not 
the example connections? Or the inverse, when you want the example connections 
but not the example DAGs? (I never want either outside of a demo env)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2976) pipy docker dependency broken

2018-08-29 Thread Lin, Yi-Li (JIRA)
Lin, Yi-Li created AIRFLOW-2976:
---

 Summary: pipy docker dependency broken
 Key: AIRFLOW-2976
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2976
 Project: Apache Airflow
  Issue Type: Bug
  Components: dependencies, docker
Affects Versions: 1.10
Reporter: Lin, Yi-Li
 Attachments: docker_dag.log, docker_dag.py

I'm trying to install airflow with docker extras but airflow's dependency will 
install recent docker-py (3.5.0) from pypi which is incompatible with current 
DockerOperator.

DockerOperator will complain that "create_container() got an unexpected keyword 
argument 'cpu_shares'".

It looks like that interface is changed from docker-py 3.0.0 and work with 
docker-py 2.7.0.

The log and dag file are in attachments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2976) pipy docker dependency broken

2018-08-29 Thread Lin, Yi-Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin, Yi-Li updated AIRFLOW-2976:

Description: 
I'm trying to install airflow with docker extras but airflow's dependency will 
install recent docker-py (3.5.0) from pypi which is incompatible with current 
DockerOperator.

DockerOperator will complain that "create_container() got an unexpected keyword 
argument 'cpu_shares'".

It looks like that interface is changed from docker-py 3.0.0 and work with 
docker-py 2.7.0.

The log and dag file are in attachments.

Note, installation comment: "AIRFLOW_GPL_UNIDECODE=yes pip install 
'apache-airflow[docker,mysql]==1.10.0'"

  was:
I'm trying to install airflow with docker extras but airflow's dependency will 
install recent docker-py (3.5.0) from pypi which is incompatible with current 
DockerOperator.

DockerOperator will complain that "create_container() got an unexpected keyword 
argument 'cpu_shares'".

It looks like that interface is changed from docker-py 3.0.0 and work with 
docker-py 2.7.0.

The log and dag file are in attachments.


> pipy docker dependency broken
> -
>
> Key: AIRFLOW-2976
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2976
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: dependencies, docker
>Affects Versions: 1.10
>Reporter: Lin, Yi-Li
>Priority: Major
> Attachments: docker_dag.log, docker_dag.py
>
>
> I'm trying to install airflow with docker extras but airflow's dependency 
> will install recent docker-py (3.5.0) from pypi which is incompatible with 
> current DockerOperator.
> DockerOperator will complain that "create_container() got an unexpected 
> keyword argument 'cpu_shares'".
> It looks like that interface is changed from docker-py 3.0.0 and work with 
> docker-py 2.7.0.
> The log and dag file are in attachments.
> Note, installation comment: "AIRFLOW_GPL_UNIDECODE=yes pip install 
> 'apache-airflow[docker,mysql]==1.10.0'"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] XD-DENG commented on issue #3793: [AIRFLOW-2948] Arg check & better doc - SSHOperator & SFTPOperator

2018-08-29 Thread GitBox
XD-DENG commented on issue #3793: [AIRFLOW-2948] Arg check & better doc - 
SSHOperator & SFTPOperator
URL: 
https://github.com/apache/incubator-airflow/pull/3793#issuecomment-416921926
 
 
   Hi @feng-tao may you take another look? Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-372) DAGs can run before start_date time

2018-08-29 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596261#comment-16596261
 ] 

jack commented on AIRFLOW-372:
--

The recommendation I got when I started using airflow is to never mess with 
start date. It's preferred to create a new dag with a new start date rather 
than changing the old one. Maybe this is why I was recommended to act like this.

> DAGs can run before start_date time
> ---
>
> Key: AIRFLOW-372
> URL: https://issues.apache.org/jira/browse/AIRFLOW-372
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 1.7.1.2
>Reporter: Isaac Steele
>Priority: Major
>
> If you turn off a DAG in the UI, there seemingly is no way to prevent 
> "missed" runs to schedule after the DAG is turned back on. I thought the 
> workaround for this, since it is not a parameterized option to prevent, would 
> be to update the start_date in the DAG code before turning the DAG back on. 
> This does not work, and therefore the scheduler is running dag_runs *before* 
> the listed start_date.
> To reproduce:
> # Create a DAG with a schedule_interval
> # Let the DAG run at least once
> # Turn off the DAG in the UI
> # Allow the schedule_interval to pass at least twice
> # Update the start_date in the DAG to be be after the two interval time
> # (I then removed the compiled python file and restarted airflow/scheduler 
> just to make sure)
> # Turn DAG back on in UI
> Result: All dag_runs that were "missed" while the DAG was turned off run, 
> despite the start_date being later.
> Ideally the start_date would always be honored. And also there would be a 
> parameter to just not run any "missed" dag_runs.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2977) Airflow Webserver Behind Reverse Proxy with SSL Termination

2018-08-29 Thread Micheal Ascah (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micheal Ascah updated AIRFLOW-2977:
---
Description: 
Currently, there is no way in Airflow to configure gunicorn to allow it to 
trust X-Forwarded-* headers from a reverse proxy.

In the scenario where the webserver is being run behind an Application Load 
Balancer in AWS that is also performing SSL termination, gunicorn will ignore 
the X-Forwarded-Proto header and issue redirects using HTTP instead of HTTPS. 
If the load balancer is only accepting traffic over 443, then these redirects 
obviously fail.

 

To resolve this, gunicorn needs to be configured to trust the X-Forwarded 
headers. Rather than manually modifying the gunicorn_config.py under www, 
(which is still also being used by the new RBAC webserver), a value should be 
able to be provided through the airflow.cfg (or also through an env var).

This configuration is documented by gunicorn under the section regarding 
deployment behind a proxy.

 

[http://docs.gunicorn.org/en/stable/deploy.html]

 

Propose to allow a forwarded_allow_ips variable under the `webserver` section 
of the airflow.cfg. and set in the gunicorn_config.py.

  was:
Currently, there is no way in Airflow to configure gunicorn to allow it to 
trust X-Forwarded-* headers from a reverse proxy.

In the scenario where the webserver is being run behind an Application Load 
Balancer in AWS that is also performing SSL termination, gunicorn will ignore 
the X-Forwarded-Proto header and issue redirects using HTTP instead of HTTPS. 
If the load balancer is only accepting traffic over 443, then these redirects 
obviously fail.

 

To resolve this, gunicorn needs to be configured to trust the X-Forwarded 
headers. Rather than manually modifying the gunicorn_config.py under www, 
(which is still also being used by the new RBAC webserver), a value should be 
able to be provided through the airflow.cfg (or also through an env var).

This configuration is documented by gunicorn under the section regarding 
deployment behind a proxy.

 

[http://docs.gunicorn.org/en/stable/deploy.html]

 

Proposed to allow a forwarded_allow_ips variable under the `webserver` section 
of the airflow.cfg. and set in the gunicorn_config.py.


> Airflow Webserver Behind Reverse Proxy with SSL Termination
> ---
>
> Key: AIRFLOW-2977
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2977
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webserver
>Affects Versions: 1.10.0
>Reporter: Micheal Ascah
>Assignee: Micheal Ascah
>Priority: Minor
>
> Currently, there is no way in Airflow to configure gunicorn to allow it to 
> trust X-Forwarded-* headers from a reverse proxy.
> In the scenario where the webserver is being run behind an Application Load 
> Balancer in AWS that is also performing SSL termination, gunicorn will ignore 
> the X-Forwarded-Proto header and issue redirects using HTTP instead of HTTPS. 
> If the load balancer is only accepting traffic over 443, then these redirects 
> obviously fail.
>  
> To resolve this, gunicorn needs to be configured to trust the X-Forwarded 
> headers. Rather than manually modifying the gunicorn_config.py under www, 
> (which is still also being used by the new RBAC webserver), a value should be 
> able to be provided through the airflow.cfg (or also through an env var).
> This configuration is documented by gunicorn under the section regarding 
> deployment behind a proxy.
>  
> [http://docs.gunicorn.org/en/stable/deploy.html]
>  
> Propose to allow a forwarded_allow_ips variable under the `webserver` section 
> of the airflow.cfg. and set in the gunicorn_config.py.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2771) S3Hook Broad Exception Silent Failure

2018-08-29 Thread Micheal Ascah (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micheal Ascah updated AIRFLOW-2771:
---
Fix Version/s: 1.10.0

> S3Hook Broad Exception Silent Failure
> -
>
> Key: AIRFLOW-2771
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2771
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Affects Versions: 1.9.0
>Reporter: Micheal Ascah
>Assignee: Micheal Ascah
>Priority: Minor
>  Labels: S3Hook, S3Sensor
> Fix For: 1.10.0, 2.0.0
>
>
> h2. Scenario
> S3KeySensor is passed an invalid S3/AWS connection id name (doesn't exist or 
> bad permissions). There are also no credentials found under 
> ~/.aws/credentials for boto to fallback on.
>  
> When poking for the key, it creates an S3Hook and calls `check_for_key` on 
> the hook. If the call to HeadObject fails, the call is caught by a generic 
> except clause that catches all exceptions, rather than the expected 
> botocore.exceptions.ClientError when an object is not found.
> h2. Problem
> This causes the sensor to return False and report no issue with the task 
> instance until it times out, rather than intuitively failing immediately if 
> the connection is incorrectly configured. The current logging output gives no 
> insight as to why the key is not being found.
> h4. Current code
> {code:python}
> try:
> self.get_conn().head_object(Bucket=bucket_name, Key=key)
> return True
> except:  # <- This catches credential and connection exceptions that should 
> be raised
> return False
> {code}
> {code:python}
> from airflow.hooks.S3_hook import S3Hook
> hook = S3Hook(aws_conn_id="conn_that_doesnt_exist")
> hook.check_for_key(key="test", bucket="test")
> False
> {code}
> {code:python}
> [2018-07-18 18:57:26,652] {base_task_runner.py:98} INFO - Subtask: 
> [2018-07-18 18:57:26,651] {sensors.py:537} INFO - Poking for key : 
> s3://bucket/key.txt
> [2018-07-18 18:57:26,681] {base_task_runner.py:98} INFO - Subtask: 
> [2018-07-18 18:57:26,680] {connectionpool.py:735} INFO - Starting new HTTPS 
> connection (1): bucket.s3.amazonaws.com
> [2018-07-18 18:58:26,767] {base_task_runner.py:98} INFO - Subtask: 
> [2018-07-18 18:58:26,767] {sensors.py:537} INFO - Poking for key : 
> s3://bucket/key.txt
> [2018-07-18 18:58:26,809] {base_task_runner.py:98} INFO - Subtask: 
> [2018-07-18 18:58:26,808] {connectionpool.py:735} INFO - Starting new HTTPS 
> connection (1): bucket.s3.amazonaws.com
> {code}
> h4. Expected
> h5. No credentials
> {code:python}
> from airflow.hooks.S3_hook import S3Hook
> hook = S3Hook(aws_conn_id="conn_that_doesnt_exist")
> hook.check_for_key(key="test", bucket="test")
> Traceback (most recent call last):
> ...
> botocore.exceptions.NoCredentialsError: Unable to locate credentials
> {code}
> h5. Good credentials
> {code:python}
> from airflow.hooks.S3_hook import S3Hook
> hook = S3Hook(aws_conn_id="conn_that_does_exist")
> hook.check_for_key(key="test", bucket="test")
> False
> {code}
> h4. Proposed Change
> Add a type to the except clause for botocore.exceptions.ClientError and log 
> the message for both check_for_key and check_for_bucket on S3Hook.
> {code:python}
> try:
> self.get_conn().head_object(Bucket=bucket_name, Key=key)
> return True
> except ClientError as e:
> self.log.info(e.response["Error"]["Message"]) 
> return False
> {code}
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ashb commented on issue #3796: [AIRFLOW-2824] - Add config to disable default conn creation

2018-08-29 Thread GitBox
ashb commented on issue #3796: [AIRFLOW-2824] - Add config to disable default 
conn creation
URL: 
https://github.com/apache/incubator-airflow/pull/3796#issuecomment-416933769
 
 
   > create an Airflow instance with dockerized solutions, init'ing without the 
example connections is preferable
   
   `airflow upgradedb` already does that. It works from scratch too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] msumit commented on issue #3779: [AIRFLOW-2928] replace uuid1 with uuid4 for better randomness

2018-08-29 Thread GitBox
msumit commented on issue #3779: [AIRFLOW-2928] replace uuid1 with uuid4 for 
better randomness
URL: 
https://github.com/apache/incubator-airflow/pull/3779#issuecomment-416957588
 
 
   lgtm


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2978) Airflow deps breaking on ancient Python packages

2018-08-29 Thread Jon Davies (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596401#comment-16596401
 ] 

Jon Davies commented on AIRFLOW-2978:
-

That made it work, thanks.

> Airflow deps breaking on ancient Python packages
> 
>
> Key: AIRFLOW-2978
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2978
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jon Davies
>Priority: Major
>
> I have a Python package that depends on Airflow that forms part of our DAG 
> build out process.
> This pipeline installs Airflow 1.10.0 and then the build fails with:
> {code:java}
> Installed /python-packages/pkg/pkg/.eggs/pytzdata-2018.5-py2.7.egg
> Traceback (most recent call last):
>   File "setup.py", line 39, in 
> test_suite="tests"
>   File "/usr/local/lib/python2.7/site-packages/setuptools/__init__.py", line 
> 131, in setup
> return distutils.core.setup(**attrs)
>   File "/usr/local/lib/python2.7/distutils/core.py", line 151, in setup
> dist.run_commands()
>   File "/usr/local/lib/python2.7/distutils/dist.py", line 953, in run_commands
> self.run_command(cmd)
>   File "/usr/local/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
> line 216, in run
> installed_dists = self.install_dists(self.distribution)
>   File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
> line 207, in install_dists
> ir_d = dist.fetch_build_eggs(dist.install_requires)
>   File "/usr/local/lib/python2.7/site-packages/setuptools/dist.py", line 514, 
> in fetch_build_eggs
> replace_conflicting=True,
>   File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", 
> line 782, in resolve
> raise VersionConflict(dist, req).with_context(dependent_req)
> pkg_resources.ContextualVersionConflict: (tzlocal 2.0.0b1 
> (/python-packages/pkg/pkg/.eggs/tzlocal-2.0.0b1-py2.7.egg), 
> Requirement.parse('tzlocal<2.0.0.0,>=1.5.0.0'), set(['pendulum']))
> {code}
> It would seem that Airflow pins ancient versions here:
> - 
> https://github.com/apache/incubator-airflow/blob/314232cc49ebe75cf12e58af9588123441976647/setup.py#L316
> For example:
> psutil is at 5.4.7
> pygments is at 2.2.0
> pendulum is at 2.0.3
> tzlocal is at 2.0.0b1
> All of these install fine with their latest versions together in a fresh 
> 2.7.15 Docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2978) Airflow deps breaking on ancient Python packages

2018-08-29 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596347#comment-16596347
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2978:


Check for one of these options 
https://pip.pypa.io/en/stable/user_guide/#configuration (if you are using pip?)

> Airflow deps breaking on ancient Python packages
> 
>
> Key: AIRFLOW-2978
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2978
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jon Davies
>Priority: Major
>
> I have a Python package that depends on Airflow that forms part of our DAG 
> build out process.
> This pipeline installs Airflow 1.10.0 and then the build fails with:
> {code:java}
> Installed /python-packages/pkg/pkg/.eggs/pytzdata-2018.5-py2.7.egg
> Traceback (most recent call last):
>   File "setup.py", line 39, in 
> test_suite="tests"
>   File "/usr/local/lib/python2.7/site-packages/setuptools/__init__.py", line 
> 131, in setup
> return distutils.core.setup(**attrs)
>   File "/usr/local/lib/python2.7/distutils/core.py", line 151, in setup
> dist.run_commands()
>   File "/usr/local/lib/python2.7/distutils/dist.py", line 953, in run_commands
> self.run_command(cmd)
>   File "/usr/local/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
> line 216, in run
> installed_dists = self.install_dists(self.distribution)
>   File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
> line 207, in install_dists
> ir_d = dist.fetch_build_eggs(dist.install_requires)
>   File "/usr/local/lib/python2.7/site-packages/setuptools/dist.py", line 514, 
> in fetch_build_eggs
> replace_conflicting=True,
>   File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", 
> line 782, in resolve
> raise VersionConflict(dist, req).with_context(dependent_req)
> pkg_resources.ContextualVersionConflict: (tzlocal 2.0.0b1 
> (/python-packages/pkg/pkg/.eggs/tzlocal-2.0.0b1-py2.7.egg), 
> Requirement.parse('tzlocal<2.0.0.0,>=1.5.0.0'), set(['pendulum']))
> {code}
> It would seem that Airflow pins ancient versions here:
> - 
> https://github.com/apache/incubator-airflow/blob/314232cc49ebe75cf12e58af9588123441976647/setup.py#L316
> For example:
> psutil is at 5.4.7
> pygments is at 2.2.0
> pendulum is at 2.0.3
> tzlocal is at 2.0.0b1
> All of these install fine with their latest versions together in a fresh 
> 2.7.15 Docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] andscoop commented on issue #3796: [AIRFLOW-2824] - Add config to disable default conn creation

2018-08-29 Thread GitBox
andscoop commented on issue #3796: [AIRFLOW-2824] - Add config to disable 
default conn creation
URL: 
https://github.com/apache/incubator-airflow/pull/3796#issuecomment-416929571
 
 
   @ashb As it becomes easier to tear down and create an Airflow instance with 
dockerized solutions, init'ing without the example connections is preferable.  
Beyond a users first time using airflow they will likely want to init without 
the example connections. `upgrade db` is a nice alternative if you have a long 
lived airflow instance that you have already manually removed the example conns 
from. This is a solution to avoid having to manually remove example conns.
   
   I don't have a strong opinion on whether or not this should be combined with 
`load_examples`. The simplicity of combining them is always a plus, but I would 
need to see how users leverage the configs before speaking further.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #3796: [AIRFLOW-2824] - Add config to disable default conn creation

2018-08-29 Thread GitBox
ashb commented on issue #3796: [AIRFLOW-2824] - Add config to disable default 
conn creation
URL: 
https://github.com/apache/incubator-airflow/pull/3796#issuecomment-416951818
 
 
   Since I didn't say so I think the goal of this ticket is .
   
   Ah the RBAC code there is an interesting one. Apart from that, everything in 
`initdb` is optional - airflow will operate fine with only ever running 
`upgradedb`. We have been running in prod since 1.8.2 only having done 
`upgradedb` and nothing has broken or missing (other than no sample charts or 
connections)
   
   I have no idea where KnowEvent(Type) is actually used - there are very few 
references to it in the rest of the code base that I could see other the 
creation there, and a view. Nothing ever seems to use it for anything else 
Maybe a candidate for dropping, though worth checking with @mistercrunch or 
someone else (still?) at AirBnB.
   
   Pre-loading of the DAG table didn't used to be required as starting the 
scheduler would do this job too, but on quick testing I'm not sure that's the 
case anymore.
   
   Hmm, the RBAC change should possibly be converted to be an actual Alembic 
migration, or at the very least done via upgradedb not just initdb!
   
   I have somewhat hijacked this issue though. Oops.
   
   Having written al that I'm not sure the way forward either.
   
   RBAC and KET should probably be in upgradedb. It might be worth merging them 
in to a single command behind a flag `airflow initdb --with-examples` which is 
more obvious/delibarate than having to know that `upgradedb` exists when all 
the tutorials just talk about initdb. THoughts?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] msumit closed pull request #3779: [AIRFLOW-2928] replace uuid1 with uuid4 for better randomness

2018-08-29 Thread GitBox
msumit closed pull request #3779: [AIRFLOW-2928] replace uuid1 with uuid4 for 
better randomness
URL: https://github.com/apache/incubator-airflow/pull/3779
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/hooks/gcp_dataflow_hook.py 
b/airflow/contrib/hooks/gcp_dataflow_hook.py
index 8c9b7423e0..a9b7e71a5e 100644
--- a/airflow/contrib/hooks/gcp_dataflow_hook.py
+++ b/airflow/contrib/hooks/gcp_dataflow_hook.py
@@ -250,7 +250,7 @@ def _build_dataflow_job_name(task_id, append_job_name=True):
 'letter and ending with a letter or number '.format(task_id))
 
 if append_job_name:
-job_name = task_id + "-" + str(uuid.uuid1())[:8]
+job_name = task_id + "-" + str(uuid.uuid4())[:8]
 else:
 job_name = task_id
 
diff --git a/airflow/contrib/hooks/gcp_dataproc_hook.py 
b/airflow/contrib/hooks/gcp_dataproc_hook.py
index 57c48bde59..f9e7a90509 100644
--- a/airflow/contrib/hooks/gcp_dataproc_hook.py
+++ b/airflow/contrib/hooks/gcp_dataproc_hook.py
@@ -81,7 +81,7 @@ def get(self):
 
 class _DataProcJobBuilder:
 def __init__(self, project_id, task_id, cluster_name, job_type, 
properties):
-name = task_id + "_" + str(uuid.uuid1())[:8]
+name = task_id + "_" + str(uuid.uuid4())[:8]
 self.job_type = job_type
 self.job = {
 "job": {
@@ -141,7 +141,7 @@ def set_python_main(self, main):
 self.job["job"][self.job_type]["mainPythonFileUri"] = main
 
 def set_job_name(self, name):
-self.job["job"]["reference"]["jobId"] = name + "_" + 
str(uuid.uuid1())[:8]
+self.job["job"]["reference"]["jobId"] = name + "_" + 
str(uuid.uuid4())[:8]
 
 def build(self):
 return self.job
diff --git a/airflow/contrib/kubernetes/pod_generator.py 
b/airflow/contrib/kubernetes/pod_generator.py
index 6d8d83ef05..bee7f5b957 100644
--- a/airflow/contrib/kubernetes/pod_generator.py
+++ b/airflow/contrib/kubernetes/pod_generator.py
@@ -149,7 +149,7 @@ def make_pod(self, namespace, image, pod_id, cmds, 
arguments, labels):
 
 return Pod(
 namespace=namespace,
-name=pod_id + "-" + str(uuid.uuid1())[:8],
+name=pod_id + "-" + str(uuid.uuid4())[:8],
 image=image,
 cmds=cmds,
 args=arguments,
diff --git a/airflow/contrib/operators/dataflow_operator.py 
b/airflow/contrib/operators/dataflow_operator.py
index 3a6980cefa..3f6093b3ba 100644
--- a/airflow/contrib/operators/dataflow_operator.py
+++ b/airflow/contrib/operators/dataflow_operator.py
@@ -365,7 +365,7 @@ def google_cloud_to_local(self, file_name):
 
 bucket_id = path_components[0]
 object_id = '/'.join(path_components[1:])
-local_file = '/tmp/dataflow{}-{}'.format(str(uuid.uuid1())[:8],
+local_file = '/tmp/dataflow{}-{}'.format(str(uuid.uuid4())[:8],
  path_components[-1])
 file_size = self._gcs_hook.download(bucket_id, object_id, local_file)
 
diff --git a/airflow/contrib/operators/dataproc_operator.py 
b/airflow/contrib/operators/dataproc_operator.py
index 6dfa2da095..69073f67de 100644
--- a/airflow/contrib/operators/dataproc_operator.py
+++ b/airflow/contrib/operators/dataproc_operator.py
@@ -1158,7 +1158,7 @@ class DataProcPySparkOperator(BaseOperator):
 @staticmethod
 def _generate_temp_filename(filename):
 dt = time.strftime('%Y%m%d%H%M%S')
-return "{}_{}_{}".format(dt, str(uuid.uuid1())[:8], 
ntpath.basename(filename))
+return "{}_{}_{}".format(dt, str(uuid.uuid4())[:8], 
ntpath.basename(filename))
 
 """
 Upload a local file to a Google Cloud Storage bucket
@@ -1312,7 +1312,7 @@ def start(self):
 .instantiate(
 name=('projects/%s/regions/%s/workflowTemplates/%s' %
   (self.project_id, self.region, self.template_id)),
-body={'instanceId': str(uuid.uuid1())})
+body={'instanceId': str(uuid.uuid4())})
 .execute())
 
 
@@ -1355,6 +1355,6 @@ def start(self):
 self.hook.get_conn().projects().regions().workflowTemplates()
 .instantiateInline(
 parent='projects/%s/regions/%s' % (self.project_id, 
self.region),
-instanceId=str(uuid.uuid1()),
+instanceId=str(uuid.uuid4()),
 body=self.template)
 .execute())
diff --git a/airflow/contrib/task_runner/cgroup_task_runner.py 
b/airflow/contrib/task_runner/cgroup_task_runner.py
index 78a240f2db..4662b0fe82 100644
--- a/airflow/contrib/task_runner/cgroup_task_runner.py
+++ b/airflow/contrib/task_runner/cgroup_task_runner.py
@@ -123,7 +123,7 @@ def start(self):
 # Create 

[jira] [Commented] (AIRFLOW-2928) Use uuid.uuid4 to create unique job name

2018-08-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596342#comment-16596342
 ] 

ASF GitHub Bot commented on AIRFLOW-2928:
-

msumit closed pull request #3779: [AIRFLOW-2928] replace uuid1 with uuid4 for 
better randomness
URL: https://github.com/apache/incubator-airflow/pull/3779
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/hooks/gcp_dataflow_hook.py 
b/airflow/contrib/hooks/gcp_dataflow_hook.py
index 8c9b7423e0..a9b7e71a5e 100644
--- a/airflow/contrib/hooks/gcp_dataflow_hook.py
+++ b/airflow/contrib/hooks/gcp_dataflow_hook.py
@@ -250,7 +250,7 @@ def _build_dataflow_job_name(task_id, append_job_name=True):
 'letter and ending with a letter or number '.format(task_id))
 
 if append_job_name:
-job_name = task_id + "-" + str(uuid.uuid1())[:8]
+job_name = task_id + "-" + str(uuid.uuid4())[:8]
 else:
 job_name = task_id
 
diff --git a/airflow/contrib/hooks/gcp_dataproc_hook.py 
b/airflow/contrib/hooks/gcp_dataproc_hook.py
index 57c48bde59..f9e7a90509 100644
--- a/airflow/contrib/hooks/gcp_dataproc_hook.py
+++ b/airflow/contrib/hooks/gcp_dataproc_hook.py
@@ -81,7 +81,7 @@ def get(self):
 
 class _DataProcJobBuilder:
 def __init__(self, project_id, task_id, cluster_name, job_type, 
properties):
-name = task_id + "_" + str(uuid.uuid1())[:8]
+name = task_id + "_" + str(uuid.uuid4())[:8]
 self.job_type = job_type
 self.job = {
 "job": {
@@ -141,7 +141,7 @@ def set_python_main(self, main):
 self.job["job"][self.job_type]["mainPythonFileUri"] = main
 
 def set_job_name(self, name):
-self.job["job"]["reference"]["jobId"] = name + "_" + 
str(uuid.uuid1())[:8]
+self.job["job"]["reference"]["jobId"] = name + "_" + 
str(uuid.uuid4())[:8]
 
 def build(self):
 return self.job
diff --git a/airflow/contrib/kubernetes/pod_generator.py 
b/airflow/contrib/kubernetes/pod_generator.py
index 6d8d83ef05..bee7f5b957 100644
--- a/airflow/contrib/kubernetes/pod_generator.py
+++ b/airflow/contrib/kubernetes/pod_generator.py
@@ -149,7 +149,7 @@ def make_pod(self, namespace, image, pod_id, cmds, 
arguments, labels):
 
 return Pod(
 namespace=namespace,
-name=pod_id + "-" + str(uuid.uuid1())[:8],
+name=pod_id + "-" + str(uuid.uuid4())[:8],
 image=image,
 cmds=cmds,
 args=arguments,
diff --git a/airflow/contrib/operators/dataflow_operator.py 
b/airflow/contrib/operators/dataflow_operator.py
index 3a6980cefa..3f6093b3ba 100644
--- a/airflow/contrib/operators/dataflow_operator.py
+++ b/airflow/contrib/operators/dataflow_operator.py
@@ -365,7 +365,7 @@ def google_cloud_to_local(self, file_name):
 
 bucket_id = path_components[0]
 object_id = '/'.join(path_components[1:])
-local_file = '/tmp/dataflow{}-{}'.format(str(uuid.uuid1())[:8],
+local_file = '/tmp/dataflow{}-{}'.format(str(uuid.uuid4())[:8],
  path_components[-1])
 file_size = self._gcs_hook.download(bucket_id, object_id, local_file)
 
diff --git a/airflow/contrib/operators/dataproc_operator.py 
b/airflow/contrib/operators/dataproc_operator.py
index 6dfa2da095..69073f67de 100644
--- a/airflow/contrib/operators/dataproc_operator.py
+++ b/airflow/contrib/operators/dataproc_operator.py
@@ -1158,7 +1158,7 @@ class DataProcPySparkOperator(BaseOperator):
 @staticmethod
 def _generate_temp_filename(filename):
 dt = time.strftime('%Y%m%d%H%M%S')
-return "{}_{}_{}".format(dt, str(uuid.uuid1())[:8], 
ntpath.basename(filename))
+return "{}_{}_{}".format(dt, str(uuid.uuid4())[:8], 
ntpath.basename(filename))
 
 """
 Upload a local file to a Google Cloud Storage bucket
@@ -1312,7 +1312,7 @@ def start(self):
 .instantiate(
 name=('projects/%s/regions/%s/workflowTemplates/%s' %
   (self.project_id, self.region, self.template_id)),
-body={'instanceId': str(uuid.uuid1())})
+body={'instanceId': str(uuid.uuid4())})
 .execute())
 
 
@@ -1355,6 +1355,6 @@ def start(self):
 self.hook.get_conn().projects().regions().workflowTemplates()
 .instantiateInline(
 parent='projects/%s/regions/%s' % (self.project_id, 
self.region),
-instanceId=str(uuid.uuid1()),
+instanceId=str(uuid.uuid4()),
 body=self.template)
 .execute())
diff --git a/airflow/contrib/task_runner/cgroup_task_runner.py 

[GitHub] andscoop commented on issue #3796: [AIRFLOW-2824] - Add config to disable default conn creation

2018-08-29 Thread GitBox
andscoop commented on issue #3796: [AIRFLOW-2824] - Add config to disable 
default conn creation
URL: 
https://github.com/apache/incubator-airflow/pull/3796#issuecomment-416944894
 
 
   @ashb I made an assumption that the [logic happening after the example 
connection 
creation](https://github.com/apache/incubator-airflow/blob/master/airflow/utils/db.py#L289)
 was necessary for a database initialization. For the most part, this does not 
appear to be the case.
   
   I think this brings up the larger question, should the project continue to 
support both `airflow initdb` and `airflow upgradedb` in their current forms? 
It appears that `upgradedb` does the work a user expects out of `initdb` and 
that the extra logic in `initdb` is not necessary to initialization the 
database. 
   
   ie.
   
   [Creation of example 
chart](https://github.com/apache/incubator-airflow/blob/master/airflow/utils/db.py#L310)
   
   [Pre-loading of DAG 
table](https://github.com/apache/incubator-airflow/blob/master/airflow/utils/db.py#L303)
   
   
   There are two other core pieces of logic inside of `initdb` that I am unsure 
of. Is the [loading of the known event 
types](https://github.com/apache/incubator-airflow/blob/master/airflow/utils/db.py#L289)
 on database initialization necessary? It does not appear to be used beyond a 
single call at app creation time. If it is necessary the logic should be 
combined with `upgrade db`
   
   I am also unsure of the [RBAC 
piece](https://github.com/apache/incubator-airflow/blob/master/airflow/utils/db.py#L328)
 inside of `initdb`.  If this is necessary we should combine with `upgrade db`.
   
   
   I am happy to pivot this ticket into consolidation of these two methods, but 
would like to hear thoughts from maintainers first.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tzulberti-jampp commented on a change in pull request #3795: [AIRFLOW-2949] Add syntax highlight for single quote strings

2018-08-29 Thread GitBox
tzulberti-jampp commented on a change in pull request #3795: [AIRFLOW-2949] Add 
syntax highlight for single quote strings
URL: https://github.com/apache/incubator-airflow/pull/3795#discussion_r213700702
 
 

 ##
 File path: airflow/www/static/main.css
 ##
 @@ -262,3 +262,4 @@ div.square {
 .sc { color: #BA2121 } /* Literal.String.Char */
 .sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
 .s2 { color: #BA2121 } /* Literal.String.Double */
+.s1 { color: #BA2121 } /* Literal.String.Single */
 
 Review comment:
   On the last commit, I added to the main.css inside the wwr_rbac. I am still 
missing something?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2975) Add support for Amazon cloudwatch (computing power at will)

2018-08-29 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596216#comment-16596216
 ] 

jack commented on AIRFLOW-2975:
---

[~ashb] How can I optimize the number of workers? 1 per core?

> Add support for Amazon cloudwatch (computing power at will)
> ---
>
> Key: AIRFLOW-2975
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2975
> Project: Apache Airflow
>  Issue Type: New Feature
>Affects Versions: 1.10.0
>Reporter: jack
>Priority: Major
> Fix For: 2.0.0, 1.10.1
>
>
> Some of have one machine that runs airfow…
> While we can scale up the executor to have many resources over different 
> servers this is consider to be expensive.
> There is another solution.. using Amazon Cloudwatch:
> [https://aws.amazon.com/premiumsupport/knowledge-center/start-stop-lambda-cloudwatch/]
>  
> This enable the user to create and close EC2 machines on specific intervals 
> for specific tasks.
> Basically if I have 50 DAGS to run on 1PM-3PM and few dags to run on other 
> hours there is no point in paying for 2nd server 24/7. 
> This could be an enhancement to one of the executors that know to work with 
> more than one server.
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2978) Airflow deps breaking on ancient Python packages

2018-08-29 Thread Jon Davies (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596344#comment-16596344
 ] 

Jon Davies commented on AIRFLOW-2978:
-

We have this in setup.py:

{code:java}
install_requires = [
'apache-airflow>=1.9.0',
'slackclient>=1.0.9',
'requests>=2.18.4',
]
tests_require = [
"mock>=2.0.0",
]
{code}

Not sure where the pre-release tzlocal is coming from.

> Airflow deps breaking on ancient Python packages
> 
>
> Key: AIRFLOW-2978
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2978
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jon Davies
>Priority: Major
>
> I have a Python package that depends on Airflow that forms part of our DAG 
> build out process.
> This pipeline installs Airflow 1.10.0 and then the build fails with:
> {code:java}
> Installed /python-packages/pkg/pkg/.eggs/pytzdata-2018.5-py2.7.egg
> Traceback (most recent call last):
>   File "setup.py", line 39, in 
> test_suite="tests"
>   File "/usr/local/lib/python2.7/site-packages/setuptools/__init__.py", line 
> 131, in setup
> return distutils.core.setup(**attrs)
>   File "/usr/local/lib/python2.7/distutils/core.py", line 151, in setup
> dist.run_commands()
>   File "/usr/local/lib/python2.7/distutils/dist.py", line 953, in run_commands
> self.run_command(cmd)
>   File "/usr/local/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
> line 216, in run
> installed_dists = self.install_dists(self.distribution)
>   File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
> line 207, in install_dists
> ir_d = dist.fetch_build_eggs(dist.install_requires)
>   File "/usr/local/lib/python2.7/site-packages/setuptools/dist.py", line 514, 
> in fetch_build_eggs
> replace_conflicting=True,
>   File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", 
> line 782, in resolve
> raise VersionConflict(dist, req).with_context(dependent_req)
> pkg_resources.ContextualVersionConflict: (tzlocal 2.0.0b1 
> (/python-packages/pkg/pkg/.eggs/tzlocal-2.0.0b1-py2.7.egg), 
> Requirement.parse('tzlocal<2.0.0.0,>=1.5.0.0'), set(['pendulum']))
> {code}
> It would seem that Airflow pins ancient versions here:
> - 
> https://github.com/apache/incubator-airflow/blob/314232cc49ebe75cf12e58af9588123441976647/setup.py#L316
> For example:
> psutil is at 5.4.7
> pygments is at 2.2.0
> pendulum is at 2.0.3
> tzlocal is at 2.0.0b1
> All of these install fine with their latest versions together in a fresh 
> 2.7.15 Docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2975) Add support for Amazon cloudwatch (computing power at will)

2018-08-29 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596223#comment-16596223
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2975:


It depends on your workload - if you are mostly network bound (i.e. lots of 
HTTP requests, or EMR calls etc) you could go much higher, but 1 or 2 slots per 
core is a good starting point, yes.

> Add support for Amazon cloudwatch (computing power at will)
> ---
>
> Key: AIRFLOW-2975
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2975
> Project: Apache Airflow
>  Issue Type: New Feature
>Affects Versions: 1.10.0
>Reporter: jack
>Priority: Major
> Fix For: 2.0.0, 1.10.1
>
>
> Some of have one machine that runs airfow…
> While we can scale up the executor to have many resources over different 
> servers this is consider to be expensive.
> There is another solution.. using Amazon Cloudwatch:
> [https://aws.amazon.com/premiumsupport/knowledge-center/start-stop-lambda-cloudwatch/]
>  
> This enable the user to create and close EC2 machines on specific intervals 
> for specific tasks.
> Basically if I have 50 DAGS to run on 1PM-3PM and few dags to run on other 
> hours there is no point in paying for 2nd server 24/7. 
> This could be an enhancement to one of the executors that know to work with 
> more than one server.
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2030) dbapi_hook KeyError: 'i' at line 225

2018-08-29 Thread Micheal Ascah (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596296#comment-16596296
 ] 

Micheal Ascah commented on AIRFLOW-2030:


For anyone checking on this, this was actually included in the 1.10.0 release 
tag, but it is not included in the change log as far as I can see.

> dbapi_hook KeyError: 'i' at line 225
> 
>
> Key: AIRFLOW-2030
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2030
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: db
>Affects Versions: 1.9.0
>Reporter: Manish Kumar Untwal
>Assignee: Manish Kumar Untwal
>Priority: Major
> Fix For: 2.0.0
>
>
> There is no local variable defined for zero rows, so the logger throws an 
> KeyError for local variable 'i' at 225 line in  dbapi_hook.py



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2978) Airflow deps breaking on ancient Python packages

2018-08-29 Thread Jon Davies (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Davies updated AIRFLOW-2978:

Summary: Airflow deps breaking on ancient Python packages  (was: Airflow 
deps breaking on ancient Package packages)

> Airflow deps breaking on ancient Python packages
> 
>
> Key: AIRFLOW-2978
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2978
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jon Davies
>Priority: Major
>
> I have a Python package that depends on Airflow that forms part of our DAG 
> build out process.
> This pipeline installs Airflow 1.10.0 and then the build fails with:
> {code:java}
> Installed /python-packages/pkg/pkg/.eggs/pytzdata-2018.5-py2.7.egg
> Traceback (most recent call last):
>   File "setup.py", line 39, in 
> test_suite="tests"
>   File "/usr/local/lib/python2.7/site-packages/setuptools/__init__.py", line 
> 131, in setup
> return distutils.core.setup(**attrs)
>   File "/usr/local/lib/python2.7/distutils/core.py", line 151, in setup
> dist.run_commands()
>   File "/usr/local/lib/python2.7/distutils/dist.py", line 953, in run_commands
> self.run_command(cmd)
>   File "/usr/local/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
> line 216, in run
> installed_dists = self.install_dists(self.distribution)
>   File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
> line 207, in install_dists
> ir_d = dist.fetch_build_eggs(dist.install_requires)
>   File "/usr/local/lib/python2.7/site-packages/setuptools/dist.py", line 514, 
> in fetch_build_eggs
> replace_conflicting=True,
>   File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", 
> line 782, in resolve
> raise VersionConflict(dist, req).with_context(dependent_req)
> pkg_resources.ContextualVersionConflict: (tzlocal 2.0.0b1 
> (/python-packages/pkg/pkg/.eggs/tzlocal-2.0.0b1-py2.7.egg), 
> Requirement.parse('tzlocal<2.0.0.0,>=1.5.0.0'), set(['pendulum']))
> {code}
> It would seem that Airflow pins ancient versions here:
> - 
> https://github.com/apache/incubator-airflow/blob/314232cc49ebe75cf12e58af9588123441976647/setup.py#L316
> psutil is at 5.4.7
> pygments is at 2.2.0
> pendulum is at 2.0.3
> All of these install fine with their latest versions together in a fresh 
> 2.7.15 Docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2978) Airflow deps breaking on ancient Python packages

2018-08-29 Thread Jon Davies (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Davies updated AIRFLOW-2978:

Description: 
I have a Python package that depends on Airflow that forms part of our DAG 
build out process.

This pipeline installs Airflow 1.10.0 and then the build fails with:

{code:java}
Installed /python-packages/pkg/pkg/.eggs/pytzdata-2018.5-py2.7.egg
Traceback (most recent call last):
  File "setup.py", line 39, in 
test_suite="tests"
  File "/usr/local/lib/python2.7/site-packages/setuptools/__init__.py", line 
131, in setup
return distutils.core.setup(**attrs)
  File "/usr/local/lib/python2.7/distutils/core.py", line 151, in setup
dist.run_commands()
  File "/usr/local/lib/python2.7/distutils/dist.py", line 953, in run_commands
self.run_command(cmd)
  File "/usr/local/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
  File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
line 216, in run
installed_dists = self.install_dists(self.distribution)
  File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
line 207, in install_dists
ir_d = dist.fetch_build_eggs(dist.install_requires)
  File "/usr/local/lib/python2.7/site-packages/setuptools/dist.py", line 514, 
in fetch_build_eggs
replace_conflicting=True,
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 
782, in resolve
raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (tzlocal 2.0.0b1 
(/python-packages/pkg/pkg/.eggs/tzlocal-2.0.0b1-py2.7.egg), 
Requirement.parse('tzlocal<2.0.0.0,>=1.5.0.0'), set(['pendulum']))
{code}

It would seem that Airflow pins ancient versions here:

- 
https://github.com/apache/incubator-airflow/blob/314232cc49ebe75cf12e58af9588123441976647/setup.py#L316

For example:

psutil is at 5.4.7
pygments is at 2.2.0
pendulum is at 2.0.3

All of these install fine with their latest versions together in a fresh 2.7.15 
Docker container.

  was:
I have a Python package that depends on Airflow that forms part of our DAG 
build out process.

This pipeline installs Airflow 1.10.0 and then the build fails with:

{code:java}
Installed /python-packages/pkg/pkg/.eggs/pytzdata-2018.5-py2.7.egg
Traceback (most recent call last):
  File "setup.py", line 39, in 
test_suite="tests"
  File "/usr/local/lib/python2.7/site-packages/setuptools/__init__.py", line 
131, in setup
return distutils.core.setup(**attrs)
  File "/usr/local/lib/python2.7/distutils/core.py", line 151, in setup
dist.run_commands()
  File "/usr/local/lib/python2.7/distutils/dist.py", line 953, in run_commands
self.run_command(cmd)
  File "/usr/local/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
  File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
line 216, in run
installed_dists = self.install_dists(self.distribution)
  File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
line 207, in install_dists
ir_d = dist.fetch_build_eggs(dist.install_requires)
  File "/usr/local/lib/python2.7/site-packages/setuptools/dist.py", line 514, 
in fetch_build_eggs
replace_conflicting=True,
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 
782, in resolve
raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (tzlocal 2.0.0b1 
(/python-packages/pkg/pkg/.eggs/tzlocal-2.0.0b1-py2.7.egg), 
Requirement.parse('tzlocal<2.0.0.0,>=1.5.0.0'), set(['pendulum']))
{code}

It would seem that Airflow pins ancient versions here:

- 
https://github.com/apache/incubator-airflow/blob/314232cc49ebe75cf12e58af9588123441976647/setup.py#L316

psutil is at 5.4.7
pygments is at 2.2.0
pendulum is at 2.0.3

All of these install fine with their latest versions together in a fresh 2.7.15 
Docker container.


> Airflow deps breaking on ancient Python packages
> 
>
> Key: AIRFLOW-2978
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2978
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jon Davies
>Priority: Major
>
> I have a Python package that depends on Airflow that forms part of our DAG 
> build out process.
> This pipeline installs Airflow 1.10.0 and then the build fails with:
> {code:java}
> Installed /python-packages/pkg/pkg/.eggs/pytzdata-2018.5-py2.7.egg
> Traceback (most recent call last):
>   File "setup.py", line 39, in 
> test_suite="tests"
>   File "/usr/local/lib/python2.7/site-packages/setuptools/__init__.py", line 
> 131, in setup
> return distutils.core.setup(**attrs)
>   File "/usr/local/lib/python2.7/distutils/core.py", line 151, in setup
> dist.run_commands()
>   File 

[jira] [Created] (AIRFLOW-2978) Airflow deps breaking on ancient Package packages

2018-08-29 Thread Jon Davies (JIRA)
Jon Davies created AIRFLOW-2978:
---

 Summary: Airflow deps breaking on ancient Package packages
 Key: AIRFLOW-2978
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2978
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Jon Davies


I have a Python package that depends on Airflow that forms part of our DAG 
build out process.

This pipeline installs Airflow 1.10.0 and then the build fails with:

{code:java}
Installed /python-packages/pkg/pkg/.eggs/pytzdata-2018.5-py2.7.egg
Traceback (most recent call last):
  File "setup.py", line 39, in 
test_suite="tests"
  File "/usr/local/lib/python2.7/site-packages/setuptools/__init__.py", line 
131, in setup
return distutils.core.setup(**attrs)
  File "/usr/local/lib/python2.7/distutils/core.py", line 151, in setup
dist.run_commands()
  File "/usr/local/lib/python2.7/distutils/dist.py", line 953, in run_commands
self.run_command(cmd)
  File "/usr/local/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
  File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
line 216, in run
installed_dists = self.install_dists(self.distribution)
  File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
line 207, in install_dists
ir_d = dist.fetch_build_eggs(dist.install_requires)
  File "/usr/local/lib/python2.7/site-packages/setuptools/dist.py", line 514, 
in fetch_build_eggs
replace_conflicting=True,
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 
782, in resolve
raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (tzlocal 2.0.0b1 
(/python-packages/pkg/pkg/.eggs/tzlocal-2.0.0b1-py2.7.egg), 
Requirement.parse('tzlocal<2.0.0.0,>=1.5.0.0'), set(['pendulum']))
{code}

It would seem that Airflow pins ancient versions here:

- 
https://github.com/apache/incubator-airflow/blob/314232cc49ebe75cf12e58af9588123441976647/setup.py#L316

psutil is at 5.4.7
pygments is at 2.2.0
pendulum is at 2.0.3

All of these install fine with their latest versions together in a fresh 2.7.15 
Docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2978) Airflow deps breaking on ancient Python packages

2018-08-29 Thread Jon Davies (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596349#comment-16596349
 ] 

Jon Davies commented on AIRFLOW-2978:
-

Sorry, this is with "python setup.py test".

> Airflow deps breaking on ancient Python packages
> 
>
> Key: AIRFLOW-2978
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2978
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jon Davies
>Priority: Major
>
> I have a Python package that depends on Airflow that forms part of our DAG 
> build out process.
> This pipeline installs Airflow 1.10.0 and then the build fails with:
> {code:java}
> Installed /python-packages/pkg/pkg/.eggs/pytzdata-2018.5-py2.7.egg
> Traceback (most recent call last):
>   File "setup.py", line 39, in 
> test_suite="tests"
>   File "/usr/local/lib/python2.7/site-packages/setuptools/__init__.py", line 
> 131, in setup
> return distutils.core.setup(**attrs)
>   File "/usr/local/lib/python2.7/distutils/core.py", line 151, in setup
> dist.run_commands()
>   File "/usr/local/lib/python2.7/distutils/dist.py", line 953, in run_commands
> self.run_command(cmd)
>   File "/usr/local/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
> line 216, in run
> installed_dists = self.install_dists(self.distribution)
>   File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
> line 207, in install_dists
> ir_d = dist.fetch_build_eggs(dist.install_requires)
>   File "/usr/local/lib/python2.7/site-packages/setuptools/dist.py", line 514, 
> in fetch_build_eggs
> replace_conflicting=True,
>   File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", 
> line 782, in resolve
> raise VersionConflict(dist, req).with_context(dependent_req)
> pkg_resources.ContextualVersionConflict: (tzlocal 2.0.0b1 
> (/python-packages/pkg/pkg/.eggs/tzlocal-2.0.0b1-py2.7.egg), 
> Requirement.parse('tzlocal<2.0.0.0,>=1.5.0.0'), set(['pendulum']))
> {code}
> It would seem that Airflow pins ancient versions here:
> - 
> https://github.com/apache/incubator-airflow/blob/314232cc49ebe75cf12e58af9588123441976647/setup.py#L316
> For example:
> psutil is at 5.4.7
> pygments is at 2.2.0
> pendulum is at 2.0.3
> tzlocal is at 2.0.0b1
> All of these install fine with their latest versions together in a fresh 
> 2.7.15 Docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2977) Airflow Webserver Behind Reverse Proxy with SSL Termination

2018-08-29 Thread Micheal Ascah (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micheal Ascah updated AIRFLOW-2977:
---
Description: 
Currently, there is no way in Airflow to configure gunicorn to allow it to 
trust X-Forwarded-* headers from a reverse proxy.

In the scenario where the webserver is being run behind an Application Load 
Balancer in AWS that is also performing SSL termination, gunicorn will ignore 
the X-Forwarded-Proto header and issue redirects using HTTP instead of HTTPS. 
If the load balancer is only accepting traffic over 443, then these redirects 
obviously fail.

 

To resolve this, gunicorn needs to be configured to trust the X-Forwarded 
headers. Rather than manually modifying the gunicorn_config.py under www, 
(which is still also being used by the new RBAC webserver), a value should be 
able to be provided through the airflow.cfg (or also through an env var).

This configuration is documented by gunicorn under the section regarding 
deployment behind a proxy.

 

[http://docs.gunicorn.org/en/stable/deploy.html]

 

Proposed to allow a forwarded_allow_ips variable under the `webserver` section 
of the airflow.cfg. and set in the gunicorn_config.py.

  was:
Currently, there is no way in Airflow to configure gunicorn to allow it to 
trust X-Forwarded-* headers from a reverse proxy.

In the scenario where the webserver is being run behind an Application Load 
Balancer in AWS that is also performing SSL termination, gunicorn will ignore 
the X-Forwarded-Proto header and issue redirects using HTTP instead of HTTPS. 
If the load balancer is only accepting traffic over 443, then these redirects 
obviously fail.

 

To resolve this, gunicorn needs to be configured to trust the X-Forwarded 
headers. Rather than manually modifying the gunicorn_config.py under www, 
(which is still also being used by the new RBAC webserver), the a value should 
be able to be provided through the airflow.cfg (or also through an env var).

This configuration is documented by gunicorn under the section regarding 
deployment behind a proxy.

 

[http://docs.gunicorn.org/en/stable/deploy.html]

 

Proposed to allow a forwarded_allow_ips variable under the `webserver` section 
of the airflow.cfg. and set in the gunicorn_config.py.


> Airflow Webserver Behind Reverse Proxy with SSL Termination
> ---
>
> Key: AIRFLOW-2977
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2977
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webserver
>Affects Versions: 1.10.0
>Reporter: Micheal Ascah
>Assignee: Micheal Ascah
>Priority: Minor
>
> Currently, there is no way in Airflow to configure gunicorn to allow it to 
> trust X-Forwarded-* headers from a reverse proxy.
> In the scenario where the webserver is being run behind an Application Load 
> Balancer in AWS that is also performing SSL termination, gunicorn will ignore 
> the X-Forwarded-Proto header and issue redirects using HTTP instead of HTTPS. 
> If the load balancer is only accepting traffic over 443, then these redirects 
> obviously fail.
>  
> To resolve this, gunicorn needs to be configured to trust the X-Forwarded 
> headers. Rather than manually modifying the gunicorn_config.py under www, 
> (which is still also being used by the new RBAC webserver), a value should be 
> able to be provided through the airflow.cfg (or also through an env var).
> This configuration is documented by gunicorn under the section regarding 
> deployment behind a proxy.
>  
> [http://docs.gunicorn.org/en/stable/deploy.html]
>  
> Proposed to allow a forwarded_allow_ips variable under the `webserver` 
> section of the airflow.cfg. and set in the gunicorn_config.py.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2977) Airflow Webserver Behind Reverse Proxy with SSL Termination

2018-08-29 Thread Micheal Ascah (JIRA)
Micheal Ascah created AIRFLOW-2977:
--

 Summary: Airflow Webserver Behind Reverse Proxy with SSL 
Termination
 Key: AIRFLOW-2977
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2977
 Project: Apache Airflow
  Issue Type: Improvement
  Components: webserver
Affects Versions: 1.10.0
Reporter: Micheal Ascah
Assignee: Micheal Ascah


Currently, there is no way in Airflow to configure gunicorn to allow it to 
trust X-Forwarded-* headers from a reverse proxy.

In the scenario where the webserver is being run behind an Application Load 
Balancer in AWS that is also performing SSL termination, gunicorn will ignore 
the X-Forwarded-Proto header and issue redirects using HTTP instead of HTTPS. 
If the load balancer is only accepting traffic over 443, then these redirects 
obviously fail.

 

To resolve this, gunicorn needs to be configured to trust the X-Forwarded 
headers. Rather than manually modifying the gunicorn_config.py under www, 
(which is still also being used by the new RBAC webserver), the a value should 
be able to be provided through the airflow.cfg (or also through an env var).

This configuration is documented by gunicorn under the section regarding 
deployment behind a proxy.

 

[http://docs.gunicorn.org/en/stable/deploy.html]

 

Proposed to allow a forwarded_allow_ips variable under the `webserver` section 
of the airflow.cfg. and set in the gunicorn_config.py.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2978) Airflow deps breaking on ancient Python packages

2018-08-29 Thread Jon Davies (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Davies updated AIRFLOW-2978:

Description: 
I have a Python package that depends on Airflow that forms part of our DAG 
build out process.

This pipeline installs Airflow 1.10.0 and then the build fails with:

{code:java}
Installed /python-packages/pkg/pkg/.eggs/pytzdata-2018.5-py2.7.egg
Traceback (most recent call last):
  File "setup.py", line 39, in 
test_suite="tests"
  File "/usr/local/lib/python2.7/site-packages/setuptools/__init__.py", line 
131, in setup
return distutils.core.setup(**attrs)
  File "/usr/local/lib/python2.7/distutils/core.py", line 151, in setup
dist.run_commands()
  File "/usr/local/lib/python2.7/distutils/dist.py", line 953, in run_commands
self.run_command(cmd)
  File "/usr/local/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
  File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
line 216, in run
installed_dists = self.install_dists(self.distribution)
  File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
line 207, in install_dists
ir_d = dist.fetch_build_eggs(dist.install_requires)
  File "/usr/local/lib/python2.7/site-packages/setuptools/dist.py", line 514, 
in fetch_build_eggs
replace_conflicting=True,
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 
782, in resolve
raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (tzlocal 2.0.0b1 
(/python-packages/pkg/pkg/.eggs/tzlocal-2.0.0b1-py2.7.egg), 
Requirement.parse('tzlocal<2.0.0.0,>=1.5.0.0'), set(['pendulum']))
{code}

It would seem that Airflow pins ancient versions here:

- 
https://github.com/apache/incubator-airflow/blob/314232cc49ebe75cf12e58af9588123441976647/setup.py#L316

For example:

psutil is at 5.4.7
pygments is at 2.2.0
pendulum is at 2.0.3
tzlocal is at 2.0.0b1

All of these install fine with their latest versions together in a fresh 2.7.15 
Docker container.

  was:
I have a Python package that depends on Airflow that forms part of our DAG 
build out process.

This pipeline installs Airflow 1.10.0 and then the build fails with:

{code:java}
Installed /python-packages/pkg/pkg/.eggs/pytzdata-2018.5-py2.7.egg
Traceback (most recent call last):
  File "setup.py", line 39, in 
test_suite="tests"
  File "/usr/local/lib/python2.7/site-packages/setuptools/__init__.py", line 
131, in setup
return distutils.core.setup(**attrs)
  File "/usr/local/lib/python2.7/distutils/core.py", line 151, in setup
dist.run_commands()
  File "/usr/local/lib/python2.7/distutils/dist.py", line 953, in run_commands
self.run_command(cmd)
  File "/usr/local/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
  File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
line 216, in run
installed_dists = self.install_dists(self.distribution)
  File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
line 207, in install_dists
ir_d = dist.fetch_build_eggs(dist.install_requires)
  File "/usr/local/lib/python2.7/site-packages/setuptools/dist.py", line 514, 
in fetch_build_eggs
replace_conflicting=True,
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 
782, in resolve
raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (tzlocal 2.0.0b1 
(/python-packages/pkg/pkg/.eggs/tzlocal-2.0.0b1-py2.7.egg), 
Requirement.parse('tzlocal<2.0.0.0,>=1.5.0.0'), set(['pendulum']))
{code}

It would seem that Airflow pins ancient versions here:

- 
https://github.com/apache/incubator-airflow/blob/314232cc49ebe75cf12e58af9588123441976647/setup.py#L316

For example:

psutil is at 5.4.7
pygments is at 2.2.0
pendulum is at 2.0.3

All of these install fine with their latest versions together in a fresh 2.7.15 
Docker container.


> Airflow deps breaking on ancient Python packages
> 
>
> Key: AIRFLOW-2978
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2978
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jon Davies
>Priority: Major
>
> I have a Python package that depends on Airflow that forms part of our DAG 
> build out process.
> This pipeline installs Airflow 1.10.0 and then the build fails with:
> {code:java}
> Installed /python-packages/pkg/pkg/.eggs/pytzdata-2018.5-py2.7.egg
> Traceback (most recent call last):
>   File "setup.py", line 39, in 
> test_suite="tests"
>   File "/usr/local/lib/python2.7/site-packages/setuptools/__init__.py", line 
> 131, in setup
> return distutils.core.setup(**attrs)
>   File "/usr/local/lib/python2.7/distutils/core.py", line 151, in setup
> dist.run_commands()
>   File 

[jira] [Commented] (AIRFLOW-2978) Airflow deps breaking on ancient Python packages

2018-08-29 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596337#comment-16596337
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2978:


What command/requirements line etc did you use to install? Do you have some 
special pip config? I notice you have {{tzlocal-2.0.0b1}} which is a 
pre-release and doesn't get installed by default - on a fresh virtualenv I 
don't get this problem.

Of the three packages you listed pendulum is the only one which is pinned to a 
specific version - the other two are already ranges

> Airflow deps breaking on ancient Python packages
> 
>
> Key: AIRFLOW-2978
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2978
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jon Davies
>Priority: Major
>
> I have a Python package that depends on Airflow that forms part of our DAG 
> build out process.
> This pipeline installs Airflow 1.10.0 and then the build fails with:
> {code:java}
> Installed /python-packages/pkg/pkg/.eggs/pytzdata-2018.5-py2.7.egg
> Traceback (most recent call last):
>   File "setup.py", line 39, in 
> test_suite="tests"
>   File "/usr/local/lib/python2.7/site-packages/setuptools/__init__.py", line 
> 131, in setup
> return distutils.core.setup(**attrs)
>   File "/usr/local/lib/python2.7/distutils/core.py", line 151, in setup
> dist.run_commands()
>   File "/usr/local/lib/python2.7/distutils/dist.py", line 953, in run_commands
> self.run_command(cmd)
>   File "/usr/local/lib/python2.7/distutils/dist.py", line 972, in run_command
> cmd_obj.run()
>   File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
> line 216, in run
> installed_dists = self.install_dists(self.distribution)
>   File "/usr/local/lib/python2.7/site-packages/setuptools/command/test.py", 
> line 207, in install_dists
> ir_d = dist.fetch_build_eggs(dist.install_requires)
>   File "/usr/local/lib/python2.7/site-packages/setuptools/dist.py", line 514, 
> in fetch_build_eggs
> replace_conflicting=True,
>   File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", 
> line 782, in resolve
> raise VersionConflict(dist, req).with_context(dependent_req)
> pkg_resources.ContextualVersionConflict: (tzlocal 2.0.0b1 
> (/python-packages/pkg/pkg/.eggs/tzlocal-2.0.0b1-py2.7.egg), 
> Requirement.parse('tzlocal<2.0.0.0,>=1.5.0.0'), set(['pendulum']))
> {code}
> It would seem that Airflow pins ancient versions here:
> - 
> https://github.com/apache/incubator-airflow/blob/314232cc49ebe75cf12e58af9588123441976647/setup.py#L316
> For example:
> psutil is at 5.4.7
> pygments is at 2.2.0
> pendulum is at 2.0.3
> tzlocal is at 2.0.0b1
> All of these install fine with their latest versions together in a fresh 
> 2.7.15 Docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-81) Scheduler blackout time period

2018-08-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596510#comment-16596510
 ] 

ASF GitHub Bot commented on AIRFLOW-81:
---

andscoop opened a new pull request #3702: [AIRFLOW-81] Add 
ScheduleBlackoutSensor
URL: https://github.com/apache/incubator-airflow/pull/3702
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-81
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   After reviewing some of the older jira issues, I found one that could be 
simply solved via a sensorOperator as Chris suggested in the comments. This 
operator and test should be good enough to close out an ancient issue that 
hasn't gotten a lot of traction
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Adds three test which validate the functionality of sensor properly 
returning true, false and error handling.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Scheduler blackout time period
> --
>
> Key: AIRFLOW-81
> URL: https://issues.apache.org/jira/browse/AIRFLOW-81
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: scheduler
>Reporter: Sean McIntyre
>Assignee: Andy Cooper
>Priority: Minor
>  Labels: features
>
> I have the need for a scheduler blackout time period in Airflow.
> My team, which uses Airflow, has been asked to not query one of my company's 
> data sources between midnight and 7 AM. When we launch big backfills on this 
> data source, it would be nice to have the Scheduler not schedule some 
> TaskInstances during the blackout hours.
> We (@r39132 and @ledsusop) brainstormed a few ideas on gitter on how to do 
> this...
> (1) Put more state/logic in the TaskInstance and Scheduler like this:
> my_task = PythonOperator(
> task_id='my_task',
> python_callable=my_command_that_access_the_datasource,
> provide_context=True,
> dag=dag,
> blackout=my_blackout_logic_for_the_datasource # <---
> )
> where my_blackout_logic is some function I provide that the scheduler calls 
> to determine whether or not it is the blackout period.
> (2) Pause DAGs on nightly basis. This can be done with the `pause_dag` CLI 
> command scheduled by cron / Jenkins. However could this be considered a core 
> feature to bring into the Airflow UI and scheduling system?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-81) Scheduler blackout time period

2018-08-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596509#comment-16596509
 ] 

ASF GitHub Bot commented on AIRFLOW-81:
---

andscoop closed pull request #3702: [AIRFLOW-81] Add ScheduleBlackoutSensor
URL: https://github.com/apache/incubator-airflow/pull/3702
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/sensors/schedule_blackout_sensor.py 
b/airflow/contrib/sensors/schedule_blackout_sensor.py
new file mode 100644
index 00..ac66cbb3b2
--- /dev/null
+++ b/airflow/contrib/sensors/schedule_blackout_sensor.py
@@ -0,0 +1,99 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+from datetime import datetime
+
+
+class ScheduleBlackoutSensor(BaseSensorOperator):
+"""
+Checks to see if a task is running for a specified date and time criteria
+Returns false if sensor is running within "blackout" criteria, true 
otherwise
+
+:param month_of_year: Integer representing month of year
+Not checked if left to default to None
+:type month_of_year: int
+:param day_of_month: Integer representing day of month
+Not checked if left to default to None
+:type day_of_month: int
+:param hour_of_day: Integer representing hour of day
+Not checked if left to default to None
+:type hour_of_day: int
+:param min_of_hour: Integer representing minute of hour
+Not checked if left to default to None
+:type min_of_hour: int
+:param day_of_week: Integer representing day of week
+Not checked if left to default to None
+:type day_of_week: int
+:param day_of_week: Datetime object to check criteria against
+Defaults to datetime.now() if set to none
+:type day_of_week: datetime
+"""
+
+@apply_defaults
+def __init__(self,
+ month_of_year=None, day_of_month=None,
+ hour_of_day=None, min_of_hour=None,
+ day_of_week=None,
+ dt=None, *args, **kwargs):
+
+super(ScheduleBlackoutSensor, self).__init__(*args, **kwargs)
+
+self.dt = dt
+self.month_of_year = month_of_year
+self.day_of_month = day_of_month
+self.hour_of_day = hour_of_day
+self.min_of_hour = min_of_hour
+self.day_of_week = day_of_week
+
+def _check_criteria(self, crit, datepart):
+if crit is None:
+return None
+
+elif isinstance(crit, list):
+for i in crit:
+if i == datepart:
+return True
+return False
+elif isinstance(crit, int):
+return True if datepart == crit else False
+else:
+raise TypeError(
+"Expected an interger or a list, received a 
{0}".format(type(crit)))
+
+def poke(self, context):
+self.dt = datetime.now() if self.dt is None else self.dt
+
+criteria = [
+# month of year
+self._check_criteria(self.month_of_year, self.dt.month),
+# day of month
+self._check_criteria(self.day_of_month, self.dt.day),
+# hour of day
+self._check_criteria(self.hour_of_day, self.dt.hour),
+# minute of hour
+self._check_criteria(self.min_of_hour, self.dt.minute),
+# day of week
+self._check_criteria(self.day_of_week, self.dt.weekday())
+]
+
+# Removes criteria that are set to None and then checks that all
+# specified criteria are True. If all criteria are True - returns False
+# in order to trigger a sensor failure if blackout criteria are met
+return not all([crit for crit in criteria if crit is not None])
diff --git a/docs/code.rst b/docs/code.rst
index 80ec76193f..db16b028b5 100644

[GitHub] andscoop removed a comment on issue #3702: [AIRFLOW-81] Add ScheduleBlackoutSensor

2018-08-29 Thread GitBox
andscoop removed a comment on issue #3702: [AIRFLOW-81] Add 
ScheduleBlackoutSensor
URL: 
https://github.com/apache/incubator-airflow/pull/3702#issuecomment-417002423
 
 
   @Fokko I think that is the correct way to solve this issue - with the trade 
off that it will involve


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #3816: [HOLD][AIRFLOW-2973] Use Python 3.6.x everywhere possible

2018-08-29 Thread GitBox
kaxil commented on issue #3816: [HOLD][AIRFLOW-2973] Use Python 3.6.x 
everywhere possible
URL: 
https://github.com/apache/incubator-airflow/pull/3816#issuecomment-417009027
 
 
   Yes sorry for that, I was trying to fix the issue and then debugged that it 
was related to our Docker CI setup and then just reverted some changes. I think 
we should change this PR for 3.7 as I mentioned in #3815 . You have already 
started working on it, which is really great @tedmiston .


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2955) Kubernetes pod operator: Unable to set requests/limits on task pods

2018-08-29 Thread Jon Davies (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596597#comment-16596597
 ] 

Jon Davies commented on AIRFLOW-2955:
-

Hello Daniel Imberman,

Here is an example task I'm trying to run with this:

{code:python}
...
task_downloader = KubernetesPodOperator(
namespace=os.getenv('POD_NAMESPACE'),
image="airflow-docker-dags:0.0.9",
cmds=["python"],
arguments=["scripts/task/download_files.py"],
resources={"limit_cpu": "1", "request_cpu": "1"},
name="task-downloader",
in_cluster=True,
task_id="task-downloader",
get_logs=True,
dag=dag)
...
{code}


> Kubernetes pod operator: Unable to set requests/limits on task pods
> ---
>
> Key: AIRFLOW-2955
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2955
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jon Davies
>Priority: Major
>
> When I try and set a resource limit/request on a DAG task with the 
> KubernetesPodOperator as follows:
> {code:java}
> resources={"limit_cpu": 1, "request_cpu": 1},
> {code}
> ...I get:
> {code:java}
> [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task Traceback (most recent call last):
> [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task   File "/usr/local/bin/airflow", line 32, in 
> [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task args.func(args)
> [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task   File "/usr/local/lib/python3.7/site-packages/airflow/utils/cli.py", 
> line 74, in wrapper
> [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task return f(*args, **kwargs)
> [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task   File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 
> 498, in run
> [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task _run(args, dag, ti)
> [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task   File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 
> 402, in _run
> [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task pool=args.pool,
> [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task   File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", 
> line 74, in wrapper
> [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task return func(*args, **kwargs)
> [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task   File "/usr/local/lib/python3.7/site-packages/airflow/models.py", line 
> 1633, in _run_raw_task
> [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task result = task_copy.execute(context=context)
> [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task   File 
> "/usr/local/lib/python3.7/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
>  line 115, in execute
> [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task get_logs=self.get_logs)
> [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task   File 
> "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/pod_launcher.py",
>  line 71, in run_pod
> [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task resp = self.run_pod_async(pod)
> [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task   File 
> "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/pod_launcher.py",
>  line 52, in run_pod_async
> [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task req = self.kube_req_factory.create(pod)
> [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task   File 
> "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/kubernetes_request_factory/pod_request_factory.py",
>  line 56, in create
> [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task self.extract_resources(pod, req)
> [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task   File 
> "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/kubernetes_request_factory/kubernetes_request_factory.py",
>  line 160, in extract_resources
> [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task if not pod.resources or pod.resources.is_empty_resource_request():
> [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask 
> task AttributeError: 'dict' 

[GitHub] andscoop commented on issue #3702: [AIRFLOW-81] Add ScheduleBlackoutSensor

2018-08-29 Thread GitBox
andscoop commented on issue #3702: [AIRFLOW-81] Add ScheduleBlackoutSensor
URL: 
https://github.com/apache/incubator-airflow/pull/3702#issuecomment-417003684
 
 
   @Fokko I think that is the more complete solution in this case, it will just 
involve more thought around `start_date` and `end_date` and how they interact 
with the new "periods" feature described above. While potentially not 
significant, this new feature will likely add more logic to the scheduler loop 
and that trade-off needs to be considered.
   
   With the maintainers' blessing, I can take this one and move forward with 
the feature as discussed. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2949) Syntax Highlight for Single Quote

2018-08-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596569#comment-16596569
 ] 

ASF GitHub Bot commented on AIRFLOW-2949:
-

feng-tao closed pull request #3795: [AIRFLOW-2949] Add syntax highlight for 
single quote strings
URL: https://github.com/apache/incubator-airflow/pull/3795
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/www/static/main.css b/airflow/www/static/main.css
index 57164b94e5..147695c4a9 100644
--- a/airflow/www/static/main.css
+++ b/airflow/www/static/main.css
@@ -262,3 +262,4 @@ div.square {
 .sc { color: #BA2121 } /* Literal.String.Char */
 .sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
 .s2 { color: #BA2121 } /* Literal.String.Double */
+.s1 { color: #BA2121 } /* Literal.String.Single */
diff --git a/airflow/www_rbac/static/css/main.css 
b/airflow/www_rbac/static/css/main.css
index ac6189938c..d3d198356e 100644
--- a/airflow/www_rbac/static/css/main.css
+++ b/airflow/www_rbac/static/css/main.css
@@ -265,3 +265,4 @@ div.square {
 .sc { color: #BA2121 } /* Literal.String.Char */
 .sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
 .s2 { color: #BA2121 } /* Literal.String.Double */
+.s1 { color: #BA2121 } /* Literal.String.Single */


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Syntax Highlight for Single Quote
> -
>
> Key: AIRFLOW-2949
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2949
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Tomas Zulberti
>Priority: Major
> Attachments: image-2018-08-23-16-16-59-375.png
>
>
> When checking the code of any DAG, there is a highlight for double quote 
> strings but there isn't any for single quote strings. pygments generate a 
> special css class but there is no color asigned for them
>  
> !image-2018-08-23-16-16-59-375.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] feng-tao closed pull request #3795: [AIRFLOW-2949] Add syntax highlight for single quote strings

2018-08-29 Thread GitBox
feng-tao closed pull request #3795: [AIRFLOW-2949] Add syntax highlight for 
single quote strings
URL: https://github.com/apache/incubator-airflow/pull/3795
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/www/static/main.css b/airflow/www/static/main.css
index 57164b94e5..147695c4a9 100644
--- a/airflow/www/static/main.css
+++ b/airflow/www/static/main.css
@@ -262,3 +262,4 @@ div.square {
 .sc { color: #BA2121 } /* Literal.String.Char */
 .sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
 .s2 { color: #BA2121 } /* Literal.String.Double */
+.s1 { color: #BA2121 } /* Literal.String.Single */
diff --git a/airflow/www_rbac/static/css/main.css 
b/airflow/www_rbac/static/css/main.css
index ac6189938c..d3d198356e 100644
--- a/airflow/www_rbac/static/css/main.css
+++ b/airflow/www_rbac/static/css/main.css
@@ -265,3 +265,4 @@ div.square {
 .sc { color: #BA2121 } /* Literal.String.Char */
 .sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
 .s2 { color: #BA2121 } /* Literal.String.Double */
+.s1 { color: #BA2121 } /* Literal.String.Single */


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on a change in pull request #3795: [AIRFLOW-2949] Add syntax highlight for single quote strings

2018-08-29 Thread GitBox
feng-tao commented on a change in pull request #3795: [AIRFLOW-2949] Add syntax 
highlight for single quote strings
URL: https://github.com/apache/incubator-airflow/pull/3795#discussion_r213747480
 
 

 ##
 File path: airflow/www/static/main.css
 ##
 @@ -262,3 +262,4 @@ div.square {
 .sc { color: #BA2121 } /* Literal.String.Char */
 .sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */
 .s2 { color: #BA2121 } /* Literal.String.Double */
+.s1 { color: #BA2121 } /* Literal.String.Single */
 
 Review comment:
   sorry for the delay. LGTM.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3702: [AIRFLOW-81] Add ScheduleBlackoutSensor

2018-08-29 Thread GitBox
codecov-io edited a comment on issue #3702: [AIRFLOW-81] Add 
ScheduleBlackoutSensor
URL: 
https://github.com/apache/incubator-airflow/pull/3702#issuecomment-410552388
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3702?src=pr=h1)
 Report
   > Merging 
[#3702](https://codecov.io/gh/apache/incubator-airflow/pull/3702?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/3157287e8c1aba7649cb7de80dd402889725?src=pr=desc)
 will **increase** coverage by `2.67%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3702/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3702?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3702  +/-   ##
   ==
   + Coverage   77.56%   80.24%   +2.67% 
   ==
 Files 204  205   +1 
 Lines   1576619763+3997 
   ==
   + Hits1222915858+3629 
   - Misses   3537 3905 +368
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3702?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/sensors/s3\_prefix\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3702/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX3ByZWZpeF9zZW5zb3IucHk=)
 | `25% <0%> (-16.18%)` | :arrow_down: |
   | 
[airflow/utils/db.py](https://codecov.io/gh/apache/incubator-airflow/pull/3702/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYi5weQ==)
 | `26.19% <0%> (-7.15%)` | :arrow_down: |
   | 
[...irflow/example\_dags/example\_kubernetes\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3702/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9rdWJlcm5ldGVzX29wZXJhdG9yLnB5)
 | `69.23% <0%> (-5.77%)` | :arrow_down: |
   | 
[airflow/utils/decorators.py](https://codecov.io/gh/apache/incubator-airflow/pull/3702/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kZWNvcmF0b3JzLnB5)
 | `89.79% <0%> (-1.88%)` | :arrow_down: |
   | 
[airflow/sensors/http\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3702/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL2h0dHBfc2Vuc29yLnB5)
 | `95% <0%> (-1.43%)` | :arrow_down: |
   | 
[airflow/sensors/s3\_key\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3702/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX2tleV9zZW5zb3IucHk=)
 | `30.43% <0%> (-0.6%)` | :arrow_down: |
   | 
[airflow/hooks/presto\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3702/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9wcmVzdG9faG9vay5weQ==)
 | `38.7% <0%> (-0.43%)` | :arrow_down: |
   | 
[airflow/operators/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3702/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvX19pbml0X18ucHk=)
 | `60.71% <0%> (-0.4%)` | :arrow_down: |
   | 
[airflow/hooks/http\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3702/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9odHRwX2hvb2sucHk=)
 | `96.42% <0%> (-0.3%)` | :arrow_down: |
   | 
[airflow/operators/hive\_stats\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3702/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvaGl2ZV9zdGF0c19vcGVyYXRvci5weQ==)
 | `0% <0%> (ø)` | :arrow_up: |
   | ... and [34 
more](https://codecov.io/gh/apache/incubator-airflow/pull/3702/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3702?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3702?src=pr=footer).
 Last update 
[3157287...8d2e1e1](https://codecov.io/gh/apache/incubator-airflow/pull/3702?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tedmiston commented on issue #3816: [HOLD][AIRFLOW-2973] Use Python 3.6.x everywhere possible

2018-08-29 Thread GitBox
tedmiston commented on issue #3816: [HOLD][AIRFLOW-2973] Use Python 3.6.x 
everywhere possible
URL: 
https://github.com/apache/incubator-airflow/pull/3816#issuecomment-416992698
 
 
   @Fokko That makes sense, and I agree with you there.
   
   I reassigned the ["upgrade to 3.6" Jira 
issue](https://issues.apache.org/jira/browse/AIRFLOW-2973) to @kaxil yesterday 
where he merged these changes into #3815, but then they were reverted back.  
Maybe we can keep his PR focused on the PyPI and tox setup of multiple 
subversions of Python 3 and then this one can become just upgrading the other 
bits that it changes above.
   
   My understanding is that both this and #3815 are on hold until the Docker 
image is reworked to support having multiple subversions of Python 3.
   
   Once that exists, I believe the changes on this branch should "just work".
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported Prog Langs

2018-08-29 Thread GitBox
kaxil commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported Prog 
Langs
URL: 
https://github.com/apache/incubator-airflow/pull/3815#issuecomment-417006519
 
 
   That's awesome @gerardo . Looking forward for it.
   
   @tedmiston Great. Would love to see Py3.7 PR, I think #3816 can be converted 
to be a 3.7 Upgrade PR. So once 3.6 is fixed, you can probably overwrite it 
with 3.7.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tedmiston commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported Prog Langs

2018-08-29 Thread GitBox
tedmiston commented on issue #3815: [AIRFLOW-2973] Add Python 3.6 to Supported 
Prog Langs
URL: 
https://github.com/apache/incubator-airflow/pull/3815#issuecomment-416993877
 
 
   @kaxil Sure.  Btw, I've started a 3.7 test branch currently living here - 
https://github.com/astronomerio/incubator-airflow/tree/python-3.7-experimental.
   
   It looks like the merge of #3816 into here yesterday was reverted.  It 
sounds like this PR may be on hold until the Docker image is reworked to 
support multiple subversions of Python 3.  Is that correct?
   
   @gerardo Can you tag me in that PR ^ when you're done?  I'd like to make a 
similar change in my follow up PR to support Python 3.7 (perhaps with an 
optional flag for now so we don't burn Apache Infra CI time).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil edited a comment on issue #3816: [HOLD][AIRFLOW-2973] Use Python 3.6.x everywhere possible

2018-08-29 Thread GitBox
kaxil edited a comment on issue #3816: [HOLD][AIRFLOW-2973] Use Python 3.6.x 
everywhere possible
URL: 
https://github.com/apache/incubator-airflow/pull/3816#issuecomment-417009027
 
 
   Yes sorry for that, I was trying to fix the issue and then debugged that it 
was related to our Docker CI setup and then just reverted some changes. I think 
we should change this PR for 3.7 (I will add the changes I made y'day in a sec 
to #3815 for Py3.6) as I mentioned in #3815 . You have already started working 
on it, which is really great @tedmiston .


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] andscoop closed pull request #3702: [AIRFLOW-81] Add ScheduleBlackoutSensor

2018-08-29 Thread GitBox
andscoop closed pull request #3702: [AIRFLOW-81] Add ScheduleBlackoutSensor
URL: https://github.com/apache/incubator-airflow/pull/3702
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/sensors/schedule_blackout_sensor.py 
b/airflow/contrib/sensors/schedule_blackout_sensor.py
new file mode 100644
index 00..ac66cbb3b2
--- /dev/null
+++ b/airflow/contrib/sensors/schedule_blackout_sensor.py
@@ -0,0 +1,99 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils.decorators import apply_defaults
+from datetime import datetime
+
+
+class ScheduleBlackoutSensor(BaseSensorOperator):
+"""
+Checks to see if a task is running for a specified date and time criteria
+Returns false if sensor is running within "blackout" criteria, true 
otherwise
+
+:param month_of_year: Integer representing month of year
+Not checked if left to default to None
+:type month_of_year: int
+:param day_of_month: Integer representing day of month
+Not checked if left to default to None
+:type day_of_month: int
+:param hour_of_day: Integer representing hour of day
+Not checked if left to default to None
+:type hour_of_day: int
+:param min_of_hour: Integer representing minute of hour
+Not checked if left to default to None
+:type min_of_hour: int
+:param day_of_week: Integer representing day of week
+Not checked if left to default to None
+:type day_of_week: int
+:param day_of_week: Datetime object to check criteria against
+Defaults to datetime.now() if set to none
+:type day_of_week: datetime
+"""
+
+@apply_defaults
+def __init__(self,
+ month_of_year=None, day_of_month=None,
+ hour_of_day=None, min_of_hour=None,
+ day_of_week=None,
+ dt=None, *args, **kwargs):
+
+super(ScheduleBlackoutSensor, self).__init__(*args, **kwargs)
+
+self.dt = dt
+self.month_of_year = month_of_year
+self.day_of_month = day_of_month
+self.hour_of_day = hour_of_day
+self.min_of_hour = min_of_hour
+self.day_of_week = day_of_week
+
+def _check_criteria(self, crit, datepart):
+if crit is None:
+return None
+
+elif isinstance(crit, list):
+for i in crit:
+if i == datepart:
+return True
+return False
+elif isinstance(crit, int):
+return True if datepart == crit else False
+else:
+raise TypeError(
+"Expected an interger or a list, received a 
{0}".format(type(crit)))
+
+def poke(self, context):
+self.dt = datetime.now() if self.dt is None else self.dt
+
+criteria = [
+# month of year
+self._check_criteria(self.month_of_year, self.dt.month),
+# day of month
+self._check_criteria(self.day_of_month, self.dt.day),
+# hour of day
+self._check_criteria(self.hour_of_day, self.dt.hour),
+# minute of hour
+self._check_criteria(self.min_of_hour, self.dt.minute),
+# day of week
+self._check_criteria(self.day_of_week, self.dt.weekday())
+]
+
+# Removes criteria that are set to None and then checks that all
+# specified criteria are True. If all criteria are True - returns False
+# in order to trigger a sensor failure if blackout criteria are met
+return not all([crit for crit in criteria if crit is not None])
diff --git a/docs/code.rst b/docs/code.rst
index 80ec76193f..db16b028b5 100644
--- a/docs/code.rst
+++ b/docs/code.rst
@@ -222,6 +222,7 @@ Sensors
 .. autoclass:: airflow.contrib.sensors.pubsub_sensor.PubSubPullSensor
 .. autoclass:: airflow.contrib.sensors.qubole_sensor.QuboleSensor
 .. autoclass:: 

[GitHub] andscoop opened a new pull request #3702: [AIRFLOW-81] Add ScheduleBlackoutSensor

2018-08-29 Thread GitBox
andscoop opened a new pull request #3702: [AIRFLOW-81] Add 
ScheduleBlackoutSensor
URL: https://github.com/apache/incubator-airflow/pull/3702
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-81
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   After reviewing some of the older jira issues, I found one that could be 
simply solved via a sensorOperator as Chris suggested in the comments. This 
operator and test should be good enough to close out an ancient issue that 
hasn't gotten a lot of traction
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Adds three test which validate the functionality of sensor properly 
returning true, false and error handling.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] andscoop commented on issue #3702: [AIRFLOW-81] Add ScheduleBlackoutSensor

2018-08-29 Thread GitBox
andscoop commented on issue #3702: [AIRFLOW-81] Add ScheduleBlackoutSensor
URL: 
https://github.com/apache/incubator-airflow/pull/3702#issuecomment-417002423
 
 
   @Fokko I think that is the correct way to solve this issue - with the trade 
off that it will involve


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dimberman commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose

2018-08-29 Thread GitBox
dimberman commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + 
docker-compose
URL: 
https://github.com/apache/incubator-airflow/pull/3797#issuecomment-417026819
 
 
   @gerardo Minikube definitely will not run inside docker (there's such thing 
as "docker in docker" but it's a really bad rabbit hole that we should avoid by 
all means necessary). Let me see if I can remove those earlier tasks. 
   
   Interesting! That looks really cool. I think that would be a great idea for 
a future PR to switch off of minikube. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] YingboWang commented on issue #3798: [AIRFLOW-2951] Update dag_run table end_date when state change

2018-08-29 Thread GitBox
YingboWang commented on issue #3798: [AIRFLOW-2951] Update dag_run table 
end_date when state change
URL: 
https://github.com/apache/incubator-airflow/pull/3798#issuecomment-417042685
 
 
   @yrqls21 Thanks for your comments. Just made modification and added another 
test case for update_state().
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao closed pull request #3793: [AIRFLOW-2948] Arg check & better doc - SSHOperator & SFTPOperator

2018-08-29 Thread GitBox
feng-tao closed pull request #3793: [AIRFLOW-2948] Arg check & better doc - 
SSHOperator & SFTPOperator
URL: https://github.com/apache/incubator-airflow/pull/3793
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/operators/sftp_operator.py 
b/airflow/contrib/operators/sftp_operator.py
index 3c736c8b95..a3b5c1f244 100644
--- a/airflow/contrib/operators/sftp_operator.py
+++ b/airflow/contrib/operators/sftp_operator.py
@@ -33,11 +33,15 @@ class SFTPOperator(BaseOperator):
 This operator uses ssh_hook to open sftp trasport channel that serve as 
basis
 for file transfer.
 
-:param ssh_hook: predefined ssh_hook to use for remote execution
+:param ssh_hook: predefined ssh_hook to use for remote execution.
+Either `ssh_hook` or `ssh_conn_id` needs to be provided.
 :type ssh_hook: :class:`SSHHook`
-:param ssh_conn_id: connection id from airflow Connections
+:param ssh_conn_id: connection id from airflow Connections.
+`ssh_conn_id` will be ingored if `ssh_hook` is provided.
 :type ssh_conn_id: str
 :param remote_host: remote host to connect (templated)
+Nullable. If provided, it will replace the `remote_host` which was
+defined in `ssh_hook` or predefined in the connection of `ssh_conn_id`.
 :type remote_host: str
 :param local_filepath: local file path to get or put. (templated)
 :type local_filepath: str
@@ -77,13 +81,21 @@ def __init__(self,
 def execute(self, context):
 file_msg = None
 try:
-if self.ssh_conn_id and not self.ssh_hook:
-self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id)
+if self.ssh_conn_id:
+if self.ssh_hook and isinstance(self.ssh_hook, SSHHook):
+self.log.info("ssh_conn_id is ignored when ssh_hook is 
provided.")
+else:
+self.log.info("ssh_hook is not provided or invalid. " +
+  "Trying ssh_conn_id to create SSHHook.")
+self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id)
 
 if not self.ssh_hook:
-raise AirflowException("can not operate without ssh_hook or 
ssh_conn_id")
+raise AirflowException("Cannot operate without ssh_hook or 
ssh_conn_id.")
 
 if self.remote_host is not None:
+self.log.info("remote_host is provided explicitly. " +
+  "It will replace the remote_host which was 
defined " +
+  "in ssh_hook or predefined in connection of 
ssh_conn_id.")
 self.ssh_hook.remote_host = self.remote_host
 
 with self.ssh_hook.get_conn() as ssh_client:
diff --git a/airflow/contrib/operators/ssh_operator.py 
b/airflow/contrib/operators/ssh_operator.py
index c0e8953d2c..2bf342935d 100644
--- a/airflow/contrib/operators/ssh_operator.py
+++ b/airflow/contrib/operators/ssh_operator.py
@@ -31,11 +31,15 @@ class SSHOperator(BaseOperator):
 """
 SSHOperator to execute commands on given remote host using the ssh_hook.
 
-:param ssh_hook: predefined ssh_hook to use for remote execution
+:param ssh_hook: predefined ssh_hook to use for remote execution.
+Either `ssh_hook` or `ssh_conn_id` needs to be provided.
 :type ssh_hook: :class:`SSHHook`
-:param ssh_conn_id: connection id from airflow Connections
+:param ssh_conn_id: connection id from airflow Connections.
+`ssh_conn_id` will be ingored if `ssh_hook` is provided.
 :type ssh_conn_id: str
 :param remote_host: remote host to connect (templated)
+Nullable. If provided, it will replace the `remote_host` which was
+defined in `ssh_hook` or predefined in the connection of `ssh_conn_id`.
 :type remote_host: str
 :param command: command to execute on remote host. (templated)
 :type command: str
@@ -68,14 +72,22 @@ def __init__(self,
 
 def execute(self, context):
 try:
-if self.ssh_conn_id and not self.ssh_hook:
-self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id,
-timeout=self.timeout)
+if self.ssh_conn_id:
+if self.ssh_hook and isinstance(self.ssh_hook, SSHHook):
+self.log.info("ssh_conn_id is ignored when ssh_hook is 
provided.")
+else:
+self.log.info("ssh_hook is not provided or invalid. " +
+  "Trying ssh_conn_id to create SSHHook.")
+self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id,
+timeout=self.timeout)
 
 if not self.ssh_hook:

[GitHub] codecov-io commented on issue #3818: [AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference

2018-08-29 Thread GitBox
codecov-io commented on issue #3818: [AIRFLOW-2980] ReadTheDocs - Fix Missing 
API Reference
URL: 
https://github.com/apache/incubator-airflow/pull/3818#issuecomment-417109229
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=h1)
 Report
   > Merging 
[#3818](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/0b0f4ac3caca72e67273f9e80221677d78ad5c0e?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3818/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3818   +/-   ##
   ===
 Coverage   77.41%   77.41%   
   ===
 Files 203  203   
 Lines   1581715817   
   ===
 Hits1224412244   
 Misses   3573 3573
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=footer).
 Last update 
[0b0f4ac...d98ce34](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3819: [AIRFLOW-XXX] Fix Broken Link in CONTRIBUTING.md

2018-08-29 Thread GitBox
codecov-io edited a comment on issue #3819: [AIRFLOW-XXX] Fix Broken Link in 
CONTRIBUTING.md
URL: 
https://github.com/apache/incubator-airflow/pull/3819#issuecomment-417110927
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3819?src=pr=h1)
 Report
   > Merging 
[#3819](https://codecov.io/gh/apache/incubator-airflow/pull/3819?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/0b0f4ac3caca72e67273f9e80221677d78ad5c0e?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3819/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3819?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3819   +/-   ##
   ===
 Coverage   77.41%   77.41%   
   ===
 Files 203  203   
 Lines   1581715817   
   ===
 Hits1224412244   
 Misses   3573 3573
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3819?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3819?src=pr=footer).
 Last update 
[0b0f4ac...e82cb52](https://codecov.io/gh/apache/incubator-airflow/pull/3819?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2981) TypeError in dataflow operators when using GCS jar or py_file

2018-08-29 Thread Jeffrey Payne (JIRA)
Jeffrey Payne created AIRFLOW-2981:
--

 Summary:  TypeError in dataflow operators when using GCS jar or 
py_file
 Key: AIRFLOW-2981
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2981
 Project: Apache Airflow
  Issue Type: Bug
  Components: contrib, Dataflow
Affects Versions: 1.9.0, 1.10
Reporter: Jeffrey Payne
Assignee: Jeffrey Payne


The {{GoogleCloudBucketHelper.google_cloud_to_local}} function attempts to 
compare a list to an int, resulting in the TypeError, with:
{noformat}
...
path_components = file_name[self.GCS_PREFIX_LENGTH:].split('/')
if path_components < 2:
...
{noformat}
This should be {{if len(path_components) < 2:}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil commented on issue #3818: [AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference

2018-08-29 Thread GitBox
kaxil commented on issue #3818: [AIRFLOW-2980] ReadTheDocs - Fix Missing API 
Reference
URL: 
https://github.com/apache/incubator-airflow/pull/3818#issuecomment-417109434
 
 
   Docs for `HiveOperator` can now be seen 
https://airflow-fork-k1.readthedocs.io/en/airflow-2980-rtd-fix-missing-docs/code.html#airflow.operators.hive_operator.HiveOperator
 (RTD on my Fork)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3819: [AIRFLOW-XXX] Fix Broken Link in CONTRIBUTING.md

2018-08-29 Thread GitBox
codecov-io commented on issue #3819: [AIRFLOW-XXX] Fix Broken Link in 
CONTRIBUTING.md
URL: 
https://github.com/apache/incubator-airflow/pull/3819#issuecomment-417110927
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3819?src=pr=h1)
 Report
   > Merging 
[#3819](https://codecov.io/gh/apache/incubator-airflow/pull/3819?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/0b0f4ac3caca72e67273f9e80221677d78ad5c0e?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3819/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3819?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3819   +/-   ##
   ===
 Coverage   77.41%   77.41%   
   ===
 Files 203  203   
 Lines   1581715817   
   ===
 Hits1224412244   
 Misses   3573 3573
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3819?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3819?src=pr=footer).
 Last update 
[0b0f4ac...e82cb52](https://codecov.io/gh/apache/incubator-airflow/pull/3819?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3818: [AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference

2018-08-29 Thread GitBox
codecov-io edited a comment on issue #3818: [AIRFLOW-2980] ReadTheDocs - Fix 
Missing API Reference
URL: 
https://github.com/apache/incubator-airflow/pull/3818#issuecomment-417109229
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=h1)
 Report
   > Merging 
[#3818](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/0b0f4ac3caca72e67273f9e80221677d78ad5c0e?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3818/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3818   +/-   ##
   ===
 Coverage   77.41%   77.41%   
   ===
 Files 203  203   
 Lines   1581715817   
   ===
 Hits1224412244   
 Misses   3573 3573
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=footer).
 Last update 
[0b0f4ac...d98ce34](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3798: [AIRFLOW-2951] Update dag_run table end_date when state change

2018-08-29 Thread GitBox
codecov-io edited a comment on issue #3798: [AIRFLOW-2951] Update dag_run table 
end_date when state change
URL: 
https://github.com/apache/incubator-airflow/pull/3798#issuecomment-417061518
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=h1)
 Report
   > Merging 
[#3798](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/96502982319b50766123ee85fb3732470ee3d048?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `83.33%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3798/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master   #3798  +/-   ##
   =
   - Coverage   77.41%   77.4%   -0.01% 
   =
 Files 203 203  
 Lines   15817   15819   +2 
   =
   + Hits12244   12245   +1 
   - Misses   35733574   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/api/common/experimental/mark\_tasks.py](https://codecov.io/gh/apache/incubator-airflow/pull/3798/diff?src=pr=tree#diff-YWlyZmxvdy9hcGkvY29tbW9uL2V4cGVyaW1lbnRhbC9tYXJrX3Rhc2tzLnB5)
 | `66.4% <0%> (-0.53%)` | :arrow_down: |
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3798/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.79% <100%> (ø)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=footer).
 Last update 
[9650298...5cb4ee7](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2979) Deprecated Celery Option not in Options list

2018-08-29 Thread Micheal Ascah (JIRA)
Micheal Ascah created AIRFLOW-2979:
--

 Summary: Deprecated Celery Option not in Options list 
 Key: AIRFLOW-2979
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2979
 Project: Apache Airflow
  Issue Type: Bug
  Components: celery
Affects Versions: 1.10.0
Reporter: Micheal Ascah


References [AIRFLOW-1840]

In airflow/configuration.py
{code:java}
# A two-level mapping of (section -> new_name -> old_name). When reading
# new_name, the old_name will be checked to see if it exists. If it does a
# DeprecationWarning will be issued and the old name will be used instead
deprecated_options = {
'celery': {
# Remove these keys in Airflow 1.11
'worker_concurrency': 'celeryd_concurrency',
'broker_url': 'celery_broker_url',
'ssl_active': 'celery_ssl_active',
'ssl_cert': 'celery_ssl_cert',
'ssl_key': 'celery_ssl_key',
}
}
{code}
This block is missing the renaming of celery_result_backend to just 
result_backed.

 

When setting this through an environment variable, the deprecated config name 
is not being used and instead the default value in the file is being used. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2948) Arg checking & better doc for SSHOperator and SFTPOperator

2018-08-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596758#comment-16596758
 ] 

ASF GitHub Bot commented on AIRFLOW-2948:
-

feng-tao closed pull request #3793: [AIRFLOW-2948] Arg check & better doc - 
SSHOperator & SFTPOperator
URL: https://github.com/apache/incubator-airflow/pull/3793
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/operators/sftp_operator.py 
b/airflow/contrib/operators/sftp_operator.py
index 3c736c8b95..a3b5c1f244 100644
--- a/airflow/contrib/operators/sftp_operator.py
+++ b/airflow/contrib/operators/sftp_operator.py
@@ -33,11 +33,15 @@ class SFTPOperator(BaseOperator):
 This operator uses ssh_hook to open sftp trasport channel that serve as 
basis
 for file transfer.
 
-:param ssh_hook: predefined ssh_hook to use for remote execution
+:param ssh_hook: predefined ssh_hook to use for remote execution.
+Either `ssh_hook` or `ssh_conn_id` needs to be provided.
 :type ssh_hook: :class:`SSHHook`
-:param ssh_conn_id: connection id from airflow Connections
+:param ssh_conn_id: connection id from airflow Connections.
+`ssh_conn_id` will be ingored if `ssh_hook` is provided.
 :type ssh_conn_id: str
 :param remote_host: remote host to connect (templated)
+Nullable. If provided, it will replace the `remote_host` which was
+defined in `ssh_hook` or predefined in the connection of `ssh_conn_id`.
 :type remote_host: str
 :param local_filepath: local file path to get or put. (templated)
 :type local_filepath: str
@@ -77,13 +81,21 @@ def __init__(self,
 def execute(self, context):
 file_msg = None
 try:
-if self.ssh_conn_id and not self.ssh_hook:
-self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id)
+if self.ssh_conn_id:
+if self.ssh_hook and isinstance(self.ssh_hook, SSHHook):
+self.log.info("ssh_conn_id is ignored when ssh_hook is 
provided.")
+else:
+self.log.info("ssh_hook is not provided or invalid. " +
+  "Trying ssh_conn_id to create SSHHook.")
+self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id)
 
 if not self.ssh_hook:
-raise AirflowException("can not operate without ssh_hook or 
ssh_conn_id")
+raise AirflowException("Cannot operate without ssh_hook or 
ssh_conn_id.")
 
 if self.remote_host is not None:
+self.log.info("remote_host is provided explicitly. " +
+  "It will replace the remote_host which was 
defined " +
+  "in ssh_hook or predefined in connection of 
ssh_conn_id.")
 self.ssh_hook.remote_host = self.remote_host
 
 with self.ssh_hook.get_conn() as ssh_client:
diff --git a/airflow/contrib/operators/ssh_operator.py 
b/airflow/contrib/operators/ssh_operator.py
index c0e8953d2c..2bf342935d 100644
--- a/airflow/contrib/operators/ssh_operator.py
+++ b/airflow/contrib/operators/ssh_operator.py
@@ -31,11 +31,15 @@ class SSHOperator(BaseOperator):
 """
 SSHOperator to execute commands on given remote host using the ssh_hook.
 
-:param ssh_hook: predefined ssh_hook to use for remote execution
+:param ssh_hook: predefined ssh_hook to use for remote execution.
+Either `ssh_hook` or `ssh_conn_id` needs to be provided.
 :type ssh_hook: :class:`SSHHook`
-:param ssh_conn_id: connection id from airflow Connections
+:param ssh_conn_id: connection id from airflow Connections.
+`ssh_conn_id` will be ingored if `ssh_hook` is provided.
 :type ssh_conn_id: str
 :param remote_host: remote host to connect (templated)
+Nullable. If provided, it will replace the `remote_host` which was
+defined in `ssh_hook` or predefined in the connection of `ssh_conn_id`.
 :type remote_host: str
 :param command: command to execute on remote host. (templated)
 :type command: str
@@ -68,14 +72,22 @@ def __init__(self,
 
 def execute(self, context):
 try:
-if self.ssh_conn_id and not self.ssh_hook:
-self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id,
-timeout=self.timeout)
+if self.ssh_conn_id:
+if self.ssh_hook and isinstance(self.ssh_hook, SSHHook):
+self.log.info("ssh_conn_id is ignored when ssh_hook is 
provided.")
+else:
+self.log.info("ssh_hook is not provided or invalid. " +
+

[jira] [Updated] (AIRFLOW-2979) Deprecated Celery Option not in Options list

2018-08-29 Thread Micheal Ascah (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micheal Ascah updated AIRFLOW-2979:
---
Description: 
References AIRFLOW-1840

In airflow/configuration.py
{code:java}
# A two-level mapping of (section -> new_name -> old_name). When reading
# new_name, the old_name will be checked to see if it exists. If it does a
# DeprecationWarning will be issued and the old name will be used instead
deprecated_options = {
'celery': {
# Remove these keys in Airflow 1.11
'worker_concurrency': 'celeryd_concurrency',
'broker_url': 'celery_broker_url',
'ssl_active': 'celery_ssl_active',
'ssl_cert': 'celery_ssl_cert',
'ssl_key': 'celery_ssl_key',
}
}
{code}
This block is missing the renaming of celery_result_backend to just 
result_backed.

 

When setting this through an environment variable, the deprecated config name 
is not being used and instead the default value in the file is being used. 

This is obviously remedied by the reading the UPDATING and setting the new 
name, but this change has broken back compat as far as I can tell.

  was:
References [AIRFLOW-1840]

In airflow/configuration.py
{code:java}
# A two-level mapping of (section -> new_name -> old_name). When reading
# new_name, the old_name will be checked to see if it exists. If it does a
# DeprecationWarning will be issued and the old name will be used instead
deprecated_options = {
'celery': {
# Remove these keys in Airflow 1.11
'worker_concurrency': 'celeryd_concurrency',
'broker_url': 'celery_broker_url',
'ssl_active': 'celery_ssl_active',
'ssl_cert': 'celery_ssl_cert',
'ssl_key': 'celery_ssl_key',
}
}
{code}
This block is missing the renaming of celery_result_backend to just 
result_backed.

 

When setting this through an environment variable, the deprecated config name 
is not being used and instead the default value in the file is being used. 


> Deprecated Celery Option not in Options list 
> -
>
> Key: AIRFLOW-2979
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2979
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery
>Affects Versions: 1.10.0
>Reporter: Micheal Ascah
>Priority: Critical
>
> References AIRFLOW-1840
> In airflow/configuration.py
> {code:java}
> # A two-level mapping of (section -> new_name -> old_name). When reading
> # new_name, the old_name will be checked to see if it exists. If it does a
> # DeprecationWarning will be issued and the old name will be used instead
> deprecated_options = {
> 'celery': {
> # Remove these keys in Airflow 1.11
> 'worker_concurrency': 'celeryd_concurrency',
> 'broker_url': 'celery_broker_url',
> 'ssl_active': 'celery_ssl_active',
> 'ssl_cert': 'celery_ssl_cert',
> 'ssl_key': 'celery_ssl_key',
> }
> }
> {code}
> This block is missing the renaming of celery_result_backend to just 
> result_backed.
>  
> When setting this through an environment variable, the deprecated config name 
> is not being used and instead the default value in the file is being used. 
> This is obviously remedied by the reading the UPDATING and setting the new 
> name, but this change has broken back compat as far as I can tell.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil opened a new pull request #3818: [AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference

2018-08-29 Thread GitBox
kaxil opened a new pull request #3818: [AIRFLOW-2980] ReadTheDocs - Fix Missing 
API Reference
URL: https://github.com/apache/incubator-airflow/pull/3818
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   Some of the operators are missing from the API reference part of the docs 
(HiveOperator for instance). This PR with force RTD to install all the Airflow 
dependencies which will then be used by it to generate API Refernce. 
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2980) Missing operators in the docs

2018-08-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596820#comment-16596820
 ] 

ASF GitHub Bot commented on AIRFLOW-2980:
-

kaxil opened a new pull request #3818: [AIRFLOW-2980] ReadTheDocs - Fix Missing 
API Reference
URL: https://github.com/apache/incubator-airflow/pull/3818
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   Some of the operators are missing from the API reference part of the docs 
(HiveOperator for instance). This PR with force RTD to install all the Airflow 
dependencies which will then be used by it to generate API Refernce. 
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Missing operators in the docs
> -
>
> Key: AIRFLOW-2980
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2980
> Project: Apache Airflow
>  Issue Type: Task
>  Components: docs, Documentation
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Major
>
> Some of the operators are missing from the API reference
> part of the docs (HiveOperator for instance). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil opened a new pull request #3819: [AIRFLOW-XXX] Fix Broken Link in CONTRIBUTING.md

2018-08-29 Thread GitBox
kaxil opened a new pull request #3819: [AIRFLOW-XXX] Fix Broken Link in 
CONTRIBUTING.md
URL: https://github.com/apache/incubator-airflow/pull/3819
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   Fix Broken DockerFile Link in CONTRIBUTING.md
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mascah edited a comment on issue #3549: [AIRFLOW-1840] Support back-compat on old celery config

2018-08-29 Thread GitBox
mascah edited a comment on issue #3549: [AIRFLOW-1840] Support back-compat on 
old celery config
URL: 
https://github.com/apache/incubator-airflow/pull/3549#issuecomment-417080253
 
 
   This looks like it's missing the option for `celery_result_backend` -> 
`result_backend`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3798: [AIRFLOW-2951] Update dag_run table end_date when state change

2018-08-29 Thread GitBox
codecov-io commented on issue #3798: [AIRFLOW-2951] Update dag_run table 
end_date when state change
URL: 
https://github.com/apache/incubator-airflow/pull/3798#issuecomment-417061518
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=h1)
 Report
   > Merging 
[#3798](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/96502982319b50766123ee85fb3732470ee3d048?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `83.33%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3798/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master   #3798  +/-   ##
   =
   - Coverage   77.41%   77.4%   -0.01% 
   =
 Files 203 203  
 Lines   15817   15819   +2 
   =
   + Hits12244   12245   +1 
   - Misses   35733574   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/api/common/experimental/mark\_tasks.py](https://codecov.io/gh/apache/incubator-airflow/pull/3798/diff?src=pr=tree#diff-YWlyZmxvdy9hcGkvY29tbW9uL2V4cGVyaW1lbnRhbC9tYXJrX3Rhc2tzLnB5)
 | `66.4% <0%> (-0.53%)` | :arrow_down: |
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3798/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.79% <100%> (ø)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=footer).
 Last update 
[9650298...5cb4ee7](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3798: [AIRFLOW-2951] Update dag_run table end_date when state change

2018-08-29 Thread GitBox
codecov-io edited a comment on issue #3798: [AIRFLOW-2951] Update dag_run table 
end_date when state change
URL: 
https://github.com/apache/incubator-airflow/pull/3798#issuecomment-417061518
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=h1)
 Report
   > Merging 
[#3798](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/96502982319b50766123ee85fb3732470ee3d048?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `83.33%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3798/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master   #3798  +/-   ##
   =
   - Coverage   77.41%   77.4%   -0.01% 
   =
 Files 203 203  
 Lines   15817   15819   +2 
   =
   + Hits12244   12245   +1 
   - Misses   35733574   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/api/common/experimental/mark\_tasks.py](https://codecov.io/gh/apache/incubator-airflow/pull/3798/diff?src=pr=tree#diff-YWlyZmxvdy9hcGkvY29tbW9uL2V4cGVyaW1lbnRhbC9tYXJrX3Rhc2tzLnB5)
 | `66.4% <0%> (-0.53%)` | :arrow_down: |
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3798/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.79% <100%> (ø)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=footer).
 Last update 
[9650298...5cb4ee7](https://codecov.io/gh/apache/incubator-airflow/pull/3798?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-2595) Oracle to Azure Blob Storage Transfer Operator

2018-08-29 Thread Marcus Rehm (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Rehm updated AIRFLOW-2595:
-
Description: 
Like MySQLToHveTransfer  operator would be nice to have a 
OracleToAzureBlobStorageTransfer operator so we can simplify data extraction 
from Oracle database and put it right into blob storage.

The operator should have the option to extract data and save it in CSV ou 
Parquet format.

  was:Like MySQLToHveTransfer  operator would be nice to have a 
OracleToAzureBlobStorageTransfer operator so we can simplify data extraction 
from Oracle database and put it right into blob storage.


> Oracle to Azure Blob Storage Transfer Operator
> --
>
> Key: AIRFLOW-2595
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2595
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: operators
>Reporter: Marcus Rehm
>Assignee: Marcus Rehm
>Priority: Trivial
>
> Like MySQLToHveTransfer  operator would be nice to have a 
> OracleToAzureBlobStorageTransfer operator so we can simplify data extraction 
> from Oracle database and put it right into blob storage.
> The operator should have the option to extract data and save it in CSV ou 
> Parquet format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #3818: [AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference

2018-08-29 Thread GitBox
codecov-io edited a comment on issue #3818: [AIRFLOW-2980] ReadTheDocs - Fix 
Missing API Reference
URL: 
https://github.com/apache/incubator-airflow/pull/3818#issuecomment-417109229
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=h1)
 Report
   > Merging 
[#3818](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/0b0f4ac3caca72e67273f9e80221677d78ad5c0e?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3818/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master   #3818  +/-   ##
   =
   - Coverage   77.41%   77.4%   -0.01% 
   =
 Files 203 203  
 Lines   15817   15817  
   =
   - Hits12244   12243   -1 
   - Misses   35733574   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3818/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.74% <0%> (-0.05%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=footer).
 Last update 
[0b0f4ac...d98ce34](https://codecov.io/gh/apache/incubator-airflow/pull/3818?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] gerardo commented on issue #3805: [AIRFLOW-2062] Add per-connection KMS encryption.

2018-08-29 Thread GitBox
gerardo commented on issue #3805: [AIRFLOW-2062] Add per-connection KMS 
encryption.
URL: 
https://github.com/apache/incubator-airflow/pull/3805#issuecomment-417157895
 
 
   @jakahn this PR makes things pretty confusing, as AWS has a product called 
[KMS](https://aws.amazon.com/kms/) as well, which has been around for longer. 
The docs you submitted also assume Google Cloud KMS is the _only_ thing called 
KMS.
   
   I still like the idea, but the implementation is very specific to Google 
Cloud and hard to reuse for other implementations. Is there a way you could 
make it more generic?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   3   >