[jira] [Commented] (AIRFLOW-2244) Exception importing Azure (wasb) log handler

2018-03-22 Thread John Arnold (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410636#comment-16410636
 ] 

John Arnold commented on AIRFLOW-2244:
--

in airflow/models.py:

if 'mysql' in settings.SQL_ALCHEMY_CONN:
LongText = LONGTEXT
else:
LongText = Text

This is not guarded at all, you can't iterate settings.SQL_ALCHEMY_CONN if it's 
empty. You get a TypeError.

> Exception importing Azure (wasb) log handler
> 
>
> Key: AIRFLOW-2244
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2244
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core
>Reporter: John Arnold
>Priority: Major
>
> Attempting to configure remote logging in azure storage, I'm getting import 
> exceptions trying to load the wasb handler:
> Unable to load the config, contains a configuration error.
> Traceback (most recent call last):
>  File "/usr/lib/python3.6/logging/config.py", line 382, in resolve
>  found = getattr(found, frag)
> AttributeError: module 'airflow.utils.log' has no attribute 
> 'wasb_task_handler'
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>  File "/usr/lib/python3.6/logging/config.py", line 558, in configure
>  handler = self.configure_handler(handlers[name])
>  File "/usr/lib/python3.6/logging/config.py", line 708, in configure_handler
>  klass = self.resolve(cname)
>  File "/usr/lib/python3.6/logging/config.py", line 384, in resolve
>  self.importer(used)
>  File 
> "/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/utils/log/wasb_task_handler.py",
>  line 18, in 
>  from airflow.contrib.hooks.wasb_hook import WasbHook
>  File 
> "/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/contrib/hooks/wasb_hook.py",
>  line 16, in 
>  from airflow.hooks.base_hook import BaseHook
>  File 
> "/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/hooks/base_hook.py",
>  line 24, in 
>  from airflow.models import Connection
>  File 
> "/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/models.py",
>  line 119, in 
>  if 'mysql' in settings.SQL_ALCHEMY_CONN:
> TypeError: argument of type 'NoneType' is not iterable
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>  File "/var/lib/airflow/venv/bin/airflow", line 4, in 
>  
> __import__('pkg_resources').run_script('apache-airflow==1.10.0.dev0+incubating',
>  'airflow')
>  File 
> "/var/lib/airflow/venv/lib/python3.6/site-packages/pkg_resources/__init__.py",
>  line 658, in run_script
>  self.require(requires)[0].run_script(script_name, ns)
>  File 
> "/var/lib/airflow/venv/lib/python3.6/site-packages/pkg_resources/__init__.py",
>  line 1438, in run_script
>  exec(code, namespace, namespace)
>  File 
> "/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/EGG-INFO/scripts/airflow",
>  line 16, in 
>  from airflow import configuration
>  File 
> "/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/__init__.py",
>  line 31, in 
>  from airflow import settings
>  File 
> "/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/settings.py",
>  line 198, in 
>  configure_logging()
>  File 
> "/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/logging_config.py",
>  line 76, in configure_logging
>  raise e
>  File 
> "/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/logging_config.py",
>  line 71, in configure_logging
>  dictConfig(logging_config)
>  File "/usr/lib/python3.6/logging/config.py", line 795, in dictConfig
>  dictConfigClass(config).configure()
>  File "/usr/lib/python3.6/logging/config.py", line 566, in configure
>  '%r: %s' % (name, e))
> ValueError: Unable to configure handler 'processor': argument of type 
> 'NoneType' is not iterable
> (venv) johnar@netdev1-westus2:~/github/incubator-airflow$ airflow scheduler
> Unable to load the config, contains a configuration error.
> Traceback (most recent call last):
>  File "/usr/lib/python3.6/logging/config.py", line 382, in resolve
>  found = getattr(found, frag)
> AttributeError: module 'airflow.utils.log' has no attribute 
> 'wasb_task_handler'
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>  File "/usr/lib/python3.6/logging/config.py", line 558, in configure
>  handler = self.configure_handler(handlers[name])
>  File 

[jira] [Issue Comment Deleted] (AIRFLOW-1775) Remote file handler for logging

2018-03-22 Thread John Arnold (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Arnold updated AIRFLOW-1775:
-
Comment: was deleted

(was: in airflow/models.py:

if 'mysql' in settings.SQL_ALCHEMY_CONN:
 LongText = LONGTEXT
else:
 LongText = Text

This is not guarded at all, you can't iterate settings.SQL_ALCHEMY_CONN if it's 
empty. You get a TypeError.)

> Remote file handler for logging
> ---
>
> Key: AIRFLOW-1775
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1775
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Reporter: Niels Zeilemaker
>Priority: Major
>
> We're using a mounted Azure file share to store our log files. Currently, 
> Airflow is writing it's logs to that fileshare. However, this is making the 
> solution quite expensive, as you pay per action on Azure file shares.
> If we would have a remote_file_task_logging handler, we could mimic the 
> s3_task_logging, and copy the file to the fileshare in the close method. 
> Reducing the cost, while still providing persistent storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1775) Remote file handler for logging

2018-03-22 Thread John Arnold (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410486#comment-16410486
 ] 

John Arnold commented on AIRFLOW-1775:
--

in airflow/models.py:

if 'mysql' in settings.SQL_ALCHEMY_CONN:
 LongText = LONGTEXT
else:
 LongText = Text

This is not guarded at all, you can't iterate settings.SQL_ALCHEMY_CONN if it's 
empty. You get a TypeError.

> Remote file handler for logging
> ---
>
> Key: AIRFLOW-1775
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1775
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Reporter: Niels Zeilemaker
>Priority: Major
>
> We're using a mounted Azure file share to store our log files. Currently, 
> Airflow is writing it's logs to that fileshare. However, this is making the 
> solution quite expensive, as you pay per action on Azure file shares.
> If we would have a remote_file_task_logging handler, we could mimic the 
> s3_task_logging, and copy the file to the fileshare in the close method. 
> Reducing the cost, while still providing persistent storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2244) Exception importing Azure (wasb) log handler

2018-03-22 Thread John Arnold (JIRA)
John Arnold created AIRFLOW-2244:


 Summary: Exception importing Azure (wasb) log handler
 Key: AIRFLOW-2244
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2244
 Project: Apache Airflow
  Issue Type: Bug
  Components: core
Reporter: John Arnold


Attempting to configure remote logging in azure storage, I'm getting import 
exceptions trying to load the wasb handler:



Unable to load the config, contains a configuration error.
Traceback (most recent call last):
 File "/usr/lib/python3.6/logging/config.py", line 382, in resolve
 found = getattr(found, frag)
AttributeError: module 'airflow.utils.log' has no attribute 'wasb_task_handler'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
 File "/usr/lib/python3.6/logging/config.py", line 558, in configure
 handler = self.configure_handler(handlers[name])
 File "/usr/lib/python3.6/logging/config.py", line 708, in configure_handler
 klass = self.resolve(cname)
 File "/usr/lib/python3.6/logging/config.py", line 384, in resolve
 self.importer(used)
 File 
"/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/utils/log/wasb_task_handler.py",
 line 18, in 
 from airflow.contrib.hooks.wasb_hook import WasbHook
 File 
"/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/contrib/hooks/wasb_hook.py",
 line 16, in 
 from airflow.hooks.base_hook import BaseHook
 File 
"/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/hooks/base_hook.py",
 line 24, in 
 from airflow.models import Connection
 File 
"/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/models.py",
 line 119, in 
 if 'mysql' in settings.SQL_ALCHEMY_CONN:
TypeError: argument of type 'NoneType' is not iterable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
 File "/var/lib/airflow/venv/bin/airflow", line 4, in 
 
__import__('pkg_resources').run_script('apache-airflow==1.10.0.dev0+incubating',
 'airflow')
 File 
"/var/lib/airflow/venv/lib/python3.6/site-packages/pkg_resources/__init__.py", 
line 658, in run_script
 self.require(requires)[0].run_script(script_name, ns)
 File 
"/var/lib/airflow/venv/lib/python3.6/site-packages/pkg_resources/__init__.py", 
line 1438, in run_script
 exec(code, namespace, namespace)
 File 
"/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/EGG-INFO/scripts/airflow",
 line 16, in 
 from airflow import configuration
 File 
"/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/__init__.py",
 line 31, in 
 from airflow import settings
 File 
"/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/settings.py",
 line 198, in 
 configure_logging()
 File 
"/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/logging_config.py",
 line 76, in configure_logging
 raise e
 File 
"/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/logging_config.py",
 line 71, in configure_logging
 dictConfig(logging_config)
 File "/usr/lib/python3.6/logging/config.py", line 795, in dictConfig
 dictConfigClass(config).configure()
 File "/usr/lib/python3.6/logging/config.py", line 566, in configure
 '%r: %s' % (name, e))
ValueError: Unable to configure handler 'processor': argument of type 
'NoneType' is not iterable
(venv) johnar@netdev1-westus2:~/github/incubator-airflow$ airflow scheduler
Unable to load the config, contains a configuration error.
Traceback (most recent call last):
 File "/usr/lib/python3.6/logging/config.py", line 382, in resolve
 found = getattr(found, frag)
AttributeError: module 'airflow.utils.log' has no attribute 'wasb_task_handler'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
 File "/usr/lib/python3.6/logging/config.py", line 558, in configure
 handler = self.configure_handler(handlers[name])
 File "/usr/lib/python3.6/logging/config.py", line 708, in configure_handler
 klass = self.resolve(cname)
 File "/usr/lib/python3.6/logging/config.py", line 384, in resolve
 self.importer(used)
 File 
"/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/utils/log/wasb_task_handler.py",
 line 18, in 
 from airflow.contrib.hooks.wasb_hook import WasbHook
 File 
"/var/lib/airflow/venv/lib/python3.6/site-packages/apache_airflow-1.10.0.dev0+incubating-py3.6.egg/airflow/contrib/hooks/wasb_hook.py",
 line 16, in 
 from airflow.hooks.base_hook import BaseHook
 File 

[jira] [Resolved] (AIRFLOW-2235) Fix wrong docstrings in Transfer operators for MySQL

2018-03-22 Thread Joy Gao (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joy Gao resolved AIRFLOW-2235.
--
   Resolution: Fixed
Fix Version/s: 1.10.0

Issue resolved by pull request #3147
[https://github.com/apache/incubator-airflow/pull/3147]

> Fix wrong docstrings in Transfer operators for MySQL
> 
>
> Key: AIRFLOW-2235
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2235
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: docs, Documentation
>Reporter: Kengo Seki
>Assignee: Tao Feng
>Priority: Minor
> Fix For: 1.10.0
>
>
> Docstrings in HiveToMySqlTransfer and PrestoToMySqlTransfer says:
> {code}
> :param sql: SQL query to execute against the MySQL database
> {code}
> but actually these queries are executed against Hive and Presto respectively.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[2/2] incubator-airflow git commit: Merge pull request #3147 from feng-tao/airflow-2235

2018-03-22 Thread joygao
Merge pull request #3147 from feng-tao/airflow-2235


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/bd010048
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/bd010048
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/bd010048

Branch: refs/heads/master
Commit: bd010048b3351a88152d0e5661c8ee99d31412fb
Parents: 7e762d4 21152ef
Author: Joy Gao 
Authored: Thu Mar 22 14:41:59 2018 -0700
Committer: Joy Gao 
Committed: Thu Mar 22 14:41:59 2018 -0700

--
 airflow/operators/hive_to_mysql.py   | 2 +-
 airflow/operators/presto_to_mysql.py | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
--




[1/2] incubator-airflow git commit: [airflow-2235] Fix wrong docstrings in two operators

2018-03-22 Thread joygao
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 7e762d42d -> bd010048b


[airflow-2235] Fix wrong docstrings in two operators


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/21152efb
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/21152efb
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/21152efb

Branch: refs/heads/master
Commit: 21152efb7d84a249e749f4bc821ebb1051e65324
Parents: 8754cb1
Author: Tao feng 
Authored: Wed Mar 21 21:54:32 2018 -0700
Committer: Tao feng 
Committed: Wed Mar 21 21:54:32 2018 -0700

--
 airflow/operators/hive_to_mysql.py   | 2 +-
 airflow/operators/presto_to_mysql.py | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/21152efb/airflow/operators/hive_to_mysql.py
--
diff --git a/airflow/operators/hive_to_mysql.py 
b/airflow/operators/hive_to_mysql.py
index d2d9d0c..6962d31 100644
--- a/airflow/operators/hive_to_mysql.py
+++ b/airflow/operators/hive_to_mysql.py
@@ -25,7 +25,7 @@ class HiveToMySqlTransfer(BaseOperator):
 into memory before being pushed to MySQL, so this operator should
 be used for smallish amount of data.
 
-:param sql: SQL query to execute against the MySQL database
+:param sql: SQL query to execute against Hive server
 :type sql: str
 :param mysql_table: target MySQL table, use dot notation to target a
 specific database

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/21152efb/airflow/operators/presto_to_mysql.py
--
diff --git a/airflow/operators/presto_to_mysql.py 
b/airflow/operators/presto_to_mysql.py
index d0c323a..116391d 100644
--- a/airflow/operators/presto_to_mysql.py
+++ b/airflow/operators/presto_to_mysql.py
@@ -23,7 +23,7 @@ class PrestoToMySqlTransfer(BaseOperator):
 into memory before being pushed to MySQL, so this operator should
 be used for smallish amount of data.
 
-:param sql: SQL query to execute against the MySQL database
+:param sql: SQL query to execute against Presto
 :type sql: str
 :param mysql_table: target MySQL table, use dot notation to target a
 specific database



[jira] [Created] (AIRFLOW-2243) airflow seems to load modules multiple times

2018-03-22 Thread sulphide (JIRA)
sulphide created AIRFLOW-2243:
-

 Summary: airflow seems to load modules multiple times
 Key: AIRFLOW-2243
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2243
 Project: Apache Airflow
  Issue Type: Bug
Reporter: sulphide


airflow uses the builtin imp.load_source to load modules, but the documentation 
(for python 2) says:

[https://docs.python.org/2/library/imp.html#imp.load_source]

{{imp.}}{{load_source}}(_name_, _pathname_[, _file_])

Load and initialize a module implemented as a Python source file and return its 
module object. If the module was already initialized, it will be initialized 
_again_. The _name_ argument is used to create or access a module object. 
The_pathname_ argument points to the source file. The _file_ argument is the 
source file, open for reading as text, from the beginning. It must currently be 
a real file object, not a user-defined class emulating a file. Note that if a 
properly matching byte-compiled file (with suffix {{.pyc}} or {{.pyo}}) exists, 
it will be used instead of parsing the given source file.

 

 

this means that airflow behaves differently from a typical python program in 
that a module may be imported multiple times, which could have unexpected 
effects for those relying on the typical python import semantics.

https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L300



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-1235) Odd behaviour when all gunicorn workers die

2018-03-22 Thread Arthur Wiedmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arthur Wiedmer resolved AIRFLOW-1235.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

Issue resolved by pull request #2330
[https://github.com/apache/incubator-airflow/pull/2330]

> Odd behaviour when all gunicorn workers die
> ---
>
> Key: AIRFLOW-1235
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1235
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webserver
>Affects Versions: 1.8.0
>Reporter: Erik Forsberg
>Assignee: Kengo Seki
>Priority: Major
> Fix For: 2.0.0
>
>
> The webserver has sometimes stopped responding to port 443, and today I found 
> the issue - I had a misconfigured resolv.conf that made it unable to talk to 
> my postgresql. This was the root cause, but the way airflow webserver behaved 
> was a bit odd.
> It seems that when all gunicorn workers failed to start, the gunicorn master 
> shut down. However, the main process (the one that starts gunicorn master) 
> did not shut down, so there was no way of detecting the failed status of 
> webserver from e.g. systemd or init script.
> Full traceback leading to stale webserver process:
> {noformat}
> May 21 09:51:57 airmaster01 airflow[26451]: [2017-05-21 09:51:57 +] 
> [23794] [ERROR] Exception in worker process:
> May 21 09:51:57 airmaster01 airflow[26451]: Traceback (most recent call last):
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 1122, in _do_get
> May 21 09:51:57 airmaster01 airflow[26451]: return self._pool.get(wait, 
> self._timeout)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/util/queue.py",
>  line 145, in get
> May 21 09:51:57 airmaster01 airflow[26451]: raise Empty
> May 21 09:51:57 airmaster01 airflow[26451]: sqlalchemy.util.queue.Empty
> May 21 09:51:57 airmaster01 airflow[26451]: During handling of the above 
> exception, another exception occurred:
> May 21 09:51:57 airmaster01 airflow[26451]: Traceback (most recent call last):
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/engine/base.py",
>  line 2147, in _wrap_pool_connect
> May 21 09:51:57 airmaster01 airflow[26451]: return fn()
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 387, in connect
> May 21 09:51:57 airmaster01 airflow[26451]: return 
> _ConnectionFairy._checkout(self)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 766, in _checkout
> May 21 09:51:57 airmaster01 airflow[26451]: fairy = 
> _ConnectionRecord.checkout(pool)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 516, in checkout
> May 21 09:51:57 airmaster01 airflow[26451]: rec = pool._do_get()
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 1138, in _do_get
> May 21 09:51:57 airmaster01 airflow[26451]: self._dec_overflow()
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/util/langhelpers.py",
>  line 66, in __exit__
> May 21 09:51:57 airmaster01 airflow[26451]: compat.reraise(exc_type, 
> exc_value, exc_tb)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/util/compat.py",
>  line 187, in reraise
> May 21 09:51:57 airmaster01 airflow[26451]: raise value
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 1135, in _do_get
> May 21 09:51:57 airmaster01 airflow[26451]: return self._create_connection()
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 333, in _create_connection
> May 21 09:51:57 airmaster01 airflow[26451]: return _ConnectionRecord(self)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 461, in __init__
> May 21 09:51:57 airmaster01 airflow[26451]: 
> self.__connect(first_connect_check=True)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 651, in __connect
> May 21 09:51:57 airmaster01 airflow[26451]: connection = 
> pool._invoke_creator(self)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> 

[jira] [Commented] (AIRFLOW-1235) Odd behaviour when all gunicorn workers die

2018-03-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410113#comment-16410113
 ] 

ASF subversion and git services commented on AIRFLOW-1235:
--

Commit 7e762d42df50d84e4740e15c24594c50aaab53a2 in incubator-airflow's branch 
refs/heads/master from [~sekikn]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=7e762d4 ]

[AIRFLOW-1235] Fix webserver's odd behaviour

In some cases, the gunicorn master shuts down
but the webserver monitor process doesn't.
This PR add timeout functionality to shutdown
all related processes in such cases.

Dear Airflow maintainers,

Please accept this PR. I understand that it will
not be reviewed until I have checked off all the
steps below!

### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "[AIRFLOW-XXX] My Airflow PR"
-
https://issues.apache.org/jira/browse/AIRFLOW-1235

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:

In some cases, the gunicorn master shuts down
but the webserver monitor process doesn't.
This PR add timeout functionality to shutdown
all related processes in such cases.

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:

tests.core:CliTests.test_cli_webserver_shutdown_wh
en_gunicorn_master_is_killed

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

Closes #2330 from sekikn/AIRFLOW-1235


> Odd behaviour when all gunicorn workers die
> ---
>
> Key: AIRFLOW-1235
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1235
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webserver
>Affects Versions: 1.8.0
>Reporter: Erik Forsberg
>Assignee: Kengo Seki
>Priority: Major
> Fix For: 2.0.0
>
>
> The webserver has sometimes stopped responding to port 443, and today I found 
> the issue - I had a misconfigured resolv.conf that made it unable to talk to 
> my postgresql. This was the root cause, but the way airflow webserver behaved 
> was a bit odd.
> It seems that when all gunicorn workers failed to start, the gunicorn master 
> shut down. However, the main process (the one that starts gunicorn master) 
> did not shut down, so there was no way of detecting the failed status of 
> webserver from e.g. systemd or init script.
> Full traceback leading to stale webserver process:
> {noformat}
> May 21 09:51:57 airmaster01 airflow[26451]: [2017-05-21 09:51:57 +] 
> [23794] [ERROR] Exception in worker process:
> May 21 09:51:57 airmaster01 airflow[26451]: Traceback (most recent call last):
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 1122, in _do_get
> May 21 09:51:57 airmaster01 airflow[26451]: return self._pool.get(wait, 
> self._timeout)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/util/queue.py",
>  line 145, in get
> May 21 09:51:57 airmaster01 airflow[26451]: raise Empty
> May 21 09:51:57 airmaster01 airflow[26451]: sqlalchemy.util.queue.Empty
> May 21 09:51:57 airmaster01 airflow[26451]: During handling of the above 
> exception, another exception occurred:
> May 21 09:51:57 airmaster01 airflow[26451]: Traceback (most recent call last):
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/engine/base.py",
>  line 2147, in _wrap_pool_connect
> May 21 09:51:57 airmaster01 airflow[26451]: return fn()
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 387, in connect
> May 21 09:51:57 airmaster01 airflow[26451]: return 
> _ConnectionFairy._checkout(self)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 766, in _checkout
> May 21 09:51:57 airmaster01 airflow[26451]: fairy = 
> _ConnectionRecord.checkout(pool)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> 

[jira] [Commented] (AIRFLOW-1235) Odd behaviour when all gunicorn workers die

2018-03-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410114#comment-16410114
 ] 

ASF subversion and git services commented on AIRFLOW-1235:
--

Commit 7e762d42df50d84e4740e15c24594c50aaab53a2 in incubator-airflow's branch 
refs/heads/master from [~sekikn]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=7e762d4 ]

[AIRFLOW-1235] Fix webserver's odd behaviour

In some cases, the gunicorn master shuts down
but the webserver monitor process doesn't.
This PR add timeout functionality to shutdown
all related processes in such cases.

Dear Airflow maintainers,

Please accept this PR. I understand that it will
not be reviewed until I have checked off all the
steps below!

### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "[AIRFLOW-XXX] My Airflow PR"
-
https://issues.apache.org/jira/browse/AIRFLOW-1235

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:

In some cases, the gunicorn master shuts down
but the webserver monitor process doesn't.
This PR add timeout functionality to shutdown
all related processes in such cases.

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:

tests.core:CliTests.test_cli_webserver_shutdown_wh
en_gunicorn_master_is_killed

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

Closes #2330 from sekikn/AIRFLOW-1235


> Odd behaviour when all gunicorn workers die
> ---
>
> Key: AIRFLOW-1235
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1235
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webserver
>Affects Versions: 1.8.0
>Reporter: Erik Forsberg
>Assignee: Kengo Seki
>Priority: Major
> Fix For: 2.0.0
>
>
> The webserver has sometimes stopped responding to port 443, and today I found 
> the issue - I had a misconfigured resolv.conf that made it unable to talk to 
> my postgresql. This was the root cause, but the way airflow webserver behaved 
> was a bit odd.
> It seems that when all gunicorn workers failed to start, the gunicorn master 
> shut down. However, the main process (the one that starts gunicorn master) 
> did not shut down, so there was no way of detecting the failed status of 
> webserver from e.g. systemd or init script.
> Full traceback leading to stale webserver process:
> {noformat}
> May 21 09:51:57 airmaster01 airflow[26451]: [2017-05-21 09:51:57 +] 
> [23794] [ERROR] Exception in worker process:
> May 21 09:51:57 airmaster01 airflow[26451]: Traceback (most recent call last):
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 1122, in _do_get
> May 21 09:51:57 airmaster01 airflow[26451]: return self._pool.get(wait, 
> self._timeout)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/util/queue.py",
>  line 145, in get
> May 21 09:51:57 airmaster01 airflow[26451]: raise Empty
> May 21 09:51:57 airmaster01 airflow[26451]: sqlalchemy.util.queue.Empty
> May 21 09:51:57 airmaster01 airflow[26451]: During handling of the above 
> exception, another exception occurred:
> May 21 09:51:57 airmaster01 airflow[26451]: Traceback (most recent call last):
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/engine/base.py",
>  line 2147, in _wrap_pool_connect
> May 21 09:51:57 airmaster01 airflow[26451]: return fn()
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 387, in connect
> May 21 09:51:57 airmaster01 airflow[26451]: return 
> _ConnectionFairy._checkout(self)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 766, in _checkout
> May 21 09:51:57 airmaster01 airflow[26451]: fairy = 
> _ConnectionRecord.checkout(pool)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> 

[jira] [Commented] (AIRFLOW-1235) Odd behaviour when all gunicorn workers die

2018-03-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410112#comment-16410112
 ] 

ASF subversion and git services commented on AIRFLOW-1235:
--

Commit 7e762d42df50d84e4740e15c24594c50aaab53a2 in incubator-airflow's branch 
refs/heads/master from [~sekikn]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=7e762d4 ]

[AIRFLOW-1235] Fix webserver's odd behaviour

In some cases, the gunicorn master shuts down
but the webserver monitor process doesn't.
This PR add timeout functionality to shutdown
all related processes in such cases.

Dear Airflow maintainers,

Please accept this PR. I understand that it will
not be reviewed until I have checked off all the
steps below!

### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "[AIRFLOW-XXX] My Airflow PR"
-
https://issues.apache.org/jira/browse/AIRFLOW-1235

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:

In some cases, the gunicorn master shuts down
but the webserver monitor process doesn't.
This PR add timeout functionality to shutdown
all related processes in such cases.

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:

tests.core:CliTests.test_cli_webserver_shutdown_wh
en_gunicorn_master_is_killed

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

Closes #2330 from sekikn/AIRFLOW-1235


> Odd behaviour when all gunicorn workers die
> ---
>
> Key: AIRFLOW-1235
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1235
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webserver
>Affects Versions: 1.8.0
>Reporter: Erik Forsberg
>Assignee: Kengo Seki
>Priority: Major
> Fix For: 2.0.0
>
>
> The webserver has sometimes stopped responding to port 443, and today I found 
> the issue - I had a misconfigured resolv.conf that made it unable to talk to 
> my postgresql. This was the root cause, but the way airflow webserver behaved 
> was a bit odd.
> It seems that when all gunicorn workers failed to start, the gunicorn master 
> shut down. However, the main process (the one that starts gunicorn master) 
> did not shut down, so there was no way of detecting the failed status of 
> webserver from e.g. systemd or init script.
> Full traceback leading to stale webserver process:
> {noformat}
> May 21 09:51:57 airmaster01 airflow[26451]: [2017-05-21 09:51:57 +] 
> [23794] [ERROR] Exception in worker process:
> May 21 09:51:57 airmaster01 airflow[26451]: Traceback (most recent call last):
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 1122, in _do_get
> May 21 09:51:57 airmaster01 airflow[26451]: return self._pool.get(wait, 
> self._timeout)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/util/queue.py",
>  line 145, in get
> May 21 09:51:57 airmaster01 airflow[26451]: raise Empty
> May 21 09:51:57 airmaster01 airflow[26451]: sqlalchemy.util.queue.Empty
> May 21 09:51:57 airmaster01 airflow[26451]: During handling of the above 
> exception, another exception occurred:
> May 21 09:51:57 airmaster01 airflow[26451]: Traceback (most recent call last):
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/engine/base.py",
>  line 2147, in _wrap_pool_connect
> May 21 09:51:57 airmaster01 airflow[26451]: return fn()
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 387, in connect
> May 21 09:51:57 airmaster01 airflow[26451]: return 
> _ConnectionFairy._checkout(self)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> "/opt/airflow/production/lib/python3.4/site-packages/sqlalchemy/pool.py", 
> line 766, in _checkout
> May 21 09:51:57 airmaster01 airflow[26451]: fairy = 
> _ConnectionRecord.checkout(pool)
> May 21 09:51:57 airmaster01 airflow[26451]: File 
> 

incubator-airflow git commit: [AIRFLOW-1235] Fix webserver's odd behaviour

2018-03-22 Thread arthur
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 8c42d03c4 -> 7e762d42d


[AIRFLOW-1235] Fix webserver's odd behaviour

In some cases, the gunicorn master shuts down
but the webserver monitor process doesn't.
This PR add timeout functionality to shutdown
all related processes in such cases.

Dear Airflow maintainers,

Please accept this PR. I understand that it will
not be reviewed until I have checked off all the
steps below!

### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "[AIRFLOW-XXX] My Airflow PR"
-
https://issues.apache.org/jira/browse/AIRFLOW-1235

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:

In some cases, the gunicorn master shuts down
but the webserver monitor process doesn't.
This PR add timeout functionality to shutdown
all related processes in such cases.

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:

tests.core:CliTests.test_cli_webserver_shutdown_wh
en_gunicorn_master_is_killed

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

Closes #2330 from sekikn/AIRFLOW-1235


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/7e762d42
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/7e762d42
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/7e762d42

Branch: refs/heads/master
Commit: 7e762d42df50d84e4740e15c24594c50aaab53a2
Parents: 8c42d03
Author: Kengo Seki 
Authored: Thu Mar 22 11:50:27 2018 -0700
Committer: Arthur Wiedmer 
Committed: Thu Mar 22 11:50:27 2018 -0700

--
 airflow/bin/cli.py   | 129 +-
 airflow/config_templates/default_airflow.cfg |   3 +
 airflow/exceptions.py|   4 +
 tests/core.py|  10 ++
 4 files changed, 91 insertions(+), 55 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/7e762d42/airflow/bin/cli.py
--
diff --git a/airflow/bin/cli.py b/airflow/bin/cli.py
index 449d8ca..1801cc7 100755
--- a/airflow/bin/cli.py
+++ b/airflow/bin/cli.py
@@ -46,7 +46,7 @@ import airflow
 from airflow import api
 from airflow import jobs, settings
 from airflow import configuration as conf
-from airflow.exceptions import AirflowException
+from airflow.exceptions import AirflowException, AirflowWebServerTimeout
 from airflow.executors import GetDefaultExecutor
 from airflow.models import (DagModel, DagBag, TaskInstance,
 DagPickle, DagRun, Variable, DagStat,
@@ -592,7 +592,12 @@ def get_num_ready_workers_running(gunicorn_master_proc):
 return len(ready_workers)
 
 
-def restart_workers(gunicorn_master_proc, num_workers_expected):
+def get_num_workers_running(gunicorn_master_proc):
+workers = psutil.Process(gunicorn_master_proc.pid).children()
+return len(workers)
+
+
+def restart_workers(gunicorn_master_proc, num_workers_expected, 
master_timeout):
 """
 Runs forever, monitoring the child processes of @gunicorn_master_proc and
 restarting workers occasionally.
@@ -618,17 +623,18 @@ def restart_workers(gunicorn_master_proc, 
num_workers_expected):
 gracefully and that the oldest worker is terminated.
 """
 
-def wait_until_true(fn):
+def wait_until_true(fn, timeout=0):
 """
 Sleeps until fn is true
 """
+t = time.time()
 while not fn():
+if 0 < timeout and timeout <= time.time() - t:
+raise AirflowWebServerTimeout(
+"No response from gunicorn master within {0} seconds"
+.format(timeout))
 time.sleep(0.1)
 
-def get_num_workers_running(gunicorn_master_proc):
-workers = psutil.Process(gunicorn_master_proc.pid).children()
-return len(workers)
-
 def start_refresh(gunicorn_master_proc):
 batch_size = conf.getint('webserver', 'worker_refresh_batch_size')
 log.debug('%s doing a refresh of %s workers', state, 

[jira] [Closed] (AIRFLOW-2241) Set pool slot to 0 if the user specified pool doesn't exist

2018-03-22 Thread Tao Feng (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Feng closed AIRFLOW-2241.
-
Resolution: Duplicate

> Set pool slot to 0 if the user specified pool doesn't exist
> ---
>
> Key: AIRFLOW-2241
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2241
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Tao Feng
>Assignee: Tao Feng
>Priority: Major
>
> Currently if the pool for the user specified task_instance doesn't exist in 
> meta-database, it will throw an exception during  
> _find_executable_task_instances. We should skip that task instead of throwing 
> an exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2241) Set pool slot to 0 if the user specified pool doesn't exist

2018-03-22 Thread Tao Feng (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409924#comment-16409924
 ] 

Tao Feng commented on AIRFLOW-2241:
---

The issue is fixed in AIRFLOW-1157. Close the issue for now.

> Set pool slot to 0 if the user specified pool doesn't exist
> ---
>
> Key: AIRFLOW-2241
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2241
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Tao Feng
>Assignee: Tao Feng
>Priority: Major
>
> Currently if the pool for the user specified task_instance doesn't exist in 
> meta-database, it will throw an exception during  
> _find_executable_task_instances. We should skip that task instead of throwing 
> an exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-1067) Should not use airf...@airflow.com in examples

2018-03-22 Thread Xiangrui Meng (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangrui Meng resolved AIRFLOW-1067.

Resolution: Fixed

> Should not use airf...@airflow.com in examples
> --
>
> Key: AIRFLOW-1067
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1067
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Xiangrui Meng
>Assignee: Xiangrui Meng
>Priority: Minor
>
> airflow.com is owned by a company named Airflow (selling fans, etc). We 
> should use airf...@example.com in all examples.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2231) DAG with a relativedelta schedule_interval fails

2018-03-22 Thread Kyle Brooks (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409452#comment-16409452
 ] 

Kyle Brooks commented on AIRFLOW-2231:
--

Thanks Joy, I was able to get relativedelta to work by making some slight 
changes to the model.py code.  I do think there's a use case for using 
relativedelta instead of cron syntax because it is more powerful.

I will submit a PR if and when I get approval to share the modifications.

Regarding not being able to reproduce the bug.  I would guess that the version 
of dateutil.relativedelta.relativedelta that you have implements __hash__ and 
so the class DAG __init__ does not fail like mine.  I have seen the same issue 
where the DAG will never be scheduled for classes that are hashable due to 
issues in the section of code you identified in your link above.

> DAG with a relativedelta schedule_interval fails
> 
>
> Key: AIRFLOW-2231
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2231
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG
>Reporter: Kyle Brooks
>Priority: Major
> Attachments: test_reldel.py
>
>
> The documentation for the DAG class says using 
> dateutil.relativedelta.relativedelta as a schedule_interval is supported but 
> it fails:
>  
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 285, 
> in process_file
>     m = imp.load_source(mod_name, filepath)
>   File 
> "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/imp.py",
>  line 172, in load_source
>     module = _load(spec)
>   File "", line 675, in _load
>   File "", line 655, in _load_unlocked
>   File "", line 678, in exec_module
>   File "", line 205, in _call_with_frames_removed
>   File "/Users/k398995/airflow/dags/test_reldel.py", line 33, in 
>     dagrun_timeout=timedelta(minutes=60))
>   File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 2914, 
> in __init__
>     if schedule_interval in cron_presets:
> TypeError: unhashable type: 'relativedelta'
>  
> It looks like the __init__ function for class DAG assumes the 
> schedule_interval is hashable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2242) Add Gousto to companies list

2018-03-22 Thread Dejan (JIRA)
Dejan created AIRFLOW-2242:
--

 Summary: Add Gousto to companies list
 Key: AIRFLOW-2242
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2242
 Project: Apache Airflow
  Issue Type: Task
Reporter: Dejan
Assignee: Dejan






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2241) Set pool slot to 0 if the user specified pool doesn't exist

2018-03-22 Thread Tao Feng (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Feng updated AIRFLOW-2241:
--
Summary: Set pool slot to 0 if the user specified pool doesn't exist  (was: 
Set pool slot to 0 if user specifies a pool doesn't exist)

> Set pool slot to 0 if the user specified pool doesn't exist
> ---
>
> Key: AIRFLOW-2241
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2241
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Tao Feng
>Assignee: Tao Feng
>Priority: Major
>
> Currently if the pool for the user specified task_instance doesn't exist in 
> meta-database, it will throw an exception during  
> _find_executable_task_instances. We should skip that task instead of throwing 
> an exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-1460) Tasks get stuck in the "removed" state

2018-03-22 Thread Maxime Beauchemin (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxime Beauchemin resolved AIRFLOW-1460.

Resolution: Fixed
  Assignee: Maxime Beauchemin  (was: George Leslie-Waksman)

> Tasks get stuck in the "removed" state
> --
>
> Key: AIRFLOW-1460
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1460
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: George Leslie-Waksman
>Assignee: Maxime Beauchemin
>Priority: Major
> Fix For: 1.9.1
>
>
> The current handling of task instances that get assigned the state "removed" 
> is that they get ignored.
> If the underlying task later gets re-added, the existing task instances will 
> prevent the task from running in the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1460) Tasks get stuck in the "removed" state

2018-03-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409138#comment-16409138
 ] 

ASF subversion and git services commented on AIRFLOW-1460:
--

Commit 8c42d03c4e35a0046e46f0e2e6db588702ee7e8b in incubator-airflow's branch 
refs/heads/master from [~tronbabylove]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=8c42d03 ]

[AIRFLOW-1460] Allow restoration of REMOVED TI's

When a task instance exists in the database but
its corresponding task
no longer exists in the DAG, the scheduler marks
the task instance as
REMOVED. Once removed, task instances stayed
removed forever, even if
the task were to be added back to the DAG.

This change allows for the restoration of REMOVED
task instances. If a
task instance is in state REMOVED but the
corresponding task is present
in the DAG, restore the task instance by setting
its state to NONE.

A new unit test simulates the removal and
restoration of a task from a
DAG and verifies that the task instance is
restored:
`./run_unit_tests.sh tests.models:DagRunTest`

JIRA:
https://issues.apache.org/jira/browse/AIRFLOW-1460

Closes #3137 from astahlman/airflow-1460-restore-
tis


> Tasks get stuck in the "removed" state
> --
>
> Key: AIRFLOW-1460
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1460
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: George Leslie-Waksman
>Assignee: George Leslie-Waksman
>Priority: Major
> Fix For: 1.9.1
>
>
> The current handling of task instances that get assigned the state "removed" 
> is that they get ignored.
> If the underlying task later gets re-added, the existing task instances will 
> prevent the task from running in the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-1460] Allow restoration of REMOVED TI's

2018-03-22 Thread maximebeauchemin
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 8754cb1c6 -> 8c42d03c4


[AIRFLOW-1460] Allow restoration of REMOVED TI's

When a task instance exists in the database but
its corresponding task
no longer exists in the DAG, the scheduler marks
the task instance as
REMOVED. Once removed, task instances stayed
removed forever, even if
the task were to be added back to the DAG.

This change allows for the restoration of REMOVED
task instances. If a
task instance is in state REMOVED but the
corresponding task is present
in the DAG, restore the task instance by setting
its state to NONE.

A new unit test simulates the removal and
restoration of a task from a
DAG and verifies that the task instance is
restored:
`./run_unit_tests.sh tests.models:DagRunTest`

JIRA:
https://issues.apache.org/jira/browse/AIRFLOW-1460

Closes #3137 from astahlman/airflow-1460-restore-
tis


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/8c42d03c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/8c42d03c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/8c42d03c

Branch: refs/heads/master
Commit: 8c42d03c4e35a0046e46f0e2e6db588702ee7e8b
Parents: 8754cb1
Author: Andrew Stahlman 
Authored: Wed Mar 21 23:54:05 2018 -0700
Committer: Maxime Beauchemin 
Committed: Wed Mar 21 23:54:05 2018 -0700

--
 airflow/models.py | 21 ++---
 tests/models.py   | 24 
 2 files changed, 42 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/8c42d03c/airflow/models.py
--
diff --git a/airflow/models.py b/airflow/models.py
index c1b608a..aa10ad5 100755
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -4905,16 +4905,31 @@ class DagRun(Base, LoggingMixin):
 dag = self.get_dag()
 tis = self.get_task_instances(session=session)
 
-# check for removed tasks
+# check for removed or restored tasks
 task_ids = []
 for ti in tis:
 task_ids.append(ti.task_id)
+task = None
 try:
-dag.get_task(ti.task_id)
+task = dag.get_task(ti.task_id)
 except AirflowException:
-if self.state is not State.RUNNING and not dag.partial:
+if ti.state == State.REMOVED:
+pass  # ti has already been removed, just ignore it
+elif self.state is not State.RUNNING and not dag.partial:
+self.log.warning("Failed to get task '{}' for dag '{}'. "
+ "Marking it as removed.".format(ti, dag))
+Stats.incr(
+"task_removed_from_dag.{}".format(dag.dag_id), 1, 1)
 ti.state = State.REMOVED
 
+is_task_in_dag = task is not None
+should_restore_task = is_task_in_dag and ti.state == State.REMOVED
+if should_restore_task:
+self.log.info("Restoring task '{}' which was previously "
+  "removed from DAG '{}'".format(ti, dag))
+Stats.incr("task_restored_to_dag.{}".format(dag.dag_id), 1, 1)
+ti.state = State.NONE
+
 # check for missing tasks
 for task in six.itervalues(dag.task_dict):
 if task.adhoc:

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/8c42d03c/tests/models.py
--
diff --git a/tests/models.py b/tests/models.py
index 5d8184c..98913af 100644
--- a/tests/models.py
+++ b/tests/models.py
@@ -892,6 +892,30 @@ class DagRunTest(unittest.TestCase):
 self.assertTrue(dagrun.is_backfill)
 self.assertFalse(dagrun2.is_backfill)
 
+def test_removed_task_instances_can_be_restored(self):
+def with_all_tasks_removed(dag):
+return DAG(dag_id=dag.dag_id, start_date=dag.start_date)
+
+dag = DAG('test_task_restoration', start_date=DEFAULT_DATE)
+dag.add_task(DummyOperator(task_id='flaky_task', owner='test'))
+
+dagrun = self.create_dag_run(dag)
+flaky_ti = dagrun.get_task_instances()[0]
+self.assertEquals('flaky_task', flaky_ti.task_id)
+self.assertEquals(State.NONE, flaky_ti.state)
+
+dagrun.dag = with_all_tasks_removed(dag)
+
+dagrun.verify_integrity()
+flaky_ti.refresh_from_db()
+self.assertEquals(State.REMOVED, flaky_ti.state)
+
+dagrun.dag.add_task(DummyOperator(task_id='flaky_task', owner='test'))
+
+dagrun.verify_integrity()
+flaky_ti.refresh_from_db()
+

[jira] [Commented] (AIRFLOW-1460) Tasks get stuck in the "removed" state

2018-03-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409139#comment-16409139
 ] 

ASF subversion and git services commented on AIRFLOW-1460:
--

Commit 8c42d03c4e35a0046e46f0e2e6db588702ee7e8b in incubator-airflow's branch 
refs/heads/master from [~tronbabylove]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=8c42d03 ]

[AIRFLOW-1460] Allow restoration of REMOVED TI's

When a task instance exists in the database but
its corresponding task
no longer exists in the DAG, the scheduler marks
the task instance as
REMOVED. Once removed, task instances stayed
removed forever, even if
the task were to be added back to the DAG.

This change allows for the restoration of REMOVED
task instances. If a
task instance is in state REMOVED but the
corresponding task is present
in the DAG, restore the task instance by setting
its state to NONE.

A new unit test simulates the removal and
restoration of a task from a
DAG and verifies that the task instance is
restored:
`./run_unit_tests.sh tests.models:DagRunTest`

JIRA:
https://issues.apache.org/jira/browse/AIRFLOW-1460

Closes #3137 from astahlman/airflow-1460-restore-
tis


> Tasks get stuck in the "removed" state
> --
>
> Key: AIRFLOW-1460
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1460
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: George Leslie-Waksman
>Assignee: George Leslie-Waksman
>Priority: Major
> Fix For: 1.9.1
>
>
> The current handling of task instances that get assigned the state "removed" 
> is that they get ignored.
> If the underlying task later gets re-added, the existing task instances will 
> prevent the task from running in the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-1460) Tasks get stuck in the "removed" state

2018-03-22 Thread Maxime Beauchemin (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxime Beauchemin updated AIRFLOW-1460:
---
Fix Version/s: 1.9.1

> Tasks get stuck in the "removed" state
> --
>
> Key: AIRFLOW-1460
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1460
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: George Leslie-Waksman
>Assignee: George Leslie-Waksman
>Priority: Major
> Fix For: 1.9.1
>
>
> The current handling of task instances that get assigned the state "removed" 
> is that they get ignored.
> If the underlying task later gets re-added, the existing task instances will 
> prevent the task from running in the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-03-22 Thread Tao Feng (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Feng reassigned AIRFLOW-1104:
-

Assignee: Tao Feng

> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Assignee: Tao Feng
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)