[jira] [Reopened] (AIRFLOW-171) Email does not work in 1.7.1.2

2016-06-13 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini reopened AIRFLOW-171:
-
  Assignee: Hao Ye

> Email does not work in 1.7.1.2
> --
>
> Key: AIRFLOW-171
> URL: https://issues.apache.org/jira/browse/AIRFLOW-171
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 1.7.1
> Environment: AWS Amazon Linux Image
>Reporter: Hao Ye
>Assignee: Hao Ye
>
> Job failure emails was working in 1.7.0. They seem to have stopped working in 
> 1.7.1.
> Error is
> {quote}
> [2016-05-25 00:48:02,334] {models.py:1311} ERROR - Failed to send email to: 
> ['em...@email.com']
> [2016-05-25 00:48:02,334] {models.py:1312} ERROR - 'module' object has no 
> attribute 'send_email_smtp'
> Traceback (most recent call last):
>   File "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 1308, 
> in handle_failure
> self.email_alert(error, is_retry=False)
>   File "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 1425, 
> in email_alert
> send_email(task.email, title, body)
>   File "/usr/local/lib/python2.7/site-packages/airflow/utils/email.py", line 
> 42, in send_email
> backend = getattr(module, attr)
> AttributeError: 'module' object has no attribute 'send_email_smtp'
> {quote}
> File exists and method exists. Seems to work fine when called in python 
> directly.
> Maybe it's loading the wrong email module.
> Tried to set PYTHONPATH to have 
> /usr/local/lib/python2.7/site-packages/airflow earlier in the path, but that 
> didn't seem to work either.
> Could this be related to the utils refactoring that happened between 1.7.0 
> and 1.7.1?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-237) AttributeError: type object 'TaskInstance' has no attribute 'log'

2016-06-13 Thread Chris Riccomini (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328889#comment-15328889
 ] 

Chris Riccomini commented on AIRFLOW-237:
-

This error seems to be popping up in PRs:

https://github.com/apache/incubator-airflow/pull/1587

That PR's test failed with the same error.

> AttributeError: type object 'TaskInstance' has no attribute 'log'
> -
>
> Key: AIRFLOW-237
> URL: https://issues.apache.org/jira/browse/AIRFLOW-237
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webserver
>Affects Versions: Airflow 1.7.1, Airflow 1.7.0, Airflow 1.7.1.2
>Reporter: Amasa Amos
>Priority: Critical
>
> After following the "Quick Start" instructions found at 
> `http://pythonhosted.org/airflow/start.html`, executing the command `airflow 
> webserver -p 8080` results in the following error:
> ```
> [2016-06-13 19:13:33,536] {__init__.py:36} INFO - Using executor 
> SequentialExecutor
> [2016-06-13 19:13:33,649] {driver.py:120} INFO - Generating grammar tables 
> from /usr/lib/python2.7/lib2to3/Grammar.txt
> [2016-06-13 19:13:33,669] {driver.py:120} INFO - Generating grammar tables 
> from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
>      _
>  |__( )_  __/__  /  __
>   /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
> ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
>  _/_/  |_/_/  /_//_//_/  \//|__/
> [2016-06-13 19:13:33,875] {models.py:154} INFO - Filling up the DagBag from 
> /home/vagrant/airflow/dags
> Traceback (most recent call last):
>   File "/usr/local/bin/airflow", line 15, in 
> args.func(args)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 393, 
> in webserver
> app = cached_app(conf)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/www/app.py", line 133, 
> in cached_app
> app = create_app(config)
>   File "/usr/local/lib/python2.7/dist-packages/airflow/www/app.py", line 75, 
> in create_app
> Session, name="Task Instances", category="Browse"))
>   File 
> "/usr/local/lib/python2.7/dist-packages/flask_admin/contrib/sqla/view.py", 
> line 318, in __init__
> menu_icon_value=menu_icon_value)
>   File "/usr/local/lib/python2.7/dist-packages/flask_admin/model/base.py", 
> line 771, in __init__
> self._refresh_cache()
>   File "/usr/local/lib/python2.7/dist-packages/flask_admin/model/base.py", 
> line 847, in _refresh_cache
> self._list_columns = self.get_list_columns()
>   File "/usr/local/lib/python2.7/dist-packages/flask_admin/model/base.py", 
> line 980, in get_list_columns
> excluded_columns=self.column_exclude_list,
>   File 
> "/usr/local/lib/python2.7/dist-packages/flask_admin/contrib/sqla/view.py", 
> line 517, in get_column_names
> column, path = tools.get_field_with_path(self.model, c)
>   File 
> "/usr/local/lib/python2.7/dist-packages/flask_admin/contrib/sqla/tools.py", 
> line 144, in get_field_with_path
> value = getattr(current_model, attribute)
> AttributeError: type object 'TaskInstance' has no attribute 'log'
> ```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-171) Email does not work in 1.7.1.2

2016-06-13 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini updated AIRFLOW-171:

External issue URL: https://github.com/apache/incubator-airflow/pull/1587

> Email does not work in 1.7.1.2
> --
>
> Key: AIRFLOW-171
> URL: https://issues.apache.org/jira/browse/AIRFLOW-171
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 1.7.1
> Environment: AWS Amazon Linux Image
>Reporter: Hao Ye
>Assignee: Hao Ye
>
> Job failure emails was working in 1.7.0. They seem to have stopped working in 
> 1.7.1.
> Error is
> {quote}
> [2016-05-25 00:48:02,334] {models.py:1311} ERROR - Failed to send email to: 
> ['em...@email.com']
> [2016-05-25 00:48:02,334] {models.py:1312} ERROR - 'module' object has no 
> attribute 'send_email_smtp'
> Traceback (most recent call last):
>   File "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 1308, 
> in handle_failure
> self.email_alert(error, is_retry=False)
>   File "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 1425, 
> in email_alert
> send_email(task.email, title, body)
>   File "/usr/local/lib/python2.7/site-packages/airflow/utils/email.py", line 
> 42, in send_email
> backend = getattr(module, attr)
> AttributeError: 'module' object has no attribute 'send_email_smtp'
> {quote}
> File exists and method exists. Seems to work fine when called in python 
> directly.
> Maybe it's loading the wrong email module.
> Tried to set PYTHONPATH to have 
> /usr/local/lib/python2.7/site-packages/airflow earlier in the path, but that 
> didn't seem to work either.
> Could this be related to the utils refactoring that happened between 1.7.0 
> and 1.7.1?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-222) Running task instances should show duration in UI

2016-06-13 Thread Kengo Seki (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kengo Seki updated AIRFLOW-222:
---
External issue URL: https://github.com/apache/incubator-airflow/pull/1589

> Running task instances should show duration in UI
> -
>
> Key: AIRFLOW-222
> URL: https://issues.apache.org/jira/browse/AIRFLOW-222
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ui, webserver
>Reporter: Dan Davydov
>Assignee: Kengo Seki
>Priority: Minor
>  Labels: newbie, starter
>
> At the moment mousing over a task instance (square) in the Airflow tree view 
> only shows the duration for completed task, and shows "Duration: null" for 
> running tasks. Instead the UI should show the current task instance's 
> duration (given by datetime.datetime.now() - ti.start_date) for running tasks.
> Bonus points for incrementing the duration on the client in real-time after 
> it fetches the initial duration. from the server
> Even more bonus points for adding a duration field to the admin/airflow/task 
> endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-93) Allow specifying multiple task execution deltas for ExternalTaskSensors

2016-06-13 Thread Bence Nagy (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328652#comment-15328652
 ] 

Bence Nagy commented on AIRFLOW-93:
---

Yep, this indeed is a possibility (as mentioned in the issues description), but 
it's not very nice. I especially would be uncomfortable with depending one 
daily task on 144 10 minute tasks to cover an entire day which would all take 
up valuable scheduling time and execution slots.

> Allow specifying multiple task execution deltas for ExternalTaskSensors
> ---
>
> Key: AIRFLOW-93
> URL: https://issues.apache.org/jira/browse/AIRFLOW-93
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: operators
>Affects Versions: Airflow 1.7.0
>Reporter: Bence Nagy
>Assignee: James Ferguson
>Priority: Minor
>
> I have some {{ExternalTaskSensor}}s with a schedule interval of 1 hour, where 
> the task depended upon has a schedule interval of 10 minutes. Right now I'm 
> depending only on the HH:50 execution, but it would be nice if I could 
> specify a range that I need all executions from HH:00 to HH:50 successful; 
> otherwise if the depended upon tasks are executed out of order the sensor 
> will pass even though I don't have data for the earlier parts of the hour yet.
> A workaround would be to have one sensor for each 10 minutes of the hour, but 
> that's too nasty for me. Especially if my sensor's schedule interval would be 
> 1 day.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (AIRFLOW-93) Allow specifying multiple task execution deltas for ExternalTaskSensors

2016-06-13 Thread Anonymous (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous reassigned AIRFLOW-93:


Assignee: Jonas Esser  (was: Bence Nagy)

> Allow specifying multiple task execution deltas for ExternalTaskSensors
> ---
>
> Key: AIRFLOW-93
> URL: https://issues.apache.org/jira/browse/AIRFLOW-93
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: operators
>Affects Versions: Airflow 1.7.0
>Reporter: Bence Nagy
>Assignee: Jonas Esser
>Priority: Minor
>
> I have some {{ExternalTaskSensor}}s with a schedule interval of 1 hour, where 
> the task depended upon has a schedule interval of 10 minutes. Right now I'm 
> depending only on the HH:50 execution, but it would be nice if I could 
> specify a range that I need all executions from HH:00 to HH:50 successful; 
> otherwise if the depended upon tasks are executed out of order the sensor 
> will pass even though I don't have data for the earlier parts of the hour yet.
> A workaround would be to have one sensor for each 10 minutes of the hour, but 
> that's too nasty for me. Especially if my sensor's schedule interval would be 
> 1 day.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (AIRFLOW-93) Allow specifying multiple task execution deltas for ExternalTaskSensors

2016-06-13 Thread Anonymous (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous reassigned AIRFLOW-93:


Assignee: (was: Bence Nagy)

> Allow specifying multiple task execution deltas for ExternalTaskSensors
> ---
>
> Key: AIRFLOW-93
> URL: https://issues.apache.org/jira/browse/AIRFLOW-93
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: operators
>Affects Versions: Airflow 1.7.0
>Reporter: Bence Nagy
>Priority: Minor
>
> I have some {{ExternalTaskSensor}}s with a schedule interval of 1 hour, where 
> the task depended upon has a schedule interval of 10 minutes. Right now I'm 
> depending only on the HH:50 execution, but it would be nice if I could 
> specify a range that I need all executions from HH:00 to HH:50 successful; 
> otherwise if the depended upon tasks are executed out of order the sensor 
> will pass even though I don't have data for the earlier parts of the hour yet.
> A workaround would be to have one sensor for each 10 minutes of the hour, but 
> that's too nasty for me. Especially if my sensor's schedule interval would be 
> 1 day.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (AIRFLOW-93) Allow specifying multiple task execution deltas for ExternalTaskSensors

2016-06-13 Thread Anonymous (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous reassigned AIRFLOW-93:


Assignee: Bence Nagy

> Allow specifying multiple task execution deltas for ExternalTaskSensors
> ---
>
> Key: AIRFLOW-93
> URL: https://issues.apache.org/jira/browse/AIRFLOW-93
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: operators
>Affects Versions: Airflow 1.7.0
>Reporter: Bence Nagy
>Assignee: Bence Nagy
>Priority: Minor
>
> I have some {{ExternalTaskSensor}}s with a schedule interval of 1 hour, where 
> the task depended upon has a schedule interval of 10 minutes. Right now I'm 
> depending only on the HH:50 execution, but it would be nice if I could 
> specify a range that I need all executions from HH:00 to HH:50 successful; 
> otherwise if the depended upon tasks are executed out of order the sensor 
> will pass even though I don't have data for the earlier parts of the hour yet.
> A workaround would be to have one sensor for each 10 minutes of the hour, but 
> that's too nasty for me. Especially if my sensor's schedule interval would be 
> 1 day.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[incubator-airflow] Git Push Summary

2016-06-13 Thread maximebeauchemin
Repository: incubator-airflow
Updated Tags:  refs/tags/1.7.1.3 891a08368 -> ddf4f7474


[jira] [Commented] (AIRFLOW-230) [HiveServer2Hook] adding multi statements support

2016-06-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327923#comment-15327923
 ] 

ASF subversion and git services commented on AIRFLOW-230:
-

Commit a599167c433246d96bea711d8bfd5710b2c9d3ff in incubator-airflow's branch 
refs/heads/master from [~maxime.beauche...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=a599167 ]

[AIRFLOW-230] [HiveServer2Hook] adding multi statements support

Changing the library from pyhive to impyla broke the behavior where multiple 
statements, including statements that don't return results were previously 
supported and aren't anymore. impyla raises an exception if any of the 
statements doesn't return result.

We have tasks that run multiple statements including DDL and want to run them 
atomically.

Closes #1583 from mistercrunch/hooks_hive_presto

[AIRFLOW-230] [HiveServer2Hook] adding multi statements support


> [HiveServer2Hook] adding multi statements support
> -
>
> Key: AIRFLOW-230
> URL: https://issues.apache.org/jira/browse/AIRFLOW-230
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Maxime Beauchemin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


incubator-airflow git commit: [AIRFLOW-230] [HiveServer2Hook] adding multi statements support

2016-06-13 Thread maximebeauchemin
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 901e8f2a9 -> a599167c4


[AIRFLOW-230] [HiveServer2Hook] adding multi statements support

Changing the library from pyhive to impyla broke the behavior where multiple 
statements, including statements that don't return results were previously 
supported and aren't anymore. impyla raises an exception if any of the 
statements doesn't return result.

We have tasks that run multiple statements including DDL and want to run them 
atomically.

Closes #1583 from mistercrunch/hooks_hive_presto

[AIRFLOW-230] [HiveServer2Hook] adding multi statements support


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/a599167c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/a599167c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/a599167c

Branch: refs/heads/master
Commit: a599167c433246d96bea711d8bfd5710b2c9d3ff
Parents: 901e8f2
Author: Maxime Beauchemin 
Authored: Mon Jun 13 11:54:35 2016 -0700
Committer: Maxime Beauchemin 
Committed: Mon Jun 13 11:54:35 2016 -0700

--
 airflow/hooks/hive_hooks.py | 10 +-
 tests/core.py   |  9 +
 2 files changed, 18 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/a599167c/airflow/hooks/hive_hooks.py
--
diff --git a/airflow/hooks/hive_hooks.py b/airflow/hooks/hive_hooks.py
index 0b06f49..87cce6a 100644
--- a/airflow/hooks/hive_hooks.py
+++ b/airflow/hooks/hive_hooks.py
@@ -465,6 +465,7 @@ class HiveServer2Hook(BaseHook):
 database=db.schema or 'default')
 
 def get_results(self, hql, schema='default', arraysize=1000):
+from impala.error import ProgrammingError
 with self.get_conn() as conn:
 if isinstance(hql, basestring):
 hql = [hql]
@@ -475,7 +476,14 @@ class HiveServer2Hook(BaseHook):
 for statement in hql:
 with conn.cursor() as cur:
 cur.execute(statement)
-records = cur.fetchall()
+records = []
+try:
+# impala Lib raises when no results are returned
+# we're silencing here as some statements in the list
+# may be `SET` or DDL
+records = cur.fetchall()
+except ProgrammingError:
+logging.debug("get_results returned no records")
 if records:
 results = {
 'data': records,

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/a599167c/tests/core.py
--
diff --git a/tests/core.py b/tests/core.py
index d5f33a1..bbf9e60 100644
--- a/tests/core.py
+++ b/tests/core.py
@@ -1474,6 +1474,15 @@ if 'HiveOperator' in dir(operators):
 hook = HiveServer2Hook()
 hook.get_records(sql)
 
+def test_multi_statements(self):
+from airflow.hooks.hive_hooks import HiveServer2Hook
+sqls = [
+"CREATE TABLE IF NOT EXISTS test_multi_statements (i INT)",
+"DROP TABLE test_multi_statements",
+]
+hook = HiveServer2Hook()
+hook.get_records(sqls)
+
 def test_get_metastore_databases(self):
 if six.PY2:
 from airflow.hooks.hive_hooks import HiveMetastoreHook