[jira] [Created] (AIRFLOW-1519) Main DAG list page does not scale using client side paging

2017-08-17 Thread Edgar Rodriguez (JIRA)
Edgar Rodriguez created AIRFLOW-1519:


 Summary: Main DAG list page does not scale using client side paging
 Key: AIRFLOW-1519
 URL: https://issues.apache.org/jira/browse/AIRFLOW-1519
 Project: Apache Airflow
  Issue Type: Bug
  Components: ui
Reporter: Edgar Rodriguez
Assignee: Edgar Rodriguez


Airflow's main page with DAGs listing takes too long to load (> 10 secs) when 
scale in number of DAGS increases to 1K+ DAGs.

Airflow's main page with DAGs listing performs client side paging, by loading 
all dags in the system and loading them to the jquery.DataTable plugin, which 
is pretty slow processing the elements within the DOM in the client's browser. 
Additionally, when there are 1K+ DAGs, rendering them in the server side via 
flask templates also introduces some overhead.

Solution would be to do a server side paging and aligning the page size to the 
one configured for the web server (see 
https://github.com/apache/incubator-airflow/commit/04bfba3aa97deab850c14763279d33a6dfceb205),
 providing consistent paging across the system.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (AIRFLOW-1420) Dag fails, but none of tasks has failed

2017-08-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16131389#comment-16131389
 ] 

ASF subversion and git services commented on AIRFLOW-1420:
--

Commit ea86895d5b81d6fed4f26c201f8874bacdd291e5 in incubator-airflow's branch 
refs/heads/master from [~gwax]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=ea86895 ]

[AIRFLOW-1420][AIRFLOW-1473] Fix deadlock check

Update the deadlock check to prevent false
positives on upstream
failure or skip conditions.

Closes #2506 from gwax/fix_dead_dagruns


> Dag fails, but none of tasks has failed
> ---
>
> Key: AIRFLOW-1420
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1420
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Maciej Z
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> Here's the code:
> {noformat}
> # -*- coding: utf-8 -*-
> from airflow.operators.dummy_operator import DummyOperator
> from airflow.models import DAG
> from datetime import date, datetime, timedelta
> day_ago = datetime.combine(datetime.today() - timedelta(1), 
> datetime.min.time())
> args = {'owner': 'airflow',  'start_date': day_ago,}
> dag = DAG(dag_id='not_running_dummy2_test', default_args=args, 
> schedule_interval='00 * * * *',  dagrun_timeout=timedelta(hours=12))
> DummyOperator(task_id='a', dag=dag)  >>  DummyOperator(task_id='b', dag=dag, 
> trigger_rule='one_failed')
> if __name__ == "__main__":
> dag.cli()
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (AIRFLOW-1473) DAG state set failed, no failed task

2017-08-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16131390#comment-16131390
 ] 

ASF subversion and git services commented on AIRFLOW-1473:
--

Commit ea86895d5b81d6fed4f26c201f8874bacdd291e5 in incubator-airflow's branch 
refs/heads/master from [~gwax]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=ea86895 ]

[AIRFLOW-1420][AIRFLOW-1473] Fix deadlock check

Update the deadlock check to prevent false
positives on upstream
failure or skip conditions.

Closes #2506 from gwax/fix_dead_dagruns


> DAG state set failed, no failed task
> 
>
> Key: AIRFLOW-1473
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1473
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.9.0
> Environment: Deployed on Kubernetes. Using Local executor.
>Reporter: Daniel Surename
>Priority: Critical
> Attachments: 1.png, 2.png, 3.png, 4.png
>
>
> *Next Scheduled task which never gets queued or run:*
> Task Instance Details:
> Task's trigger rule 'all_success' requires all upstream tasks to have 
> succeeded, but *found 1 non-success*(es). upstream_tasks_state={'successes': 
> 54L, 'failed': 0L, 'upstream_failed': 0L, 'skipped': 0L, 'done': 54L}
> Dagrun Running:
>   Task instance's dagrun was not in the 'running' state but in the state 
> 'failed'.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


incubator-airflow git commit: [AIRFLOW-1420][AIRFLOW-1473] Fix deadlock check

2017-08-17 Thread saguziel
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 67b47c958 -> ea86895d5


[AIRFLOW-1420][AIRFLOW-1473] Fix deadlock check

Update the deadlock check to prevent false
positives on upstream
failure or skip conditions.

Closes #2506 from gwax/fix_dead_dagruns


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/ea86895d
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/ea86895d
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/ea86895d

Branch: refs/heads/master
Commit: ea86895d5b81d6fed4f26c201f8874bacdd291e5
Parents: 67b47c9
Author: George Leslie-Waksman 
Authored: Thu Aug 17 15:19:46 2017 -0700
Committer: Alex Guziel 
Committed: Thu Aug 17 15:19:52 2017 -0700

--
 airflow/models.py| 21 +--
 airflow/ti_deps/deps/trigger_rule_dep.py |  4 +--
 tests/models.py  | 37 ---
 3 files changed, 48 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/ea86895d/airflow/models.py
--
diff --git a/airflow/models.py b/airflow/models.py
index 0b82c56..bf308e5 100755
--- a/airflow/models.py
+++ b/airflow/models.py
@@ -4231,7 +4231,6 @@ class DagRun(Base):
 
 ID_PREFIX = 'scheduled__'
 ID_FORMAT_PREFIX = ID_PREFIX + '{0}'
-DEADLOCK_CHECK_DEP_CONTEXT = DepContext(ignore_in_retry_period=True)
 
 id = Column(Integer, primary_key=True)
 dag_id = Column(String(ID_LEN))
@@ -4457,13 +4456,19 @@ class DagRun(Base):
 # small speed up
 if unfinished_tasks and none_depends_on_past:
 # todo: this can actually get pretty slow: one task costs between 
0.01-015s
-no_dependencies_met = all(
-# Use a special dependency context that ignores task's up for 
retry
-# dependency, since a task that is up for retry is not 
necessarily
-# deadlocked.
-not 
t.are_dependencies_met(dep_context=self.DEADLOCK_CHECK_DEP_CONTEXT,
-   session=session)
-for t in unfinished_tasks)
+no_dependencies_met = True
+for ut in unfinished_tasks:
+# We need to flag upstream and check for changes because 
upstream
+# failures can result in deadlock false positives
+old_state = ut.state
+deps_met = ut.are_dependencies_met(
+dep_context=DepContext(
+flag_upstream_failed=True,
+ignore_in_retry_period=True),
+session=session)
+if deps_met or old_state != ut.current_state(session=session):
+no_dependencies_met = False
+break
 
 duration = (datetime.now() - start_dttm).total_seconds() * 1000
 Stats.timing("dagrun.dependency-check.{}.{}".

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/ea86895d/airflow/ti_deps/deps/trigger_rule_dep.py
--
diff --git a/airflow/ti_deps/deps/trigger_rule_dep.py 
b/airflow/ti_deps/deps/trigger_rule_dep.py
index cf06c0b..5a80314 100644
--- a/airflow/ti_deps/deps/trigger_rule_dep.py
+++ b/airflow/ti_deps/deps/trigger_rule_dep.py
@@ -124,8 +124,8 @@ class TriggerRuleDep(BaseTIDep):
 tr = task.trigger_rule
 upstream_done = done >= upstream
 upstream_tasks_state = {
-"successes": successes, "skipped": skipped, "failed": failed,
-"upstream_failed": upstream_failed, "done": done
+"total": upstream, "successes": successes, "skipped": skipped,
+"failed": failed, "upstream_failed": upstream_failed, "done": done
 }
 # TODO(aoen): Ideally each individual trigger rules would be it's own 
class, but
 # this isn't very feasible at the moment since the database queries 
need to be

http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/ea86895d/tests/models.py
--
diff --git a/tests/models.py b/tests/models.py
index 266e036..96275d3 100644
--- a/tests/models.py
+++ b/tests/models.py
@@ -37,6 +37,7 @@ from airflow.operators.python_operator import PythonOperator
 from airflow.operators.python_operator import ShortCircuitOperator
 from airflow.ti_deps.deps.trigger_rule_dep import TriggerRuleDep
 from airflow.utils.state import State
+from airflow.utils.trigger_rule import TriggerRule
 from mock import patch
 from parameterized import parameterized
 
@@ -483,10 +484,38 @@ class 

[jira] [Assigned] (AIRFLOW-1463) Scheduler does not reschedule tasks in QUEUED state

2017-08-17 Thread Anonymous (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous reassigned AIRFLOW-1463:
--

Assignee: (was: Stanislav Pak)

> Scheduler does not reschedule tasks in QUEUED state
> ---
>
> Key: AIRFLOW-1463
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1463
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: cli
> Environment: Ubuntu 14.04
> Airflow 1.8.0
> SQS backed task queue, AWS RDS backed meta storage
> DAG folder is synced by script on code push: archive is downloaded from s3, 
> unpacked, moved, install script is run. airflow executable is replaced with 
> symlink pointing to the latest version of code, no airflow processes are 
> restarted.
>Reporter: Stanislav Pak
>Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Our pipelines related code is deployed almost simultaneously on all airflow 
> boxes: scheduler+webserver box, workers boxes. Some common python package is 
> deployed on those boxes on every other code push (3-5 deployments per hour). 
> Due to installation specifics, a DAG that imports module from that package 
> might fail. If DAG import fails when worker runs a task, the task is still 
> removed from the queue but task state is not changed, so in this case the 
> task stays in QUEUED state forever.
> Beside the described case, there is scenario when it happens because of DAG 
> update lag in scheduler. A task can be scheduled with old DAG and worker can 
> run the task with new DAG that fails to be imported.
> There might be other scenarios when it happens.
> Proposal:
> Catch errors when importing DAG on task run and clear task instance state if 
> import fails. This should fix transient issues of this kind.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (AIRFLOW-1518) Make it possible to specify list of dag ids for scheduler to run

2017-08-17 Thread Stanislav Kudriashev (JIRA)
Stanislav Kudriashev created AIRFLOW-1518:
-

 Summary: Make it possible to specify list of dag ids for scheduler 
to run
 Key: AIRFLOW-1518
 URL: https://issues.apache.org/jira/browse/AIRFLOW-1518
 Project: Apache Airflow
  Issue Type: Improvement
Reporter: Stanislav Kudriashev
Assignee: Stanislav Kudriashev


Make it possible to specify list of dag ids for scheduler to run. Currently it 
is possible to specify only one dag id.

{code}
airflow scheduler -d dag1 -d dag2 -d dag3
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (AIRFLOW-1518) Make it possible to specify list of dag ids for scheduler to run

2017-08-17 Thread Stanislav Kudriashev (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-1518 started by Stanislav Kudriashev.
-
> Make it possible to specify list of dag ids for scheduler to run
> 
>
> Key: AIRFLOW-1518
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1518
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Stanislav Kudriashev
>Assignee: Stanislav Kudriashev
>
> Make it possible to specify list of dag ids for scheduler to run. Currently 
> it is possible to specify only one dag id.
> {code}
> airflow scheduler -d dag1 -d dag2 -d dag3
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (AIRFLOW-1330) Connection.parse_from_uri doesn't work for google_cloud_platform and so on

2017-08-17 Thread Yu Ishikawa (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130013#comment-16130013
 ] 

Yu Ishikawa commented on AIRFLOW-1330:
--

Good point. I think we should consider backwards compatibility. We can offer 
both of the following types to expose options for a connection type, host and 
port.

{noformat}
airflow connections --add --conn_id --conn_url mysql://fake-host:3306

airflow connections --add --conn_id --conn_type mysql --conn_host fake-host 
--conn_port 3306
{noformat}

> Connection.parse_from_uri doesn't work for google_cloud_platform and so on
> --
>
> Key: AIRFLOW-1330
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1330
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: cli
>Reporter: Yu Ishikawa
>
> h2. Overview
> {{Connection.parse_from_uri}} doesn't work for some types like 
> {{google_cloud_platform}} whose type name includes under scores. Since 
> `urllib.parse.urlparse()` which is used in {{Connection.parse_from_url}} 
> doesn't support a schema name which include under scores.
> So, airflow's CLI doesn't work when a given connection URI includes under 
> scores like {{google_cloud_platform://X}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)