ASF subversion and git services commented on AIRFLOW-52:

Commit 0ed36a14d8047bbfac749c73a81581f293b43af6 in incubator-airflow's branch 
refs/heads/airbnb_rb1.7.1_3 from [~jlowin]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=0ed36a1 ]

[AIRFLOW-52] Fix bottlenecks when working with many tasks

Dag hash function tried (and failed) to hash the list of tasks, then fell back 
on repr-ing the list, which took forever. Instead, hash 
tuple(task_dict.keys()). In addition this replaces two slow list comprehensions 
with much faster hash lookups (using the new task_dict).

> Release airflow 1.7.1
> ---------------------
>                 Key: AIRFLOW-52
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-52
>             Project: Apache Airflow
>          Issue Type: Task
>          Components: ease
>            Reporter: Dan Davydov
>            Assignee: Dan Davydov
>              Labels: release
> Release the airflow 1.7.1 tag.
> Current status:
> There are three issues blocking this release caused by this commit:
> https://github.com/apache/incubator-airflow/commit/fb0c5775cda4f84c07d8d5c0e6277fc387c172e6
> -1. DAGs with a lot of tasks take much longer to parse (~25x slowdown)-
> 2. The following kind of patterns fail:
> {code}
> email.set_upstream(dag.roots)
> dag.add_task(email)
> {code}
> This is because set_upstream now calls add_task and a task can't be added 
> more than once.
> 3. Airflow losing queued tasks (see linked issue)
> I'm working with the owner of the commit to resolve these issues.
> The way to catch (1) in the future is an integration test that asserts a 
> given non-trivial DAG parses under X seconds

This message was sent by Atlassian JIRA

Reply via email to