[jira] [Reopened] (AIRFLOW-539) Add support for BigQuery Standard SQL

2016-10-11 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini reopened AIRFLOW-539:
-

> Add support for BigQuery Standard SQL
> -
>
> Key: AIRFLOW-539
> URL: https://issues.apache.org/jira/browse/AIRFLOW-539
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Sam McVeety
>Assignee: Ilya Rakoshes
>Priority: Minor
>  Labels: gcp
> Fix For: Airflow 1.8
>
>
> Many of the operators in 
> https://github.com/apache/incubator-airflow/tree/master/airflow/contrib/operators
>  are implicitly using the "useLegacySql" option to BigQuery.  Providing the 
> option to negate this will allow users to migrate to Standard SQL 
> (https://cloud.google.com/bigquery/sql-reference/enabling-standard-sql).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-564) DagRunOperator and ExternalTaskSensor Incompatibility

2016-10-11 Thread Martin Grayson (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565656#comment-15565656
 ] 

Martin Grayson commented on AIRFLOW-564:


Thanks for the inspiration [~lauralorenz]! It didn't quite occur to me that I 
could pass the execution date of the parent dag into the child (via the 
payload) and subsequently use that in the ExternalTaskSensor.

This could be a workaround for me, but does complicate the situation. I agree 
with you that inheritance of execution_date could be a more optimal solution.

> DagRunOperator and ExternalTaskSensor Incompatibility
> -
>
> Key: AIRFLOW-564
> URL: https://issues.apache.org/jira/browse/AIRFLOW-564
> Project: Apache Airflow
>  Issue Type: Task
>  Components: DagRun
>Affects Versions: Airflow 1.7.1.3
>Reporter: Martin Grayson
> Attachments: 1.controller.png, 2. par.png, 3. dep.png, dep_dag.py, 
> parent_dag.py, test_dag_scheduler.py
>
>
> I have an hourly Dag that acts to orchestrate other dags, triggering dag runs 
> if a precondition is met.
> The controller dag (Dag CD) uses a `TriggerDagRunOperator` operator to do 
> this, launching 2 other dags (Dag A, Dag B) in my example (10+ in my real 
> life). 
> Within these dags they have their own dependencies, for example Dag B is 
> dependent on Dag A's completion. This is handled using a 
> `ExternalTaskSensor`. 
> Since Dag A and B are executed dynamically with the `TriggerDagRunOperator`, 
> they're assigned very granular execution dates, rather than dd-mm- 
> hh:00:00 as the scheduler would assign.  Due to this, the 
> `ExternalTaskSensor` will not succeed unless Dag A and B managed to have 
> exactly the same execution_date (down to the second).
> As far as I can tell, I'm not able to force the `ExternalTaskSensor` to poke 
> for the correct execution date. The only possible solution would be to 
> implement a `execution_date_fn` that looked through the metadata db to find 
> the exact execution time. This seems massively complicated for a relatively 
> simple requirement.
> Can this scenario be modelled with these components?
> I've attached a set of example dags and some screenshots, showing the issue.
> Thanks,
> Martin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-564) DagRunOperator and ExternalTaskSensor Incompatibility

2016-10-11 Thread Laura Lorenz (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565570#comment-15565570
 ] 

Laura Lorenz commented on AIRFLOW-564:
--

For us this hasn't leaked into the problem as described here, but we do have an 
execution_date aware DAG that is triggered via another DAG's 
TriggerDagRunOperator, and we have to send the parent DAG's execution_date as 
part of the payload to its child. I can see how it could escalate into the 
problem [~martingrayson] describes if we threw an ExternalTaskSensor into the 
mix. I think it makes a lot more intuitive sense for the child DAG's DagRun to 
inherit the execution_date of its parent DAG, not one that is mirrored by the 
DagRun's start_date.

> DagRunOperator and ExternalTaskSensor Incompatibility
> -
>
> Key: AIRFLOW-564
> URL: https://issues.apache.org/jira/browse/AIRFLOW-564
> Project: Apache Airflow
>  Issue Type: Task
>  Components: DagRun
>Affects Versions: Airflow 1.7.1.3
>Reporter: Martin Grayson
> Attachments: 1.controller.png, 2. par.png, 3. dep.png, dep_dag.py, 
> parent_dag.py, test_dag_scheduler.py
>
>
> I have an hourly Dag that acts to orchestrate other dags, triggering dag runs 
> if a precondition is met.
> The controller dag (Dag CD) uses a `TriggerDagRunOperator` operator to do 
> this, launching 2 other dags (Dag A, Dag B) in my example (10+ in my real 
> life). 
> Within these dags they have their own dependencies, for example Dag B is 
> dependent on Dag A's completion. This is handled using a 
> `ExternalTaskSensor`. 
> Since Dag A and B are executed dynamically with the `TriggerDagRunOperator`, 
> they're assigned very granular execution dates, rather than dd-mm- 
> hh:00:00 as the scheduler would assign.  Due to this, the 
> `ExternalTaskSensor` will not succeed unless Dag A and B managed to have 
> exactly the same execution_date (down to the second).
> As far as I can tell, I'm not able to force the `ExternalTaskSensor` to poke 
> for the correct execution date. The only possible solution would be to 
> implement a `execution_date_fn` that looked through the metadata db to find 
> the exact execution time. This seems massively complicated for a relatively 
> simple requirement.
> Can this scenario be modelled with these components?
> I've attached a set of example dags and some screenshots, showing the issue.
> Thanks,
> Martin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (AIRFLOW-563) Dag stuck in "Filling up the DagBag from .." state

2016-10-11 Thread Vincent Martinez (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Martinez reassigned AIRFLOW-563:


Assignee: Vincent Martinez

> Dag stuck in "Filling up the DagBag from .." state
> --
>
> Key: AIRFLOW-563
> URL: https://issues.apache.org/jira/browse/AIRFLOW-563
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Vincent Martinez
>Assignee: Vincent Martinez
>Priority: Critical
> Attachments: 2016-10-10T11_00_00.txt, dagrun.png, load_dims.py
>
>
> Hello,
> I have scheduled a dag hourly but the task doesn't run at all, it stays at 
> "Filling up the DagBag from .." step and each hour I have one more Dag id in 
> running state (stuck aswell). Do you have any idea where does the problem 
> comes from ?
> You'll find attached screens, log of the dag for 1 run and the dag file.
> If you need anything else feel free to ask.
> Thanks
> Best regards,
> Vincent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-565) DockerOperator doesn't work on Python3.4

2016-10-11 Thread Vitor Baptista (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566109#comment-15566109
 ] 

Vitor Baptista commented on AIRFLOW-565:


Went ahead and sent a pull request for it on 
https://github.com/apache/incubator-airflow/pull/1832

> DockerOperator doesn't work on Python3.4
> 
>
> Key: AIRFLOW-565
> URL: https://issues.apache.org/jira/browse/AIRFLOW-565
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: docker, operators
>Affects Versions: Airflow 1.7.1.3
> Environment: Python 3.4.3
>Reporter: Vitor Baptista
>
> On {{DockerOperator.execute()}} we have:
> {code}
> if self.force_pull or len(self.cli.images(name=image)) == 0:
> logging.info('Pulling docker image ' + image)
> for l in self.cli.pull(image, stream=True):
> output = json.loads(l)
> logging.info("{}".format(output['status']))
> {code}
> https://github.com/apache/incubator-airflow/blob/master/airflow/operators/docker_operator.py#L152-L156
> The {{self.cli.pull()}} method returns {{bytes}} in Python3.4, and 
> {{json.loads()}} expects a string, so we end up with:
> {code}
> Traceback (most recent call last):
>   File 
> "/home/vitor/Projetos/okfn/opentrials/airflow/env/lib/python3.4/site-packages/airflow/models.py",
>  line 1245, in run
> result = task_copy.execute(context=context)
>   File 
> "/home/vitor/Projetos/okfn/opentrials/airflow/env/lib/python3.4/site-packages/airflow/operators/docker_operator.py",
>  line 142, in execute
> logging.info("{}".format(output['status']))
>   File "/usr/lib/python3.4/json/__init__.py", line 312, in loads
> s.__class__.__name__))
> TypeError: the JSON object must be str, not 'bytes'
> {code}
> To avoid this, we could simply change it to {{output = 
> json.loads(l.encode('utf-8'))}}. This hardcodes the string as UTF-8, which 
> should be fine, considering the JSON spec requires the use of UTF-8, UTF-16 
> or UTF-32. As we're dealing with a Docker server, we can assume they'll be 
> well behaved.
> I'm happy to submit a pull request for this if you agree with the solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (AIRFLOW-565) DockerOperator doesn't work on Python3.4

2016-10-11 Thread Vitor Baptista (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitor Baptista reassigned AIRFLOW-565:
--

Assignee: Vitor Baptista

> DockerOperator doesn't work on Python3.4
> 
>
> Key: AIRFLOW-565
> URL: https://issues.apache.org/jira/browse/AIRFLOW-565
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: docker, operators
>Affects Versions: Airflow 1.7.1.3
> Environment: Python 3.4.3
>Reporter: Vitor Baptista
>Assignee: Vitor Baptista
>
> On {{DockerOperator.execute()}} we have:
> {code}
> if self.force_pull or len(self.cli.images(name=image)) == 0:
> logging.info('Pulling docker image ' + image)
> for l in self.cli.pull(image, stream=True):
> output = json.loads(l)
> logging.info("{}".format(output['status']))
> {code}
> https://github.com/apache/incubator-airflow/blob/master/airflow/operators/docker_operator.py#L152-L156
> The {{self.cli.pull()}} method returns {{bytes}} in Python3.4, and 
> {{json.loads()}} expects a string, so we end up with:
> {code}
> Traceback (most recent call last):
>   File 
> "/home/vitor/Projetos/okfn/opentrials/airflow/env/lib/python3.4/site-packages/airflow/models.py",
>  line 1245, in run
> result = task_copy.execute(context=context)
>   File 
> "/home/vitor/Projetos/okfn/opentrials/airflow/env/lib/python3.4/site-packages/airflow/operators/docker_operator.py",
>  line 142, in execute
> logging.info("{}".format(output['status']))
>   File "/usr/lib/python3.4/json/__init__.py", line 312, in loads
> s.__class__.__name__))
> TypeError: the JSON object must be str, not 'bytes'
> {code}
> To avoid this, we could simply change it to {{output = 
> json.loads(l.encode('utf-8'))}}. This hardcodes the string as UTF-8, which 
> should be fine, considering the JSON spec requires the use of UTF-8, UTF-16 
> or UTF-32. As we're dealing with a Docker server, we can assume they'll be 
> well behaved.
> I'm happy to submit a pull request for this if you agree with the solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-513) ExternalTaskSensor tasks should not count towards parallelism limit

2016-10-11 Thread Laura Lorenz (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565966#comment-15565966
 ] 

Laura Lorenz commented on AIRFLOW-513:
--

I think that sensors, including ExternalTaskSensors, should be counted in the 
core paralellism limit; they are after all using resources. I think the 
resource bottlenecks long-waiting ExternalTaskSensors can create should be 
managed with pools/queues or actual DAG dependencies. To the latter point I 
think in your case you may want to use the TriggerDagRunOperator instead so you 
are pushing to create a DagRun instance for DAG 2, instead of polling from DAG 
2 for DAG 1. We do use some ExternalTaskSensors in the way you describe but we 
increase worker limits each time we add one, and in any case we have only 4 
running at a time, so it hasn't reached your level of contention yet.

> ExternalTaskSensor tasks should not count towards parallelism limit
> ---
>
> Key: AIRFLOW-513
> URL: https://issues.apache.org/jira/browse/AIRFLOW-513
> Project: Apache Airflow
>  Issue Type: Improvement
> Environment: Ubuntu 14.04
> Version 1.7.0
>Reporter: Kevin Yuen
>
> Hi, 
> We are using airflow version 1.7.0 and we are using `ExternalTaskSensor` 
> pretty heavily to manage dependencies between our DAGs. 
> We have recently experienced a case where the external task sensors are 
> causing the DAGs to go into limbo state because they took up all the 
> execution slots defined via `AIRFLOW__CORE__PARALLELISM`. 
> For example: 
> Given we have 2 DAGs: 
> first one with 16 python operator tasks, and the other with 16 sensors. 
> We set `PARALLELISM` to 16. 
> If the scheduler choses to schedule all 16 sensors first, the dag runs 
> will never complete. 
> There are a couple of work around to this:
> # staggering the DAGs so that the first dag with python operator runs first
> # lowering the TaskSensor timeout thresholds and relying on retries
> Both of these options seems less then ideal to us. We wonder if 
> `ExternalTaskSensor` should really be counting towards the `PARALLELISM` 
> limit?
> Cheers, 
> Kevin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-560) DbApiHook should provide access to url / sqlalchemy

2016-10-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566026#comment-15566026
 ] 

ASF subversion and git services commented on AIRFLOW-560:
-

Commit c6de16b563cb1f681c62696b755a9e2eb3c80341 in incubator-airflow's branch 
refs/heads/master from [~gwax]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=c6de16b ]

[AIRFLOW-560] Get URI & SQLA engine from Connection

Closes #1256 from gwax/sqlalchemy_conn


> DbApiHook should provide access to url / sqlalchemy
> ---
>
> Key: AIRFLOW-560
> URL: https://issues.apache.org/jira/browse/AIRFLOW-560
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: George Leslie-Waksman
>Assignee: George Leslie-Waksman
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (AIRFLOW-560) DbApiHook should provide access to url / sqlalchemy

2016-10-11 Thread Siddharth Anand (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Anand resolved AIRFLOW-560.
-
Resolution: Fixed

Issue resolved by pull request #1256
[https://github.com/apache/incubator-airflow/pull/1256]

> DbApiHook should provide access to url / sqlalchemy
> ---
>
> Key: AIRFLOW-560
> URL: https://issues.apache.org/jira/browse/AIRFLOW-560
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: George Leslie-Waksman
>Assignee: George Leslie-Waksman
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-470) Frequent multiple dispatching of the same task to celery

2016-10-11 Thread Laura Lorenz (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566196#comment-15566196
 ] 

Laura Lorenz commented on AIRFLOW-470:
--

Dup of AIRFLOW-471?

> Frequent multiple dispatching of the same task to celery
> 
>
> Key: AIRFLOW-470
> URL: https://issues.apache.org/jira/browse/AIRFLOW-470
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery, scheduler
>Affects Versions: Airflow 1.7.1.3
>Reporter: Jasmine Tsai
>Priority: Critical
>
> We are seeing a lot of frequent dispatching of the same task to celery within 
> a very short time frame (same task instance by Airflow conditions, but a 
> different celery task uuid), which is causing a lot of unexpected behavior 
> for us. Most of these are annoying but harmless — sometimes they clear xcom 
> data and overwrite logs, but for the most part they are able to rely on the 
> db metadata and not try to run itself multiple times. We are seeing this 
> behavior frequent, some tasks are getting scheduled 5 times within the span 
> of two minutes. The issue seems to be exacerbated by the use of pools.
> We have even seen the same task being dispatched twice within a second apart, 
> causing real race conditions because the second try didn't see the task 
> instance starting to run yet in the metadata db. 
> It seems from other issues submitted here that people definitely see problems 
> with the same tasks running multiple times, but this problem seems to be 
> getting worse for us. Is it a known issue for the multiple dispatching to be 
> so frequent/severe? (or maybe even the intentional design/side effect?) Are 
> there things that we could be doing that might make this worse? (One of our 
> primary suspect is the scheduler, which we have set its num_runs to 1)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-441) DagRuns are marked as failed as soon as one task fails

2016-10-11 Thread Laura Lorenz (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566210#comment-15566210
 ] 

Laura Lorenz commented on AIRFLOW-441:
--

I haven't looked into the source at all but this has definitely bit me too. I 
would prefer the DagRun failure state to occur after all tasks have been 
attempted and are either 'success', 'failed', or 'skipped' status.

> DagRuns are marked as failed as soon as one task fails
> --
>
> Key: AIRFLOW-441
> URL: https://issues.apache.org/jira/browse/AIRFLOW-441
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jeff Balogh
>
> https://github.com/apache/incubator-airflow/pull/1514 added a 
> [{{verify_integrity}} 
> function|https://github.com/apache/incubator-airflow/blob/fcf645b/airflow/models.py#L3850-L3877]
>  that greedily creates {{TaskInstance}} objects for all tasks in a dag.
> This does not interact well with the assumptions in the new [{{update_state}} 
> function|https://github.com/apache/incubator-airflow/blob/fcf645b/airflow/models.py#L3816-L3825].
>  The guard for {{if len(tis) == len(dag.active_tasks)}} is no longer 
> effective; in the old world of lazily-created tasks this code would only run 
> once all the tasks in the dag had run. Now it runs all the time, and as soon 
> as one task in a dag run fails the whole DagRun fails. This is bad since the 
> scheduler stops processing the DagRun after that.
> In retrospect, the old code was also buggy: if your dag ends with a bunch of 
> Queued tasks the DagRun could be marked as failed prematurely.
> I suspect the fix is to update the guard to look at tasks where the state is 
> success or failed. Otherwise we're evaluating and failing the dag based on 
> up_for_retry/queued/scheduled tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


incubator-airflow git commit: closes apache/incubator-airflow#1590 *no movement from submitter*

2016-10-11 Thread sanand
Repository: incubator-airflow
Updated Branches:
  refs/heads/master c6de16b56 -> 9b689d05a


closes apache/incubator-airflow#1590 *no movement from submitter*


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/9b689d05
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/9b689d05
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/9b689d05

Branch: refs/heads/master
Commit: 9b689d05a5f3ef3ce73250578c5c9b83ec6a9b3a
Parents: c6de16b
Author: Siddharth Anand 
Authored: Tue Oct 11 17:11:25 2016 -0700
Committer: Siddharth Anand 
Committed: Tue Oct 11 17:11:25 2016 -0700

--

--




[jira] [Updated] (AIRFLOW-559) Add support for BigQuery kwarg parameters

2016-10-11 Thread Alex Van Boxel (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel updated AIRFLOW-559:
---
Component/s: gcp

> Add support for BigQuery kwarg parameters
> -
>
> Key: AIRFLOW-559
> URL: https://issues.apache.org/jira/browse/AIRFLOW-559
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp
>Reporter: Sam McVeety
>Assignee: Sam McVeety
>Priority: Minor
> Fix For: Airflow 1.8
>
>
> Many of the operators in 
> https://github.com/apache/incubator-airflow/tree/master/airflow/contrib/operators
>  add parameters over time, and plumbing these through multiple layers of 
> calls isn't always a high priority.
> The operators (and hooks) should support an end-to-end kwargs parameter that 
> allows for new fields (e.g. useLegacySql, defaultDataset) to be added by 
> users without needing to change the underlying code.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-539) Add support for BigQuery Standard SQL

2016-10-11 Thread Alex Van Boxel (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564730#comment-15564730
 ] 

Alex Van Boxel commented on AIRFLOW-539:


[~criccomini]: athough this ticket is closed (can you add a "gcp" label on this 
ticket), you're probably an admin. Thanks.

> Add support for BigQuery Standard SQL
> -
>
> Key: AIRFLOW-539
> URL: https://issues.apache.org/jira/browse/AIRFLOW-539
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Sam McVeety
>Assignee: Ilya Rakoshes
>Priority: Minor
> Fix For: Airflow 1.8
>
>
> Many of the operators in 
> https://github.com/apache/incubator-airflow/tree/master/airflow/contrib/operators
>  are implicitly using the "useLegacySql" option to BigQuery.  Providing the 
> option to negate this will allow users to migrate to Standard SQL 
> (https://cloud.google.com/bigquery/sql-reference/enabling-standard-sql).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-563) Dag stuck in "Filling up the DagBag from .." state

2016-10-11 Thread Vincent Martinez (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Martinez updated AIRFLOW-563:
-
Attachment: dagrun.png
2016-10-10T11_00_00.txt

> Dag stuck in "Filling up the DagBag from .." state
> --
>
> Key: AIRFLOW-563
> URL: https://issues.apache.org/jira/browse/AIRFLOW-563
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Vincent Martinez
>Priority: Critical
> Attachments: 2016-10-10T11_00_00.txt, dagrun.png, load_dims.py
>
>
> Hello,
> I have scheduled a dag hourly but the task doesn't run at all, it stays at 
> "Filling up the DagBag from .." step and each hour I have one more Dag id in 
> running state (stuck aswell). Do you have any idea where does the problem 
> comes from ?
> You'll find attached screens, log of the dag for 1 run and the dag file.
> If you need anything else feel free to ask.
> Thanks
> Best regards,
> Vincent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)