[jira] [Commented] (AIRFLOW-512) Docs: Typo, 'below' written as 'bellow'

2016-09-15 Thread David Gingrich (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494778#comment-15494778
 ] 

David Gingrich commented on AIRFLOW-512:


PR: https://github.com/apache/incubator-airflow/pull/1800

> Docs: Typo, 'below' written as 'bellow'
> ---
>
> Key: AIRFLOW-512
> URL: https://issues.apache.org/jira/browse/AIRFLOW-512
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: docs
> Environment: n/a
>Reporter: David Gingrich
>Assignee: David Gingrich
>Priority: Trivial
>
> FAQ has typo, 'below' written as 'bellow': 
> https://airflow.incubator.apache.org/faq.html#what-are-all-the-airflow-run-commands-in-my-process-list
> Also in upstart script and a few code comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (AIRFLOW-494) Add per-operator success/failure metrics.

2016-09-15 Thread Li Xuanji (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Xuanji closed AIRFLOW-494.
-
Resolution: Fixed

> Add per-operator success/failure metrics.
> -
>
> Key: AIRFLOW-494
> URL: https://issues.apache.org/jira/browse/AIRFLOW-494
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: rics
>Reporter: Dan Davydov
>Assignee: Dan Davydov
>Priority: Minor
>
> It would be good to have metrics for success/failure rates of each operator, 
> that way when we e.g. do a new release we will have some signal if there is a 
> regression in an operator. It will also be useful if e.g. a user wants to 
> upgrade their infrastructure and make sure that all of the operators still 
> work as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (AIRFLOW-486) Webserver does not detach from console when using -D

2016-09-15 Thread Li Xuanji (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Xuanji reassigned AIRFLOW-486:
-

Assignee: Li Xuanji

> Webserver does not detach from console when using -D
> 
>
> Key: AIRFLOW-486
> URL: https://issues.apache.org/jira/browse/AIRFLOW-486
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webserver
>Reporter: Bolke de Bruin
>Assignee: Li Xuanji
>
> Since the rework of a rolling webserver restart the "airflow webserver -D" 
> does not detach from the console anymore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-511) DagRun Failure: Retry, Email, Callbacks

2016-09-15 Thread Rob Froetscher (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rob Froetscher updated AIRFLOW-511:
---
Description: 
Are there any plans to have retry, email, or callbacks on failure of a DAG run? 
Would you guys be open to someone implementing that? Right now particularly 
with dagrun_timeout, there is not much insight that the dag actually stopped.

Pseudocode: 
https://github.com/apache/incubator-airflow/compare/master...rfroetscher:dagrun_failure

  was:
Are there any plans to have retry, email, or callbacks on failure of a DAG run? 
Would you guys be open to someone implementing that? Right now particularly 
with dagrun_timeout, there is not much insight that the dag actually stopped.

Psuedocode: 
https://github.com/apache/incubator-airflow/compare/master...rfroetscher:dagrun_failure


> DagRun Failure: Retry, Email, Callbacks
> ---
>
> Key: AIRFLOW-511
> URL: https://issues.apache.org/jira/browse/AIRFLOW-511
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Rob Froetscher
>Priority: Minor
>
> Are there any plans to have retry, email, or callbacks on failure of a DAG 
> run? Would you guys be open to someone implementing that? Right now 
> particularly with dagrun_timeout, there is not much insight that the dag 
> actually stopped.
> Pseudocode: 
> https://github.com/apache/incubator-airflow/compare/master...rfroetscher:dagrun_failure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-511) DagRun Failure Retry, Email, Callbacks

2016-09-15 Thread Rob Froetscher (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rob Froetscher updated AIRFLOW-511:
---
Description: Are there any plans to have retry, email, or callbacks on 
failure of a DAG run? Would you guys be open to someone implementing that? 
Right now particularly with dagrun_timeout, there is not much insight that the 
dag actually stopped.  (was: Are there any plans to have retry, email, or 
callbacks on failure of a DAG run? Would you guys be open to someone 
implementing that. Right now particularly with dagrun_timeout, there is not 
much insight that the dag actually stopped.)

> DagRun Failure Retry, Email, Callbacks
> --
>
> Key: AIRFLOW-511
> URL: https://issues.apache.org/jira/browse/AIRFLOW-511
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Rob Froetscher
>Priority: Minor
>
> Are there any plans to have retry, email, or callbacks on failure of a DAG 
> run? Would you guys be open to someone implementing that? Right now 
> particularly with dagrun_timeout, there is not much insight that the dag 
> actually stopped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AIRFLOW-511) DagRun Failure: Retry, Email, Callbacks

2016-09-15 Thread Rob Froetscher (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rob Froetscher updated AIRFLOW-511:
---
Summary: DagRun Failure: Retry, Email, Callbacks  (was: DagRun Failure 
Retry, Email, Callbacks)

> DagRun Failure: Retry, Email, Callbacks
> ---
>
> Key: AIRFLOW-511
> URL: https://issues.apache.org/jira/browse/AIRFLOW-511
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Rob Froetscher
>Priority: Minor
>
> Are there any plans to have retry, email, or callbacks on failure of a DAG 
> run? Would you guys be open to someone implementing that? Right now 
> particularly with dagrun_timeout, there is not much insight that the dag 
> actually stopped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-511) DagRun Failure Retry, Email, Callbacks

2016-09-15 Thread Rob Froetscher (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494568#comment-15494568
 ] 

Rob Froetscher commented on AIRFLOW-511:


[~criccomini] any thoughts on this?

> DagRun Failure Retry, Email, Callbacks
> --
>
> Key: AIRFLOW-511
> URL: https://issues.apache.org/jira/browse/AIRFLOW-511
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Rob Froetscher
>Priority: Minor
>
> Are there any plans to have retry, email, or callbacks on failure of a DAG 
> run? Would you guys be open to someone implementing that. Right now 
> particularly with dagrun_timeout, there is not much insight that the dag 
> actually stopped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (AIRFLOW-401) scheduler gets stuck without a trace

2016-09-15 Thread JIRA

[ 
https://issues.apache.org/jira/browse/AIRFLOW-401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15473369#comment-15473369
 ] 

Maciej Bryński edited comment on AIRFLOW-401 at 9/15/16 6:31 PM:
-

After upgrading to master everything works so slowly.

I have 3 scheduler, 3 workers and most of the logs I see 
{code}
[2016-09-08 09:26:36,444] {jobs.py:1346} INFO - Heartbeating the process manager
[2016-09-08 09:26:36,444] {jobs.py:1383} INFO - Heartbeating the executor
{code}
And there is new run every few minutes


was (Author: maver1ck):
After upgrading to master everything works so fucking slowly.

I have 3 scheduler, 3 workers and most of the logs I see 
{code}
[2016-09-08 09:26:36,444] {jobs.py:1346} INFO - Heartbeating the process manager
[2016-09-08 09:26:36,444] {jobs.py:1383} INFO - Heartbeating the executor
{code}
And there is new run every few minutes

> scheduler gets stuck without a trace
> 
>
> Key: AIRFLOW-401
> URL: https://issues.apache.org/jira/browse/AIRFLOW-401
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor, scheduler
>Affects Versions: Airflow 1.7.1.3
>Reporter: Nadeem Ahmed Nazeer
>Assignee: Bolke de Bruin
>Priority: Minor
> Attachments: Dag_code.txt, schduler_cpu100%.png, scheduler_stuck.png, 
> scheduler_stuck_7hours.png
>
>
> The scheduler gets stuck without a trace or error. When this happens, the CPU 
> usage of scheduler service is at 100%. No jobs get submitted and everything 
> comes to a halt. Looks it goes into some kind of infinite loop. 
> The only way I could make it run again is by manually restarting the 
> scheduler service. But again, after running some tasks it gets stuck. I've 
> tried with both Celery and Local executors but same issue occurs. I am using 
> the -n 3 parameter while starting scheduler. 
> Scheduler configs,
> job_heartbeat_sec = 5
> scheduler_heartbeat_sec = 5
> executor = LocalExecutor
> parallelism = 32
> Please help. I would be happy to provide any other information needed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-462) Concurrent Scheduler Jobs pushing the same task to queue

2016-09-15 Thread JIRA

[ 
https://issues.apache.org/jira/browse/AIRFLOW-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494161#comment-15494161
 ] 

Maciej Bryński commented on AIRFLOW-462:


I have the same problem in master (as of 2016-09-15)

> Concurrent Scheduler Jobs pushing the same task to queue
> 
>
> Key: AIRFLOW-462
> URL: https://issues.apache.org/jira/browse/AIRFLOW-462
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: Airflow 1.7.0
>Reporter: Yogesh
>Priority: Blocker
>
> Hi,
> We are using airflow version 1.7.0 and we tried to implement high 
> availability for airflow daemons in our production environment.
> Detailed high availability approach:
> - Airflow running on two different machines with all the 
> daemons(webserver, scheduler, execueor)
> - Single mysql db repository pointed by two schedulers
> - Replicated dag files in both the machines
> -   Running Single Rabbitmq Instance as message broker
> While doing so we came across below problem:
> - A particular task was sent to executor twice (two entries in message 
> queue) by two different schedulers. But, we see only single entry for the 
> task instance in database which is correct.
> We just checked out the code and found below fact:
> - before sending the task to executor it checks for task state in 
> database and if its not already QUEUED it pushes that task to queue
> issue:
> As there is no locking implemented on the task instance in the database and 
> both the Scheduler jobs are running so close that the second one might check 
> for the status in the db before the first one updates that to QUEUED.
> We are not sure if in recent release this issue have been taken care of.
> Would you please help with some appropriate approach so that the high 
> availability can be achieved.
> Thanks
> Yogesh



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (AIRFLOW-401) scheduler gets stuck without a trace

2016-09-15 Thread JIRA

[ 
https://issues.apache.org/jira/browse/AIRFLOW-401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15477516#comment-15477516
 ] 

Maciej Bryński edited comment on AIRFLOW-401 at 9/15/16 6:29 PM:
-

I will try this.
In the meantime I found  not documented min_file_process_interval option.
UPDATE: the same was in the patch

That's solved many of my problems but trigger new.

How can I set up HA Scheduler ? Having more than one instance triggers 
duplicates of DagRuns.


was (Author: maver1ck):
I will try this.
In the meantime I found  not documented min_file_process_interval option.

That's solved many of my problems but trigger new.

How can I set up HA Scheduler ? Having more than one instance triggers 
duplicates of DagRuns.

> scheduler gets stuck without a trace
> 
>
> Key: AIRFLOW-401
> URL: https://issues.apache.org/jira/browse/AIRFLOW-401
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor, scheduler
>Affects Versions: Airflow 1.7.1.3
>Reporter: Nadeem Ahmed Nazeer
>Assignee: Bolke de Bruin
>Priority: Minor
> Attachments: Dag_code.txt, schduler_cpu100%.png, scheduler_stuck.png, 
> scheduler_stuck_7hours.png
>
>
> The scheduler gets stuck without a trace or error. When this happens, the CPU 
> usage of scheduler service is at 100%. No jobs get submitted and everything 
> comes to a halt. Looks it goes into some kind of infinite loop. 
> The only way I could make it run again is by manually restarting the 
> scheduler service. But again, after running some tasks it gets stuck. I've 
> tried with both Celery and Local executors but same issue occurs. I am using 
> the -n 3 parameter while starting scheduler. 
> Scheduler configs,
> job_heartbeat_sec = 5
> scheduler_heartbeat_sec = 5
> executor = LocalExecutor
> parallelism = 32
> Please help. I would be happy to provide any other information needed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (AIRFLOW-509) Create operator to delete tables in BigQuery

2016-09-15 Thread Chris Riccomini (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini closed AIRFLOW-509.
---
   Resolution: Done
Fix Version/s: Airflow 1.8

> Create operator to delete tables in BigQuery
> 
>
> Key: AIRFLOW-509
> URL: https://issues.apache.org/jira/browse/AIRFLOW-509
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Ilya Rakoshes
>Assignee: Ilya Rakoshes
> Fix For: Airflow 1.8
>
>
> Airflow currently has several operators for BigQuery, but none of them are 
> able to delete tables. The request is to create a new operator that will be 
> able to do this.
> An important application of this is to help refresh views for which the 
> schema of the underlying tables is subject to change. BigQuery has no 
> functionality to refresh the schema of a view, so if the underlying table 
> changes the schema will not have the new data and things like auto-completion 
> will not work in the UI. By deleting a view and re-creating it, we can force 
> a schema refresh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AIRFLOW-509) Create operator to delete tables in BigQuery

2016-09-15 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493995#comment-15493995
 ] 

ASF subversion and git services commented on AIRFLOW-509:
-

Commit 0e3ed447b5ec854d4af91819b9efee886918 in incubator-airflow's branch 
refs/heads/master from [~illop]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=0e3ed44 ]

[AIRFLOW-509][AIRFLOW-1] Create operator to delete tables in BigQuery

We have a use case to delete BigQuery tables and views. This patch
adds a delete operator that allows us to do so.

Closes #1798 from illop/BigQueryDeleteOperator


> Create operator to delete tables in BigQuery
> 
>
> Key: AIRFLOW-509
> URL: https://issues.apache.org/jira/browse/AIRFLOW-509
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Ilya Rakoshes
>Assignee: Ilya Rakoshes
> Fix For: Airflow 1.8
>
>
> Airflow currently has several operators for BigQuery, but none of them are 
> able to delete tables. The request is to create a new operator that will be 
> able to do this.
> An important application of this is to help refresh views for which the 
> schema of the underlying tables is subject to change. BigQuery has no 
> functionality to refresh the schema of a view, so if the underlying table 
> changes the schema will not have the new data and things like auto-completion 
> will not work in the UI. By deleting a view and re-creating it, we can force 
> a schema refresh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)