[jira] [Created] (AIRFLOW-3087) Task stuck in UP_FOR_RETRY and continuously showing Not In Retry Period

2018-09-18 Thread Chandu Kavar (JIRA)
Chandu Kavar created AIRFLOW-3087:
-

 Summary: Task stuck in UP_FOR_RETRY and continuously showing Not 
In Retry Period
 Key: AIRFLOW-3087
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3087
 Project: Apache Airflow
  Issue Type: Bug
  Components: scheduler
Affects Versions: 1.9.0
Reporter: Chandu Kavar
 Attachments: Screen Shot 2018-09-19 at 10.27.25 AM.png

Hi,

We are facing issues with "up_for_retry" of the task in few DAGs. When the task 
failed and scheduler picks up for "up_for_retry", it got stuck. In task 
instance details we see this log when the retry time appears:

{{All dependencies are met but the task instance is not running. In most cases 
this just means that the task will probably be scheduled soon unless: - The 
scheduler is down or under heavy load If this task instance does not start soon 
please contact your Airflow administrator for assistance.}}

after retry delay again it shows (and it keep showing this log):

{{Not In Retry Period Task is not ready for retry yet but will be retried 
automatically. Current date is 2018-08-29T15:xx: and task will be retrieve. 
}}

After an hour task is able to retry. 

Code:

 
{code:java}
from datetime import *
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from airflow.operators.bash_operator import BashOperator

default_args = {
'owner': 'Chandu',
'depends_on_past': False,
'retries': 3,
'retry_delay': timedelta(minutes=1),
'queue': 'worker_test'
}

dag = DAG('airflow-examples.test_failed_dag_v3', description='Failed DAG',
schedule_interval='*/10 * * * *',
start_date=datetime(2018, 9, 7), default_args=default_args)

b = BashOperator(
task_id="ls_command",
bash_command="mdr",
dag=dag
)
{code}
 


Tree view of the DAG:

 

!Screen Shot 2018-09-19 at 10.27.25 AM.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] criccomini commented on issue #3916: [AIRFLOW-3085] Bug fix to allow log display in RBAC UI

2018-09-18 Thread GitBox
criccomini commented on issue #3916: [AIRFLOW-3085] Bug fix to allow log 
display in RBAC UI
URL: 
https://github.com/apache/incubator-airflow/pull/3916#issuecomment-422627689
 
 
   LGTM!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3917: [AIRFLOW-3086] Add extras group for google auth to setup.py.

2018-09-18 Thread GitBox
codecov-io commented on issue #3917: [AIRFLOW-3086] Add extras group for google 
auth to setup.py.
URL: 
https://github.com/apache/incubator-airflow/pull/3917#issuecomment-422625841
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3917?src=pr=h1)
 Report
   > Merging 
[#3917](https://codecov.io/gh/apache/incubator-airflow/pull/3917?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/2f50083c8dfcd79ad569216a78b67f7568347628?src=pr=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3917/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3917?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3917  +/-   ##
   ==
   - Coverage   77.55%   77.53%   -0.02% 
   ==
 Files 198  198  
 Lines   1584715847  
   ==
   - Hits1229012287   -3 
   - Misses   3557 3560   +3
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3917?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3917/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `82.48% <0%> (-0.27%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3917?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3917?src=pr=footer).
 Last update 
[2f50083...c249828](https://codecov.io/gh/apache/incubator-airflow/pull/3917?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3917: [AIRFLOW-3086] Add extras group for google auth to setup.py.

2018-09-18 Thread GitBox
codecov-io commented on issue #3917: [AIRFLOW-3086] Add extras group for google 
auth to setup.py.
URL: 
https://github.com/apache/incubator-airflow/pull/3917#issuecomment-422625851
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3917?src=pr=h1)
 Report
   > Merging 
[#3917](https://codecov.io/gh/apache/incubator-airflow/pull/3917?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/2f50083c8dfcd79ad569216a78b67f7568347628?src=pr=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3917/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3917?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3917  +/-   ##
   ==
   - Coverage   77.55%   77.53%   -0.02% 
   ==
 Files 198  198  
 Lines   1584715847  
   ==
   - Hits1229012287   -3 
   - Misses   3557 3560   +3
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3917?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3917/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `82.48% <0%> (-0.27%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3917?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3917?src=pr=footer).
 Last update 
[2f50083...c249828](https://codecov.io/gh/apache/incubator-airflow/pull/3917?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3086) Add extras group in setup.py for google oauth

2018-09-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619966#comment-16619966
 ] 

ASF GitHub Bot commented on AIRFLOW-3086:
-

jmcarp opened a new pull request #3917: [AIRFLOW-3086] Add extras group for 
google auth to setup.py.
URL: https://github.com/apache/incubator-airflow/pull/3917
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   To clarify installation instructions for the google auth backend, add an
   install group to `setup.py` that installs dependencies google auth via
   `pip install apache-airflow[google_auth]`.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   This patch just adds an item to `setup.py`.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add extras group in setup.py for google oauth
> -
>
> Key: AIRFLOW-3086
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3086
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Josh Carp
>Priority: Major
>
> Since the google auth backend requires Flask-OAuthlib, it would be helpful to 
> add an extras group to setup.py for google auth that installs this dependency.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jmcarp opened a new pull request #3917: [AIRFLOW-3086] Add extras group for google auth to setup.py.

2018-09-18 Thread GitBox
jmcarp opened a new pull request #3917: [AIRFLOW-3086] Add extras group for 
google auth to setup.py.
URL: https://github.com/apache/incubator-airflow/pull/3917
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   To clarify installation instructions for the google auth backend, add an
   install group to `setup.py` that installs dependencies google auth via
   `pip install apache-airflow[google_auth]`.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   This patch just adds an item to `setup.py`.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3086) Add extras group in setup.py for google oauth

2018-09-18 Thread Josh Carp (JIRA)
Josh Carp created AIRFLOW-3086:
--

 Summary: Add extras group in setup.py for google oauth
 Key: AIRFLOW-3086
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3086
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Josh Carp


Since the google auth backend requires Flask-OAuthlib, it would be helpful to 
add an extras group to setup.py for google auth that installs this dependency.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io commented on issue #3916: [AIRFLOW-3085] Bug fix to allow log display in RBAC UI

2018-09-18 Thread GitBox
codecov-io commented on issue #3916: [AIRFLOW-3085] Bug fix to allow log 
display in RBAC UI
URL: 
https://github.com/apache/incubator-airflow/pull/3916#issuecomment-422598348
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3916?src=pr=h1)
 Report
   > Merging 
[#3916](https://codecov.io/gh/apache/incubator-airflow/pull/3916?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/2f50083c8dfcd79ad569216a78b67f7568347628?src=pr=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3916/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3916?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3916  +/-   ##
   ==
   - Coverage   77.55%   77.53%   -0.02% 
   ==
 Files 198  198  
 Lines   1584715847  
   ==
   - Hits1229012287   -3 
   - Misses   3557 3560   +3
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3916?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/www\_rbac/security.py](https://codecov.io/gh/apache/incubator-airflow/pull/3916/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy9zZWN1cml0eS5weQ==)
 | `91.27% <ø> (ø)` | :arrow_up: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3916/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `82.48% <0%> (-0.27%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3916?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3916?src=pr=footer).
 Last update 
[2f50083...16d3c23](https://codecov.io/gh/apache/incubator-airflow/pull/3916?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3085) Log viewing not possible in default RBAC setting

2018-09-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619874#comment-16619874
 ] 

ASF GitHub Bot commented on AIRFLOW-3085:
-

jgao54 opened a new pull request #3916: [AIRFLOW-3085] Bug fix to allow log 
display in RBAC UI
URL: https://github.com/apache/incubator-airflow/pull/3916
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-AIRFLOW-3085
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Tested via UI
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Log viewing not possible in default RBAC setting
> 
>
> Key: AIRFLOW-3085
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3085
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Joy Gao
>Priority: Major
>
> Aside from Admin role, all other roles are not able to view logs right now 
> due to a missing permission in the default setting. The permission should be 
> added to Viewer/User/Op as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jgao54 opened a new pull request #3916: [AIRFLOW-3085] Bug fix to allow log display in RBAC UI

2018-09-18 Thread GitBox
jgao54 opened a new pull request #3916: [AIRFLOW-3085] Bug fix to allow log 
display in RBAC UI
URL: https://github.com/apache/incubator-airflow/pull/3916
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-AIRFLOW-3085
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Tested via UI
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3085) Log viewing not possible in default RBAC setting

2018-09-18 Thread Joy Gao (JIRA)
Joy Gao created AIRFLOW-3085:


 Summary: Log viewing not possible in default RBAC setting
 Key: AIRFLOW-3085
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3085
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Joy Gao


Aside from Admin role, all other roles are not able to view logs right now due 
to a missing permission in the default setting. The permission should be added 
to Viewer/User/Op as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io commented on issue #3915: [AIRFLOW-XXX] Fix SlackWebhookOperator docs

2018-09-18 Thread GitBox
codecov-io commented on issue #3915: [AIRFLOW-XXX] Fix SlackWebhookOperator docs
URL: 
https://github.com/apache/incubator-airflow/pull/3915#issuecomment-422586261
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3915?src=pr=h1)
 Report
   > Merging 
[#3915](https://codecov.io/gh/apache/incubator-airflow/pull/3915?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/2f50083c8dfcd79ad569216a78b67f7568347628?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3915/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3915?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3915   +/-   ##
   ===
 Coverage   77.55%   77.55%   
   ===
 Files 198  198   
 Lines   1584715847   
   ===
 Hits1229012290   
 Misses   3557 3557
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3915?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3915?src=pr=footer).
 Last update 
[2f50083...d200895](https://codecov.io/gh/apache/incubator-airflow/pull/3915?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3915: [AIRFLOW-XXX] Fix SlackWebhookOperator docs

2018-09-18 Thread GitBox
codecov-io commented on issue #3915: [AIRFLOW-XXX] Fix SlackWebhookOperator docs
URL: 
https://github.com/apache/incubator-airflow/pull/3915#issuecomment-422586211
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3915?src=pr=h1)
 Report
   > Merging 
[#3915](https://codecov.io/gh/apache/incubator-airflow/pull/3915?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/2f50083c8dfcd79ad569216a78b67f7568347628?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3915/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3915?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3915   +/-   ##
   ===
 Coverage   77.55%   77.55%   
   ===
 Files 198  198   
 Lines   1584715847   
   ===
 Hits1229012290   
 Misses   3557 3557
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3915?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3915?src=pr=footer).
 Last update 
[2f50083...d200895](https://codecov.io/gh/apache/incubator-airflow/pull/3915?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sbilinski opened a new pull request #3915: [AIRFLOW-XXX] Fix SlackWebhookOperator docs

2018-09-18 Thread GitBox
sbilinski opened a new pull request #3915: [AIRFLOW-XXX] Fix 
SlackWebhookOperator docs
URL: https://github.com/apache/incubator-airflow/pull/3915
 
 
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. 
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. -> ✅ 
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   The docs refer to `conn_id` while the actual argument is `http_conn_id`.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   No tests - documentation fix only. 
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3914: [AIRFLOW-3069] Log all output of the S3 file transform script

2018-09-18 Thread GitBox
codecov-io commented on issue #3914: [AIRFLOW-3069] Log all output of the S3 
file transform script
URL: 
https://github.com/apache/incubator-airflow/pull/3914#issuecomment-422548293
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3914?src=pr=h1)
 Report
   > Merging 
[#3914](https://codecov.io/gh/apache/incubator-airflow/pull/3914?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/2f50083c8dfcd79ad569216a78b67f7568347628?src=pr=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3914/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3914?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3914  +/-   ##
   ==
   - Coverage   77.55%   77.54%   -0.02% 
   ==
 Files 198  198  
 Lines   1584715852   +5 
   ==
   + Hits1229012292   +2 
   - Misses   3557 3560   +3
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3914?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/s3\_file\_transform\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3914/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvczNfZmlsZV90cmFuc2Zvcm1fb3BlcmF0b3IucHk=)
 | `94.44% <100%> (+0.56%)` | :arrow_up: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3914/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `82.48% <0%> (-0.27%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3914?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3914?src=pr=footer).
 Last update 
[2f50083...48d1d3b](https://codecov.io/gh/apache/incubator-airflow/pull/3914?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3914: [AIRFLOW-3069] Log all output of the S3 file transform script

2018-09-18 Thread GitBox
codecov-io commented on issue #3914: [AIRFLOW-3069] Log all output of the S3 
file transform script
URL: 
https://github.com/apache/incubator-airflow/pull/3914#issuecomment-422548218
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3914?src=pr=h1)
 Report
   > Merging 
[#3914](https://codecov.io/gh/apache/incubator-airflow/pull/3914?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/2f50083c8dfcd79ad569216a78b67f7568347628?src=pr=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3914/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3914?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3914  +/-   ##
   ==
   - Coverage   77.55%   77.54%   -0.02% 
   ==
 Files 198  198  
 Lines   1584715852   +5 
   ==
   + Hits1229012292   +2 
   - Misses   3557 3560   +3
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3914?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/s3\_file\_transform\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3914/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvczNfZmlsZV90cmFuc2Zvcm1fb3BlcmF0b3IucHk=)
 | `94.44% <100%> (+0.56%)` | :arrow_up: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3914/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `82.48% <0%> (-0.27%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3914?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3914?src=pr=footer).
 Last update 
[2f50083...48d1d3b](https://codecov.io/gh/apache/incubator-airflow/pull/3914?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3084) Webserver Returns 404 When base_url is set via Environment

2018-09-18 Thread Toby Jennings (JIRA)
Toby Jennings created AIRFLOW-3084:
--

 Summary: Webserver Returns 404 When base_url is set via Environment
 Key: AIRFLOW-3084
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3084
 Project: Apache Airflow
  Issue Type: Bug
  Components: webserver
Affects Versions: 1.10.0
 Environment: Docker
Reporter: Toby Jennings


Attempting to mount Airflow at a subpath beneath root (see AIRFLOW-1755).

When Airflow is configured via the environment variable 
"AIRFLOW__WEBSERVER__BASE_URL" the web server returns a 404 for all paths. When 
"base_url" is set directly in airflow.cfg, the web server works as expected. 
Documentation suggests that using env to configure Airflow should be sufficient.

 

Steps to reproduce:
 # Install Airflow.
 # Set AIRFLOW__WEBSERVER__BASE_URL to "http://localhost:8080/airflow;
 # Access Airflow at "/", "/airflow" or any other path.
 # Webserver returns 404 ("Apache Airflow is not at this location.")

 

Workaround:
 # Install Airflow
 # Set "base_url" to "http://localhost:8080/airflow; in airflow.cfg
 # Access Airflow at "/airflow"
 # Webserver works as intended.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3083) Trigger Dag Returns Redirect to Incorrect Path when Airflow is mounted under root

2018-09-18 Thread Toby Jennings (JIRA)
Toby Jennings created AIRFLOW-3083:
--

 Summary: Trigger Dag Returns Redirect to Incorrect Path when 
Airflow is mounted under root
 Key: AIRFLOW-3083
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3083
 Project: Apache Airflow
  Issue Type: Bug
  Components: webserver
Affects Versions: 1.10.0
Reporter: Toby Jennings


Steps to reproduce:
 # Configure Airflow 1.10.0 for operation mounted at a subpath under root by 
setting "base_url" to, e.g., "http://localhost:8080/airflow;
 # Use web server UI to trigger a dag run.
 # A 404 error is returned.

 

This may be caused by several routes in www/views.py including /trigger/ 
returning a "redirect(origin)" where "origin" is set to "/admin/" instead of 
getting the appropriate subpath using url_for().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors

2018-09-18 Thread GitBox
codecov-io commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of 
sensors
URL: 
https://github.com/apache/incubator-airflow/pull/3596#issuecomment-422536512
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3596?src=pr=h1)
 Report
   > Merging 
[#3596](https://codecov.io/gh/apache/incubator-airflow/pull/3596?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/0e5eee83b14b2a57345370b14e91404d518f0bf4?src=pr=desc)
 will **increase** coverage by `0.02%`.
   > The diff coverage is `82.4%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3596/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3596?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3596  +/-   ##
   ==
   + Coverage   77.52%   77.55%   +0.02% 
   ==
 Files 198  199   +1 
 Lines   1584215963 +121 
   ==
   + Hits1228212380  +98 
   - Misses   3560 3583  +23
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3596?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3596/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `89.01% <100%> (+0.2%)` | :arrow_up: |
   | 
[airflow/sensors/base\_sensor\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3596/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL2Jhc2Vfc2Vuc29yX29wZXJhdG9yLnB5)
 | `97.87% <100%> (+1.2%)` | :arrow_up: |
   | 
[airflow/exceptions.py](https://codecov.io/gh/apache/incubator-airflow/pull/3596/diff?src=pr=tree#diff-YWlyZmxvdy9leGNlcHRpb25zLnB5)
 | `100% <100%> (ø)` | :arrow_up: |
   | 
[airflow/ti\_deps/dep\_context.py](https://codecov.io/gh/apache/incubator-airflow/pull/3596/diff?src=pr=tree#diff-YWlyZmxvdy90aV9kZXBzL2RlcF9jb250ZXh0LnB5)
 | `100% <100%> (ø)` | :arrow_up: |
   | 
[airflow/ti\_deps/deps/ready\_to\_reschedule.py](https://codecov.io/gh/apache/incubator-airflow/pull/3596/diff?src=pr=tree#diff-YWlyZmxvdy90aV9kZXBzL2RlcHMvcmVhZHlfdG9fcmVzY2hlZHVsZS5weQ==)
 | `100% <100%> (ø)` | |
   | 
[airflow/www\_rbac/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/3596/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy92aWV3cy5weQ==)
 | `72.04% <35.29%> (-0.59%)` | :arrow_down: |
   | 
[airflow/www/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/3596/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=)
 | `68.85% <35.29%> (-0.46%)` | :arrow_down: |
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/3596/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `64.74% <0%> (ø)` | :arrow_up: |
   | ... and [3 
more](https://codecov.io/gh/apache/incubator-airflow/pull/3596/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3596?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3596?src=pr=footer).
 Last update 
[0e5eee8...cdd8f89](https://codecov.io/gh/apache/incubator-airflow/pull/3596?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3069) Decode output of S3 file transform operator

2018-09-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619625#comment-16619625
 ] 

ASF GitHub Bot commented on AIRFLOW-3069:
-

sbilinski opened a new pull request #3914: [AIRFLOW-3069] Log all output of the 
S3 file transform script
URL: https://github.com/apache/incubator-airflow/pull/3914
 
 
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title.
 - 
[https://jira.apache.org/jira/browse/AIRFLOW-3069](https://jira.apache.org/jira/browse/AIRFLOW-3069)
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   The output of the process spawned by `S3FileTransformOperator` is not 
properly decoded, which makes reading logs rather difficult. Additionally, the 
`stderr` stream is only shown when process exit code is not equal to `0`.
   
   I would like to propose the following changes to `S3FileTransformOperator`:
   
   - Send both output streams (stdout & stderr) to the logger, regardless of 
the outcome (if any data is present)
   - Decode the output, so that new lines can be displayed correctly.
   - Include process exit code in the exception message, if the process fails.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   I've added a separate case for testing `transform_script` with output 
present. Since logging is essential in this case, the test checks if a valid 
message was passed to the logging module. 
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Decode output of S3 file transform operator
> ---
>
> Key: AIRFLOW-3069
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3069
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws
>Affects Versions: 1.10.0
>Reporter: Szymon Bilinski
>Assignee: Szymon Bilinski
>Priority: Trivial
>
> h3. Current behaviour
> {{S3FileTransformOperator}} logs {{stdout}} of the underlying process as such:
> {code}
> [2018-09-15 23:17:13,850] {{s3_file_transform_operator.py:122}} INFO - 
> Transform script stdout b'Copying /tmp/tmpd5rjo8g0 to 
> /tmp/tmpd3vkhzte\nDone\n'
> {code}
> While {{stderr}} is omitted entirely, unless exit code is not {{0}} (in this 
> case it's included in the exception message only).
> h3. Proposed behaviour
> 1. Both streams are logged, regardless of the underlying process outcome 
> (i.e. success or failure).
> 2. Stream output is decoded before logging (e.g. {{\n}} is replaced with an 
> actual new line). 
> 3. If {{transform_script}} fails, the exception message contains return code 
> of the process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] sbilinski opened a new pull request #3914: [AIRFLOW-3069] Log all output of the S3 file transform script

2018-09-18 Thread GitBox
sbilinski opened a new pull request #3914: [AIRFLOW-3069] Log all output of the 
S3 file transform script
URL: https://github.com/apache/incubator-airflow/pull/3914
 
 
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title.
 - 
[https://jira.apache.org/jira/browse/AIRFLOW-3069](https://jira.apache.org/jira/browse/AIRFLOW-3069)
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   The output of the process spawned by `S3FileTransformOperator` is not 
properly decoded, which makes reading logs rather difficult. Additionally, the 
`stderr` stream is only shown when process exit code is not equal to `0`.
   
   I would like to propose the following changes to `S3FileTransformOperator`:
   
   - Send both output streams (stdout & stderr) to the logger, regardless of 
the outcome (if any data is present)
   - Decode the output, so that new lines can be displayed correctly.
   - Include process exit code in the exception message, if the process fails.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   I've added a separate case for testing `transform_script` with output 
present. Since logging is essential in this case, the test checks if a valid 
message was passed to the logging module. 
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit re-schedule of sensors

2018-09-18 Thread GitBox
seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit 
re-schedule of sensors
URL: https://github.com/apache/incubator-airflow/pull/3596#discussion_r218567791
 
 

 ##
 File path: airflow/sensors/base_sensor_operator.py
 ##
 @@ -65,6 +89,11 @@ def poke(self, context):
 
 def execute(self, context):
 started_at = timezone.utcnow()
+if self.reschedule:
+# If reschedule, use first start date of current try
+task_reschedules = 
TaskReschedule.find_for_task_instance(context['ti'])
+if task_reschedules:
+started_at = task_reschedules[0].start_date
 while not self.poke(context):
 if (timezone.utcnow() - started_at).total_seconds() > self.timeout:
 
 Review comment:
   Normally that should not be the case because `started_at` is always in the 
past. Of course clocks are never in sync (except at Google) so it may happen. 
But also in such a case `datetime.timedelta.total_seconds()` doesn't throw an 
exception but returns a negative number which is still smaller than the 
configured timeout (if the user configured a positive timeout, lol)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit re-schedule of sensors

2018-09-18 Thread GitBox
seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit 
re-schedule of sensors
URL: https://github.com/apache/incubator-airflow/pull/3596#discussion_r218564723
 
 

 ##
 File path: airflow/models.py
 ##
 @@ -56,8 +56,8 @@
 
 from sqlalchemy import (
 Column, Integer, String, DateTime, Text, Boolean, ForeignKey, PickleType,
-Index, Float, LargeBinary, UniqueConstraint)
-from sqlalchemy import func, or_, and_, true as sqltrue
+Index, Float, LargeBinary, UniqueConstraint, ForeignKeyConstraint)
+from sqlalchemy import func, or_, and_, true as sqltrue, asc
 
 Review comment:
   I changed it like this, let me know what you think.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit re-schedule of sensors

2018-09-18 Thread GitBox
seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit 
re-schedule of sensors
URL: https://github.com/apache/incubator-airflow/pull/3596#discussion_r218564370
 
 

 ##
 File path: airflow/sensors/base_sensor_operator.py
 ##
 @@ -75,11 +104,24 @@ def execute(self, context):
 raise AirflowSkipException('Snap. Time is OUT.')
 else:
 raise AirflowSensorTimeout('Snap. Time is OUT.')
-sleep(self.poke_interval)
+if self.reschedule:
+reschedule_date = timezone.utcnow() + timedelta(
+seconds=self.poke_interval)
+raise AirflowRescheduleException(reschedule_date)
+else:
+sleep(self.poke_interval)
 self.log.info("Success criteria met. Exiting.")
 
 def _do_skip_downstream_tasks(self, context):
 downstream_tasks = context['task'].get_flat_relatives(upstream=False)
 self.log.debug("Downstream task_ids %s", downstream_tasks)
 if downstream_tasks:
 self.skip(context['dag_run'], context['ti'].execution_date, 
downstream_tasks)
+
+@property
+def reschedule(self):
+return self.mode == 'reschedule'
+
+@property
+def deps(self):
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit re-schedule of sensors

2018-09-18 Thread GitBox
seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit 
re-schedule of sensors
URL: https://github.com/apache/incubator-airflow/pull/3596#discussion_r218564334
 
 

 ##
 File path: airflow/ti_deps/deps/ready_to_reschedule.py
 ##
 @@ -0,0 +1,62 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.ti_deps.deps.base_ti_dep import BaseTIDep
+from airflow.utils import timezone
+from airflow.utils.db import provide_session
+from airflow.utils.state import State
+
+
+class ReadyToRescheduleDep(BaseTIDep):
+NAME = "Ready To Reschedule"
+IGNOREABLE = True
+IS_TASK_DEP = True
+
+@provide_session
+def _get_dep_statuses(self, ti, session, dep_context):
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit re-schedule of sensors

2018-09-18 Thread GitBox
seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit 
re-schedule of sensors
URL: https://github.com/apache/incubator-airflow/pull/3596#discussion_r218563478
 
 

 ##
 File path: airflow/sensors/base_sensor_operator.py
 ##
 @@ -75,11 +104,24 @@ def execute(self, context):
 raise AirflowSkipException('Snap. Time is OUT.')
 else:
 raise AirflowSensorTimeout('Snap. Time is OUT.')
-sleep(self.poke_interval)
+if self.reschedule:
+reschedule_date = timezone.utcnow() + timedelta(
 
 Review comment:
   No. In "reschedule" mode `started_at` is always set to the initial schedule 
time (when the task instance was scheduled the first time). `started_at` is 
only used to determine if timeout is reached.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3082) Task Status lags behind actual status in DAG: Tree View

2018-09-18 Thread Damon Cool (JIRA)
Damon Cool created AIRFLOW-3082:
---

 Summary: Task Status lags behind actual status in DAG: Tree View
 Key: AIRFLOW-3082
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3082
 Project: Apache Airflow
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Damon Cool


Since upgrading to 1.10.0 (from 1.9) I have noticed that the tasks don't show 
the current status.  I noticed this by checking the logs, tasks that have 
completed according to the logs are still showing a clear status (not even in 
running state).

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AIRFLOW-26) GCP hook naming alignment

2018-09-18 Thread Kaxil Naik (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-26?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618958#comment-16618958
 ] 

Kaxil Naik edited comment on AIRFLOW-26 at 9/18/18 11:33 AM:
-

Currently, we are also proposing to move the contrib hooks/and operators in a 
separate repository. So not sure what is the best naming convention. [~fenglu] 
Any thoughts?  `gcp_service` seems logical to me.


was (Author: kaxilnaik):
Currently, we are also proposing to move the contrib hooks/and operators in a 
separate repository. So not sure what is the best naming convention. [~fenglu] 
Any thoughts? But yes the `gcp_service` seems logical to me.

> GCP hook naming alignment
> -
>
> Key: AIRFLOW-26
> URL: https://issues.apache.org/jira/browse/AIRFLOW-26
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>  Labels: gcp
>
> Because we have quite a few GCP services, it's better to align the naming to 
> not confuse new users using Google Cloud Platform:
> gcp_storage > renamed from gcs
> gcp_bigquery > renamed from bigquery
> gcp_datastore > rename from datastore
> gcp_dataflow > TBD
> gcp_dataproc > TBD
> gcp_bigtable > TBD
> Note: this could break 'custom' operators if they use the hooks.
> Can be assigned to me.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-26) GCP hook naming alignment

2018-09-18 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-26?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618965#comment-16618965
 ] 

Ash Berlin-Taylor commented on AIRFLOW-26:
--

bq.  Note: this could break 'custom' operators if they use the hooks.

You can create an import shim (a mostly empty python module) that issues a 
deprecation warning and makes the old names available.

> GCP hook naming alignment
> -
>
> Key: AIRFLOW-26
> URL: https://issues.apache.org/jira/browse/AIRFLOW-26
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>  Labels: gcp
>
> Because we have quite a few GCP services, it's better to align the naming to 
> not confuse new users using Google Cloud Platform:
> gcp_storage > renamed from gcs
> gcp_bigquery > renamed from bigquery
> gcp_datastore > rename from datastore
> gcp_dataflow > TBD
> gcp_dataproc > TBD
> gcp_bigtable > TBD
> Note: this could break 'custom' operators if they use the hooks.
> Can be assigned to me.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AIRFLOW-26) GCP hook naming alignment

2018-09-18 Thread Kaxil Naik (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-26?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618958#comment-16618958
 ] 

Kaxil Naik edited comment on AIRFLOW-26 at 9/18/18 11:23 AM:
-

Currently, we are also proposing to move the contrib hooks/and operators in a 
separate repository. So not sure what is the best naming convention. [~fenglu] 
Any thoughts? But yes the `gcp_service` seems logical to me.


was (Author: kaxilnaik):
Currently, we are also proposing to move the contrib hooks/and operators in a 
separate repository. So not sure what is the best naming convention. [~fenglu] 
Any thoughts?

> GCP hook naming alignment
> -
>
> Key: AIRFLOW-26
> URL: https://issues.apache.org/jira/browse/AIRFLOW-26
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>  Labels: gcp
>
> Because we have quite a few GCP services, it's better to align the naming to 
> not confuse new users using Google Cloud Platform:
> gcp_storage > renamed from gcs
> gcp_bigquery > renamed from bigquery
> gcp_datastore > rename from datastore
> gcp_dataflow > TBD
> gcp_dataproc > TBD
> gcp_bigtable > TBD
> Note: this could break 'custom' operators if they use the hooks.
> Can be assigned to me.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-26) GCP hook naming alignment

2018-09-18 Thread Kaxil Naik (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-26?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618958#comment-16618958
 ] 

Kaxil Naik commented on AIRFLOW-26:
---

Currently, we are also proposing to move the contrib hooks/and operators in a 
separate repository. So not sure what is the best naming convention. [~fenglu] 
Any thoughts?

> GCP hook naming alignment
> -
>
> Key: AIRFLOW-26
> URL: https://issues.apache.org/jira/browse/AIRFLOW-26
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>  Labels: gcp
>
> Because we have quite a few GCP services, it's better to align the naming to 
> not confuse new users using Google Cloud Platform:
> gcp_storage > renamed from gcs
> gcp_bigquery > renamed from bigquery
> gcp_datastore > rename from datastore
> gcp_dataflow > TBD
> gcp_dataproc > TBD
> gcp_bigtable > TBD
> Note: this could break 'custom' operators if they use the hooks.
> Can be assigned to me.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3078) Basic operators for Google Compute Engine

2018-09-18 Thread Kaxil Naik (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618954#comment-16618954
 ] 

Kaxil Naik commented on AIRFLOW-3078:
-

Hey [~higrys], Not sure if Feng Lu would talk to me about this but I am happy 
for you to work on this task as you seem to already have a design. I am 
definitely happy to review it and have requested access on the same.

Thanks.

> Basic operators for Google Compute Engine
> -
>
> Key: AIRFLOW-3078
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3078
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib, gcp
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Trivial
>
> In order to be able to interact with raw Google Compute Engine, we need an 
> operator that should be able to:
> For managing individual machines:
>  * Start Instance: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instances/start])
>  * Set Machine Type 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instances/setMachineType])
>  
>  * Stop Instance: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instances/stop])
> Also we should be able to manipulate instance groups:
>  * Get instance group: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instanceGroupManagers/get])
>  * Insert Group: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instanceGroupManagers/insert])
>  * Update Group: 
> ([https://cloud.google.com/compute/docs/reference/rest/beta/instanceGroupManagers/update])
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3081) Support automated integration tests in Travis CI

2018-09-18 Thread Jarek Potiuk (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618930#comment-16618930
 ] 

Jarek Potiuk commented on AIRFLOW-3081:
---

[~ashb]. Sure, I will convert the doc to Proposal :) 

Re: thought about who pays - I think we could approach it in two steps (that's 
part of our proposal as well):

Typically when you have a fork of Airflow you setup your own Travis CI project 
(we have one) and for our fork we could use our own GCP  project (that is our 
plan). Then the integration tests for our changes will be run on our project. 
We could run those tests conditionally (only when credentials are passed via 
Travis environment variables) - so then in the main project we would not run 
them, until the second stage - where the community would agree on some kind of 
sponsorship for the project and implement some security measures.

For now I would like to experiment with it on our environment only (it's very 
useful for us to make sure that the operators actually work) and if we see how 
it works and see that it works fine we can think about next steps.

In our case we base the integration tests on the example dags we provide for 
the operators, which is nice because then we have all that nicely linked (and 
proven to work!) -> examples, user documentation, integration test.

> Support automated integration tests in Travis CI
> 
>
> Key: AIRFLOW-3081
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3081
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: ci
>Reporter: Jarek Potiuk
>Priority: Minor
>
> I think it would be great to have a way to run integration tests 
> automatically for some of the operators. We've started to work on some GCP 
> operators (Cloud Functions is the first one). We have a proposal on how Cloud 
> Functions (and later other GCP operators) could have integration tests that 
> could run on GCP infrastructure. Here is the link to the proposal Doc 
> [https://docs.google.com/document/d/1-763cYrOs37Sj77RzSQP5hy1GSvZ7I7MPOOG2Q86Osc/edit|https://docs.google.com/document/d/1-763cYrOs37Sj77RzSQP5hy1GSvZ7I7MPOOG2Q86Osc/edit?usp=sharing]
> Maybe it's a good time to start discussion on that :).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-26) GCP hook naming alignment

2018-09-18 Thread Jarek Potiuk (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-26?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618921#comment-16618921
 ] 

Jarek Potiuk commented on AIRFLOW-26:
-

We are working on implementation of Cloud Functions operators (AIRFLOW-2912) as 
well as some basic Compute Engine (AIRFLOW-3078) very soon and it would be 
great to align with the "future" consistent way. Currently we have separate 
python file for separate operators (Delete/Deploy  + Invoke in the future). 
From what I understand from this issue and AIRFLOW-2056, preferred way is to 
put together related "GC*" operators into single file (in our case they should 
be named gcp_functions and gcp_compute respectively).

Is my understanding correct ?

> GCP hook naming alignment
> -
>
> Key: AIRFLOW-26
> URL: https://issues.apache.org/jira/browse/AIRFLOW-26
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>  Labels: gcp
>
> Because we have quite a few GCP services, it's better to align the naming to 
> not confuse new users using Google Cloud Platform:
> gcp_storage > renamed from gcs
> gcp_bigquery > renamed from bigquery
> gcp_datastore > rename from datastore
> gcp_dataflow > TBD
> gcp_dataproc > TBD
> gcp_bigtable > TBD
> Note: this could break 'custom' operators if they use the hooks.
> Can be assigned to me.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3078) Basic operators for Google Compute Engine

2018-09-18 Thread Jarek Potiuk (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618913#comment-16618913
 ] 

Jarek Potiuk commented on AIRFLOW-3078:
---

([~kaxilnaik] - I hope it's ok that I assigned the issue to myself :). Please 
let me know if you have any issue with it)

> Basic operators for Google Compute Engine
> -
>
> Key: AIRFLOW-3078
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3078
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib, gcp
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Trivial
>
> In order to be able to interact with raw Google Compute Engine, we need an 
> operator that should be able to:
> For managing individual machines:
>  * Start Instance: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instances/start])
>  * Set Machine Type 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instances/setMachineType])
>  
>  * Stop Instance: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instances/stop])
> Also we should be able to manipulate instance groups:
>  * Get instance group: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instanceGroupManagers/get])
>  * Insert Group: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instanceGroupManagers/insert])
>  * Update Group: 
> ([https://cloud.google.com/compute/docs/reference/rest/beta/instanceGroupManagers/update])
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-3078) Basic operators for Google Compute Engine

2018-09-18 Thread Jarek Potiuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-3078:
-

Assignee: Jarek Potiuk  (was: Kaxil Naik)

> Basic operators for Google Compute Engine
> -
>
> Key: AIRFLOW-3078
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3078
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib, gcp
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Trivial
>
> In order to be able to interact with raw Google Compute Engine, we need an 
> operator that should be able to:
> For managing individual machines:
>  * Start Instance: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instances/start])
>  * Set Machine Type 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instances/setMachineType])
>  
>  * Stop Instance: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instances/stop])
> Also we should be able to manipulate instance groups:
>  * Get instance group: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instanceGroupManagers/get])
>  * Insert Group: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instanceGroupManagers/insert])
>  * Update Group: 
> ([https://cloud.google.com/compute/docs/reference/rest/beta/instanceGroupManagers/update])
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3078) Basic operators for Google Compute Engine

2018-09-18 Thread Jarek Potiuk (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618909#comment-16618909
 ] 

Jarek Potiuk commented on AIRFLOW-3078:
---

Hello  [~kaxilnaik] - I know [~fenglu] will talk to you about it, but we are 
currently working on implementation of those basic operators. We even have a 
draft design doc that explains what we are planning to do in [Airflow "Compute 
Engine" 
operators|https://docs.google.com/document/d/17cjZeu4ov_ZrVH3qCa-g8olW4DjRi-YG83Z-WjnbXhk/edit#heading=h.6w8wok2now8f]
 - so maybe instead of implementing it, we can involve you in reviewing (both 
doc and implementation). I am happy to collaborate on it (I am starting to work 
on it this week).

> Basic operators for Google Compute Engine
> -
>
> Key: AIRFLOW-3078
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3078
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib, gcp
>Reporter: Jarek Potiuk
>Assignee: Kaxil Naik
>Priority: Trivial
>
> In order to be able to interact with raw Google Compute Engine, we need an 
> operator that should be able to:
> For managing individual machines:
>  * Start Instance: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instances/start])
>  * Set Machine Type 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instances/setMachineType])
>  
>  * Stop Instance: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instances/stop])
> Also we should be able to manipulate instance groups:
>  * Get instance group: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instanceGroupManagers/get])
>  * Insert Group: 
> ([https://cloud.google.com/compute/docs/reference/rest/v1/instanceGroupManagers/insert])
>  * Update Group: 
> ([https://cloud.google.com/compute/docs/reference/rest/beta/instanceGroupManagers/update])
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2912) Add operators for Google Cloud Functions

2018-09-18 Thread Jarek Potiuk (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618894#comment-16618894
 ] 

Jarek Potiuk commented on AIRFLOW-2912:
---

We have a design doc describing proposed architecture and properties of the 
implementation. We think the most important operators are:
 * *GCFFunctionDelete* - deletes an existing function, specified by name. 
Succeeds when there is no function to delete with the name specified (it is 
idempotent)
 * *GCFFunctionDeploy* - creates new or updates existing function.
 * *GCFFunctionInvoke* - invokes existing function (Note that it’s being 
implemented now and not supposed to - will be available when API controlling 
access to Invoke API calls will be published)

The document is here:

https://docs.google.com/document/d/1Wj46--jco47Ju-5-OuSxG3RZj6qvfRUscn62VEBXJAc/edit#heading=h.b48kcm9s7ymv

> Add operators for Google Cloud Functions
> 
>
> Key: AIRFLOW-2912
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2912
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Dariusz Aniszewski
>Assignee: Jarek Potiuk
>Priority: Major
>
> It would be nice to be able to create, delete and call Cloud Functions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3081) Support automated integration tests in Travis CI

2018-09-18 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618890#comment-16618890
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3081:


Could you create an AIP 
(https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)
 for this, rather than in a google doc please, and then email the dev@ mailing 
list to start a discussion about it.

First thought: who will pay for the GCP costs? For many of the AWS operators we 
mock the AWS calls (using the "moto" python library - "mock boto") which while 
not perfect doesn't incur any extra cost, and is usually quicker to boot.

> Support automated integration tests in Travis CI
> 
>
> Key: AIRFLOW-3081
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3081
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: ci
>Reporter: Jarek Potiuk
>Priority: Minor
>
> I think it would be great to have a way to run integration tests 
> automatically for some of the operators. We've started to work on some GCP 
> operators (Cloud Functions is the first one). We have a proposal on how Cloud 
> Functions (and later other GCP operators) could have integration tests that 
> could run on GCP infrastructure. Here is the link to the proposal Doc 
> [https://docs.google.com/document/d/1-763cYrOs37Sj77RzSQP5hy1GSvZ7I7MPOOG2Q86Osc/edit|https://docs.google.com/document/d/1-763cYrOs37Sj77RzSQP5hy1GSvZ7I7MPOOG2Q86Osc/edit?usp=sharing]
> Maybe it's a good time to start discussion on that :).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2912) Add operators for Google Cloud Functions

2018-09-18 Thread Jarek Potiuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-2912:
-

Assignee: Jarek Potiuk

> Add operators for Google Cloud Functions
> 
>
> Key: AIRFLOW-2912
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2912
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Dariusz Aniszewski
>Assignee: Jarek Potiuk
>Priority: Major
>
> It would be nice to be able to create, delete and call Cloud Functions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3081) Support automated integration tests in Travis CI

2018-09-18 Thread Jarek Potiuk (JIRA)
Jarek Potiuk created AIRFLOW-3081:
-

 Summary: Support automated integration tests in Travis CI
 Key: AIRFLOW-3081
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3081
 Project: Apache Airflow
  Issue Type: New Feature
  Components: ci
Reporter: Jarek Potiuk


I think it would be great to have a way to run integration tests automatically 
for some of the operators. We've started to work on some GCP operators (Cloud 
Functions is the first one). We have a proposal on how Cloud Functions (and 
later other GCP operators) could have integration tests that could run on GCP 
infrastructure. Here is the link to the proposal Doc 
[https://docs.google.com/document/d/1-763cYrOs37Sj77RzSQP5hy1GSvZ7I7MPOOG2Q86Osc/edit|https://docs.google.com/document/d/1-763cYrOs37Sj77RzSQP5hy1GSvZ7I7MPOOG2Q86Osc/edit?usp=sharing]

Maybe it's a good time to start discussion on that :).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] XD-DENG commented on issue #3693: [AIRFLOW-2848] Ensure dag_id in metadata "job" for LocalTaskJob

2018-09-18 Thread GitBox
XD-DENG commented on issue #3693: [AIRFLOW-2848] Ensure dag_id in metadata 
"job" for LocalTaskJob
URL: 
https://github.com/apache/incubator-airflow/pull/3693#issuecomment-422314417
 
 
   Thanks @ashb .
   
   It was me who changed the fix version to 1.10.1 in JIRA, as you suggested in 
an mail in the emailist.
   
   I have changed a few other tickets to 1.10.1 as well. All of them are sort 
of bug fix or enhancement. Please check 
https://issues.apache.org/jira/browse/AIRFLOW-2855?filter=-1=resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%201.10.1%20AND%20assignee%20in%20(XD-DENG)%20order%20by%20updated%20DESC
   
   Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #3913: [AIRFLOW-3072] Assign permission get_logs_with_metadata to viewer role

2018-09-18 Thread GitBox
ashb commented on issue #3913: [AIRFLOW-3072] Assign permission 
get_logs_with_metadata to viewer role
URL: 
https://github.com/apache/incubator-airflow/pull/3913#issuecomment-422314238
 
 
   I'm not sure I like this as a default permisson. On the one hand needing 
admin to view logs is wrong, but conversely there could be passwords or other 
sensitive info in the logs, so maybe just "viewer" shouldn't have access to the 
logs?
   
   Not sure basically.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #3693: [AIRFLOW-2848] Ensure dag_id in metadata "job" for LocalTaskJob

2018-09-18 Thread GitBox
ashb commented on issue #3693: [AIRFLOW-2848] Ensure dag_id in metadata "job" 
for LocalTaskJob
URL: 
https://github.com/apache/incubator-airflow/pull/3693#issuecomment-422311835
 
 
   @XD-DENG it's marked for 1.10.1 so will be included, yup :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-307) Code cleanup. There is no __neq__ python magic method.

2018-09-18 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-307.
---
   Resolution: Fixed
Fix Version/s: 1.8.0

> Code cleanup. There is no __neq__ python magic method.
> --
>
> Key: AIRFLOW-307
> URL: https://issues.apache.org/jira/browse/AIRFLOW-307
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.7.1, 1.7.1.3, 1.8.0
>Reporter: Oleksandr Vilchynskyy
>Assignee: Oleksandr Vilchynskyy
>Priority: Minor
> Fix For: 1.8.0
>
>
> There is small mistype in  class BaseOperator(object)  which is decorated 
> with functools.total_ordering:
> def __neq__was used instead of __ne__, which breaks logic of later 
> objects comparison.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-407) Sensors all have the same ui color, making them hard to distinguish on the web UI

2018-09-18 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-407.
---
   Resolution: Fixed
Fix Version/s: 1.8.0

> Sensors all have the same ui color, making them hard to distinguish on the 
> web UI
> -
>
> Key: AIRFLOW-407
> URL: https://issues.apache.org/jira/browse/AIRFLOW-407
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Li Xuanji
>Assignee: Li Xuanji
>Priority: Minor
> Fix For: 1.8.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-298) incubator disclaimer isn't proper on documentation website

2018-09-18 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-298.
---
Resolution: Fixed

> incubator disclaimer isn't proper on documentation website
> --
>
> Key: AIRFLOW-298
> URL: https://issues.apache.org/jira/browse/AIRFLOW-298
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Maxime Beauchemin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-443) Code from DAGs with same __name__ show up on each other's code view in the web UI

2018-09-18 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-443.
---
Resolution: Fixed

> Code from DAGs with same __name__ show up on each other's code view in the 
> web UI
> -
>
> Key: AIRFLOW-443
> URL: https://issues.apache.org/jira/browse/AIRFLOW-443
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Li Xuanji
>Assignee: Bolke de Bruin
>Priority: Major
>
> With a dags folder containing 2 files, `bash_bash_bash/dag.py` and 
> `bash_bash_bash_2/dag.py`, with the following contents
> bash_bash_bash/dag.py
> ```
> from airflow import DAG
> from airflow.operators.bash_operator import BashOperator
> from datetime import datetime, timedelta
> default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'start_date': datetime(2016, 1, 1, 1, 0),
> 'email': ['xua...@gmail.com'],
> 'email_on_failure': True,
> 'email_on_retry': False,
> 'retries': 3,
> 'retry_delay': timedelta(minutes=1),
> 'concurrency': 1,
> }
> dag = DAG('bash_bash_bash', default_args=default_args, 
> schedule_interval=timedelta(seconds=10))
> # t1, t2 and t3 are examples of tasks created by instatiating operators
> t1 = BashOperator(
> task_id='print_date',
> bash_command='date',
> dag=dag
> )
> t2 = BashOperator(
> task_id='sleep',
> bash_command='sleep 1',
> retries=3,
> dag=dag
> )
> templated_command = """
> {% for i in range(5) %}
> echo "{{ ds }}"
> echo "{{ macros.ds_add(ds, 7)}}"
> echo "{{ params.my_param }}"
> {% endfor %}
> """
> t3 = BashOperator(
> task_id='templated',
> bash_command=templated_command,
> params={'my_param': 'Parameter I passed in'},
> dag=dag
> )
> t2.set_upstream(t1)
> t3.set_upstream(t1)
> ```
> bash_bash_bash_2/dag.py
> ```
> from airflow import DAG
> from airflow.operators.bash_operator import BashOperator
> from datetime import datetime, timedelta
> default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'start_date': datetime(2016, 1, 1, 1, 0),
> 'email': ['xua...@gmail.com'],
> 'email_on_failure': True,
> 'email_on_retry': False,
> 'retries': 3,
> 'retry_delay': timedelta(minutes=1),
> 'concurrency': 1,
> }
> dag = DAG('bash_bash_bash_2', default_args=default_args, 
> schedule_interval=timedelta(seconds=10))
> t1 = BashOperator(
> task_id='print_date',
> bash_command='date',
> dag=dag
> )
> ```
> The code view in the web UI shows the contents of bash_bash_bash_2/dag.py  
> for both DAGs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-306) Spark-sql hook and operator required

2018-09-18 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-306.
---
Resolution: Fixed

> Spark-sql hook and operator required
> 
>
> Key: AIRFLOW-306
> URL: https://issues.apache.org/jira/browse/AIRFLOW-306
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, hooks, operators
>Reporter: Daniel van der Ende
>Assignee: Daniel van der Ende
>Priority: Minor
>
> It would be nice to have a Spark-sql hook and operator for Spark-sql which 
> can execute Spark-sql queries (instead of having to run them via the 
> Bash-operator).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-233) Detached DagRun error in scheduler loop

2018-09-18 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-233.
---
Resolution: Fixed

> Detached DagRun error in scheduler loop
> ---
>
> Key: AIRFLOW-233
> URL: https://issues.apache.org/jira/browse/AIRFLOW-233
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun, scheduler
> Environment: Airflow master (git log below), Postgres backend, 
> LocalExecutor
> {code}
> b7def7f1f9a97d584e9076cdad48287e652a2d41 [AIRFLOW-142] setup_env.sh doesn't 
> download hive tarball if hdp is specified as distro
> 0bd5515a42f7912b0d4ac8bf33dec2f01539b555 [AIRFLOW-218] Added option to enable 
> webserver gunicorn access/err logs
> 80210b2bd768668e55e498995a3820900d9119ba Merge pull request #1569 from 
> mistercrunch/docs
> {code}
>Reporter: Jeremiah Lowin
>Assignee: Bolke de Bruin
>Priority: Major
>
> Running Airflow master, every scheduler loop has at least one detached DagRun 
> error. This is the output:
> {code}
> [2016-06-10 09:41:54,772] {jobs.py:669} ERROR - Instance  0x10ab80dd8> is not bound to a Session; attribute refresh operation cannot 
> proceed
> Traceback (most recent call last):
>   File "/Users/jlowin/git/airflow/airflow/jobs.py", line 666, in _do_dags
> self.process_dag(dag, tis_out)
>   File "/Users/jlowin/git/airflow/airflow/jobs.py", line 524, in process_dag
> State.UP_FOR_RETRY))
>   File "/Users/jlowin/git/airflow/airflow/utils/db.py", line 53, in wrapper
> result = func(*args, **kwargs)
>   File "/Users/jlowin/git/airflow/airflow/models.py", line 3387, in 
> get_task_instances
> TI.dag_id == self.dag_id,
>   File 
> "/Users/jlowin/anaconda3/lib/python3.5/site-packages/sqlalchemy/orm/attributes.py",
>  line 237, in __get__
> return self.impl.get(instance_state(instance), dict_)
>   File 
> "/Users/jlowin/anaconda3/lib/python3.5/site-packages/sqlalchemy/orm/attributes.py",
>  line 578, in get
> value = state._load_expired(state, passive)
>   File 
> "/Users/jlowin/anaconda3/lib/python3.5/site-packages/sqlalchemy/orm/state.py",
>  line 474, in _load_expired
> self.manager.deferred_scalar_loader(self, toload)
>   File 
> "/Users/jlowin/anaconda3/lib/python3.5/site-packages/sqlalchemy/orm/loading.py",
>  line 610, in load_scalar_attributes
> (state_str(state)))
> sqlalchemy.orm.exc.DetachedInstanceError: Instance  is 
> not bound to a Session; attribute refresh operation cannot proceed
> {code}
> This is the test DAG in question:
> {code}
> from airflow import DAG
> from airflow.operators import PythonOperator
> from datetime import datetime
> import logging
> import time
> default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'start_date': datetime(2016, 4, 24),
> }
> dag_name = 'dp_test'
> dag = DAG(
> dag_name,
> default_args=default_args,
> schedule_interval='*/2 * * * *')
> def cb(**kw):
> time.sleep(2)
> logging.info('Done %s' % kw['ds'])
> d = PythonOperator(task_id="delay", provide_context=True, python_callable=cb, 
> dag=dag)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-249) Refactor the SLA mechanism

2018-09-18 Thread Siddharth Anand (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Anand reassigned AIRFLOW-249:
---

Assignee: (was: dud)

> Refactor the SLA mechanism
> --
>
> Key: AIRFLOW-249
> URL: https://issues.apache.org/jira/browse/AIRFLOW-249
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: dud
>Priority: Major
>
> Hello
> I've noticed the SLA feature is currently behaving as follow :
> - it doesn't work on DAG scheduled @once or None because they have no 
> dag.followwing_schedule property
> - it keeps endlessly checking for SLA misses without ever worrying about any 
> end_date. Worse I noticed that emails are still being sent for runs that are 
> never happening because of end_date
> - it keeps checking for recent TIs even if SLA notification has been already 
> been sent for them
> - the SLA logic is only being fired after following_schedule + sla has 
> elapsed, in other words one has to wait for the next TI before having a 
> chance of getting any email. Also the email reports dag.following_schedule 
> time (I guess because it is close of TI.start_date), but unfortunately that 
> doesn't match what the task instances shows nor the log filename
> - the SLA logic is based on max(TI.execution_date) for the starting point of 
> its checks, that means that for a DAG whose SLA is longer than its schedule 
> period if half of the TIs are running longer than expected it will go 
> unnoticed. This could be demonstrated with a DAG like this one :
> {code}
> from airflow import DAG
> from airflow.operators import *
> from datetime import datetime, timedelta
> from time import sleep
> default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'start_date': datetime(2016, 6, 16, 12, 20),
> 'email': my_email
> 'sla': timedelta(minutes=2),
> }
> dag = DAG('unnoticed_sla', default_args=default_args, 
> schedule_interval=timedelta(minutes=1))
> def alternating_sleep(**kwargs):
> minute = kwargs['execution_date'].strftime("%M")
> is_odd = int(minute) % 2
> if is_odd:
> sleep(300)
> else:
> sleep(10)
> return True
> PythonOperator(
> task_id='sla_miss',
> python_callable=alternating_sleep,
> provide_context=True,
> dag=dag)
> {code}
> I've tried to rework the SLA triggering mechanism by addressing the above 
> points., please [have a look on 
> it|https://github.com/dud225/incubator-airflow/commit/972260354075683a8d55a1c960d839c37e629e7d]
> I made some tests with this patch :
> - the fluctuent DAG shown above no longer make Airflow skip any SLA event :
> {code}
>  task_id  |dag_id |   execution_date| email_sent | 
> timestamp  | description | notification_sent 
> --+---+-+++-+---
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:05:00 | t  | 2016-06-16 
> 15:08:26.058631 | | t
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:07:00 | t  | 2016-06-16 
> 15:10:06.093253 | | t
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:09:00 | t  | 2016-06-16 
> 15:12:06.241773 | | t
> {code}
> - on a normal DAG, the SLA is being triggred more quickly :
> {code}
> // start_date = 2016-06-16 15:55:00
> // end_date = 2016-06-16 16:00:00
> // schedule_interval =  timedelta(minutes=1)
> // sla = timedelta(minutes=2)
>  task_id  |dag_id |   execution_date| email_sent | 
> timestamp  | description | notification_sent 
> --+---+-+++-+---
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:55:00 | t  | 2016-06-16 
> 15:58:11.832299 | | t
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:56:00 | t  | 2016-06-16 
> 15:59:09.663778 | | t
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:57:00 | t  | 2016-06-16 
> 16:00:13.651422 | | t
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:58:00 | t  | 2016-06-16 
> 16:01:08.576399 | | t
>  sla_miss | dag_sla_miss1 | 2016-06-16 15:59:00 | t  | 2016-06-16 
> 16:02:08.523486 | | t
>  sla_miss | dag_sla_miss1 | 2016-06-16 16:00:00 | t  | 2016-06-16 
> 16:03:08.538593 | | t
> (6 rows)
> {code}
> than before (current master branch) :
> {code}
> // start_date = 2016-06-16 15:40:00
> // end_date = 2016-06-16 15:45:00
> // schedule_interval =  timedelta(minutes=1)
> // sla = timedelta(minutes=2)
>  task_id  |dag_id |   execution_date| email_sent | 
> timestamp  | description | notification_sent 
> 

[GitHub] r39132 closed pull request #1869: [AIRFLOW-571] added --forwarded_allow_ips as a command line argument to webserver

2018-09-18 Thread GitBox
r39132 closed pull request #1869: [AIRFLOW-571] added --forwarded_allow_ips as 
a command line argument to webserver
URL: https://github.com/apache/incubator-airflow/pull/1869
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/bin/cli.py b/airflow/bin/cli.py
index 21e1d23878..8fda8f5dc1 100755
--- a/airflow/bin/cli.py
+++ b/airflow/bin/cli.py
@@ -45,7 +45,7 @@
 from airflow import api
 from airflow import jobs, settings
 from airflow import configuration as conf
-from airflow.exceptions import AirflowException
+from airflow.exceptions import AirflowException, AirflowConfigException
 from airflow.executors import DEFAULT_EXECUTOR
 from airflow.models import (DagModel, DagBag, TaskInstance,
 DagPickle, DagRun, Variable, DagStat,
@@ -699,6 +699,11 @@ def webserver(args):
 if ssl_cert and not ssl_key:
 raise AirflowException(
 'An SSL key must also be provided for use with ' + ssl_cert)
+try:
+forwarded_allow_ips = (args.forwarded_allow_ips or
+   conf.get('webserver', 'forwarded_allow_ips'))
+except AirflowConfigException:
+forwarded_allow_ips = None
 
 if args.debug:
 print(
@@ -740,6 +745,9 @@ def webserver(args):
 if ssl_cert:
 run_args += ['--certfile', ssl_cert, '--keyfile', ssl_key]
 
+if forwarded_allow_ips:
+run_args += ['--forwarded-allow-ips', forwarded_allow_ips]
+
 run_args += ["airflow.www.app:cached_app()"]
 
 gunicorn_master_proc = subprocess.Popen(run_args)
@@ -1294,6 +1302,10 @@ class CLIFactory(object):
 default=conf.get('webserver', 'ERROR_LOGFILE'),
 help="The logfile to store the webserver error log. Use '-' to 
print to "
  "stderr."),
+'forwarded_allow_ips': Arg(
+("--forwarded_allow_ips", ),
+default=None,
+help="Pass gunicorn front-end IPs allowed to handle set secure 
headers."),
 # resetdb
 'yes': Arg(
 ("-y", "--yes"),
@@ -1469,7 +1481,8 @@ class CLIFactory(object):
 'help': "Start a Airflow webserver instance",
 'args': ('port', 'workers', 'workerclass', 'worker_timeout', 
'hostname',
  'pid', 'daemon', 'stdout', 'stderr', 'access_logfile',
- 'error_logfile', 'log_file', 'ssl_cert', 'ssl_key', 
'debug'),
+ 'error_logfile', 'log_file', 'ssl_cert', 'ssl_key',
+ 'forwarded_allow_ips', 'debug'),
 }, {
 'func': resetdb,
 'help': "Burn down and rebuild the metadata database",
diff --git a/airflow/configuration.py b/airflow/configuration.py
index 265f7289ea..a86f629493 100644
--- a/airflow/configuration.py
+++ b/airflow/configuration.py
@@ -211,6 +211,12 @@ def run_command(command):
 web_server_ssl_cert =
 web_server_ssl_key =
 
+# Pass gunicorn front-end IPs allowed to handle set secure headers.
+# Multiple IPs should be comma separated.  Set to * to disable checking.
+# Useful if you are running gunicorn behind a load balancer.
+# See http://docs.gunicorn.org/en/stable/settings.html#forwarded-allow-ips
+# forwarded_allow_ips = *
+
 # Number of seconds the gunicorn webserver waits before timing out on a worker
 web_server_worker_timeout = 120
 
@@ -454,6 +460,7 @@ def run_command(command):
 dag_orientation = LR
 log_fetch_timeout_sec = 5
 hide_paused_dags_by_default = False
+forwarded_allow_ips = *
 
 [email]
 email_backend = airflow.utils.email.send_email_smtp


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-571) allow gunicorn config to be passed to airflow webserver

2018-09-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618620#comment-16618620
 ] 

ASF GitHub Bot commented on AIRFLOW-571:


r39132 closed pull request #1869: [AIRFLOW-571] added --forwarded_allow_ips as 
a command line argument to webserver
URL: https://github.com/apache/incubator-airflow/pull/1869
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/bin/cli.py b/airflow/bin/cli.py
index 21e1d23878..8fda8f5dc1 100755
--- a/airflow/bin/cli.py
+++ b/airflow/bin/cli.py
@@ -45,7 +45,7 @@
 from airflow import api
 from airflow import jobs, settings
 from airflow import configuration as conf
-from airflow.exceptions import AirflowException
+from airflow.exceptions import AirflowException, AirflowConfigException
 from airflow.executors import DEFAULT_EXECUTOR
 from airflow.models import (DagModel, DagBag, TaskInstance,
 DagPickle, DagRun, Variable, DagStat,
@@ -699,6 +699,11 @@ def webserver(args):
 if ssl_cert and not ssl_key:
 raise AirflowException(
 'An SSL key must also be provided for use with ' + ssl_cert)
+try:
+forwarded_allow_ips = (args.forwarded_allow_ips or
+   conf.get('webserver', 'forwarded_allow_ips'))
+except AirflowConfigException:
+forwarded_allow_ips = None
 
 if args.debug:
 print(
@@ -740,6 +745,9 @@ def webserver(args):
 if ssl_cert:
 run_args += ['--certfile', ssl_cert, '--keyfile', ssl_key]
 
+if forwarded_allow_ips:
+run_args += ['--forwarded-allow-ips', forwarded_allow_ips]
+
 run_args += ["airflow.www.app:cached_app()"]
 
 gunicorn_master_proc = subprocess.Popen(run_args)
@@ -1294,6 +1302,10 @@ class CLIFactory(object):
 default=conf.get('webserver', 'ERROR_LOGFILE'),
 help="The logfile to store the webserver error log. Use '-' to 
print to "
  "stderr."),
+'forwarded_allow_ips': Arg(
+("--forwarded_allow_ips", ),
+default=None,
+help="Pass gunicorn front-end IPs allowed to handle set secure 
headers."),
 # resetdb
 'yes': Arg(
 ("-y", "--yes"),
@@ -1469,7 +1481,8 @@ class CLIFactory(object):
 'help': "Start a Airflow webserver instance",
 'args': ('port', 'workers', 'workerclass', 'worker_timeout', 
'hostname',
  'pid', 'daemon', 'stdout', 'stderr', 'access_logfile',
- 'error_logfile', 'log_file', 'ssl_cert', 'ssl_key', 
'debug'),
+ 'error_logfile', 'log_file', 'ssl_cert', 'ssl_key',
+ 'forwarded_allow_ips', 'debug'),
 }, {
 'func': resetdb,
 'help': "Burn down and rebuild the metadata database",
diff --git a/airflow/configuration.py b/airflow/configuration.py
index 265f7289ea..a86f629493 100644
--- a/airflow/configuration.py
+++ b/airflow/configuration.py
@@ -211,6 +211,12 @@ def run_command(command):
 web_server_ssl_cert =
 web_server_ssl_key =
 
+# Pass gunicorn front-end IPs allowed to handle set secure headers.
+# Multiple IPs should be comma separated.  Set to * to disable checking.
+# Useful if you are running gunicorn behind a load balancer.
+# See http://docs.gunicorn.org/en/stable/settings.html#forwarded-allow-ips
+# forwarded_allow_ips = *
+
 # Number of seconds the gunicorn webserver waits before timing out on a worker
 web_server_worker_timeout = 120
 
@@ -454,6 +460,7 @@ def run_command(command):
 dag_orientation = LR
 log_fetch_timeout_sec = 5
 hide_paused_dags_by_default = False
+forwarded_allow_ips = *
 
 [email]
 email_backend = airflow.utils.email.send_email_smtp


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> allow gunicorn config to be passed to airflow webserver
> ---
>
> Key: AIRFLOW-571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-571
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webserver
>Reporter: Dennis O'Brien
>Priority: Major
>
> I have run into an issue when running airflow webserver behind a load 
> balancer where redirects result in https requests forwarded to http.  I ran 
> into a similar issue with Caravel which also uses gunicorn.  
> 

[jira] [Commented] (AIRFLOW-249) Refactor the SLA mechanism

2018-09-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618619#comment-16618619
 ] 

ASF GitHub Bot commented on AIRFLOW-249:


r39132 closed pull request #1601: [AIRFLOW-249] Refactor the SLA mechanism
URL: https://github.com/apache/incubator-airflow/pull/1601
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/jobs.py b/airflow/jobs.py
index 1e583ac41b..d6f32cd52a 100644
--- a/airflow/jobs.py
+++ b/airflow/jobs.py
@@ -258,39 +258,33 @@ def manage_slas(self, dag, session=None):
 tasks that should have succeeded in the past hour.
 """
 TI = models.TaskInstance
-sq = (
-session
-.query(
-TI.task_id,
-func.max(TI.execution_date).label('max_ti'))
-.filter(TI.dag_id == dag.dag_id)
-.filter(TI.state == State.SUCCESS)
-.filter(TI.task_id.in_(dag.task_ids))
-.group_by(TI.task_id).subquery('sq')
+SlaMiss = models.SlaMiss
+
+sla_missed = (
+session.query(SlaMiss)
+.filter(SlaMiss.email_sent == 't')
+.subquery('sla_missed')
 )
 
-max_tis = session.query(TI).filter(
-TI.dag_id == dag.dag_id,
-TI.task_id == sq.c.task_id,
-TI.execution_date == sq.c.max_ti,
-).all()
+sq = session.query(TI).outerjoin(
+sla_missed,
+sla_missed.c.execution_date == TI.execution_date).filter(
+sla_missed.c.execution_date == None,
+TI.dag_id == dag.dag_id,
+TI.state == State.RUNNING,
+TI.task_id.in_(dag.task_ids)
+).all()
 
 ts = datetime.now()
-SlaMiss = models.SlaMiss
-for ti in max_tis:
+for ti in sq:
 task = dag.get_task(ti.task_id)
-dttm = ti.execution_date
 if task.sla:
-dttm = dag.following_schedule(dttm)
-while dttm < datetime.now():
-following_schedule = dag.following_schedule(dttm)
-if following_schedule + task.sla < datetime.now():
-session.merge(models.SlaMiss(
-task_id=ti.task_id,
-dag_id=ti.dag_id,
-execution_date=dttm,
-timestamp=ts))
-dttm = dag.following_schedule(dttm)
+if ti.start_date + task.sla < ts:
+session.merge(models.SlaMiss(
+task_id=ti.task_id,
+dag_id=ti.dag_id,
+execution_date=ti.execution_date,
+timestamp=ts))
 session.commit()
 
 slas = (


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refactor the SLA mechanism
> --
>
> Key: AIRFLOW-249
> URL: https://issues.apache.org/jira/browse/AIRFLOW-249
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: dud
>Assignee: dud
>Priority: Major
>
> Hello
> I've noticed the SLA feature is currently behaving as follow :
> - it doesn't work on DAG scheduled @once or None because they have no 
> dag.followwing_schedule property
> - it keeps endlessly checking for SLA misses without ever worrying about any 
> end_date. Worse I noticed that emails are still being sent for runs that are 
> never happening because of end_date
> - it keeps checking for recent TIs even if SLA notification has been already 
> been sent for them
> - the SLA logic is only being fired after following_schedule + sla has 
> elapsed, in other words one has to wait for the next TI before having a 
> chance of getting any email. Also the email reports dag.following_schedule 
> time (I guess because it is close of TI.start_date), but unfortunately that 
> doesn't match what the task instances shows nor the log filename
> - the SLA logic is based on max(TI.execution_date) for the starting point of 
> its checks, that means that for a DAG whose SLA is longer than its schedule 
> period if half of the TIs are running longer than expected it will go 
> unnoticed. This could be demonstrated with a DAG like this one :
> {code}
> from airflow import DAG
> from airflow.operators 

[GitHub] r39132 commented on issue #1601: [AIRFLOW-249] Refactor the SLA mechanism

2018-09-18 Thread GitBox
r39132 commented on issue #1601: [AIRFLOW-249] Refactor the SLA mechanism
URL: 
https://github.com/apache/incubator-airflow/pull/1601#issuecomment-422290032
 
 
   Closing for now. Please reopen once you have updated the PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] r39132 commented on issue #1869: [AIRFLOW-571] added --forwarded_allow_ips as a command line argument to webserver

2018-09-18 Thread GitBox
r39132 commented on issue #1869: [AIRFLOW-571] added --forwarded_allow_ips as a 
command line argument to webserver
URL: 
https://github.com/apache/incubator-airflow/pull/1869#issuecomment-422290179
 
 
   @dennisobrien Please reopen when you are ready to proceed!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] r39132 closed pull request #1601: [AIRFLOW-249] Refactor the SLA mechanism

2018-09-18 Thread GitBox
r39132 closed pull request #1601: [AIRFLOW-249] Refactor the SLA mechanism
URL: https://github.com/apache/incubator-airflow/pull/1601
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/jobs.py b/airflow/jobs.py
index 1e583ac41b..d6f32cd52a 100644
--- a/airflow/jobs.py
+++ b/airflow/jobs.py
@@ -258,39 +258,33 @@ def manage_slas(self, dag, session=None):
 tasks that should have succeeded in the past hour.
 """
 TI = models.TaskInstance
-sq = (
-session
-.query(
-TI.task_id,
-func.max(TI.execution_date).label('max_ti'))
-.filter(TI.dag_id == dag.dag_id)
-.filter(TI.state == State.SUCCESS)
-.filter(TI.task_id.in_(dag.task_ids))
-.group_by(TI.task_id).subquery('sq')
+SlaMiss = models.SlaMiss
+
+sla_missed = (
+session.query(SlaMiss)
+.filter(SlaMiss.email_sent == 't')
+.subquery('sla_missed')
 )
 
-max_tis = session.query(TI).filter(
-TI.dag_id == dag.dag_id,
-TI.task_id == sq.c.task_id,
-TI.execution_date == sq.c.max_ti,
-).all()
+sq = session.query(TI).outerjoin(
+sla_missed,
+sla_missed.c.execution_date == TI.execution_date).filter(
+sla_missed.c.execution_date == None,
+TI.dag_id == dag.dag_id,
+TI.state == State.RUNNING,
+TI.task_id.in_(dag.task_ids)
+).all()
 
 ts = datetime.now()
-SlaMiss = models.SlaMiss
-for ti in max_tis:
+for ti in sq:
 task = dag.get_task(ti.task_id)
-dttm = ti.execution_date
 if task.sla:
-dttm = dag.following_schedule(dttm)
-while dttm < datetime.now():
-following_schedule = dag.following_schedule(dttm)
-if following_schedule + task.sla < datetime.now():
-session.merge(models.SlaMiss(
-task_id=ti.task_id,
-dag_id=ti.dag_id,
-execution_date=dttm,
-timestamp=ts))
-dttm = dag.following_schedule(dttm)
+if ti.start_date + task.sla < ts:
+session.merge(models.SlaMiss(
+task_id=ti.task_id,
+dag_id=ti.dag_id,
+execution_date=ti.execution_date,
+timestamp=ts))
 session.commit()
 
 slas = (


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit re-schedule of sensors

2018-09-18 Thread GitBox
seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit 
re-schedule of sensors
URL: https://github.com/apache/incubator-airflow/pull/3596#discussion_r218308398
 
 

 ##
 File path: airflow/models.py
 ##
 @@ -1744,6 +1749,29 @@ def dry_run(self):
 self.render_templates()
 task_copy.dry_run()
 
+@provide_session
+def handle_reschedule(self, reschedule_exception, test_mode=False, 
context=None,
 
 Review comment:
   Yes, should be private


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit re-schedule of sensors

2018-09-18 Thread GitBox
seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit 
re-schedule of sensors
URL: https://github.com/apache/incubator-airflow/pull/3596#discussion_r218308245
 
 

 ##
 File path: airflow/models.py
 ##
 @@ -1744,6 +1749,29 @@ def dry_run(self):
 self.render_templates()
 task_copy.dry_run()
 
+@provide_session
+def handle_reschedule(self, reschedule_exception, test_mode=False, 
context=None,
+  session=None):
+self.end_date = timezone.utcnow()
+self.set_duration()
+
+# Log reschedule request
+session.add(TaskReschedule(self.task, self.execution_date, 
self._try_number,
+self.start_date, self.end_date,
+reschedule_exception.reschedule_date))
+
+# set state
+self.state = State.NONE
+
+# Decrement try_number so subsequent runs will use the same try number 
and write
+# to same log file.
+self._try_number -= 1
+
+if not test_mode:
+session.merge(self)
+session.commit()
 
 Review comment:
   It's the same pattern as in `handle_failure`, I didn't think much about it, 
I'll think a bit more about it...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit re-schedule of sensors

2018-09-18 Thread GitBox
seelmann commented on a change in pull request #3596: [AIRFLOW-2747] Explicit 
re-schedule of sensors
URL: https://github.com/apache/incubator-airflow/pull/3596#discussion_r218307927
 
 

 ##
 File path: airflow/models.py
 ##
 @@ -56,8 +56,8 @@
 
 from sqlalchemy import (
 Column, Integer, String, DateTime, Text, Boolean, ForeignKey, PickleType,
-Index, Float, LargeBinary, UniqueConstraint)
-from sqlalchemy import func, or_, and_, true as sqltrue
+Index, Float, LargeBinary, UniqueConstraint, ForeignKeyConstraint)
+from sqlalchemy import func, or_, and_, true as sqltrue, asc
 
 Review comment:
   There are already two `from sqlalchemy import ...` statements: the first 
imports types, the second imports (SQL) expressions.
   I can introduce a third one.
   Or combine all into one like this (lexicographically sorted):
   ```
   from sqlalchemy import (
   Boolean, Column, DateTime, Float, ForeignKey, ForeignKeyConstraint, 
Index,
   Integer, LargeBinary, PickleType, String, Text, UniqueConstraint, 
   and_, asc, func, or_, true as sqltrue
   )
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services