[jira] [Commented] (AIRFLOW-5701) Don't clear xcom explicitly before execution

2019-10-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957572#comment-16957572
 ] 

ASF subversion and git services commented on AIRFLOW-5701:
--

Commit 74d2a0d9e77cf90b85654f65d1adba0875e0fb1f in airflow's branch 
refs/heads/master from Fokko Driesprong
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=74d2a0d ]

AIRFLOW-5701: Don't clear xcom explicitly before execution (#6370)



> Don't clear xcom explicitly before execution
> 
>
> Key: AIRFLOW-5701
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5701
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: xcom
>Affects Versions: 1.10.5
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] Fokko commented on a change in pull request #6379: [AIRFLOW-5708] Optionally check task pools when parsing dags.

2019-10-22 Thread GitBox
Fokko commented on a change in pull request #6379: [AIRFLOW-5708] Optionally 
check task pools when parsing dags.
URL: https://github.com/apache/airflow/pull/6379#discussion_r337857359
 
 

 ##
 File path: airflow/config_templates/default_airflow.cfg
 ##
 @@ -208,6 +208,8 @@ dag_discovery_safe_mode = True
 # The number of retries each task is going to have by default. Can be 
overridden at dag or task level.
 default_task_retries = 0
 
+check_task_pools = False
 
 Review comment:
   Why this as an option? I think we should check if the pool exists anyway. 
Currently, the error message is horrible :-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] Fokko commented on issue #6341: Updated airflow

2019-10-22 Thread GitBox
Fokko commented on issue #6341: Updated airflow
URL: https://github.com/apache/airflow/pull/6341#issuecomment-545275692
 
 
   Thank you @Fortschritt69 for your PR. Please create a ticket in Jira to 
describe the intention of the PR. For now, I'll close the PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] Fokko closed pull request #6341: Updated airflow

2019-10-22 Thread GitBox
Fokko closed pull request #6341: Updated airflow
URL: https://github.com/apache/airflow/pull/6341
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] Fokko merged pull request #6370: AIRFLOW-5701: Don't clear xcom explicitly before execution

2019-10-22 Thread GitBox
Fokko merged pull request #6370: AIRFLOW-5701: Don't clear xcom explicitly 
before execution
URL: https://github.com/apache/airflow/pull/6370
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5701) Don't clear xcom explicitly before execution

2019-10-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957571#comment-16957571
 ] 

ASF GitHub Bot commented on AIRFLOW-5701:
-

Fokko commented on pull request #6370: AIRFLOW-5701: Don't clear xcom 
explicitly before execution
URL: https://github.com/apache/airflow/pull/6370
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Don't clear xcom explicitly before execution
> 
>
> Key: AIRFLOW-5701
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5701
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: xcom
>Affects Versions: 1.10.5
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] fangpenlin commented on issue #3229: [AIRFLOW-2325] Add cloudwatch task handler (IN PROGRESS)

2019-10-22 Thread GitBox
fangpenlin commented on issue #3229: [AIRFLOW-2325] Add cloudwatch task handler 
(IN PROGRESS)
URL: https://github.com/apache/airflow/pull/3229#issuecomment-545265286
 
 
   @ericabertugli @potiuk @mik-laj yeah, sorry I have no time to work on this. 
Feel free to take over my PR and continue working on it  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-5710) Optionally error on unused operator arguments

2019-10-22 Thread Chao-Han Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao-Han Tsai updated AIRFLOW-5710:
---
Fix Version/s: 1.10.6

> Optionally error on unused operator arguments
> -
>
> Key: AIRFLOW-5710
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5710
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.10.5
>Reporter: Joshua Carp
>Assignee: Joshua Carp
>Priority: Trivial
> Fix For: 1.10.6
>
>
> Airflow 2.0 will error when operators are instantiated with unused keyword 
> arguments, but for now unused arguments just raise a warning. My team has 
> passed unused arguments to operators and not noticed the warning a few times, 
> and it would be useful to be able to opt in to airflow 2.0 validation early. 
> I propose adding a configuration flag that causes tasks to error on unused 
> arguments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5710) Optionally error on unused operator arguments

2019-10-22 Thread Chao-Han Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao-Han Tsai resolved AIRFLOW-5710.

Resolution: Fixed

> Optionally error on unused operator arguments
> -
>
> Key: AIRFLOW-5710
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5710
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.10.5
>Reporter: Joshua Carp
>Assignee: Joshua Carp
>Priority: Trivial
>
> Airflow 2.0 will error when operators are instantiated with unused keyword 
> arguments, but for now unused arguments just raise a warning. My team has 
> passed unused arguments to operators and not noticed the warning a few times, 
> and it would be useful to be able to opt in to airflow 2.0 validation early. 
> I propose adding a configuration flag that causes tasks to error on unused 
> arguments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5710) Optionally error on unused operator arguments

2019-10-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957504#comment-16957504
 ] 

ASF subversion and git services commented on AIRFLOW-5710:
--

Commit e9d65e3d21f2aa6eaeb557b2960730628540127f in airflow's branch 
refs/heads/master from Joshua Carp
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=e9d65e3 ]

[AIRFLOW-5710] Optionally raise exception on unused operator arguments. (#6332)



> Optionally error on unused operator arguments
> -
>
> Key: AIRFLOW-5710
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5710
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.10.5
>Reporter: Joshua Carp
>Assignee: Joshua Carp
>Priority: Trivial
>
> Airflow 2.0 will error when operators are instantiated with unused keyword 
> arguments, but for now unused arguments just raise a warning. My team has 
> passed unused arguments to operators and not noticed the warning a few times, 
> and it would be useful to be able to opt in to airflow 2.0 validation early. 
> I propose adding a configuration flag that causes tasks to error on unused 
> arguments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5710) Optionally error on unused operator arguments

2019-10-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957503#comment-16957503
 ] 

ASF GitHub Bot commented on AIRFLOW-5710:
-

milton0825 commented on pull request #6332: [AIRFLOW-5710] Optionally raise 
exception on unused operator arguments.
URL: https://github.com/apache/airflow/pull/6332
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Optionally error on unused operator arguments
> -
>
> Key: AIRFLOW-5710
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5710
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.10.5
>Reporter: Joshua Carp
>Assignee: Joshua Carp
>Priority: Trivial
>
> Airflow 2.0 will error when operators are instantiated with unused keyword 
> arguments, but for now unused arguments just raise a warning. My team has 
> passed unused arguments to operators and not noticed the warning a few times, 
> and it would be useful to be able to opt in to airflow 2.0 validation early. 
> I propose adding a configuration flag that causes tasks to error on unused 
> arguments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] milton0825 merged pull request #6332: [AIRFLOW-5710] Optionally raise exception on unused operator arguments.

2019-10-22 Thread GitBox
milton0825 merged pull request #6332: [AIRFLOW-5710] Optionally raise exception 
on unused operator arguments.
URL: https://github.com/apache/airflow/pull/6332
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] KevinYang21 merged pull request #6391: [AIRFLOW-XXX] Improve wording

2019-10-22 Thread GitBox
KevinYang21 merged pull request #6391: [AIRFLOW-XXX] Improve wording
URL: https://github.com/apache/airflow/pull/6391
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-10-22 Thread GitBox
codecov-io edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting 
serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#issuecomment-529755241
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5743?src=pr=h1) 
Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@4e661f5`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `83.81%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5743/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5743?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master#5743   +/-   ##
   =
 Coverage  ?   84.16%   
   =
 Files ?  627   
 Lines ?36531   
 Branches  ?0   
   =
 Hits  ?30745   
 Misses? 5786   
 Partials  ?0
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5743?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/serialization/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy9zZXJpYWxpemF0aW9uL19faW5pdF9fLnB5)
 | `100% <100%> (ø)` | |
   | 
[airflow/models/baseoperator.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvYmFzZW9wZXJhdG9yLnB5)
 | `95.91% <100%> (ø)` | |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `88.48% <100%> (ø)` | |
   | 
[airflow/utils/log/logging\_mixin.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9sb2cvbG9nZ2luZ19taXhpbi5weQ==)
 | `96.15% <100%> (ø)` | |
   | 
[airflow/www/utils.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdXRpbHMucHk=)
 | `80.19% <100%> (ø)` | |
   | 
[airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=)
 | `100% <100%> (ø)` | |
   | 
[airflow/serialization/enums.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy9zZXJpYWxpemF0aW9uL2VudW1zLnB5)
 | `100% <100%> (ø)` | |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `58.15% <16.66%> (ø)` | |
   | 
[airflow/models/dagbag.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFnYmFnLnB5)
 | `85.24% <41.66%> (ø)` | |
   | 
[airflow/models/dag.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFnLnB5)
 | `90.87% <80%> (ø)` | |
   | ... and [8 
more](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5743?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5743?src=pr=footer). 
Last update 
[4e661f5...4e1416a](https://codecov.io/gh/apache/airflow/pull/5743?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-10-22 Thread GitBox
codecov-io edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting 
serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#issuecomment-529755241
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5743?src=pr=h1) 
Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@4e661f5`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `83.81%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5743/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5743?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master#5743   +/-   ##
   =
 Coverage  ?   84.16%   
   =
 Files ?  627   
 Lines ?36531   
 Branches  ?0   
   =
 Hits  ?30745   
 Misses? 5786   
 Partials  ?0
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5743?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/serialization/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy9zZXJpYWxpemF0aW9uL19faW5pdF9fLnB5)
 | `100% <100%> (ø)` | |
   | 
[airflow/models/baseoperator.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvYmFzZW9wZXJhdG9yLnB5)
 | `95.91% <100%> (ø)` | |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `88.48% <100%> (ø)` | |
   | 
[airflow/utils/log/logging\_mixin.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9sb2cvbG9nZ2luZ19taXhpbi5weQ==)
 | `96.15% <100%> (ø)` | |
   | 
[airflow/www/utils.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdXRpbHMucHk=)
 | `80.19% <100%> (ø)` | |
   | 
[airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=)
 | `100% <100%> (ø)` | |
   | 
[airflow/serialization/enums.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy9zZXJpYWxpemF0aW9uL2VudW1zLnB5)
 | `100% <100%> (ø)` | |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `58.15% <16.66%> (ø)` | |
   | 
[airflow/models/dagbag.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFnYmFnLnB5)
 | `85.24% <41.66%> (ø)` | |
   | 
[airflow/models/dag.py](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFnLnB5)
 | `90.87% <80%> (ø)` | |
   | ... and [8 
more](https://codecov.io/gh/apache/airflow/pull/5743/diff?src=pr=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5743?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5743?src=pr=footer). 
Last update 
[4e661f5...4e1416a](https://codecov.io/gh/apache/airflow/pull/5743?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-10-22 Thread GitBox
kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r337801737
 
 

 ##
 File path: docs/howto/enable-dag-serialization.rst
 ##
 @@ -0,0 +1,109 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+
+
+
+Enable DAG Serialization
+
+
+Add the following settings in ``airflow.cfg``:
+
+.. code-block:: ini
+
+[core]
+store_serialized_dags = True
+min_serialized_dag_update_interval = 30
+
+*   ``store_serialized_dags``: This flag decides whether to serialises DAGs 
and persist them in DB.
+If set to True, Webserver reads from DB instead of parsing DAG files
+*   ``min_serialized_dag_update_interval``: This flag sets the minimum 
interval (in seconds) after which
+the serialized DAG in DB should be updated. This helps in reducing 
database write rate.
+
+If you are updating Airflow from <1.10.6, please do not forget to run 
``airflow db upgrade``.
+
+
+How it works
+
+
+In order to make Airflow Webserver stateless (almost!), Airflow >=1.10.6 
supports
+DAG Serialization and DB Persistence.
+
+.. image:: ../img/dag_serialization.png
+
+As shown in the image above in Vanilla Airflow, the Webserver and the 
Scheduler both
+needs access to the DAG files. Both the scheduler and webserver parses the DAG 
files.
+
+With **DAG Serialization** we aim to decouple the webserver from DAG parsing
+which would make the Webserver very light-weight.
+
+As shown in the image above, when using the dag_serilization feature,
+the Scheduler parses the DAG files, serializes them in JSON format and saves 
them in the Metadata DB.
+
+The Webserver now instead of having to parse the DAG file again, reads the
+serialized DAGs in JSON, de-serializes them and create the DagBag and uses it
+to show in the UI.
+
+One of the key features that is implemented as the part of DAG Serialization 
is that
+instead of loading an entire DagBag when the WebServer starts we only load 
each DAG on demand from the
+Serialized Dag table. This helps reduce Webserver startup time and memory. The 
reduction is notable
+when you have large number of DAGs.
+
+Below is the screenshot of the ``serialized_dag`` table in Metadata DB:
+
+.. image:: ../img/serialized_dag_table.png
+
+Limitations
+---
+The Webserver will still need access to DAG files in the following cases,
+which is why we said "almost" stateless.
+
+*   **Rendered Template** tab will still have to parse Python file as it needs 
all the details like
+the execution date and even the data passed by the upstream task using 
Xcom.
+*   **Code View** will read the DAG File & show it using Pygments.
+However, it does not need to Parse the Python file so it is still a small 
operation.
+*   :doc:`Extra Operator Links ` would not work out of
+the box. They need to be defined in Airflow Plugin.
+
+**Existing Airflow Operators**:
+To make extra operator links work with existing operators like BigQuery, 
copy all
+the classes that are defined in ``operator_extra_links`` property.
 
 Review comment:
   Looks like the currently defined OperatorLinks for inbuilt operators are 
dependent on the Instance of operator object.
   
   Example the following is Qubole Operator:
   
   ```
   class QDSLink(BaseOperatorLink):
   """Link to QDS"""
   name = 'Go to QDS'
   
   def get_link(self, operator, dttm):
   return operator.get_hook().get_extra_links(operator, dttm)
   ```
   which depends on `conn = 
BaseHook.get_connection(operator.kwargs['qubole_conn_id'])` line. We don't 
store `qubole_conn_id` as the task property in our serialized DAG.
   
   and the BigqueryOperator is:
   
   ```
   @property
   def operator_extra_links(self):
   """
   Return operator extra lxinks
   """
   if isinstance(self.sql, str):
   return (
   BigQueryConsoleLink(),
   )
   return (
   BigQueryConsoleIndexableLink(i) for i, _ in enumerate(self.sql)
   )
   ```
   this one is dependent on `self.sql` property 

[GitHub] [airflow] kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-10-22 Thread GitBox
kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r337801737
 
 

 ##
 File path: docs/howto/enable-dag-serialization.rst
 ##
 @@ -0,0 +1,109 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+
+
+
+Enable DAG Serialization
+
+
+Add the following settings in ``airflow.cfg``:
+
+.. code-block:: ini
+
+[core]
+store_serialized_dags = True
+min_serialized_dag_update_interval = 30
+
+*   ``store_serialized_dags``: This flag decides whether to serialises DAGs 
and persist them in DB.
+If set to True, Webserver reads from DB instead of parsing DAG files
+*   ``min_serialized_dag_update_interval``: This flag sets the minimum 
interval (in seconds) after which
+the serialized DAG in DB should be updated. This helps in reducing 
database write rate.
+
+If you are updating Airflow from <1.10.6, please do not forget to run 
``airflow db upgrade``.
+
+
+How it works
+
+
+In order to make Airflow Webserver stateless (almost!), Airflow >=1.10.6 
supports
+DAG Serialization and DB Persistence.
+
+.. image:: ../img/dag_serialization.png
+
+As shown in the image above in Vanilla Airflow, the Webserver and the 
Scheduler both
+needs access to the DAG files. Both the scheduler and webserver parses the DAG 
files.
+
+With **DAG Serialization** we aim to decouple the webserver from DAG parsing
+which would make the Webserver very light-weight.
+
+As shown in the image above, when using the dag_serilization feature,
+the Scheduler parses the DAG files, serializes them in JSON format and saves 
them in the Metadata DB.
+
+The Webserver now instead of having to parse the DAG file again, reads the
+serialized DAGs in JSON, de-serializes them and create the DagBag and uses it
+to show in the UI.
+
+One of the key features that is implemented as the part of DAG Serialization 
is that
+instead of loading an entire DagBag when the WebServer starts we only load 
each DAG on demand from the
+Serialized Dag table. This helps reduce Webserver startup time and memory. The 
reduction is notable
+when you have large number of DAGs.
+
+Below is the screenshot of the ``serialized_dag`` table in Metadata DB:
+
+.. image:: ../img/serialized_dag_table.png
+
+Limitations
+---
+The Webserver will still need access to DAG files in the following cases,
+which is why we said "almost" stateless.
+
+*   **Rendered Template** tab will still have to parse Python file as it needs 
all the details like
+the execution date and even the data passed by the upstream task using 
Xcom.
+*   **Code View** will read the DAG File & show it using Pygments.
+However, it does not need to Parse the Python file so it is still a small 
operation.
+*   :doc:`Extra Operator Links ` would not work out of
+the box. They need to be defined in Airflow Plugin.
+
+**Existing Airflow Operators**:
+To make extra operator links work with existing operators like BigQuery, 
copy all
+the classes that are defined in ``operator_extra_links`` property.
 
 Review comment:
   Looks like the currently defined OperatorLinks for inbuilt operators are 
dependent on the Instance of operator class.
   
   Example the following is Qubole Operator:
   
   ```
   class QDSLink(BaseOperatorLink):
   """Link to QDS"""
   name = 'Go to QDS'
   
   def get_link(self, operator, dttm):
   return operator.get_hook().get_extra_links(operator, dttm)
   ```
   which depends on `conn = 
BaseHook.get_connection(operator.kwargs['qubole_conn_id'])` line. We don't 
store `qubole_conn_id` as the task property in our serialized DAG.
   
   and the BigqueryOperator is:
   
   ```
   @property
   def operator_extra_links(self):
   """
   Return operator extra lxinks
   """
   if isinstance(self.sql, str):
   return (
   BigQueryConsoleLink(),
   )
   return (
   BigQueryConsoleIndexableLink(i) for i, _ in enumerate(self.sql)
   )
   ```
   this one is dependent on `self.sql` property 

[GitHub] [airflow] kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-10-22 Thread GitBox
kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r337801737
 
 

 ##
 File path: docs/howto/enable-dag-serialization.rst
 ##
 @@ -0,0 +1,109 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+
+
+
+Enable DAG Serialization
+
+
+Add the following settings in ``airflow.cfg``:
+
+.. code-block:: ini
+
+[core]
+store_serialized_dags = True
+min_serialized_dag_update_interval = 30
+
+*   ``store_serialized_dags``: This flag decides whether to serialises DAGs 
and persist them in DB.
+If set to True, Webserver reads from DB instead of parsing DAG files
+*   ``min_serialized_dag_update_interval``: This flag sets the minimum 
interval (in seconds) after which
+the serialized DAG in DB should be updated. This helps in reducing 
database write rate.
+
+If you are updating Airflow from <1.10.6, please do not forget to run 
``airflow db upgrade``.
+
+
+How it works
+
+
+In order to make Airflow Webserver stateless (almost!), Airflow >=1.10.6 
supports
+DAG Serialization and DB Persistence.
+
+.. image:: ../img/dag_serialization.png
+
+As shown in the image above in Vanilla Airflow, the Webserver and the 
Scheduler both
+needs access to the DAG files. Both the scheduler and webserver parses the DAG 
files.
+
+With **DAG Serialization** we aim to decouple the webserver from DAG parsing
+which would make the Webserver very light-weight.
+
+As shown in the image above, when using the dag_serilization feature,
+the Scheduler parses the DAG files, serializes them in JSON format and saves 
them in the Metadata DB.
+
+The Webserver now instead of having to parse the DAG file again, reads the
+serialized DAGs in JSON, de-serializes them and create the DagBag and uses it
+to show in the UI.
+
+One of the key features that is implemented as the part of DAG Serialization 
is that
+instead of loading an entire DagBag when the WebServer starts we only load 
each DAG on demand from the
+Serialized Dag table. This helps reduce Webserver startup time and memory. The 
reduction is notable
+when you have large number of DAGs.
+
+Below is the screenshot of the ``serialized_dag`` table in Metadata DB:
+
+.. image:: ../img/serialized_dag_table.png
+
+Limitations
+---
+The Webserver will still need access to DAG files in the following cases,
+which is why we said "almost" stateless.
+
+*   **Rendered Template** tab will still have to parse Python file as it needs 
all the details like
+the execution date and even the data passed by the upstream task using 
Xcom.
+*   **Code View** will read the DAG File & show it using Pygments.
+However, it does not need to Parse the Python file so it is still a small 
operation.
+*   :doc:`Extra Operator Links ` would not work out of
+the box. They need to be defined in Airflow Plugin.
+
+**Existing Airflow Operators**:
+To make extra operator links work with existing operators like BigQuery, 
copy all
+the classes that are defined in ``operator_extra_links`` property.
 
 Review comment:
   Looks like the currently defined OperatorLinks for inbuilt operators 
aredependent on Instance of operator object.
   
   Example the following is Qubole Operator:
   
   ```
   class QDSLink(BaseOperatorLink):
   """Link to QDS"""
   name = 'Go to QDS'
   
   def get_link(self, operator, dttm):
   return operator.get_hook().get_extra_links(operator, dttm)
   ```
   which depends on `conn = 
BaseHook.get_connection(operator.kwargs['qubole_conn_id'])` line. We don't 
store `qubole_conn_id` as the task property in our serialized DAG.
   
   and the BigqueryOperator is:
   
   ```
   @property
   def operator_extra_links(self):
   """
   Return operator extra lxinks
   """
   if isinstance(self.sql, str):
   return (
   BigQueryConsoleLink(),
   )
   return (
   BigQueryConsoleIndexableLink(i) for i, _ in enumerate(self.sql)
   )
   ```
   this one is dependent on `self.sql` property !! 

[GitHub] [airflow] codecov-io edited a comment on issue #6317: [AIRFLOW-5644] Simplify TriggerDagRunOperator usage

2019-10-22 Thread GitBox
codecov-io edited a comment on issue #6317: [AIRFLOW-5644] Simplify 
TriggerDagRunOperator usage
URL: https://github.com/apache/airflow/pull/6317#issuecomment-541349899
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=h1) 
Report
   > Merging 
[#6317](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/3cfe4a1c9dc49c91839e9c278b97f9c18033fdf4?src=pr=desc)
 will **decrease** coverage by `0.26%`.
   > The diff coverage is `95%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6317/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master   #6317  +/-   ##
   =
   - Coverage   80.57%   80.3%   -0.27% 
   =
 Files 626 626  
 Lines   36237   36223  -14 
   =
   - Hits29198   29090 -108 
   - Misses   70397133  +94
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[...low/example\_dags/example\_trigger\_controller\_dag.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV90cmlnZ2VyX2NvbnRyb2xsZXJfZGFnLnB5)
 | `100% <100%> (+43.75%)` | :arrow_up: |
   | 
[airflow/operators/dagrun\_operator.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvZGFncnVuX29wZXJhdG9yLnB5)
 | `96% <100%> (+1.26%)` | :arrow_up: |
   | 
[airflow/utils/dates.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYXRlcy5weQ==)
 | `82.6% <100%> (ø)` | :arrow_up: |
   | 
[airflow/example\_dags/example\_trigger\_target\_dag.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV90cmlnZ2VyX3RhcmdldF9kYWcucHk=)
 | `90% <75%> (-2.31%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `45.25% <0%> (-46.72%)` | :arrow_down: |
   | 
[airflow/kubernetes/kube\_client.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL2t1YmVfY2xpZW50LnB5)
 | `33.33% <0%> (-41.67%)` | :arrow_down: |
   | 
[...rflow/contrib/operators/kubernetes\_pod\_operator.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZF9vcGVyYXRvci5weQ==)
 | `70.14% <0%> (-28.36%)` | :arrow_down: |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `93.28% <0%> (-0.51%)` | :arrow_down: |
   | ... and [4 
more](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=footer). 
Last update 
[3cfe4a1...467dfcc](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #6317: [AIRFLOW-5644] Simplify TriggerDagRunOperator usage

2019-10-22 Thread GitBox
codecov-io edited a comment on issue #6317: [AIRFLOW-5644] Simplify 
TriggerDagRunOperator usage
URL: https://github.com/apache/airflow/pull/6317#issuecomment-541349899
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=h1) 
Report
   > Merging 
[#6317](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/3cfe4a1c9dc49c91839e9c278b97f9c18033fdf4?src=pr=desc)
 will **decrease** coverage by `0.26%`.
   > The diff coverage is `95%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6317/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master   #6317  +/-   ##
   =
   - Coverage   80.57%   80.3%   -0.27% 
   =
 Files 626 626  
 Lines   36237   36223  -14 
   =
   - Hits29198   29090 -108 
   - Misses   70397133  +94
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[...low/example\_dags/example\_trigger\_controller\_dag.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV90cmlnZ2VyX2NvbnRyb2xsZXJfZGFnLnB5)
 | `100% <100%> (+43.75%)` | :arrow_up: |
   | 
[airflow/operators/dagrun\_operator.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvZGFncnVuX29wZXJhdG9yLnB5)
 | `96% <100%> (+1.26%)` | :arrow_up: |
   | 
[airflow/utils/dates.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYXRlcy5weQ==)
 | `82.6% <100%> (ø)` | :arrow_up: |
   | 
[airflow/example\_dags/example\_trigger\_target\_dag.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV90cmlnZ2VyX3RhcmdldF9kYWcucHk=)
 | `90% <75%> (-2.31%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `45.25% <0%> (-46.72%)` | :arrow_down: |
   | 
[airflow/kubernetes/kube\_client.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL2t1YmVfY2xpZW50LnB5)
 | `33.33% <0%> (-41.67%)` | :arrow_down: |
   | 
[...rflow/contrib/operators/kubernetes\_pod\_operator.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZF9vcGVyYXRvci5weQ==)
 | `70.14% <0%> (-28.36%)` | :arrow_down: |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `93.28% <0%> (-0.51%)` | :arrow_down: |
   | ... and [4 
more](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=footer). 
Last update 
[3cfe4a1...467dfcc](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #6317: [AIRFLOW-5644] Simplify TriggerDagRunOperator usage

2019-10-22 Thread GitBox
codecov-io edited a comment on issue #6317: [AIRFLOW-5644] Simplify 
TriggerDagRunOperator usage
URL: https://github.com/apache/airflow/pull/6317#issuecomment-541349899
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=h1) 
Report
   > Merging 
[#6317](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/3cfe4a1c9dc49c91839e9c278b97f9c18033fdf4?src=pr=desc)
 will **decrease** coverage by `0.26%`.
   > The diff coverage is `95%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6317/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master   #6317  +/-   ##
   =
   - Coverage   80.57%   80.3%   -0.27% 
   =
 Files 626 626  
 Lines   36237   36223  -14 
   =
   - Hits29198   29090 -108 
   - Misses   70397133  +94
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[...low/example\_dags/example\_trigger\_controller\_dag.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV90cmlnZ2VyX2NvbnRyb2xsZXJfZGFnLnB5)
 | `100% <100%> (+43.75%)` | :arrow_up: |
   | 
[airflow/operators/dagrun\_operator.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvZGFncnVuX29wZXJhdG9yLnB5)
 | `96% <100%> (+1.26%)` | :arrow_up: |
   | 
[airflow/utils/dates.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYXRlcy5weQ==)
 | `82.6% <100%> (ø)` | :arrow_up: |
   | 
[airflow/example\_dags/example\_trigger\_target\_dag.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV90cmlnZ2VyX3RhcmdldF9kYWcucHk=)
 | `90% <75%> (-2.31%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `45.25% <0%> (-46.72%)` | :arrow_down: |
   | 
[airflow/kubernetes/kube\_client.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL2t1YmVfY2xpZW50LnB5)
 | `33.33% <0%> (-41.67%)` | :arrow_down: |
   | 
[...rflow/contrib/operators/kubernetes\_pod\_operator.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZF9vcGVyYXRvci5weQ==)
 | `70.14% <0%> (-28.36%)` | :arrow_down: |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `93.28% <0%> (-0.51%)` | :arrow_down: |
   | ... and [4 
more](https://codecov.io/gh/apache/airflow/pull/6317/diff?src=pr=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=footer). 
Last update 
[3cfe4a1...467dfcc](https://codecov.io/gh/apache/airflow/pull/6317?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-10-22 Thread GitBox
kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r337773140
 
 

 ##
 File path: airflow/models/serialized_dag.py
 ##
 @@ -0,0 +1,214 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Serialzed DAG table in database."""
+
+import hashlib
+from datetime import timedelta
+from typing import TYPE_CHECKING, Any, Dict, List, Optional
+
+# from sqlalchemy import Column, Index, Integer, String, Text, JSON, and_, exc
+from sqlalchemy import JSON, Column, Index, Integer, String, and_
+from sqlalchemy.sql import exists
+
+from airflow.models.base import ID_LEN, Base
+from airflow.utils import db, timezone
+from airflow.utils.log.logging_mixin import LoggingMixin
+from airflow.utils.sqlalchemy import UtcDateTime
+
+if TYPE_CHECKING:
+from airflow.models import DAG  # noqa: F401; # pylint: 
disable=cyclic-import
+from airflow.serialization import SerializedDAG  # noqa: F401
+
+
+log = LoggingMixin().log
+
+
+class SerializedDagModel(Base):
+"""A table for serialized DAGs.
+
+serialized_dag table is a snapshot of DAG files synchronized by scheduler.
+This feature is controlled by:
+
+* ``[core] store_serialized_dags = True``: enable this feature
+* ``[core] min_serialized_dag_update_interval = 30`` (s):
+  serialized DAGs are updated in DB when a file gets processed by 
scheduler,
+  to reduce DB write rate, there is a minimal interval of updating 
serialized DAGs.
+* ``[scheduler] dag_dir_list_interval = 300`` (s):
+  interval of deleting serialized DAGs in DB when the files are deleted, 
suggest
+  to use a smaller interval such as 60
+
+It is used by webserver to load dagbags when 
``store_serialized_dags=True``.
+Because reading from database is lightweight compared to importing from 
files,
+it solves the webserver scalability issue.
+"""
+__tablename__ = 'serialized_dag'
+
+dag_id = Column(String(ID_LEN), primary_key=True)
+fileloc = Column(String(2000), nullable=False)
+# The max length of fileloc exceeds the limit of indexing.
+fileloc_hash = Column(Integer, nullable=False)
+data = Column(JSON, nullable=False)
+last_updated = Column(UtcDateTime, nullable=False)
+
+__table_args__ = (
+Index('idx_fileloc_hash', fileloc_hash, unique=False),
+)
+
+def __init__(self, dag: 'DAG'):
+from airflow.serialization import SerializedDAG  # noqa # pylint: 
disable=redefined-outer-name
+
+self.dag_id = dag.dag_id
+self.fileloc = dag.full_filepath
+self.fileloc_hash = self.dag_fileloc_hash(self.fileloc)
+self.data = SerializedDAG.to_dict(dag)
+self.last_updated = timezone.utcnow()
+
+@staticmethod
+def dag_fileloc_hash(full_filepath: str) -> int:
+Hashing file location for indexing.
+
+:param full_filepath: full filepath of DAG file
+:return: hashed full_filepath
+"""
+# hashing is needed because the length of fileloc is 2000 as an 
Airflow convention,
+# which is over the limit of indexing. If we can reduce the length of 
fileloc, then
+# hashing is not needed.
+return int.from_bytes(
+hashlib.sha1(full_filepath.encode('utf-8')).digest()[-2:], 
byteorder='big', signed=False)
+
+@classmethod
+@db.provide_session
+def write_dag(cls, dag: 'DAG', min_update_interval: Optional[int] = None, 
session=None):
+"""Serializes a DAG and writes it into database.
+
+:param dag: a DAG to be written into database
+:param min_update_interval: minimal interval in seconds to update 
serialized DAG
+:param session: ORM Session
+"""
+log.debug("Writing DAG: %s to the DB", dag)
+# Checks if (Current Time - Time when the DAG was written to DB) < 
min_update_interval
+# If Yes, does nothing
+# If No or the DAG does not exists, updates / writes Serialized DAG to 
DB
+if min_update_interval is not None:
+if session.query(exists().where(
+and_(cls.dag_id == 

[GitHub] [airflow] kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-10-22 Thread GitBox
kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r337772827
 
 

 ##
 File path: airflow/dag/serialization/schema.json
 ##
 @@ -0,0 +1,196 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#;,
+  "$id": "https://airflow.apache.com/schemas/serialized-dags.json;,
+  "definitions": {
+"datetime": {
+  "description": "A date time, stored as fractional seconds since the 
epoch",
+  "type": "number"
+},
+"typed_datetime": {
+  "type": "object",
+  "properties": {
+"__type": {
+  "type": "string",
+  "const": "datetime"
+},
+"__var": { "$ref": "#/definitions/datetime" }
+  },
+  "required": [
+"__type",
+"__var"
+  ],
+  "additionalProperties": false
+},
+"timedelta": {
+  "type": "number",
+  "minimum": 0
+},
+"typed_timedelta": {
+  "type": "object",
+  "properties": {
+"__type": {
+  "type": "string",
+  "const": "timedelta"
+},
+"__var": { "$ref": "#/definitions/timedelta" }
+  },
+  "required": [
+"__type",
+"__var"
+  ],
+  "additionalProperties": false
+},
+"typed_relativedelta": {
+  "type": "object",
+  "description": "A dateutil.relativedelta.relativedelta object",
+  "properties": {
+"__type": {
+  "type": "string",
+  "const": "relativedelta"
+},
+"__var": {
+  "type": "object",
+  "properties": {
+"weekday": {
+  "type": "array",
+  "items": { "type": "integer" },
+  "minItems": 1,
+  "maxItems": 2
+}
+  },
+  "additionalProperties": { "type": "integer" }
+}
+  }
+},
+"timezone": {
+  "type": "string"
+},
+"dict": {
+  "description": "A python dictionary containing values of any type",
+  "type": "object"
+},
+"color": {
+  "type": "string",
+  "pattern": "^#[a-fA-F0-9]{3,6}$"
+},
+"stringset": {
+  "description": "A set of strings",
+  "type": "object",
+  "properties": {
+"__type": {
+  "type": "string",
+  "const": "set"
+},
+"__var": {
+  "type": "array",
+  "items": { "type": "string" }
+}
+  },
+  "required": [
+"__type",
+"__var"
+  ]
+},
+"dag": {
+  "type": "object",
+  "properties": {
+"params": { "$ref": "#/definitions/dict" },
+"_dag_id": { "type": "string" },
+"task_dict": {  "$ref": "#/definitions/task_dict" },
+"timezone": { "$ref": "#/definitions/timezone" },
+"schedule_interval": {
+  "anyOf": [
+{ "type": "null" },
+{ "type": "string" },
+{ "$ref": "#/definitions/typed_timedelta" },
+{ "$ref": "#/definitions/typed_relativedelta" }
+  ]
+},
+"catchup": { "type": "boolean" },
+"is_subdag": { "type": "boolean" },
+"fileloc": { "type" : "string"},
+"orientation": { "type" : "string"},
+"_description": { "type" : "string"},
+"_concurrency": { "type" : "number"},
+"max_active_runs": { "type" : "number"},
+"default_args": { "$ref": "#/definitions/dict" },
+"start_date": { "$ref": "#/definitions/datetime" },
+"dagrun_timeout": { "$ref": "#/definitions/timedelta" },
+"doc_md": { "type" : "string"}
+  },
+  "required": [
+"params",
+"_dag_id",
+"fileloc",
+"task_dict"
+  ],
+  "additionalProperties": false
+},
+"task_dict": {
+  "type": "object",
+  "additionalProperties": { "$ref": "#/definitions/operator" }
+},
+"operator": {
+  "$comment": "A task/operator in a DAG",
+  "type": "object",
+  "required": [
+"_task_type",
+"_task_module",
+"task_id",
+"ui_color",
+"ui_fgcolor",
+"template_fields"
+  ],
+  "properties": {
+"_task_type": { "type": "string" },
+"_task_module": { "type": "string" },
+"task_id": { "type": "string" },
+"owner": { "type": "string" },
+"start_date": { "$ref": "#/definitions/datetime" },
+"end_date": { "$ref": "#/definitions/datetime" },
+"trigger_rule": { "type": "string" },
+"depends_on_past": { "type": "boolean" },
+"wait_for_downstream": { "type": "boolean" },
+"retries": { "type": "number" },
+"queue": { "type": "string" },
+"pool": { "type": "string" },
+"execution_timeout": { "$ref": "#/definitions/timedelta" },
+"retry_delay": { "$ref": "#/definitions/timedelta" },
+"retry_exponential_backoff": { 

[GitHub] [airflow] mik-laj commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-10-22 Thread GitBox
mik-laj commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r337772203
 
 

 ##
 File path: airflow/models/serialized_dag.py
 ##
 @@ -0,0 +1,214 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Serialzed DAG table in database."""
+
+import hashlib
+from datetime import timedelta
+from typing import TYPE_CHECKING, Any, Dict, List, Optional
+
+# from sqlalchemy import Column, Index, Integer, String, Text, JSON, and_, exc
+from sqlalchemy import JSON, Column, Index, Integer, String, and_
+from sqlalchemy.sql import exists
+
+from airflow.models.base import ID_LEN, Base
+from airflow.utils import db, timezone
+from airflow.utils.log.logging_mixin import LoggingMixin
+from airflow.utils.sqlalchemy import UtcDateTime
+
+if TYPE_CHECKING:
+from airflow.models import DAG  # noqa: F401; # pylint: 
disable=cyclic-import
+from airflow.serialization import SerializedDAG  # noqa: F401
+
+
+log = LoggingMixin().log
+
+
+class SerializedDagModel(Base):
+"""A table for serialized DAGs.
+
+serialized_dag table is a snapshot of DAG files synchronized by scheduler.
+This feature is controlled by:
+
+* ``[core] store_serialized_dags = True``: enable this feature
+* ``[core] min_serialized_dag_update_interval = 30`` (s):
+  serialized DAGs are updated in DB when a file gets processed by 
scheduler,
+  to reduce DB write rate, there is a minimal interval of updating 
serialized DAGs.
+* ``[scheduler] dag_dir_list_interval = 300`` (s):
+  interval of deleting serialized DAGs in DB when the files are deleted, 
suggest
+  to use a smaller interval such as 60
+
+It is used by webserver to load dagbags when 
``store_serialized_dags=True``.
+Because reading from database is lightweight compared to importing from 
files,
+it solves the webserver scalability issue.
+"""
+__tablename__ = 'serialized_dag'
+
+dag_id = Column(String(ID_LEN), primary_key=True)
+fileloc = Column(String(2000), nullable=False)
+# The max length of fileloc exceeds the limit of indexing.
+fileloc_hash = Column(Integer, nullable=False)
+data = Column(JSON, nullable=False)
+last_updated = Column(UtcDateTime, nullable=False)
+
+__table_args__ = (
+Index('idx_fileloc_hash', fileloc_hash, unique=False),
+)
+
+def __init__(self, dag: 'DAG'):
+from airflow.serialization import SerializedDAG  # noqa # pylint: 
disable=redefined-outer-name
+
+self.dag_id = dag.dag_id
+self.fileloc = dag.full_filepath
+self.fileloc_hash = self.dag_fileloc_hash(self.fileloc)
+self.data = SerializedDAG.to_dict(dag)
+self.last_updated = timezone.utcnow()
+
+@staticmethod
+def dag_fileloc_hash(full_filepath: str) -> int:
+Hashing file location for indexing.
+
+:param full_filepath: full filepath of DAG file
+:return: hashed full_filepath
+"""
+# hashing is needed because the length of fileloc is 2000 as an 
Airflow convention,
+# which is over the limit of indexing. If we can reduce the length of 
fileloc, then
+# hashing is not needed.
+return int.from_bytes(
+hashlib.sha1(full_filepath.encode('utf-8')).digest()[-2:], 
byteorder='big', signed=False)
+
+@classmethod
+@db.provide_session
+def write_dag(cls, dag: 'DAG', min_update_interval: Optional[int] = None, 
session=None):
+"""Serializes a DAG and writes it into database.
+
+:param dag: a DAG to be written into database
+:param min_update_interval: minimal interval in seconds to update 
serialized DAG
+:param session: ORM Session
+"""
+log.debug("Writing DAG: %s to the DB", dag)
+# Checks if (Current Time - Time when the DAG was written to DB) < 
min_update_interval
+# If Yes, does nothing
+# If No or the DAG does not exists, updates / writes Serialized DAG to 
DB
+if min_update_interval is not None:
+if session.query(exists().where(
+and_(cls.dag_id == 

[GitHub] [airflow] kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-10-22 Thread GitBox
kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r337772008
 
 

 ##
 File path: airflow/models/serialized_dag.py
 ##
 @@ -0,0 +1,214 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Serialzed DAG table in database."""
+
+import hashlib
+from datetime import timedelta
+from typing import TYPE_CHECKING, Any, Dict, List, Optional
+
+# from sqlalchemy import Column, Index, Integer, String, Text, JSON, and_, exc
+from sqlalchemy import JSON, Column, Index, Integer, String, and_
+from sqlalchemy.sql import exists
+
+from airflow.models.base import ID_LEN, Base
+from airflow.utils import db, timezone
+from airflow.utils.log.logging_mixin import LoggingMixin
+from airflow.utils.sqlalchemy import UtcDateTime
+
+if TYPE_CHECKING:
+from airflow.models import DAG  # noqa: F401; # pylint: 
disable=cyclic-import
+from airflow.serialization import SerializedDAG  # noqa: F401
+
+
+log = LoggingMixin().log
+
+
+class SerializedDagModel(Base):
+"""A table for serialized DAGs.
+
+serialized_dag table is a snapshot of DAG files synchronized by scheduler.
+This feature is controlled by:
+
+* ``[core] store_serialized_dags = True``: enable this feature
+* ``[core] min_serialized_dag_update_interval = 30`` (s):
+  serialized DAGs are updated in DB when a file gets processed by 
scheduler,
+  to reduce DB write rate, there is a minimal interval of updating 
serialized DAGs.
+* ``[scheduler] dag_dir_list_interval = 300`` (s):
+  interval of deleting serialized DAGs in DB when the files are deleted, 
suggest
+  to use a smaller interval such as 60
+
+It is used by webserver to load dagbags when 
``store_serialized_dags=True``.
+Because reading from database is lightweight compared to importing from 
files,
+it solves the webserver scalability issue.
+"""
+__tablename__ = 'serialized_dag'
+
+dag_id = Column(String(ID_LEN), primary_key=True)
+fileloc = Column(String(2000), nullable=False)
+# The max length of fileloc exceeds the limit of indexing.
+fileloc_hash = Column(Integer, nullable=False)
+data = Column(JSON, nullable=False)
+last_updated = Column(UtcDateTime, nullable=False)
+
+__table_args__ = (
+Index('idx_fileloc_hash', fileloc_hash, unique=False),
+)
+
+def __init__(self, dag: 'DAG'):
+from airflow.serialization import SerializedDAG  # noqa # pylint: 
disable=redefined-outer-name
+
+self.dag_id = dag.dag_id
+self.fileloc = dag.full_filepath
+self.fileloc_hash = self.dag_fileloc_hash(self.fileloc)
+self.data = SerializedDAG.to_dict(dag)
+self.last_updated = timezone.utcnow()
+
+@staticmethod
+def dag_fileloc_hash(full_filepath: str) -> int:
+Hashing file location for indexing.
+
+:param full_filepath: full filepath of DAG file
+:return: hashed full_filepath
+"""
+# hashing is needed because the length of fileloc is 2000 as an 
Airflow convention,
+# which is over the limit of indexing. If we can reduce the length of 
fileloc, then
+# hashing is not needed.
+return int.from_bytes(
+hashlib.sha1(full_filepath.encode('utf-8')).digest()[-2:], 
byteorder='big', signed=False)
+
+@classmethod
+@db.provide_session
+def write_dag(cls, dag: 'DAG', min_update_interval: Optional[int] = None, 
session=None):
+"""Serializes a DAG and writes it into database.
+
+:param dag: a DAG to be written into database
+:param min_update_interval: minimal interval in seconds to update 
serialized DAG
+:param session: ORM Session
+"""
+log.debug("Writing DAG: %s to the DB", dag)
+# Checks if (Current Time - Time when the DAG was written to DB) < 
min_update_interval
+# If Yes, does nothing
+# If No or the DAG does not exists, updates / writes Serialized DAG to 
DB
+if min_update_interval is not None:
+if session.query(exists().where(
+and_(cls.dag_id == 

[GitHub] [airflow-site] kgabryje commented on a change in pull request #89: [depends on #83] Feature/blog tags

2019-10-22 Thread GitBox
kgabryje commented on a change in pull request #89: [depends on #83] 
Feature/blog tags
URL: https://github.com/apache/airflow-site/pull/89#discussion_r337771639
 
 

 ##
 File path: landing-pages/site/assets/scss/main-custom.scss
 ##
 @@ -18,7 +18,7 @@
  */
 
 @import url('https://fonts.googleapis.com/css?family=Rubik:500=swap');
-@import 
url('https://fonts.googleapis.com/css?family=Roboto:400,500,700=swap');
+@import 
url('https://fonts.googleapis.com/css?family=Roboto:400,400i,500,700=swap');
 
 Review comment:
    


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-10-22 Thread GitBox
kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r337771516
 
 

 ##
 File path: docs/howto/enable-dag-serialization.rst
 ##
 @@ -0,0 +1,109 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+
+
+
+Enable DAG Serialization
+
+
+Add the following settings in ``airflow.cfg``:
+
+.. code-block:: ini
+
+[core]
+store_serialized_dags = True
+min_serialized_dag_update_interval = 30
+
+*   ``store_serialized_dags``: This flag decides whether to serialises DAGs 
and persist them in DB.
+If set to True, Webserver reads from DB instead of parsing DAG files
+*   ``min_serialized_dag_update_interval``: This flag sets the minimum 
interval (in seconds) after which
+the serialized DAG in DB should be updated. This helps in reducing 
database write rate.
+
+If you are updating Airflow from <1.10.6, please do not forget to run 
``airflow db upgrade``.
+
+
+How it works
+
+
+In order to make Airflow Webserver stateless (almost!), Airflow >=1.10.6 
supports
+DAG Serialization and DB Persistence.
+
+.. image:: ../img/dag_serialization.png
+
+As shown in the image above in Vanilla Airflow, the Webserver and the 
Scheduler both
+needs access to the DAG files. Both the scheduler and webserver parses the DAG 
files.
+
+With **DAG Serialization** we aim to decouple the webserver from DAG parsing
+which would make the Webserver very light-weight.
+
+As shown in the image above, when using the dag_serilization feature,
+the Scheduler parses the DAG files, serializes them in JSON format and saves 
them in the Metadata DB.
+
+The Webserver now instead of having to parse the DAG file again, reads the
+serialized DAGs in JSON, de-serializes them and create the DagBag and uses it
+to show in the UI.
+
+One of the key features that is implemented as the part of DAG Serialization 
is that
+instead of loading an entire DagBag when the WebServer starts we only load 
each DAG on demand from the
+Serialized Dag table. This helps reduce Webserver startup time and memory. The 
reduction is notable
+when you have large number of DAGs.
+
+Below is the screenshot of the ``serialized_dag`` table in Metadata DB:
+
+.. image:: ../img/serialized_dag_table.png
+
+Limitations
+---
+The Webserver will still need access to DAG files in the following cases,
+which is why we said "almost" stateless.
+
+*   **Rendered Template** tab will still have to parse Python file as it needs 
all the details like
+the execution date and even the data passed by the upstream task using 
Xcom.
+*   **Code View** will read the DAG File & show it using Pygments.
+However, it does not need to Parse the Python file so it is still a small 
operation.
+*   :doc:`Extra Operator Links ` would not work out of
+the box. They need to be defined in Airflow Plugin.
+
+**Existing Airflow Operators**:
+To make extra operator links work with existing operators like BigQuery, 
copy all
+the classes that are defined in ``operator_extra_links`` property.
 
 Review comment:
   Yes, unfortunately in the current implementation you need to add operator 
links via plugin for in-built operators too.
   
   Let me try and see if I can remove the need to register links for in-built 
operators links


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-10-22 Thread GitBox
kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r337770457
 
 

 ##
 File path: airflow/models/serialized_dag.py
 ##
 @@ -0,0 +1,214 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""Serialzed DAG table in database."""
+
+import hashlib
+from datetime import timedelta
+from typing import TYPE_CHECKING, Any, Dict, List, Optional
+
+# from sqlalchemy import Column, Index, Integer, String, Text, JSON, and_, exc
+from sqlalchemy import JSON, Column, Index, Integer, String, and_
+from sqlalchemy.sql import exists
+
+from airflow.models.base import ID_LEN, Base
+from airflow.utils import db, timezone
+from airflow.utils.log.logging_mixin import LoggingMixin
+from airflow.utils.sqlalchemy import UtcDateTime
+
+if TYPE_CHECKING:
+from airflow.models import DAG  # noqa: F401; # pylint: 
disable=cyclic-import
+from airflow.serialization import SerializedDAG  # noqa: F401
+
+
+log = LoggingMixin().log
+
+
+class SerializedDagModel(Base):
+"""A table for serialized DAGs.
+
+serialized_dag table is a snapshot of DAG files synchronized by scheduler.
+This feature is controlled by:
+
+* ``[core] store_serialized_dags = True``: enable this feature
+* ``[core] min_serialized_dag_update_interval = 30`` (s):
+  serialized DAGs are updated in DB when a file gets processed by 
scheduler,
+  to reduce DB write rate, there is a minimal interval of updating 
serialized DAGs.
+* ``[scheduler] dag_dir_list_interval = 300`` (s):
+  interval of deleting serialized DAGs in DB when the files are deleted, 
suggest
+  to use a smaller interval such as 60
+
+It is used by webserver to load dagbags when 
``store_serialized_dags=True``.
+Because reading from database is lightweight compared to importing from 
files,
+it solves the webserver scalability issue.
+"""
+__tablename__ = 'serialized_dag'
+
+dag_id = Column(String(ID_LEN), primary_key=True)
+fileloc = Column(String(2000), nullable=False)
+# The max length of fileloc exceeds the limit of indexing.
+fileloc_hash = Column(Integer, nullable=False)
+data = Column(JSON, nullable=False)
+last_updated = Column(UtcDateTime, nullable=False)
+
+__table_args__ = (
+Index('idx_fileloc_hash', fileloc_hash, unique=False),
+)
+
+def __init__(self, dag: 'DAG'):
+from airflow.serialization import SerializedDAG  # noqa # pylint: 
disable=redefined-outer-name
+
+self.dag_id = dag.dag_id
+self.fileloc = dag.full_filepath
+self.fileloc_hash = self.dag_fileloc_hash(self.fileloc)
+self.data = SerializedDAG.to_dict(dag)
+self.last_updated = timezone.utcnow()
+
+@staticmethod
+def dag_fileloc_hash(full_filepath: str) -> int:
+Hashing file location for indexing.
+
+:param full_filepath: full filepath of DAG file
+:return: hashed full_filepath
+"""
+# hashing is needed because the length of fileloc is 2000 as an 
Airflow convention,
+# which is over the limit of indexing. If we can reduce the length of 
fileloc, then
+# hashing is not needed.
+return int.from_bytes(
+hashlib.sha1(full_filepath.encode('utf-8')).digest()[-2:], 
byteorder='big', signed=False)
+
+@classmethod
+@db.provide_session
+def write_dag(cls, dag: 'DAG', min_update_interval: Optional[int] = None, 
session=None):
+"""Serializes a DAG and writes it into database.
+
+:param dag: a DAG to be written into database
+:param min_update_interval: minimal interval in seconds to update 
serialized DAG
+:param session: ORM Session
+"""
+log.debug("Writing DAG: %s to the DB", dag)
+# Checks if (Current Time - Time when the DAG was written to DB) < 
min_update_interval
+# If Yes, does nothing
+# If No or the DAG does not exists, updates / writes Serialized DAG to 
DB
+if min_update_interval is not None:
+if session.query(exists().where(
+and_(cls.dag_id == 

[GitHub] [airflow] kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-10-22 Thread GitBox
kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r337769472
 
 

 ##
 File path: airflow/dag/serialization/schema.json
 ##
 @@ -0,0 +1,196 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#;,
+  "$id": "https://airflow.apache.com/schemas/serialized-dags.json;,
+  "definitions": {
+"datetime": {
+  "description": "A date time, stored as fractional seconds since the 
epoch",
+  "type": "number"
+},
+"typed_datetime": {
+  "type": "object",
+  "properties": {
+"__type": {
+  "type": "string",
+  "const": "datetime"
+},
+"__var": { "$ref": "#/definitions/datetime" }
+  },
+  "required": [
+"__type",
+"__var"
+  ],
+  "additionalProperties": false
+},
+"timedelta": {
+  "type": "number",
+  "minimum": 0
+},
+"typed_timedelta": {
+  "type": "object",
+  "properties": {
+"__type": {
+  "type": "string",
+  "const": "timedelta"
+},
+"__var": { "$ref": "#/definitions/timedelta" }
+  },
+  "required": [
+"__type",
+"__var"
+  ],
+  "additionalProperties": false
+},
+"typed_relativedelta": {
+  "type": "object",
+  "description": "A dateutil.relativedelta.relativedelta object",
+  "properties": {
+"__type": {
+  "type": "string",
+  "const": "relativedelta"
+},
+"__var": {
+  "type": "object",
+  "properties": {
+"weekday": {
+  "type": "array",
+  "items": { "type": "integer" },
+  "minItems": 1,
+  "maxItems": 2
+}
+  },
+  "additionalProperties": { "type": "integer" }
+}
+  }
+},
+"timezone": {
+  "type": "string"
+},
+"dict": {
+  "description": "A python dictionary containing values of any type",
+  "type": "object"
+},
+"color": {
+  "type": "string",
+  "pattern": "^#[a-fA-F0-9]{3,6}$"
+},
+"stringset": {
+  "description": "A set of strings",
+  "type": "object",
+  "properties": {
+"__type": {
+  "type": "string",
+  "const": "set"
+},
+"__var": {
+  "type": "array",
+  "items": { "type": "string" }
+}
+  },
+  "required": [
+"__type",
+"__var"
+  ]
+},
+"dag": {
+  "type": "object",
+  "properties": {
+"params": { "$ref": "#/definitions/dict" },
+"_dag_id": { "type": "string" },
+"task_dict": {  "$ref": "#/definitions/task_dict" },
+"timezone": { "$ref": "#/definitions/timezone" },
+"schedule_interval": {
+  "anyOf": [
+{ "type": "null" },
+{ "type": "string" },
+{ "$ref": "#/definitions/typed_timedelta" },
+{ "$ref": "#/definitions/typed_relativedelta" }
+  ]
+},
+"catchup": { "type": "boolean" },
+"is_subdag": { "type": "boolean" },
+"fileloc": { "type" : "string"},
+"orientation": { "type" : "string"},
+"_description": { "type" : "string"},
+"_concurrency": { "type" : "number"},
+"max_active_runs": { "type" : "number"},
+"default_args": { "$ref": "#/definitions/dict" },
+"start_date": { "$ref": "#/definitions/datetime" },
+"dagrun_timeout": { "$ref": "#/definitions/timedelta" },
+"doc_md": { "type" : "string"}
+  },
+  "required": [
+"params",
+"_dag_id",
+"fileloc",
+"task_dict"
+  ],
+  "additionalProperties": false
+},
+"task_dict": {
+  "type": "object",
+  "additionalProperties": { "$ref": "#/definitions/operator" }
+},
+"operator": {
+  "$comment": "A task/operator in a DAG",
+  "type": "object",
+  "required": [
+"_task_type",
+"_task_module",
+"task_id",
+"ui_color",
+"ui_fgcolor",
+"template_fields"
+  ],
+  "properties": {
+"_task_type": { "type": "string" },
+"_task_module": { "type": "string" },
+"task_id": { "type": "string" },
+"owner": { "type": "string" },
+"start_date": { "$ref": "#/definitions/datetime" },
+"end_date": { "$ref": "#/definitions/datetime" },
+"trigger_rule": { "type": "string" },
+"depends_on_past": { "type": "boolean" },
+"wait_for_downstream": { "type": "boolean" },
+"retries": { "type": "number" },
+"queue": { "type": "string" },
+"pool": { "type": "string" },
+"execution_timeout": { "$ref": "#/definitions/timedelta" },
+"retry_delay": { "$ref": "#/definitions/timedelta" },
+"retry_exponential_backoff": { 

[GitHub] [airflow] kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-10-22 Thread GitBox
kaxil commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] 
Persisting serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#discussion_r337769592
 
 

 ##
 File path: airflow/dag/serialization/serialization.py
 ##
 @@ -53,81 +50,81 @@ class Serialization:
 _primitive_types = (int, bool, float, str)
 
 # Time types.
-_datetime_types = (datetime.datetime, datetime.date, datetime.time)
+# datetime.date and datetime.time are converted to strings.
+_datetime_types = (datetime.datetime,)
 
 # Object types that are always excluded in serialization.
 # FIXME: not needed if _included_fields of DAG and operator are customized.
 _excluded_types = (logging.Logger, Connection, type)
 
-_json_schema = None # type: Optional[Dict]
+_json_schema = None  # type: Optional[Validator]
+
+_CONSTRUCTOR_PARAMS = {}  # type: Dict[str, Parameter]
+
+SERIALIZER_VERSION = 1
 
 @classmethod
 def to_json(cls, var: Union[DAG, BaseOperator, dict, list, set, tuple]) -> 
str:
 """Stringifies DAGs and operators contained by var and returns a JSON 
string of var.
 """
-return json.dumps(cls._serialize(var, {}), ensure_ascii=True)
+return json.dumps(cls.to_dict(var), ensure_ascii=True)
 
 @classmethod
-def from_json(cls, json_str: str) -> Union[
+def to_dict(cls, var: Union[DAG, BaseOperator, dict, list, set, tuple]) -> 
dict:
+"""Stringifies DAGs and operators contained by var and returns a dict 
of var.
+"""
+# Don't call on this class directly - only SerializedDAG or
+# SerializedBaseOperator should be used as the "entrypoint"
+raise NotImplementedError()
+
+@classmethod
+def from_json(cls, serialized_obj: str) -> Union[
 'SerializedDAG', 'SerializedBaseOperator', dict, list, set, tuple]:
 """Deserializes json_str and reconstructs all DAGs and operators it 
contains."""
-return cls._deserialize(json.loads(json_str), {})
+return cls.from_dict(json.loads(serialized_obj))
 
 @classmethod
-def validate_json(cls, json_str: str):
-"""Validate json_str satisfies JSON schema."""
+def from_dict(cls, serialized_obj: dict) -> Union[
+'SerializedDAG', 'SerializedBaseOperator', dict, list, set, tuple]:
+"""Deserializes a python dict stored with type decorators and
+reconstructs all DAGs and operators it contains."""
+return cls._deserialize(serialized_obj)
+
+@classmethod
+def validate_schema(cls, serialized_obj: Union[str, dict]):
+"""Validate serialized_obj satisfies JSON schema."""
 if cls._json_schema is None:
 raise AirflowException('JSON schema of {:s} is not 
set.'.format(cls.__name__))
-jsonschema.validate(json.loads(json_str), cls._json_schema)
+
+if isinstance(serialized_obj, dict):
+cls._json_schema.validate(serialized_obj)
+elif isinstance(serialized_obj, str):
+cls._json_schema.validate(json.loads(serialized_obj))
 
 Review comment:
   Updated


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] stevetaggart opened a new pull request #6391: [AIRFLOW-XXX] Improve wording

2019-10-22 Thread GitBox
stevetaggart opened a new pull request #6391: [AIRFLOW-XXX] Improve wording
URL: https://github.com/apache/airflow/pull/6391
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes: a small change to documentation wording
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason: tiny documentation change 
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow-site] mik-laj commented on a change in pull request #89: Feature/blog tags

2019-10-22 Thread GitBox
mik-laj commented on a change in pull request #89: Feature/blog tags
URL: https://github.com/apache/airflow-site/pull/89#discussion_r337763096
 
 

 ##
 File path: landing-pages/site/assets/scss/main-custom.scss
 ##
 @@ -18,7 +18,7 @@
  */
 
 @import url('https://fonts.googleapis.com/css?family=Rubik:500=swap');
-@import 
url('https://fonts.googleapis.com/css?family=Roboto:400,500,700=swap');
+@import 
url('https://fonts.googleapis.com/css?family=Roboto:400,400i,500,700=swap');
 
 Review comment:
   It doesn't have to be in a separate PR. I just didn't know what happened 
here and I thought it was a typo


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow-site] kgabryje commented on a change in pull request #89: Feature/blog tags

2019-10-22 Thread GitBox
kgabryje commented on a change in pull request #89: Feature/blog tags
URL: https://github.com/apache/airflow-site/pull/89#discussion_r337754367
 
 

 ##
 File path: landing-pages/site/assets/scss/main-custom.scss
 ##
 @@ -18,7 +18,7 @@
  */
 
 @import url('https://fonts.googleapis.com/css?family=Rubik:500=swap');
-@import 
url('https://fonts.googleapis.com/css?family=Roboto:400,500,700=swap');
+@import 
url('https://fonts.googleapis.com/css?family=Roboto:400,400i,500,700=swap');
 
 Review comment:
   I added Roboto italic because we have italic style in typography. Though i 
realise now that it shouldn't be a part of this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-5720) don't call _get_connections_from_db in TestCloudSqlDatabaseHook

2019-10-22 Thread Daniel Standish (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Standish updated AIRFLOW-5720:
-
Description: 
Issues with this test class:

*tests are mocking lower-level than they need to*
Tests were mocking {{airflow.hook.BaseHook.get_connections}}.
Instead they can mock 
{{airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection}} which is 
more direct.

*should not reference private method*

This is an impediment to refactoring of connections / creds.

*Tests had complexity that did not add a benefit*

They all had this bit:
{code:python}
self._setup_connections(get_connections, uri)
gcp_conn_id = 'google_cloud_default'
hook = CloudSqlDatabaseHook(

default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(
'extra__google_cloud_platform__project')
)
{code}

{{_setup_connections}} was like this:
{code:python}
@staticmethod
def _setup_connections(get_connections, uri):
gcp_connection = mock.MagicMock()
gcp_connection.extra_dejson = mock.MagicMock()
gcp_connection.extra_dejson.get.return_value = 'empty_project'
cloudsql_connection = Connection()
cloudsql_connection.parse_from_uri(uri)
cloudsql_connection2 = Connection()
cloudsql_connection2.parse_from_uri(uri)
get_connections.side_effect = [[gcp_connection], [cloudsql_connection],
   [cloudsql_connection2]]
{code}

Issues here are as follows.

1. no test ever used the third side effect

2. the first side effect does not help us; {{default_gcp_project_id}} is 
irrelevant

Only one of the three connections in {{_setup_connections}} has any impact on 
the test.

The call of {{BaseHook.get_connection}} only serves to discard the first 
connection in mock side effect list, {{gcp_connection}}.  

The second connection is the one that matters, and it is returned when 
{{CloudSqlDatabaseHook}} calls `self.get_connection` during init.  Since it is 
a mock side effect, it doesn't matter what value is given for conn_id.  So the 
{{CloudSqlDatabaseHook}} init param {{default_gcp_project_id}} has no 
consequence.  And because it has no consequence, we should not supply a value  
for it because this is misleading.

We should not have extra code that does not serve a purpose because it makes it 
harder to understand what's actually happening.



  was:
Issues with this test class:

*tests are mocking lower-level than they need to*
Tests were mocking {{airflow.hook.BaseHook.get_connections}}.
Instead they can mock 
{{airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection}} which is 
more direct.

*should not reference private method*

This is an impediment to refactoring of connections / creds.

*Tests had complexity that did not add a benefit*

They all had this bit:
{code:python}
self._setup_connections(get_connections, uri)
gcp_conn_id = 'google_cloud_default'
hook = CloudSqlDatabaseHook(

default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(
'extra__google_cloud_platform__project')
)
{code}

{{_setup_connections}} was like this:
{code:python}
@staticmethod
def _setup_connections(get_connections, uri):
gcp_connection = mock.MagicMock()
gcp_connection.extra_dejson = mock.MagicMock()
gcp_connection.extra_dejson.get.return_value = 'empty_project'
cloudsql_connection = Connection()
cloudsql_connection.parse_from_uri(uri)
cloudsql_connection2 = Connection()
cloudsql_connection2.parse_from_uri(uri)
get_connections.side_effect = [[gcp_connection], [cloudsql_connection],
   [cloudsql_connection2]]
{code}

Issues here are as follows.

1. no test ever used the third side effect

2. the first side effect does not help us; {{default_gcp_project_id}} is 
irrelevant

Only one of the three connections in {{_setup_connections}} has any impact on 
the test.

The call of {{BaseHook.get_connection}} only serves to discard the first 
connection in mock side effect list, {{gcp_connection}}.  

The second connection is the one that matters, and it is returned when 
{{CloudSqlDatabaseHook}} calls `self.get_connection` during init.  Since it is 
a mock side effect, it doesn't matter what value is given for conn_id.  So the 
{{CloudSqlDatabaseHook}} init param {{default_gcp_project_id}} has no 
consequence.  And because it has no consequence, we should not supply a value  
for it because this is misleading.






> don't call _get_connections_from_db in TestCloudSqlDatabaseHook
> ---
>
> Key: AIRFLOW-5720
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5720
> Project: Apache Airflow
> 

[GitHub] [airflow] codecov-io edited a comment on issue #6281: Run batches of (self-terminating) EMR JobFlows [AIRFLOW-XXX]

2019-10-22 Thread GitBox
codecov-io edited a comment on issue #6281: Run batches of (self-terminating) 
EMR JobFlows [AIRFLOW-XXX]
URL: https://github.com/apache/airflow/pull/6281#issuecomment-539810934
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6281?src=pr=h1) 
Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@bc53412`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6281/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6281?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master#6281   +/-   ##
   =
 Coverage  ?   80.33%   
   =
 Files ?  627   
 Lines ?36316   
 Branches  ?0   
   =
 Hits  ?29173   
 Misses? 7143   
 Partials  ?0
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6281?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/contrib/sensors/emr\_run\_job\_flows.py](https://codecov.io/gh/apache/airflow/pull/6281/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL3NlbnNvcnMvZW1yX3J1bl9qb2JfZmxvd3MucHk=)
 | `100% <100%> (ø)` | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6281?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6281?src=pr=footer). 
Last update 
[bc53412...82f63ec](https://codecov.io/gh/apache/airflow/pull/6281?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] BasPH commented on a change in pull request #6317: [AIRFLOW-5644] Simplify TriggerDagRunOperator usage

2019-10-22 Thread GitBox
BasPH commented on a change in pull request #6317: [AIRFLOW-5644] Simplify 
TriggerDagRunOperator usage
URL: https://github.com/apache/airflow/pull/6317#discussion_r337730794
 
 

 ##
 File path: airflow/operators/dagrun_operator.py
 ##
 @@ -18,81 +18,64 @@
 # under the License.
 
 import datetime
-import json
-from typing import Callable, Dict, Optional, Union
+from typing import Dict, Optional, Union
 
 from airflow.api.common.experimental.trigger_dag import trigger_dag
 from airflow.models import BaseOperator
 from airflow.utils import timezone
 from airflow.utils.decorators import apply_defaults
 
 
-class DagRunOrder:
-def __init__(self, run_id=None, payload=None):
-self.run_id = run_id
-self.payload = payload
-
-
 class TriggerDagRunOperator(BaseOperator):
 """
 Triggers a DAG run for a specified ``dag_id``
 
 :param trigger_dag_id: the dag_id to trigger (templated)
 :type trigger_dag_id: str
-:param python_callable: a reference to a python function that will be
-called while passing it the ``context`` object and a placeholder
-object ``obj`` for your callable to fill and return if you want
-a DagRun created. This ``obj`` object contains a ``run_id`` and
-``payload`` attribute that you can modify in your function.
-The ``run_id`` should be a unique identifier for that DAG run, and
-the payload has to be a picklable object that will be made available
-to your tasks while executing that DAG run. Your function header
-should look like ``def foo(context, dag_run_obj):``
-:type python_callable: python callable
+:param conf: Configuration for the DAG run
+:type conf: dict
 :param execution_date: Execution date for the dag (templated)
 :type execution_date: str or datetime.datetime
 """
-template_fields = ('trigger_dag_id', 'execution_date')
-ui_color = '#ffefeb'
+
+template_fields = ("trigger_dag_id", "execution_date", "conf")
+ui_color = "#ffefeb"
 
 @apply_defaults
 def __init__(
-self,
-trigger_dag_id: str,
-python_callable: Optional[Callable[[Dict, DagRunOrder], 
DagRunOrder]] = None,
-execution_date: Optional[Union[str, datetime.datetime]] = None,
-*args, **kwargs) -> None:
+self,
+trigger_dag_id: str,
+conf: Optional[Dict] = None,
+execution_date: Optional[Union[str, datetime.datetime]] = None,
+*args,
+**kwargs
+) -> None:
 super().__init__(*args, **kwargs)
-self.python_callable = python_callable
 self.trigger_dag_id = trigger_dag_id
+self.conf = conf
 
-self.execution_date = None  # type: Optional[Union[str, 
datetime.datetime]]
-if isinstance(execution_date, datetime.datetime):
-self.execution_date = execution_date.isoformat()
-elif isinstance(execution_date, str):
+if execution_date is None or isinstance(execution_date, (str, 
datetime.datetime)):
 self.execution_date = execution_date
-elif execution_date is None:
-self.execution_date = None
 else:
 raise TypeError(
-'Expected str or datetime.datetime type '
-'for execution_date. Got {}'.format(
-type(execution_date)))
+"Expected str or datetime.datetime type for execution_date. "
+"Got {}".format(type(execution_date))
+)
 
-def execute(self, context):
-if self.execution_date is not None:
-run_id = 'trig__{}'.format(self.execution_date)
-self.execution_date = timezone.parse(self.execution_date)
+def execute(self, context: Dict):
+if isinstance(self.execution_date, datetime.datetime):
+run_id = "trig__{}".format(self.execution_date.isoformat())
+elif isinstance(self.execution_date, str):
+run_id = "trig__{}".format(self.execution_date)
+self.execution_date = timezone.parse(self.execution_date)  # 
trigger_dag() expects datetime
 
 Review comment:
   Would like to use it as it is much shorter and simpler.
   
   However, the `execution_date` can be templated. For example `{{ 
execution_date }}` will fail in `timezone.parse()`. So, we have to save it 
first, wait for `execute()` to be called and all variables to be templated, and 
only then can we call `timezone.parse()` on the `execution_date` :( 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] tooptoop4 commented on issue #2892: [AIRFLOW-XXX] Upgrade support to python 3.5 and disable dask

2019-10-22 Thread GitBox
tooptoop4 commented on issue #2892: [AIRFLOW-XXX] Upgrade support to python 3.5 
and disable dask
URL: https://github.com/apache/airflow/pull/2892#issuecomment-545132250
 
 
   Can tests be reenabled?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] tooptoop4 commented on issue #2067: [AIRFLOW-862] Add DaskExecutor

2019-10-22 Thread GitBox
tooptoop4 commented on issue #2067: [AIRFLOW-862] Add DaskExecutor
URL: https://github.com/apache/airflow/pull/2067#issuecomment-545130598
 
 
   @jlowin can distribute version above v2 work? Setup.py limits it to <2


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] BasPH commented on a change in pull request #6317: [AIRFLOW-5644] Simplify TriggerDagRunOperator usage

2019-10-22 Thread GitBox
BasPH commented on a change in pull request #6317: [AIRFLOW-5644] Simplify 
TriggerDagRunOperator usage
URL: https://github.com/apache/airflow/pull/6317#discussion_r337683917
 
 

 ##
 File path: airflow/operators/dagrun_operator.py
 ##
 @@ -18,81 +18,64 @@
 # under the License.
 
 import datetime
-import json
-from typing import Callable, Dict, Optional, Union
+from typing import Dict, Optional, Union
 
 from airflow.api.common.experimental.trigger_dag import trigger_dag
 from airflow.models import BaseOperator
 from airflow.utils import timezone
 from airflow.utils.decorators import apply_defaults
 
 
-class DagRunOrder:
-def __init__(self, run_id=None, payload=None):
-self.run_id = run_id
-self.payload = payload
-
-
 class TriggerDagRunOperator(BaseOperator):
 """
 Triggers a DAG run for a specified ``dag_id``
 
 :param trigger_dag_id: the dag_id to trigger (templated)
 :type trigger_dag_id: str
-:param python_callable: a reference to a python function that will be
-called while passing it the ``context`` object and a placeholder
-object ``obj`` for your callable to fill and return if you want
-a DagRun created. This ``obj`` object contains a ``run_id`` and
-``payload`` attribute that you can modify in your function.
-The ``run_id`` should be a unique identifier for that DAG run, and
-the payload has to be a picklable object that will be made available
-to your tasks while executing that DAG run. Your function header
-should look like ``def foo(context, dag_run_obj):``
-:type python_callable: python callable
+:param conf: Configuration for the DAG run
+:type conf: dict
 :param execution_date: Execution date for the dag (templated)
 :type execution_date: str or datetime.datetime
 """
-template_fields = ('trigger_dag_id', 'execution_date')
-ui_color = '#ffefeb'
+
+template_fields = ("trigger_dag_id", "execution_date", "conf")
+ui_color = "#ffefeb"
 
 @apply_defaults
 def __init__(
-self,
-trigger_dag_id: str,
-python_callable: Optional[Callable[[Dict, DagRunOrder], 
DagRunOrder]] = None,
-execution_date: Optional[Union[str, datetime.datetime]] = None,
-*args, **kwargs) -> None:
+self,
+trigger_dag_id: str,
+conf: Optional[Dict] = None,
+execution_date: Optional[Union[str, datetime.datetime]] = None,
+*args,
+**kwargs
+) -> None:
 super().__init__(*args, **kwargs)
-self.python_callable = python_callable
 self.trigger_dag_id = trigger_dag_id
+self.conf = conf
 
-self.execution_date = None  # type: Optional[Union[str, 
datetime.datetime]]
-if isinstance(execution_date, datetime.datetime):
-self.execution_date = execution_date.isoformat()
-elif isinstance(execution_date, str):
+if execution_date is None or isinstance(execution_date, (str, 
datetime.datetime)):
 self.execution_date = execution_date
-elif execution_date is None:
-self.execution_date = None
 else:
 raise TypeError(
-'Expected str or datetime.datetime type '
-'for execution_date. Got {}'.format(
-type(execution_date)))
+"Expected str or datetime.datetime type for execution_date. "
+"Got {}".format(type(execution_date))
+)
 
-def execute(self, context):
-if self.execution_date is not None:
-run_id = 'trig__{}'.format(self.execution_date)
-self.execution_date = timezone.parse(self.execution_date)
+def execute(self, context: Dict):
+if isinstance(self.execution_date, datetime.datetime):
+run_id = "trig__{}".format(self.execution_date.isoformat())
+elif isinstance(self.execution_date, str):
+run_id = "trig__{}".format(self.execution_date)
+self.execution_date = timezone.parse(self.execution_date)  # 
trigger_dag() expects datetime
 
 Review comment:
   Much simpler, much better


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5720) don't call _get_connections_from_db in TestCloudSqlDatabaseHook

2019-10-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957306#comment-16957306
 ] 

ASF GitHub Bot commented on AIRFLOW-5720:
-

dstandish commented on pull request #6390: [AIRFLOW-5720] don't call private 
method _get_connections_from_db
URL: https://github.com/apache/airflow/pull/6390
 
 
   * tests are mocking lower-level than they need to
   - Tests were mocking airflow.hook.BaseHook.get_connections.
   - Instead they can mock 
airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection which is more 
direct.
   * should not reference private method
   - This is an impediment to refactoring of connections / creds.
   * Tests had complexity that did not add a benefit
   - only one of the three connections in _setup_connections actually had 
any impact on the test
   - supplying a param for default_gcp_project_id is misleading because it 
has no impact on the test
   
   See Jira ticket 
[AIRFLOW-5720](https://issues.apache.org/jira/browse/AIRFLOW-5720) for more 
info.
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following 
[AIRFLOW-5720](https://issues.apache.org/jira/browse/AIRFLOW-5720) issues and 
references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5720
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Issues with this test class:
   
   **tests are mocking lower-level than they need to**
   Tests were mocking airflow.hook.BaseHook.get_connections.
   Instead they can mock 
airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection which is more 
direct.
   
   **should not reference private method**
   
   This is an impediment to refactoring of connections / creds.
   
   **Tests had complexity that did not add a benefit**
   
   They all had this bit:
   
   self._setup_connections(get_connections, uri)
   gcp_conn_id = 'google_cloud_default'
   hook = CloudSqlDatabaseHook(
   
default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(
   'extra__google_cloud_platform__project')
   )
   _setup_connections was like this:
   
   @staticmethod
   def _setup_connections(get_connections, uri):
   gcp_connection = mock.MagicMock()
   gcp_connection.extra_dejson = mock.MagicMock()
   gcp_connection.extra_dejson.get.return_value = 'empty_project'
   cloudsql_connection = Connection()
   cloudsql_connection.parse_from_uri(uri)
   cloudsql_connection2 = Connection()
   cloudsql_connection2.parse_from_uri(uri)
   get_connections.side_effect = [[gcp_connection], 
[cloudsql_connection],
  [cloudsql_connection2]]
   Issues here are as follows.
   
   1. no test ever used the third side effect
   
   2. the first side effect does not help us; `default_gcp_project_id` is 
irrelevant
   
   Only one of the three connections in `_setup_connections` has any impact on 
the test.
   
   The call of `BaseHook.get_connection` only serves to discard the first 
connection in mock side effect list, `gcp_connection`.
   
   The second connection is the one that matters, and it is returned when 
`CloudSqlDatabaseHook` calls `self.get_connection` during init. Since it is a 
mock side effect, it doesn't matter what value is given for conn_id. So the 
`CloudSqlDatabaseHook` init param `default_gcp_project_id` has no consequence. 
And because it has no consequence, we should not supply a value for it because 
this is misleading.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
 

[GitHub] [airflow] dstandish opened a new pull request #6390: [AIRFLOW-5720] don't call private method _get_connections_from_db

2019-10-22 Thread GitBox
dstandish opened a new pull request #6390: [AIRFLOW-5720] don't call private 
method _get_connections_from_db
URL: https://github.com/apache/airflow/pull/6390
 
 
   * tests are mocking lower-level than they need to
   - Tests were mocking airflow.hook.BaseHook.get_connections.
   - Instead they can mock 
airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection which is more 
direct.
   * should not reference private method
   - This is an impediment to refactoring of connections / creds.
   * Tests had complexity that did not add a benefit
   - only one of the three connections in _setup_connections actually had 
any impact on the test
   - supplying a param for default_gcp_project_id is misleading because it 
has no impact on the test
   
   See Jira ticket 
[AIRFLOW-5720](https://issues.apache.org/jira/browse/AIRFLOW-5720) for more 
info.
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following 
[AIRFLOW-5720](https://issues.apache.org/jira/browse/AIRFLOW-5720) issues and 
references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5720
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Issues with this test class:
   
   **tests are mocking lower-level than they need to**
   Tests were mocking airflow.hook.BaseHook.get_connections.
   Instead they can mock 
airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection which is more 
direct.
   
   **should not reference private method**
   
   This is an impediment to refactoring of connections / creds.
   
   **Tests had complexity that did not add a benefit**
   
   They all had this bit:
   
   self._setup_connections(get_connections, uri)
   gcp_conn_id = 'google_cloud_default'
   hook = CloudSqlDatabaseHook(
   
default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(
   'extra__google_cloud_platform__project')
   )
   _setup_connections was like this:
   
   @staticmethod
   def _setup_connections(get_connections, uri):
   gcp_connection = mock.MagicMock()
   gcp_connection.extra_dejson = mock.MagicMock()
   gcp_connection.extra_dejson.get.return_value = 'empty_project'
   cloudsql_connection = Connection()
   cloudsql_connection.parse_from_uri(uri)
   cloudsql_connection2 = Connection()
   cloudsql_connection2.parse_from_uri(uri)
   get_connections.side_effect = [[gcp_connection], 
[cloudsql_connection],
  [cloudsql_connection2]]
   Issues here are as follows.
   
   1. no test ever used the third side effect
   
   2. the first side effect does not help us; `default_gcp_project_id` is 
irrelevant
   
   Only one of the three connections in `_setup_connections` has any impact on 
the test.
   
   The call of `BaseHook.get_connection` only serves to discard the first 
connection in mock side effect list, `gcp_connection`.
   
   The second connection is the one that matters, and it is returned when 
`CloudSqlDatabaseHook` calls `self.get_connection` during init. Since it is a 
mock side effect, it doesn't matter what value is given for conn_id. So the 
`CloudSqlDatabaseHook` init param `default_gcp_project_id` has no consequence. 
And because it has no consequence, we should not supply a value for it because 
this is misleading.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about 

[jira] [Updated] (AIRFLOW-5720) don't call _get_connections_from_db in TestCloudSqlDatabaseHook

2019-10-22 Thread Daniel Standish (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Standish updated AIRFLOW-5720:
-
Description: 
Issues with this test class:

*tests are mocking lower-level than they need to*
Tests were mocking {{airflow.hook.BaseHook.get_connections}}.
Instead they can mock 
{{airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection}} which is 
more direct.

*should not reference private method*

This is an impediment to refactoring of connections / creds.

*Tests had complexity that did not add a benefit*

They all had this bit:
{code:python}
self._setup_connections(get_connections, uri)
gcp_conn_id = 'google_cloud_default'
hook = CloudSqlDatabaseHook(

default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(
'extra__google_cloud_platform__project')
)
{code}

{{_setup_connections}} was like this:
{code:python}
@staticmethod
def _setup_connections(get_connections, uri):
gcp_connection = mock.MagicMock()
gcp_connection.extra_dejson = mock.MagicMock()
gcp_connection.extra_dejson.get.return_value = 'empty_project'
cloudsql_connection = Connection()
cloudsql_connection.parse_from_uri(uri)
cloudsql_connection2 = Connection()
cloudsql_connection2.parse_from_uri(uri)
get_connections.side_effect = [[gcp_connection], [cloudsql_connection],
   [cloudsql_connection2]]
{code}

Issues here are as follows.

1. no test ever used the third side effect

2. the first side effect does not help us; {{default_gcp_project_id}} is 
irrelevant

Only one of the three connections in {{_setup_connections}} has any impact on 
the test.

The call of {{BaseHook.get_connection}} only serves to discard the first 
connection in mock side effect list, {{gcp_connection}}.  

The second connection is the one that matters, and it is returned when 
{{CloudSqlDatabaseHook}} calls `self.get_connection` during init.  Since it is 
a mock side effect, it doesn't matter what value is given for conn_id.  So the 
{{CloudSqlDatabaseHook}} init param {{default_gcp_project_id}} has no 
consequence.  And because it has no consequence, we should not supply a value  
for it because this is misleading.





  was:
Issues with this test class:

*tests are mocking lower-level than they need to*
Tests were mocking {{airflow.hook.BaseHook.get_connections}}.
Instead they can mock 
{{airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection}} which is 
more direct.

*should not reference private method*

This is an impediment to refactoring of connections / creds.

*Tests had complexity that did not add a benefit*

They all had this bit:
{code:python}
self._setup_connections(get_connections, uri)
gcp_conn_id = 'google_cloud_default'
hook = CloudSqlDatabaseHook(

default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(
'extra__google_cloud_platform__project')
)
{code}

{{_setup_connections}} was like this:
{code:python}
@staticmethod
def _setup_connections(get_connections, uri):
gcp_connection = mock.MagicMock()
gcp_connection.extra_dejson = mock.MagicMock()
gcp_connection.extra_dejson.get.return_value = 'empty_project'
cloudsql_connection = Connection()
cloudsql_connection.parse_from_uri(uri)
cloudsql_connection2 = Connection()
cloudsql_connection2.parse_from_uri(uri)
get_connections.side_effect = [[gcp_connection], [cloudsql_connection],
   [cloudsql_connection2]]
{code}

Issues here are as follows.

1. no test ever used the third side effect

2. the first side effect does not help us; {{default_gcp_project_id}} is 
irrelevant.

All this line serves to accomplish is to discard the first connection, 
{{gcp_connection}}.  This is the first invocation of 
{{BaseHook.get_connection}}.  

The second invocation is the one that matters, namely when 
{{CloudSqlDatabaseHook}} calls `self.get_connection`.

This is when {{cloudsql_connection}} is returned.  But since it is a mock side 
effect, it doesn't matter what value you give for conn_id.  So the param 
{{default_gcp_project_id}} has no consequence.  And because it has no 
consequence, we should not supply a value  for it because this is misleading.






> don't call _get_connections_from_db in TestCloudSqlDatabaseHook
> ---
>
> Key: AIRFLOW-5720
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5720
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: gcp
>Affects Versions: 1.10.5
>Reporter: Daniel Standish
>Assignee: Daniel Standish
>Priority: 

[GitHub] [airflow-site] mik-laj commented on a change in pull request #89: Feature/blog tags

2019-10-22 Thread GitBox
mik-laj commented on a change in pull request #89: Feature/blog tags
URL: https://github.com/apache/airflow-site/pull/89#discussion_r337716708
 
 

 ##
 File path: landing-pages/site/assets/scss/main-custom.scss
 ##
 @@ -18,7 +18,7 @@
  */
 
 @import url('https://fonts.googleapis.com/css?family=Rubik:500=swap');
-@import 
url('https://fonts.googleapis.com/css?family=Roboto:400,500,700=swap');
+@import 
url('https://fonts.googleapis.com/css?family=Roboto:400,400i,500,700=swap');
 
 Review comment:
   ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] Fokko commented on issue #6370: AIRFLOW-5701: Don't clear xcom explicitly before execution

2019-10-22 Thread GitBox
Fokko commented on issue #6370: AIRFLOW-5701: Don't clear xcom explicitly 
before execution
URL: https://github.com/apache/airflow/pull/6370#issuecomment-545123697
 
 
   Actually, the current database layout is backward compatible. We don't use 
the `id` field, and I don't see too much value in writing a complicated 
migration script (since sqlite doesn't support removing columns, and also 
removing the PK's can be tricky in alembic). Ready for review :-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[airflow-site] branch aip-11 updated: Add blog page layout (#87)

2019-10-22 Thread kamilbregula
This is an automated email from the ASF dual-hosted git repository.

kamilbregula pushed a commit to branch aip-11
in repository https://gitbox.apache.org/repos/asf/airflow-site.git


The following commit(s) were added to refs/heads/aip-11 by this push:
 new 11ee2c4  Add blog page layout (#87)
11ee2c4 is described below

commit 11ee2c435d79de5c0726abd1b5ccbd75943c3c95
Author: Kamil Breguła 
AuthorDate: Tue Oct 22 21:44:07 2019 +0200

Add blog page layout (#87)

Co-Authored-By: Kamil Gabryjelski 
---
 .../assets/scss/{main-custom.scss => _blog.scss}   |  29 +--
 landing-pages/site/assets/scss/_list-boxes.scss|  33 ++-
 landing-pages/site/assets/scss/main-custom.scss|   1 +
 landing-pages/site/config.toml |   6 +-
 .../Its-a-breeze-to-develop-apache-airflow.html|   8 +
 .../Its-a-breeze-to-develop-apache-airflow2.html   |   8 +
 .../Its-a-breeze-to-develop-apache-airflow3.html   |   8 +
 .../Its-a-breeze-to-develop-apache-airflow4.html   |   8 +
 .../Its-a-breeze-to-develop-apache-airflow5.html   |   8 +
 landing-pages/site/content/en/blog/_index.md   |   7 +-
 landing-pages/site/content/en/blog/news/_index.md  |   6 -
 .../blog/news/first-post/featured-sunset-get.png   | Bin 387442 -> 0 bytes
 .../site/content/en/blog/news/first-post/index.md  |  44 
 .../site/content/en/blog/news/second-post.md   | 245 -
 .../site/content/en/blog/releases/_index.md|   6 -
 .../releases/in-depth-monoliths-detailed-spec.md   | 245 -
 landing-pages/site/layouts/blog/baseof.html|  46 
 landing-pages/site/layouts/blog/list.html  |  31 +++
 .../site/layouts/partials/boxes/blogpost.html  |  33 +++
 landing-pages/site/layouts/taxonomy/baseof.html|  46 
 landing-pages/site/layouts/taxonomy/tag.html   |  31 +++
 landing-pages/src/js/showAllCommiters.js   |   2 +-
 22 files changed, 274 insertions(+), 577 deletions(-)

diff --git a/landing-pages/site/assets/scss/main-custom.scss 
b/landing-pages/site/assets/scss/_blog.scss
similarity index 60%
copy from landing-pages/site/assets/scss/main-custom.scss
copy to landing-pages/site/assets/scss/_blog.scss
index 0813810..696d145 100644
--- a/landing-pages/site/assets/scss/main-custom.scss
+++ b/landing-pages/site/assets/scss/_blog.scss
@@ -17,21 +17,14 @@
  * under the License.
  */
 
-@import url('https://fonts.googleapis.com/css?family=Rubik:500=swap');
-@import 
url('https://fonts.googleapis.com/css?family=Roboto:400,500,700=swap');
-@import 
url('https://fonts.googleapis.com/css?family=Roboto+Mono=swap');
-
-@import "typography";
-@import "accordion";
-@import "buttons";
-@import "ol-ul";
-@import "list-boxes";
-@import "avatar";
-@import "quote";
-@import "pager";
-@import "case-study";
-@import "paragraph";
-@import "base-layout";
-@import "feature";
-@import "text-with-icon";
-@import "video";
+.tag {
+  // TODO(kgabryje): when using @extend .bodytext__medium--cerulean-blue, scss 
builds stylesheet but throws errors
+  font-family: "Roboto", sans-serif;
+  font-weight: 400;
+  font-size: 16px;
+  line-height: 1.63;
+  color: #017cee;
+  background-color: #d9ebfc; // cerulean-blue + opacity 0.15
+  padding: 1px 30px;
+  border-radius: 5px;
+}
diff --git a/landing-pages/site/assets/scss/_list-boxes.scss 
b/landing-pages/site/assets/scss/_list-boxes.scss
index 51b49ed..69cd658 100644
--- a/landing-pages/site/assets/scss/_list-boxes.scss
+++ b/landing-pages/site/assets/scss/_list-boxes.scss
@@ -16,7 +16,6 @@
  * specific language governing permissions and limitations
  * under the License.
  */
-
 @import "colors";
 
 $box-margin: 20px;
@@ -33,6 +32,11 @@ $box-margin: 20px;
   margin: $box-margin;
   padding: 30px 10px;
   width: 270px;
+
+  &--wide {
+max-width: 580px;
+width: 100%;
+  }
 }
 
 .box-event {
@@ -55,6 +59,32 @@ $box-margin: 20px;
 }
   }
 
+  &__blogpost {
+padding: 0 20px;
+
+&--metadata {
+  display: flex;
+  flex-wrap: wrap;
+  justify-content: space-between;
+  margin-bottom: 20px;
+}
+
+&--header {
+  @extend .subtitle__large--greyish-brown;
+  margin-bottom: 4px;
+}
+
+&--author {
+  @extend .bodytext__medium--cerulean-blue;
+  font-weight: 500;
+}
+
+&--description {
+  @extend .bodytext__medium--brownish-grey;
+  margin-bottom: 20px;
+}
+  }
+
   &__case-study {
 padding: 18px 18px 0;
 justify-content: space-between;
@@ -109,6 +139,7 @@ $box-margin: 20px;
 &--members {
   @extend .bodytext__medium--brownish-grey;
   margin-bottom: 30px;
+
   span {
 vertical-align: middle;
   }
diff --git a/landing-pages/site/assets/scss/main-custom.scss 
b/landing-pages/site/assets/scss/main-custom.scss
index 0813810..f39eef6 100644
--- a/landing-pages/site/assets/scss/main-custom.scss
+++ b/landing-pages/site/assets/scss/main-custom.scss
@@ -35,3 +35,4 @@
 @import "feature";
 @import "text-with-icon";
 

[GitHub] [airflow-site] mik-laj merged pull request #87: [depends on #85, #86, #87] Add blogpost list layout

2019-10-22 Thread GitBox
mik-laj merged pull request #87: [depends on #85, #86, #87] Add blogpost list 
layout
URL: https://github.com/apache/airflow-site/pull/87
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-5720) don't call _get_connections_from_db in TestCloudSqlDatabaseHook

2019-10-22 Thread Daniel Standish (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Standish updated AIRFLOW-5720:
-
Description: 
Issues with this test class:

*tests are mocking lower-level than they need to*
Tests were mocking {{airflow.hook.BaseHook.get_connections}}.
Instead they can mock 
{{airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection}} which is 
more direct.

*should not reference private method*

This is an impediment to refactoring of connections / creds.

*Tests had complexity that did not add a benefit*

They all had this bit:
{code:python}
self._setup_connections(get_connections, uri)
gcp_conn_id = 'google_cloud_default'
hook = CloudSqlDatabaseHook(

default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(
'extra__google_cloud_platform__project')
)
{code}

{{_setup_connections}} was like this:
{code:python}
@staticmethod
def _setup_connections(get_connections, uri):
gcp_connection = mock.MagicMock()
gcp_connection.extra_dejson = mock.MagicMock()
gcp_connection.extra_dejson.get.return_value = 'empty_project'
cloudsql_connection = Connection()
cloudsql_connection.parse_from_uri(uri)
cloudsql_connection2 = Connection()
cloudsql_connection2.parse_from_uri(uri)
get_connections.side_effect = [[gcp_connection], [cloudsql_connection],
   [cloudsql_connection2]]
{code}

Issues here are as follows.

1. no test ever used the third side effect

2. the first side effect does not help us; {{default_gcp_project_id}} is 
irrelevant.

All this line serves to accomplish is to discard the first connection, 
{{gcp_connection}}.  This is the first invocation of 
{{BaseHook.get_connection}}.  

The second invocation is the one that matters, namely when 
{{CloudSqlDatabaseHook}} calls `self.get_connection`.

This is when {{cloudsql_connection}} is returned.  But since it is a mock side 
effect, it doesn't matter what value you give for conn_id.  So the param 
{{default_gcp_project_id}} has no consequence.  And because it has no 
consequence, we should not supply a value  for it because this is misleading.





  was:
Issues with this test class:

*tests are mocking lower-level than they need to*
Tests were mocking {{airflow.hook.BaseHook.get_connections}}.
Instead they can mock 
{{airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection}} which is 
more direct.

*should not reference private method*

This is an impediment to refactoring of connections / creds.

*Tests had complexity that did not add a benefit*

They all had this bit:
{code:python}
self._setup_connections(get_connections, uri)
gcp_conn_id = 'google_cloud_default'
hook = CloudSqlDatabaseHook(

default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(
'extra__google_cloud_platform__project')
)
{code}

{{_setup_connections}} was like this:
{code:python}
@staticmethod
def _setup_connections(get_connections, uri):
gcp_connection = mock.MagicMock()
gcp_connection.extra_dejson = mock.MagicMock()
gcp_connection.extra_dejson.get.return_value = 'empty_project'
cloudsql_connection = Connection()
cloudsql_connection.parse_from_uri(uri)
cloudsql_connection2 = Connection()
cloudsql_connection2.parse_from_uri(uri)
get_connections.side_effect = [[gcp_connection], [cloudsql_connection],
   [cloudsql_connection2]]
{code}

Issues here are as follows.

1. no test ever used the third side effect

2. this line: 
{{default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(}}
 

All this line serves to accomplish is to discard the first connection, 
{{gcp_connection}}.  This is the first invocation of 
{{BaseHook.get_connection}}.  

The second invocation is the one that matters, namely when 
{{CloudSqlDatabaseHook}} calls `self.get_connection`.

This is when {{cloudsql_connection}} is returned.  But since it is a mock side 
effect, it doesn't matter what value you give for conn_id.  So the param 
{{default_gcp_project_id}} has no consequence.  And because it has no 
consequence, we should not supply a value  for it because this is misleading.






> don't call _get_connections_from_db in TestCloudSqlDatabaseHook
> ---
>
> Key: AIRFLOW-5720
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5720
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: gcp
>Affects Versions: 1.10.5
>Reporter: Daniel Standish
>Assignee: Daniel Standish
>Priority: Major
>
> Issues with this test class:
> *tests are 

[jira] [Updated] (AIRFLOW-5720) don't call _get_connections_from_db in TestCloudSqlDatabaseHook

2019-10-22 Thread Daniel Standish (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Standish updated AIRFLOW-5720:
-
Description: 
Issues with this test class:

*tests are mocking lower-level than they need to*
Tests were mocking {{airflow.hook.BaseHook.get_connections}}.
Instead they can mock 
{{airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection}} which is 
more direct.

*should not reference private method*

This is an impediment to refactoring of connections / creds.

*Tests had complexity that did not add a benefit*

They all had this bit:
{code:python}
self._setup_connections(get_connections, uri)
gcp_conn_id = 'google_cloud_default'
hook = CloudSqlDatabaseHook(

default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(
'extra__google_cloud_platform__project')
)
{code}

{{_setup_connections}} was like this:
{code:python}
@staticmethod
def _setup_connections(get_connections, uri):
gcp_connection = mock.MagicMock()
gcp_connection.extra_dejson = mock.MagicMock()
gcp_connection.extra_dejson.get.return_value = 'empty_project'
cloudsql_connection = Connection()
cloudsql_connection.parse_from_uri(uri)
cloudsql_connection2 = Connection()
cloudsql_connection2.parse_from_uri(uri)
get_connections.side_effect = [[gcp_connection], [cloudsql_connection],
   [cloudsql_connection2]]
{code}

Issues here are as follows.

1. no test ever used the third side effect

2. this line: 
{{default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(}}
 

All this line serves to accomplish is to discard the first connection, 
{{gcp_connection}}.  This is the first invocation of 
{{BaseHook.get_connection}}.  

The second invocation is the one that matters, namely when 
{{CloudSqlDatabaseHook}} calls `self.get_connection`.

This is when {{cloudsql_connection}} is returned.  But since it is a mock side 
effect, it doesn't matter what value you give for conn_id.  So the param 
{{default_gcp_project_id}} has no consequence.  And because it has no 
consequence, we should not supply a value  for it because this is misleading.





  was:
Test should not reference "private" methods.  This impedes refactoring.

Also some other issues with these tests.

More detail TBD






> don't call _get_connections_from_db in TestCloudSqlDatabaseHook
> ---
>
> Key: AIRFLOW-5720
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5720
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: gcp
>Affects Versions: 1.10.5
>Reporter: Daniel Standish
>Assignee: Daniel Standish
>Priority: Major
>
> Issues with this test class:
> *tests are mocking lower-level than they need to*
> Tests were mocking {{airflow.hook.BaseHook.get_connections}}.
> Instead they can mock 
> {{airflow.gcp.hooks.cloud_sql.CloudSqlDatabaseHook.get_connection}} which is 
> more direct.
> *should not reference private method*
> This is an impediment to refactoring of connections / creds.
> *Tests had complexity that did not add a benefit*
> They all had this bit:
> {code:python}
> self._setup_connections(get_connections, uri)
> gcp_conn_id = 'google_cloud_default'
> hook = CloudSqlDatabaseHook(
> 
> default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(
> 'extra__google_cloud_platform__project')
> )
> {code}
> {{_setup_connections}} was like this:
> {code:python}
> @staticmethod
> def _setup_connections(get_connections, uri):
> gcp_connection = mock.MagicMock()
> gcp_connection.extra_dejson = mock.MagicMock()
> gcp_connection.extra_dejson.get.return_value = 'empty_project'
> cloudsql_connection = Connection()
> cloudsql_connection.parse_from_uri(uri)
> cloudsql_connection2 = Connection()
> cloudsql_connection2.parse_from_uri(uri)
> get_connections.side_effect = [[gcp_connection], 
> [cloudsql_connection],
>[cloudsql_connection2]]
> {code}
> Issues here are as follows.
> 1. no test ever used the third side effect
> 2. this line: 
> {{default_gcp_project_id=BaseHook.get_connection(gcp_conn_id).extra_dejson.get(}}
>  
> All this line serves to accomplish is to discard the first connection, 
> {{gcp_connection}}.  This is the first invocation of 
> {{BaseHook.get_connection}}.  
> The second invocation is the one that matters, namely when 
> {{CloudSqlDatabaseHook}} calls `self.get_connection`.
> This is when {{cloudsql_connection}} is returned.  But since it is a mock 
> side effect, it doesn't matter what value you 

[jira] [Commented] (AIRFLOW-5715) Make email, owner context available

2019-10-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957293#comment-16957293
 ] 

ASF GitHub Bot commented on AIRFLOW-5715:
-

feng-tao commented on pull request #6385: [AIRFLOW-5715] Make email, owner 
context available
URL: https://github.com/apache/airflow/pull/6385
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Make email, owner context available
> ---
>
> Key: AIRFLOW-5715
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5715
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.10.5
>Reporter: Tao Feng
>Assignee: Tao Feng
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5715) Make email, owner context available

2019-10-22 Thread Tao Feng (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Feng resolved AIRFLOW-5715.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Make email, owner context available
> ---
>
> Key: AIRFLOW-5715
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5715
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.10.5
>Reporter: Tao Feng
>Assignee: Tao Feng
>Priority: Minor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5715) Make email, owner context available

2019-10-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957294#comment-16957294
 ] 

ASF subversion and git services commented on AIRFLOW-5715:
--

Commit 641b8aaf04bcf68311cd490481360ea93a3d360d in airflow's branch 
refs/heads/master from Tao Feng
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=641b8aa ]

[AIRFLOW-5715] Make email, owner context available (#6385)



> Make email, owner context available
> ---
>
> Key: AIRFLOW-5715
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5715
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.10.5
>Reporter: Tao Feng
>Assignee: Tao Feng
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] feng-tao merged pull request #6385: [AIRFLOW-5715] Make email, owner context available

2019-10-22 Thread GitBox
feng-tao merged pull request #6385: [AIRFLOW-5715] Make email, owner context 
available
URL: https://github.com/apache/airflow/pull/6385
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-5720) don't call _get_connections_from_db in TestCloudSqlDatabaseHook

2019-10-22 Thread Daniel Standish (Jira)
Daniel Standish created AIRFLOW-5720:


 Summary: don't call _get_connections_from_db in 
TestCloudSqlDatabaseHook
 Key: AIRFLOW-5720
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5720
 Project: Apache Airflow
  Issue Type: New Feature
  Components: gcp
Affects Versions: 1.10.5
Reporter: Daniel Standish
Assignee: Daniel Standish


Test should not reference "private" methods.  This impedes refactoring.

Also some other issues with these tests.

More detail TBD







--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow-site] mik-laj merged pull request #86: Add features components

2019-10-22 Thread GitBox
mik-laj merged pull request #86: Add features components
URL: https://github.com/apache/airflow-site/pull/86
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-5714) Collect SLA miss emails only from tasks missed SLA

2019-10-22 Thread Chao-Han Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao-Han Tsai resolved AIRFLOW-5714.

Fix Version/s: 1.10.7
   Resolution: Fixed

> Collect SLA miss emails only from tasks missed SLA
> --
>
> Key: AIRFLOW-5714
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5714
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 1.10.7
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
> Fix For: 1.10.7
>
>
> Currently when a task in the DAG missed the SLA, Airflow would traverse 
> through all the tasks in the DAG and collect all the task-level emails. Then 
> Airflow would send an SLA miss email to all those collected emails, which can 
> add unnecessary noise to task owners that does not contribute to the SLA miss.
> Thus, changing the code to only collect emails from the tasks that missed the 
> SLA.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5714) Collect SLA miss emails only from tasks missed SLA

2019-10-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957268#comment-16957268
 ] 

ASF GitHub Bot commented on AIRFLOW-5714:
-

milton0825 commented on pull request #6384: [AIRFLOW-5714] Collect SLA miss 
emails only from tasks missed SLA
URL: https://github.com/apache/airflow/pull/6384
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Collect SLA miss emails only from tasks missed SLA
> --
>
> Key: AIRFLOW-5714
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5714
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 1.10.7
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
>
> Currently when a task in the DAG missed the SLA, Airflow would traverse 
> through all the tasks in the DAG and collect all the task-level emails. Then 
> Airflow would send an SLA miss email to all those collected emails, which can 
> add unnecessary noise to task owners that does not contribute to the SLA miss.
> Thus, changing the code to only collect emails from the tasks that missed the 
> SLA.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5714) Collect SLA miss emails only from tasks missed SLA

2019-10-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957269#comment-16957269
 ] 

ASF subversion and git services commented on AIRFLOW-5714:
--

Commit bc5341223429efe777af7b492ad09b96c7c77b17 in airflow's branch 
refs/heads/master from Chao-Han Tsai
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=bc53412 ]

[AIRFLOW-5714] Collect SLA miss emails only from tasks missed SLA (#6384)

Currently when a task in the DAG missed the SLA,
Airflow would traverse through all the tasks in the DAG
and collect all the task-level emails. Then Airflow would
send an SLA miss email to all those collected emails,
which can add unnecessary noise to task owners that
does not contribute to the SLA miss.

Thus, changing the code to only collect emails
from the tasks that missed the SLA.


> Collect SLA miss emails only from tasks missed SLA
> --
>
> Key: AIRFLOW-5714
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5714
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 1.10.7
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
>
> Currently when a task in the DAG missed the SLA, Airflow would traverse 
> through all the tasks in the DAG and collect all the task-level emails. Then 
> Airflow would send an SLA miss email to all those collected emails, which can 
> add unnecessary noise to task owners that does not contribute to the SLA miss.
> Thus, changing the code to only collect emails from the tasks that missed the 
> SLA.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] milton0825 merged pull request #6384: [AIRFLOW-5714] Collect SLA miss emails only from tasks missed SLA

2019-10-22 Thread GitBox
milton0825 merged pull request #6384: [AIRFLOW-5714] Collect SLA miss emails 
only from tasks missed SLA
URL: https://github.com/apache/airflow/pull/6384
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #6385: [AIRFLOW-5715] Make email, owner context available

2019-10-22 Thread GitBox
codecov-io commented on issue #6385: [AIRFLOW-5715] Make email, owner context 
available
URL: https://github.com/apache/airflow/pull/6385#issuecomment-545096426
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6385?src=pr=h1) 
Report
   > Merging 
[#6385](https://codecov.io/gh/apache/airflow/pull/6385?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/3cfe4a1c9dc49c91839e9c278b97f9c18033fdf4?src=pr=desc)
 will **decrease** coverage by `0.28%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6385/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6385?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#6385  +/-   ##
   ==
   - Coverage   80.57%   80.29%   -0.29% 
   ==
 Files 626  626  
 Lines   3623736248  +11 
   ==
   - Hits2919829105  -93 
   - Misses   7039 7143 +104
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6385?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/utils/operator\_helpers.py](https://codecov.io/gh/apache/airflow/pull/6385/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9vcGVyYXRvcl9oZWxwZXJzLnB5)
 | `100% <100%> (ø)` | :arrow_up: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6385/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6385/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6385/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `45.25% <0%> (-46.72%)` | :arrow_down: |
   | 
[airflow/kubernetes/kube\_client.py](https://codecov.io/gh/apache/airflow/pull/6385/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL2t1YmVfY2xpZW50LnB5)
 | `33.33% <0%> (-41.67%)` | :arrow_down: |
   | 
[...rflow/contrib/operators/kubernetes\_pod\_operator.py](https://codecov.io/gh/apache/airflow/pull/6385/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZF9vcGVyYXRvci5weQ==)
 | `70.14% <0%> (-28.36%)` | :arrow_down: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/6385/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `58.56% <0%> (-0.34%)` | :arrow_down: |
   | 
[airflow/jobs/backfill\_job.py](https://codecov.io/gh/apache/airflow/pull/6385/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2JhY2tmaWxsX2pvYi5weQ==)
 | `91.43% <0%> (+1.52%)` | :arrow_up: |
   | 
[airflow/jobs/local\_task\_job.py](https://codecov.io/gh/apache/airflow/pull/6385/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2xvY2FsX3Rhc2tfam9iLnB5)
 | `90% <0%> (+5%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6385?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6385?src=pr=footer). 
Last update 
[3cfe4a1...ae58749](https://codecov.io/gh/apache/airflow/pull/6385?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] BasPH commented on a change in pull request #6317: [AIRFLOW-5644] Simplify TriggerDagRunOperator usage

2019-10-22 Thread GitBox
BasPH commented on a change in pull request #6317: [AIRFLOW-5644] Simplify 
TriggerDagRunOperator usage
URL: https://github.com/apache/airflow/pull/6317#discussion_r337683917
 
 

 ##
 File path: airflow/operators/dagrun_operator.py
 ##
 @@ -18,81 +18,64 @@
 # under the License.
 
 import datetime
-import json
-from typing import Callable, Dict, Optional, Union
+from typing import Dict, Optional, Union
 
 from airflow.api.common.experimental.trigger_dag import trigger_dag
 from airflow.models import BaseOperator
 from airflow.utils import timezone
 from airflow.utils.decorators import apply_defaults
 
 
-class DagRunOrder:
-def __init__(self, run_id=None, payload=None):
-self.run_id = run_id
-self.payload = payload
-
-
 class TriggerDagRunOperator(BaseOperator):
 """
 Triggers a DAG run for a specified ``dag_id``
 
 :param trigger_dag_id: the dag_id to trigger (templated)
 :type trigger_dag_id: str
-:param python_callable: a reference to a python function that will be
-called while passing it the ``context`` object and a placeholder
-object ``obj`` for your callable to fill and return if you want
-a DagRun created. This ``obj`` object contains a ``run_id`` and
-``payload`` attribute that you can modify in your function.
-The ``run_id`` should be a unique identifier for that DAG run, and
-the payload has to be a picklable object that will be made available
-to your tasks while executing that DAG run. Your function header
-should look like ``def foo(context, dag_run_obj):``
-:type python_callable: python callable
+:param conf: Configuration for the DAG run
+:type conf: dict
 :param execution_date: Execution date for the dag (templated)
 :type execution_date: str or datetime.datetime
 """
-template_fields = ('trigger_dag_id', 'execution_date')
-ui_color = '#ffefeb'
+
+template_fields = ("trigger_dag_id", "execution_date", "conf")
+ui_color = "#ffefeb"
 
 @apply_defaults
 def __init__(
-self,
-trigger_dag_id: str,
-python_callable: Optional[Callable[[Dict, DagRunOrder], 
DagRunOrder]] = None,
-execution_date: Optional[Union[str, datetime.datetime]] = None,
-*args, **kwargs) -> None:
+self,
+trigger_dag_id: str,
+conf: Optional[Dict] = None,
+execution_date: Optional[Union[str, datetime.datetime]] = None,
+*args,
+**kwargs
+) -> None:
 super().__init__(*args, **kwargs)
-self.python_callable = python_callable
 self.trigger_dag_id = trigger_dag_id
+self.conf = conf
 
-self.execution_date = None  # type: Optional[Union[str, 
datetime.datetime]]
-if isinstance(execution_date, datetime.datetime):
-self.execution_date = execution_date.isoformat()
-elif isinstance(execution_date, str):
+if execution_date is None or isinstance(execution_date, (str, 
datetime.datetime)):
 self.execution_date = execution_date
-elif execution_date is None:
-self.execution_date = None
 else:
 raise TypeError(
-'Expected str or datetime.datetime type '
-'for execution_date. Got {}'.format(
-type(execution_date)))
+"Expected str or datetime.datetime type for execution_date. "
+"Got {}".format(type(execution_date))
+)
 
-def execute(self, context):
-if self.execution_date is not None:
-run_id = 'trig__{}'.format(self.execution_date)
-self.execution_date = timezone.parse(self.execution_date)
+def execute(self, context: Dict):
+if isinstance(self.execution_date, datetime.datetime):
+run_id = "trig__{}".format(self.execution_date.isoformat())
+elif isinstance(self.execution_date, str):
+run_id = "trig__{}".format(self.execution_date)
+self.execution_date = timezone.parse(self.execution_date)  # 
trigger_dag() expects datetime
 
 Review comment:
   Much simpler, much better


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5663) PythonVirtualenvOperator doesn't print logs until the operator is finished

2019-10-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957243#comment-16957243
 ] 

ASF GitHub Bot commented on AIRFLOW-5663:
-

wjiangqc commented on pull request #6389: [AIRFLOW-5663]: Switch to real-time 
logging in PythonVirtualenvOperator.
URL: https://github.com/apache/airflow/pull/6389
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5663
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   Allow us to see the logs while the operator is running.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Existing 19 unit tests passed.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> PythonVirtualenvOperator doesn't print logs until the operator is finished
> --
>
> Key: AIRFLOW-5663
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5663
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.10.5
>Reporter: Wei Jiang
>Priority: Major
>
> It seems the {{PythonVirtualenvOperator }}doesn't print out the log until the 
> job is done. When we use logging in {{PythonOperator}}, it does print out 
> logging in real time. It would be nice for the {{PythonVirtualenvOperator}} 
> to have the same functionality as the {{PythonOperator}} so we can see the 
> logging output as the PythonVirtualenvOperator makes progress.
> [https://github.com/apache/airflow/blob/master/airflow/operators/python_operator.py#L332]
> {code:python}
> def _execute_in_subprocess(self, cmd):
> try:
> self.log.info("Executing cmd\n%s", cmd)
> output = subprocess.check_output(cmd,
>  stderr=subprocess.STDOUT,
>  close_fds=True)
> if output:
> self.log.info("Got output\n%s", output)
> except subprocess.CalledProcessError as e:
> self.log.info("Got error output\n%s", e.output)
> raise
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] ashb commented on a change in pull request #6377: [AIRFLOW-5589] monitor pods by labels instead of names

2019-10-22 Thread GitBox
ashb commented on a change in pull request #6377: [AIRFLOW-5589] monitor pods 
by labels instead of names
URL: https://github.com/apache/airflow/pull/6377#discussion_r337668858
 
 

 ##
 File path: tests/integration/kubernetes/test_kubernetes_pod_operator.py
 ##
 @@ -61,7 +72,10 @@ def setUp(self):
 'namespace': 'default',
 'name': ANY,
 'annotations': {},
-'labels': {'foo': 'bar'}
+'labels': {'foo': 'bar',
+   'exec_date': '2019-01-01T09-4bf747daa',
 
 Review comment:
   I wonder if it's worth using a non-standard date format here, rather than 
always hashing this label? (Ultimately it doesn't matter much to us as Airflow 
I guess.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on a change in pull request #6389: [AIRFLOW-5663]: Switch to real-time logging in PythonVirtualenvOperator.

2019-10-22 Thread GitBox
mik-laj commented on a change in pull request #6389: [AIRFLOW-5663]: Switch to 
real-time logging in PythonVirtualenvOperator.
URL: https://github.com/apache/airflow/pull/6389#discussion_r337667617
 
 

 ##
 File path: airflow/operators/python_operator.py
 ##
 @@ -330,16 +330,20 @@ def _pass_op_args(self):
 return len(self.op_args) + len(self.op_kwargs) > 0
 
 def _execute_in_subprocess(self, cmd):
-try:
-self.log.info("Executing cmd\n%s", cmd)
-output = subprocess.check_output(cmd,
- stderr=subprocess.STDOUT,
- close_fds=True)
-if output:
-self.log.info("Got output\n%s", output)
-except subprocess.CalledProcessError as e:
-self.log.info("Got error output\n%s", e.output)
-raise
+self.log.info("Executing cmd\n{}".format(cmd))
+self._sp = subprocess.Popen(cmd,
+stdout=subprocess.PIPE,
+stderr=subprocess.STDOUT,
+bufsize=0,
+close_fds=True)
+self.log.info("Got output\n")
+with self._sp.stdout:
+for line in iter(self._sp.stdout.readline, b''):
+self.log.info(line)
 
 Review comment:
   ```suggestion
   self.log.info("%s", line)
   ```
   Data should be passed as arguments so that they are written in a different 
color.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow-site] mik-laj merged pull request #88: [depends on #84] Add video component

2019-10-22 Thread GitBox
mik-laj merged pull request #88: [depends on #84] Add video component
URL: https://github.com/apache/airflow-site/pull/88
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[airflow-site] branch aip-11 updated: [depends on #84] Add video component (#88)

2019-10-22 Thread kamilbregula
This is an automated email from the ASF dual-hosted git repository.

kamilbregula pushed a commit to branch aip-11
in repository https://gitbox.apache.org/repos/asf/airflow-site.git


The following commit(s) were added to refs/heads/aip-11 by this push:
 new 6108236  [depends on #84] Add video component (#88)
6108236 is described below

commit 61082363127a96bfdcf829dc68b9b822c1f73988
Author: Kamil Breguła 
AuthorDate: Tue Oct 22 19:53:48 2019 +0200

[depends on #84] Add video component (#88)

Co-Authored-By: Kamil Gabryjelski 
---
 landing-pages/site/assets/icons/play-icon.svg  |  5 ++
 landing-pages/site/assets/scss/_video.scss | 90 ++
 landing-pages/site/assets/scss/main-custom.scss|  1 +
 landing-pages/site/data/videos.json| 56 ++
 landing-pages/site/layouts/examples/list.html  |  2 +
 .../site/layouts/partials/video-section.html   | 42 ++
 landing-pages/site/layouts/partials/youtube.html   | 23 ++
 landing-pages/src/index.js |  2 +
 .../js/handleActiveVideo.js}   | 32 
 9 files changed, 239 insertions(+), 14 deletions(-)

diff --git a/landing-pages/site/assets/icons/play-icon.svg 
b/landing-pages/site/assets/icons/play-icon.svg
new file mode 100644
index 000..6f66cb0
--- /dev/null
+++ b/landing-pages/site/assets/icons/play-icon.svg
@@ -0,0 +1,5 @@
+http://www.w3.org/2000/svg; width="20" height="20" viewBox="-0.5 
-0.5 21 21">
+
+
diff --git a/landing-pages/site/assets/scss/_video.scss 
b/landing-pages/site/assets/scss/_video.scss
new file mode 100644
index 000..f8dd998
--- /dev/null
+++ b/landing-pages/site/assets/scss/_video.scss
@@ -0,0 +1,90 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+@import "colors";
+
+.video-section {
+  display: flex;
+  border: solid 1px #cbcbcb;
+  padding: 40px;
+}
+
+.video-wrapper {
+  flex: 1;
+
+  .video-container {
+display: none;
+
+&:last-child {
+  display: block;
+}
+  }
+
+  .anchor {
+position: fixed;
+
+&:target + .video-container {
+  display: block;
+}
+
+&:target + .video-container ~ .video-container {
+  display: none;
+}
+  }
+}
+
+.video-list-wrapper {
+  overflow-y: scroll;
+  max-height: 403px;
+  max-width: 365px;
+  width: 100%;
+  margin-left: 40px;
+}
+
+.video-list {
+  display: flex;
+  flex-direction: column-reverse;
+  justify-content: flex-end;
+
+  &__item {
+display: flex;
+align-items: center;
+border-bottom: solid 1px map-get($colors, very-light-pink);
+padding: 16px 0;
+
+$item: &;
+#{$item}--title {
+  @extend .bodytext__medium--brownish-grey;
+  margin-left: 9px;
+  vertical-align: middle;
+}
+
+&:hover, &.active {
+  #{$item}--title {
+font-weight: 500;
+  }
+
+  svg {
+path {
+  fill: map-get($colors, brownish-grey);
+  stroke: none;
+}
+  }
+}
+  }
+}
diff --git a/landing-pages/site/assets/scss/main-custom.scss 
b/landing-pages/site/assets/scss/main-custom.scss
index 4b7330e..1d9bd8b 100644
--- a/landing-pages/site/assets/scss/main-custom.scss
+++ b/landing-pages/site/assets/scss/main-custom.scss
@@ -32,3 +32,4 @@
 @import "case-study";
 @import "paragraph";
 @import "base-layout";
+@import "video";
diff --git a/landing-pages/site/data/videos.json 
b/landing-pages/site/data/videos.json
new file mode 100644
index 000..64dd841
--- /dev/null
+++ b/landing-pages/site/data/videos.json
@@ -0,0 +1,56 @@
+[
+  {
+"name": "video",
+"date": "2019-01-23",
+"title": "Airflow Meetup, London 23 Jan 2019",
+"videoID": "E0asAgpHvaI"
+  },
+  {
+"name": "video",
+"date": "2019-04-05",
+"title": "Airflow Meetup, London 05 Apr 2019",
+"videoID": "uN-TvWzeEvA"
+  },
+  {
+"name": "video",
+"date": "2019-09-30",
+"title": "Airflow Meetup, London 30 Sep 2019",
+"videoID": "nUfb4UxnvJk"
+  },
+  {
+"name": "video",
+"date": "2019-09-30",
+"title": "Airflow Meetup, London 30 Sep 2019",
+"videoID": "OhWZavn2OvM"
+  },
+  {
+"name": "video",
+"date": "2019-09-30",
+"title": "Airflow Meetup, London 30 Sep 

[GitHub] [airflow] wjiangqc opened a new pull request #6389: [AIRFLOW-5663]: Switch to real-time logging in PythonVirtualenvOperator.

2019-10-22 Thread GitBox
wjiangqc opened a new pull request #6389: [AIRFLOW-5663]: Switch to real-time 
logging in PythonVirtualenvOperator.
URL: https://github.com/apache/airflow/pull/6389
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5663
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   Allow us to see the logs while the operator is running.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Existing 19 unit tests passed.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #6370: AIRFLOW-5701: Don't clear xcom explicitly before execution

2019-10-22 Thread GitBox
codecov-io edited a comment on issue #6370: AIRFLOW-5701: Don't clear xcom 
explicitly before execution
URL: https://github.com/apache/airflow/pull/6370#issuecomment-544232286
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6370?src=pr=h1) 
Report
   > Merging 
[#6370](https://codecov.io/gh/apache/airflow/pull/6370?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/3cfe4a1c9dc49c91839e9c278b97f9c18033fdf4?src=pr=desc)
 will **decrease** coverage by `0.3%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6370/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6370?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#6370  +/-   ##
   ==
   - Coverage   80.57%   80.26%   -0.31% 
   ==
 Files 626  626  
 Lines   3623736229   -8 
   ==
   - Hits2919829081 -117 
   - Misses   7039 7148 +109
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6370?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/6370/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `93.73% <ø> (-0.06%)` | :arrow_down: |
   | 
[airflow/models/xcom.py](https://codecov.io/gh/apache/airflow/pull/6370/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMveGNvbS5weQ==)
 | `79.79% <100%> (-0.6%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6370/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6370/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6370/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `45.25% <0%> (-46.72%)` | :arrow_down: |
   | 
[airflow/kubernetes/kube\_client.py](https://codecov.io/gh/apache/airflow/pull/6370/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL2t1YmVfY2xpZW50LnB5)
 | `33.33% <0%> (-41.67%)` | :arrow_down: |
   | 
[...rflow/contrib/operators/kubernetes\_pod\_operator.py](https://codecov.io/gh/apache/airflow/pull/6370/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZF9vcGVyYXRvci5weQ==)
 | `70.14% <0%> (-28.36%)` | :arrow_down: |
   | 
[airflow/ti\_deps/deps/base\_ti\_dep.py](https://codecov.io/gh/apache/airflow/pull/6370/diff?src=pr=tree#diff-YWlyZmxvdy90aV9kZXBzL2RlcHMvYmFzZV90aV9kZXAucHk=)
 | `85.71% <0%> (-4.77%)` | :arrow_down: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/6370/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `58.73% <0%> (-0.17%)` | :arrow_down: |
   | 
[airflow/jobs/backfill\_job.py](https://codecov.io/gh/apache/airflow/pull/6370/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2JhY2tmaWxsX2pvYi5weQ==)
 | `91.43% <0%> (+1.52%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6370?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6370?src=pr=footer). 
Last update 
[3cfe4a1...c9b9312](https://codecov.io/gh/apache/airflow/pull/6370?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow-site] mik-laj merged pull request #84: Add accordion

2019-10-22 Thread GitBox
mik-laj merged pull request #84: Add accordion
URL: https://github.com/apache/airflow-site/pull/84
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[airflow-site] branch aip-11 updated: Add accordian component (#84)

2019-10-22 Thread kamilbregula
This is an automated email from the ASF dual-hosted git repository.

kamilbregula pushed a commit to branch aip-11
in repository https://gitbox.apache.org/repos/asf/airflow-site.git


The following commit(s) were added to refs/heads/aip-11 by this push:
 new 4976f21  Add accordian component (#84)
4976f21 is described below

commit 4976f218652975dff3e3c0cab5f699c68b0cba56
Author: Kamil Breguła 
AuthorDate: Tue Oct 22 19:40:01 2019 +0200

Add accordian component (#84)

Co-Authored-By: Kamil Gabryjelski 
---
 .../site/assets/icons/ask-question-icon.svg| 16 +++
 landing-pages/site/assets/icons/bug-icon.svg   | 31 ++
 .../site/assets/icons/join-devlist-icon.svg| 16 +++
 landing-pages/site/assets/scss/_accordion.scss | 20 --
 landing-pages/site/content/en/examples/_index.html | 21 +++
 .../site/layouts/shortcodes/accordion.html | 19 ++---
 6 files changed, 107 insertions(+), 16 deletions(-)

diff --git a/landing-pages/site/assets/icons/ask-question-icon.svg 
b/landing-pages/site/assets/icons/ask-question-icon.svg
new file mode 100644
index 000..33916d3
--- /dev/null
+++ b/landing-pages/site/assets/icons/ask-question-icon.svg
@@ -0,0 +1,16 @@
+http://www.w3.org/2000/svg; width="60" height="55.876" viewBox="0 
0 60 55.876">
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/landing-pages/site/assets/icons/bug-icon.svg 
b/landing-pages/site/assets/icons/bug-icon.svg
new file mode 100644
index 000..7bfb35b
--- /dev/null
+++ b/landing-pages/site/assets/icons/bug-icon.svg
@@ -0,0 +1,31 @@
+http://www.w3.org/2000/svg; width="54.425" height="60" viewBox="0 
0 54.425 60">
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/landing-pages/site/assets/icons/join-devlist-icon.svg 
b/landing-pages/site/assets/icons/join-devlist-icon.svg
new file mode 100644
index 000..48a9ec5
--- /dev/null
+++ b/landing-pages/site/assets/icons/join-devlist-icon.svg
@@ -0,0 +1,16 @@
+http://www.w3.org/2000/svg; width="60" height="43.846" viewBox="0 
0 60 43.846">
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/landing-pages/site/assets/scss/_accordion.scss 
b/landing-pages/site/assets/scss/_accordion.scss
index 7d9e96d..392e3e5 100644
--- a/landing-pages/site/assets/scss/_accordion.scss
+++ b/landing-pages/site/assets/scss/_accordion.scss
@@ -16,7 +16,6 @@
  * specific language governing permissions and limitations
  * under the License.
  */
-
 @import "colors";
 
 details.accordion {
@@ -36,16 +35,21 @@ details.accordion {
 display: none;
   }
 
-  .accordion__header {
-color: map_get($colors, cerulean-blue);
-margin-bottom: 20px;
-  }
+  .accordion__summary-content {
+display: flex;
 
-  .accordion__description {
-color: map_get($colors, brownish-grey);
+svg {
+  width: 60px;
+  margin-top: 28px;
+  margin-right: 42px;
+}
+
+&--header {
+  margin-bottom: 20px;
+}
   }
 
-  .accordion__icon {
+  .accordion__arrow {
 display: flex;
 position: absolute;
 width: 36px;
diff --git a/landing-pages/site/content/en/examples/_index.html 
b/landing-pages/site/content/en/examples/_index.html
index a9b2fc3..4d6085b 100644
--- a/landing-pages/site/content/en/examples/_index.html
+++ b/landing-pages/site/content/en/examples/_index.html
@@ -43,16 +43,27 @@ menu:
 
 {{< blocks/section color="white" >}}
 
-{{< accordion title="Accordion 1" description="Tincidunt ornare massa eget 
egestas purus viverra." >}}
-Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod 
tempor incididunt ut labore et dolore magna aliqua. Tellus at urna condimentum 
mattis pellentesque id. Neque laoreet suspendisse interdum consectetur. Non 
blandit massa enim nec.
-{{< /accordion >}}
-{{< accordion title="Accordion 2" description="Vestibulum morbi blandit 
cursus risus at." >}}
+{{< accordion title="Install Apache Airflow locally" description="Working 
on an Open Source project such as Apache Airflow is very demanding but also 
equally rewarding when you realize how many businesses use it every day." >}}
 Magna ac placerat vestibulum lectus mauris ultrices. Nullam non nisi est 
sit amet facilisis magna etiam tempor. Aliquet nec ullamcorper sit amet risus 
nullam eget felis. Rhoncus aenean vel elit scelerisque mauris pellentesque.
 {{< /accordion >}}
-{{< accordion title="Accordion 3" description="Id faucibus nisl tincidunt 
eget." >}}
+{{< accordion 

[GitHub] [airflow] Fokko commented on issue #6370: AIRFLOW-5701: Don't clear xcom explicitly before execution

2019-10-22 Thread GitBox
Fokko commented on issue #6370: AIRFLOW-5701: Don't clear xcom explicitly 
before execution
URL: https://github.com/apache/airflow/pull/6370#issuecomment-545043480
 
 
   Test failure looks unrelated, pulled in master:
   ```
  Summary of failed tests
   
tests.gcp.hooks.test_gcs.TestGoogleCloudStorageHookUpload.test_upload_data_str_gzip
 Failure:builtins.AssertionError
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] Fokko commented on a change in pull request #6317: [AIRFLOW-5644] Simplify TriggerDagRunOperator usage

2019-10-22 Thread GitBox
Fokko commented on a change in pull request #6317: [AIRFLOW-5644] Simplify 
TriggerDagRunOperator usage
URL: https://github.com/apache/airflow/pull/6317#discussion_r337613334
 
 

 ##
 File path: airflow/operators/dagrun_operator.py
 ##
 @@ -75,24 +63,21 @@ def __init__(
 self.execution_date = None
 else:
 raise TypeError(
-'Expected str or datetime.datetime type '
-'for execution_date. Got {}'.format(
-type(execution_date)))
+"Expected str or datetime.datetime type for execution_date. "
+"Got {}".format(type(execution_date))
+)
 
 def execute(self, context):
 
 Review comment:
   No, it is okay like this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow-site] kgabryje opened a new pull request #89: Feature/blog tags

2019-10-22 Thread GitBox
kgabryje opened a new pull request #89: Feature/blog tags
URL: https://github.com/apache/airflow-site/pull/89
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-5695) Cannot run a task from UI if its state is None

2019-10-22 Thread Ash Berlin-Taylor (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-5695.

Fix Version/s: 1.10.7
   Resolution: Fixed

> Cannot run a task from UI if its state is None
> --
>
> Key: AIRFLOW-5695
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5695
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.10.4
>Reporter: Ping Zhang
>Priority: Major
> Fix For: 1.10.7
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow-site] mik-laj opened a new pull request #88: [depends on #84] Add video component

2019-10-22 Thread GitBox
mik-laj opened a new pull request #88: [depends on #84] Add video component
URL: https://github.com/apache/airflow-site/pull/88
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[airflow-site] branch aip-11 updated (d17ee44 -> dbb402c)

2019-10-22 Thread kamilbregula
This is an automated email from the ASF dual-hosted git repository.

kamilbregula pushed a change to branch aip-11
in repository https://gitbox.apache.org/repos/asf/airflow-site.git.


 discard d17ee44  [depends on #84] Add video component (#85)

This update removed existing revisions from the reference, leaving the
reference pointing at a previous point in the repository history.

 * -- * -- N   refs/heads/aip-11 (dbb402c)
\
 O -- O -- O   (d17ee44)

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

No new revisions were added by this update.

Summary of changes:
 .../site/assets/icons/ask-question-icon.svg| 16 
 landing-pages/site/assets/icons/bug-icon.svg   | 31 
 .../site/assets/icons/join-devlist-icon.svg| 16 
 landing-pages/site/assets/icons/play-icon.svg  |  5 --
 landing-pages/site/assets/scss/_accordion.scss | 20 ++---
 landing-pages/site/assets/scss/_video.scss | 90 --
 landing-pages/site/assets/scss/main-custom.scss|  1 -
 landing-pages/site/content/en/examples/_index.html | 21 ++---
 landing-pages/site/data/videos.json| 56 --
 landing-pages/site/layouts/examples/list.html  |  2 -
 .../site/layouts/partials/video-section.html   | 42 --
 landing-pages/site/layouts/partials/youtube.html   | 23 --
 .../site/layouts/shortcodes/accordion.html | 19 +
 landing-pages/src/index.js |  2 -
 landing-pages/src/js/handleActiveVideo.js  | 38 -
 15 files changed, 16 insertions(+), 366 deletions(-)
 delete mode 100644 landing-pages/site/assets/icons/ask-question-icon.svg
 delete mode 100644 landing-pages/site/assets/icons/bug-icon.svg
 delete mode 100644 landing-pages/site/assets/icons/join-devlist-icon.svg
 delete mode 100644 landing-pages/site/assets/icons/play-icon.svg
 delete mode 100644 landing-pages/site/assets/scss/_video.scss
 delete mode 100644 landing-pages/site/data/videos.json
 delete mode 100644 landing-pages/site/layouts/partials/video-section.html
 delete mode 100644 landing-pages/site/layouts/partials/youtube.html
 delete mode 100644 landing-pages/src/js/handleActiveVideo.js



[GitHub] [airflow-site] mik-laj merged pull request #85: [depends on #84] Add video component

2019-10-22 Thread GitBox
mik-laj merged pull request #85: [depends on #84] Add video component
URL: https://github.com/apache/airflow-site/pull/85
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[airflow-site] branch aip-11 updated: [depends on #84] Add video component (#85)

2019-10-22 Thread kamilbregula
This is an automated email from the ASF dual-hosted git repository.

kamilbregula pushed a commit to branch aip-11
in repository https://gitbox.apache.org/repos/asf/airflow-site.git


The following commit(s) were added to refs/heads/aip-11 by this push:
 new d17ee44  [depends on #84] Add video component (#85)
d17ee44 is described below

commit d17ee44b86b21a4fc39893988f04a95d99047d17
Author: Kamil Breguła 
AuthorDate: Tue Oct 22 15:39:36 2019 +0200

[depends on #84] Add video component (#85)

Co-Authored-By:  Kamil Gabryjelski 
---
 .../site/assets/icons/ask-question-icon.svg| 16 
 landing-pages/site/assets/icons/bug-icon.svg   | 31 
 .../site/assets/icons/join-devlist-icon.svg| 16 
 landing-pages/site/assets/icons/play-icon.svg  |  5 ++
 landing-pages/site/assets/scss/_accordion.scss | 20 +++--
 landing-pages/site/assets/scss/_video.scss | 90 ++
 landing-pages/site/assets/scss/main-custom.scss|  1 +
 landing-pages/site/content/en/examples/_index.html | 21 +++--
 landing-pages/site/data/videos.json| 56 ++
 landing-pages/site/layouts/examples/list.html  |  2 +
 .../site/layouts/partials/video-section.html   | 42 ++
 .../accordion.html => partials/youtube.html}   | 14 +---
 .../site/layouts/shortcodes/accordion.html | 19 -
 landing-pages/src/index.js |  2 +
 .../js/handleActiveVideo.js}   | 32 
 15 files changed, 327 insertions(+), 40 deletions(-)

diff --git a/landing-pages/site/assets/icons/ask-question-icon.svg 
b/landing-pages/site/assets/icons/ask-question-icon.svg
new file mode 100644
index 000..33916d3
--- /dev/null
+++ b/landing-pages/site/assets/icons/ask-question-icon.svg
@@ -0,0 +1,16 @@
+http://www.w3.org/2000/svg; width="60" height="55.876" viewBox="0 
0 60 55.876">
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/landing-pages/site/assets/icons/bug-icon.svg 
b/landing-pages/site/assets/icons/bug-icon.svg
new file mode 100644
index 000..7bfb35b
--- /dev/null
+++ b/landing-pages/site/assets/icons/bug-icon.svg
@@ -0,0 +1,31 @@
+http://www.w3.org/2000/svg; width="54.425" height="60" viewBox="0 
0 54.425 60">
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/landing-pages/site/assets/icons/join-devlist-icon.svg 
b/landing-pages/site/assets/icons/join-devlist-icon.svg
new file mode 100644
index 000..48a9ec5
--- /dev/null
+++ b/landing-pages/site/assets/icons/join-devlist-icon.svg
@@ -0,0 +1,16 @@
+http://www.w3.org/2000/svg; width="60" height="43.846" viewBox="0 
0 60 43.846">
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/landing-pages/site/assets/icons/play-icon.svg 
b/landing-pages/site/assets/icons/play-icon.svg
new file mode 100644
index 000..6f66cb0
--- /dev/null
+++ b/landing-pages/site/assets/icons/play-icon.svg
@@ -0,0 +1,5 @@
+http://www.w3.org/2000/svg; width="20" height="20" viewBox="-0.5 
-0.5 21 21">
+
+
diff --git a/landing-pages/site/assets/scss/_accordion.scss 
b/landing-pages/site/assets/scss/_accordion.scss
index 7d9e96d..392e3e5 100644
--- a/landing-pages/site/assets/scss/_accordion.scss
+++ b/landing-pages/site/assets/scss/_accordion.scss
@@ -16,7 +16,6 @@
  * specific language governing permissions and limitations
  * under the License.
  */
-
 @import "colors";
 
 details.accordion {
@@ -36,16 +35,21 @@ details.accordion {
 display: none;
   }
 
-  .accordion__header {
-color: map_get($colors, cerulean-blue);
-margin-bottom: 20px;
-  }
+  .accordion__summary-content {
+display: flex;
 
-  .accordion__description {
-color: map_get($colors, brownish-grey);
+svg {
+  width: 60px;
+  margin-top: 28px;
+  margin-right: 42px;
+}
+
+&--header {
+  margin-bottom: 20px;
+}
   }
 
-  .accordion__icon {
+  .accordion__arrow {
 display: flex;
 position: absolute;
 width: 36px;
diff --git a/landing-pages/site/assets/scss/_video.scss 
b/landing-pages/site/assets/scss/_video.scss
new file mode 100644
index 000..f8dd998
--- /dev/null
+++ b/landing-pages/site/assets/scss/_video.scss
@@ -0,0 +1,90 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you 

[jira] [Resolved] (AIRFLOW-5632) Tests for renaming GCP operators

2019-10-22 Thread Kamil Bregula (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula resolved AIRFLOW-5632.

Fix Version/s: 2.0.0
   Resolution: Fixed

> Tests for renaming GCP operators 
> -
>
> Key: AIRFLOW-5632
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5632
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.5
>Reporter: Michał Słowikowski
>Assignee: Michał Słowikowski
>Priority: Minor
> Fix For: 2.0.0
>
>
> Tests include:
>  * if importing old module raising DeprecationWarning
>  * if initializing deprecated class  raising DeprecationWarning
>  * if the old class is the new class subclass
>  * check if new module path is shown in warning message



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5632) Tests for renaming GCP operators

2019-10-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957051#comment-16957051
 ] 

ASF subversion and git services commented on AIRFLOW-5632:
--

Commit 3cfe4a1c9dc49c91839e9c278b97f9c18033fdf4 in airflow's branch 
refs/heads/master from mislo
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3cfe4a1 ]

[AIRFLOW-5632] Rename ComputeEngine operators (#6306)



> Tests for renaming GCP operators 
> -
>
> Key: AIRFLOW-5632
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5632
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.5
>Reporter: Michał Słowikowski
>Assignee: Michał Słowikowski
>Priority: Minor
>
> Tests include:
>  * if importing old module raising DeprecationWarning
>  * if initializing deprecated class  raising DeprecationWarning
>  * if the old class is the new class subclass
>  * check if new module path is shown in warning message



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5632) Tests for renaming GCP operators

2019-10-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957050#comment-16957050
 ] 

ASF GitHub Bot commented on AIRFLOW-5632:
-

mik-laj commented on pull request #6306: [AIRFLOW-5632] Rename ComputeEngine 
operators
URL: https://github.com/apache/airflow/pull/6306
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Tests for renaming GCP operators 
> -
>
> Key: AIRFLOW-5632
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5632
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.5
>Reporter: Michał Słowikowski
>Assignee: Michał Słowikowski
>Priority: Minor
>
> Tests include:
>  * if importing old module raising DeprecationWarning
>  * if initializing deprecated class  raising DeprecationWarning
>  * if the old class is the new class subclass
>  * check if new module path is shown in warning message



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] mik-laj merged pull request #6306: [AIRFLOW-5632] Rename ComputeEngine operators

2019-10-22 Thread GitBox
mik-laj merged pull request #6306: [AIRFLOW-5632] Rename ComputeEngine operators
URL: https://github.com/apache/airflow/pull/6306
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow-site] mik-laj opened a new pull request #87: [depends on #85, #86, #87] Add blog layout

2019-10-22 Thread GitBox
mik-laj opened a new pull request #87: [depends on #85, #86, #87] Add blog 
layout
URL: https://github.com/apache/airflow-site/pull/87
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow-site] mik-laj opened a new pull request #86: Add features components

2019-10-22 Thread GitBox
mik-laj opened a new pull request #86: Add features components
URL: https://github.com/apache/airflow-site/pull/86
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow-site] mik-laj opened a new pull request #85: [depends on #84]Add video component

2019-10-22 Thread GitBox
mik-laj opened a new pull request #85: [depends on #84]Add video component
URL: https://github.com/apache/airflow-site/pull/85
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow-site] mik-laj opened a new pull request #84: Add accordion

2019-10-22 Thread GitBox
mik-laj opened a new pull request #84: Add accordion
URL: https://github.com/apache/airflow-site/pull/84
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5694) Check for blinker when detecting Sentry packages

2019-10-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957024#comment-16957024
 ] 

ASF subversion and git services commented on AIRFLOW-5694:
--

Commit 770927ae8d12135adac7430a9ab6b39a6f4a1e68 in airflow's branch 
refs/heads/v1-10-test from Marcus Levine
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=770927a ]

[AIRFLOW-5694] Check for blinker in Sentry setup (#6365)


(cherry picked from commit c72c42730236fee1526fcc03dca7f88e1778ee94)


> Check for blinker when detecting Sentry packages
> 
>
> Key: AIRFLOW-5694
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5694
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: dependencies
>Affects Versions: 1.10.6
>Reporter: Marcus Levine
>Assignee: Marcus Levine
>Priority: Minor
> Fix For: 1.10.7
>
>
> After upgrading to 1.10.6rc1 with `sentry-sdk` installed but not specifying 
> the `[sentry]` extra, the dependency `blinker` will cause failures of the 
> following form:
> {code:python}
> ../lib/python3.7/site-packages/airflow/__init__.py:40: in 
>     from airflow.models import DAG
> ../lib/python3.7/site-packages/airflow/models/__init__.py:21: in 
>     from airflow.models.baseoperator import BaseOperator  # noqa: F401
> ../lib/python3.7/site-packages/airflow/models/baseoperator.py:42: in 
>     from airflow.models.dag import DAG
> ../lib/python3.7/site-packages/airflow/models/dag.py:51: in 
>     from airflow.models.taskinstance import TaskInstance, clear_task_instances
> ../lib/python3.7/site-packages/airflow/models/taskinstance.py:53: in 
>     from airflow.sentry import Sentry
> ../lib/python3.7/site-packages/airflow/sentry.py:167: in 
>     Sentry = ConfiguredSentry()
> ../lib/python3.7/site-packages/airflow/sentry.py:94: in __init__
>     init(integrations=integrations)
> ../lib/python3.7/site-packages/sentry_sdk/hub.py:81: in _init
>     client = Client(*args, **kwargs)  # type: ignore
> ../lib/python3.7/site-packages/sentry_sdk/client.py:80: in __init__
>     self._init_impl()
> ../lib/python3.7/site-packages/sentry_sdk/client.py:108: in _init_impl
>     with_defaults=self.options["default_integrations"],
> ../lib/python3.7/site-packages/sentry_sdk/integrations/__init__.py:82: in 
> setup_integrations
>     type(integration).setup_once()
> ../lib/python3.7/site-packages/sentry_sdk/integrations/flask.py:57: in 
> setup_once
>     appcontext_pushed.connect(_push_appctx)
> ../lib/python3.7/site-packages/flask/signals.py:39: in _fail
>     "Signalling support is unavailable because the blinker"
> E   RuntimeError: Signalling support is unavailable because the blinker 
> library is not installed.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] Fokko commented on a change in pull request #6210: [AIRFLOW-5567] BaseAsyncOperator

2019-10-22 Thread GitBox
Fokko commented on a change in pull request #6210: [AIRFLOW-5567] 
BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r337499032
 
 

 ##
 File path: airflow/models/base_reschedule_poke_operator.py
 ##
 @@ -0,0 +1,218 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+Base Operator for kicking off a long running
+operations and polling for completion with reschedule mode.
+"""
+
+from abc import ABC, abstractmethod
+from typing import Dict, List, Iterable, Optional, Union
+from time import sleep
+from datetime import timedelta
+
+from airflow.exceptions import AirflowException, AirflowSensorTimeout, \
+AirflowSkipException, AirflowRescheduleException
+from airflow.models import SkipMixin, TaskReschedule
+from airflow.models.xcom import XCOM_EXTERNAL_RESOURCE_ID_KEY
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils import timezone
+from airflow.utils.decorators import apply_defaults
+from airflow.ti_deps.deps.ready_to_reschedule import ReadyToRescheduleDep
+
+
+class BaseReschedulePokeOperator(BaseOperator, SkipMixin, ABC):
+"""
+ReschedulePokeOperators are derived from this class and inherit these 
attributes.
+ReschedulePokeOperators should be used for long running operations where 
the task
+can tolerate a longer poke interval. They use the task rescheduling
+mechanism similar to sensors to avoid occupying a worker slot between
+pokes.
+
+Developing concrete operators that provide parameterized flexibility
+for synchronous or asynchronous poking depending on the invocation is
+possible by programing against this `BaseReschedulePokeOperator` interface,
+and overriding the execute method as demonstrated below.
+
+.. code-block:: python
+
+class DummyFlexiblePokingOperator(BaseReschedulePokeOperator):
+  def __init__(self, async=False, *args, **kwargs):
+self.async = async
+super().__init(*args, **kwargs)
+
+  def execute(self, context: Dict) -> None:
+if self.async:
+  # use the BaseReschedulePokeOperator's execute
+  super().execute(context)
+else:
+  self.submit_request(context)
+  while not self.poke():
+time.sleep(self.poke_interval)
+  self.process_results(context)
+
+  def sumbit_request(self, context: Dict) -> Optional[str]:
+return None
+
+  def poke(self, context: Dict) -> bool:
+return bool(random.getrandbits(1))
+
+ReschedulePokeOperators must override the following methods:
+:py:meth:`submit_request`: fire a request for a long running operation
+:py:meth:`poke`: a method to check if the long running operation is
+complete it should return True when a success criteria is met.
+
+Optionally, ReschedulePokeOperators can override:
+:py:meth:`process_result` to perform any operations after the success
+criteria is met in :py:meth: `poke`
+
+:py:meth:`poke` is executed at a time interval and succeed when a
+criteria is met and fail if and when they time out.
+
+:param soft_fail: Set to true to mark the task as SKIPPED on failure
+:type soft_fail: bool
+:param poke_interval: Time in seconds that the job should wait in
+between each tries
+:type poke_interval: int
+:param timeout: Time, in seconds before the task times out and fails.
+:type timeout: int
+
+"""
+ui_color = '#9933ff'  # type: str
+
+@apply_defaults
+def __init__(self,
+ *args,
+ **kwargs) -> None:
+super().__init__(mode='reschedule', *args, **kwargs)
+
+@abstractmethod
+def submit_request(self, context: Dict) -> Optional[Union[str, List, 
Dict]]:
+"""
+This method should kick off a long running operation.
+This method should return the ID for the long running operation if
+applicable.
+Context is the same dictionary used as when rendering jinja templates.
+
+Refer to get_template_context for more context.
+
+:returns: a resource_id for the long running operation.
+   

[GitHub] [airflow] Fokko commented on a change in pull request #6210: [AIRFLOW-5567] BaseAsyncOperator

2019-10-22 Thread GitBox
Fokko commented on a change in pull request #6210: [AIRFLOW-5567] 
BaseAsyncOperator
URL: https://github.com/apache/airflow/pull/6210#discussion_r337498162
 
 

 ##
 File path: airflow/gcp/hooks/dataproc.py
 ##
 @@ -485,7 +490,8 @@ def submit(
 project_id: str,
 job: Dict,
 region: str = 'global',
-job_error_states: Optional[Iterable[str]] = None
+job_error_states: Optional[Iterable[str]] = None,
+async: bool = False
 
 Review comment:
   I believe this is a reserved keyword


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] acroos commented on a change in pull request #6380: [AIRFLOW-3632] Allow replace_microseconds in trigger_dag REST reqeust

2019-10-22 Thread GitBox
acroos commented on a change in pull request #6380: [AIRFLOW-3632] Allow 
replace_microseconds in trigger_dag REST reqeust
URL: https://github.com/apache/airflow/pull/6380#discussion_r337497804
 
 

 ##
 File path: airflow/www/api/experimental/endpoints.py
 ##
 @@ -74,8 +75,12 @@ def trigger_dag(dag_id):
 
 return response
 
+replace_microseconds = True
+if 'replace_microseconds' in data:
+replace_microseconds = to_boolean(data['replace_microseconds'])
+
 try:
-dr = trigger.trigger_dag(dag_id, run_id, conf, execution_date)
+dr = trigger.trigger_dag(dag_id, run_id, conf, execution_date, 
replace_microseconds)
 
 Review comment:
   sounds great, will do!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-5719) Adding Knative Executor support to Airflow

2019-10-22 Thread Ash Berlin-Taylor (Jira)
Ash Berlin-Taylor created AIRFLOW-5719:
--

 Summary: Adding Knative Executor support to Airflow
 Key: AIRFLOW-5719
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5719
 Project: Apache Airflow
  Issue Type: Epic
  Components: executor-kubernetes
Affects Versions: 2.0.0
Reporter: Ash Berlin-Taylor
Assignee: Daniel Imberman






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] ashb commented on a change in pull request #6377: [AIRFLOW-5589] monitor pods by labels instead of names

2019-10-22 Thread GitBox
ashb commented on a change in pull request #6377: [AIRFLOW-5589] monitor pods 
by labels instead of names
URL: https://github.com/apache/airflow/pull/6377#discussion_r337479786
 
 

 ##
 File path: airflow/contrib/operators/kubernetes_pod_operator.py
 ##
 @@ -112,55 +113,55 @@ class KubernetesPodOperator(BaseOperator):  # pylint: 
disable=too-many-instance-
 """
 template_fields = ('cmds', 'arguments', 'env_vars', 'config_file')
 
+@staticmethod
+def create_labels_for_pod(context):
+"""
+Generate labels for the pod s.t. we can track it in case of Operator 
crash
+
+:param context:
+:return:
+"""
+labels = {
+'dag_id': context['dag'].dag_id,
+'task_id': context['task'].task_id,
+'exec_date': context['ts']
+}
+# In the case of sub dags this is just useful
+if context['dag'].parent_dag:
+labels['parent_dag_id'] = context['dag'].parent_dag.dag_id
+# Replace unsupported characters with dashes & trim at 63 chars (max 
allowed)
 
 Review comment:
   This comment probably doesn't reflect what make_safe_label_value does 
anymore so
   ```suggestion
   # Ensure that label is valid for Kube, and if not trucate/remove 
invalid chars and replace with short hash.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5669) [AIRFLOW-5669] Rename GoogleCloudBaseHook to CloudBaseHook

2019-10-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956983#comment-16956983
 ] 

ASF GitHub Bot commented on AIRFLOW-5669:
-

michalslowikowski00 commented on pull request #6388: [AIRFLOW-5669] Rename 
GoogleCloudBaseHook to CloudBaseHook
URL: https://github.com/apache/airflow/pull/6388
 
 
   Part of AIP-21
   - renamed `GoogleCloudBaseHook to CloudBaseHook`
   - added `added to gcp_api_base_hook.py deprecation warning`
   
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5569
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [AIRFLOW-5669] Rename GoogleCloudBaseHook to CloudBaseHook
> --
>
> Key: AIRFLOW-5669
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5669
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: gcp
>Affects Versions: 1.10.5
>Reporter: Michał Słowikowski
>Assignee: Michał Słowikowski
>Priority: Minor
>
> Rename GoogleCloudBaseHook to CloudBaseHook



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] michalslowikowski00 opened a new pull request #6388: [AIRFLOW-5669] Rename GoogleCloudBaseHook to CloudBaseHook

2019-10-22 Thread GitBox
michalslowikowski00 opened a new pull request #6388: [AIRFLOW-5669] Rename 
GoogleCloudBaseHook to CloudBaseHook
URL: https://github.com/apache/airflow/pull/6388
 
 
   Part of AIP-21
   - renamed `GoogleCloudBaseHook to CloudBaseHook`
   - added `added to gcp_api_base_hook.py deprecation warning`
   
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5569
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] ashb commented on a change in pull request #6266: [AIRFLOW-2439] Production Docker image support including refactoring of build scripts - depends on [AIRFLOW-5704]

2019-10-22 Thread GitBox
ashb commented on a change in pull request #6266: [AIRFLOW-2439] Production 
Docker image support including refactoring of build scripts - depends on 
[AIRFLOW-5704]
URL: https://github.com/apache/airflow/pull/6266#discussion_r337465801
 
 

 ##
 File path: setup.py
 ##
 @@ -287,46 +286,75 @@ def write_version(filename: str = 
os.path.join(*["airflow", "git_version"])):
 'jira',
 'mongomock',
 'moto==1.3.5',
+'mypy==0.720',
 'nose',
 'nose-ignore-docstring==0.2',
 'nose-timer',
 'parameterized',
-'paramiko',
 'pre-commit',
 'pylint~=2.3.1',  # to be upgraded after fixing 
https://github.com/PyCQA/pylint/issues/3123
   # We should also disable checking docstring at the 
module level
-'pysftp',
-'pywinrm',
-'qds-sdk>=1.9.6',
 'rednose',
 'requests_mock',
 'yamllint'
 ]
+
+devel = sorted(devel + doc)
+
 

 # IMPORTANT NOTE!!!
 # IF you are removing dependencies from the above list, please make sure that 
you also increase
 # DEPENDENCIES_EPOCH_NUMBER in the Dockerfile
 

 
-if PY3:
-devel += ['mypy==0.720']
-else:
-devel += ['unittest2']
-
 devel_minreq = devel + kubernetes + mysql + doc + password + cgroups
 devel_hadoop = devel_minreq + hive + hdfs + webhdfs + kerberos
-devel_all = (sendgrid + devel + all_dbs + doc + samba + slack + oracle +
- docker + ssh + kubernetes + celery + redis + gcp + grpc +
- datadog + zendesk + jdbc + ldap + kerberos + password + webhdfs + 
jenkins +
- druid + pinot + segment + snowflake + elasticsearch + sentry +
- atlas + azure + aws + salesforce + cgroups + papermill + 
virtualenv)
 
-# Snakebite & Google Cloud Dataflow are not Python 3 compatible :'(
-if PY3:
-devel_ci = [package for package in devel_all if package not in
-['snakebite>=2.7.8', 'snakebite[kerberos]>=2.7.8']]
-else:
-devel_ci = devel_all
+all_packages = (
+async_packages +
+atlas +
+all_dbs +
+aws +
+azure +
+celery +
+cgroups +
+datadog +
+dask +
+databricks +
+datadog +
+docker +
+druid +
+elasticsearch +
+gcp +
+grpc +
+flask_oauth +
+jdbc +
+jenkins +
+kerberos +
+kubernetes +
+ldap +
+oracle +
+papermill +
+password +
+pinot +
+redis +
+salesforce +
+samba +
+sendgrid +
+sentry +
+segment +
+slack +
+snowflake +
+ssh +
+statsd +
+virtualenv +
+webhdfs +
+winrm +
+zendesk
+)
+
+# Snakebite is not Python 3 compatible :'(
+all_packages = [package for package in all_packages if not 
package.startswith('snakebite')]
 
 Review comment:
   There's an open issue for upgrading/replacing snakebite.
   
   Removing Kerberos: eek no definetly not. Soem enterprises need kerberos 
support. If it's broken on Py3 then we need to fix that before 2.0.0 can be 
released.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #6306: [AIRFLOW-5632] Rename ComputeEngine operators

2019-10-22 Thread GitBox
codecov-io edited a comment on issue #6306: [AIRFLOW-5632] Rename ComputeEngine 
operators
URL: https://github.com/apache/airflow/pull/6306#issuecomment-541052532
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6306?src=pr=h1) 
Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@4e661f5`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6306/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6306?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master#6306   +/-   ##
   =
 Coverage  ?   80.28%   
   =
 Files ?  626   
 Lines ?36237   
 Branches  ?0   
   =
 Hits  ?29094   
 Misses? 7143   
 Partials  ?0
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6306?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[...irflow/contrib/operators/gcp\_container\_operator.py](https://codecov.io/gh/apache/airflow/pull/6306/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9nY3BfY29udGFpbmVyX29wZXJhdG9yLnB5)
 | `100% <ø> (ø)` | |
   | 
[airflow/contrib/hooks/gcp\_compute\_hook.py](https://codecov.io/gh/apache/airflow/pull/6306/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2hvb2tzL2djcF9jb21wdXRlX2hvb2sucHk=)
 | `100% <100%> (ø)` | |
   | 
[airflow/gcp/operators/compute.py](https://codecov.io/gh/apache/airflow/pull/6306/diff?src=pr=tree#diff-YWlyZmxvdy9nY3Avb3BlcmF0b3JzL2NvbXB1dGUucHk=)
 | `98.49% <100%> (ø)` | |
   | 
[airflow/gcp/hooks/compute.py](https://codecov.io/gh/apache/airflow/pull/6306/diff?src=pr=tree#diff-YWlyZmxvdy9nY3AvaG9va3MvY29tcHV0ZS5weQ==)
 | `86.86% <100%> (ø)` | |
   | 
[airflow/contrib/hooks/gcp\_container\_hook.py](https://codecov.io/gh/apache/airflow/pull/6306/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2hvb2tzL2djcF9jb250YWluZXJfaG9vay5weQ==)
 | `100% <100%> (ø)` | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6306?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6306?src=pr=footer). 
Last update 
[4e661f5...4b2ff7d](https://codecov.io/gh/apache/airflow/pull/6306?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] ashb commented on a change in pull request #6380: [AIRFLOW-3632] Allow replace_microseconds in trigger_dag REST reqeust

2019-10-22 Thread GitBox
ashb commented on a change in pull request #6380: [AIRFLOW-3632] Allow 
replace_microseconds in trigger_dag REST reqeust
URL: https://github.com/apache/airflow/pull/6380#discussion_r337444292
 
 

 ##
 File path: airflow/www/api/experimental/endpoints.py
 ##
 @@ -74,8 +75,12 @@ def trigger_dag(dag_id):
 
 return response
 
+replace_microseconds = True
+if 'replace_microseconds' in data:
+replace_microseconds = to_boolean(data['replace_microseconds'])
+
 try:
-dr = trigger.trigger_dag(dag_id, run_id, conf, execution_date)
+dr = trigger.trigger_dag(dag_id, run_id, conf, execution_date, 
replace_microseconds)
 
 Review comment:
   It doesn't seen needed, other than it's sometimes "nice" to see whole 
seconds. For the API it is probably more sensible to default to un-trucated 
time, but this would be potentially "breaking" change, so we need to think 
about how we upgrade existing users.
   
   For the 1.10 release series I would probably suggest that the default 
shouldn't change, just in case someone is relying on it.
   
   If you could do this as two PRs (both against master) the first adding the 
behavour we talked about, but not changing the deafult behaviour (which we can 
get in to the next 1.10.x release, I will cherry-pick the merged PR back to the 
relase branch) and then once that is merged a second one that changes the 
default to be as you describe.
   
   How does that sound?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-5718) Add SFTPToGoogleCloudStorageOperator

2019-10-22 Thread Tobiasz Kedzierski (Jira)
Tobiasz Kedzierski created AIRFLOW-5718:
---

 Summary: Add SFTPToGoogleCloudStorageOperator
 Key: AIRFLOW-5718
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5718
 Project: Apache Airflow
  Issue Type: Sub-task
  Components: gcp, operators
Affects Versions: 1.10.5
Reporter: Tobiasz Kedzierski
Assignee: Tobiasz Kedzierski






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-4971) Add Google Display & Video 360 integration

2019-10-22 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956879#comment-16956879
 ] 

ASF subversion and git services commented on AIRFLOW-4971:
--

Commit 16d7accb22c866d4fbf368e4d979dc1c4a41d93c in airflow's branch 
refs/heads/master from Tomek
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=16d7acc ]

[AIRFLOW-4971] Add Google Display & Video 360 integration (#6170)



> Add Google Display & Video 360 integration
> --
>
> Key: AIRFLOW-4971
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4971
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.3
>Reporter: Kamil Bregula
>Priority: Major
>
> Hi
> This project lacks integration with the Google Display & Video 360 service. I 
> would be happy if Airflow had proper operators and hooks that integrate with 
> this service.
> Product Documentation: 
> https://developers.google.com/bid-manager/guides/getting-started-api
> API Documentation: 
> https://developers.google.com/bid-manager/guides/getting-started-api
> Lots of love



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-4971) Add Google Display & Video 360 integration

2019-10-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956878#comment-16956878
 ] 

ASF GitHub Bot commented on AIRFLOW-4971:
-

mik-laj commented on pull request #6170: [AIRFLOW-4971] Add Google Display & 
Video 360 integration
URL: https://github.com/apache/airflow/pull/6170
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Google Display & Video 360 integration
> --
>
> Key: AIRFLOW-4971
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4971
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.3
>Reporter: Kamil Bregula
>Priority: Major
>
> Hi
> This project lacks integration with the Google Display & Video 360 service. I 
> would be happy if Airflow had proper operators and hooks that integrate with 
> this service.
> Product Documentation: 
> https://developers.google.com/bid-manager/guides/getting-started-api
> API Documentation: 
> https://developers.google.com/bid-manager/guides/getting-started-api
> Lots of love



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-5717) Add get_tree_map method to SFTPHook

2019-10-22 Thread Tobiasz Kedzierski (Jira)
Tobiasz Kedzierski created AIRFLOW-5717:
---

 Summary: Add get_tree_map method to SFTPHook
 Key: AIRFLOW-5717
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5717
 Project: Apache Airflow
  Issue Type: Improvement
  Components: hooks
Affects Versions: 1.10.5
Reporter: Tobiasz Kedzierski
Assignee: Tobiasz Kedzierski






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >