[GitHub] [airflow] feng-tao commented on issue #5160: [AIRFLOW-4394] Don't double test behaviour of BackfillJob from CLI tests

2019-04-23 Thread GitBox
feng-tao commented on issue #5160: [AIRFLOW-4394] Don't double test behaviour 
of BackfillJob from CLI tests
URL: https://github.com/apache/airflow/pull/5160#issuecomment-486077899
 
 
   Nice, do we know how much we expect to save ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4402) Update super() calls for nvd3

2019-04-23 Thread Tao Feng (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824830#comment-16824830
 ] 

Tao Feng commented on AIRFLOW-4402:
---

[~milton0825] didn't quite follow all the prs recently. But I found what you 
are talking about. Just raise the questions in that pr as I am not 100% sure.

> Update super() calls for nvd3
> -
>
> Key: AIRFLOW-4402
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4402
> Project: Apache Airflow
>  Issue Type: Sub-task
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
> Fix For: 2.0.0
>
>
> In all classes under nvd3, replace {{super(__class__, self).__init__(...)}} 
> by {{super().__init__(...)}}
> Similarly for any other {{super}} calls for other methods.
> (In Python 3 {{super(__class__, self) == super()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] codecov-io edited a comment on issue #5167: Reliable SynchronizedQueue used instead of multiprocessing.Queue

2019-04-23 Thread GitBox
codecov-io edited a comment on issue #5167: Reliable SynchronizedQueue used 
instead of multiprocessing.Queue
URL: https://github.com/apache/airflow/pull/5167#issuecomment-486068982
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=h1) 
Report
   > Merging 
[#5167](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/de1795c8f1e65a5f4a848e88ef3848ba51aab449?src=pr=desc)
 will **increase** coverage by `0.01%`.
   > The diff coverage is `91.66%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5167/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5167  +/-   ##
   ==
   + Coverage   78.44%   78.46%   +0.01% 
   ==
 Files 466  467   +1 
 Lines   2973529762  +27 
   ==
   + Hits2332723354  +27 
 Misses   6408 6408
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/executors/base\_executor.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvYmFzZV9leGVjdXRvci5weQ==)
 | `95.58% <ø> (-0.07%)` | :arrow_down: |
   | 
[airflow/executors/local\_executor.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvbG9jYWxfZXhlY3V0b3IucHk=)
 | `81.25% <100%> (+0.19%)` | :arrow_up: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `59.22% <100%> (+0.1%)` | :arrow_up: |
   | 
[airflow/contrib/executors/kubernetes\_executor.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2V4ZWN1dG9ycy9rdWJlcm5ldGVzX2V4ZWN1dG9yLnB5)
 | `63.38% <100%> (+0.1%)` | :arrow_up: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `78.53% <50%> (+0.01%)` | :arrow_up: |
   | 
[airflow/utils/synchronized\_queue.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zeW5jaHJvbml6ZWRfcXVldWUucHk=)
 | `92.3% <92.3%> (ø)` | |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.59% <0%> (+0.17%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=footer). 
Last update 
[de1795c...9494571](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #5167: Reliable SynchronizedQueue used instead of multiprocessing.Queue

2019-04-23 Thread GitBox
codecov-io edited a comment on issue #5167: Reliable SynchronizedQueue used 
instead of multiprocessing.Queue
URL: https://github.com/apache/airflow/pull/5167#issuecomment-486068982
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=h1) 
Report
   > Merging 
[#5167](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/de1795c8f1e65a5f4a848e88ef3848ba51aab449?src=pr=desc)
 will **increase** coverage by `0.01%`.
   > The diff coverage is `91.66%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5167/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5167  +/-   ##
   ==
   + Coverage   78.44%   78.46%   +0.01% 
   ==
 Files 466  467   +1 
 Lines   2973529762  +27 
   ==
   + Hits2332723354  +27 
 Misses   6408 6408
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/executors/base\_executor.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvYmFzZV9leGVjdXRvci5weQ==)
 | `95.58% <ø> (-0.07%)` | :arrow_down: |
   | 
[airflow/executors/local\_executor.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvbG9jYWxfZXhlY3V0b3IucHk=)
 | `81.25% <100%> (+0.19%)` | :arrow_up: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `59.22% <100%> (+0.1%)` | :arrow_up: |
   | 
[airflow/contrib/executors/kubernetes\_executor.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2V4ZWN1dG9ycy9rdWJlcm5ldGVzX2V4ZWN1dG9yLnB5)
 | `63.38% <100%> (+0.1%)` | :arrow_up: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `78.53% <50%> (+0.01%)` | :arrow_up: |
   | 
[airflow/utils/synchronized\_queue.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zeW5jaHJvbml6ZWRfcXVldWUucHk=)
 | `92.3% <92.3%> (ø)` | |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.59% <0%> (+0.17%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=footer). 
Last update 
[de1795c...9494571](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao edited a comment on issue #5019: [AIRFLOW-4199][AIP-3-step3] Remove all sys version info

2019-04-23 Thread GitBox
feng-tao edited a comment on issue #5019: [AIRFLOW-4199][AIP-3-step3] Remove 
all sys version info
URL: https://github.com/apache/airflow/pull/5019#issuecomment-486072340
 
 
   @kaxil, @potiuk , @ashb, @bolkedebruin  are we supposed to change the vendor 
files? I thought files under vendor dir are meant for copy paste from upstream 
without any change? Or I understand incorrectly?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao commented on issue #5019: [AIRFLOW-4199][AIP-3-step3] Remove all sys version info

2019-04-23 Thread GitBox
feng-tao commented on issue #5019: [AIRFLOW-4199][AIP-3-step3] Remove all sys 
version info
URL: https://github.com/apache/airflow/pull/5019#issuecomment-486072369
 
 
   cc @milton0825 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao commented on issue #5019: [AIRFLOW-4199][AIP-3-step3] Remove all sys version info

2019-04-23 Thread GitBox
feng-tao commented on issue #5019: [AIRFLOW-4199][AIP-3-step3] Remove all sys 
version info
URL: https://github.com/apache/airflow/pull/5019#issuecomment-486072340
 
 
   @kaxil, @potiuk , @ashb, @bolkedebruin  are we supposed to change the vendor 
files(https://github.com/apache/airflow/commit/e36bdef0b34c16def20ecbb8248950070eb5fa33)?
 I thought files under vendor dir are meant for copy paste from upstream? Or I 
understand incorrectly.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4402) Update super() calls for nvd3

2019-04-23 Thread Chao-Han Tsai (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824823#comment-16824823
 ] 

Chao-Han Tsai commented on AIRFLOW-4402:


Hmm i have seen changes on those files.

> Update super() calls for nvd3
> -
>
> Key: AIRFLOW-4402
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4402
> Project: Apache Airflow
>  Issue Type: Sub-task
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
> Fix For: 2.0.0
>
>
> In all classes under nvd3, replace {{super(__class__, self).__init__(...)}} 
> by {{super().__init__(...)}}
> Similarly for any other {{super}} calls for other methods.
> (In Python 3 {{super(__class__, self) == super()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] codecov-io edited a comment on issue #5167: Reliable SynchronizedQueue used instead of multiprocessing.Queue

2019-04-23 Thread GitBox
codecov-io edited a comment on issue #5167: Reliable SynchronizedQueue used 
instead of multiprocessing.Queue
URL: https://github.com/apache/airflow/pull/5167#issuecomment-486068982
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=h1) 
Report
   > Merging 
[#5167](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/de1795c8f1e65a5f4a848e88ef3848ba51aab449?src=pr=desc)
 will **increase** coverage by `0.01%`.
   > The diff coverage is `91.66%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5167/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5167  +/-   ##
   ==
   + Coverage   78.44%   78.46%   +0.01% 
   ==
 Files 466  467   +1 
 Lines   2973529762  +27 
   ==
   + Hits2332723353  +26 
   - Misses   6408 6409   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/executors/base\_executor.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvYmFzZV9leGVjdXRvci5weQ==)
 | `95.58% <ø> (-0.07%)` | :arrow_down: |
   | 
[airflow/executors/local\_executor.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvbG9jYWxfZXhlY3V0b3IucHk=)
 | `81.25% <100%> (+0.19%)` | :arrow_up: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `59.22% <100%> (+0.1%)` | :arrow_up: |
   | 
[airflow/contrib/executors/kubernetes\_executor.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2V4ZWN1dG9ycy9rdWJlcm5ldGVzX2V4ZWN1dG9yLnB5)
 | `63.38% <100%> (+0.1%)` | :arrow_up: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `78.53% <50%> (+0.01%)` | :arrow_up: |
   | 
[airflow/utils/synchronized\_queue.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zeW5jaHJvbml6ZWRfcXVldWUucHk=)
 | `92.3% <92.3%> (ø)` | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=footer). 
Last update 
[de1795c...9494571](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] milton0825 commented on issue #5143: [AIRFLOW-4204] Update super() calls to PY3

2019-04-23 Thread GitBox
milton0825 commented on issue #5143: [AIRFLOW-4204] Update super() calls to PY3
URL: https://github.com/apache/airflow/pull/5143#issuecomment-486068967
 
 
   Removed the second commit and created a separate PR: 
https://github.com/apache/airflow/pull/5168


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #5167: Reliable SynchronizedQueue used instead of multiprocessing.Queue

2019-04-23 Thread GitBox
codecov-io commented on issue #5167: Reliable SynchronizedQueue used instead of 
multiprocessing.Queue
URL: https://github.com/apache/airflow/pull/5167#issuecomment-486068982
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=h1) 
Report
   > Merging 
[#5167](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/de1795c8f1e65a5f4a848e88ef3848ba51aab449?src=pr=desc)
 will **increase** coverage by `0.01%`.
   > The diff coverage is `91.66%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5167/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5167  +/-   ##
   ==
   + Coverage   78.44%   78.46%   +0.01% 
   ==
 Files 466  467   +1 
 Lines   2973529762  +27 
   ==
   + Hits2332723353  +26 
   - Misses   6408 6409   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/executors/base\_executor.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvYmFzZV9leGVjdXRvci5weQ==)
 | `95.58% <ø> (-0.07%)` | :arrow_down: |
   | 
[airflow/executors/local\_executor.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvbG9jYWxfZXhlY3V0b3IucHk=)
 | `81.25% <100%> (+0.19%)` | :arrow_up: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `59.22% <100%> (+0.1%)` | :arrow_up: |
   | 
[airflow/contrib/executors/kubernetes\_executor.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2V4ZWN1dG9ycy9rdWJlcm5ldGVzX2V4ZWN1dG9yLnB5)
 | `63.38% <100%> (+0.1%)` | :arrow_up: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `78.53% <50%> (+0.01%)` | :arrow_up: |
   | 
[airflow/utils/synchronized\_queue.py](https://codecov.io/gh/apache/airflow/pull/5167/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zeW5jaHJvbml6ZWRfcXVldWUucHk=)
 | `92.3% <92.3%> (ø)` | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=footer). 
Last update 
[de1795c...9494571](https://codecov.io/gh/apache/airflow/pull/5167?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4402) Update super() calls for nvd3

2019-04-23 Thread Tao Feng (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824820#comment-16824820
 ] 

Tao Feng commented on AIRFLOW-4402:
---

hold on, I don't think we should change any of the files under vendor folder as 
all the file under that folder meant to be copy paste from upstream dependency. 

> Update super() calls for nvd3
> -
>
> Key: AIRFLOW-4402
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4402
> Project: Apache Airflow
>  Issue Type: Sub-task
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
> Fix For: 2.0.0
>
>
> In all classes under nvd3, replace {{super(__class__, self).__init__(...)}} 
> by {{super().__init__(...)}}
> Similarly for any other {{super}} calls for other methods.
> (In Python 3 {{super(__class__, self) == super()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] milton0825 opened a new pull request #5168: [AIRFLOW-4204] Update super() calls to PY3 for nvd3

2019-04-23 Thread GitBox
milton0825 opened a new pull request #5168: [AIRFLOW-4204] Update super() calls 
to PY3 for nvd3
URL: https://github.com/apache/airflow/pull/5168
 
 
   (cherry picked from commit 52edaebf1a043120f8a9882b768fea0cf2270db1)
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-4401) multiprocessing.Queue.empty() is unreliable

2019-04-23 Thread Jarek Potiuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4401:
--
Description: 
After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
LocalExecutor tests, I took a deeper dive into the problem and what I found 
raised the remaining hair on top of my head. 

We had a number of flaky tests in the local executor that resulted in 
result_queue not being empty where it should have been emptied a moment before. 
More details and discussion can be found in 
[https://github.com/apache/airflow/pull/5159] 

The problem turned out to be ... unreliability of multiprocessing.Queue empty() 
implementation. It turned out that multiprocessing.Queue.empty() implementation 
is not fully synchronized and it might return True even if put() operation has 
been already completed in another process. What's more - empty() might return 
True even if qsize() of the queue returns > 0 (!) 

It's a bit mind-boggling but it is "as intended' as documented in 
[https://bugs.python.org/issue23582]  (resolved as "not a bug" ) 

A few people have stumbled upon this problem. For example 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 and [https://github.com/keras-team/autokeras/issues/368] 

Also we seemed to experienced that in Airflow before. In jobs.py years ago 
(31.07.2016) - we can see the comment below (but we used multiprocessing.queue 
empty() nevertheless):
{code:java}
# Not using multiprocessing.Queue() since it's no longer a separate
# process and due to some unusual behavior. (empty() incorrectly
# returns true?){code}
The solution available in [https://bugs.python.org/issue23582]  using qsize() 
was working on Linux but is not really acceptable because qsize() does not work 
on MacOS (throws NotImplementedError).

The working solution is to implement a reliable queue (SynchronizedQueue) based 
on 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 (butwith a twist that __init__ of class deriving from Queue has to be changed 
for python 3.4+ as described in 
[https://stackoverflow.com/questions/24941359/ctx-parameter-in-multiprocessing-queue].

Luckily we are now Python3.5+

We should replace all usages of multiprocessing.Queue where empty() is used 
with the SynchronizedQueue. And make sure we do not use multiprocessing.Queue 
in similar way in the future.

 

 

 

  was:
After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
LocalExecutor tests, I took a deeper dive into the problem and what I found 
raised the remaining hair on top of my head. 

We had a number of flaky tests in the local executor that resulted in 
result_queue not being empty where it should have been emptied a moment before. 
More details and discussion can be found in 
[https://github.com/apache/airflow/pull/5159] 

The problem turned out to be ... unreliability of multiprocessing.Queue empty() 
implementation. It turned out that multiprocessing.Queue.empty() implementation 
is not fully synchronized and it might return True even if put() operation has 
been already completed in another process. What's more - empty() might return 
True even if qsize() of the queue returns > 0 (!) 

It's a bit mind-boggling but it is "as intended' as documented in 
[https://bugs.python.org/issue23582]  (resolved as "not a bug" ) 

A few people have stumbled upon this problem. For example 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 and [https://github.com/keras-team/autokeras/issues/368] 

Also we seemed to experienced that in Airflow before. In jobs.py years ago 
(31.07.2016) - we can see the comment below (but we used multiprocessing.queue 
empty() nevertheless):
{code:java}
# Not using multiprocessing.Queue() since it's no longer a separate
# process and due to some unusual behavior. (empty() incorrectly
# returns true?){code}
The solution available in [https://bugs.python.org/issue23582]  using qsize() 
was not usable because qsize() does not work on MacOS (throws 
NotImplementedError).

The working solution is to implement a reliable queue (SynchronizedQueue) based 
on 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 (butwith a twist that __init__ of class deriving from Queue has to be changed 
for python 3.4+ as described in 
[https://stackoverflow.com/questions/24941359/ctx-parameter-in-multiprocessing-queue].

Luckily we are now Python3.5+

We should replace all usages of multiprocessing.Queue where empty() is used 
with the SynchronizedQueue. And make sure we do not use multiprocessing.Queue 
in similar way in the future.

 

 

 


> multiprocessing.Queue.empty() is unreliable
> ---
>
> Key: AIRFLOW-4401
> URL: 

[jira] [Updated] (AIRFLOW-4402) Update super() calls for nvd3

2019-04-23 Thread Chao-Han Tsai (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao-Han Tsai updated AIRFLOW-4402:
---
Description: 
In all classes under nvd3, replace {{super(__class__, self).__init__(...)}} by 
{{super().__init__(...)}}

Similarly for any other {{super}} calls for other methods.

(In Python 3 {{super(__class__, self) == super()}})

  was:
In all classes, replace {{super(__class__, self).__init__(...)}} by 
{{super().__init__(...)}}

Similarly for any other {{super}} calls for other methods.

(In Python 3 {{super(__class__, self) == super()}})


> Update super() calls for nvd3
> -
>
> Key: AIRFLOW-4402
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4402
> Project: Apache Airflow
>  Issue Type: Sub-task
>Reporter: Chao-Han Tsai
>Assignee: Chao-Han Tsai
>Priority: Major
> Fix For: 2.0.0
>
>
> In all classes under nvd3, replace {{super(__class__, self).__init__(...)}} 
> by {{super().__init__(...)}}
> Similarly for any other {{super}} calls for other methods.
> (In Python 3 {{super(__class__, self) == super()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-4402) Update super() calls for nvd3

2019-04-23 Thread Chao-Han Tsai (JIRA)
Chao-Han Tsai created AIRFLOW-4402:
--

 Summary: Update super() calls for nvd3
 Key: AIRFLOW-4402
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4402
 Project: Apache Airflow
  Issue Type: Sub-task
Reporter: Chao-Han Tsai
Assignee: Chao-Han Tsai
 Fix For: 2.0.0


In all classes, replace {{super(__class__, self).__init__(...)}} by 
{{super().__init__(...)}}

Similarly for any other {{super}} calls for other methods.

(In Python 3 {{super(__class__, self) == super()}})



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-4401) multiprocessing.Queue.empty() is unreliable

2019-04-23 Thread Jarek Potiuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4401:
--
Description: 
After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
LocalExecutor tests, I took a deeper dive into the problem and what I found 
raised the remaining hair on top of my head. 

We had a number of flaky tests in the local executor that resulted in 
result_queue not being empty where it should have been emptied a moment before. 
More details and discussion can be found in 
[https://github.com/apache/airflow/pull/5159] 

The problem turned out to be ... unreliability of multiprocessing.Queue empty() 
implementation. It turned out that multiprocessing.Queue.empty() implementation 
is not fully synchronized and it might return True even if put() operation has 
been already completed in another process. What's more - empty() might return 
True even if qsize() of the queue returns > 0 (!) 

It's a bit mind-boggling but it is "as intended' as documented in 
[https://bugs.python.org/issue23582]  (resolved as "not a bug" ) 

A few people have stumbled upon this problem. For example 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 and [https://github.com/keras-team/autokeras/issues/368] 

Also we seemed to experienced that in Airflow before. In jobs.py years ago 
(31.07.2016) - we can see the comment below (but we used multiprocessing.queue 
empty() nevertheless):
{code:java}
# Not using multiprocessing.Queue() since it's no longer a separate
# process and due to some unusual behavior. (empty() incorrectly
# returns true?){code}
The solution available in [https://bugs.python.org/issue23582]  using qsize() 
was not usable because qsize() does not work on MacOS (throws 
NotImplementedError).

The working solution is to implement a reliable queue (SynchronizedQueue) based 
on 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 (butwith a twist that __init__ of class deriving from Queue has to be changed 
for python 3.4+ as described in 
[https://stackoverflow.com/questions/24941359/ctx-parameter-in-multiprocessing-queue].

Luckily we are now Python3.5+

We should replace all usages of multiprocessing.Queue where empty() is used 
with the SynchronizedQueue. And make sure we do not use multiprocessing.Queue 
in similar way in the future.

 

 

 

  was:
After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
LocalExecutor tests, I took a deeper dive into the problem and what I found 
raised the remaining hair on top of my head. 

We had a number of flaky tests in the local executor that resulted in 
result_queue not being empty where it should have been emptied a moment before. 
More details and discussion can be found in 
[https://github.com/apache/airflow/pull/5159] 

The problem turned out to be ... unreliability of multiprocessing.Queue empty() 
implementation. It turned out that multiprocessing.Queue.empty() implementation 
is not fully synchronized and it might return True even if put() operation has 
been already completed in another process. What's more - empty() might return 
True even if qsize() of the queue returns > 0 (!) 

It's a bit mind-boggling but it is "as intended' as documented in 
[https://bugs.python.org/issue23582]  (resolved as "not a bug" ) 

A few people have stumbled upon this problem. For example 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 and [https://github.com/keras-team/autokeras/issues/368] 

Also we seemed to experienced that in Airflow before. In jobs.py year ago 
(31.07.2016) - we can see the comment below (but we used multiprocessing.queue 
empty() nevertheless):
{code:java}
# Not using multiprocessing.Queue() since it's no longer a separate
# process and due to some unusual behavior. (empty() incorrectly
# returns true?){code}
The solution available in [https://bugs.python.org/issue23582]  using qsize() 
was not usable because qsize() does not work on MacOS (throws 
NotImplementedError).

The working solution is to implement a reliable queue (SynchronizedQueue) based 
on 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 (butwith a twist that __init__ of class deriving from Queue has to be changed 
for python 3.4+ as described in 
[https://stackoverflow.com/questions/24941359/ctx-parameter-in-multiprocessing-queue].

Luckily we are now Python3.5+

We should replace all usages of multiprocessing.Queue where empty() is used 
with the SynchronizedQueue. And make sure we do not use multiprocessing.Queue 
in similar way in the future.

 

 

 


> multiprocessing.Queue.empty() is unreliable
> ---
>
> Key: AIRFLOW-4401
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4401
> Project: Apache 

[GitHub] [airflow] milton0825 commented on issue #5041: [AIRFLOW-4208] Replace @abstractproperty by @abstractmethod

2019-04-23 Thread GitBox
milton0825 commented on issue #5041: [AIRFLOW-4208] Replace @abstractproperty 
by @abstractmethod
URL: https://github.com/apache/airflow/pull/5041#issuecomment-486066778
 
 
   Shall we merge this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jmcarp commented on issue #4821: [AIRFLOW-3998] Use nested commands in cli.

2019-04-23 Thread GitBox
jmcarp commented on issue #4821: [AIRFLOW-3998] Use nested commands in cli.
URL: https://github.com/apache/airflow/pull/4821#issuecomment-486065962
 
 
   Updated documentation to match the new CLI structure.
   
   What's the next step here? A vote on the AIP?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-4401) multiprocessing.Queue.empty() is unreliable

2019-04-23 Thread Jarek Potiuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4401:
--
Description: 
After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
LocalExecutor tests, I took a deeper dive into the problem and what I found 
raised the remaining hair on top of my head. 

We had a number of flaky tests in the local executor that resulted in 
result_queue not being empty where it should have been emptied a moment before. 
More details and discussion can be found in 
[https://github.com/apache/airflow/pull/5159] 

The problem turned out to be ... unreliability of multiprocessing.Queue empty() 
implementation. It turned out that multiprocessing.Queue.empty() implementation 
is not fully synchronized and it might return True even if put() operation has 
been already completed in another process. What's more - empty() might return 
True even if qsize() of the queue returns > 0 (!) 

It's a bit mind-boggling but it is "as intended' as documented in 
[https://bugs.python.org/issue23582]  (resolved as "not a bug" ) 

A few people have stumbled upon this problem. For example 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 and [https://github.com/keras-team/autokeras/issues/368] 

Also we seemed to experienced that in Airflow before. In jobs.py year ago 
(31.07.2016) - we can see the comment below (but we used multiprocessing.queue 
empty() nevertheless):
{code:java}
# Not using multiprocessing.Queue() since it's no longer a separate
# process and due to some unusual behavior. (empty() incorrectly
# returns true?){code}
The solution available in [https://bugs.python.org/issue23582]  using qsize() 
was not usable because qsize() does not work on MacOS (throws 
NotImplementedError).

The working solution is to implement a reliable queue (SynchronizedQueue) based 
on 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 (butwith a twist that __init__ of class deriving from Queue has to be changed 
for python 3.4+ as described in 
[https://stackoverflow.com/questions/24941359/ctx-parameter-in-multiprocessing-queue].

Luckily we are now Python3.5+

We should replace all usages of multiprocessing.Queue where empty() is used 
with the SynchronizedQueue. And make sure we do not use multiprocessing.Queue 
in similar way in the future.

 

 

 

  was:
After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
LocalExecutor tests, I took a deeper dive into the problem and what I found 
raised the remaining hair on top of my head. 

We had a number of flaky tests in the local executor that resulted in 
result_queue not being empty where it should have been emptied a moment before. 
More details and discussion can be found in 
[https://github.com/apache/airflow/pull/5159] 

The problem turned out to be ... unreliability of multiprocessing.Queue empty() 
implementation. Itt turned out that multiprocessing.Queue.empty() 
implementation is not fully synchronized and it might return True even if put() 
operation has been already completed in another thread. What's more - empty() 
might return True even if qsize() of the queue returns > 0 (!) 

It's a bit mind-boggling but it is "as intended' as documented in 
[https://bugs.python.org/issue23582]  (resolved as "not a bug" ) 

A few people have stumbled upon this problem. For example 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 and [https://github.com/keras-team/autokeras/issues/368] 

Also we seemed to experienced that in Airflow before. In jobs.py year ago 
(31.07.2016) - we can see the comment below (but we used multiprocessing.queue 
empty() nevertheless):
{code:java}
# Not using multiprocessing.Queue() since it's no longer a separate
# process and due to some unusual behavior. (empty() incorrectly
# returns true?){code}
The solution available in [https://bugs.python.org/issue23582]  using qsize() 
was not usable because qsize() does not work on MacOS (throws 
NotImplementedError).

The working solution is to implement a reliable queue (SynchronizedQueue) based 
on 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 (butwith a twist that __init__ of class deriving from Queue has to be changed 
for python 3.4+ as described in 
[https://stackoverflow.com/questions/24941359/ctx-parameter-in-multiprocessing-queue].

Luckily we are now Python3.5+

We should replace all usages of multiprocessing.Queue where empty() is used 
with the SynchronizedQueue. And make sure we do not use multiprocessing.Queue 
in similar way in the future.

 

 

 


> multiprocessing.Queue.empty() is unreliable
> ---
>
> Key: AIRFLOW-4401
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4401
> Project: Apache 

[GitHub] [airflow] potiuk commented on a change in pull request #5166: [AIRFLOW-4397] Add GCSUploadSessionCompleteSensor

2019-04-23 Thread GitBox
potiuk commented on a change in pull request #5166: [AIRFLOW-4397] Add 
GCSUploadSessionCompleteSensor
URL: https://github.com/apache/airflow/pull/5166#discussion_r277952601
 
 

 ##
 File path: airflow/contrib/sensors/gcs_sensor.py
 ##
 @@ -160,3 +162,105 @@ def poke(self, context):
 google_cloud_storage_conn_id=self.google_cloud_conn_id,
 delegate_to=self.delegate_to)
 return bool(hook.list(self.bucket, prefix=self.prefix))
+
+
+class GoogleCloudStorageUploadSessionCompleteSensor(BaseSensorOperator):
+"""
+Checks for changes in the number of files at prefix in Google Cloud Storage
+bucket and returns True if the inactivity period has passed with no
+increase in the number of files. Note, it is recommended to use reschedule
+mode if you expect this sensor to run for hours.
+
+:param bucket: The Google cloud storage bucket where the objects are.
+expected.
+:type bucket: str
+:param prefix: The name of the prefix to check in the Google cloud
+storage bucket.
+:param inactivity_period: The total seconds of inactivity to designate
+an upload session is over. Note, this mechanism is not real time and
+this operator may not return until a poke_interval after this period
+has passed with no additional files sensed.
+:type inactivity_period: int
+:param min_files: The minimum number of files needed for upload session
+to be considered valid.
+:type min_files: int
+:param previous_num_files: The previous number of files before the next
+iteration.
+:type previous_num_files: int
+:param inactivity_seconds: The current seconds of the inactivity period.
+:type inactivity_seconds: int
+:param google_cloud_conn_id: The connection ID to use when connecting
+to Google cloud storage.
+:type google_cloud_conn_id: str
+:param delegate_to: The account to impersonate, if any. For this to work,
+the service account making the request must have domain-wide
+delegation enabled.
+:type delegate_to: str
+"""
+
+template_fields = ('bucket', 'prefix')
+ui_color = '#f0eee4'
+
+@apply_defaults
+def __init__(self,
+ bucket,
+ prefix,
+ inactivity_period=60 * 60,
+ min_files=1,
+ previous_num_files=0,
+ google_cloud_conn_id='google_cloud_default',
+ delegate_to=None,
+ *args, **kwargs):
+
+super(GoogleCloudStorageUploadSessionCompleteSensor, 
self).__init__(*args, **kwargs)
+
+self.bucket = bucket
+self.prefix = prefix
+self.google_cloud_conn_id = google_cloud_conn_id
+self.inactivity_period = inactivity_period
+self.min_files = min_files
+self.previous_num_files = previous_num_files
+self.inactivity_seconds = 0
+self.google_cloud_conn_id = google_cloud_conn_id
+self.delegate_to = delegate_to
+
+def is_bucket_updated(self, current_num_files):
+"""
+Checks whether new files have been uploaded and the inactivity_period
+has passed and updates the state of the sensor accordingly.
+
+:param current_num_files: number of files in bucket during last poke.
+:type current_num_files: int
+"""
+
+if current_num_files > self.previous_num_files:
+# When new files arrived, reset the inactivity_seconds
+# previous_num_files for the next poke.
+self.inactivity_seconds = 0
+self.previous_num_files = current_num_files
+else:
+self.inactivity_seconds = self.inactivity_seconds \
++ self.poke_interval
 
 Review comment:
   I think it would have been better to use the actual system time for 
inactivity seconds calculation (current_time - last_activity_time). Using 
poke_interval - especially multiple times - does not take into account 
potential scheduling delays. There is no guarantee that the "is_bucket_updated" 
will be always called exactly after poke_interval seconds. It will be a bit 
more difficult for unit testing but it can be done with mocking the 
current_time.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jmcarp commented on issue #5030: [AIRFLOW-4227] Use python3-style type annotations.

2019-04-23 Thread GitBox
jmcarp commented on issue #5030: [AIRFLOW-4227] Use python3-style type 
annotations.
URL: https://github.com/apache/airflow/pull/5030#issuecomment-486064887
 
 
   This should be ready to go now that #5090 is finished. cc @Fokko @kaxil 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-4401) multiprocessing.Queue is unreliable

2019-04-23 Thread Jarek Potiuk (JIRA)
Jarek Potiuk created AIRFLOW-4401:
-

 Summary: multiprocessing.Queue is unreliable
 Key: AIRFLOW-4401
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4401
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Jarek Potiuk


After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
LocalExecutor tests, I took a deeper dive into the problem and what I found 
raised the remaining hair on top of my head. 

We had a number of flaky tests in the local executor that resulted in 
result_queue not being empty where it should have been emptied a moment before. 
More details and discussion can be found in 
[https://github.com/apache/airflow/pull/5159] 

The problem turned out to be ... unreliability of multiprocessing.Queue empty() 
implementation. Itt turned out that multiprocessing.Queue.empty() 
implementation is not fully synchronized and it might return True even if put() 
operation has been already completed in another thread. What's more - empty() 
might return True even if qsize() of the queue returns > 0 (!) 

It's a bit mind-boggling but it is "as intended' as documented in 
[https://bugs.python.org/issue23582]  (resolved as "not a bug" ) 

A few people have stumbled upon this problem. For example 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 and [https://github.com/keras-team/autokeras/issues/368] 

The solution available in [https://bugs.python.org/issue23582]  using qsize() 
was not usable because qsize() does not work on MacOS (throws 
NotImplementedError).

The working solution is to implement a reliable queue (SynchronizedQueue) based 
on 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 (butwith a twist that __init__ of class deriving from Queue has to be changed 
for python 3.4+ as described in 
[https://stackoverflow.com/questions/24941359/ctx-parameter-in-multiprocessing-queue].

Luckily we are now Python3.5+

We should replace all usages of multiprocessing.Queue where empty() is used 
with the SynchronizedQueue

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] potiuk edited a comment on issue #5159: [AIRFLOW-XXX] Attempt to remove flakeyness of LocalExecutor

2019-04-23 Thread GitBox
potiuk edited a comment on issue #5159: [AIRFLOW-XXX] Attempt to remove 
flakeyness of LocalExecutor
URL: https://github.com/apache/airflow/pull/5159#issuecomment-486063506
 
 
   Hey @BasPH @ashb . I woke up this morning and could not stop thinking about 
that one and I took a dep dive. It was great idea @BasPH to make it 
reproducible locally, that has helped me a lot. I added a lot of logging and 
eventually it boiled down to cases where queue.empty() returned True where 
queue.qsize() returned 1.
   
   I thought I lost my mind for a while, but well ... It turned out that this 
this is INTENDED BEHAVIOUR of multiprocessing.Queue. See the issue 
https://bugs.python.org/issue23582 (resolved as "not a bug").
   
   I have no words for it, but well, we needed to solve it. Unfortunately 
simple solution does not work on MacOS (qsize() does not work on MacOS and 
throws NotImplemented) ... So at the end I had to do what others did in the 
same case - have a custom SynchronizedQueue deriving from multiprocessing.Queue 
and add extra synchronisation/counter to put/get methods :(
   
   I described all details in 
https://issues.apache.org/jira/browse/AIRFLOW-4401 and created a PR to fix it 
here: https://github.com/apache/airflow/pull/5167
   
   Speaking about productive morning :).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #5159: [AIRFLOW-XXX] Attempt to remove flakeyness of LocalExecutor

2019-04-23 Thread GitBox
potiuk commented on issue #5159: [AIRFLOW-XXX] Attempt to remove flakeyness of 
LocalExecutor
URL: https://github.com/apache/airflow/pull/5159#issuecomment-486063506
 
 
   Hey @BasPH @ashb . I woke up this morning and could not stop thinking about 
that one and I took a dep dive. It was great idea @BasPH to make it 
reproducible locally, that has helped me a lot. I added a lot of logging. 
   
   Eventually it boiled down to cases where queue.empty() returned True where 
queue.qsize() returned 1.
   
   I thought I lost my mind for a while, but well ... It turned out that this 
this is INTENDED BEHAVIOUR of multiprocessing.Queue. See the issue 
https://bugs.python.org/issue23582 (resolved as "not a bug").
   
   I have no words for it, but well, we needed to solve it. Unfortunately 
simple solution does not work on MacOS (qsize() does not work on MacOS and 
throws NotImplemented) ... So at the end I had to do what others did in the 
same case - have a custom SynchronizedQueue deriving from multiprocessing.Queue 
and add extra synchronisation/counter to put/get methods :(
   
   I described all details in 
https://issues.apache.org/jira/browse/AIRFLOW-4401 and created a PR to fix it 
here: https://github.com/apache/airflow/pull/5167
   
   Speaking about productive morning :).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk opened a new pull request #5167: Reliable SynchronizedQueue used instead of multiprocessing.Queue

2019-04-23 Thread GitBox
potiuk opened a new pull request #5167: Reliable SynchronizedQueue used instead 
of multiprocessing.Queue
URL: https://github.com/apache/airflow/pull/5167
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4401
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   It solves unreliable behaviour of multiprocessing.Queue.empty() - sometimes 
returning True when the queue is not really empty.
   
   ### Tests
   
   - [x] My PR does not need testing for this extremely good reason:
   We already have LocalExecutor and other tests that will just become less 
flaky.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-4401) multiprocessing.Queue is unreliable

2019-04-23 Thread Jarek Potiuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4401:
--
Description: 
After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
LocalExecutor tests, I took a deeper dive into the problem and what I found 
raised the remaining hair on top of my head. 

We had a number of flaky tests in the local executor that resulted in 
result_queue not being empty where it should have been emptied a moment before. 
More details and discussion can be found in 
[https://github.com/apache/airflow/pull/5159] 

The problem turned out to be ... unreliability of multiprocessing.Queue empty() 
implementation. Itt turned out that multiprocessing.Queue.empty() 
implementation is not fully synchronized and it might return True even if put() 
operation has been already completed in another thread. What's more - empty() 
might return True even if qsize() of the queue returns > 0 (!) 

It's a bit mind-boggling but it is "as intended' as documented in 
[https://bugs.python.org/issue23582]  (resolved as "not a bug" ) 

A few people have stumbled upon this problem. For example 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 and [https://github.com/keras-team/autokeras/issues/368] 

Also we seemed to experienced that in Airflow before. In jobs.py year ago 
(31.07.2016) - we can see the comment below (but we used multiprocessing.queue 
empty() nevertheless):
{code:java}
# Not using multiprocessing.Queue() since it's no longer a separate
# process and due to some unusual behavior. (empty() incorrectly
# returns true?){code}
The solution available in [https://bugs.python.org/issue23582]  using qsize() 
was not usable because qsize() does not work on MacOS (throws 
NotImplementedError).

The working solution is to implement a reliable queue (SynchronizedQueue) based 
on 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 (butwith a twist that __init__ of class deriving from Queue has to be changed 
for python 3.4+ as described in 
[https://stackoverflow.com/questions/24941359/ctx-parameter-in-multiprocessing-queue].

Luckily we are now Python3.5+

We should replace all usages of multiprocessing.Queue where empty() is used 
with the SynchronizedQueue. And make sure we do not use multiprocessing.Queue 
in similar way in the future.

 

 

 

  was:
After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
LocalExecutor tests, I took a deeper dive into the problem and what I found 
raised the remaining hair on top of my head. 

We had a number of flaky tests in the local executor that resulted in 
result_queue not being empty where it should have been emptied a moment before. 
More details and discussion can be found in 
[https://github.com/apache/airflow/pull/5159] 

The problem turned out to be ... unreliability of multiprocessing.Queue empty() 
implementation. Itt turned out that multiprocessing.Queue.empty() 
implementation is not fully synchronized and it might return True even if put() 
operation has been already completed in another thread. What's more - empty() 
might return True even if qsize() of the queue returns > 0 (!) 

It's a bit mind-boggling but it is "as intended' as documented in 
[https://bugs.python.org/issue23582]  (resolved as "not a bug" ) 

A few people have stumbled upon this problem. For example 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 and [https://github.com/keras-team/autokeras/issues/368] 

The solution available in [https://bugs.python.org/issue23582]  using qsize() 
was not usable because qsize() does not work on MacOS (throws 
NotImplementedError).

The working solution is to implement a reliable queue (SynchronizedQueue) based 
on 
[https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
 (butwith a twist that __init__ of class deriving from Queue has to be changed 
for python 3.4+ as described in 
[https://stackoverflow.com/questions/24941359/ctx-parameter-in-multiprocessing-queue].

Luckily we are now Python3.5+

We should replace all usages of multiprocessing.Queue where empty() is used 
with the SynchronizedQueue

 

 

 


> multiprocessing.Queue is unreliable
> ---
>
> Key: AIRFLOW-4401
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4401
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jarek Potiuk
>Priority: Major
>
> After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
> LocalExecutor tests, I took a deeper dive into the problem and what I found 
> raised the remaining hair on top of my head. 
> We had a number of flaky tests in the local executor that resulted in 
> result_queue not being empty where it should have been emptied 

[jira] [Updated] (AIRFLOW-4401) multiprocessing.Queue.empty() is unreliable

2019-04-23 Thread Jarek Potiuk (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4401:
--
Summary: multiprocessing.Queue.empty() is unreliable  (was: 
multiprocessing.Queue is unreliable)

> multiprocessing.Queue.empty() is unreliable
> ---
>
> Key: AIRFLOW-4401
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4401
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Jarek Potiuk
>Priority: Major
>
> After discussing with [~ash] and [~BasPH] potential reasons for flakiness of 
> LocalExecutor tests, I took a deeper dive into the problem and what I found 
> raised the remaining hair on top of my head. 
> We had a number of flaky tests in the local executor that resulted in 
> result_queue not being empty where it should have been emptied a moment 
> before. More details and discussion can be found in 
> [https://github.com/apache/airflow/pull/5159] 
> The problem turned out to be ... unreliability of multiprocessing.Queue 
> empty() implementation. Itt turned out that multiprocessing.Queue.empty() 
> implementation is not fully synchronized and it might return True even if 
> put() operation has been already completed in another thread. What's more - 
> empty() might return True even if qsize() of the queue returns > 0 (!) 
> It's a bit mind-boggling but it is "as intended' as documented in 
> [https://bugs.python.org/issue23582]  (resolved as "not a bug" ) 
> A few people have stumbled upon this problem. For example 
> [https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
>  and [https://github.com/keras-team/autokeras/issues/368] 
> Also we seemed to experienced that in Airflow before. In jobs.py year ago 
> (31.07.2016) - we can see the comment below (but we used 
> multiprocessing.queue empty() nevertheless):
> {code:java}
> # Not using multiprocessing.Queue() since it's no longer a separate
> # process and due to some unusual behavior. (empty() incorrectly
> # returns true?){code}
> The solution available in [https://bugs.python.org/issue23582]  using qsize() 
> was not usable because qsize() does not work on MacOS (throws 
> NotImplementedError).
> The working solution is to implement a reliable queue (SynchronizedQueue) 
> based on 
> [https://github.com/vterron/lemon/commit/9ca6b4b1212228dbd4f69b88aaf88b12952d7d6f]
>  (butwith a twist that __init__ of class deriving from Queue has to be 
> changed for python 3.4+ as described in 
> [https://stackoverflow.com/questions/24941359/ctx-parameter-in-multiprocessing-queue].
> Luckily we are now Python3.5+
> We should replace all usages of multiprocessing.Queue where empty() is used 
> with the SynchronizedQueue. And make sure we do not use multiprocessing.Queue 
> in similar way in the future.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] codecov-io edited a comment on issue #4821: [AIRFLOW-3998] Use nested commands in cli.

2019-04-23 Thread GitBox
codecov-io edited a comment on issue #4821: [AIRFLOW-3998] Use nested commands 
in cli.
URL: https://github.com/apache/airflow/pull/4821#issuecomment-468981700
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/4821?src=pr=h1) 
Report
   > Merging 
[#4821](https://codecov.io/gh/apache/airflow/pull/4821?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/de1795c8f1e65a5f4a848e88ef3848ba51aab449?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `82.92%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/4821/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/4821?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4821  +/-   ##
   ==
   + Coverage   78.44%   78.45%   +<.01% 
   ==
 Files 466  466  
 Lines   2973529706  -29 
   ==
   - Hits2332723305  -22 
   + Misses   6408 6401   -7
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/4821?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[...ow/contrib/example\_dags/example\_qubole\_operator.py](https://codecov.io/gh/apache/airflow/pull/4821/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2V4YW1wbGVfZGFncy9leGFtcGxlX3F1Ym9sZV9vcGVyYXRvci5weQ==)
 | `0% <ø> (ø)` | :arrow_up: |
   | 
[...flow/contrib/example\_dags/example\_qubole\_sensor.py](https://codecov.io/gh/apache/airflow/pull/4821/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2V4YW1wbGVfZGFncy9leGFtcGxlX3F1Ym9sZV9zZW5zb3IucHk=)
 | `0% <ø> (ø)` | :arrow_up: |
   | 
[...le\_dags/example\_passing\_params\_via\_test\_command.py](https://codecov.io/gh/apache/airflow/pull/4821/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9wYXNzaW5nX3BhcmFtc192aWFfdGVzdF9jb21tYW5kLnB5)
 | `100% <ø> (ø)` | :arrow_up: |
   | 
[airflow/models/crypto.py](https://codecov.io/gh/apache/airflow/pull/4821/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvY3J5cHRvLnB5)
 | `79.41% <ø> (ø)` | :arrow_up: |
   | 
[...trib/example\_dags/example\_azure\_cosmosdb\_sensor.py](https://codecov.io/gh/apache/airflow/pull/4821/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2V4YW1wbGVfZGFncy9leGFtcGxlX2F6dXJlX2Nvc21vc2RiX3NlbnNvci5weQ==)
 | `0% <ø> (ø)` | :arrow_up: |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/4821/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.59% <100%> (+0.17%)` | :arrow_up: |
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/airflow/pull/4821/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `66.23% <82.84%> (-0.4%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/4821?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/4821?src=pr=footer). 
Last update 
[de1795c...5b67478](https://codecov.io/gh/apache/airflow/pull/4821?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #4821: [AIRFLOW-3998] Use nested commands in cli.

2019-04-23 Thread GitBox
codecov-io edited a comment on issue #4821: [AIRFLOW-3998] Use nested commands 
in cli.
URL: https://github.com/apache/airflow/pull/4821#issuecomment-468981700
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/4821?src=pr=h1) 
Report
   > Merging 
[#4821](https://codecov.io/gh/apache/airflow/pull/4821?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/de1795c8f1e65a5f4a848e88ef3848ba51aab449?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `82.92%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/4821/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/4821?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4821  +/-   ##
   ==
   + Coverage   78.44%   78.45%   +<.01% 
   ==
 Files 466  466  
 Lines   2973529706  -29 
   ==
   - Hits2332723305  -22 
   + Misses   6408 6401   -7
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/4821?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[...ow/contrib/example\_dags/example\_qubole\_operator.py](https://codecov.io/gh/apache/airflow/pull/4821/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2V4YW1wbGVfZGFncy9leGFtcGxlX3F1Ym9sZV9vcGVyYXRvci5weQ==)
 | `0% <ø> (ø)` | :arrow_up: |
   | 
[...flow/contrib/example\_dags/example\_qubole\_sensor.py](https://codecov.io/gh/apache/airflow/pull/4821/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2V4YW1wbGVfZGFncy9leGFtcGxlX3F1Ym9sZV9zZW5zb3IucHk=)
 | `0% <ø> (ø)` | :arrow_up: |
   | 
[...le\_dags/example\_passing\_params\_via\_test\_command.py](https://codecov.io/gh/apache/airflow/pull/4821/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9wYXNzaW5nX3BhcmFtc192aWFfdGVzdF9jb21tYW5kLnB5)
 | `100% <ø> (ø)` | :arrow_up: |
   | 
[airflow/models/crypto.py](https://codecov.io/gh/apache/airflow/pull/4821/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvY3J5cHRvLnB5)
 | `79.41% <ø> (ø)` | :arrow_up: |
   | 
[...trib/example\_dags/example\_azure\_cosmosdb\_sensor.py](https://codecov.io/gh/apache/airflow/pull/4821/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2V4YW1wbGVfZGFncy9leGFtcGxlX2F6dXJlX2Nvc21vc2RiX3NlbnNvci5weQ==)
 | `0% <ø> (ø)` | :arrow_up: |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/4821/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.59% <100%> (+0.17%)` | :arrow_up: |
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/airflow/pull/4821/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `66.23% <82.84%> (-0.4%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/4821?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/4821?src=pr=footer). 
Last update 
[de1795c...5b67478](https://codecov.io/gh/apache/airflow/pull/4821?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] jaketf opened a new pull request #5166: [AIRFLOW-4397] Add GCSUploadSessionCompleteSensor

2019-04-23 Thread GitBox
jaketf opened a new pull request #5166: [AIRFLOW-4397] Add 
GCSUploadSessionCompleteSensor
URL: https://github.com/apache/airflow/pull/5166
 
 
   This commit add a GoogleCloudStorageUploadSessionCompleteSensor
   to address the use case of accepting files from a third party vendor
   who refuses to send a success indicator when providing data files
   into a bucket and waiting until an inactivity period has passed to
   indicate the end of an upload session.
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following 
[Airflow-4937](https://issues.apache.org/jira/browse/AIRFLOW-4397) issues and 
references them in the PR title. 
   
   
   ### Description
   
   - [x]  This PR add a GCS sensor to poke a bucket until some inactivity 
period has passed with no new files added to the bucket.
   
   ### Tests
   
   - [x] My PR adds the following unit tests: 
`tests/contrib/sensors/test_gcs_upload_session_sensor.py`
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #5165: [AIRFLOW-4399] Avoid duplicated os.path.isfile() check in models.dagbag

2019-04-23 Thread GitBox
codecov-io commented on issue #5165: [AIRFLOW-4399] Avoid duplicated 
os.path.isfile() check in models.dagbag
URL: https://github.com/apache/airflow/pull/5165#issuecomment-486059506
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5165?src=pr=h1) 
Report
   > Merging 
[#5165](https://codecov.io/gh/apache/airflow/pull/5165?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/de1795c8f1e65a5f4a848e88ef3848ba51aab449?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5165/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5165?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#5165   +/-   ##
   ===
 Coverage   78.44%   78.44%   
   ===
 Files 466  466   
 Lines   2973529735   
   ===
 Hits2332723327   
 Misses   6408 6408
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5165?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/dagbag.py](https://codecov.io/gh/apache/airflow/pull/5165/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFnYmFnLnB5)
 | `91.82% <100%> (ø)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5165?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5165?src=pr=footer). 
Last update 
[de1795c...7966d38](https://codecov.io/gh/apache/airflow/pull/5165?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-4400) Wrong filemode on three files

2019-04-23 Thread Beau Barker (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Beau Barker updated AIRFLOW-4400:
-
Description: 
Three files inside the models directory have the wrong file mode.

__init__.py, taskfail.py and taskreschedule.py.
{noformat}
$ ls -l airflow/models
total 232K
-rwxr-xr-x 1 beau staff 176K Apr 8 10:38 __init__.py*
-rw-r--r-- 1 beau staff 1.2K Apr 5 13:47 base.py
-rw-r--r-- 1 beau staff 12K Apr 8 09:28 connection.py
-rw-r--r-- 1 beau staff 1.9K Apr 5 13:47 dagpickle.py
-rw-r--r-- 1 beau staff 1.2K Apr 5 13:47 errors.py
-rw-r--r-- 1 beau staff 2.8K Apr 5 13:47 kubernetes.py
-rw-r--r-- 1 beau staff 2.2K Apr 5 13:47 log.py
-rw-r--r-- 1 beau staff 2.6K Apr 5 13:47 skipmixin.py
-rw-r--r-- 1 beau staff 1.8K Apr 5 13:47 slamiss.py
-rwxr-xr-x 1 beau staff 2.0K Apr 5 13:47 taskfail.py*
-rwxr-xr-x 1 beau staff 3.3K Apr 5 13:47 taskreschedule.py*
-rw-r--r-- 1 beau staff 7.5K Apr 8 09:28 xcom.py
{noformat}
 

  was:
Three files inside the models directory have the wrong file mode.

`__init__.py`, `taskfail.py` and `taskreschedule.py`.

```

$ ls -l airflow/models
total 232K
-rwxr-xr-x 1 beau staff 176K Apr 8 10:38 __init__.py*
-rw-r--r-- 1 beau staff 1.2K Apr 5 13:47 base.py
-rw-r--r-- 1 beau staff 12K Apr 8 09:28 connection.py
-rw-r--r-- 1 beau staff 1.9K Apr 5 13:47 dagpickle.py
-rw-r--r-- 1 beau staff 1.2K Apr 5 13:47 errors.py
-rw-r--r-- 1 beau staff 2.8K Apr 5 13:47 kubernetes.py
-rw-r--r-- 1 beau staff 2.2K Apr 5 13:47 log.py
-rw-r--r-- 1 beau staff 2.6K Apr 5 13:47 skipmixin.py
-rw-r--r-- 1 beau staff 1.8K Apr 5 13:47 slamiss.py
-rwxr-xr-x 1 beau staff 2.0K Apr 5 13:47 taskfail.py*
-rwxr-xr-x 1 beau staff 3.3K Apr 5 13:47 taskreschedule.py*
-rw-r--r-- 1 beau staff 7.5K Apr 8 09:28 xcom.py

```


> Wrong filemode on three files
> -
>
> Key: AIRFLOW-4400
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4400
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: models
>Affects Versions: 1.10.3
>Reporter: Beau Barker
>Assignee: Beau Barker
>Priority: Trivial
>
> Three files inside the models directory have the wrong file mode.
> __init__.py, taskfail.py and taskreschedule.py.
> {noformat}
> $ ls -l airflow/models
> total 232K
> -rwxr-xr-x 1 beau staff 176K Apr 8 10:38 __init__.py*
> -rw-r--r-- 1 beau staff 1.2K Apr 5 13:47 base.py
> -rw-r--r-- 1 beau staff 12K Apr 8 09:28 connection.py
> -rw-r--r-- 1 beau staff 1.9K Apr 5 13:47 dagpickle.py
> -rw-r--r-- 1 beau staff 1.2K Apr 5 13:47 errors.py
> -rw-r--r-- 1 beau staff 2.8K Apr 5 13:47 kubernetes.py
> -rw-r--r-- 1 beau staff 2.2K Apr 5 13:47 log.py
> -rw-r--r-- 1 beau staff 2.6K Apr 5 13:47 skipmixin.py
> -rw-r--r-- 1 beau staff 1.8K Apr 5 13:47 slamiss.py
> -rwxr-xr-x 1 beau staff 2.0K Apr 5 13:47 taskfail.py*
> -rwxr-xr-x 1 beau staff 3.3K Apr 5 13:47 taskreschedule.py*
> -rw-r--r-- 1 beau staff 7.5K Apr 8 09:28 xcom.py
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-4400) Wrong filemode on three files

2019-04-23 Thread Beau Barker (JIRA)
Beau Barker created AIRFLOW-4400:


 Summary: Wrong filemode on three files
 Key: AIRFLOW-4400
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4400
 Project: Apache Airflow
  Issue Type: Bug
  Components: models
Affects Versions: 1.10.3
Reporter: Beau Barker
Assignee: Beau Barker


Three files inside the models directory have the wrong file mode.

`__init__.py`, `taskfail.py` and `taskreschedule.py`.

```

$ ls -l airflow/models
total 232K
-rwxr-xr-x 1 beau staff 176K Apr 8 10:38 __init__.py*
-rw-r--r-- 1 beau staff 1.2K Apr 5 13:47 base.py
-rw-r--r-- 1 beau staff 12K Apr 8 09:28 connection.py
-rw-r--r-- 1 beau staff 1.9K Apr 5 13:47 dagpickle.py
-rw-r--r-- 1 beau staff 1.2K Apr 5 13:47 errors.py
-rw-r--r-- 1 beau staff 2.8K Apr 5 13:47 kubernetes.py
-rw-r--r-- 1 beau staff 2.2K Apr 5 13:47 log.py
-rw-r--r-- 1 beau staff 2.6K Apr 5 13:47 skipmixin.py
-rw-r--r-- 1 beau staff 1.8K Apr 5 13:47 slamiss.py
-rwxr-xr-x 1 beau staff 2.0K Apr 5 13:47 taskfail.py*
-rwxr-xr-x 1 beau staff 3.3K Apr 5 13:47 taskreschedule.py*
-rw-r--r-- 1 beau staff 7.5K Apr 8 09:28 xcom.py

```



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-4324) Partial search in RBAC UI no longer works

2019-04-23 Thread Xiaodong DENG (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaodong DENG resolved AIRFLOW-4324.

Resolution: Resolved

> Partial search in RBAC UI no longer works
> -
>
> Key: AIRFLOW-4324
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4324
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.10.3
>Reporter: Robin Edwards
>Assignee: sumous
>Priority: Minor
> Fix For: 1.10.4
>
>
> In airflow 1.10.2 RBAC UI searching for part of a dag name worked. For 
> example if I had multiple dags named:
> foo_bar_1
> foo_bar_2
> foo_bar_3
> Searching for "foo" would return all 3 dags. As of upgrading to 1.10.3 the 
> same search yields no results. On the result page after no results are 
> returned the search autocomplete feature no longer suggest dags as I type



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] XD-DENG edited a comment on issue #5131: [AIRFLOW-4324] fix search in RBAC UI

2019-04-23 Thread GitBox
XD-DENG edited a comment on issue #5131: [AIRFLOW-4324] fix search in RBAC UI
URL: https://github.com/apache/airflow/pull/5131#issuecomment-486053307
 
 
   On the other hand, we have the same issue in `master` branch 
https://github.com/apache/airflow/blob/master/airflow/www/views.py#L239
   
   @sumous , May you raise another PR to fix it in master as well? Thanks
   
   (you may use the same JIRA ticket number)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4324) Partial search in RBAC UI no longer works

2019-04-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824784#comment-16824784
 ] 

ASF GitHub Bot commented on AIRFLOW-4324:
-

XD-DENG commented on pull request #5131: [AIRFLOW-4324] fix search in RBAC UI
URL: https://github.com/apache/airflow/pull/5131
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Partial search in RBAC UI no longer works
> -
>
> Key: AIRFLOW-4324
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4324
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.10.3
>Reporter: Robin Edwards
>Assignee: sumous
>Priority: Minor
> Fix For: 1.10.4
>
>
> In airflow 1.10.2 RBAC UI searching for part of a dag name worked. For 
> example if I had multiple dags named:
> foo_bar_1
> foo_bar_2
> foo_bar_3
> Searching for "foo" would return all 3 dags. As of upgrading to 1.10.3 the 
> same search yields no results. On the result page after no results are 
> returned the search autocomplete feature no longer suggest dags as I type



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4324) Partial search in RBAC UI no longer works

2019-04-23 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824785#comment-16824785
 ] 

ASF subversion and git services commented on AIRFLOW-4324:
--

Commit 57013ecf01acc3f5e95e2c9668181b415a1ab47d in airflow's branch 
refs/heads/v1-10-stable from sumous
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=57013ec ]

[AIRFLOW-4324] fix DAG fuzzy search in RBAC UI (#5131)




> Partial search in RBAC UI no longer works
> -
>
> Key: AIRFLOW-4324
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4324
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.10.3
>Reporter: Robin Edwards
>Assignee: sumous
>Priority: Minor
> Fix For: 1.10.4
>
>
> In airflow 1.10.2 RBAC UI searching for part of a dag name worked. For 
> example if I had multiple dags named:
> foo_bar_1
> foo_bar_2
> foo_bar_3
> Searching for "foo" would return all 3 dags. As of upgrading to 1.10.3 the 
> same search yields no results. On the result page after no results are 
> returned the search autocomplete feature no longer suggest dags as I type



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] XD-DENG merged pull request #5131: [AIRFLOW-4324] fix search in RBAC UI

2019-04-23 Thread GitBox
XD-DENG merged pull request #5131: [AIRFLOW-4324] fix search in RBAC UI
URL: https://github.com/apache/airflow/pull/5131
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (AIRFLOW-3143) Support auto-zone in DataprocClusterCreateOperator

2019-04-23 Thread Joel Croteau (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Croteau reassigned AIRFLOW-3143:
-

Assignee: Joel Croteau  (was: fengya li)

> Support auto-zone in DataprocClusterCreateOperator
> --
>
> Key: AIRFLOW-3143
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3143
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Reporter: Wilson Lian
>Assignee: Joel Croteau
>Priority: Minor
>
> [Dataproc 
> Auto-zone|https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/auto-zone]
>  allows users to omit the zone when creating a cluster, and the service will 
> pick a zone in the chosen region.
> Providing an empty string or None for `zone` would match up with how users 
> would request auto-zone via direct API access, but as-is the 
> DataprocClusterCreateOperator makes a bad API request when such values are 
> passed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4399) To avoid duplicated os.path.isfile(filepath) checks in models.dagbag

2019-04-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824766#comment-16824766
 ] 

ASF GitHub Bot commented on AIRFLOW-4399:
-

XD-DENG commented on pull request #5165: [AIRFLOW-4399] Avoid duplicated 
os.path.isfile() check in models.dagbag
URL: https://github.com/apache/airflow/pull/5165
 
 
   ### Jira
   
 - https://issues.apache.org/jira/browse/AIRFLOW-4399
   
   ### Description
   
   The check which is removed in this commit always returns `True` for sure, 
because the same check has already been done earlier in this function body 
https://github.com/apache/airflow/blob/de1795c8f1e65a5f4a848e88ef3848ba51aab449/airflow/models/dagbag.py#L157-L158
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> To avoid duplicated os.path.isfile(filepath) checks in models.dagbag
> 
>
> Key: AIRFLOW-4399
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4399
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: models
>Affects Versions: 1.10.3
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Minor
> Fix For: 1.10.4
>
>
> There are duplicated os.path.isfile(filepath) checks, among which the 2nd one 
> is not necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] XD-DENG opened a new pull request #5165: [AIRFLOW-4399] Avoid duplicated os.path.isfile() check in models.dagbag

2019-04-23 Thread GitBox
XD-DENG opened a new pull request #5165: [AIRFLOW-4399] Avoid duplicated 
os.path.isfile() check in models.dagbag
URL: https://github.com/apache/airflow/pull/5165
 
 
   ### Jira
   
 - https://issues.apache.org/jira/browse/AIRFLOW-4399
   
   ### Description
   
   The check which is removed in this commit always returns `True` for sure, 
because the same check has already been done earlier in this function body 
https://github.com/apache/airflow/blob/de1795c8f1e65a5f4a848e88ef3848ba51aab449/airflow/models/dagbag.py#L157-L158
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-4399) To avoid duplicated os.path.isfile(filepath) checks in models.dagbag

2019-04-23 Thread Xiaodong DENG (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaodong DENG updated AIRFLOW-4399:
---
Summary: To avoid duplicated os.path.isfile(filepath) checks in 
models.dagbag  (was: To remove duplicated os.path.isfile(filepath) checks in 
models.dagbag)

> To avoid duplicated os.path.isfile(filepath) checks in models.dagbag
> 
>
> Key: AIRFLOW-4399
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4399
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: models
>Affects Versions: 1.10.3
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Minor
> Fix For: 1.10.4
>
>
> There are duplicated os.path.isfile(filepath) checks, among which the 2nd one 
> is not necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-4399) To remove duplicated os.path.isfile(filepath) checks in models.dagbag

2019-04-23 Thread Xiaodong DENG (JIRA)
Xiaodong DENG created AIRFLOW-4399:
--

 Summary: To remove duplicated os.path.isfile(filepath) checks in 
models.dagbag
 Key: AIRFLOW-4399
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4399
 Project: Apache Airflow
  Issue Type: Improvement
  Components: models
Affects Versions: 1.10.3
Reporter: Xiaodong DENG
Assignee: Xiaodong DENG
 Fix For: 1.10.4


There are duplicated os.path.isfile(filepath) checks, among which the 2nd one 
is not necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] milton0825 commented on a change in pull request #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
milton0825 commented on a change in pull request #5094: [AIRFLOW-4306] Global 
operator extra links
URL: https://github.com/apache/airflow/pull/5094#discussion_r277924790
 
 

 ##
 File path: docs/plugins.rst
 ##
 @@ -202,6 +212,13 @@ definitions in Airflow.
 def stat_name_dummy_handler(stat_name):
 return stat_name
 
+# A global operator extra link that redirect you to google
+class GoogleLink(BaseOperatorLink):
 
 Review comment:
   This is for the testing purpose so I want to make it as simple as possible
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao commented on issue #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
feng-tao commented on issue #5094: [AIRFLOW-4306] Global operator extra links
URL: https://github.com/apache/airflow/pull/5094#issuecomment-486027034
 
 
   @KevinYang21 , not sure how to DM you, but you may want to check your email 
:)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] KevinYang21 edited a comment on issue #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
KevinYang21 edited a comment on issue #5094: [AIRFLOW-4306] Global operator 
extra links
URL: https://github.com/apache/airflow/pull/5094#issuecomment-486025570
 
 
   @feng-tao Took a quick look at the code and I think that's because we start 
to construct the links from operator instead of from task instance. But since 
we already have the `execution_date` we can always construct the TI just like 
other endpoints and use whatever info is inside.
   
   While getting the TI fields is easy, my thinking is still more on the form 
of the link. The fact that we can just construct a link button right now is 
probly not going to look very nice when we need some other format, like when we 
want the download log look for our kibana log links.
   
   EDIT: (hit comment button too soon) I'm not trying to scope out this PR, I 
think there might be use cases that a link button is enough. If it is not 
trivial to create such a framework for more complex pluggable components in the 
dialog it makes sense to start another PR around that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao edited a comment on issue #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
feng-tao edited a comment on issue #5094: [AIRFLOW-4306] Global operator extra 
links
URL: https://github.com/apache/airflow/pull/5094#issuecomment-486026307
 
 
   yeah, FORM/Design/UI is always a tough question
   
   BTW @KevinYang21 , is yrql...@gmail.com your email ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao commented on issue #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
feng-tao commented on issue #5094: [AIRFLOW-4306] Global operator extra links
URL: https://github.com/apache/airflow/pull/5094#issuecomment-486026307
 
 
   yeah, FORM/Design/UI is always a tough question
   
   BTW, @KevinYang21 , is yrql...@gmail.com your email ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] zhongjiajie commented on a change in pull request #5153: [AIRFLOW-XXX] Name default config_file param in KubernetesPodOperator docstring

2019-04-23 Thread GitBox
zhongjiajie commented on a change in pull request #5153: [AIRFLOW-XXX] Name 
default config_file param in KubernetesPodOperator docstring
URL: https://github.com/apache/airflow/pull/5153#discussion_r277922027
 
 

 ##
 File path: airflow/contrib/operators/kubernetes_pod_operator.py
 ##
 @@ -71,7 +71,8 @@ class KubernetesPodOperator(BaseOperator):
 :type affinity: dict
 :param node_selectors: A dict containing a group of scheduling rules
 :type node_selectors: dict
-:param config_file: The path to the Kubernetes config file
+:param config_file: The path to the Kubernetes config file.
+If not specified, defaults to ``~/.kube/config``
 
 Review comment:
   Looking great.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] KevinYang21 commented on issue #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
KevinYang21 commented on issue #5094: [AIRFLOW-4306] Global operator extra links
URL: https://github.com/apache/airflow/pull/5094#issuecomment-486025570
 
 
   @feng-tao Took a quick look at the code and I think that's because we start 
to construct the links from operator instead of from task instance. But since 
we already have the `execution_date` we can always construct the TI just like 
other endpoints and use whatever info is inside.
   
   While getting the TI fields is easy, my thinking is still more on the form 
of the link. The fact that we can just construct a link button right now is 
probly not going to look very nice when we need some other format, like when we 
want the download log look for our kibana log links.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] milton0825 commented on issue #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
milton0825 commented on issue #5094: [AIRFLOW-4306] Global operator extra links
URL: https://github.com/apache/airflow/pull/5094#issuecomment-486025327
 
 
   @feng-tao we exposed the operator so all the following fields should be 
available to user:
   
https://github.com/apache/airflow/blob/8be65327e8cd1f4cf7dedcb90b8eacc4fa5f635d/airflow/models/baseoperator.py#L241-L274


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] milton0825 commented on a change in pull request #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
milton0825 commented on a change in pull request #5094: [AIRFLOW-4306] Global 
operator extra links
URL: https://github.com/apache/airflow/pull/5094#discussion_r277920779
 
 

 ##
 File path: tests/plugins/test_plugin.py
 ##
 @@ -100,6 +114,10 @@ class AirflowTestPlugin(AirflowPlugin):
 appbuilder_views = [v_appbuilder_package]
 appbuilder_menu_items = [appbuilder_mitem]
 stat_name_handler = staticmethod(stat_name_dummy_handler)
+global_operator_extra_links = [
+GoogleLink(),
 
 Review comment:
   Yeah I think that make more sense to let the user know our intent


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] milton0825 commented on a change in pull request #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
milton0825 commented on a change in pull request #5094: [AIRFLOW-4306] Global 
operator extra links
URL: https://github.com/apache/airflow/pull/5094#discussion_r277920575
 
 

 ##
 File path: airflow/models/baseoperator.py
 ##
 @@ -973,6 +986,25 @@ class BaseOperatorLink:
 
 __metaclass__ = ABCMeta
 
+@property
+@abstractmethod
+def name(self):
+# type: () -> str
+"""
+Name of the link. This will be the button name on the task UI.
+
+:return: link name
+"""
+pass
+
 @abstractmethod
 def get_link(self, operator, dttm):
+# type: (BaseOperator, datetime) -> str
+"""
+Link to external system.
+
+:param operator: airflow operator
 
 Review comment:
   I have a type hint in: 
https://github.com/apache/airflow/pull/5094/files/8be65327e8cd1f4cf7dedcb90b8eacc4fa5f635d#diff-b5ec97b8301ab688b7e045426f24d485R1002


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-4398) Extra links include try_number

2019-04-23 Thread Chao-Han Tsai (JIRA)
Chao-Han Tsai created AIRFLOW-4398:
--

 Summary: Extra links include try_number
 Key: AIRFLOW-4398
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4398
 Project: Apache Airflow
  Issue Type: New Feature
Reporter: Chao-Han Tsai
Assignee: Chao-Han Tsai


Currently airflow supports constructing extra links using task and 
execution_date.
https://github.com/apache/airflow/blob/master/airflow/models/baseoperator.py#L977

We should also include the try_number into the scope.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] milton0825 edited a comment on issue #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
milton0825 edited a comment on issue #5094: [AIRFLOW-4306] Global operator 
extra links
URL: https://github.com/apache/airflow/pull/5094#issuecomment-486023240
 
 
   @KevinYang21 @kunalyogenshah I think it totally make sense to include the 
try-number into the scope.   
   
   I am thinking about addressing that in a separate PR to limit the scope of 
changes in this PR and here is the jira to track that: 
https://issues.apache.org/jira/browse/AIRFLOW-4398
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] milton0825 commented on issue #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
milton0825 commented on issue #5094: [AIRFLOW-4306] Global operator extra links
URL: https://github.com/apache/airflow/pull/5094#issuecomment-486023240
 
 
   @KevinYang21 @kunalyogenshah I think it totally make sense to include the 
try-number into the scope.   
   
   I am thinking about addressing that in a separate PR and here is the jira to 
track that: https://issues.apache.org/jira/browse/AIRFLOW-4398
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-4397) Add GCSUploadSessionCompleteSensor

2019-04-23 Thread Jacob Ferriero (JIRA)
Jacob Ferriero created AIRFLOW-4397:
---

 Summary: Add GCSUploadSessionCompleteSensor
 Key: AIRFLOW-4397
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4397
 Project: Apache Airflow
  Issue Type: New Feature
  Components: contrib
Reporter: Jacob Ferriero
Assignee: Jacob Ferriero


I'd like to contribute a Sensor for Google Cloud Storage that can poke a bucket 
until there has been sufficient time without a new file drop. Often times there 
are cases where a third party vendor drops data to a bucket but don't send a 
success flag when they are done. This sensor would allow you to poke every n 
minutes to check if more files have been added since the last poke, and if 
there had been `inactivity_period` minutes without a new file drop, return 
`True`. This could allow SLA misses if data did not arrive by an expected time, 
and have a configurable deadline past which the sensor would fail. Optionally 
the user could specify a minimum number of files for the sensor to succeed. 
This would be my first time contributing to an OSS project, so please let me 
know if this is not the appropriate place to start.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] KevinYang21 commented on issue #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
KevinYang21 commented on issue #5094: [AIRFLOW-4306] Global operator extra links
URL: https://github.com/apache/airflow/pull/5094#issuecomment-486021540
 
 
   Nice idea to make it pluggable. We do also have the need to put a Kibana 
link on the dialog but as @kunalyogenshah mentioned, for task logs I think it 
makes more sense to provide different links for different `try_number`. 
Possible that we can make it so flexible that we can have plugin section just 
like the download log section?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kunalyogenshah commented on issue #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
kunalyogenshah commented on issue #5094: [AIRFLOW-4306] Global operator extra 
links
URL: https://github.com/apache/airflow/pull/5094#issuecomment-486019322
 
 
   @milton0825 Got it. Constructing it this way is certainly doable. But 
somehow it does seem like introducing the Kibana log URLs in a more central way 
like the Download log links / View log links could be more semantically 
appropriate than adding them through global extra links? Specially if you 
consider adding try number to the mix, in which case a Kibana global should 
return multi-links instead of one link. Just a thought. + @KevinYang21 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] milton0825 commented on issue #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
milton0825 commented on issue #5094: [AIRFLOW-4306] Global operator extra links
URL: https://github.com/apache/airflow/pull/5094#issuecomment-486015757
 
 
   @kunalyogenshah so currently the operator instance and the `execution_time` 
are exposed to you like in: 
https://github.com/apache/airflow/pull/5094/files#diff-2fdb519812514e29589938d977d68ea0R94
 and you should have the information (dag_id, task_id,...) to formulate the URL 
with granularity to each task_instance. If you need information such as log id 
that is out of the scope of airflow, you can do whatever you need in the 
`get_link` to get that information.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4346) Kubernetes Executor Fails for Large Wide DAGs

2019-04-23 Thread Vincent Castaneda (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824629#comment-16824629
 ] 

Vincent Castaneda commented on AIRFLOW-4346:


Has anyone been looking into this?

> Kubernetes Executor Fails for Large Wide DAGs
> -
>
> Key: AIRFLOW-4346
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4346
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, executor, kubernetes
>Affects Versions: 1.10.2, 1.10.3
>Reporter: Vincent Castaneda
>Priority: Blocker
> Attachments: configmap-airflow-share.yaml, sched_logs.txt, 
> wide_dag_bash_test.py, wide_dag_test_100_300.py, wide_dag_test_300_300.py
>
>
> When running large DAGs–those with parallelism of over 100 task instances to 
> be running concurrently--several tasks fail on the executor and are reported 
> to the database, but the scheduler is never aware of them failing.
> Attached are:
>  - A test DAG that we can use to replicate the issue.
>  - The configmap-airflow.yaml file
> I will be available to answer any other questions that are raised about our 
> configuration. We are running this on GKE and giving the scheduler and web 
> pod a base 100m for execution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] kunalyogenshah commented on issue #5094: [AIRFLOW-4306] Global operator extra links

2019-04-23 Thread GitBox
kunalyogenshah commented on issue #5094: [AIRFLOW-4306] Global operator extra 
links
URL: https://github.com/apache/airflow/pull/5094#issuecomment-486008366
 
 
   Quick question about this implementation. Could you help provide an example 
usage for let's say Kibana logs that require arguments to be passed to it. For 
example, if I want to add a global Kibana link for all tasks, I need to be able 
to specify something like a log ID to the URL to redirect to the right place. 
Does this implementation allow that?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kunalyogenshah commented on issue #5164: [AIRFLOW-4396] Provide a link to external Elasticsearch logs in UI.

2019-04-23 Thread GitBox
kunalyogenshah commented on issue #5164: [AIRFLOW-4396] Provide a link to 
external Elasticsearch logs in UI.
URL: https://github.com/apache/airflow/pull/5164#issuecomment-486007872
 
 
   Perfect. Thanks @milton0825 . I'll move my questions over to that PR for 
better context :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] milton0825 commented on issue #5164: [AIRFLOW-4396] Provide a link to external Elasticsearch logs in UI.

2019-04-23 Thread GitBox
milton0825 commented on issue #5164: [AIRFLOW-4396] Provide a link to external 
Elasticsearch logs in UI.
URL: https://github.com/apache/airflow/pull/5164#issuecomment-486006581
 
 
   @kunalyogenshah I think this is what you are looking for: 
https://github.com/apache/airflow/pull/5094
   We can register an extra link that is global to all the operators.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kunalyogenshah commented on issue #5164: [AIRFLOW-4396] Provide a link to external Elasticsearch logs in UI.

2019-04-23 Thread GitBox
kunalyogenshah commented on issue #5164: [AIRFLOW-4396] Provide a link to 
external Elasticsearch logs in UI.
URL: https://github.com/apache/airflow/pull/5164#issuecomment-486006137
 
 
   @feng-tao : A quick naive clarification question. Does this mean that the 
DAG operator itself must specify the link to Kibana as an extra link in its 
operator definition? Or is there a way to make that universal for all DAGs and 
tasks?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kunalyogenshah commented on issue #5164: [AIRFLOW-4396] Provide a link to external Elasticsearch logs in UI.

2019-04-23 Thread GitBox
kunalyogenshah commented on issue #5164: [AIRFLOW-4396] Provide a link to 
external Elasticsearch logs in UI.
URL: https://github.com/apache/airflow/pull/5164#issuecomment-486004575
 
 
   Ah. Thanks @feng-tao . This seems to be exactly similar to what we've been 
trying to do. Let me look into using extra links instead.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao edited a comment on issue #5164: [AIRFLOW-4396] Provide a link to external Elasticsearch logs in UI.

2019-04-23 Thread GitBox
feng-tao edited a comment on issue #5164: [AIRFLOW-4396] Provide a link to 
external Elasticsearch logs in UI.
URL: https://github.com/apache/airflow/pull/5164#issuecomment-486001330
 
 
   And we have already did that at Lyft in prod for quite some 
time(https://eng.lyft.com/running-apache-airflow-at-lyft-6e53bb8fccff). cc 
@milton0825 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao commented on issue #5164: [AIRFLOW-4396] Provide a link to external Elasticsearch logs in UI.

2019-04-23 Thread GitBox
feng-tao commented on issue #5164: [AIRFLOW-4396] Provide a link to external 
Elasticsearch logs in UI.
URL: https://github.com/apache/airflow/pull/5164#issuecomment-486001330
 
 
   And we have already did that at Lyft in prod for quite some time. cc 
@milton0825 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao commented on issue #5164: [AIRFLOW-4396] Provide a link to external Elasticsearch logs in UI.

2019-04-23 Thread GitBox
feng-tao commented on issue #5164: [AIRFLOW-4396] Provide a link to external 
Elasticsearch logs in UI.
URL: https://github.com/apache/airflow/pull/5164#issuecomment-486001239
 
 
   @kunalyogenshah , not sure if this is the right way to show the UI. I think 
another approach(Lyft) is to leverage 
https://github.com/apache/airflow/pull/5059 and 
https://github.com/apache/airflow/pull/5094 which allows you to customize your 
task instance panel in generic way.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #5161: [AIRFLOW-4395] Remove pickle_info view

2019-04-23 Thread GitBox
codecov-io commented on issue #5161: [AIRFLOW-4395] Remove pickle_info view
URL: https://github.com/apache/airflow/pull/5161#issuecomment-486000901
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5161?src=pr=h1) 
Report
   > Merging 
[#5161](https://codecov.io/gh/apache/airflow/pull/5161?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/de1795c8f1e65a5f4a848e88ef3848ba51aab449?src=pr=desc)
 will **decrease** coverage by `0.02%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5161/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5161?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#5161  +/-   ##
   ==
   - Coverage   78.44%   78.42%   -0.03% 
   ==
 Files 466  466  
 Lines   2973529722  -13 
   ==
   - Hits2332723308  -19 
   - Misses   6408 6414   +6
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5161?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/www/views.py](https://codecov.io/gh/apache/airflow/pull/5161/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=)
 | `76.18% <ø> (-0.16%)` | :arrow_down: |
   | 
[airflow/www/security.py](https://codecov.io/gh/apache/airflow/pull/5161/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvc2VjdXJpdHkucHk=)
 | `92.3% <ø> (ø)` | :arrow_up: |
   | 
[airflow/models/dag.py](https://codecov.io/gh/apache/airflow/pull/5161/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFnLnB5)
 | `92.1% <0%> (-1.27%)` | :arrow_down: |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/5161/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `92.59% <0%> (+0.17%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5161?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5161?src=pr=footer). 
Last update 
[de1795c...78ab648](https://codecov.io/gh/apache/airflow/pull/5161?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] KevinYang21 commented on issue #5164: [AIRFLOW-4396] Provide a link to external Elasticsearch logs in UI.

2019-04-23 Thread GitBox
KevinYang21 commented on issue #5164: [AIRFLOW-4396] Provide a link to external 
Elasticsearch logs in UI.
URL: https://github.com/apache/airflow/pull/5164#issuecomment-485994653
 
 
   This is nice, thank you @kunalyogenshah ! Users using Airflow with 
Elasticsearch backend do likely want to utilize more those aggregation features 
Elasticsearch frontends, e.g. Kibana, provide. 
   
   Before looking into the code, my opinion on the UI change is that we make 
the layout similar to how we do the log downloading--we allow users to choose 
the try number of the task instances. Do you think that make sense?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3951) No easy way to get query_execution_id from AWSAthenaOperator

2019-04-23 Thread Rob Dinoff (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824570#comment-16824570
 ] 

Rob Dinoff commented on AIRFLOW-3951:
-

yes you may close this ... thanks,

Rob

On Tue, Apr 23, 2019 at 5:04 PM Ash Berlin-Taylor (JIRA) 



> No easy way to get query_execution_id  from AWSAthenaOperator
> -
>
> Key: AIRFLOW-3951
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3951
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, contrib, operators
>Affects Versions: 1.10.2
>Reporter: Rob Dinoff
>Priority: Major
>
> There does not seem to be an easy way to get the query_execution_id from the 
> AWSAthensOperator.  This should be pushed as an xcom variable.  Need this so 
> query_execution_id can be used in the AthenaSensor.
> The following should be added to the end of the AWSAthenaOperator.execute
> {code:java}
> context["task_instance"].xcom_push(key="query_execution_id", 
> value=self.query_execution_id){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-4396) Add a link to Elasticsearch frontend for logs if available

2019-04-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824571#comment-16824571
 ] 

ASF GitHub Bot commented on AIRFLOW-4396:
-

kunalyogenshah commented on pull request #5164: [AIRFLOW-4396] Provide a link 
to external Elasticsearch logs in UI.
URL: https://github.com/apache/airflow/pull/5164
 
 
   ### Jira
   
   - [ ] My PR addresses the following:
 - https://issues.apache.org/jira/browse/AIRFLOW-4396
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   This PR adds a new button to the Airflow UI next to "View log" ("View log in 
ES"). It provides an external URL for visualizing the log, and renders the 
latest try for the Task, DAG and execution being looked up. This button is 
controlled by a new `elasticsearch_frontend` conf param, which provides the 
template URL for viewing the corresponding log. Screenshot below :
   
   ![Screen Shot 2019-04-23 at 2 54 15 
PM](https://user-images.githubusercontent.com/5075200/56618520-b4f77100-65d7-11e9-99a8-37c4c9577407.png)
   
   Documentation update is pending code review. Will update documentation after.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Unit tested the UI and checked the console for errors in the JS. Button 
doesn't show up if the param is not defined.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add a link to Elasticsearch frontend for logs if available
> --
>
> Key: AIRFLOW-4396
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4396
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webserver
>Reporter: Kunal Shah
>Assignee: Kunal Shah
>Priority: Minor
>
> If elasticsearch backend is being used for logs collection, then there may 
> exist a frontend to redirect for the client to view their logs. In those 
> cases, we can provide a link to the external UI which may allow visualizing 
> the logs in a more queryable fashion.
> The idea is to add a new conf param for elasticsearch for a template URL to 
> the frontend, which can formatted with the elasticsearch log ID template.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-4396) Add a link to Elasticsearch frontend for logs if available

2019-04-23 Thread Kunal Shah (JIRA)
Kunal Shah created AIRFLOW-4396:
---

 Summary: Add a link to Elasticsearch frontend for logs if available
 Key: AIRFLOW-4396
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4396
 Project: Apache Airflow
  Issue Type: Improvement
  Components: webserver
Reporter: Kunal Shah
Assignee: Kunal Shah


If elasticsearch backend is being used for logs collection, then there may 
exist a frontend to redirect for the client to view their logs. In those cases, 
we can provide a link to the external UI which may allow visualizing the logs 
in a more queryable fashion.

The idea is to add a new conf param for elasticsearch for a template URL to the 
frontend, which can formatted with the elasticsearch log ID template.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] kaxil edited a comment on issue #5143: [AIRFLOW-4204] Update super() calls to PY3

2019-04-23 Thread GitBox
kaxil edited a comment on issue #5143: [AIRFLOW-4204] Update super() calls to 
PY3
URL: https://github.com/apache/airflow/pull/5143#issuecomment-485983744
 
 
   We should have a separate PR for modifying nvd3 stuff - ideally with a new 
JIRA issue.
   
   It becomes easy to revert if need be.
   
   Or create sub-task under a parent JIRA 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #5163: WIP: [AIRFLOW-XXX] Remove hard-coded logging level

2019-04-23 Thread GitBox
codecov-io commented on issue #5163: WIP: [AIRFLOW-XXX] Remove hard-coded 
logging level
URL: https://github.com/apache/airflow/pull/5163#issuecomment-485984210
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5163?src=pr=h1) 
Report
   > Merging 
[#5163](https://codecov.io/gh/apache/airflow/pull/5163?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/de1795c8f1e65a5f4a848e88ef3848ba51aab449?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5163/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5163?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#5163   +/-   ##
   ===
 Coverage   78.44%   78.44%   
   ===
 Files 466  466   
 Lines   2973529735   
   ===
 Hits2332723327   
 Misses   6408 6408
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5163?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/5163/diff?src=pr=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `83.87% <100%> (ø)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5163?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5163?src=pr=footer). 
Last update 
[de1795c...d7ed667](https://codecov.io/gh/apache/airflow/pull/5163?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on issue #5143: [AIRFLOW-4204] Update super() calls to PY3

2019-04-23 Thread GitBox
kaxil commented on issue #5143: [AIRFLOW-4204] Update super() calls to PY3
URL: https://github.com/apache/airflow/pull/5143#issuecomment-485983744
 
 
   @potiuk - We don't use tool any longer (at least I don't since we moved to 
GitBox). Use **Squash and merge** from the UI as shown in the image below. 
Remove extra details from the commit message and resolve the Jira issue once 
merged.
   
   
![image](https://user-images.githubusercontent.com/8811558/56617494-ec860d80-6617-11e9-9b3d-d63681a2f57a.png)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil closed pull request #5163: WIP: [AIRFLOW-XXX] Remove hard-coded logging level

2019-04-23 Thread GitBox
kaxil closed pull request #5163: WIP: [AIRFLOW-XXX] Remove hard-coded logging 
level
URL: https://github.com/apache/airflow/pull/5163
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] BasPH edited a comment on issue #5159: [AIRFLOW-XXX] Attempt to remove flakeyness of LocalExecutor

2019-04-23 Thread GitBox
BasPH edited a comment on issue #5159: [AIRFLOW-XXX] Attempt to remove 
flakeyness of LocalExecutor
URL: https://github.com/apache/airflow/pull/5159#issuecomment-485977458
 
 
   I have very limited time this week but tried digging into it. At least I 
managed to reproduce locally:
   
   I set an optional breakpoint before `self.assertEqual(len(executor.running), 
0)`:
   ```python
   if len(executor.running) != 0:
   import ipdb
   ipdb.set_trace()
   self.assertEqual(len(executor.running), 0)
   ```
   
   Let this run a few times and after just a few attempts I hit the breakpoint:
   ```bash
   for n in {1..100}; do nosetests 
tests/executors/test_local_executor.py:LocalExecutorTest.test_execution_limited_parallelism
 -s; done
   ```
   
   For the record, at this point:
   
   - `len(executor.running)` is 1 at this point in time and `executor.running` 
contains `{'fail': True}`
   - `executor.queue.empty()` returns `True`
   - `executor.result_queue.get()` returns `('fail', 'failed')` (I would expect 
this thing to be empty because `sync()` is called which loops over the 
result_queue, clearing out all items...)
   
   Thinking out loud: my understanding is there's a race condition where the 
asserts are already being called before `executor.end()` finishes. I imagine 
there's something iffy about multiple workers and passing around the 
JoinableQueue to all workers, but unsure exactly what's causing the issue. 
Maybe `change_state()` might not finish fast enough, and the result_queue is 
already empty by then, and as a result the `end()` function is finished?
   
   Hope this helps so far :-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3951) No easy way to get query_execution_id from AWSAthenaOperator

2019-04-23 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824525#comment-16824525
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3951:


"but no way to tell if it succeeded or failed" was a bug that has been fixed in 
1.10.4 - if the Query fails the Operator will now also fail.

Given that I think we can close this?

> No easy way to get query_execution_id  from AWSAthenaOperator
> -
>
> Key: AIRFLOW-3951
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3951
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, contrib, operators
>Affects Versions: 1.10.2
>Reporter: Rob Dinoff
>Priority: Major
>
> There does not seem to be an easy way to get the query_execution_id from the 
> AWSAthensOperator.  This should be pushed as an xcom variable.  Need this so 
> query_execution_id can be used in the AthenaSensor.
> The following should be added to the end of the AWSAthenaOperator.execute
> {code:java}
> context["task_instance"].xcom_push(key="query_execution_id", 
> value=self.query_execution_id){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] kaxil opened a new pull request #5163: WIP: [AIRFLOW-XXX] Remove hard-coded logging level

2019-04-23 Thread GitBox
kaxil opened a new pull request #5163: WIP: [AIRFLOW-XXX] Remove hard-coded 
logging level
URL: https://github.com/apache/airflow/pull/5163
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3951) No easy way to get query_execution_id from AWSAthenaOperator

2019-04-23 Thread Rob Dinoff (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824490#comment-16824490
 ] 

Rob Dinoff commented on AIRFLOW-3951:
-

There is no way to get the query status returned from poll_query_status or the 
execution id returned from the hook outside of the operator so was not sure how 
AthenaSensor would work.

But yes just using AWSAthenaOperator will work but no way to tell if it 
succeeded or failed

> No easy way to get query_execution_id  from AWSAthenaOperator
> -
>
> Key: AIRFLOW-3951
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3951
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, contrib, operators
>Affects Versions: 1.10.2
>Reporter: Rob Dinoff
>Priority: Major
>
> There does not seem to be an easy way to get the query_execution_id from the 
> AWSAthensOperator.  This should be pushed as an xcom variable.  Need this so 
> query_execution_id can be used in the AthenaSensor.
> The following should be added to the end of the AWSAthenaOperator.execute
> {code:java}
> context["task_instance"].xcom_push(key="query_execution_id", 
> value=self.query_execution_id){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] potiuk commented on a change in pull request #5159: [AIRFLOW-XXX] Attempt to remove flakeyness of LocalExecutor

2019-04-23 Thread GitBox
potiuk commented on a change in pull request #5159: [AIRFLOW-XXX] Attempt to 
remove flakeyness of LocalExecutor
URL: https://github.com/apache/airflow/pull/5159#discussion_r277838641
 
 

 ##
 File path: tests/executors/test_local_executor.py
 ##
 @@ -49,11 +48,7 @@ def execution_parallelism(self, parallelism=0):
 
 executor.running['fail'] = True
 
 Review comment:
   I think this also should be moved before execute_async above. I also do not 
understand why we need the try/except 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #5153: [AIRFLOW-XXX] Name default config_file param in KubernetesPodOperator docstring

2019-04-23 Thread GitBox
codecov-io edited a comment on issue #5153: [AIRFLOW-XXX] Name default 
config_file param in KubernetesPodOperator docstring
URL: https://github.com/apache/airflow/pull/5153#issuecomment-485495821
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5153?src=pr=h1) 
Report
   > Merging 
[#5153](https://codecov.io/gh/apache/airflow/pull/5153?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/de1795c8f1e65a5f4a848e88ef3848ba51aab449?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5153/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5153?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#5153   +/-   ##
   ===
 Coverage   78.44%   78.44%   
   ===
 Files 466  466   
 Lines   2973529735   
   ===
 Hits2332723327   
 Misses   6408 6408
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5153?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[...rflow/contrib/operators/kubernetes\_pod\_operator.py](https://codecov.io/gh/apache/airflow/pull/5153/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZF9vcGVyYXRvci5weQ==)
 | `98.63% <ø> (ø)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5153?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5153?src=pr=footer). 
Last update 
[de1795c...469bbeb](https://codecov.io/gh/apache/airflow/pull/5153?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #5153: [AIRFLOW-XXX] Name default config_file param in KubernetesPodOperator docstring

2019-04-23 Thread GitBox
codecov-io edited a comment on issue #5153: [AIRFLOW-XXX] Name default 
config_file param in KubernetesPodOperator docstring
URL: https://github.com/apache/airflow/pull/5153#issuecomment-485495821
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5153?src=pr=h1) 
Report
   > Merging 
[#5153](https://codecov.io/gh/apache/airflow/pull/5153?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/de1795c8f1e65a5f4a848e88ef3848ba51aab449?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5153/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5153?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#5153   +/-   ##
   ===
 Coverage   78.44%   78.44%   
   ===
 Files 466  466   
 Lines   2973529735   
   ===
 Hits2332723327   
 Misses   6408 6408
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/5153?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[...rflow/contrib/operators/kubernetes\_pod\_operator.py](https://codecov.io/gh/apache/airflow/pull/5153/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZF9vcGVyYXRvci5weQ==)
 | `98.63% <ø> (ø)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5153?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5153?src=pr=footer). 
Last update 
[de1795c...469bbeb](https://codecov.io/gh/apache/airflow/pull/5153?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] Fokko commented on issue #5158: [AIRFLOW-XXX] Speed up tests for PythonSensor by 60s

2019-04-23 Thread GitBox
Fokko commented on issue #5158: [AIRFLOW-XXX] Speed up tests for PythonSensor 
by 60s
URL: https://github.com/apache/airflow/pull/5158#issuecomment-485910763
 
 
   Love it


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] ashb commented on issue #5162: [AIRFLOW-4358] Speed up test_jobs by not running tasks

2019-04-23 Thread GitBox
ashb commented on issue #5162: [AIRFLOW-4358] Speed up test_jobs by not running 
tasks
URL: https://github.com/apache/airflow/pull/5162#issuecomment-485898431
 
 
   Looks like I might have some ordering in the tests to sort out.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feng-tao commented on issue #5162: [AIRFLOW-4358] Speed up test_jobs by not running tasks

2019-04-23 Thread GitBox
feng-tao commented on issue #5162: [AIRFLOW-4358] Speed up test_jobs by not 
running tasks
URL: https://github.com/apache/airflow/pull/5162#issuecomment-485891838
 
 
   Nice! It is always good to speed up the CI!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4358) Decouple running actual tasks from test_jobs.py

2019-04-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824325#comment-16824325
 ] 

ASF GitHub Bot commented on AIRFLOW-4358:
-

ashb commented on pull request #5162: [AIRFLOW-4358] Speed up test_jobs by not 
running tasks
URL: https://github.com/apache/airflow/pull/5162
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] https://issues.apache.org/jira/browse/AIRFLOW-4358
   
   ### Description
   
   - [ ] We don't care about the behaviour of the tasks (we test those 
elsewhere) - we just care that the task is "run" and yeilds a state, which we 
can all do more directly.
   
 On my laptop this takes the time for SchedulerJobTest from 133s to 35s, 
and reduces the BackfillJobTests from >5mins to mere seconds!
   
 I think this approach still tests the behaviour well enough
   
   ### Tests
   
   - [x] Changed tests to check events that are run via executor, but it won't 
run it
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Decouple running actual tasks from test_jobs.py
> ---
>
> Key: AIRFLOW-4358
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4358
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: tests
>Reporter: Ash Berlin-Taylor
>Assignee: Ash Berlin-Taylor
>Priority: Major
>
> In tests/test_jobs.py we have a lot of slow, fragile tests that test the 
> behaviour of the "Jobs" - BackfillJob and SchedulerJob being the main two.
> Since we have tests that check that the executors will actually _run_ tasks 
> we should "stub" out the executor and replace the executor in those tests 
> with one that just records what tasks it was asked to run.
> This will hopefully speed up the tests and make them less fragile.
> A possible starting point:
> {code:title=test_jobs.diff}
> diff --git a/tests/test_jobs.py b/tests/test_jobs.py
> index e3ae8be1..aa919f90 100644
> --- a/tests/test_jobs.py
> +++ b/tests/test_jobs.py
> @@ -1384,6 +1384,14 @@ class LocalTaskJobTest(unittest.TestCase):
>  class SchedulerJobTest(unittest.TestCase):
> +class NullExecutor(airflow.executors.base_executor.BaseExecutor):
> +def heartbeat(self, *args, **kwargs):
> +# This would normally pop tasks of the queue and run them.
> +pass
> +
> +def end():
> +pass
> +
>  def setUp(self):
>  clear_db_runs()
>  clear_db_pools()
> @@ -1391,6 +1399,10 @@ class SchedulerJobTest(unittest.TestCase):
>  clear_db_sla_miss()
>  clear_db_errors()
> +# Speed up some tests by not running the tasks, just look at what we
> +# enqueue!
> +self.null_exec = self.NullExecutor()
> +
>  @classmethod
>  def setUpClass(cls):
>  cls.dagbag = DagBag()
> @@ -1409,8 +1421,7 @@ class SchedulerJobTest(unittest.TestCase):
>  def tearDownClass(cls):
>  cls.patcher.stop()
> -@staticmethod
> -def run_single_scheduler_loop_with_no_dags(dags_folder):
> +def run_single_scheduler_loop_with_no_dags(self, dags_folder):
>  """
>  Utility function that runs a single scheduler loop without actually
>  changing/scheduling any dags. This is useful to simulate the other 
> side effects of
> @@ -1421,6 +1432,7 @@ class SchedulerJobTest(unittest.TestCase):
>  :type directory: str
>  """
>  scheduler = SchedulerJob(
> +  

[GitHub] [airflow] ashb opened a new pull request #5162: [AIRFLOW-4358] Speed up test_jobs by not running tasks

2019-04-23 Thread GitBox
ashb opened a new pull request #5162: [AIRFLOW-4358] Speed up test_jobs by not 
running tasks
URL: https://github.com/apache/airflow/pull/5162
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] https://issues.apache.org/jira/browse/AIRFLOW-4358
   
   ### Description
   
   - [ ] We don't care about the behaviour of the tasks (we test those 
elsewhere) - we just care that the task is "run" and yeilds a state, which we 
can all do more directly.
   
 On my laptop this takes the time for SchedulerJobTest from 133s to 35s, 
and reduces the BackfillJobTests from >5mins to mere seconds!
   
 I think this approach still tests the behaviour well enough
   
   ### Tests
   
   - [x] Changed tests to check events that are run via executor, but it won't 
run it
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [x] Passes `flake8`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] ashb commented on a change in pull request #5159: [AIRFLOW-XXX] Attempt to remove flakeyness of LocalExecutor

2019-04-23 Thread GitBox
ashb commented on a change in pull request #5159: [AIRFLOW-XXX] Attempt to 
remove flakeyness of LocalExecutor
URL: https://github.com/apache/airflow/pull/5159#discussion_r23566
 
 

 ##
 File path: tests/executors/test_local_executor.py
 ##
 @@ -49,11 +48,7 @@ def execution_parallelism(self, parallelism=0):
 
 executor.running['fail'] = True
 
-if parallelism == 0:
 
 Review comment:
   The sentinel trick didn't immediately work, likely because we are using 
multiprocessing. I'll leave that as a different issue then.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work started] (AIRFLOW-4358) Decouple running actual tasks from test_jobs.py

2019-04-23 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-4358 started by Ash Berlin-Taylor.
--
> Decouple running actual tasks from test_jobs.py
> ---
>
> Key: AIRFLOW-4358
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4358
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: tests
>Reporter: Ash Berlin-Taylor
>Assignee: Ash Berlin-Taylor
>Priority: Major
>
> In tests/test_jobs.py we have a lot of slow, fragile tests that test the 
> behaviour of the "Jobs" - BackfillJob and SchedulerJob being the main two.
> Since we have tests that check that the executors will actually _run_ tasks 
> we should "stub" out the executor and replace the executor in those tests 
> with one that just records what tasks it was asked to run.
> This will hopefully speed up the tests and make them less fragile.
> A possible starting point:
> {code:title=test_jobs.diff}
> diff --git a/tests/test_jobs.py b/tests/test_jobs.py
> index e3ae8be1..aa919f90 100644
> --- a/tests/test_jobs.py
> +++ b/tests/test_jobs.py
> @@ -1384,6 +1384,14 @@ class LocalTaskJobTest(unittest.TestCase):
>  class SchedulerJobTest(unittest.TestCase):
> +class NullExecutor(airflow.executors.base_executor.BaseExecutor):
> +def heartbeat(self, *args, **kwargs):
> +# This would normally pop tasks of the queue and run them.
> +pass
> +
> +def end():
> +pass
> +
>  def setUp(self):
>  clear_db_runs()
>  clear_db_pools()
> @@ -1391,6 +1399,10 @@ class SchedulerJobTest(unittest.TestCase):
>  clear_db_sla_miss()
>  clear_db_errors()
> +# Speed up some tests by not running the tasks, just look at what we
> +# enqueue!
> +self.null_exec = self.NullExecutor()
> +
>  @classmethod
>  def setUpClass(cls):
>  cls.dagbag = DagBag()
> @@ -1409,8 +1421,7 @@ class SchedulerJobTest(unittest.TestCase):
>  def tearDownClass(cls):
>  cls.patcher.stop()
> -@staticmethod
> -def run_single_scheduler_loop_with_no_dags(dags_folder):
> +def run_single_scheduler_loop_with_no_dags(self, dags_folder):
>  """
>  Utility function that runs a single scheduler loop without actually
>  changing/scheduling any dags. This is useful to simulate the other 
> side effects of
> @@ -1421,6 +1432,7 @@ class SchedulerJobTest(unittest.TestCase):
>  :type directory: str
>  """
>  scheduler = SchedulerJob(
> +executor=self.null_exec,
>  dag_id='this_dag_doesnt_exist',  # We don't want to actually run 
> anything
>  num_runs=1,
>  subdir=os.path.join(dags_folder))
> @@ -2483,6 +2495,7 @@ class SchedulerJobTest(unittest.TestCase):
>  self.assertTrue(dag.start_date > datetime.datetime.utcnow())
>  scheduler = SchedulerJob(dag_id,
> + executor=self.null_exec,
>   subdir=os.path.join(TEST_DAG_FOLDER, 
> 'test_scheduler_dags.py'),
>   num_runs=1)
>  scheduler.run()
> @@ -2491,6 +2504,7 @@ class SchedulerJobTest(unittest.TestCase):
>  self.assertEqual(
>  len(session.query(TI).filter(TI.dag_id == dag_id).all()), 0)
>  session.commit()
> +self.assertEqual({}, self.null_exec.queued_tasks)
>  # previously, running this backfill would kick off the Scheduler
>  # because it would take the most recent run and start from there
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-4358) Decouple running actual tasks from test_jobs.py

2019-04-23 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-4358:
---
Issue Type: Sub-task  (was: Test)
Parent: AIRFLOW-2488

> Decouple running actual tasks from test_jobs.py
> ---
>
> Key: AIRFLOW-4358
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4358
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: tests
>Reporter: Ash Berlin-Taylor
>Priority: Major
>
> In tests/test_jobs.py we have a lot of slow, fragile tests that test the 
> behaviour of the "Jobs" - BackfillJob and SchedulerJob being the main two.
> Since we have tests that check that the executors will actually _run_ tasks 
> we should "stub" out the executor and replace the executor in those tests 
> with one that just records what tasks it was asked to run.
> This will hopefully speed up the tests and make them less fragile.
> A possible starting point:
> {code:title=test_jobs.diff}
> diff --git a/tests/test_jobs.py b/tests/test_jobs.py
> index e3ae8be1..aa919f90 100644
> --- a/tests/test_jobs.py
> +++ b/tests/test_jobs.py
> @@ -1384,6 +1384,14 @@ class LocalTaskJobTest(unittest.TestCase):
>  class SchedulerJobTest(unittest.TestCase):
> +class NullExecutor(airflow.executors.base_executor.BaseExecutor):
> +def heartbeat(self, *args, **kwargs):
> +# This would normally pop tasks of the queue and run them.
> +pass
> +
> +def end():
> +pass
> +
>  def setUp(self):
>  clear_db_runs()
>  clear_db_pools()
> @@ -1391,6 +1399,10 @@ class SchedulerJobTest(unittest.TestCase):
>  clear_db_sla_miss()
>  clear_db_errors()
> +# Speed up some tests by not running the tasks, just look at what we
> +# enqueue!
> +self.null_exec = self.NullExecutor()
> +
>  @classmethod
>  def setUpClass(cls):
>  cls.dagbag = DagBag()
> @@ -1409,8 +1421,7 @@ class SchedulerJobTest(unittest.TestCase):
>  def tearDownClass(cls):
>  cls.patcher.stop()
> -@staticmethod
> -def run_single_scheduler_loop_with_no_dags(dags_folder):
> +def run_single_scheduler_loop_with_no_dags(self, dags_folder):
>  """
>  Utility function that runs a single scheduler loop without actually
>  changing/scheduling any dags. This is useful to simulate the other 
> side effects of
> @@ -1421,6 +1432,7 @@ class SchedulerJobTest(unittest.TestCase):
>  :type directory: str
>  """
>  scheduler = SchedulerJob(
> +executor=self.null_exec,
>  dag_id='this_dag_doesnt_exist',  # We don't want to actually run 
> anything
>  num_runs=1,
>  subdir=os.path.join(dags_folder))
> @@ -2483,6 +2495,7 @@ class SchedulerJobTest(unittest.TestCase):
>  self.assertTrue(dag.start_date > datetime.datetime.utcnow())
>  scheduler = SchedulerJob(dag_id,
> + executor=self.null_exec,
>   subdir=os.path.join(TEST_DAG_FOLDER, 
> 'test_scheduler_dags.py'),
>   num_runs=1)
>  scheduler.run()
> @@ -2491,6 +2504,7 @@ class SchedulerJobTest(unittest.TestCase):
>  self.assertEqual(
>  len(session.query(TI).filter(TI.dag_id == dag_id).all()), 0)
>  session.commit()
> +self.assertEqual({}, self.null_exec.queued_tasks)
>  # previously, running this backfill would kick off the Scheduler
>  # because it would take the most recent run and start from there
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2488) Speed up the CI

2019-04-23 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2488:
---
Description: 
Some of the tests take far to long:
{noformat}
[success] 10.48% tests.test_jobs.BackfillJobTest.test_backfill_examples: 
177.9813s
[success] 6.24% tests.test_jobs.BackfillJobTest.test_backfill_multi_dates: 
106.0157s
[success] 3.59% tests.transplant_class..C.test_backfill: 61.0503s
[success] 3.55% 
tests.contrib.sensors.test_python_sensor.PythonSensorTests.test_python_sensor_false:
 60.2099s
[success] 2.66% 
tests.test_impersonation.ImpersonationTest.test_impersonation_subdag: 45.1117s
[success] 1.86% 
tests.test_jobs.SchedulerJobTest.test_dagrun_root_after_dagrun_unfinished: 
31.5533s
[success] 1.77% 
tests.executors.test_celery_executor.CeleryExecutorTest.test_celery_integration_1_redis_redis_6379_0:
 30.0873s
[success] 1.48% 
tests.operators.test_subdag_operator.SubDagOperatorTests.test_subdag_deadlock: 
25.0747s
[success] 1.40% 
tests.test_jobs.SchedulerJobTest.test_new_import_error_replaces_old: 23.7093s
[success] 1.36% tests.test_jobs.SchedulerJobTest.test_dagrun_success: 23.1645s
[success] 1.33% 
tests.test_jobs.BackfillJobTest.test_cli_backfill_depends_on_past_backwards: 
22.5161s
[success] 1.30% 
tests.test_jobs.SchedulerJobTest.test_dagrun_deadlock_ignore_depends_on_past: 
22.1402s
[success] 1.30% 
tests.test_jobs.SchedulerJobTest.test_dagrun_deadlock_ignore_depends_on_past_advance_ex_date:
 22.1153s
[success] 1.28% 
tests.test_jobs.BackfillJobTest.test_cli_backfill_depends_on_past: 21.7944s
[success] 1.27% 
tests.test_jobs.SchedulerJobTest.test_dagrun_root_fail_unfinished: 21.5010s
[success] 1.25% 
tests.test_jobs.BackfillJobTest.test_run_ignores_all_dependencies: 21.3037s
[success] 1.25% tests.test_jobs.SchedulerJobTest.test_dagrun_root_fail: 21.1455s
[success] 1.20% 
tests.test_impersonation.ImpersonationTest.test_impersonation_custom: 20.3233s
[success] 1.19% 
tests.executors.test_celery_executor.CeleryExecutorTest.test_celery_integration_0_amqp_guest_guest_rabbitmq_5672:
 20.1648s
[success] 1.14% tests.test_jobs.BackfillJobTest.test_backfill_run_backwards: 
19.4043s
[success] 1.06% tests.cli.test_cli.TestCLI.test_next_execution: 17.9290s
{noformat}

(List updated April 23, 2019)

These tests should be refactored to make them faster.

  was:
Some of the tests take far to long:
{{[success] 14.51% tests.BackfillJobTest.test_backfill_examples: 259.7899s
[success] 4.29% tests.hooks.test_s3_hook.TestS3Hook.test_list_keys_paged: 
76.7397s
[success] 4.18% tests.hooks.test_s3_hook.TestS3Hook.test_list_prefixes_paged: 
74.8025s
[success] 2.58% tests.BackfillJobTest.test_backfill_multi_dates: 46.1206s
[success] 2.38% tests.SchedulerJobTest.test_scheduler_start_date: 42.5719s
[success] 1.95% tests.CoreTest.test_schedule_dag_no_end_date_up_to_today_only: 
34.9399s
[success] 1.71% tests.CliTests.test_backfill: 30.5825s
[success] 1.70% tests.SchedulerJobTest.test_scheduler_multiprocessing: 30.3580s
[success] 1.68% 
tests.contrib.hooks.test_gcp_mlengine_hook.TestMLEngineHook.test_create_mlengine_job:
 30.0609s
[success] 1.68% tests.SchedulerJobTest.test_new_import_error_replaces_old: 
29.9968s}}

These tests should be refactored to make them faster.


> Speed up the CI
> ---
>
> Key: AIRFLOW-2488
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2488
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Priority: Major
>
> Some of the tests take far to long:
> {noformat}
> [success] 10.48% tests.test_jobs.BackfillJobTest.test_backfill_examples: 
> 177.9813s
> [success] 6.24% tests.test_jobs.BackfillJobTest.test_backfill_multi_dates: 
> 106.0157s
> [success] 3.59% tests.transplant_class..C.test_backfill: 61.0503s
> [success] 3.55% 
> tests.contrib.sensors.test_python_sensor.PythonSensorTests.test_python_sensor_false:
>  60.2099s
> [success] 2.66% 
> tests.test_impersonation.ImpersonationTest.test_impersonation_subdag: 45.1117s
> [success] 1.86% 
> tests.test_jobs.SchedulerJobTest.test_dagrun_root_after_dagrun_unfinished: 
> 31.5533s
> [success] 1.77% 
> tests.executors.test_celery_executor.CeleryExecutorTest.test_celery_integration_1_redis_redis_6379_0:
>  30.0873s
> [success] 1.48% 
> tests.operators.test_subdag_operator.SubDagOperatorTests.test_subdag_deadlock:
>  25.0747s
> [success] 1.40% 
> tests.test_jobs.SchedulerJobTest.test_new_import_error_replaces_old: 23.7093s
> [success] 1.36% tests.test_jobs.SchedulerJobTest.test_dagrun_success: 23.1645s
> [success] 1.33% 
> tests.test_jobs.BackfillJobTest.test_cli_backfill_depends_on_past_backwards: 
> 22.5161s
> [success] 1.30% 
> tests.test_jobs.SchedulerJobTest.test_dagrun_deadlock_ignore_depends_on_past: 
> 22.1402s
> [success] 1.30% 
> 

[jira] [Commented] (AIRFLOW-4395) Remove pickle_info view

2019-04-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824298#comment-16824298
 ] 

ASF GitHub Bot commented on AIRFLOW-4395:
-

mik-laj commented on pull request #5161: [AIRFLOW-4395] Remove pickle_info view
URL: https://github.com/apache/airflow/pull/5161
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-4395\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4395
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   As for me, this view is not currently used in the project. 
   The view was introduced in
   
https://github.com/apache/airflow/commit/300951c23ac079e13101c6dd11d6e42b60912ebb
   but it does not contain references in UI.
   This is a development tool that can be added using a plugin when someone 
misses this feature.
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove pickle_info view
> ---
>
> Key: AIRFLOW-4395
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4395
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Kamil Bregula
>Priority: Trivial
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] mik-laj opened a new pull request #5161: [AIRFLOW-4395] Remove pickle_info view

2019-04-23 Thread GitBox
mik-laj opened a new pull request #5161: [AIRFLOW-4395] Remove pickle_info view
URL: https://github.com/apache/airflow/pull/5161
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-4395\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-4395
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   As for me, this view is not currently used in the project. 
   The view was introduced in
   
https://github.com/apache/airflow/commit/300951c23ac079e13101c6dd11d6e42b60912ebb
   but it does not contain references in UI.
   This is a development tool that can be added using a plugin when someone 
misses this feature.
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #5160: [AIRFLOW-4394] Don't double test behaviour of BackfillJob from CLI tests

2019-04-23 Thread GitBox
codecov-io commented on issue #5160: [AIRFLOW-4394] Don't double test behaviour 
of BackfillJob from CLI tests
URL: https://github.com/apache/airflow/pull/5160#issuecomment-485876624
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/5160?src=pr=h1) 
Report
   > Merging 
[#5160](https://codecov.io/gh/apache/airflow/pull/5160?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/37fdee3f5be3cdd7c416d4e13dff592bb757bda4?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/5160/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/5160?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#5160   +/-   ##
   ===
 Coverage   78.44%   78.44%   
   ===
 Files 466  466   
 Lines   2973529735   
   ===
 Hits2332723327   
 Misses   6408 6408
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/5160?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/5160?src=pr=footer). 
Last update 
[37fdee3...3fe35e3](https://codecov.io/gh/apache/airflow/pull/5160?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-4395) Remove pickle_info view

2019-04-23 Thread Kamil Bregula (JIRA)
Kamil Bregula created AIRFLOW-4395:
--

 Summary: Remove pickle_info view
 Key: AIRFLOW-4395
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4395
 Project: Apache Airflow
  Issue Type: New Feature
Reporter: Kamil Bregula






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [airflow] feluelle commented on a change in pull request #5127: [AIRFLOW-4343] Show warning in UI if scheduler is not running

2019-04-23 Thread GitBox
feluelle commented on a change in pull request #5127: [AIRFLOW-4343] Show 
warning in UI if scheduler is not running
URL: https://github.com/apache/airflow/pull/5127#discussion_r277758349
 
 

 ##
 File path: airflow/config_templates/default_airflow.cfg
 ##
 @@ -464,11 +464,6 @@ dag_dir_list_interval = 300
 # How often should stats be printed to the logs
 print_stats_interval = 30
 
-# If the last scheduler heartbeat happened more than 
scheduler_health_check_threshold ago (in seconds),
-# scheduler is considered unhealthy.
-# This is used by the health check in the "/health" endpoint
-scheduler_health_check_threshold = 30
 
 Review comment:
   > Does anyone thing we need an UPDATING note for this?
   
   I think it is not necessary, but please don't forget to update the docs 
accordingly to your changes: 
https://github.com/apache/airflow/blob/master/docs/howto/check-health.rst
   
   > Should I replace it with something that controls the grace_multiplier for 
SchedulerJob? (or keep this setting and have SchedulerJob use a fixed timeout 
instead) @feluelle @XD-DENG thoughts?
   
   I prefer a fixed value over a multiplier, because it is clearer to me. But I 
would also be fine with having a grace_multiplier as long as it is documented 
properly :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   3   >