[GitHub] [airflow] bolkedebruin commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
bolkedebruin commented on issue #6627: [AIRFLOW-5931] Use os.fork when 
appropriate to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557421110
 
 
   @potiuk Gotcha. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] bolkedebruin commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
bolkedebruin commented on issue #6627: [AIRFLOW-5931] Use os.fork when 
appropriate to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557420983
 
 
   @dimberman It's inherently an insecure design to have tasks directly access 
the DB and has been a pain point in Airflow for a long time. The executor is 
fine, but not tasks.
   
   There are many ways to do this, but what makes the k8s executor so special 
that tasks executed require access to the DB apart from the current paradigm?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] badfeanor opened a new pull request #6631: Add coding to fix Cyrillic output

2019-11-21 Thread GitBox
badfeanor opened a new pull request #6631: Add coding to fix Cyrillic output
URL: https://github.com/apache/airflow/pull/6631
 
 
   ### Jira
   
   - "\[AIRFLOW-6039\] Not correctly displayed Cyrillic in the DAGs logs"
   - https://issues.apache.org/jira/browse/AIRFLOW-6039

   ### Description
   - Add encoding to fix Cyrillic output


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-6039) Not correctly displayed Cyrillic in the DAGs logs

2019-11-21 Thread Alexey Oskin (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Oskin resolved AIRFLOW-6039.
---
Resolution: Fixed

In /airflow/utils/log/file_task_handler.py add line 128:

response.encoding = "utf-8"

> Not correctly displayed Cyrillic in the DAGs logs
> -
>
> Key: AIRFLOW-6039
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6039
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: utils
>Affects Versions: 1.10.4
>Reporter: Alexey Oskin
>Assignee: Alexey Oskin
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6039) Not correctly displayed Cyrillic in the DAGs logs

2019-11-21 Thread Alexey Oskin (Jira)
Alexey Oskin created AIRFLOW-6039:
-

 Summary: Not correctly displayed Cyrillic in the DAGs logs
 Key: AIRFLOW-6039
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6039
 Project: Apache Airflow
  Issue Type: Bug
  Components: utils
Affects Versions: 1.10.4
Reporter: Alexey Oskin
Assignee: Alexey Oskin






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (AIRFLOW-6039) Not correctly displayed Cyrillic in the DAGs logs

2019-11-21 Thread Alexey Oskin (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-6039 started by Alexey Oskin.
-
> Not correctly displayed Cyrillic in the DAGs logs
> -
>
> Key: AIRFLOW-6039
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6039
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: utils
>Affects Versions: 1.10.4
>Reporter: Alexey Oskin
>Assignee: Alexey Oskin
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6038) AWS DataSync example dags

2019-11-21 Thread Bjorn Olsen (Jira)
Bjorn Olsen created AIRFLOW-6038:


 Summary: AWS DataSync example dags
 Key: AIRFLOW-6038
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6038
 Project: Apache Airflow
  Issue Type: Improvement
  Components: aws, examples
Affects Versions: 1.10.6
Reporter: Bjorn Olsen
Assignee: Bjorn Olsen


Add example_dags for AWS DataSync operators



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] codecov-io edited a comment on issue #6630: [AIRFLOW-5947] Make the json backend pluggable for DAG Serialization

2019-11-21 Thread GitBox
codecov-io edited a comment on issue #6630: [AIRFLOW-5947] Make the json 
backend pluggable for DAG Serialization
URL: https://github.com/apache/airflow/pull/6630#issuecomment-557337449
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6630?src=pr=h1) 
Report
   > Merging 
[#6630](https://codecov.io/gh/apache/airflow/pull/6630?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/da086661f7207ece0b598233b988387233c24d4a?src=pr=desc)
 will **decrease** coverage by `0.3%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6630/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6630?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#6630  +/-   ##
   ==
   - Coverage   83.79%   83.48%   -0.31% 
   ==
 Files 669  669  
 Lines   3754837552   +4 
   ==
   - Hits3146431351 -113 
   - Misses   6084 6201 +117
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6630?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/serialization/serialized\_dag.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9zZXJpYWxpemF0aW9uL3NlcmlhbGl6ZWRfZGFnLnB5)
 | `96% <ø> (ø)` | :arrow_up: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `88.96% <100%> (+0.15%)` | :arrow_up: |
   | 
[airflow/serialization/serialization.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9zZXJpYWxpemF0aW9uL3NlcmlhbGl6YXRpb24ucHk=)
 | `82.57% <100%> (ø)` | :arrow_up: |
   | 
[airflow/models/serialized\_dag.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvc2VyaWFsaXplZF9kYWcucHk=)
 | `85.71% <100%> (+0.34%)` | :arrow_up: |
   | 
[airflow/serialization/json\_schema.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9zZXJpYWxpemF0aW9uL2pzb25fc2NoZW1hLnB5)
 | `81.81% <100%> (ø)` | :arrow_up: |
   | 
[airflow/operators/mysql\_operator.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfb3BlcmF0b3IucHk=)
 | `100% <0%> (ø)` | :arrow_up: |
   | 
[airflow/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfdG9faGl2ZS5weQ==)
 | `100% <0%> (ø)` | :arrow_up: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `45.25% <0%> (-46.72%)` | :arrow_down: |
   | ... and [9 
more](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6630?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6630?src=pr=footer). 
Last update 
[da08666...5dda74e](https://codecov.io/gh/apache/airflow/pull/6630?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #6630: [AIRFLOW-5947] Make the json backend pluggable for DAG Serialization

2019-11-21 Thread GitBox
codecov-io commented on issue #6630: [AIRFLOW-5947] Make the json backend 
pluggable for DAG Serialization
URL: https://github.com/apache/airflow/pull/6630#issuecomment-557337449
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6630?src=pr=h1) 
Report
   > Merging 
[#6630](https://codecov.io/gh/apache/airflow/pull/6630?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/da086661f7207ece0b598233b988387233c24d4a?src=pr=desc)
 will **decrease** coverage by `0.53%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6630/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6630?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#6630  +/-   ##
   ==
   - Coverage   83.79%   83.26%   -0.54% 
   ==
 Files 669  669  
 Lines   3754837552   +4 
   ==
   - Hits3146431267 -197 
   - Misses   6084 6285 +201
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6630?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/serialization/serialized\_dag.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9zZXJpYWxpemF0aW9uL3NlcmlhbGl6ZWRfZGFnLnB5)
 | `96% <ø> (ø)` | :arrow_up: |
   | 
[airflow/settings.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9zZXR0aW5ncy5weQ==)
 | `88.96% <100%> (+0.15%)` | :arrow_up: |
   | 
[airflow/serialization/serialization.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9zZXJpYWxpemF0aW9uL3NlcmlhbGl6YXRpb24ucHk=)
 | `82.57% <100%> (ø)` | :arrow_up: |
   | 
[airflow/models/serialized\_dag.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvc2VyaWFsaXplZF9kYWcucHk=)
 | `85.71% <100%> (+0.34%)` | :arrow_up: |
   | 
[airflow/serialization/json\_schema.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9zZXJpYWxpemF0aW9uL2pzb25fc2NoZW1hLnB5)
 | `81.81% <100%> (ø)` | :arrow_up: |
   | 
[airflow/operators/mysql\_operator.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfb3BlcmF0b3IucHk=)
 | `0% <0%> (-100%)` | :arrow_down: |
   | 
[airflow/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfdG9faGl2ZS5weQ==)
 | `0% <0%> (-100%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `45.25% <0%> (-46.72%)` | :arrow_down: |
   | ... and [9 
more](https://codecov.io/gh/apache/airflow/pull/6630/diff?src=pr=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6630?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6630?src=pr=footer). 
Last update 
[da08666...5dda74e](https://codecov.io/gh/apache/airflow/pull/6630?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4560) Tez queue parameter passed by mapred_queue is incorrect

2019-11-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979701#comment-16979701
 ] 

ASF subversion and git services commented on AIRFLOW-4560:
--

Commit d055d4c547fefe601a7de2572f87fab81fe20847 in airflow's branch 
refs/heads/v1-10-test from aliceabe
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=d055d4c ]

[AIRFLOW-4560] Fix Tez queue parameter name in mapred_queue (#5315)


(cherry picked from commit 03ee1c32feac3bad86b0b398125f61999c6b7948)


> Tez queue parameter passed by mapred_queue is incorrect
> ---
>
> Key: AIRFLOW-4560
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4560
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: Alice Berard
>Priority: Major
> Fix For: 1.10.7
>
>
> The parameter is currently {{tez.job.queue.name}}, see code: 
> [https://github.com/apache/airflow/blob/355bd56282e6a684c5c060953e9948ba2260aa37/airflow/hooks/hive_hooks.py#L214]
> But it should be {{tez.queue.name}}, see here: 
> [https://tez.apache.org/releases/0.9.2/tez-api-javadocs/configs/TezConfiguration.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-4560) Tez queue parameter passed by mapred_queue is incorrect

2019-11-21 Thread Kaxil Naik (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-4560.
-
Fix Version/s: 1.10.7
   Resolution: Fixed

> Tez queue parameter passed by mapred_queue is incorrect
> ---
>
> Key: AIRFLOW-4560
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4560
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Reporter: Alice Berard
>Priority: Major
> Fix For: 1.10.7
>
>
> The parameter is currently {{tez.job.queue.name}}, see code: 
> [https://github.com/apache/airflow/blob/355bd56282e6a684c5c060953e9948ba2260aa37/airflow/hooks/hive_hooks.py#L214]
> But it should be {{tez.queue.name}}, see here: 
> [https://tez.apache.org/releases/0.9.2/tez-api-javadocs/configs/TezConfiguration.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5947) Make the json backend pluggable for DAG Serialization

2019-11-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979692#comment-16979692
 ] 

ASF GitHub Bot commented on AIRFLOW-5947:
-

kaxil commented on pull request #6630: [AIRFLOW-5947] Make the json backend 
pluggable for DAG Serialization
URL: https://github.com/apache/airflow/pull/6630
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. 
 - https://issues.apache.org/jira/browse/AIRFLOW-5947
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   Allow users the option to choose the JSON library of their choice for DAG 
Serialization.
   
   
   
   ### Tests
   
   - [] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Make the json backend pluggable for DAG Serialization
> -
>
> Key: AIRFLOW-5947
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5947
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core, scheduler
>Affects Versions: 2.0.0, 1.10.7
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Major
>
> Allow users the option to choose the JSON library of their choice for DAG 
> Serialization.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil opened a new pull request #6630: [AIRFLOW-5947] Make the json backend pluggable for DAG Serialization

2019-11-21 Thread GitBox
kaxil opened a new pull request #6630: [AIRFLOW-5947] Make the json backend 
pluggable for DAG Serialization
URL: https://github.com/apache/airflow/pull/6630
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. 
 - https://issues.apache.org/jira/browse/AIRFLOW-5947
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   Allow users the option to choose the JSON library of their choice for DAG 
Serialization.
   
   
   
   ### Tests
   
   - [] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-6037) Remove default connections from Airflow

2019-11-21 Thread Sergio Kef (Jira)
Sergio Kef created AIRFLOW-6037:
---

 Summary: Remove default connections from Airflow
 Key: AIRFLOW-6037
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6037
 Project: Apache Airflow
  Issue Type: Improvement
  Components: configuration
Affects Versions: 1.10.6
Reporter: Sergio Kef
Assignee: Sergio Kef


Currently, initdb creates a bunch of example connections. Those are as helpful 
as the example dags are. Meaning you might want to see them on your first 
build, but probably not on your production.

This ticket is to bundle both example dags and example connections under 
current existing config setting {{load_examples}}. When False no connections 
are created during initdb.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] stale[bot] commented on issue #6277: [AIRFLOW-2971] Add health check CLI for scheduler

2019-11-21 Thread GitBox
stale[bot] commented on issue #6277: [AIRFLOW-2971] Add health check CLI for 
scheduler
URL: https://github.com/apache/airflow/pull/6277#issuecomment-557306504
 
 
   This issue has been automatically marked as stale because it has not had 
recent activity. It will be closed if no further activity occurs. Thank you for 
your contributions.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk edited a comment on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
potiuk edited a comment on issue #6627: [AIRFLOW-5931] Use os.fork when 
appropriate to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557303290
 
 
   Currently the change manually checks if we "can fork". When using 
multiprocessing.Process, the check is done by the library and it uses "fork" by 
default on Linux, and Python < 3.8 on MacOS, but it uses "spawn" on Windows / 
MacOS for python >=  3.8. So maybe we can simplify the code a bit here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
potiuk commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to 
speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557303290
 
 
   Currently the change manually checks if we "can fork" - by using 
multiprocessing.Process, the check is done by the library and it uses "fork" by 
default on Linux, and Python < 3.8 on MacOS, but it uses "exec" on Windows / 
MacOS for python >=  3.8.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk edited a comment on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
potiuk edited a comment on issue #6627: [AIRFLOW-5931] Use os.fork when 
appropriate to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557303290
 
 
   Currently the change manually checks if we "can fork". When using 
multiprocessing.Process, the check is done by the library and it uses "fork" by 
default on Linux, and Python < 3.8 on MacOS, but it uses "exec" on Windows / 
MacOS for python >=  3.8.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk edited a comment on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
potiuk edited a comment on issue #6627: [AIRFLOW-5931] Use os.fork when 
appropriate to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557303290
 
 
   Currently the change manually checks if we "can fork". When using 
multiprocessing.Process, the check is done by the library and it uses "fork" by 
default on Linux, and Python < 3.8 on MacOS, but it uses "spawn" on Windows / 
MacOS for python >=  3.8.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
potiuk commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to 
speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557302459
 
 
   @dimberman @bolkedebruin : just to clarify -> `psutil.Process` indeed 
reloaded everything (just started a new python interpreters). So changing to 
fork make sense - I fully agree. What @mik-laj asked was using 
`multiprocessing.Process` instead of os.fork.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate 
to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557301508
 
 
   @bolkedebruin the understanding I have is that when you spawn a totally new 
process, you are reinitializing the interpreter, re-loading all dependencies, 
and restarting airflow. using os.fork directly allows you to keep the same 
memory state (at least that's what I understood from @ashb )
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate 
to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557301118
 
 
   @bolkedebruin having the tasks access the DB is a central part of the 
k8sexecutor. Unless we want to set up some sort of messaging system/message 
queue


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] bolkedebruin commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
bolkedebruin commented on issue #6627: [AIRFLOW-5931] Use os.fork when 
appropriate to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557300880
 
 
   BTW I agree with @potiuk that its a bit strange that we get this speed up as 
both are using fork(). What is the trade off?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] bolkedebruin commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
bolkedebruin commented on issue #6627: [AIRFLOW-5931] Use os.fork when 
appropriate to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557300526
 
 
   Aside from from this performance enhancement I think a much bigger gain can 
be gotten from removing the raw task excecution. Ie. the process is now as 
follows Executor -> Task -> Rawtask. Which doesn't make sense. It should just 
be Executor -> Task. Ideally the task would then signal to the executor what 
its state is rather than setting its own state, which is either undeterministic 
(in case of a crash of the task) and insecure (it requires airflow db access by 
the task which is available to all operators)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
potiuk commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to 
speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557299451
 
 
   But @dimberman ->  multiprocessing also uses os.fork() underneath in fork 
mode (default for Linux). I have my reservations with using mutlprocessing (but 
mostly because people do not realise that it actually uses fork (and we plan to 
use it anyway so no difference). 
   
   Using multiprocessing might be a more portable way if we consider running it 
in different environments. Note that in python 3.8 default mode for the new 
process is spawn as forking on MacOS might cause crashes because threads are 
not safe for forking and some system libraries on MacOS run threads. So using 
multiprocessing.Process will be slower on MacOS in 3.8 but won't crash.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] bolkedebruin commented on a change in pull request #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
bolkedebruin commented on a change in pull request #6627: [AIRFLOW-5931] Use 
os.fork when appropriate to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#discussion_r349343728
 
 

 ##
 File path: airflow/task/task_runner/standard_task_runner.py
 ##
 @@ -17,28 +17,89 @@
 # specific language governing permissions and limitations
 # under the License.
 
+import os
+
 import psutil
+from setproctitle import setproctitle
 
 from airflow.task.task_runner.base_task_runner import BaseTaskRunner
 from airflow.utils.helpers import reap_process_group
 
+CAN_FORK = hasattr(os, 'fork')
+
 
 class StandardTaskRunner(BaseTaskRunner):
 """
 Runs the raw Airflow task by invoking through the Bash shell.
 """
 def __init__(self, local_task_job):
 super().__init__(local_task_job)
+self._rc = None
 
 def start(self):
-self.process = self.run_command()
+if CAN_FORK and not self.run_as_user:
+self.process = self._start_by_fork()
+else:
+self.process = self._start_by_exec()
+
+def _start_by_exec(self):
+subprocess = self.run_command()
+return psutil.Process(subprocess.pid)
+
+def _start_by_fork(self):
+pid = os.fork()
+if pid:
+self.log.info("Started process %d to run task", pid)
+return psutil.Process(pid)
+else:
+from airflow.bin.cli import get_parser
+import signal
+import airflow.settings as settings
+
+signal.signal(signal.SIGINT, signal.SIG_DFL)
+signal.signal(signal.SIGTERM, signal.SIG_DFL)
+# Start a new process group
+os.setpgid(0, 0)
+
+# Force a new SQLAlchemy session. We can't share open DB handles 
between process.
+settings.engine.pool.dispose()
+settings.engine.pool.recreate()
+
+parser = get_parser()
+args = parser.parse_args(self._command[1:])
+setproctitle(
+"airflow task runner: {0.dag_id} {0.task_id} 
{0.execution_date} {0.job_id}".format(args)
+)
+try:
+args.func(args)
+os._exit(0)
+except Exception:
+os._exit(1)
+
+def return_code(self, timeout=0):
+# We call this multiple times, but we can only wait on the process once
+if self._rc is not None or not self.process:
+return self._rc
+
+try:
+self._rc = self.process.wait(timeout=timeout)
+self.process = None
+except psutil.TimeoutExpired:
+pass
 
-def return_code(self):
-return self.process.poll()
+return self._rc
 
 def terminate(self):
-if self.process and psutil.pid_exists(self.process.pid):
-reap_process_group(self.process.pid, self.log)
+if self.process:
+if self.process.is_running():
+reap_process_group(self.process.pid, self.log)
 
 Review comment:
   NIT: process groups are undeterministic. We would be better off using cgroups


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] bolkedebruin commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
bolkedebruin commented on issue #6627: [AIRFLOW-5931] Use os.fork when 
appropriate to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557293221
 
 
   Can you explain the security of running the tasks and the different 
processes involved?
   
   Afaik it does Executor -> Task -> Rawtask. So with your change it would now 
do "Executor -> Task -> Rawtask -> New Process"? I.e. it hasn't become part of 
the executor I assume (that would be a no go). Just verifying.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
dimberman commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate 
to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557283842
 
 
   @mik-laj it's because multiprocessing.Process has to re-parse all 
dependencies/DAGs. It causes a lot of slowdown.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
mik-laj commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate 
to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557275475
 
 
   This week I was thinking about it.  If I worked at CLI, I saw this problem 
and it was on my list. 
   
   I think this will be better done using multiprocessing.Process. Is there a 
reason why you did it this way?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #6622: [AIRLFOW-6024] Do not use the logger in CLI

2019-11-21 Thread GitBox
codecov-io edited a comment on issue #6622: [AIRLFOW-6024] Do not use the 
logger in CLI
URL: https://github.com/apache/airflow/pull/6622#issuecomment-556849902
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6622?src=pr=h1) 
Report
   > Merging 
[#6622](https://codecov.io/gh/apache/airflow/pull/6622?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/a1e2f863526973b17892ec31caf09eded95c1cd2?src=pr=desc)
 will **decrease** coverage by `0.32%`.
   > The diff coverage is `90.9%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6622/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6622?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#6622  +/-   ##
   ==
   - Coverage83.8%   83.47%   -0.33% 
   ==
 Files 669  669  
 Lines   3756437535  -29 
   ==
   - Hits3147931334 -145 
   - Misses   6085 6201 +116
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6622?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/cli/commands/pool\_command.py](https://codecov.io/gh/apache/airflow/pull/6622/diff?src=pr=tree#diff-YWlyZmxvdy9jbGkvY29tbWFuZHMvcG9vbF9jb21tYW5kLnB5)
 | `89.06% <100%> (-1.08%)` | :arrow_down: |
   | 
[airflow/cli/commands/worker\_command.py](https://codecov.io/gh/apache/airflow/pull/6622/diff?src=pr=tree#diff-YWlyZmxvdy9jbGkvY29tbWFuZHMvd29ya2VyX2NvbW1hbmQucHk=)
 | `38.09% <100%> (-1.44%)` | :arrow_down: |
   | 
[airflow/cli/commands/dag\_command.py](https://codecov.io/gh/apache/airflow/pull/6622/diff?src=pr=tree#diff-YWlyZmxvdy9jbGkvY29tbWFuZHMvZGFnX2NvbW1hbmQucHk=)
 | `81.11% <100%> (+0.84%)` | :arrow_up: |
   | 
[airflow/cli/commands/task\_command.py](https://codecov.io/gh/apache/airflow/pull/6622/diff?src=pr=tree#diff-YWlyZmxvdy9jbGkvY29tbWFuZHMvdGFza19jb21tYW5kLnB5)
 | `62.01% <50%> (ø)` | :arrow_up: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6622/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6622/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6622/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `45.25% <0%> (-46.72%)` | :arrow_down: |
   | 
[airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/6622/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5)
 | `50.98% <0%> (-23.53%)` | :arrow_down: |
   | 
[...rflow/contrib/operators/kubernetes\_pod\_operator.py](https://codecov.io/gh/apache/airflow/pull/6622/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZF9vcGVyYXRvci5weQ==)
 | `77.14% <0%> (-21.43%)` | :arrow_down: |
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/airflow/pull/6622/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `89.13% <0%> (-3.63%)` | :arrow_down: |
   | ... and [7 
more](https://codecov.io/gh/apache/airflow/pull/6622/diff?src=pr=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6622?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6622?src=pr=footer). 
Last update 
[a1e2f86...d51fca2](https://codecov.io/gh/apache/airflow/pull/6622?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-6036) Improve code formatting when using two context managers

2019-11-21 Thread Kamil Bregula (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula updated AIRFLOW-6036:
---
Description: [~bolke] commented on the formatting of the code in one 
comment. 
https://github.com/apache/airflow/commit/da086661f7207ece0b598233b988387233c24d4a#r36077183

> Improve code formatting when using two context managers
> ---
>
> Key: AIRFLOW-6036
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6036
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.10.6
>Reporter: Kamil Bregula
>Assignee: Kamil Bregula
>Priority: Trivial
>
> [~bolke] commented on the formatting of the code in one comment. 
> https://github.com/apache/airflow/commit/da086661f7207ece0b598233b988387233c24d4a#r36077183



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability

2019-11-21 Thread GitBox
kaxil edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting 
serialized DAG in DB for webserver scalability
URL: https://github.com/apache/airflow/pull/5743#issuecomment-530789821
 
 
   Few things to do:
   
   - [ ] Add 
https://github.com/astronomer/airflow/commit/baf12f626e6d56dfde735faaed71b2c30cb4befb
 and add tests for it
   - [x] Reduce the info we store in Serialized DAGs by removing all the 
default arguments that are not overridden by users. Eg `owner` in DAG & Task 
etc. This will help reduce blob size as well as reduce the time spent in 
`_deserialise` method. 
   - [ ] Agree / dis-agree on using 
https://pypi.org/project/SQLAlchemy-JSONField/ instead of our code . It also 
has a nice option of specifying json library as compared to providing that info 
in the `create_engine.json_serializer` and `create_engine.json_deserializer` 
parameters in  
https://docs.sqlalchemy.org/en/13/core/type_basics.html#sqlalchemy.types.JSON
   - [x] Test serialisation code with zipped DAG files
   
   cc @coufon @ashb 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (AIRFLOW-6036) Improve code formatting when using two context managers

2019-11-21 Thread Kamil Bregula (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula reassigned AIRFLOW-6036:
--

Assignee: Kamil Bregula

> Improve code formatting when using two context managers
> ---
>
> Key: AIRFLOW-6036
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6036
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.10.6
>Reporter: Kamil Bregula
>Assignee: Kamil Bregula
>Priority: Trivial
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6036) Improve code formatting when using two context managers

2019-11-21 Thread Kamil Bregula (Jira)
Kamil Bregula created AIRFLOW-6036:
--

 Summary: Improve code formatting when using two context managers
 Key: AIRFLOW-6036
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6036
 Project: Apache Airflow
  Issue Type: Bug
  Components: core
Affects Versions: 1.10.6
Reporter: Kamil Bregula






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] codecov-io edited a comment on issue #6629: [AIRFLOW-6035] Remove command method in TaskInstance

2019-11-21 Thread GitBox
codecov-io edited a comment on issue #6629: [AIRFLOW-6035] Remove command 
method in TaskInstance
URL: https://github.com/apache/airflow/pull/6629#issuecomment-557264642
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=h1) 
Report
   > Merging 
[#6629](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/da086661f7207ece0b598233b988387233c24d4a?src=pr=desc)
 will **decrease** coverage by `0.31%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6629/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#6629  +/-   ##
   ==
   - Coverage   83.79%   83.47%   -0.32% 
   ==
 Files 669  669  
 Lines   3754837546   -2 
   ==
   - Hits3146431343 -121 
   - Misses   6084 6203 +119
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `93.46% <ø> (+0.14%)` | :arrow_up: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `45.25% <0%> (-46.72%)` | :arrow_down: |
   | 
[airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5)
 | `50.98% <0%> (-23.53%)` | :arrow_down: |
   | 
[...rflow/contrib/operators/kubernetes\_pod\_operator.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZF9vcGVyYXRvci5weQ==)
 | `77.14% <0%> (-21.43%)` | :arrow_down: |
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `89.13% <0%> (-3.63%)` | :arrow_down: |
   | 
[airflow/jobs/local\_task\_job.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2xvY2FsX3Rhc2tfam9iLnB5)
 | `85% <0%> (-1.25%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=footer). 
Last update 
[da08666...d280024](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #6629: [AIRFLOW-6035] Remove command method in TaskInstance

2019-11-21 Thread GitBox
codecov-io commented on issue #6629: [AIRFLOW-6035] Remove command method in 
TaskInstance
URL: https://github.com/apache/airflow/pull/6629#issuecomment-557264642
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=h1) 
Report
   > Merging 
[#6629](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/da086661f7207ece0b598233b988387233c24d4a?src=pr=desc)
 will **decrease** coverage by `0.31%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6629/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#6629  +/-   ##
   ==
   - Coverage   83.79%   83.47%   -0.32% 
   ==
 Files 669  669  
 Lines   3754837546   -2 
   ==
   - Hits3146431343 -121 
   - Misses   6084 6203 +119
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `93.46% <ø> (+0.14%)` | :arrow_up: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `45.25% <0%> (-46.72%)` | :arrow_down: |
   | 
[airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5)
 | `50.98% <0%> (-23.53%)` | :arrow_down: |
   | 
[...rflow/contrib/operators/kubernetes\_pod\_operator.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZF9vcGVyYXRvci5weQ==)
 | `77.14% <0%> (-21.43%)` | :arrow_down: |
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `89.13% <0%> (-3.63%)` | :arrow_down: |
   | 
[airflow/jobs/local\_task\_job.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2xvY2FsX3Rhc2tfam9iLnB5)
 | `85% <0%> (-1.25%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=footer). 
Last update 
[da08666...d280024](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #6629: [AIRFLOW-6035] Remove command method in TaskInstance

2019-11-21 Thread GitBox
codecov-io edited a comment on issue #6629: [AIRFLOW-6035] Remove command 
method in TaskInstance
URL: https://github.com/apache/airflow/pull/6629#issuecomment-557264642
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=h1) 
Report
   > Merging 
[#6629](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/da086661f7207ece0b598233b988387233c24d4a?src=pr=desc)
 will **decrease** coverage by `0.31%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6629/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#6629  +/-   ##
   ==
   - Coverage   83.79%   83.47%   -0.32% 
   ==
 Files 669  669  
 Lines   3754837546   -2 
   ==
   - Hits3146431343 -121 
   - Misses   6084 6203 +119
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `93.46% <ø> (+0.14%)` | :arrow_up: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `45.25% <0%> (-46.72%)` | :arrow_down: |
   | 
[airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5)
 | `50.98% <0%> (-23.53%)` | :arrow_down: |
   | 
[...rflow/contrib/operators/kubernetes\_pod\_operator.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZF9vcGVyYXRvci5weQ==)
 | `77.14% <0%> (-21.43%)` | :arrow_down: |
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `89.13% <0%> (-3.63%)` | :arrow_down: |
   | 
[airflow/jobs/local\_task\_job.py](https://codecov.io/gh/apache/airflow/pull/6629/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2xvY2FsX3Rhc2tfam9iLnB5)
 | `85% <0%> (-1.25%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=footer). 
Last update 
[da08666...d280024](https://codecov.io/gh/apache/airflow/pull/6629?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io commented on issue #6628: [AIRFLOW-6034] Fix Deprecation Elasticsearch configs on Master

2019-11-21 Thread GitBox
codecov-io commented on issue #6628: [AIRFLOW-6034] Fix Deprecation 
Elasticsearch configs on Master
URL: https://github.com/apache/airflow/pull/6628#issuecomment-557243873
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6628?src=pr=h1) 
Report
   > Merging 
[#6628](https://codecov.io/gh/apache/airflow/pull/6628?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/1d8b8cfcbc0d1d81758e42fcf7a789efd797c931?src=pr=desc)
 will **decrease** coverage by `0.54%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6628/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6628?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#6628  +/-   ##
   ==
   - Coverage83.8%   83.25%   -0.55% 
   ==
 Files 669  669  
 Lines   3756437548  -16 
   ==
   - Hits3147931261 -218 
   - Misses   6085 6287 +202
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6628?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/airflow/pull/6628/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `89.13% <ø> (-3.63%)` | :arrow_down: |
   | 
[airflow/operators/mysql\_operator.py](https://codecov.io/gh/apache/airflow/pull/6628/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfb3BlcmF0b3IucHk=)
 | `0% <0%> (-100%)` | :arrow_down: |
   | 
[airflow/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/airflow/pull/6628/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbXlzcWxfdG9faGl2ZS5weQ==)
 | `0% <0%> (-100%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6628/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6628/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6628/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `45.25% <0%> (-46.72%)` | :arrow_down: |
   | 
[airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/6628/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5)
 | `50.98% <0%> (-23.53%)` | :arrow_down: |
   | 
[...rflow/contrib/operators/kubernetes\_pod\_operator.py](https://codecov.io/gh/apache/airflow/pull/6628/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZF9vcGVyYXRvci5weQ==)
 | `77.14% <0%> (-21.43%)` | :arrow_down: |
   | 
[airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/airflow/pull/6628/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5)
 | `86.44% <0%> (-6.78%)` | :arrow_down: |
   | ... and [8 
more](https://codecov.io/gh/apache/airflow/pull/6628/diff?src=pr=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6628?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6628?src=pr=footer). 
Last update 
[1d8b8cf...db94e5a](https://codecov.io/gh/apache/airflow/pull/6628?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (AIRFLOW-5959) Change import paths for "jira" modules

2019-11-21 Thread Rich Dean (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rich Dean reassigned AIRFLOW-5959:
--

Assignee: Rich Dean

> Change import paths for "jira" modules
> --
>
> Key: AIRFLOW-5959
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5959
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: operators
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Rich Dean
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5958) Change import paths for "mysql" modules

2019-11-21 Thread Rich Dean (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979565#comment-16979565
 ] 

Rich Dean commented on AIRFLOW-5958:


[~higrys] - the list on the AIP page has hive_to_mysql and presto_to_mysql - 
both are in {{airflow/operators}} - there is also bigquery_to_mysql in there. 
In contrib we have  vertica_to_mysql (and the shell/forwarder for the old bq 
operator).
So - migrate just the two on the list, or bring the core bigquery and the 
contrib vertica ones too?


> Change import paths for "mysql" modules
> ---
>
> Key: AIRFLOW-5958
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5958
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: operators
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Rich Dean
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] dimberman commented on a change in pull request #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
dimberman commented on a change in pull request #6627: [AIRFLOW-5931] Use 
os.fork when appropriate to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#discussion_r349279807
 
 

 ##
 File path: airflow/task/task_runner/standard_task_runner.py
 ##
 @@ -17,28 +17,89 @@
 # specific language governing permissions and limitations
 # under the License.
 
+import os
+
 import psutil
+from setproctitle import setproctitle
 
 from airflow.task.task_runner.base_task_runner import BaseTaskRunner
 from airflow.utils.helpers import reap_process_group
 
+CAN_FORK = hasattr(os, 'fork')
+
 
 class StandardTaskRunner(BaseTaskRunner):
 """
 Runs the raw Airflow task by invoking through the Bash shell.
 """
 def __init__(self, local_task_job):
 super().__init__(local_task_job)
+self._rc = None
 
 def start(self):
-self.process = self.run_command()
+if CAN_FORK and not self.run_as_user:
+self.process = self._start_by_fork()
+else:
+self.process = self._start_by_exec()
+
+def _start_by_exec(self):
+subprocess = self.run_command()
+return psutil.Process(subprocess.pid)
+
+def _start_by_fork(self):
+pid = os.fork()
+if pid:
+self.log.info("Started process %d to run task", pid)
+return psutil.Process(pid)
+else:
+from airflow.bin.cli import CLIFactory
+import signal
+import airflow.settings as settings
+
+signal.signal(signal.SIGINT, signal.SIG_DFL)
+signal.signal(signal.SIGTERM, signal.SIG_DFL)
+# Start a new process group
+os.setpgid(0, 0)
+
+# Force a new SQLAlchemy session. We can't share open DB handles 
between process.
+settings.engine.pool.dispose()
+settings.engine.pool.recreate()
+
+parser = CLIFactory.get_parser()
+args = parser.parse_args(self._command[1:])
+setproctitle(
+"airflow task runner: {0.dag_id} {0.task_id} 
{0.execution_date} {0.job_id}".format(args)
+)
+try:
+args.func(args)
+os._exit(0)
+except Exception:
+os._exit(1)
+
+def return_code(self, timeout=0):
+# We call this multiple times, but we can only wait on the process once
+if self._rc is not None or not self.process:
+return self._rc
+
+try:
+self._rc = self.process.wait(timeout=timeout)
+self.process = None
+except psutil.TimeoutExpired:
+pass
 
-def return_code(self):
-return self.process.poll()
+return self._rc
 
 def terminate(self):
-if self.process and psutil.pid_exists(self.process.pid):
-reap_process_group(self.process.pid, self.log)
+if self.process:
+if self.process.is_running():
 
 Review comment:
   can you break this into a function?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (AIRFLOW-5935) Logs are not sent S3 by S3TaskHandler

2019-11-21 Thread Aidar Mamytov (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aidar Mamytov closed AIRFLOW-5935.
--
Resolution: Resolved

> Logs are not sent S3 by S3TaskHandler
> -
>
> Key: AIRFLOW-5935
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5935
> Project: Apache Airflow
>  Issue Type: Task
>  Components: logging
>Affects Versions: 1.10.6
>Reporter: Aidar Mamytov
>Assignee: Aidar Mamytov
>Priority: Major
>
> When exactly is S3TaskHandler supposed to have its *s3_write* or *close* 
> method called? The logs are written locally but are not appearing in S3.I've 
> pdb-debugged my custom log_config.py file and Airflow reads configs 
> successfully and loads *S3TaskHandler* configs successfully. I also 
> pdb-debugged and checked another thing with print statements - whenever I try 
> to open "_View Log_" for any task in the admin dashboard, it definitely calls 
> *S3TaskHandler.s3_read* and *S3TaskHandler.s3_log_exists* and successfully 
> connects to S3. I also checked if Airflow is able to connect to S3 in Python 
> console: imported *S3Hook* and *S3TaskHandler* and tried to connect to S3, 
> read objects and write new ones to my bucket - all good.
> The problem is that although Airflow is able to connect to S3 bucket and 
> interact with it with read/write operations, it just does not upload logs to 
> it. What might I do wrong or what do I not understand about airflow remote 
> logging?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5935) Logs are not sent S3 by S3TaskHandler

2019-11-21 Thread Aidar Mamytov (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979555#comment-16979555
 ] 

Aidar Mamytov commented on AIRFLOW-5935:


Figured out the problem. We're using Django in the project and it's logging is 
somehow interfering with Airflow being able to write logs remotely. Closing 
this issue

> Logs are not sent S3 by S3TaskHandler
> -
>
> Key: AIRFLOW-5935
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5935
> Project: Apache Airflow
>  Issue Type: Task
>  Components: logging
>Affects Versions: 1.10.6
>Reporter: Aidar Mamytov
>Assignee: Aidar Mamytov
>Priority: Major
>
> When exactly is S3TaskHandler supposed to have its *s3_write* or *close* 
> method called? The logs are written locally but are not appearing in S3.I've 
> pdb-debugged my custom log_config.py file and Airflow reads configs 
> successfully and loads *S3TaskHandler* configs successfully. I also 
> pdb-debugged and checked another thing with print statements - whenever I try 
> to open "_View Log_" for any task in the admin dashboard, it definitely calls 
> *S3TaskHandler.s3_read* and *S3TaskHandler.s3_log_exists* and successfully 
> connects to S3. I also checked if Airflow is able to connect to S3 in Python 
> console: imported *S3Hook* and *S3TaskHandler* and tried to connect to S3, 
> read objects and write new ones to my bucket - all good.
> The problem is that although Airflow is able to connect to S3 bucket and 
> interact with it with read/write operations, it just does not upload logs to 
> it. What might I do wrong or what do I not understand about airflow remote 
> logging?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-5958) Change import paths for "mysql" modules

2019-11-21 Thread Rich Dean (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rich Dean reassigned AIRFLOW-5958:
--

Assignee: Rich Dean

> Change import paths for "mysql" modules
> ---
>
> Key: AIRFLOW-5958
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5958
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: operators
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Rich Dean
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] bolkedebruin commented on a change in pull request #6564: [AIRFLOW-5911] Simplify lineage API and improve robustness

2019-11-21 Thread GitBox
bolkedebruin commented on a change in pull request #6564: [AIRFLOW-5911] 
Simplify lineage API and improve robustness
URL: https://github.com/apache/airflow/pull/6564#discussion_r349272236
 
 

 ##
 File path: airflow/lineage/entities.py
 ##
 @@ -0,0 +1,97 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from typing import Any, Dict, List, Optional
+
+import attr
+
+
+@attr.s(auto_attribs=True)
+class Dataset:
+pass
+
+
+@attr.s(auto_attribs=True)
+class File(Dataset):
+url: str = attr.ib()
+type_hint: Optional[str] = None
+
+meta_schema: str = __name__ + '.File'
+
+
+@attr.s(auto_attribs=True, kw_only=True)
+class User(Dataset):
+email: str = attr.ib()
+first_name: Optional[str] = None
+last_name: Optional[str] = None
+meta_schema: str = __name__ + '.User'
+
+
+@attr.s(auto_attribs=True, kw_only=True)
+class Tag(Dataset):
+tag_name: str = attr.ib()
+meta_schema: str = __name__ + '.Tag'
+
+
+@attr.s(auto_attribs=True, kw_only=True)
+class Column(Dataset):
+name: str = attr.ib()
+description: Optional[str] = None
+data_type: str = attr.ib()
+tags: List[Tag] = []
+meta_schema: str = __name__ + '.Column'
+
+
+# this is a temporary hack to satisfy mypy. Once
+# https://github.com/python/mypy/issues/6136 is resolved, use
+# `attr.converters.default_if_none(default=False)`
+def default_if_none(arg: Optional[bool]) -> bool:
+return arg or False
+
+
+@attr.s(auto_attribs=True, kw_only=True)
+class Table(Dataset):
 
 Review comment:
   Airflow supports a light weight lineage. Lineage is the information that 
links Dataset A to Dataset B. Sometimes is important to give some hints to the 
lineage system to know what is is dealing with ie. File(Dataset A) -> 
Table(Dataset B).
   
   I on purpose made the basic lineage info resemble Amundsen's model very 
closely. It will allow databuilder to enrich it or you could implement a 
Amundsen proxy in Airflow that allows it to push the information as soon as it 
becomes available. A proxy would also allow you to enrich the information for 
usage *in* Airflow, so pick up extra information and take actions based on 
those within the operator or do branching on them based on type of example.
   
   Yes it's very easy to extend. See the papermill_operator for a Notebook 
example.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on a change in pull request #6622: [AIRLFOW-6024] Do not use the logger in CLI

2019-11-21 Thread GitBox
mik-laj commented on a change in pull request #6622: [AIRLFOW-6024] Do not use 
the logger in CLI
URL: https://github.com/apache/airflow/pull/6622#discussion_r349270843
 
 

 ##
 File path: airflow/cli/commands/dag_command.py
 ##
 @@ -104,16 +104,15 @@ def dag_trigger(args):
 :return:
 """
 api_client = get_current_api_client()
-log = LoggingMixin().log
 try:
 message = api_client.trigger_dag(dag_id=args.dag_id,
  run_id=args.run_id,
  conf=args.conf,
  execution_date=args.exec_date)
+print(message)
 except OSError as err:
-log.error(err)
+print(str(err))
 raise AirflowException(err)
 
 Review comment:
   Fixed. Raising exceptions is also not the best solution, but it is another 
problem.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on a change in pull request #6622: [AIRLFOW-6024] Do not use the logger in CLI

2019-11-21 Thread GitBox
mik-laj commented on a change in pull request #6622: [AIRLFOW-6024] Do not use 
the logger in CLI
URL: https://github.com/apache/airflow/pull/6622#discussion_r349270863
 
 

 ##
 File path: airflow/cli/commands/dag_command.py
 ##
 @@ -125,16 +124,15 @@ def dag_delete(args):
 :return:
 """
 api_client = get_current_api_client()
-log = LoggingMixin().log
 if args.yes or input(
 "This will drop all existing records related to the specified DAG. 
"
 "Proceed? (y/n)").upper() == "Y":
 try:
 message = api_client.delete_dag(dag_id=args.dag_id)
+print(message)
 except OSError as err:
-log.error(err)
+print(str(err))
 raise AirflowException(err)
 
 Review comment:
   Fixed. Raising exceptions is also not the best solution, but it is another 
problem.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] bolkedebruin commented on a change in pull request #6564: [AIRFLOW-5911] Simplify lineage API and improve robustness

2019-11-21 Thread GitBox
bolkedebruin commented on a change in pull request #6564: [AIRFLOW-5911] 
Simplify lineage API and improve robustness
URL: https://github.com/apache/airflow/pull/6564#discussion_r349268865
 
 

 ##
 File path: airflow/lineage/__init__.py
 ##
 @@ -92,51 +116,46 @@ def prepare_lineage(func):
 * "auto" -> picks up any outlets from direct upstream tasks that have 
outlets defined, as such that
   if A -> B -> C and B does not have outlets but A does, these are 
provided as inlets.
 * "list of task_ids" -> picks up outlets from the upstream task_ids
-* "list of datasets" -> manually defined list of DataSet
+* "list of datasets" -> manually defined list of data
 
 """
 @wraps(func)
 def wrapper(self, context, *args, **kwargs):
 self.log.debug("Preparing lineage inlets and outlets")
 
-task_ids = set(self._inlets['task_ids']).intersection(  # 
pylint:disable=protected-access
-self.get_flat_relative_ids(upstream=True)
-)
-if task_ids:
-inlets = self.xcom_pull(context,
-task_ids=task_ids,
-dag_id=self.dag_id,
-key=PIPELINE_OUTLETS)
-inlets = [item for sublist in inlets if sublist for item in 
sublist]
-inlets = [DataSet.map_type(i['typeName'])(data=i['attributes'])
-  for i in inlets]
-self.inlets.extend(inlets)
+if isinstance(self._inlets, str):
+self._inlets = [self._inlets, ]
+
+if isinstance(self._inlets, list):
+task_ids = set(
+filter(lambda x: isinstance(x, str) and x.lower() != AUTO, 
self._inlets)
+).union(
+map(lambda op: op.task_id,
+filter(lambda op: isinstance(op, Operator), self._inlets))
+).intersection(self.get_flat_relative_ids(upstream=True))
 
-if self._inlets['auto']:  # pylint:disable=protected-access
-# dont append twice
-task_ids = set(self._inlets['task_ids']).symmetric_difference(  # 
pylint:disable=protected-access
-self.upstream_task_ids
-)
-inlets = self.xcom_pull(context,
-task_ids=task_ids,
-dag_id=self.dag_id,
-key=PIPELINE_OUTLETS)
-inlets = [item for sublist in inlets if sublist for item in 
sublist]
-inlets = [DataSet.map_type(i['typeName'])(data=i['attributes'])
-  for i in inlets]
+if AUTO.upper() in self._inlets or AUTO.lower() in self._inlets:
+task_ids = 
task_ids.union(task_ids.symmetric_difference(self.upstream_task_ids))
+
+inlets = self.xcom_pull(context, task_ids=task_ids,
+dag_id=self.dag_id, key=PIPELINE_OUTLETS)
+inlets = [_get_instance(item) for sublist in inlets if sublist for 
item in sublist]
 self.inlets.extend(inlets)
 
-if self._inlets['datasets']:  # pylint:disable=protected-access
-self.inlets.extend(self._inlets['datasets'])  # 
pylint:disable=protected-access
+self.inlets.extend([_get_instance(_as_rendered_dict(i, context))
 
 Review comment:
   I'm not entirely sure if I understand your question, but the context is the 
same as for render_template so outputs are the same. The difference is that 
this function will not fail if it find a non serializable field.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] bolkedebruin commented on a change in pull request #6564: [AIRFLOW-5911] Simplify lineage API and improve robustness

2019-11-21 Thread GitBox
bolkedebruin commented on a change in pull request #6564: [AIRFLOW-5911] 
Simplify lineage API and improve robustness
URL: https://github.com/apache/airflow/pull/6564#discussion_r349267760
 
 

 ##
 File path: airflow/lineage/__init__.py
 ##
 @@ -18,18 +18,25 @@
 # under the License.
 
 # pylint:disable=missing-docstring
+import attr
+import jinja2
+import json
 
 from functools import wraps
-from itertools import chain
 
 from airflow.configuration import conf
 from airflow.exceptions import AirflowConfigException
-from airflow.lineage.datasets import DataSet
+from airflow.lineage.entity.dataset import Dataset
 from airflow.utils.log.logging_mixin import LoggingMixin
 from airflow.utils.module_loading import import_string
 
+from cattr import structure
+
+ENV = jinja2.Environment()
+
 PIPELINE_OUTLETS = "pipeline_outlets"
 PIPELINE_INLETS = "pipeline_inlets"
+AUTO = "auto"
 
 Review comment:
   How'd you mean?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-6035) Remove comand method in TaskInstance

2019-11-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979527#comment-16979527
 ] 

ASF GitHub Bot commented on AIRFLOW-6035:
-

mik-laj commented on pull request #6629: [AIRFLOW-6035] Remove command method 
in TaskInstance
URL: https://github.com/apache/airflow/pull/6629
 
 
   This method is not used. In addition, this method does not work properly 
because the arguments should be processed using the shlex.quote function.
   
   
   ---
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-6035
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove comand method in TaskInstance
> 
>
> Key: AIRFLOW-6035
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6035
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.10.6
>Reporter: Kamil Bregula
>Priority: Trivial
>
> This method is not used. In addition, this method does not work properly 
> because the arguments should be processed using the shlex.quote function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] mik-laj opened a new pull request #6629: [AIRFLOW-6035] Remove command method in TaskInstance

2019-11-21 Thread GitBox
mik-laj opened a new pull request #6629: [AIRFLOW-6035] Remove command method 
in TaskInstance
URL: https://github.com/apache/airflow/pull/6629
 
 
   This method is not used. In addition, this method does not work properly 
because the arguments should be processed using the shlex.quote function.
   
   
   ---
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-6035
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-6035) Remove comand method in TaskInstance

2019-11-21 Thread Kamil Bregula (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula updated AIRFLOW-6035:
---
Summary: Remove comand method in TaskInstance  (was: Remove comand method 
in Task)

> Remove comand method in TaskInstance
> 
>
> Key: AIRFLOW-6035
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6035
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.10.6
>Reporter: Kamil Bregula
>Priority: Trivial
>
> This method is not used. In addition, this method does not work properly 
> because the arguments should be processed using the shlex.quote function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6035) Remove comand method in Task

2019-11-21 Thread Kamil Bregula (Jira)
Kamil Bregula created AIRFLOW-6035:
--

 Summary: Remove comand method in Task
 Key: AIRFLOW-6035
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6035
 Project: Apache Airflow
  Issue Type: Bug
  Components: core
Affects Versions: 1.10.6
Reporter: Kamil Bregula


This method is not used. In addition, this method does not work properly 
because the arguments should be processed using the shlex.quote function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (AIRFLOW-6026) Use ccntextlib to redirect stderr and stdout

2019-11-21 Thread Kamil Bregula (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula closed AIRFLOW-6026.
--
Fix Version/s: 1.10.7
   Resolution: Fixed

> Use ccntextlib to redirect stderr and stdout
> 
>
> Key: AIRFLOW-6026
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6026
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.10.6
>Reporter: Kamil Bregula
>Priority: Major
> Fix For: 1.10.7
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6026) Use ccntextlib to redirect stderr and stdout

2019-11-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979519#comment-16979519
 ] 

ASF subversion and git services commented on AIRFLOW-6026:
--

Commit da086661f7207ece0b598233b988387233c24d4a in airflow's branch 
refs/heads/master from Kamil Breguła
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=da08666 ]

[AIRFLOW-6026] Use contextlib to redirect stderr and stdout (#6624)



> Use ccntextlib to redirect stderr and stdout
> 
>
> Key: AIRFLOW-6026
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6026
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.10.6
>Reporter: Kamil Bregula
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6026) Use ccntextlib to redirect stderr and stdout

2019-11-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979518#comment-16979518
 ] 

ASF GitHub Bot commented on AIRFLOW-6026:
-

mik-laj commented on pull request #6624: [AIRFLOW-6026] Use contextlib to 
redirect stderr and stdout
URL: https://github.com/apache/airflow/pull/6624
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Use ccntextlib to redirect stderr and stdout
> 
>
> Key: AIRFLOW-6026
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6026
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.10.6
>Reporter: Kamil Bregula
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] mik-laj merged pull request #6624: [AIRFLOW-6026] Use contextlib to redirect stderr and stdout

2019-11-21 Thread GitBox
mik-laj merged pull request #6624: [AIRFLOW-6026] Use contextlib to redirect 
stderr and stdout
URL: https://github.com/apache/airflow/pull/6624
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-4897) Location not used to create empty dataset by bigquery_hook cursor

2019-11-21 Thread Kamil Bregula (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula updated AIRFLOW-4897:
---
Component/s: gcp

> Location not used to create empty dataset by bigquery_hook cursor
> -
>
> Key: AIRFLOW-4897
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4897
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: gcp, hooks
>Affects Versions: 1.10.2, 1.10.3
> Environment: composer-1.7.1-airflow-1.10.2
> Python 3
>Reporter: Benjamin
>Priority: Major
>
> {code:java}
> bq_cursor = BigQueryHook(use_legacy_sql=False,
>  bigquery_conn_id='google_cloud_default',
>  location=EU").get_conn().cursor()
> print(f'Location Cursor : {bq_cursor.location}') // EU is printed
> bq_cursor.create_empty_dataset(dataset_id, project_id){code}
> 'EU' is printed but my empty dataset has been created in location : 'US'.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] deshraj commented on issue #6380: [AIRFLOW-3632] Allow replace_microseconds in trigger_dag REST request

2019-11-21 Thread GitBox
deshraj commented on issue #6380: [AIRFLOW-3632] Allow replace_microseconds in 
trigger_dag REST request
URL: https://github.com/apache/airflow/pull/6380#issuecomment-557221923
 
 
   @ashb may I ask you about the release timeline for this feature? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #6553: [AIRFLOW-5902] avoid unnecessary sleep to maintain local task job heart rate

2019-11-21 Thread GitBox
codecov-io edited a comment on issue #6553: [AIRFLOW-5902] avoid unnecessary 
sleep to maintain local task job heart rate
URL: https://github.com/apache/airflow/pull/6553#issuecomment-553170084
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6553?src=pr=h1) 
Report
   > Merging 
[#6553](https://codecov.io/gh/apache/airflow/pull/6553?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/fab957e763f40bf2a2398770312b4834fbd613e1?src=pr=desc)
 will **decrease** coverage by `0.3%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6553/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6553?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master   #6553  +/-   ##
   =
   - Coverage83.8%   83.5%   -0.31% 
   =
 Files 669 669  
 Lines   37564   37557   -7 
   =
   - Hits31480   31361 -119 
   - Misses   60846196 +112
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6553?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/jobs/local\_task\_job.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2xvY2FsX3Rhc2tfam9iLnB5)
 | `89.33% <ø> (-0.67%)` | :arrow_down: |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `93.32% <100%> (ø)` | :arrow_up: |
   | 
[airflow/jobs/base\_job.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2Jhc2Vfam9iLnB5)
 | `92.14% <100%> (+3.41%)` | :arrow_up: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `45.25% <0%> (-46.72%)` | :arrow_down: |
   | 
[airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5)
 | `50.98% <0%> (-23.53%)` | :arrow_down: |
   | 
[...rflow/contrib/operators/kubernetes\_pod\_operator.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZF9vcGVyYXRvci5weQ==)
 | `77.14% <0%> (-21.43%)` | :arrow_down: |
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `89.13% <0%> (-3.63%)` | :arrow_down: |
   | 
[airflow/task/task\_runner/base\_task\_runner.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL2Jhc2VfdGFza19ydW5uZXIucHk=)
 | `76.27% <0%> (+3.38%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6553?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6553?src=pr=footer). 
Last update 
[fab957e...6c5d1d6](https://codecov.io/gh/apache/airflow/pull/6553?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] codecov-io edited a comment on issue #6553: [AIRFLOW-5902] avoid unnecessary sleep to maintain local task job heart rate

2019-11-21 Thread GitBox
codecov-io edited a comment on issue #6553: [AIRFLOW-5902] avoid unnecessary 
sleep to maintain local task job heart rate
URL: https://github.com/apache/airflow/pull/6553#issuecomment-553170084
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6553?src=pr=h1) 
Report
   > Merging 
[#6553](https://codecov.io/gh/apache/airflow/pull/6553?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/fab957e763f40bf2a2398770312b4834fbd613e1?src=pr=desc)
 will **decrease** coverage by `0.3%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6553/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6553?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master   #6553  +/-   ##
   =
   - Coverage83.8%   83.5%   -0.31% 
   =
 Files 669 669  
 Lines   37564   37557   -7 
   =
   - Hits31480   31361 -119 
   - Misses   60846196 +112
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6553?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/jobs/local\_task\_job.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2xvY2FsX3Rhc2tfam9iLnB5)
 | `89.33% <ø> (-0.67%)` | :arrow_down: |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `93.32% <100%> (ø)` | :arrow_up: |
   | 
[airflow/jobs/base\_job.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL2Jhc2Vfam9iLnB5)
 | `92.14% <100%> (+3.41%)` | :arrow_up: |
   | 
[airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==)
 | `44.44% <0%> (-55.56%)` | :arrow_down: |
   | 
[airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==)
 | `52.94% <0%> (-47.06%)` | :arrow_down: |
   | 
[airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==)
 | `45.25% <0%> (-46.72%)` | :arrow_down: |
   | 
[airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5)
 | `50.98% <0%> (-23.53%)` | :arrow_down: |
   | 
[...rflow/contrib/operators/kubernetes\_pod\_operator.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZF9vcGVyYXRvci5weQ==)
 | `77.14% <0%> (-21.43%)` | :arrow_down: |
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `89.13% <0%> (-3.63%)` | :arrow_down: |
   | 
[airflow/task/task\_runner/base\_task\_runner.py](https://codecov.io/gh/apache/airflow/pull/6553/diff?src=pr=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL2Jhc2VfdGFza19ydW5uZXIucHk=)
 | `76.27% <0%> (+3.38%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6553?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6553?src=pr=footer). 
Last update 
[fab957e...6c5d1d6](https://codecov.io/gh/apache/airflow/pull/6553?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-1717) AttributeError while clicking on dag on webUI

2019-11-21 Thread jack (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979495#comment-16979495
 ] 

jack commented on AIRFLOW-1717:
---

Had many releases since 1.8 clicking on DAG in the Ui works now for sure :)

> AttributeError while clicking on dag on webUI
> -
>
> Key: AIRFLOW-1717
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1717
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.8.0
>Reporter: Ambrish Bhargava
>Priority: Major
>
> Simple DAG
> {code}from airflow import DAG
> from airflow.contrib.operators.qubole_operator import QuboleOperator
> from datetime import datetime, timedelta
>  
> # Default args
> default_args = {
> 'owner': 'airflow',
> 'depends_on_past': False,
> 'start_date': datetime(2017, 8, 1),
> 'email': ['airf...@airflow.com'],
> 'email_on_failure': True,
> 'email_on_retry': False,
> 'retries': 1,
> 'retry_delay': timedelta(minutes=5),
> }
>  
> # Dag information
> dag = DAG(
> 'qubole_test',
> default_args=default_args,
> schedule_interval='@daily')
>  
> # Actual steps
> hive_cmd = QuboleOperator(
> command_type='hivecmd',
> task_id='qubole_show_tables',
> query='use schema;show tables;',
> cluster_label='default',
> qubole_conn_id = 'airflow_qubole',
> dag=dag){code}
> When I ran this dag on CLI, it worked fine. But when I tried to click the DAG 
> on web UI, I am getting following error:
> {code}Traceback (most recent call last):
>   File "/usr/local/lib64/python2.7/site-packages/flask/app.py", line 1988, in 
> wsgi_app
> response = self.full_dispatch_request()
>   File "/usr/local/lib64/python2.7/site-packages/flask/app.py", line 1641, in 
> full_dispatch_request
> rv = self.handle_user_exception(e)
>   File "/usr/local/lib64/python2.7/site-packages/flask/app.py", line 1544, in 
> handle_user_exception
> reraise(exc_type, exc_value, tb)
>   File "/usr/local/lib64/python2.7/site-packages/flask/app.py", line 1639, in 
> full_dispatch_request
> rv = self.dispatch_request()
>   File "/usr/local/lib64/python2.7/site-packages/flask/app.py", line 1625, in 
> dispatch_request
> return self.view_functions[rule.endpoint](**req.view_args)
>   File "/usr/local/lib/python2.7/site-packages/flask_admin/base.py", line 69, 
> in inner
> return self._run_view(f, *args, **kwargs)
>   File "/usr/local/lib/python2.7/site-packages/flask_admin/base.py", line 
> 368, in _run_view
> return fn(self, *args, **kwargs)
>   File "/usr/local/lib/python2.7/site-packages/flask_login.py", line 755, in 
> decorated_view
> return func(*args, **kwargs)
>   File "/usr/local/lib/python2.7/site-packages/airflow/www/utils.py", line 
> 219, in view_func
> return f(*args, **kwargs)
>   File "/usr/local/lib/python2.7/site-packages/airflow/www/utils.py", line 
> 125, in wrapper
> return f(*args, **kwargs)
>   File "/usr/local/lib/python2.7/site-packages/airflow/www/views.py", line 
> 1229, in tree
> 'children': [recurse_nodes(t, set()) for t in dag.roots],
>   File "/usr/local/lib/python2.7/site-packages/airflow/www/views.py", line 
> 1191, in recurse_nodes
> if node_count[0] < node_limit or t not in visited]
>   File "/usr/local/lib/python2.7/site-packages/airflow/www/views.py", line 
> 1216, in recurse_nodes
> for d in dates],
> AttributeError: 'NoneType' object has no attribute 'isoformat'{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3185) Add chunking to DBAPI_hook by implementing fetchmany and pandas chunksize

2019-11-21 Thread jack (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979489#comment-16979489
 ] 

jack commented on AIRFLOW-3185:
---

[~tomanizer] do you have a final version to PR?

> Add chunking to DBAPI_hook by implementing fetchmany and pandas chunksize
> -
>
> Key: AIRFLOW-3185
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3185
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: hooks
>Affects Versions: 1.10.0
>Reporter: Thomas Haederle
>Assignee: Thomas Haederle
>Priority: Minor
>  Labels: easyfix
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> DbApiHook currently implements get_records and get_pandas_df, where both 
> methods fetch all records into memory.
> We should implement two new methods which return a generator with a 
> configurable chunksize:
> - def get_many_records(self, sql, parameters=None, chunksize=20, 
> iterate_singles=False):
> - def get_pandas_df_chunks(self, sql, parameters=None, chunksize=20)
> this should work for all DB hooks which inherit from this class.
> We could also adapt existing methods, but that could be problematic because 
> these methods will return a generator whereas the others return either 
> records or dataframes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil commented on issue #6628: [AIRFLOW-6034] Fix Deprecation Elasticsearch configs on Master

2019-11-21 Thread GitBox
kaxil commented on issue #6628: [AIRFLOW-6034] Fix Deprecation Elasticsearch 
configs on Master
URL: https://github.com/apache/airflow/pull/6628#issuecomment-557211496
 
 
   I had fixed incorrect label in https://github.com/apache/airflow/pull/6620 
but I didn't know that they were wrong i. key were actually values and vice 
versa.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil opened a new pull request #6628: [AIRFLOW-6034] Fix Deprecation Elasticsearch configs on Master

2019-11-21 Thread GitBox
kaxil opened a new pull request #6628: [AIRFLOW-6034] Fix Deprecation 
Elasticsearch configs on Master
URL: https://github.com/apache/airflow/pull/6628
 
 
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-6034
   
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   This was already fixed in v1-10-* branches
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-6034) Fix Deprecation Elasticsearch configs on Master

2019-11-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979480#comment-16979480
 ] 

ASF GitHub Bot commented on AIRFLOW-6034:
-

kaxil commented on pull request #6628: [AIRFLOW-6034] Fix Deprecation 
Elasticsearch configs on Master
URL: https://github.com/apache/airflow/pull/6628
 
 
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-6034
   
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   This was already fixed in v1-10-* branches
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix Deprecation Elasticsearch configs on Master
> ---
>
> Key: AIRFLOW-6034
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6034
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: configuration
>Affects Versions: 2.0.0
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Major
>
> This has already been fixed in the master



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] codecov-io edited a comment on issue #6489: [AIRFLOW-3959] [AIRFLOW-4026] Add filter by DAG tags

2019-11-21 Thread GitBox
codecov-io edited a comment on issue #6489: [AIRFLOW-3959] [AIRFLOW-4026] Add 
filter by DAG tags
URL: https://github.com/apache/airflow/pull/6489#issuecomment-552128422
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6489?src=pr=h1) 
Report
   > Merging 
[#6489](https://codecov.io/gh/apache/airflow/pull/6489?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/fab957e763f40bf2a2398770312b4834fbd613e1?src=pr=desc)
 will **increase** coverage by `0.01%`.
   > The diff coverage is `96.55%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6489/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6489?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#6489  +/-   ##
   ==
   + Coverage83.8%   83.81%   +0.01% 
   ==
 Files 669  669  
 Lines   3756437609  +45 
   ==
   + Hits3148031523  +43 
   - Misses   6084 6086   +2
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6489?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/example\_dags/example\_pig\_operator.py](https://codecov.io/gh/apache/airflow/pull/6489/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9waWdfb3BlcmF0b3IucHk=)
 | `100% <ø> (ø)` | :arrow_up: |
   | 
[airflow/example\_dags/example\_python\_operator.py](https://codecov.io/gh/apache/airflow/pull/6489/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9weXRob25fb3BlcmF0b3IucHk=)
 | `63.33% <ø> (ø)` | :arrow_up: |
   | 
[...ample\_dags/example\_branch\_python\_dop\_operator\_3.py](https://codecov.io/gh/apache/airflow/pull/6489/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9icmFuY2hfcHl0aG9uX2RvcF9vcGVyYXRvcl8zLnB5)
 | `75% <ø> (ø)` | :arrow_up: |
   | 
[...le\_dags/example\_passing\_params\_via\_test\_command.py](https://codecov.io/gh/apache/airflow/pull/6489/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9wYXNzaW5nX3BhcmFtc192aWFfdGVzdF9jb21tYW5kLnB5)
 | `100% <ø> (ø)` | :arrow_up: |
   | 
[airflow/example\_dags/example\_branch\_operator.py](https://codecov.io/gh/apache/airflow/pull/6489/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9icmFuY2hfb3BlcmF0b3IucHk=)
 | `100% <ø> (ø)` | :arrow_up: |
   | 
[airflow/example\_dags/example\_gcs\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/6489/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9nY3NfdG9fZ2NzLnB5)
 | `100% <ø> (ø)` | :arrow_up: |
   | 
[...low/example\_dags/example\_trigger\_controller\_dag.py](https://codecov.io/gh/apache/airflow/pull/6489/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV90cmlnZ2VyX2NvbnRyb2xsZXJfZGFnLnB5)
 | `100% <ø> (ø)` | :arrow_up: |
   | 
[airflow/example\_dags/example\_bash\_operator.py](https://codecov.io/gh/apache/airflow/pull/6489/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9iYXNoX29wZXJhdG9yLnB5)
 | `94.44% <ø> (ø)` | :arrow_up: |
   | 
[airflow/example\_dags/example\_subdag\_operator.py](https://codecov.io/gh/apache/airflow/pull/6489/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9zdWJkYWdfb3BlcmF0b3IucHk=)
 | `100% <ø> (ø)` | :arrow_up: |
   | 
[airflow/example\_dags/example\_trigger\_target\_dag.py](https://codecov.io/gh/apache/airflow/pull/6489/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV90cmlnZ2VyX3RhcmdldF9kYWcucHk=)
 | `90% <ø> (ø)` | :arrow_up: |
   | ... and [14 
more](https://codecov.io/gh/apache/airflow/pull/6489/diff?src=pr=tree-more) 
| |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6489?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6489?src=pr=footer). 
Last update 
[fab957e...bd060b1](https://codecov.io/gh/apache/airflow/pull/6489?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-6034) Fix Deprecation Elasticsearch configs on Master

2019-11-21 Thread Kaxil Naik (Jira)
Kaxil Naik created AIRFLOW-6034:
---

 Summary: Fix Deprecation Elasticsearch configs on Master
 Key: AIRFLOW-6034
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6034
 Project: Apache Airflow
  Issue Type: New Feature
  Components: configuration
Affects Versions: 2.0.0
Reporter: Kaxil Naik
Assignee: Kaxil Naik


This has already been fixed in the master



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil commented on a change in pull request #6620: [AIRFLOW-6023] Remove deprecated Celery configs

2019-11-21 Thread GitBox
kaxil commented on a change in pull request #6620: [AIRFLOW-6023] Remove 
deprecated Celery configs
URL: https://github.com/apache/airflow/pull/6620#discussion_r349239272
 
 

 ##
 File path: airflow/configuration.py
 ##
 @@ -114,14 +112,7 @@ class AirflowConfigParser(ConfigParser):
 # new_name, the old_name will be checked to see if it exists. If it does a
 # DeprecationWarning will be issued and the old name will be used instead
 deprecated_options = {
-'celery': {
-# Remove these keys in Airflow 1.11
-'worker_concurrency': 'celeryd_concurrency',
-'result_backend': 'celery_result_backend',
-'broker_url': 'celery_broker_url',
-'ssl_active': 'celery_ssl_active',
-'ssl_cert': 'celery_ssl_cert',
-'ssl_key': 'celery_ssl_key',
+'elasticsearch': {
 
 Review comment:
   Looks like we already fixed it in v1.10


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-6023) Remove deprecated Celery configs

2019-11-21 Thread Kaxil Naik (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-6023.
-
Resolution: Fixed

> Remove deprecated Celery configs
> 
>
> Key: AIRFLOW-6023
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6023
> Project: Apache Airflow
>  Issue Type: Task
>  Components: configuration
>Affects Versions: 1.10.6
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Trivial
> Fix For: 2.0.0
>
>
> Some of the celery configs have been deprecated since 1.10 
> https://github.com/apache/airflow/blob/master/UPDATING.md#celery-config



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6023) Remove deprecated Celery configs

2019-11-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979465#comment-16979465
 ] 

ASF subversion and git services commented on AIRFLOW-6023:
--

Commit 1d8b8cfcbc0d1d81758e42fcf7a789efd797c931 in airflow's branch 
refs/heads/master from Kaxil Naik
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=1d8b8cf ]

[AIRFLOW-6023] Remove deprecated Celery configs (#6620)



> Remove deprecated Celery configs
> 
>
> Key: AIRFLOW-6023
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6023
> Project: Apache Airflow
>  Issue Type: Task
>  Components: configuration
>Affects Versions: 1.10.6
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Trivial
> Fix For: 2.0.0
>
>
> Some of the celery configs have been deprecated since 1.10 
> https://github.com/apache/airflow/blob/master/UPDATING.md#celery-config



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6023) Remove deprecated Celery configs

2019-11-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979464#comment-16979464
 ] 

ASF GitHub Bot commented on AIRFLOW-6023:
-

kaxil commented on pull request #6620: [AIRFLOW-6023] Remove deprecated Celery 
configs
URL: https://github.com/apache/airflow/pull/6620
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove deprecated Celery configs
> 
>
> Key: AIRFLOW-6023
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6023
> Project: Apache Airflow
>  Issue Type: Task
>  Components: configuration
>Affects Versions: 1.10.6
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Trivial
> Fix For: 2.0.0
>
>
> Some of the celery configs have been deprecated since 1.10 
> https://github.com/apache/airflow/blob/master/UPDATING.md#celery-config



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil merged pull request #6620: [AIRFLOW-6023] Remove deprecated Celery configs

2019-11-21 Thread GitBox
kaxil merged pull request #6620: [AIRFLOW-6023] Remove deprecated Celery configs
URL: https://github.com/apache/airflow/pull/6620
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #6621: [AIRFLOW-6025] Add label to uniquely identify creator of Pod

2019-11-21 Thread GitBox
kaxil commented on a change in pull request #6621: [AIRFLOW-6025] Add label to 
uniquely identify creator of Pod
URL: https://github.com/apache/airflow/pull/6621#discussion_r349236097
 
 

 ##
 File path: airflow/contrib/operators/kubernetes_pod_operator.py
 ##
 @@ -127,6 +128,15 @@ def execute(self, context):
  
cluster_context=self.cluster_context,
  config_file=self.config_file)
 
+# Add Airflow Version to the label
+# And a label to identify that pod is launched by 
KubernetesPodOperator
+self.labels.update(
+{
+'airflow_version': airflow_version.replace('+', '-'),
 
 Review comment:
   Re. labels vs annotations:
   
   >You can use either labels or annotations to attach metadata to Kubernetes 
objects. Labels can be used to select objects and to find collections of 
objects that satisfy certain conditions. In contrast, annotations are not used 
to identify and select objects. The metadata in an annotation can be small or 
large, structured or unstructured, and can include characters not permitted by 
labels.
   So labels make more sense for e.g to select all the pods that are created by 
Kube Pod Operator


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #6621: [AIRFLOW-6025] Add label to uniquely identify creator of Pod

2019-11-21 Thread GitBox
kaxil commented on a change in pull request #6621: [AIRFLOW-6025] Add label to 
uniquely identify creator of Pod
URL: https://github.com/apache/airflow/pull/6621#discussion_r349232197
 
 

 ##
 File path: airflow/contrib/operators/kubernetes_pod_operator.py
 ##
 @@ -127,6 +128,15 @@ def execute(self, context):
  
cluster_context=self.cluster_context,
  config_file=self.config_file)
 
+# Add Airflow Version to the label
+# And a label to identify that pod is launched by 
KubernetesPodOperator
+self.labels.update(
+{
+'airflow_version': airflow_version.replace('+', '-'),
 
 Review comment:
   
https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/
 recommend some but given we already use labels and don't follow such 
conventions, I made this change :
   
   
https://github.com/apache/airflow/blob/fab957e763f40bf2a2398770312b4834fbd613e1/airflow/kubernetes/worker_configuration.py#L372-L378


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] ashb commented on a change in pull request #6621: [AIRFLOW-6025] Add label to uniquely identify creator of Pod

2019-11-21 Thread GitBox
ashb commented on a change in pull request #6621: [AIRFLOW-6025] Add label to 
uniquely identify creator of Pod
URL: https://github.com/apache/airflow/pull/6621#discussion_r349229008
 
 

 ##
 File path: airflow/contrib/operators/kubernetes_pod_operator.py
 ##
 @@ -127,6 +128,15 @@ def execute(self, context):
  
cluster_context=self.cluster_context,
  config_file=self.config_file)
 
+# Add Airflow Version to the label
+# And a label to identify that pod is launched by 
KubernetesPodOperator
+self.labels.update(
+{
+'airflow_version': airflow_version.replace('+', '-'),
 
 Review comment:
   Does Kube recommend that we qualify our labels, like 
`org.apache.airflow.version`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] ashb commented on a change in pull request #6621: [AIRFLOW-6025] Add label to uniquely identify creator of Pod

2019-11-21 Thread GitBox
ashb commented on a change in pull request #6621: [AIRFLOW-6025] Add label to 
uniquely identify creator of Pod
URL: https://github.com/apache/airflow/pull/6621#discussion_r349229400
 
 

 ##
 File path: airflow/contrib/operators/kubernetes_pod_operator.py
 ##
 @@ -127,6 +128,15 @@ def execute(self, context):
  
cluster_context=self.cluster_context,
  config_file=self.config_file)
 
+# Add Airflow Version to the label
+# And a label to identify that pod is launched by 
KubernetesPodOperator
+self.labels.update(
+{
+'airflow_version': airflow_version.replace('+', '-'),
 
 Review comment:
   Or is that for annotations? When should we use one over another?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] ashb commented on a change in pull request #6620: [AIRFLOW-6023] Remove deprecated Celery configs

2019-11-21 Thread GitBox
ashb commented on a change in pull request #6620: [AIRFLOW-6023] Remove 
deprecated Celery configs
URL: https://github.com/apache/airflow/pull/6620#discussion_r349228457
 
 

 ##
 File path: airflow/configuration.py
 ##
 @@ -114,14 +112,7 @@ class AirflowConfigParser(ConfigParser):
 # new_name, the old_name will be checked to see if it exists. If it does a
 # DeprecationWarning will be issued and the old name will be used instead
 deprecated_options = {
-'celery': {
-# Remove these keys in Airflow 1.11
-'worker_concurrency': 'celeryd_concurrency',
-'result_backend': 'celery_result_backend',
-'broker_url': 'celery_broker_url',
-'ssl_active': 'celery_ssl_active',
-'ssl_cert': 'celery_ssl_cert',
-'ssl_key': 'celery_ssl_key',
+'elasticsearch': {
 
 Review comment:
   (we should do this on its commit on v1.10 too)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] ashb commented on a change in pull request #6620: [AIRFLOW-6023] Remove deprecated Celery configs

2019-11-21 Thread GitBox
ashb commented on a change in pull request #6620: [AIRFLOW-6023] Remove 
deprecated Celery configs
URL: https://github.com/apache/airflow/pull/6620#discussion_r349228056
 
 

 ##
 File path: airflow/configuration.py
 ##
 @@ -114,14 +112,7 @@ class AirflowConfigParser(ConfigParser):
 # new_name, the old_name will be checked to see if it exists. If it does a
 # DeprecationWarning will be issued and the old name will be used instead
 deprecated_options = {
-'celery': {
-# Remove these keys in Airflow 1.11
-'worker_concurrency': 'celeryd_concurrency',
-'result_backend': 'celery_result_backend',
-'broker_url': 'celery_broker_url',
-'ssl_active': 'celery_ssl_active',
-'ssl_cert': 'celery_ssl_cert',
-'ssl_key': 'celery_ssl_key',
+'elasticsearch': {
 
 Review comment:
   Did we never fix this?!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work started] (AIRFLOW-5947) Make the json backend pluggable for DAG Serialization

2019-11-21 Thread Kaxil Naik (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-5947 started by Kaxil Naik.
---
> Make the json backend pluggable for DAG Serialization
> -
>
> Key: AIRFLOW-5947
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5947
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core, scheduler
>Affects Versions: 2.0.0, 1.10.7
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Major
>
> Allow users the option to choose the JSON library of their choice for DAG 
> Serialization.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (AIRFLOW-5931) Spawning new python interpreter for every task slow

2019-11-21 Thread Ash Berlin-Taylor (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-5931 started by Ash Berlin-Taylor.
--
> Spawning new python interpreter for every task slow
> ---
>
> Key: AIRFLOW-5931
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5931
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: executors, worker
>Affects Versions: 2.0.0
>Reporter: Ash Berlin-Taylor
>Assignee: Ash Berlin-Taylor
>Priority: Major
>
> There are a number of places in the Executors and Task Runners where we spawn 
> a whole new python interpreter.
> My profiling has shown that this is slow. Rather than running a fresh python 
> interpreter which then has to re-load all of Airflow and its dependencies we 
> should use {{os.fork}} when it is available/suitable which should speed up 
> task running, espeically for short lived tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] ashb commented on a change in pull request #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
ashb commented on a change in pull request #6627: [AIRFLOW-5931] Use os.fork 
when appropriate to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#discussion_r349217376
 
 

 ##
 File path: airflow/task/task_runner/standard_task_runner.py
 ##
 @@ -17,28 +17,69 @@
 # specific language governing permissions and limitations
 # under the License.
 
+import os
+
 import psutil
+from setproctitle import setproctitle
 
 from airflow.task.task_runner.base_task_runner import BaseTaskRunner
 from airflow.utils.helpers import reap_process_group
 
+CAN_FORK = hasattr(os, 'fork')
+
 
 class StandardTaskRunner(BaseTaskRunner):
 """
 Runs the raw Airflow task by invoking through the Bash shell.
 """
 def __init__(self, local_task_job):
 super().__init__(local_task_job)
+self._rc = None
 
 def start(self):
-self.process = self.run_command()
+if CAN_FORK and not self.run_as_user:
+self.process = self._start_by_fork()
+else:
+self.process = self._start_by_exec()
 
-def return_code(self):
-return self.process.poll()
+def _start_by_exec(self):
+subprocess = self.run_command()
+return psutil.Process(subprocess.pid)
 
-def terminate(self):
-if self.process and psutil.pid_exists(self.process.pid):
-reap_process_group(self.process.pid, self.log)
+def _start_by_fork(self):
+pid = os.fork()
+if pid:
 
 Review comment:
   Could do, but for python it doesn't matter so much - either it turns a pid, 
0 in the child or throws an error (C can return 0, pid or -1 on error, but 
python converts that to an exception for us)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] ashb commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
ashb commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to 
speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557186547
 
 
   
   > @ashb when are situations where CAN_FORK is false besides when doing 
run_as_user?
   
   Windows mostly :) Just being defensive.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-6028) Add and Looker Hook and Operators.

2019-11-21 Thread Nathan Hadfield (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nathan Hadfield updated AIRFLOW-6028:
-
Summary: Add and Looker Hook and Operators.  (was: Add a Looker Hook.)

> Add and Looker Hook and Operators.
> --
>
> Key: AIRFLOW-6028
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6028
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: hooks
>Affects Versions: 2.0.0
>Reporter: Nathan Hadfield
>Assignee: Nathan Hadfield
>Priority: Minor
> Fix For: 2.0.0
>
>
> This addition of a hook for Looker ([https://looker.com/]) will enable the 
> integration of Airflow with the Looker SDK.  This can then form the basis for 
> a suite of operators to automate common Looker actions, e.g. sending a Looker 
> dashboard via email.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6028) Add and Looker Hook and Operators.

2019-11-21 Thread Nathan Hadfield (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nathan Hadfield updated AIRFLOW-6028:
-
Component/s: operators

> Add and Looker Hook and Operators.
> --
>
> Key: AIRFLOW-6028
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6028
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: hooks, operators
>Affects Versions: 2.0.0
>Reporter: Nathan Hadfield
>Assignee: Nathan Hadfield
>Priority: Minor
> Fix For: 2.0.0
>
>
> This addition of a hook for Looker ([https://looker.com/]) will enable the 
> integration of Airflow with the Looker SDK.  This can then form the basis for 
> a suite of operators to automate common Looker actions, e.g. sending a Looker 
> dashboard via email.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil commented on issue #6621: [AIRFLOW-6025] Add label to uniquely identify creator of Pod

2019-11-21 Thread GitBox
kaxil commented on issue #6621: [AIRFLOW-6025] Add label to uniquely identify 
creator of Pod
URL: https://github.com/apache/airflow/pull/6621#issuecomment-557178870
 
 
   @nuclearpinguin Yeah, restarting the test worked :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #6396: [AIRFLOW-5726] Delete table as file name in RedshiftToS3Transfer

2019-11-21 Thread GitBox
kaxil commented on a change in pull request #6396: [AIRFLOW-5726] Delete table 
as file name in RedshiftToS3Transfer
URL: https://github.com/apache/airflow/pull/6396#discussion_r349161658
 
 

 ##
 File path: tests/operators/test_redshift_to_s3_operator.py
 ##
 @@ -31,7 +32,8 @@ class TestRedshiftToS3Transfer(unittest.TestCase):
 
 @mock.patch("boto3.session.Session")
 @mock.patch("airflow.hooks.postgres_hook.PostgresHook.run")
-def test_execute(self, mock_run, mock_session):
+@parameterized.expand([(True, ), (False, )])
+def test_execute(self, mock_run, mock_session, boolean_value):
 
 Review comment:
   
   I am going to try and run this my test on my machine and let you know


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] zacharya19 commented on a change in pull request #6489: [AIRFLOW-3959] [AIRFLOW-4026] Add filter by DAG tags

2019-11-21 Thread GitBox
zacharya19 commented on a change in pull request #6489: [AIRFLOW-3959] 
[AIRFLOW-4026] Add filter by DAG tags
URL: https://github.com/apache/airflow/pull/6489#discussion_r349201040
 
 

 ##
 File path: airflow/www/app.py
 ##
 @@ -65,6 +66,7 @@ def create_app(config=None, session=None, testing=False, 
app_name="Airflow"):
 app.config['SESSION_COOKIE_HTTPONLY'] = True
 app.config['SESSION_COOKIE_SECURE'] = conf.getboolean('webserver', 
'COOKIE_SECURE')
 app.config['SESSION_COOKIE_SAMESITE'] = conf.get('webserver', 
'COOKIE_SAMESITE')
+app.config['PERMANENT_SESSION_LIFETIME'] = timedelta(days=3560)  # 10 years
 
 Review comment:
   Exposed as a configuration.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] zacharya19 commented on a change in pull request #6489: [AIRFLOW-3959] [AIRFLOW-4026] Add filter by DAG tags

2019-11-21 Thread GitBox
zacharya19 commented on a change in pull request #6489: [AIRFLOW-3959] 
[AIRFLOW-4026] Add filter by DAG tags
URL: https://github.com/apache/airflow/pull/6489#discussion_r349201067
 
 

 ##
 File path: airflow/www/templates/airflow/dags.html
 ##
 @@ -81,9 +92,17 @@ DAGs
 
 
 
-
-{{ dag.dag_id }}
-
+
 
 Review comment:
   No, because the dag_id is in block display, so it will break the line after 
(but I changed it to span so the float is removed).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil edited a comment on issue #6396: [AIRFLOW-5726] Delete table as file name in RedshiftToS3Transfer

2019-11-21 Thread GitBox
kaxil edited a comment on issue #6396: [AIRFLOW-5726] Delete table as file name 
in RedshiftToS3Transfer
URL: https://github.com/apache/airflow/pull/6396#issuecomment-557171010
 
 
   @JavierLopezT Apply the following code and the test will pass. Passes on my 
local machine:
   
   
   ```diff
   diff --git a/tests/operators/test_redshift_to_s3_operator.py 
b/tests/operators/test_redshift_to_s3_operator.py
   index 5fd8d46e3..baa4aad32 100644
   --- a/tests/operators/test_redshift_to_s3_operator.py
   +++ b/tests/operators/test_redshift_to_s3_operator.py
   @@ -30,10 +30,13 @@ from airflow.utils.tests import 
assertEqualIgnoreMultipleSpaces
   
class TestRedshiftToS3Transfer(unittest.TestCase):
   
   +@parameterized.expand([
   +[True, "key/table_"],
   +[False, "key"],
   +])
@mock.patch("boto3.session.Session")
@mock.patch("airflow.hooks.postgres_hook.PostgresHook.run")
   -@parameterized.expand([(True, ), (False, )])
   -def test_execute(self, mock_run, mock_session, boolean_value):
   +def test_execute(self, table_as_file_name, expected_s3_key, mock_run, 
mock_session,):
access_key = "aws_access_key_id"
secret_key = "aws_secret_access_key"
mock_session.return_value = Session(access_key, secret_key)
   @@ -42,7 +45,6 @@ class TestRedshiftToS3Transfer(unittest.TestCase):
s3_bucket = "bucket"
s3_key = "key"
unload_options = ['HEADER', ]
   -table_as_file_name = boolean_value
   
RedshiftToS3Transfer(
schema=schema,
   @@ -62,14 +64,14 @@ class TestRedshiftToS3Transfer(unittest.TestCase):
select_query = "SELECT * FROM 
{schema}.{table}".format(schema=schema, table=table)
unload_query = """
UNLOAD ('{select_query}')
   -TO 's3://{s3_bucket}/{s3_key}/{table}_'
   +TO 's3://{s3_bucket}/{s3_key}'
with credentials

'aws_access_key_id={access_key};aws_secret_access_key={secret_key}'
{unload_options};
""".format(select_query=select_query,
   -   table=table,
   s3_bucket=s3_bucket,
   -   s3_key=s3_key,
   +   s3_key=expected_s3_key,
   access_key=access_key,
   secret_key=secret_key,
   unload_options=unload_options)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (AIRFLOW-6033) UI crashes at "Landing Time" after switching task_id caps/small letters

2019-11-21 Thread ivan de los santos (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ivan de los santos reassigned AIRFLOW-6033:
---

Assignee: ivan de los santos

> UI crashes at "Landing Time" after switching task_id caps/small letters
> ---
>
> Key: AIRFLOW-6033
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6033
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, ui
>Affects Versions: 1.10.6
>Reporter: ivan de los santos
>Assignee: ivan de los santos
>Priority: Minor
>
> Airflow UI will crash in the browser returning "Oops" message and the 
> Traceback of the crashing error.
> This is caused by modifying a task_id with a capital/small letter, I will 
> point out some examples that will cause airflow to crash:
>  - task_id = "DUMMY_TASK" to task_id = "dUMMY_TASK"
>  - task_id = "Dummy_Task" to task_id = "dummy_Task" or "Dummy_task",...
>  - task_id = "Dummy_task" to task_id = "Dummy_tASk"
> _
> If you change the name of the task_id to something different such as, in our 
> example:
>  - task_id = "Dummy_Task" to task_id = "DummyTask" or "Dummytask"
> It won't fail since it will be recognized as new tasks, which is the expected 
> behaviour.
> If we switch back the modified name to the original name it won't crash since 
> it will access to the correct tasks instances. I will explain in next 
> paragraphs where this error is located.
> _
>  *How to replicate*: 
>  # Launch airflow webserver -p 8080
>  # Go to the Airflow-UI
>  # Create an example DAG with a task_id name up to your choice in small 
> letters (ex. "run")
>  # Launch the DAG and wait its execution to finish
>  # Modify the task_id inside the DAG with the first letter to capital letter 
> (ex. "Run")
>  # Refresh the DAG
>  # Go to "Landing Times" inside the DAG menu in the UI
>  # You will get an "oops" message with the Traceback.
>  
> *File causing the problem*:  
> [https://github.com/apache/airflow/blob/master/airflow/www/views.py] (lines 
> 1643 - 1654)
>  
> *Reasons of the problem*:
>  #  KeyError: 'run', meaning a dictionary does not contain the task_id "run", 
> it will get more into the details of where this comes from.
> {code:python}
> Traceback (most recent call last):
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 2446, in wsgi_app
> response = self.full_dispatch_request()
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 1951, in full_dispatch_request
> rv = self.handle_user_exception(e)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 1820, in handle_user_exception
> reraise(exc_type, exc_value, tb)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/_compat.py", line 
> 39, in reraise
> raise value
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 1949, in full_dispatch_request
> rv = self.dispatch_request()
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 1935, in dispatch_request
> return self.view_functions[rule.endpoint](**req.view_args)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", 
> line 69, in inner
> return self._run_view(f, *args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", 
> line 368, in _run_view
> return fn(self, *args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask_login/utils.py", 
> line 258, in decorated_view
> return func(*args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/airflow/www/utils.py", 
> line 295, in wrapper
> return f(*args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/airflow/utils/db.py", 
> line 74, in wrapper
> return func(*args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/airflow/www/views.py", 
> line 1921, in landing_times
> x[ti.task_id].append(dttm)
> KeyError: 'run'
> {code}
> _
> h2. Code
> {code:python}
> for task in dag.tasks:
> y[task.task_id] = []
> x[task.task_id] = []
> for ti in task.get_task_instances(start_date=min_date, 
> end_date=base_date):
> ts = ti.execution_date
> if dag.schedule_interval and dag.following_schedule(ts):
> ts = dag.following_schedule(ts)
> if ti.end_date:
> dttm = wwwutils.epoch(ti.execution_date)
> secs = (ti.end_date - ts).total_seconds()
> x[ti.task_id].append(dttm)
> y[ti.task_id].append(secs)
> {code}
>  
> We can see in first two lines inside the first for loop, how the dictionary x 
> and y is being filled with 

[jira] [Assigned] (AIRFLOW-6033) UI crashes at "Landing Time" after switching task_id caps/small letters

2019-11-21 Thread ivan de los santos (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ivan de los santos reassigned AIRFLOW-6033:
---

Assignee: (was: ivan de los santos)

> UI crashes at "Landing Time" after switching task_id caps/small letters
> ---
>
> Key: AIRFLOW-6033
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6033
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, ui
>Affects Versions: 1.10.6
>Reporter: ivan de los santos
>Priority: Minor
>
> Airflow UI will crash in the browser returning "Oops" message and the 
> Traceback of the crashing error.
> This is caused by modifying a task_id with a capital/small letter, I will 
> point out some examples that will cause airflow to crash:
>  - task_id = "DUMMY_TASK" to task_id = "dUMMY_TASK"
>  - task_id = "Dummy_Task" to task_id = "dummy_Task" or "Dummy_task",...
>  - task_id = "Dummy_task" to task_id = "Dummy_tASk"
> _
> If you change the name of the task_id to something different such as, in our 
> example:
>  - task_id = "Dummy_Task" to task_id = "DummyTask" or "Dummytask"
> It won't fail since it will be recognized as new tasks, which is the expected 
> behaviour.
> If we switch back the modified name to the original name it won't crash since 
> it will access to the correct tasks instances. I will explain in next 
> paragraphs where this error is located.
> _
>  *How to replicate*: 
>  # Launch airflow webserver -p 8080
>  # Go to the Airflow-UI
>  # Create an example DAG with a task_id name up to your choice in small 
> letters (ex. "run")
>  # Launch the DAG and wait its execution to finish
>  # Modify the task_id inside the DAG with the first letter to capital letter 
> (ex. "Run")
>  # Refresh the DAG
>  # Go to "Landing Times" inside the DAG menu in the UI
>  # You will get an "oops" message with the Traceback.
>  
> *File causing the problem*:  
> [https://github.com/apache/airflow/blob/master/airflow/www/views.py] (lines 
> 1643 - 1654)
>  
> *Reasons of the problem*:
>  #  KeyError: 'run', meaning a dictionary does not contain the task_id "run", 
> it will get more into the details of where this comes from.
> {code:python}
> Traceback (most recent call last):
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 2446, in wsgi_app
> response = self.full_dispatch_request()
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 1951, in full_dispatch_request
> rv = self.handle_user_exception(e)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 1820, in handle_user_exception
> reraise(exc_type, exc_value, tb)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/_compat.py", line 
> 39, in reraise
> raise value
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 1949, in full_dispatch_request
> rv = self.dispatch_request()
>   File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 
> 1935, in dispatch_request
> return self.view_functions[rule.endpoint](**req.view_args)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", 
> line 69, in inner
> return self._run_view(f, *args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", 
> line 368, in _run_view
> return fn(self, *args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/flask_login/utils.py", 
> line 258, in decorated_view
> return func(*args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/airflow/www/utils.py", 
> line 295, in wrapper
> return f(*args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/airflow/utils/db.py", 
> line 74, in wrapper
> return func(*args, **kwargs)
>   File "/home/rde/.local/lib/python3.6/site-packages/airflow/www/views.py", 
> line 1921, in landing_times
> x[ti.task_id].append(dttm)
> KeyError: 'run'
> {code}
> _
> h2. Code
> {code:python}
> for task in dag.tasks:
> y[task.task_id] = []
> x[task.task_id] = []
> for ti in task.get_task_instances(start_date=min_date, 
> end_date=base_date):
> ts = ti.execution_date
> if dag.schedule_interval and dag.following_schedule(ts):
> ts = dag.following_schedule(ts)
> if ti.end_date:
> dttm = wwwutils.epoch(ti.execution_date)
> secs = (ti.end_date - ts).total_seconds()
> x[ti.task_id].append(dttm)
> y[ti.task_id].append(secs)
> {code}
>  
> We can see in first two lines inside the first for loop, how the dictionary x 
> and y is being filled with tasks_id attributes which comes from 

[jira] [Created] (AIRFLOW-6033) UI crashes at "Landing Time" after switching task_id caps/small letters

2019-11-21 Thread ivan de los santos (Jira)
ivan de los santos created AIRFLOW-6033:
---

 Summary: UI crashes at "Landing Time" after switching task_id 
caps/small letters
 Key: AIRFLOW-6033
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6033
 Project: Apache Airflow
  Issue Type: Bug
  Components: DAG, ui
Affects Versions: 1.10.6
Reporter: ivan de los santos


Airflow UI will crash in the browser returning "Oops" message and the Traceback 
of the crashing error.

This is caused by modifying a task_id with a capital/small letter, I will point 
out some examples that will cause airflow to crash:
 - task_id = "DUMMY_TASK" to task_id = "dUMMY_TASK"
 - task_id = "Dummy_Task" to task_id = "dummy_Task" or "Dummy_task",...
 - task_id = "Dummy_task" to task_id = "Dummy_tASk"

_

If you change the name of the task_id to something different such as, in our 
example:
 - task_id = "Dummy_Task" to task_id = "DummyTask" or "Dummytask"

It won't fail since it will be recognized as new tasks, which is the expected 
behaviour.

If we switch back the modified name to the original name it won't crash since 
it will access to the correct tasks instances. I will explain in next 
paragraphs where this error is located.

_

 *How to replicate*: 
 # Launch airflow webserver -p 8080
 # Go to the Airflow-UI
 # Create an example DAG with a task_id name up to your choice in small letters 
(ex. "run")
 # Launch the DAG and wait its execution to finish
 # Modify the task_id inside the DAG with the first letter to capital letter 
(ex. "Run")
 # Refresh the DAG
 # Go to "Landing Times" inside the DAG menu in the UI
 # You will get an "oops" message with the Traceback.

 

*File causing the problem*:  
[https://github.com/apache/airflow/blob/master/airflow/www/views.py] (lines 
1643 - 1654)

 

*Reasons of the problem*:
 #  KeyError: 'run', meaning a dictionary does not contain the task_id "run", 
it will get more into the details of where this comes from.

{code:python}
Traceback (most recent call last):
  File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 2446, 
in wsgi_app
response = self.full_dispatch_request()
  File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1951, 
in full_dispatch_request
rv = self.handle_user_exception(e)
  File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1820, 
in handle_user_exception
reraise(exc_type, exc_value, tb)
  File "/home/rde/.local/lib/python3.6/site-packages/flask/_compat.py", line 
39, in reraise
raise value
  File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1949, 
in full_dispatch_request
rv = self.dispatch_request()
  File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1935, 
in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 
69, in inner
return self._run_view(f, *args, **kwargs)
  File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 
368, in _run_view
return fn(self, *args, **kwargs)
  File "/home/rde/.local/lib/python3.6/site-packages/flask_login/utils.py", 
line 258, in decorated_view
return func(*args, **kwargs)
  File "/home/rde/.local/lib/python3.6/site-packages/airflow/www/utils.py", 
line 295, in wrapper
return f(*args, **kwargs)
  File "/home/rde/.local/lib/python3.6/site-packages/airflow/utils/db.py", line 
74, in wrapper
return func(*args, **kwargs)
  File "/home/rde/.local/lib/python3.6/site-packages/airflow/www/views.py", 
line 1921, in landing_times
x[ti.task_id].append(dttm)
KeyError: 'run'

{code}
_
h2. Code
{code:python}
for task in dag.tasks:
y[task.task_id] = []
x[task.task_id] = []

for ti in task.get_task_instances(start_date=min_date, end_date=base_date):

ts = ti.execution_date
if dag.schedule_interval and dag.following_schedule(ts):
ts = dag.following_schedule(ts)
if ti.end_date:
dttm = wwwutils.epoch(ti.execution_date)
secs = (ti.end_date - ts).total_seconds()
x[ti.task_id].append(dttm)
y[ti.task_id].append(secs)

{code}
 
We can see in first two lines inside the first for loop, how the dictionary x 
and y is being filled with tasks_id attributes which comes from the actual DAG.

*The problem actually comes in the second for loop* when you get the task 
instances from a DAG, I am not sure about this next part and I wish someone to 
clarify my question about this.

I think that the task instances (ti) received from get_task_instances() 
function comes from the information stored into the database, that is the 
reason of crash when you access to "Landing Times" page, is that the x and y 
where filled 

[GitHub] [airflow] zacharya19 commented on a change in pull request #6489: [AIRFLOW-3959] [AIRFLOW-4026] Add filter by DAG tags

2019-11-21 Thread GitBox
zacharya19 commented on a change in pull request #6489: [AIRFLOW-3959] 
[AIRFLOW-4026] Add filter by DAG tags
URL: https://github.com/apache/airflow/pull/6489#discussion_r349197932
 
 

 ##
 File path: airflow/www/views.py
 ##
 @@ -213,6 +218,18 @@ def get_int_arg(value, default=0):
 
 arg_current_page = request.args.get('page', '0')
 arg_search_query = request.args.get('search', None)
+arg_tags_filter = request.args.getlist('tags', None)
+flask_session.permanent = True
 
 Review comment:
   Yes, changed the entire flask session cookie to be permanent (with the 
expiration set in `PERMANENT_SESSION_LIFETIME`).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on issue #6396: [AIRFLOW-5726] Delete table as file name in RedshiftToS3Transfer

2019-11-21 Thread GitBox
kaxil commented on issue #6396: [AIRFLOW-5726] Delete table as file name in 
RedshiftToS3Transfer
URL: https://github.com/apache/airflow/pull/6396#issuecomment-557171010
 
 
   @JavierLopezT Apply the following code and the test will pass. Passes on my 
local machine:
   
   
   ```diff
   diff --git a/tests/operators/test_redshift_to_s3_operator.py 
b/tests/operators/test_redshift_to_s3_operator.py
   index 5fd8d46e3..baa4aad32 100644
   --- a/tests/operators/test_redshift_to_s3_operator.py
   +++ b/tests/operators/test_redshift_to_s3_operator.py
   @@ -30,10 +30,13 @@ from airflow.utils.tests import 
assertEqualIgnoreMultipleSpaces
   
class TestRedshiftToS3Transfer(unittest.TestCase):
   
   +@parameterized.expand([
   +[True, "key/table_"],
   +[False, "key"],
   +])
@mock.patch("boto3.session.Session")
@mock.patch("airflow.hooks.postgres_hook.PostgresHook.run")
   -@parameterized.expand([(True, ), (False, )])
   -def test_execute(self, mock_run, mock_session, boolean_value):
   +def test_execute(self, table_as_file_name, expected_s3_key, mock_run, 
mock_session,):
access_key = "aws_access_key_id"
secret_key = "aws_secret_access_key"
mock_session.return_value = Session(access_key, secret_key)
   @@ -42,7 +45,6 @@ class TestRedshiftToS3Transfer(unittest.TestCase):
s3_bucket = "bucket"
s3_key = "key"
unload_options = ['HEADER', ]
   -table_as_file_name = boolean_value
   
RedshiftToS3Transfer(
schema=schema,
   @@ -62,14 +64,14 @@ class TestRedshiftToS3Transfer(unittest.TestCase):
select_query = "SELECT * FROM 
{schema}.{table}".format(schema=schema, table=table)
unload_query = """
UNLOAD ('{select_query}')
   -TO 's3://{s3_bucket}/{s3_key}/{table}_'
   +TO 's3://{s3_bucket}/{s3_key}'
with credentials

'aws_access_key_id={access_key};aws_secret_access_key={secret_key}'
{unload_options};
""".format(select_query=select_query,
   -   table=table,
   +   # table=table,
   s3_bucket=s3_bucket,
   -   s3_key=s3_key,
   +   s3_key=expected_s3_key,
   access_key=access_key,
   secret_key=secret_key,
   unload_options=unload_options)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #6396: [AIRFLOW-5726] Delete table as file name in RedshiftToS3Transfer

2019-11-21 Thread GitBox
kaxil commented on a change in pull request #6396: [AIRFLOW-5726] Delete table 
as file name in RedshiftToS3Transfer
URL: https://github.com/apache/airflow/pull/6396#discussion_r349161658
 
 

 ##
 File path: tests/operators/test_redshift_to_s3_operator.py
 ##
 @@ -31,7 +32,8 @@ class TestRedshiftToS3Transfer(unittest.TestCase):
 
 @mock.patch("boto3.session.Session")
 @mock.patch("airflow.hooks.postgres_hook.PostgresHook.run")
-def test_execute(self, mock_run, mock_session):
+@parameterized.expand([(True, ), (False, )])
+def test_execute(self, mock_run, mock_session, boolean_value):
 
 Review comment:
   
   
   I going to try and run this my test on my machine and let you know


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] zacharya19 commented on a change in pull request #6489: [AIRFLOW-3959] [AIRFLOW-4026] Add filter by DAG tags

2019-11-21 Thread GitBox
zacharya19 commented on a change in pull request #6489: [AIRFLOW-3959] 
[AIRFLOW-4026] Add filter by DAG tags
URL: https://github.com/apache/airflow/pull/6489#discussion_r349191635
 
 

 ##
 File path: airflow/www/utils.py
 ##
 @@ -471,9 +471,12 @@ def clean_column_names():
 
 def is_utcdatetime(self, col_name):
 from airflow.utils.sqlalchemy import UtcDateTime
-obj = self.list_columns[col_name].type
-return isinstance(obj, UtcDateTime) or \
-isinstance(obj, sqla.types.TypeDecorator) and \
-isinstance(obj.impl, UtcDateTime)
+
+if col_name in self.list_columns:
 
 Review comment:
   This function is trying to get "type" from the new field tag, while it's not 
really a column but a relationship.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-6032) Sagemaker sensors with log printing are causing worker to stuck

2019-11-21 Thread Shlomi Cohen (Jira)
Shlomi Cohen created AIRFLOW-6032:
-

 Summary: Sagemaker sensors with log printing are causing worker to 
stuck
 Key: AIRFLOW-6032
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6032
 Project: Apache Airflow
  Issue Type: Bug
  Components: scheduler
Affects Versions: 1.10.5
Reporter: Shlomi Cohen


Hi

we are trying to use Sagemaker sensors to wait on long running tasks in 
sagemaker.

problem is that the scheduler is filled up with sensors and stops to function.

we have tried changing the sensor mode to "rescheduled" and also changed its 
priority to be lower than other tasks- that didn't work

the indication of the problem is - you have a sensor which works once in every 
60 seconds

with task that take 20 minute and you see only 1 line like this in the log

 {{Rescheduling task, marking task as UP_FOR_RESCHEDULE}}

{{Writing our own sensor works as expected and the line above appear as many 
times as needed until the job finish.}}

{{looks like something with the AwsHook which uses a connection to get the logs 
or something is wrong.}}

{{}}

{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] ashb commented on issue #6352: [AIRFLOW-5683] Add propagate_skipped_state to SubDagOperator

2019-11-21 Thread GitBox
ashb commented on issue #6352: [AIRFLOW-5683] Add propagate_skipped_state to 
SubDagOperator
URL: https://github.com/apache/airflow/pull/6352#issuecomment-557145887
 
 
   This work cherry-picking in to in 1.10.7? (I'm not sure how the SubDagOp 
compares between master and release branch)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

2019-11-21 Thread GitBox
kaxil commented on a change in pull request #6627: [AIRFLOW-5931] Use os.fork 
when appropriate to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#discussion_r349164729
 
 

 ##
 File path: airflow/task/task_runner/standard_task_runner.py
 ##
 @@ -17,28 +17,69 @@
 # specific language governing permissions and limitations
 # under the License.
 
+import os
+
 import psutil
+from setproctitle import setproctitle
 
 from airflow.task.task_runner.base_task_runner import BaseTaskRunner
 from airflow.utils.helpers import reap_process_group
 
+CAN_FORK = hasattr(os, 'fork')
+
 
 class StandardTaskRunner(BaseTaskRunner):
 """
 Runs the raw Airflow task by invoking through the Bash shell.
 """
 def __init__(self, local_task_job):
 super().__init__(local_task_job)
+self._rc = None
 
 def start(self):
-self.process = self.run_command()
+if CAN_FORK and not self.run_as_user:
+self.process = self._start_by_fork()
+else:
+self.process = self._start_by_exec()
 
-def return_code(self):
-return self.process.poll()
+def _start_by_exec(self):
+subprocess = self.run_command()
+return psutil.Process(subprocess.pid)
 
-def terminate(self):
-if self.process and psutil.pid_exists(self.process.pid):
-reap_process_group(self.process.pid, self.log)
+def _start_by_fork(self):
+pid = os.fork()
+if pid:
 
 Review comment:
   Just wondering if we want to make it explicit. Maybe not required by just a 
suggestion:
   
   ```
   if pid == 0:
   ```
   
   example:
   
   
   ```python
   def _start_by_fork(self):
   pid = os.fork()
   if pid == 0:
   from airflow.bin.cli import CLIFactory
   parser = CLIFactory.get_parser()
   args = parser.parse_args(self._command[1:])
   setproctitle(
   "airflow task runner: {0.dag_id} {0.task_id} 
{0.execution_date} {0.job_id}".format(args)
   )
   args.func(args)
   os._exit(0)
   
   else:
   self.log.info("Started process %d to run task", pid)
   return psutil.Process(pid)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   >