[GitHub] codecov-io commented on issue #4274: [AIRFLOW-3438] Fix default value of udf_config in BQOperator

2018-12-03 Thread GitBox
codecov-io commented on issue #4274: [AIRFLOW-3438] Fix default value of 
udf_config in BQOperator
URL: 
https://github.com/apache/incubator-airflow/pull/4274#issuecomment-443865605
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4274?src=pr=h1)
 Report
   > Merging 
[#4274](https://codecov.io/gh/apache/incubator-airflow/pull/4274?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/9c04e8f339a6d84b2fff983e6584af2b81249652?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4274/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4274?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4274   +/-   ##
   ===
 Coverage   78.08%   78.08%   
   ===
 Files 201  201   
 Lines   1645816458   
   ===
 Hits1285112851   
 Misses   3607 3607
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4274?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4274?src=pr=footer).
 Last update 
[9c04e8f...3117760](https://codecov.io/gh/apache/incubator-airflow/pull/4274?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4274: [AIRFLOW-3438] Fix default value of udf_config in BQOperator

2018-12-03 Thread GitBox
codecov-io edited a comment on issue #4274: [AIRFLOW-3438] Fix default value of 
udf_config in BQOperator
URL: 
https://github.com/apache/incubator-airflow/pull/4274#issuecomment-443865605
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4274?src=pr=h1)
 Report
   > Merging 
[#4274](https://codecov.io/gh/apache/incubator-airflow/pull/4274?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/9c04e8f339a6d84b2fff983e6584af2b81249652?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4274/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4274?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4274   +/-   ##
   ===
 Coverage   78.08%   78.08%   
   ===
 Files 201  201   
 Lines   1645816458   
   ===
 Hits1285112851   
 Misses   3607 3607
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4274?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4274?src=pr=footer).
 Last update 
[9c04e8f...3117760](https://codecov.io/gh/apache/incubator-airflow/pull/4274?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work started] (AIRFLOW-3438) BigQueryOperator should default udf_config to None instead of false.

2018-12-03 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-3438 started by Kaxil Naik.
---
> BigQueryOperator should default udf_config to None instead of false.
> 
>
> Key: AIRFLOW-3438
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3438
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Reporter: Luka Draksler
>Assignee: Kaxil Naik
>Priority: Major
>  Labels: easyfix
> Fix For: 1.10.2
>
>
> BigQueryOperator currently sets default value of udf_config to False. This no 
> longer works due to [https://github.com/apache/incubator-airflow/pull/3733] 
> validating the type of that parameter as either None or list. Default value 
> needs to be changed to None.
> The line in question added in the commit referenced above
> {code:java}
> (udf_config, 'userDefinedFunctionResources', None, list),
> {code}
>  
>  
> Note, other users of the hook may potentially encounter the same issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-12-03 Thread GitBox
kaxil commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r238321366
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -473,11 +482,11 @@ def create_external_table(self,
 def run_query(self,
   bql=None,
   sql=None,
-  destination_dataset_table=False,
+  destination_dataset_table=None,
   write_disposition='WRITE_EMPTY',
   allow_large_results=False,
-  flatten_results=None,
-  udf_config=False,
+  flatten_results=False,
 
 Review comment:
   @xnuinside If you want to work on this, can you please make sure to add 
tests to cover this cases.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil opened a new pull request #4274: [AIRFLOW-3438] Fix default value of udf_config in BQOperator

2018-12-03 Thread GitBox
kaxil opened a new pull request #4274: [AIRFLOW-3438] Fix default value of 
udf_config in BQOperator
URL: https://github.com/apache/incubator-airflow/pull/4274
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. 
 - https://issues.apache.org/jira/browse/AIRFLOW-3438
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3438) BigQueryOperator should default udf_config to None instead of false.

2018-12-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707721#comment-16707721
 ] 

ASF GitHub Bot commented on AIRFLOW-3438:
-

kaxil opened a new pull request #4274: [AIRFLOW-3438] Fix default value of 
udf_config in BQOperator
URL: https://github.com/apache/incubator-airflow/pull/4274
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. 
 - https://issues.apache.org/jira/browse/AIRFLOW-3438
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> BigQueryOperator should default udf_config to None instead of false.
> 
>
> Key: AIRFLOW-3438
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3438
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Reporter: Luka Draksler
>Priority: Major
>  Labels: easyfix
> Fix For: 1.10.2
>
>
> BigQueryOperator currently sets default value of udf_config to False. This no 
> longer works due to [https://github.com/apache/incubator-airflow/pull/3733] 
> validating the type of that parameter as either None or list. Default value 
> needs to be changed to None.
> The line in question added in the commit referenced above
> {code:java}
> (udf_config, 'userDefinedFunctionResources', None, list),
> {code}
>  
>  
> Note, other users of the hook may potentially encounter the same issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3438) BigQueryOperator should default udf_config to None instead of false.

2018-12-03 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-3438:

Fix Version/s: 1.10.2

> BigQueryOperator should default udf_config to None instead of false.
> 
>
> Key: AIRFLOW-3438
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3438
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Reporter: Luka Draksler
>Assignee: Kaxil Naik
>Priority: Major
>  Labels: easyfix
> Fix For: 1.10.2
>
>
> BigQueryOperator currently sets default value of udf_config to False. This no 
> longer works due to [https://github.com/apache/incubator-airflow/pull/3733] 
> validating the type of that parameter as either None or list. Default value 
> needs to be changed to None.
> The line in question added in the commit referenced above
> {code:java}
> (udf_config, 'userDefinedFunctionResources', None, list),
> {code}
>  
>  
> Note, other users of the hook may potentially encounter the same issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3348) Refresh run stats on dag refresh

2018-12-03 Thread Marcin Szymanski (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcin Szymanski resolved AIRFLOW-3348.
---
Resolution: Fixed

> Refresh run stats on dag refresh
> 
>
> Key: AIRFLOW-3348
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3348
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Marcin Szymanski
>Assignee: Marcin Szymanski
>Priority: Minor
>
> In some cases dag run statistics may become outdated, for example, when dag 
> run is deleted.
> As the view from which dag runs are deleted permits to delete runs from 
> multiple dags in a single transaction, this seems to be the most reasonable 
> place to update run statistics



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AIRFLOW-3348) Refresh run stats on dag refresh

2018-12-03 Thread Marcin Szymanski (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcin Szymanski closed AIRFLOW-3348.
-

> Refresh run stats on dag refresh
> 
>
> Key: AIRFLOW-3348
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3348
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Marcin Szymanski
>Assignee: Marcin Szymanski
>Priority: Minor
>
> In some cases dag run statistics may become outdated, for example, when dag 
> run is deleted.
> As the view from which dag runs are deleted permits to delete runs from 
> multiple dags in a single transaction, this seems to be the most reasonable 
> place to update run statistics



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] caddac commented on issue #3684: [AIRFLOW-2840] - add update connections cli option

2018-12-03 Thread GitBox
caddac commented on issue #3684: [AIRFLOW-2840] - add update connections cli 
option
URL: 
https://github.com/apache/incubator-airflow/pull/3684#issuecomment-443816087
 
 
   @ashb Think I got the json_api figured out, will try to get my changes 
committed tonight.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] caddac edited a comment on issue #3684: [AIRFLOW-2840] - add update connections cli option

2018-12-03 Thread GitBox
caddac edited a comment on issue #3684: [AIRFLOW-2840] - add update connections 
cli option
URL: 
https://github.com/apache/incubator-airflow/pull/3684#issuecomment-443816087
 
 
   @ashb Think I got the json_client figured out, will try to get my changes 
committed tonight.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (AIRFLOW-3412) Worker pods are not being deleted after termination

2018-12-03 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor closed AIRFLOW-3412.
--
Resolution: Duplicate

> Worker pods are not being deleted after termination
> ---
>
> Key: AIRFLOW-3412
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3412
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor, kubernetes
>Affects Versions: 1.10.0
>Reporter: Viktor
>Assignee: Viktor
>Priority: Major
> Fix For: 1.10.2
>
>
> When using KubernetesExecutor multiple pods are spawned for tasks.
> When their job is done they are not deleted automatically even if you specify 
> *delete_worker_pods=true* in the Airflow configuration and RBAC is properly 
> configured to allow the scheduler to delete pods.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #4273: [AIRFLOW-XXX] GCP operators documentation clarifications

2018-12-03 Thread GitBox
codecov-io edited a comment on issue #4273: [AIRFLOW-XXX] GCP operators 
documentation clarifications
URL: 
https://github.com/apache/incubator-airflow/pull/4273#issuecomment-443807416
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4273?src=pr=h1)
 Report
   > Merging 
[#4273](https://codecov.io/gh/apache/incubator-airflow/pull/4273?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/9c04e8f339a6d84b2fff983e6584af2b81249652?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4273/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4273?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4273   +/-   ##
   ===
 Coverage   78.08%   78.08%   
   ===
 Files 201  201   
 Lines   1645816458   
   ===
 Hits1285112851   
 Misses   3607 3607
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4273?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4273?src=pr=footer).
 Last update 
[9c04e8f...49cbba8](https://codecov.io/gh/apache/incubator-airflow/pull/4273?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #4273: [AIRFLOW-XXX] GCP operators documentation clarifications

2018-12-03 Thread GitBox
codecov-io commented on issue #4273: [AIRFLOW-XXX] GCP operators documentation 
clarifications
URL: 
https://github.com/apache/incubator-airflow/pull/4273#issuecomment-443807416
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4273?src=pr=h1)
 Report
   > Merging 
[#4273](https://codecov.io/gh/apache/incubator-airflow/pull/4273?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/9c04e8f339a6d84b2fff983e6584af2b81249652?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4273/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4273?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4273   +/-   ##
   ===
 Coverage   78.08%   78.08%   
   ===
 Files 201  201   
 Lines   1645816458   
   ===
 Hits1285112851   
 Misses   3607 3607
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4273?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4273?src=pr=footer).
 Last update 
[9c04e8f...49cbba8](https://codecov.io/gh/apache/incubator-airflow/pull/4273?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kppullin commented on issue #4247: [AIRFLOW-3402] Support global k8s affinity and toleration configs

2018-12-03 Thread GitBox
kppullin commented on issue #4247: [AIRFLOW-3402] Support global k8s affinity 
and toleration configs
URL: 
https://github.com/apache/incubator-airflow/pull/4247#issuecomment-443800172
 
 
   @Fokko - Thanks for the heads up on the CI failures!  I believe the 
referenced `sla_miss` failures were a red herring, with the actual errors being 
flake8 issues (my fault) and an existing python2 compatibility issue exposed by 
the new tests (lack of the `dict.copy` function).  I then broke things more 
with my first attempt at fixing the python2 issue, since I thought the affected 
data structure was a list.  CI went green once I switched the logic to handle a 
dict.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] draksler commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-12-03 Thread GitBox
draksler commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r238365698
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -473,11 +482,11 @@ def create_external_table(self,
 def run_query(self,
   bql=None,
   sql=None,
-  destination_dataset_table=False,
+  destination_dataset_table=None,
   write_disposition='WRITE_EMPTY',
   allow_large_results=False,
-  flatten_results=None,
-  udf_config=False,
+  flatten_results=False,
 
 Review comment:
   I've opened a jira 
[here](https://issues.apache.org/jira/browse/AIRFLOW-3438), if it is a 
duplicate please close it. 
   Tests would normally not cover this case, as you would mock the whole hook, 
not part of it. It is unfortunate that the docstring was incorrect though.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] potiuk opened a new pull request #4273: [AIRFLOW-XXX] GCP operators documentation clarifications

2018-12-03 Thread GitBox
potiuk opened a new pull request #4273: [AIRFLOW-XXX] GCP operators 
documentation clarifications
URL: https://github.com/apache/incubator-airflow/pull/4273
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] It's a pure documentation change
   
   ### Description
   
   - [X] Here are some details about my PR, including screenshots of any UI 
changes:
   These are documentation updated after technical writer review for 
GCP-related operators.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-12-03 Thread GitBox
kaxil commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r238321366
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -473,11 +482,11 @@ def create_external_table(self,
 def run_query(self,
   bql=None,
   sql=None,
-  destination_dataset_table=False,
+  destination_dataset_table=None,
   write_disposition='WRITE_EMPTY',
   allow_large_results=False,
-  flatten_results=None,
-  udf_config=False,
+  flatten_results=False,
 
 Review comment:
   @xnuinside If you want to work on this, can you please make sure to add 
tests to cover this cases.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil closed pull request #4272: [AIRFLOW-XXX] Add Get Simpl to Companies

2018-12-03 Thread GitBox
kaxil closed pull request #4272: [AIRFLOW-XXX] Add Get Simpl to Companies
URL: https://github.com/apache/incubator-airflow/pull/4272
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/README.md b/README.md
index 33ae765f7a..97431d03dd 100644
--- a/README.md
+++ b/README.md
@@ -189,6 +189,7 @@ Currently **officially** using Airflow:
 1. [GameWisp](https://gamewisp.com) [[@tjbiii](https://github.com/TJBIII) & 
[@theryanwalls](https://github.com/theryanwalls)]
 1. [GeneCards](https://www.genecards.org) 
[[@oferze](https://github.com/oferze)]
 1. [Gentner Lab](http://github.com/gentnerlab) 
[[@neuromusic](https://github.com/neuromusic)]
+1. [Get Simpl](https://getsimpl.com/) [[@rootcss](https://github.com/rootcss)]
 1. [Glassdoor](https://github.com/Glassdoor) 
[[@syvineckruyk](https://github.com/syvineckruyk) & 
[@sid88in](https://github.com/sid88in)]
 1. [Global Fashion Group](http://global-fashion-group.com) 
[[@GFG](https://github.com/GFG)]
 1. [GovTech GDS](https://gds-gov.tech) 
[[@chrissng](https://github.com/chrissng) & 
[@datagovsg](https://github.com/datagovsg)]


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-12-03 Thread GitBox
kaxil commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r238320654
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -473,11 +482,11 @@ def create_external_table(self,
 def run_query(self,
   bql=None,
   sql=None,
-  destination_dataset_table=False,
+  destination_dataset_table=None,
   write_disposition='WRITE_EMPTY',
   allow_large_results=False,
-  flatten_results=None,
-  udf_config=False,
+  flatten_results=False,
 
 Review comment:
   @draksler Thanks for reporting. Will fix this bug and release it in 1.10.2


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (AIRFLOW-3440) Redundant LoggingMixin instance

2018-12-03 Thread Alberto Garcia-Raboso (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alberto Garcia-Raboso closed AIRFLOW-3440.
--
Resolution: Invalid

The instance is created in a class method.

> Redundant LoggingMixin instance
> ---
>
> Key: AIRFLOW-3440
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3440
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Affects Versions: 1.9.0, 1.10.0, 1.10.1
>Reporter: Alberto Garcia-Raboso
>Assignee: Alberto Garcia-Raboso
>Priority: Trivial
>
> The class {{airflow.hooks.base_hook.BaseHook}} inherits from 
> {{airflow.utils.log.logging_mixin.LoggingMixin}}, so logging can be done with 
> {{self.log}} inside of the former.
> However, a fresh instance of {{LoggingMixin}} is created on [line 82 of 
> {{airflow/hooks/base_hook.py}}|https://github.com/apache/incubator-airflow/blob/1.10.1/airflow/hooks/base_hook.py#L82],
>  inside {{BaseHook}}, for use in the following line, which is unnecessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] hugoprudente commented on a change in pull request #4231: [AIRFLOW-3066] Adding support for AWS Batch parameters

2018-12-03 Thread GitBox
hugoprudente commented on a change in pull request #4231: [AIRFLOW-3066] Adding 
support for AWS Batch parameters
URL: https://github.com/apache/incubator-airflow/pull/4231#discussion_r238299200
 
 

 ##
 File path: airflow/contrib/operators/awsbatch_operator.py
 ##
 @@ -94,11 +100,21 @@ def execute(self, context):
 )
 
 try:
-response = self.client.submit_job(
-jobName=self.job_name,
-jobQueue=self.job_queue,
-jobDefinition=self.job_definition,
-containerOverrides=self.overrides)
+if self.parameters is None:
 
 Review comment:
   The API does not accept None parameters or {}, both triggers KeyError 
exception.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] hugoprudente commented on a change in pull request #4231: [AIRFLOW-3066] Adding support for AWS Batch parameters

2018-12-03 Thread GitBox
hugoprudente commented on a change in pull request #4231: [AIRFLOW-3066] Adding 
support for AWS Batch parameters
URL: https://github.com/apache/incubator-airflow/pull/4231#discussion_r238298730
 
 

 ##
 File path: airflow/contrib/example_dags/example_awsbatch_operator.py
 ##
 @@ -0,0 +1,95 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions andf limitations
+# under the License.
+
+import airflow
+from airflow.utils.log.logging_mixin import LoggingMixin
+from airflow.models import DAG
+from datetime import timedelta
+
+log = LoggingMixin().log
+
+try:
+# AWS Batch is optional, so not available in vanilla Airflow
+# pip install apache-airflow[boto3]
+from airflow.contrib.operators.awsbatch_operator import AWSBatchOperator
+
+default_args = {
+'owner': 'airflow',
+'depends_on_past': False,
+'start_date': airflow.utils.dates.days_ago(2),
+'email': ['airf...@airflow.com'],
+'email_on_failure': False,
+'email_on_retry': False,
+'retries': 1,
+'retry_delay': timedelta(minutes=5),
+}
+
+dag = DAG(
+'example_awsbatch_dag', default_args=default_args, 
schedule_interval=timedelta(1))
+
+# vanilla example
+t0 = AWSBatchOperator(
+task_id='airflow-vanilla',
+job_name='airflow-vanilla',
+job_queue='airflow',
+job_definition='airflow',
+overrides={},
+queue='airflow',
+dag=dag)
+
+# overrides example
+t1 = AWSBatchOperator(
+job_name='airflow-overrides',
+task_id='airflow-overrides',
+job_queue='airflow',
+job_definition='airflow',
+overrides={
+"command": [
+"echo",
+"overrides"
+]
+},
+queue='airflow',
+dag=dag)
+
+# parameters example
+t2 = AWSBatchOperator(
+job_name='airflow-parameters',
+task_id='airflow-parameters',
+job_queue='airflow',
+job_definition='airflow',
+overrides={
+"command": [
+"echo",
+"Ref::input"
+]
+},
+parameters={
+"input": "Airflow2000"
+},
+queue='airflow',
+dag=dag)
+
+t0.set_upstream(t1)
 
 Review comment:
   Yes, I'll make the update!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rcorre commented on issue #3546: AIRFLOW-2664: Support filtering dag runs by id prefix in API.

2018-12-03 Thread GitBox
rcorre commented on issue #3546: AIRFLOW-2664: Support filtering dag runs by id 
prefix in API.
URL: 
https://github.com/apache/incubator-airflow/pull/3546#issuecomment-443736807
 
 
   Still having some trouble with the test:
   
   ```
   ==
   47) FAIL: test_get_dag_runs_success_with_run_id__like_parameter 
(tests.www_rbac.api.experimental.test_dag_runs_endpoint.TestDagRunsEndpoint)
   --
  Traceback (most recent call last):
   tests/www_rbac/api/experimental/test_dag_runs_endpoint.py line 90 in 
test_get_dag_runs_success_with_run_id__like_parameter
 execution_date=datetime.datetime.fromtimestamp(1539097214),
   airflow/api/common/experimental/trigger_dag.py line 104 in trigger_dag
 replace_microseconds=replace_microseconds,
   airflow/api/common/experimental/trigger_dag.py line 45 in _trigger_dag
 assert timezone.is_localized(execution_date)
  AssertionError: 
   ```
   
   My next guess is to just remove the `execution_date` parameter, but it seems 
like a bad idea to have unit tests with a field that can change every time you 
execute them.
   
   I haven't been able to test at all locally. After following the steps in 
`CONTRIBUTING.md`:
   
   ```
   docker run -t -i -v `pwd`:/airflow/ -w /airflow/ -e 
SLUGIFY_USES_TEXT_UNIDECODE=yes python:3.5 bash
   cd /airflow/
   pip install -e ".[hdfs,hive,druid,devel]"
   airflow initdb
   nosetests -v tests/www_rbac/api/experimental/test_dag_runs_endpoint.py
   ```
   
   Every test fails with:
   
   ```
   ==
   ERROR: test_get_dag_runs_success 
(tests.www_rbac.api.experimental.test_dag_runs_endpoint.TestDagRunsEndpoint)
   --
   Traceback (most recent call last):
 File "/airflow/tests/www_rbac/api/experimental/test_dag_runs_endpoint.py", 
line 43, in setUp
   app, _ = application.create_app(testing=True)
 File "/airflow/airflow/www_rbac/app.py", line 146, in create_app
   security_manager.sync_roles()
 File "/airflow/airflow/www_rbac/security.py", line 439, in sync_roles
   self.create_custom_dag_permission_view()
 File "/airflow/airflow/www_rbac/security.py", line 387, in 
create_custom_dag_permission_view
   all_perm_views = set([role.permission_view_id for role in 
all_perm_view_by_user])
 File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", 
line 2855, in __iter__
   return self._execute_and_instances(context)
 File "/usr/local/lib/python3.5/site-packages/sqlalchemy/orm/query.py", 
line 2878, in _execute_and_instances
   result = conn.execute(querycontext.statement, self._params)
 File "/usr/local/lib/python3.5/site-packages/sqlalchemy/engine/base.py", 
line 945, in execute
   return meth(self, multiparams, params)
 File "/usr/local/lib/python3.5/site-packages/sqlalchemy/sql/elements.py", 
line 263, in _execute_on_connection
   return connection._execute_clauseelement(self, multiparams, params)
 File "/usr/local/lib/python3.5/site-packages/sqlalchemy/engine/base.py", 
line 1053, in _execute_clauseelement
   compiled_sql, distilled_params
 File "/usr/local/lib/python3.5/site-packages/sqlalchemy/engine/base.py", 
line 1189, in _execute_context
   context)
 File "/usr/local/lib/python3.5/site-packages/sqlalchemy/engine/base.py", 
line 1402, in _handle_dbapi_exception
   exc_info
 File "/usr/local/lib/python3.5/site-packages/sqlalchemy/util/compat.py", 
line 203, in raise_from_cause
   reraise(type(exception), exception, tb=exc_tb, cause=cause)
 File "/usr/local/lib/python3.5/site-packages/sqlalchemy/util/compat.py", 
line 186, in reraise
   raise value.with_traceback(tb)
 File "/usr/local/lib/python3.5/site-packages/sqlalchemy/engine/base.py", 
line 1182, in _execute_context
   context)
 File 
"/usr/local/lib/python3.5/site-packages/sqlalchemy/engine/default.py", line 
470, in do_execute
   cursor.execute(statement, parameters)
   sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: 
ab_permission_view_role [SQL: 'SELECT ab_permission_view_role.id AS 
ab_permission_view_role_id, ab_permission_view_role.permission_view_id AS 
ab_permission_view_role_permission_view_id, ab_permission_view_role.role_id AS 
ab_permission_view_role_role_id \nFROM ab_permission_view_role JOIN 
ab_permission_view ON ab_permission_view.id = 
ab_permission_view_role.permission_view_id JOIN ab_view_menu ON ab_view_menu.id 
= ab_permission_view.view_menu_id \nWHERE ab_permission_view_role.role_id = ? 
AND ab_permission_view.view_menu_id != ?'] [parameters: (4, 51)]
   ```
   
   Any ideas?
   


This is an automated message from the Apache Git 

[GitHub] xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-12-03 Thread GitBox
xnuinside commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r238290290
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -473,11 +482,11 @@ def create_external_table(self,
 def run_query(self,
   bql=None,
   sql=None,
-  destination_dataset_table=False,
+  destination_dataset_table=None,
   write_disposition='WRITE_EMPTY',
   allow_large_results=False,
-  flatten_results=None,
-  udf_config=False,
+  flatten_results=False,
 
 Review comment:
   @draksler, very sad what is was not covered by tests in BigQueryOperator and 
it not matched docstrign, because udf_config it has a list type, not bool.. 
need to open the bug issue and make a fix


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3440) Redundant LoggingMixin instance

2018-12-03 Thread Alberto Garcia-Raboso (JIRA)
Alberto Garcia-Raboso created AIRFLOW-3440:
--

 Summary: Redundant LoggingMixin instance
 Key: AIRFLOW-3440
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3440
 Project: Apache Airflow
  Issue Type: Bug
  Components: hooks
Affects Versions: 1.10.1, 1.10.0, 1.9.0
Reporter: Alberto Garcia-Raboso
Assignee: Alberto Garcia-Raboso


The class {{airflow.hooks.base_hook.BaseHook}} inherits from 
{{airflow.utils.log.logging_mixin.LoggingMixin}}, so logging can be done with 
{{self.log}} inside of the former.

However, a fresh instance of {{LoggingMixin}} is created on [line 82 of 
{{airflow/hooks/base_hook.py}}|https://github.com/apache/incubator-airflow/blob/1.10.1/airflow/hooks/base_hook.py#L82],
 inside {{BaseHook}}, for use in the following line, which is unnecessary.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3439) Logs with non-ascii characters can't be read from GCS

2018-12-03 Thread Pavel Raschetnov (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Raschetnov updated AIRFLOW-3439:
--
Summary: Logs with non-ascii characters can't be read from GCS  (was: Logs 
with non-ascii characters can't be read )

> Logs with non-ascii characters can't be read from GCS
> -
>
> Key: AIRFLOW-3439
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3439
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging, webserver
>Affects Versions: 1.10.1
>Reporter: Pavel Raschetnov
>Priority: Major
>  Labels: UTF-8, encoding, gcs_task_handler
>
> Can't see task logs in web interface due to
> {{*** Unable to read remote log from 
> gs://bucket/dag/abcd61826/2018-11-23T00:00:00+00:00/1.log *** 'ascii' codec 
> can't decode byte 0xc3 in position 4421445: ordinal not in range(128)}}
> GCSTaskHandler should use `.decode('utf-8') instead of `.decode` in 
> gcs_read() method
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3439) Logs with non-ascii characters can't be read

2018-12-03 Thread Pavel Raschetnov (JIRA)
Pavel Raschetnov created AIRFLOW-3439:
-

 Summary: Logs with non-ascii characters can't be read 
 Key: AIRFLOW-3439
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3439
 Project: Apache Airflow
  Issue Type: Bug
  Components: logging, webserver
Affects Versions: 1.10.1
Reporter: Pavel Raschetnov


Can't see task logs in web interface due to

{{*** Unable to read remote log from 
gs://bucket/dag/abcd61826/2018-11-23T00:00:00+00:00/1.log *** 'ascii' codec 
can't decode byte 0xc3 in position 4421445: ordinal not in range(128)}}

GCSTaskHandler should use `.decode('utf-8') instead of `.decode` in gcs_read() 
method

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3438) BigQueryOperator should default udf_config to None instead of false.

2018-12-03 Thread Luka Draksler (JIRA)
Luka Draksler created AIRFLOW-3438:
--

 Summary: BigQueryOperator should default udf_config to None 
instead of false.
 Key: AIRFLOW-3438
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3438
 Project: Apache Airflow
  Issue Type: Bug
  Components: operators
Reporter: Luka Draksler


BigQueryOperator currently sets default value of udf_config to False. This no 
longer works due to [https://github.com/apache/incubator-airflow/pull/3733] 
validating the type of that parameter as either None or list. Default value 
needs to be changed to None.

 

Note, other users of the hook may potentially encounter the same issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3438) BigQueryOperator should default udf_config to None instead of false.

2018-12-03 Thread Luka Draksler (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luka Draksler updated AIRFLOW-3438:
---
Description: 
BigQueryOperator currently sets default value of udf_config to False. This no 
longer works due to [https://github.com/apache/incubator-airflow/pull/3733] 
validating the type of that parameter as either None or list. Default value 
needs to be changed to None.

The line in question added in the commit referenced above
{code:java}
(udf_config, 'userDefinedFunctionResources', None, list),
{code}
 

 

Note, other users of the hook may potentially encounter the same issue.

  was:
BigQueryOperator currently sets default value of udf_config to False. This no 
longer works due to [https://github.com/apache/incubator-airflow/pull/3733] 
validating the type of that parameter as either None or list. Default value 
needs to be changed to None.

 

Note, other users of the hook may potentially encounter the same issue.


> BigQueryOperator should default udf_config to None instead of false.
> 
>
> Key: AIRFLOW-3438
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3438
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Reporter: Luka Draksler
>Priority: Major
>  Labels: easyfix
>
> BigQueryOperator currently sets default value of udf_config to False. This no 
> longer works due to [https://github.com/apache/incubator-airflow/pull/3733] 
> validating the type of that parameter as either None or list. Default value 
> needs to be changed to None.
> The line in question added in the commit referenced above
> {code:java}
> (udf_config, 'userDefinedFunctionResources', None, list),
> {code}
>  
>  
> Note, other users of the hook may potentially encounter the same issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] draksler commented on a change in pull request #3733: [AIRFLOW-491] Add cache parameter in BigQuery query method - with 'api_resource_configs'

2018-12-03 Thread GitBox
draksler commented on a change in pull request #3733: [AIRFLOW-491] Add cache 
parameter in BigQuery query method - with 'api_resource_configs'
URL: https://github.com/apache/incubator-airflow/pull/3733#discussion_r238271833
 
 

 ##
 File path: airflow/contrib/hooks/bigquery_hook.py
 ##
 @@ -473,11 +482,11 @@ def create_external_table(self,
 def run_query(self,
   bql=None,
   sql=None,
-  destination_dataset_table=False,
+  destination_dataset_table=None,
   write_disposition='WRITE_EMPTY',
   allow_large_results=False,
-  flatten_results=None,
-  udf_config=False,
+  flatten_results=False,
 
 Review comment:
   This breaks the BigQueryOperator which sets the udf_config to false when it 
calls the hook.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #4272: [AIRFLOW-XXX] Add Get Simpl to Companies

2018-12-03 Thread GitBox
codecov-io commented on issue #4272: [AIRFLOW-XXX] Add Get Simpl to Companies
URL: 
https://github.com/apache/incubator-airflow/pull/4272#issuecomment-443711582
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4272?src=pr=h1)
 Report
   > Merging 
[#4272](https://codecov.io/gh/apache/incubator-airflow/pull/4272?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/1b1231bc2eccc7c3307810536499158f1037018c?src=pr=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4272/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4272?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4272  +/-   ##
   ==
   - Coverage78.1%   78.08%   -0.02% 
   ==
 Files 201  201  
 Lines   1645816458  
   ==
   - Hits1285412851   -3 
   - Misses   3604 3607   +3
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4272?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/4272/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `77.39% <0%> (-0.28%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4272?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4272?src=pr=footer).
 Last update 
[1b1231b...be7b057](https://codecov.io/gh/apache/incubator-airflow/pull/4272?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yujiantao commented on issue #3197: [AIRFLOW-2267] Airflow DAG level access

2018-12-03 Thread GitBox
yujiantao commented on issue #3197: [AIRFLOW-2267] Airflow DAG level access
URL: 
https://github.com/apache/incubator-airflow/pull/3197#issuecomment-443709952
 
 
   Any progress on this feature?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-3434) SFTPOperator does not create intermediate directories

2018-12-03 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-3434.
-
Resolution: Fixed

Resolved by https://github.com/apache/incubator-airflow/pull/4270

> SFTPOperator does not create intermediate directories
> -
>
> Key: AIRFLOW-3434
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3434
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Minor
> Fix For: 1.10.2
>
>
> When using SFTPOperator with either 'get' or 'put', it doesn't create the 
> intermediate directories when copying the file and fails with directory does 
> not exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] sprzedwojski commented on a change in pull request #4251: [AIRFLOW-2440] Add Google Cloud SQL import/export operator

2018-12-03 Thread GitBox
sprzedwojski commented on a change in pull request #4251: [AIRFLOW-2440] Add 
Google Cloud SQL import/export operator
URL: https://github.com/apache/incubator-airflow/pull/4251#discussion_r238257151
 
 

 ##
 File path: airflow/contrib/hooks/gcp_sql_hook.py
 ##
 @@ -254,6 +254,54 @@ def delete_database(self, project, instance, database):
 operation_name = response["name"]
 return self._wait_for_operation_to_complete(project, operation_name)
 
+def export_instance(self, project_id, instance_id, body):
+"""
+Exports data from a Cloud SQL instance to a Cloud Storage bucket as a 
SQL dump
+or CSV file.
+
+:param project_id: Project ID of the project where the instance exists.
+:type project_id: str
+:param instance_id: Name of the Cloud SQL instance. This does not 
include the
+project ID.
+:type instance_id: str
+:param body: The request body, as described in
+
https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/instances/export#request-body
+:type body: dict
+:return: True if the operation succeeded, raises an error otherwise
+:rtype: bool
+"""
+response = self.get_conn().instances().export(
+project=project_id,
+instance=instance_id,
+body=body
+).execute(num_retries=NUM_RETRIES)
+operation_name = response["name"]
+return self._wait_for_operation_to_complete(project_id, operation_name)
+
+def import_instance(self, project_id, instance_id, body):
+"""
+Imports data into a Cloud SQL instance from a SQL dump or CSV file in
+Cloud Storage.
+
+:param project_id: Project ID of the project where the instance exists.
+:type project_id: str
+:param instance_id: Name of the Cloud SQL instance. This does not 
include the
+project ID.
+:type instance_id: str
+:param body: The request body, as described in
+
https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/instances/export#request-body
+:type body: dict
+:return: True if the operation succeeded, raises an error otherwise
+:rtype: bool
+"""
+response = self.get_conn().instances().import_(
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sprzedwojski commented on a change in pull request #4251: [AIRFLOW-2440] Add Google Cloud SQL import/export operator

2018-12-03 Thread GitBox
sprzedwojski commented on a change in pull request #4251: [AIRFLOW-2440] Add 
Google Cloud SQL import/export operator
URL: https://github.com/apache/incubator-airflow/pull/4251#discussion_r238257116
 
 

 ##
 File path: airflow/contrib/hooks/gcp_sql_hook.py
 ##
 @@ -254,6 +254,54 @@ def delete_database(self, project, instance, database):
 operation_name = response["name"]
 return self._wait_for_operation_to_complete(project, operation_name)
 
+def export_instance(self, project_id, instance_id, body):
+"""
+Exports data from a Cloud SQL instance to a Cloud Storage bucket as a 
SQL dump
+or CSV file.
+
+:param project_id: Project ID of the project where the instance exists.
+:type project_id: str
+:param instance_id: Name of the Cloud SQL instance. This does not 
include the
+project ID.
+:type instance_id: str
+:param body: The request body, as described in
+
https://cloud.google.com/sql/docs/mysql/admin-api/v1beta4/instances/export#request-body
+:type body: dict
+:return: True if the operation succeeded, raises an error otherwise
+:rtype: bool
+"""
+response = self.get_conn().instances().export(
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rootcss opened a new pull request #4272: [AIRFLOW-XXX] Add Get Simpl to Companies

2018-12-03 Thread GitBox
rootcss opened a new pull request #4272: [AIRFLOW-XXX] Add Get Simpl to 
Companies
URL: https://github.com/apache/incubator-airflow/pull/4272
 
 
   Jira
   No Jira issue. Add Get Simpl to companies list.
   
   Description
   This PR adds Get Simpl to the companies list in the README.md.
   
   Tests
   No tests required. No code changes.
   
   Documentation
   No code changes.
   
   Code Quality
   No code changes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil closed pull request #2273: [AIRFLOW-1171] Fix up encoding for Postgres

2018-12-03 Thread GitBox
kaxil closed pull request #2273: [AIRFLOW-1171] Fix up encoding for Postgres
URL: https://github.com/apache/incubator-airflow/pull/2273
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/hooks/postgres_hook.py b/airflow/hooks/postgres_hook.py
index 4b460c1158..6ab9ea7110 100644
--- a/airflow/hooks/postgres_hook.py
+++ b/airflow/hooks/postgres_hook.py
@@ -67,4 +67,9 @@ def _serialize_cell(cell, conn):
 :rtype: str
 """
 
-return psycopg2.extensions.adapt(cell).getquoted().decode('utf-8')
+adapted = psycopg2.extensions.adapt(cell)
+try:
+adapted.prepare(conn)
+except AttributeError:
+pass
+return adapted.getquoted()


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-1171) Encoding error for non latin-1 Postgres database

2018-12-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707136#comment-16707136
 ] 

ASF GitHub Bot commented on AIRFLOW-1171:
-

kaxil closed pull request #2273: [AIRFLOW-1171] Fix up encoding for Postgres
URL: https://github.com/apache/incubator-airflow/pull/2273
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/hooks/postgres_hook.py b/airflow/hooks/postgres_hook.py
index 4b460c1158..6ab9ea7110 100644
--- a/airflow/hooks/postgres_hook.py
+++ b/airflow/hooks/postgres_hook.py
@@ -67,4 +67,9 @@ def _serialize_cell(cell, conn):
 :rtype: str
 """
 
-return psycopg2.extensions.adapt(cell).getquoted().decode('utf-8')
+adapted = psycopg2.extensions.adapt(cell)
+try:
+adapted.prepare(conn)
+except AttributeError:
+pass
+return adapted.getquoted()


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Encoding error for non latin-1 Postgres database
> 
>
> Key: AIRFLOW-1171
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1171
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: db, hooks
>Affects Versions: 1.8.0
> Environment: macOS 10.12.5
> Python 2.7.12
> Postgres 9.6.1
> However, these are irrelevant to this issue.
>Reporter: Richard Lee
>Assignee: Richard Lee
>Priority: Major
>
> There's [a known issue|https://github.com/psycopg/psycopg2/issues/331] from 
> psycopg2 that Airflow ignores the encoding settings from db by default and 
> which results in encoding error if there's any non latin-1 content in 
> database cell.
> Reference stack trace:
> {code}
>   File "dags/recipe_hourly_pageviews.py", line 73, in 
> dag.cli()
>   File 
> "/Users/dlackty/.pyenv/versions/2.7.12/lib/python2.7/site-packages/airflow/models.py",
>  line 3339, in cli
> args.func(args, self)
>   File 
> "/Users/dlackty/.pyenv/versions/2.7.12/lib/python2.7/site-packages/airflow/bin/cli.py",
>  line 585, in test
> ti.run(ignore_task_deps=True, ignore_ti_state=True, test_mode=True)
>   File 
> "/Users/dlackty/.pyenv/versions/2.7.12/lib/python2.7/site-packages/airflow/utils/db.py",
>  line 53, in wrapper
> result = func(*args, **kwargs)
>   File 
> "/Users/dlackty/.pyenv/versions/2.7.12/lib/python2.7/site-packages/airflow/models.py",
>  line 1374, in run
> result = task_copy.execute(context=context)
>   File 
> "/Users/dlackty/.pyenv/versions/2.7.12/lib/python2.7/site-packages/airflow/operators/generic_transfer.py",
>  l
> ine 78, in execute
> destination_hook.insert_rows(table=self.destination_table, rows=results)
>   File 
> "/Users/dlackty/.pyenv/versions/2.7.12/lib/python2.7/site-packages/airflow/hooks/dbapi_hook.py",
>  line 215, i
> n insert_rows
> l.append(self._serialize_cell(cell, conn))
>   File 
> "/Users/dlackty/.pyenv/versions/2.7.12/lib/python2.7/site-packages/airflow/hooks/postgres_hook.py",
>  line 70,
>  in _serialize_cell
> return psycopg2.extensions.adapt(cell).getquoted().decode('utf-8')
> UnicodeEncodeError: 'latin-1' codec can't encode characters in position 6-10: 
> ordinal not in range(256)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil closed pull request #2074: [AIRFLOW-869] Web UI Mark Success Upstream Option Bug

2018-12-03 Thread GitBox
kaxil closed pull request #2074: [AIRFLOW-869] Web UI Mark Success Upstream 
Option Bug
URL: https://github.com/apache/incubator-airflow/pull/2074
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/www/views.py b/airflow/www/views.py
index d7c46a746b..fc49d7adaf 100644
--- a/airflow/www/views.py
+++ b/airflow/www/views.py
@@ -1123,7 +1123,7 @@ def success(self):
 if recursive:
 recurse_tasks(relatives, task_ids, dag_ids, task_id_to_dag)
 if upstream:
-relatives = task.get_flat_relatives(upstream=False)
+relatives = task.get_flat_relatives(upstream=True)
 task_ids += [t.task_id for t in relatives]
 if recursive:
 recurse_tasks(relatives, task_ids, dag_ids, task_id_to_dag)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-869) Web UI Mark Success Upstream Option Bug

2018-12-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707135#comment-16707135
 ] 

ASF GitHub Bot commented on AIRFLOW-869:


kaxil closed pull request #2074: [AIRFLOW-869] Web UI Mark Success Upstream 
Option Bug
URL: https://github.com/apache/incubator-airflow/pull/2074
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/www/views.py b/airflow/www/views.py
index d7c46a746b..fc49d7adaf 100644
--- a/airflow/www/views.py
+++ b/airflow/www/views.py
@@ -1123,7 +1123,7 @@ def success(self):
 if recursive:
 recurse_tasks(relatives, task_ids, dag_ids, task_id_to_dag)
 if upstream:
-relatives = task.get_flat_relatives(upstream=False)
+relatives = task.get_flat_relatives(upstream=True)
 task_ids += [t.task_id for t in relatives]
 if recursive:
 recurse_tasks(relatives, task_ids, dag_ids, task_id_to_dag)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Web UI Mark Success Upstream Option Bug
> ---
>
> Key: AIRFLOW-869
> URL: https://issues.apache.org/jira/browse/AIRFLOW-869
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Yi Chen
>Priority: Major
> Fix For: 1.8.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A simple bug report: I tracked down to the source code of Airflow Web UI, 
> look at this line, 
> https://github.com/apache/incubator-airflow/blob/v1-8-stable/airflow/www/views.py#L1127
>  .  It should be `relatives = task.get_flat_relatives(upstream=True)`. But 
> even with this fix, there are still issues about the "Mark Success" 
> functionality. I hope we ship this bug fix along with v1.8. And I will open 
> another ticket discussing the functionality of "Mark Success".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3434) SFTPOperator does not create intermediate directories

2018-12-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707133#comment-16707133
 ] 

ASF GitHub Bot commented on AIRFLOW-3434:
-

kaxil closed pull request #4270: [AIRFLOW-3434] Allows creating intermediate 
folders in SFTPOperator
URL: https://github.com/apache/incubator-airflow/pull/4270
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/operators/sftp_operator.py 
b/airflow/contrib/operators/sftp_operator.py
index 620d875f89..117bc55a8c 100644
--- a/airflow/contrib/operators/sftp_operator.py
+++ b/airflow/contrib/operators/sftp_operator.py
@@ -16,6 +16,8 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+import os
+
 from airflow.contrib.hooks.ssh_hook import SSHHook
 from airflow.exceptions import AirflowException
 from airflow.models import BaseOperator
@@ -48,9 +50,28 @@ class SFTPOperator(BaseOperator):
 :param remote_filepath: remote file path to get or put. (templated)
 :type remote_filepath: str
 :param operation: specify operation 'get' or 'put', defaults to put
-:type get: bool
+:type operation: str
 :param confirm: specify if the SFTP operation should be confirmed, 
defaults to True
 :type confirm: bool
+:param create_intermediate_dirs: create missing intermediate directories 
when
+copying from remote to local and vice-versa. Default is False.
+
+Example: The following task would copy ``file.txt`` to the remote host
+at ``/tmp/tmp1/tmp2/`` while creating ``tmp``,``tmp1`` and ``tmp2`` if 
they
+don't exist. If the parameter is not passed it would error as the 
directory
+does not exist. ::
+
+put_file = SFTPOperator(
+task_id="test_sftp",
+ssh_conn="ssh_default",
+local_filepath="/tmp/file.txt",
+remote_filepath="/tmp/tmp1/tmp2/file.txt",
+operation="put",
+create_intermediate_dirs=True,
+dag=dag
+)
+
+:type create_intermediate_dirs: bool
 """
 template_fields = ('local_filepath', 'remote_filepath', 'remote_host')
 
@@ -63,6 +84,7 @@ def __init__(self,
  remote_filepath=None,
  operation=SFTPOperation.PUT,
  confirm=True,
+ create_intermediate_dirs=False,
  *args,
  **kwargs):
 super(SFTPOperator, self).__init__(*args, **kwargs)
@@ -73,6 +95,7 @@ def __init__(self,
 self.remote_filepath = remote_filepath
 self.operation = operation
 self.confirm = confirm
+self.create_intermediate_dirs = create_intermediate_dirs
 if not (self.operation.lower() == SFTPOperation.GET or
 self.operation.lower() == SFTPOperation.PUT):
 raise TypeError("unsupported operation value {0}, expected {1} or 
{2}"
@@ -101,11 +124,25 @@ def execute(self, context):
 with self.ssh_hook.get_conn() as ssh_client:
 sftp_client = ssh_client.open_sftp()
 if self.operation.lower() == SFTPOperation.GET:
+local_folder = os.path.dirname(self.local_filepath)
+if self.create_intermediate_dirs:
+# Create Intermediate Directories if it doesn't exist
+try:
+os.makedirs(local_folder)
+except OSError:
+if not os.path.isdir(local_folder):
+raise
 file_msg = "from {0} to {1}".format(self.remote_filepath,
 self.local_filepath)
 self.log.debug("Starting to transfer %s", file_msg)
 sftp_client.get(self.remote_filepath, self.local_filepath)
 else:
+remote_folder = os.path.dirname(self.remote_filepath)
+if self.create_intermediate_dirs:
+_make_intermediate_dirs(
+sftp_client=sftp_client,
+remote_directory=remote_folder,
+)
 file_msg = "from {0} to {1}".format(self.local_filepath,
 self.remote_filepath)
 self.log.debug("Starting to transfer file %s", file_msg)
@@ -118,3 +155,26 @@ def execute(self, context):
.format(file_msg, str(e)))
 
 return None
+
+
+def 

[GitHub] kaxil closed pull request #4270: [AIRFLOW-3434] Allows creating intermediate folders in SFTPOperator

2018-12-03 Thread GitBox
kaxil closed pull request #4270: [AIRFLOW-3434] Allows creating intermediate 
folders in SFTPOperator
URL: https://github.com/apache/incubator-airflow/pull/4270
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/operators/sftp_operator.py 
b/airflow/contrib/operators/sftp_operator.py
index 620d875f89..117bc55a8c 100644
--- a/airflow/contrib/operators/sftp_operator.py
+++ b/airflow/contrib/operators/sftp_operator.py
@@ -16,6 +16,8 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+import os
+
 from airflow.contrib.hooks.ssh_hook import SSHHook
 from airflow.exceptions import AirflowException
 from airflow.models import BaseOperator
@@ -48,9 +50,28 @@ class SFTPOperator(BaseOperator):
 :param remote_filepath: remote file path to get or put. (templated)
 :type remote_filepath: str
 :param operation: specify operation 'get' or 'put', defaults to put
-:type get: bool
+:type operation: str
 :param confirm: specify if the SFTP operation should be confirmed, 
defaults to True
 :type confirm: bool
+:param create_intermediate_dirs: create missing intermediate directories 
when
+copying from remote to local and vice-versa. Default is False.
+
+Example: The following task would copy ``file.txt`` to the remote host
+at ``/tmp/tmp1/tmp2/`` while creating ``tmp``,``tmp1`` and ``tmp2`` if 
they
+don't exist. If the parameter is not passed it would error as the 
directory
+does not exist. ::
+
+put_file = SFTPOperator(
+task_id="test_sftp",
+ssh_conn="ssh_default",
+local_filepath="/tmp/file.txt",
+remote_filepath="/tmp/tmp1/tmp2/file.txt",
+operation="put",
+create_intermediate_dirs=True,
+dag=dag
+)
+
+:type create_intermediate_dirs: bool
 """
 template_fields = ('local_filepath', 'remote_filepath', 'remote_host')
 
@@ -63,6 +84,7 @@ def __init__(self,
  remote_filepath=None,
  operation=SFTPOperation.PUT,
  confirm=True,
+ create_intermediate_dirs=False,
  *args,
  **kwargs):
 super(SFTPOperator, self).__init__(*args, **kwargs)
@@ -73,6 +95,7 @@ def __init__(self,
 self.remote_filepath = remote_filepath
 self.operation = operation
 self.confirm = confirm
+self.create_intermediate_dirs = create_intermediate_dirs
 if not (self.operation.lower() == SFTPOperation.GET or
 self.operation.lower() == SFTPOperation.PUT):
 raise TypeError("unsupported operation value {0}, expected {1} or 
{2}"
@@ -101,11 +124,25 @@ def execute(self, context):
 with self.ssh_hook.get_conn() as ssh_client:
 sftp_client = ssh_client.open_sftp()
 if self.operation.lower() == SFTPOperation.GET:
+local_folder = os.path.dirname(self.local_filepath)
+if self.create_intermediate_dirs:
+# Create Intermediate Directories if it doesn't exist
+try:
+os.makedirs(local_folder)
+except OSError:
+if not os.path.isdir(local_folder):
+raise
 file_msg = "from {0} to {1}".format(self.remote_filepath,
 self.local_filepath)
 self.log.debug("Starting to transfer %s", file_msg)
 sftp_client.get(self.remote_filepath, self.local_filepath)
 else:
+remote_folder = os.path.dirname(self.remote_filepath)
+if self.create_intermediate_dirs:
+_make_intermediate_dirs(
+sftp_client=sftp_client,
+remote_directory=remote_folder,
+)
 file_msg = "from {0} to {1}".format(self.local_filepath,
 self.remote_filepath)
 self.log.debug("Starting to transfer file %s", file_msg)
@@ -118,3 +155,26 @@ def execute(self, context):
.format(file_msg, str(e)))
 
 return None
+
+
+def _make_intermediate_dirs(sftp_client, remote_directory):
+"""
+Create all the intermediate directories in a remote host
+
+:param sftp_client: A Paramiko SFTP client.
+:param remote_directory: Absolute Path of the directory 

[GitHub] vardancse commented on a change in pull request #3994: [AIRFLOW-3136] Add retry_number to TaskInstance Key property to avoid race condition

2018-12-03 Thread GitBox
vardancse commented on a change in pull request #3994: [AIRFLOW-3136] Add 
retry_number to TaskInstance Key property to avoid race condition
URL: https://github.com/apache/incubator-airflow/pull/3994#discussion_r238249284
 
 

 ##
 File path: airflow/contrib/executors/kubernetes_executor.py
 ##
 @@ -453,7 +453,8 @@ def _labels_to_key(self, labels):
 try:
 return (
 labels['dag_id'], labels['task_id'],
-
self._label_safe_datestring_to_datetime(labels['execution_date']))
+
self._label_safe_datestring_to_datetime(labels['execution_date']),
+labels['try_number'])
 
 Review comment:
   @aliaksandr-d https://github.com/apache/incubator-airflow/pull/4163 already 
created for the same by @wyndhblb 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] vardancse commented on issue #4163: [AIRFLOW-3319] - KubernetsExecutor: Need in try_number in labels if getting them later

2018-12-03 Thread GitBox
vardancse commented on issue #4163: [AIRFLOW-3319] - KubernetsExecutor: Need in 
try_number in  labels if getting them later
URL: 
https://github.com/apache/incubator-airflow/pull/4163#issuecomment-443692742
 
 
   @ashb @wyndhblb Change looks good to me. Was also thinking(nothing 
concerning though), do we really need error handling done here 
https://github.com/apache/incubator-airflow/pull/4163/files#diff-b1d8d65aeaa7d031dfe5b197d6c5aa69R464
 as we've already placed try_number label in worker_configuration.py? 
   
   I also see another PR raised for the similar issue: 
https://github.com/apache/incubator-airflow/pull/4268
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #4271: Adding better json return

2018-12-03 Thread GitBox
ashb commented on issue #4271: Adding better json return
URL: 
https://github.com/apache/incubator-airflow/pull/4271#issuecomment-443679593
 
 
   PRs should be against the master branch, not the release branch please.
   
   Additionally you need tests - I think you end up with `"start_date": True` 
in your output


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-3437) Formatted json should be returned when dag_run is triggered with experimental api

2018-12-03 Thread Saumya Saxena Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saumya Saxena Gupta updated AIRFLOW-3437:
-
Description: 
*Scenario* -> Developer wants to trigger DAG_RUN through API

*Issue* -> Current rest API "/api/experimental/dags/dag_runs"returns 
message like below, which makes difficult for developer to figure out execution 
date/run_id and execution date/run_id extract logic has to be written  
{quote}
 { *"message": "Created "}*


{quote}
*Improvement Suggestion* -> rest API "/api/experimental/dags/dag_runs" 
should return json representing dag_run object , something like below

 
{quote}{
 "dag_id": "example_bash_operator",
 "dag_run_url": 
"/admin/airflow/graph?execution_date=2018-12-03+11%3A11%3A18%2B00%3A00_id=example_bash_operator",
 "execution_date": "2018-12-03T11:11:18+00:00",
 "id": 142,
 "run_id": "manual__2018-12-03T11:11:18+00:00",
 "start_date": "2018-12-03T11:11:18.267197+00:00",
 "state": "running"
} 
{quote}
With the Json returned as shown above , picking dag_run details becomes easy.

  was:
*Scenario* -> Developer wants to trigger DAG_RUN through API

*Issue* -> Current rest API "/api/experimental/dags/dag_runs"returns 
message like below, which makes difficult for developer to figure out execution 
date/run_id and execution date/run_id extract logic has to be written  

{{*"{*}}
{{ *"message": "Created "*}}
{{*}"*}}

*Improvement Suggestion* -> rest API "/api/experimental/dags/dag_runs" 
should return json representing dag_run object , something like below

 

{{*{*}}
{{ *"dag_id": "example_bash_operator",*}}
{{ *"dag_run_url": 
"/admin/airflow/graph?execution_date=2018-12-03+11%3A11%3A18%2B00%3A00_id=example_bash_operator",*}}
{{ *"execution_date": "2018-12-03T11:11:18+00:00",*}}
{{ *"id": 142,*}}
{{ *"run_id": "manual__2018-12-03T11:11:18+00:00",*}}
{{ *"start_date": "2018-12-03T11:11:18.267197+00:00",*}}
{{ *"state": "running"*}}
{{*}*}}

 

With the Json returned as shown above , picking dag_run details becomes easy.


> Formatted json should be returned when dag_run is triggered with experimental 
> api
> -
>
> Key: AIRFLOW-3437
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3437
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 1.10.0, 2.0.0
>Reporter: Saumya Saxena Gupta
>Assignee: Saumya Saxena Gupta
>Priority: Major
>
> *Scenario* -> Developer wants to trigger DAG_RUN through API
> *Issue* -> Current rest API "/api/experimental/dags/dag_runs"returns 
> message like below, which makes difficult for developer to figure out 
> execution date/run_id and execution date/run_id extract logic has to be 
> written  
> {quote}
>  { *"message": "Created  11:08:17+00:00: manual__2018-12-03T11:08:17+00:00, externally triggered: 
> True>"}*
> {quote}
> *Improvement Suggestion* -> rest API 
> "/api/experimental/dags/dag_runs" should return json representing 
> dag_run object , something like below
>  
> {quote}{
>  "dag_id": "example_bash_operator",
>  "dag_run_url": 
> "/admin/airflow/graph?execution_date=2018-12-03+11%3A11%3A18%2B00%3A00_id=example_bash_operator",
>  "execution_date": "2018-12-03T11:11:18+00:00",
>  "id": 142,
>  "run_id": "manual__2018-12-03T11:11:18+00:00",
>  "start_date": "2018-12-03T11:11:18.267197+00:00",
>  "state": "running"
> } 
> {quote}
> With the Json returned as shown above , picking dag_run details becomes easy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] SaumyaRackspace opened a new pull request #4271: Adding better json return

2018-12-03 Thread GitBox
SaumyaRackspace opened a new pull request #4271: Adding better json return
URL: https://github.com/apache/incubator-airflow/pull/4271
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW-3437) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   Scenario -> Developer wants to trigger DAG_RUN through API
   
   Issue -> Current rest API "/api/experimental/dags/dag_runs"returns 
message like below, which makes difficult for developer to figure out execution 
date/run_id and execution date/run_id extract logic has to be written  
   
   { "message": "Created " 
}
   Improvement Suggestion -> rest API "/api/experimental/dags/dag_runs" 
should return json representing dag_run object , something like below
   

   
   { "dag_id": "example_bash_operator", "dag_run_url": 
"/admin/airflow/graph?execution_date=2018-12-03+11%3A11%3A18%2B00%3A00_id=example_bash_operator",
 "execution_date": "2018-12-03T11:11:18+00:00", "id": 142, "run_id": 
"manual__2018-12-03T11:11:18+00:00", "start_date": 
"2018-12-03T11:11:18.267197+00:00", "state": "running" }
   \
   
   With the Json returned as shown above , picking dag_run details becomes easy.
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-3437) Formatted json should be returned when dag_run is triggered with experimental api

2018-12-03 Thread Saumya Saxena Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saumya Saxena Gupta updated AIRFLOW-3437:
-
Description: 
*Scenario* -> Developer wants to trigger DAG_RUN through API

*Issue* -> Current rest API "/api/experimental/dags/dag_runs"returns 
message like below, which makes difficult for developer to figure out execution 
date/run_id and execution date/run_id extract logic has to be written  

{
 "message": "Created "
}

*Improvement Suggestion* -> rest API "/api/experimental/dags/dag_runs" 
should return json representing dag_run object , something like below

 {
 "dag_id": "example_bash_operator",
 "dag_run_url": 
"/admin/airflow/graph?execution_date=2018-12-03+11%3A11%3A18%2B00%3A00_id=example_bash_operator",
 "execution_date": "2018-12-03T11:11:18+00:00",
 "id": 142,
 "run_id": "manual__2018-12-03T11:11:18+00:00",
 "start_date": "2018-12-03T11:11:18.267197+00:00",
 "state": "running"
}\

With the Json returned as shown above , picking dag_run details becomes easy.

  was:
*Scenario* -> Developer wants to trigger DAG_RUN through API

*Issue* -> Current rest API "/api/experimental/dags/dag_runs"returns 
message like below, which makes difficult for developer to figure out execution 
date/run_id and execution date/run_id extract logic has to be written  
{quote}
 { *"message": "Created "}*


{quote}
*Improvement Suggestion* -> rest API "/api/experimental/dags/dag_runs" 
should return json representing dag_run object , something like below

 
{quote}{
 "dag_id": "example_bash_operator",
 "dag_run_url": 
"/admin/airflow/graph?execution_date=2018-12-03+11%3A11%3A18%2B00%3A00_id=example_bash_operator",
 "execution_date": "2018-12-03T11:11:18+00:00",
 "id": 142,
 "run_id": "manual__2018-12-03T11:11:18+00:00",
 "start_date": "2018-12-03T11:11:18.267197+00:00",
 "state": "running"
} 
{quote}
With the Json returned as shown above , picking dag_run details becomes easy.


> Formatted json should be returned when dag_run is triggered with experimental 
> api
> -
>
> Key: AIRFLOW-3437
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3437
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 1.10.0, 2.0.0
>Reporter: Saumya Saxena Gupta
>Assignee: Saumya Saxena Gupta
>Priority: Major
>
> *Scenario* -> Developer wants to trigger DAG_RUN through API
> *Issue* -> Current rest API "/api/experimental/dags/dag_runs"returns 
> message like below, which makes difficult for developer to figure out 
> execution date/run_id and execution date/run_id extract logic has to be 
> written  
> {
>  "message": "Created  11:16:36+00:00: manual__2018-12-03T11:16:36+00:00, externally triggered: 
> True>"
> }
> *Improvement Suggestion* -> rest API 
> "/api/experimental/dags/dag_runs" should return json representing 
> dag_run object , something like below
>  {
>  "dag_id": "example_bash_operator",
>  "dag_run_url": 
> "/admin/airflow/graph?execution_date=2018-12-03+11%3A11%3A18%2B00%3A00_id=example_bash_operator",
>  "execution_date": "2018-12-03T11:11:18+00:00",
>  "id": 142,
>  "run_id": "manual__2018-12-03T11:11:18+00:00",
>  "start_date": "2018-12-03T11:11:18.267197+00:00",
>  "state": "running"
> }\
> With the Json returned as shown above , picking dag_run details becomes easy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3437) Formatted json should be returned when dag_run is triggered with experimental api

2018-12-03 Thread Saumya Saxena Gupta (JIRA)
Saumya Saxena Gupta created AIRFLOW-3437:


 Summary: Formatted json should be returned when dag_run is 
triggered with experimental api
 Key: AIRFLOW-3437
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3437
 Project: Apache Airflow
  Issue Type: Improvement
  Components: api
Affects Versions: 1.10.0, 2.0.0
Reporter: Saumya Saxena Gupta
Assignee: Saumya Saxena Gupta


*Scenario* -> Developer wants to trigger DAG_RUN through API

*Issue* -> Current rest API "/api/experimental/dags/dag_runs"returns 
message like below, which makes difficult for developer to figure out execution 
date/run_id and execution date/run_id extract logic has to be written  

{{*"{*}}
{{ *"message": "Created "*}}
{{*}"*}}

*Improvement Suggestion* -> rest API "/api/experimental/dags/dag_runs" 
should return json representing dag_run object , something like below

 

{{*{*}}
{{ *"dag_id": "example_bash_operator",*}}
{{ *"dag_run_url": 
"/admin/airflow/graph?execution_date=2018-12-03+11%3A11%3A18%2B00%3A00_id=example_bash_operator",*}}
{{ *"execution_date": "2018-12-03T11:11:18+00:00",*}}
{{ *"id": 142,*}}
{{ *"run_id": "manual__2018-12-03T11:11:18+00:00",*}}
{{ *"start_date": "2018-12-03T11:11:18.267197+00:00",*}}
{{ *"state": "running"*}}
{{*}*}}

 

With the Json returned as shown above , picking dag_run details becomes easy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3436) Not able to start airflow webserver

2018-12-03 Thread Balajee SV (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balajee SV updated AIRFLOW-3436:

Component/s: webserver

> Not able to start airflow webserver
> ---
>
> Key: AIRFLOW-3436
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3436
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webserver
>Reporter: Balajee SV
>Priority: Major
>
> [2018-12-03 15:59:12,370] \{__init__.py:51} INFO - Using executor 
> SequentialExecutor
>   _
>   |__( )_ __/__ / __
>  /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
> ___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
>  _/_/ |_/_/ /_/ /_/ /_/ \//|__/
> [2018-12-03 15:59:13,509] \{models.py:271} INFO - Filling up the DagBag from 
> /Users/balajee/airflow/dags
> Running the Gunicorn Server with:
> Workers: 4 sync
> Host: 0.0.0.0:8080
> Timeout: 120
> Logfiles: - -
> =
> Error: No module named 'airflow.www'
> [2018-12-03 16:01:13,948] \{cli.py:754} ERROR - No response from gunicorn 
> master within 120 seconds
> [2018-12-03 16:01:13,948] \{cli.py:755} ERROR - Shutting down webserver



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3436) Not able to start airflow webserver

2018-12-03 Thread Balajee SV (JIRA)
Balajee SV created AIRFLOW-3436:
---

 Summary: Not able to start airflow webserver
 Key: AIRFLOW-3436
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3436
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Balajee SV


[2018-12-03 15:59:12,370] \{__init__.py:51} INFO - Using executor 
SequentialExecutor
  _
  |__( )_ __/__ / __
 /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
 _/_/ |_/_/ /_/ /_/ /_/ \//|__/

[2018-12-03 15:59:13,509] \{models.py:271} INFO - Filling up the DagBag from 
/Users/balajee/airflow/dags
Running the Gunicorn Server with:
Workers: 4 sync
Host: 0.0.0.0:8080
Timeout: 120
Logfiles: - -
=

Error: No module named 'airflow.www'
[2018-12-03 16:01:13,948] \{cli.py:754} ERROR - No response from gunicorn 
master within 120 seconds
[2018-12-03 16:01:13,948] \{cli.py:755} ERROR - Shutting down webserver



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] verdan commented on issue #4163: [AIRFLOW-3319] - KubernetsExecutor: Need in try_number in labels if getting them later

2018-12-03 Thread GitBox
verdan commented on issue #4163: [AIRFLOW-3319] - KubernetsExecutor: Need in 
try_number in  labels if getting them later
URL: 
https://github.com/apache/incubator-airflow/pull/4163#issuecomment-443659438
 
 
   @vardancse ^


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (AIRFLOW-3417) Use the platformVersion only for the FARGATE launch type

2018-12-03 Thread Alexander Kovalenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kovalenko reassigned AIRFLOW-3417:


Assignee: (was: Alexander Kovalenko)

> Use the platformVersion only for the FARGATE launch type
> 
>
> Key: AIRFLOW-3417
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3417
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: aws
>Affects Versions: 2.0.0
>Reporter: Alexander Kovalenko
>Priority: Major
>
> By default an ECS container should be run with the EC2 launch type.
> The current implementation passing the {{platformVersion}} parameter all the 
> time and we got an exception:
> {code:java}
> botocore.errorfactory.InvalidParameterException: An error occurred 
> (InvalidParameterException) when calling the RunTask operation: The platform 
> version must be null when specifying an EC2 launch type.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3431) Document how to report security vulnerabilities and issues safely

2018-12-03 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3431.

   Resolution: Fixed
Fix Version/s: 2.0.0

> Document how to report security vulnerabilities and issues safely
> -
>
> Key: AIRFLOW-3431
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3431
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Ash Berlin-Taylor
>Assignee: Ash Berlin-Taylor
>Priority: Major
> Fix For: 2.0.0
>
>
> Add to our docs how poeple can report security vulnerabilities in Airflow 
> safely and responsibly. Point QU30 in the maturity docs:
> {quote}The project provides a well-documented channel to report security 
> issues, along with a documented way of responding to them.\{quote}
> We will follow the Apache way and use 
> [secur...@apache.org|mailto:secur...@apache.org] for now, but we need to say 
> this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3431) Document how to report security vulnerabilities and issues safely

2018-12-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706919#comment-16706919
 ] 

ASF GitHub Bot commented on AIRFLOW-3431:
-

ashb closed pull request #4262: [AIRFLOW-3431] Document how to report security 
vulnerabilities.
URL: https://github.com/apache/incubator-airflow/pull/4262
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/security.rst b/docs/security.rst
index 1adefcd834..de62deb1e6 100644
--- a/docs/security.rst
+++ b/docs/security.rst
@@ -34,6 +34,27 @@ Be sure to checkout :doc:`api` for securing the API.
environment variables) as ``%%``, otherwise Airflow might leak these
passwords on a config parser exception to a log.
 
+Reporting Vulnerabilities
+-
+
+The Apache Software Foundation takes security issues very seriously. Apache
+Airflow specifically offers security features and is responsive to issues
+around its features. If you have any concern around Airflow Security or believe
+you have uncovered a vulnerability, we suggest that you get in touch via the
+e-mail address secur...@apache.org. In the message, try to provide a
+description of the issue and ideally a way of reproducing it. The security team
+will get back to you after assessing the description.
+
+Note that this security address should be used only for undisclosed
+vulnerabilities. Dealing with fixed issues or general questions on how to use
+the security features should be handled regularly via the user and the dev
+lists. Please report any security problems to the project security address
+before disclosing it publicly.
+
+The `ASF Security team's page `_ describes
+how vulnerability reports are handled, and includes PGP keys if you wish to use
+that.
+
 Web Authentication
 --
 


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Document how to report security vulnerabilities and issues safely
> -
>
> Key: AIRFLOW-3431
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3431
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Ash Berlin-Taylor
>Assignee: Ash Berlin-Taylor
>Priority: Major
>
> Add to our docs how poeple can report security vulnerabilities in Airflow 
> safely and responsibly. Point QU30 in the maturity docs:
> {quote}The project provides a well-documented channel to report security 
> issues, along with a documented way of responding to them.\{quote}
> We will follow the Apache way and use 
> [secur...@apache.org|mailto:secur...@apache.org] for now, but we need to say 
> this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ashb closed pull request #4262: [AIRFLOW-3431] Document how to report security vulnerabilities.

2018-12-03 Thread GitBox
ashb closed pull request #4262: [AIRFLOW-3431] Document how to report security 
vulnerabilities.
URL: https://github.com/apache/incubator-airflow/pull/4262
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/security.rst b/docs/security.rst
index 1adefcd834..de62deb1e6 100644
--- a/docs/security.rst
+++ b/docs/security.rst
@@ -34,6 +34,27 @@ Be sure to checkout :doc:`api` for securing the API.
environment variables) as ``%%``, otherwise Airflow might leak these
passwords on a config parser exception to a log.
 
+Reporting Vulnerabilities
+-
+
+The Apache Software Foundation takes security issues very seriously. Apache
+Airflow specifically offers security features and is responsive to issues
+around its features. If you have any concern around Airflow Security or believe
+you have uncovered a vulnerability, we suggest that you get in touch via the
+e-mail address secur...@apache.org. In the message, try to provide a
+description of the issue and ideally a way of reproducing it. The security team
+will get back to you after assessing the description.
+
+Note that this security address should be used only for undisclosed
+vulnerabilities. Dealing with fixed issues or general questions on how to use
+the security features should be handled regularly via the user and the dev
+lists. Please report any security problems to the project security address
+before disclosing it publicly.
+
+The `ASF Security team's page `_ describes
+how vulnerability reports are handled, and includes PGP keys if you wish to use
+that.
+
 Web Authentication
 --
 


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3367) Test celery with redis broker

2018-12-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706869#comment-16706869
 ] 

ASF GitHub Bot commented on AIRFLOW-3367:
-

ashb closed pull request #4207: [AIRFLOW-3367] Run celery integration test with 
redis broker.
URL: https://github.com/apache/incubator-airflow/pull/4207
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/tests/executors/test_celery_executor.py 
b/tests/executors/test_celery_executor.py
index 954e17ca03..e85979dace 100644
--- a/tests/executors/test_celery_executor.py
+++ b/tests/executors/test_celery_executor.py
@@ -16,20 +16,22 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+import os
 import sys
 import unittest
+import contextlib
 from multiprocessing import Pool
 
 import mock
-from celery.contrib.testing.worker import start_worker
 
-from airflow.executors import celery_executor
-from airflow.executors.celery_executor import CELERY_FETCH_ERR_MSG_HEADER
-from airflow.executors.celery_executor import (CeleryExecutor, 
celery_configuration,
-   send_task_to_executor, 
execute_command)
-from airflow.executors.celery_executor import app
+from celery import Celery
 from celery import states as celery_states
+from celery.contrib.testing.worker import start_worker
+from kombu.asynchronous import set_event_loop
+from parameterized import parameterized
+
 from airflow.utils.state import State
+from airflow.executors import celery_executor
 
 from airflow import configuration
 configuration.load_test_config()
@@ -38,48 +40,80 @@
 import celery.contrib.testing.tasks  # noqa: F401
 
 
+def _prepare_test_bodies():
+if 'CELERY_BROKER_URLS' in os.environ:
+return [
+(url, )
+for url in os.environ['CELERY_BROKER_URLS'].split(',')
+]
+return [(configuration.conf.get('celery', 'BROKER_URL'))]
+
+
 class CeleryExecutorTest(unittest.TestCase):
+
+@contextlib.contextmanager
+def _prepare_app(self, broker_url=None, execute=None):
+broker_url = broker_url or configuration.conf.get('celery', 
'BROKER_URL')
+execute = execute or celery_executor.execute_command.__wrapped__
+
+test_config = dict(celery_executor.celery_configuration)
+test_config.update({'broker_url': broker_url})
+test_app = Celery(broker_url, config_source=test_config)
+test_execute = test_app.task(execute)
+patch_app = mock.patch('airflow.executors.celery_executor.app', 
test_app)
+patch_execute = 
mock.patch('airflow.executors.celery_executor.execute_command', test_execute)
+
+with patch_app, patch_execute:
+try:
+yield test_app
+finally:
+# Clear event loop to tear down each celery instance
+set_event_loop(None)
+
+@parameterized.expand(_prepare_test_bodies())
 @unittest.skipIf('sqlite' in configuration.conf.get('core', 
'sql_alchemy_conn'),
  "sqlite is configured with SequentialExecutor")
-def test_celery_integration(self):
-executor = CeleryExecutor()
-executor.start()
-with start_worker(app=app, logfile=sys.stdout, loglevel='debug'):
-success_command = ['true', 'some_parameter']
-fail_command = ['false', 'some_parameter']
-
-cached_celery_backend = execute_command.backend
-task_tuples_to_send = [('success', 'fake_simple_ti', 
success_command,
-celery_configuration['task_default_queue'],
-execute_command),
-   ('fail', 'fake_simple_ti', fail_command,
-celery_configuration['task_default_queue'],
-execute_command)]
-
-chunksize = 
executor._num_tasks_per_send_process(len(task_tuples_to_send))
-num_processes = min(len(task_tuples_to_send), 
executor._sync_parallelism)
-
-send_pool = Pool(processes=num_processes)
-key_and_async_results = send_pool.map(
-send_task_to_executor,
-task_tuples_to_send,
-chunksize=chunksize)
-
-send_pool.close()
-send_pool.join()
-
-for key, command, result in key_and_async_results:
-# Only pops when enqueued successfully, otherwise keep it
-# and expect scheduler loop to deal with it.
-result.backend = cached_celery_backend
-executor.running[key] = 

[jira] [Resolved] (AIRFLOW-3367) Test celery with redis broker

2018-12-03 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3367.

   Resolution: Fixed
Fix Version/s: 2.0.0

> Test celery with redis broker
> -
>
> Key: AIRFLOW-3367
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3367
> Project: Apache Airflow
>  Issue Type: Test
>Reporter: Josh Carp
>Priority: Trivial
> Fix For: 2.0.0
>
>
> Current integration tests use celery with the rabbitmq broker, but not the 
> redis broker. We should test with both brokers to avoid regressions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ashb closed pull request #4207: [AIRFLOW-3367] Run celery integration test with redis broker.

2018-12-03 Thread GitBox
ashb closed pull request #4207: [AIRFLOW-3367] Run celery integration test with 
redis broker.
URL: https://github.com/apache/incubator-airflow/pull/4207
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/tests/executors/test_celery_executor.py 
b/tests/executors/test_celery_executor.py
index 954e17ca03..e85979dace 100644
--- a/tests/executors/test_celery_executor.py
+++ b/tests/executors/test_celery_executor.py
@@ -16,20 +16,22 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+import os
 import sys
 import unittest
+import contextlib
 from multiprocessing import Pool
 
 import mock
-from celery.contrib.testing.worker import start_worker
 
-from airflow.executors import celery_executor
-from airflow.executors.celery_executor import CELERY_FETCH_ERR_MSG_HEADER
-from airflow.executors.celery_executor import (CeleryExecutor, 
celery_configuration,
-   send_task_to_executor, 
execute_command)
-from airflow.executors.celery_executor import app
+from celery import Celery
 from celery import states as celery_states
+from celery.contrib.testing.worker import start_worker
+from kombu.asynchronous import set_event_loop
+from parameterized import parameterized
+
 from airflow.utils.state import State
+from airflow.executors import celery_executor
 
 from airflow import configuration
 configuration.load_test_config()
@@ -38,48 +40,80 @@
 import celery.contrib.testing.tasks  # noqa: F401
 
 
+def _prepare_test_bodies():
+if 'CELERY_BROKER_URLS' in os.environ:
+return [
+(url, )
+for url in os.environ['CELERY_BROKER_URLS'].split(',')
+]
+return [(configuration.conf.get('celery', 'BROKER_URL'))]
+
+
 class CeleryExecutorTest(unittest.TestCase):
+
+@contextlib.contextmanager
+def _prepare_app(self, broker_url=None, execute=None):
+broker_url = broker_url or configuration.conf.get('celery', 
'BROKER_URL')
+execute = execute or celery_executor.execute_command.__wrapped__
+
+test_config = dict(celery_executor.celery_configuration)
+test_config.update({'broker_url': broker_url})
+test_app = Celery(broker_url, config_source=test_config)
+test_execute = test_app.task(execute)
+patch_app = mock.patch('airflow.executors.celery_executor.app', 
test_app)
+patch_execute = 
mock.patch('airflow.executors.celery_executor.execute_command', test_execute)
+
+with patch_app, patch_execute:
+try:
+yield test_app
+finally:
+# Clear event loop to tear down each celery instance
+set_event_loop(None)
+
+@parameterized.expand(_prepare_test_bodies())
 @unittest.skipIf('sqlite' in configuration.conf.get('core', 
'sql_alchemy_conn'),
  "sqlite is configured with SequentialExecutor")
-def test_celery_integration(self):
-executor = CeleryExecutor()
-executor.start()
-with start_worker(app=app, logfile=sys.stdout, loglevel='debug'):
-success_command = ['true', 'some_parameter']
-fail_command = ['false', 'some_parameter']
-
-cached_celery_backend = execute_command.backend
-task_tuples_to_send = [('success', 'fake_simple_ti', 
success_command,
-celery_configuration['task_default_queue'],
-execute_command),
-   ('fail', 'fake_simple_ti', fail_command,
-celery_configuration['task_default_queue'],
-execute_command)]
-
-chunksize = 
executor._num_tasks_per_send_process(len(task_tuples_to_send))
-num_processes = min(len(task_tuples_to_send), 
executor._sync_parallelism)
-
-send_pool = Pool(processes=num_processes)
-key_and_async_results = send_pool.map(
-send_task_to_executor,
-task_tuples_to_send,
-chunksize=chunksize)
-
-send_pool.close()
-send_pool.join()
-
-for key, command, result in key_and_async_results:
-# Only pops when enqueued successfully, otherwise keep it
-# and expect scheduler loop to deal with it.
-result.backend = cached_celery_backend
-executor.running[key] = command
-executor.tasks[key] = result
-executor.last_state[key] = celery_states.PENDING
-
-executor.running['success'] = True
-executor.running['fail'] = True
-
-

[GitHub] ashb edited a comment on issue #4262: [AIRFLOW-3431] Document how to report security vulnerabilities.

2018-12-03 Thread GitBox
ashb edited a comment on issue #4262: [AIRFLOW-3431] Document how to report 
security vulnerabilities.
URL: 
https://github.com/apache/incubator-airflow/pull/4262#issuecomment-443538262
 
 
   Oh if didn't know Airflow and found a vulnerability of look on the docs, not 
in the contribution docs - and this is where other Apache projects have this 
(Kafka, Zookeeper)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3435) Volume mount issue with kubernetes pod operator

2018-12-03 Thread Sai Varun Reddy Daram (JIRA)
Sai Varun Reddy Daram created AIRFLOW-3435:
--

 Summary: Volume mount issue with kubernetes pod operator
 Key: AIRFLOW-3435
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3435
 Project: Apache Airflow
  Issue Type: Bug
  Components: kubernetes, operators
Affects Versions: 1.10.1
 Environment: Airflow 1.10.1 on Docker, Kubernetes running on minikube 
v0.28.2, kubernetes client version: 1.12.3, kubernetes server version: 1.10.0, 
python version of airflow 3.6
Reporter: Sai Varun Reddy Daram


I have followed the standard example describe here 
[https://airflow.apache.org/kubernetes.html#kubernetes-operator,] and I'm 
getting this error.

{{Traceback (most recent call last): File 
"/usr/local/lib/python3.6/site-packages/airflow/contrib/kubernetes/pod_launcher.py",
 line 55, in run_pod_async resp = self._client.create_namespaced_pod(body=req, 
namespace=pod.namespace) File 
"/usr/local/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py", 
line 6115, in create_namespaced_pod (data) = 
self.create_namespaced_pod_with_http_info(namespace, body, **kwargs) File 
"/usr/local/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py", 
line 6206, in create_namespaced_pod_with_http_info 
collection_formats=collection_formats) File 
"/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 
321, in call_api _return_http_data_only, collection_formats, _preload_content, 
_request_timeout) File 
"/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 
155, in __call_api _request_timeout=_request_timeout) File 
"/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 
364, in request body=body) File 
"/usr/local/lib/python3.6/site-packages/kubernetes/client/rest.py", line 266, 
in POST body=body) File 
"/usr/local/lib/python3.6/site-packages/kubernetes/client/rest.py", line 222, 
in request raise ApiException(http_resp=r) kubernetes.client.rest.ApiException: 
(422) Reason: Unprocessable Entity HTTP response headers: 
HTTPHeaderDict(\{'Content-Type': 'application/json', 'Date': 'Mon, 03 Dec 2018 
07:16:10 GMT', 'Content-Length': '393'}) HTTP response body: 
\{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Pod
 \"test-4f6c8ada\" is invalid: spec.containers[0].volumeMounts[1].name: Not 
found: 
\"test-volume\"","reason":"Invalid","details":\{"name":"test-4f6c8ada","kind":"Pod","causes":[{"reason":"FieldValueNotFound","message":"Not
 found: 
\"test-volume\"","field":"spec.containers[0].volumeMounts[1].name"}]},"code":422}}}

 

 

{{The code is:}}
{code:java}
// code placeholder
from airflow.contrib.kubernetes.volume import Volume
from airflow.contrib.kubernetes.volume_mount import VolumeMount
from airflow.contrib.operators.kubernetes_pod_operator import 
KubernetesPodOperator
from airflow.contrib.kubernetes.secret import Secret
from airflow import DAG
from datetime import datetime, timedelta

current_date = datetime.utcnow()
default_args = {
'owner': 'root',
'depends_on_past': False,
'retries': 1,
'retry_delay': timedelta(minutes=1),
}

secret_file = Secret('volume', '/etc/sql_conn', 'airflow-secrets', 
'sql_alchemy_conn')
secret_env = Secret('env', 'SQL_CONN', 'airflow-secrets', 'sql_alchemy_conn')
volume_mount = VolumeMount('test-volume',
mount_path='/root/mount_file',
sub_path=None,
read_only=True)

volume_config = {
'persistentVolumeClaim':
{
'claimName': 'test-volume'
}
}
volume = Volume(name='test-volume', configs=volume_config)
with DAG(
dag_id='MMM_DAG', default_args=default_args,
start_date=current_date,
concurrency=1,
schedule_interval=None) as d:
k = KubernetesPodOperator(namespace='default',
image="ubuntu:16.04",
cmds=["bash", "-cx"],
arguments=["echo", "10"],
in_cluster=True,
labels={"foo": "bar"},
secrets=[secret_file, secret_env],
volume=[volume],
volume_mounts=[volume_mount],
name="test",
task_id="task",
is_delete_operator_pod=True,
hostnetwork=False
)


{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)