[GitHub] XD-DENG commented on issue #4130: [AIRFLOW-3193] Pin docker requirement version

2018-11-06 Thread GitBox
XD-DENG commented on issue #4130: [AIRFLOW-3193] Pin docker requirement version
URL: 
https://github.com/apache/incubator-airflow/pull/4130#issuecomment-436474322
 
 
   Hi @ashb , would you mind to add this commit into 1.10.1? 
   
   Understand that my PR #4049 (**[AIRFLOW-3203] Fix DockerOperator & some 
operator test**) is already cherry-picked into branch v1-10-test. The change 
made in that PR is partially meant to address a breaking change in Python 
package `docker==3.0.0`. But I missed to pin the `docker` version in that 
commit. This may result in issues for users who have `docker < 3.0.0` when they 
upgrade to 1.10.1.
   
   Cheers.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jmcarp commented on issue #4147: [AIRFLOW-3307] Upgrade rbac node deps via `npm audit fix`.

2018-11-06 Thread GitBox
jmcarp commented on issue #4147: [AIRFLOW-3307] Upgrade rbac node deps via `npm 
audit fix`.
URL: 
https://github.com/apache/incubator-airflow/pull/4147#issuecomment-436445983
 
 
   @ashb agree that there's no security issue here--I just want npm to stop 
emitting warnings when I run commands  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3307) Update insecure node dependencies

2018-11-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677400#comment-16677400
 ] 

ASF GitHub Bot commented on AIRFLOW-3307:
-

jmcarp opened a new pull request #4147: [AIRFLOW-3307] Upgrade rbac node deps 
via `npm audit fix`.
URL: https://github.com/apache/incubator-airflow/pull/4147
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3307
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Update insecure dependences with `npm audit fix`.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Just updating build dependencies.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update insecure node dependencies
> -
>
> Key: AIRFLOW-3307
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3307
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Josh Carp
>Assignee: Josh Carp
>Priority: Trivial
>
> `npm audit` shows some node dependencies that are out of date and potentially 
> insecure. We should update them with `npm audit fix`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jmcarp opened a new pull request #4147: [AIRFLOW-3307] Upgrade rbac node deps via `npm audit fix`.

2018-11-06 Thread GitBox
jmcarp opened a new pull request #4147: [AIRFLOW-3307] Upgrade rbac node deps 
via `npm audit fix`.
URL: https://github.com/apache/incubator-airflow/pull/4147
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3307
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   Update insecure dependences with `npm audit fix`.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Just updating build dependencies.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yangaws commented on a change in pull request #4126: [AIRFLOW-2524] More AWS SageMaker operators, sensors for model, endpoint-config and endpoint

2018-11-06 Thread GitBox
yangaws commented on a change in pull request #4126: [AIRFLOW-2524] More AWS 
SageMaker operators, sensors for model, endpoint-config and endpoint
URL: https://github.com/apache/incubator-airflow/pull/4126#discussion_r231326429
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_endpoint_config_operator.py
 ##
 @@ -0,0 +1,67 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.operators.sagemaker_base_operator import 
SageMakerBaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerEndpointConfigOperator(SageMakerBaseOperator):
+
+"""
+Create a SageMaker endpoint config.
+
+This operator returns The ARN of the endpoint config created in Amazon 
SageMaker
+
+:param config: The configuration necessary to create an endpoint config.
+
+For details of the configuration parameter, See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint_config
+:type config: dict
+:param aws_conn_id: The AWS connection ID to use.
+:type aws_conn_id: str
+"""  # noqa
 
 Review comment:
   Updated


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yangaws commented on a change in pull request #4126: [AIRFLOW-2524] More AWS SageMaker operators, sensors for model, endpoint-config and endpoint

2018-11-06 Thread GitBox
yangaws commented on a change in pull request #4126: [AIRFLOW-2524] More AWS 
SageMaker operators, sensors for model, endpoint-config and endpoint
URL: https://github.com/apache/incubator-airflow/pull/4126#discussion_r231326391
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_endpoint_config_operator.py
 ##
 @@ -0,0 +1,67 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.operators.sagemaker_base_operator import 
SageMakerBaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerEndpointConfigOperator(SageMakerBaseOperator):
+
+"""
+Create a SageMaker endpoint config.
+
+This operator returns The ARN of the endpoint config created in Amazon 
SageMaker
+
+:param config: The configuration necessary to create an endpoint config.
+
+For details of the configuration parameter, See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint_config
+:type config: dict
+:param aws_conn_id: The AWS connection ID to use.
+:type aws_conn_id: str
+"""  # noqa
 
 Review comment:
   Got it! Thanks a lot!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ultrabug commented on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-11-06 Thread GitBox
ultrabug commented on issue #2460: [AIRFLOW-1424] make the next execution date 
of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-436433446
 
 
   @ashb I'm not sure what you're referring to? are you talking about the name 
of the `next_execution_date` property in the model or about the fact that I'm 
still checking next_run_date on the tests?
   
   If it's about the property name, I think it's in line with the CLI name so I 
figured I'd stick with it.
   
   If it's about the tests, it's normal as I'd like an ACK on the 
implementation itself before adjusting all the tests at once.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #4143: [AIRFLOW-689] Okta Authentication

2018-11-06 Thread GitBox
ashb commented on issue #4143: [AIRFLOW-689] Okta Authentication
URL: 
https://github.com/apache/incubator-airflow/pull/4143#issuecomment-436431058
 
 
   https://github.com/apache/incubator-airflow/pull/4142 might obsolete this PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4145: Revert "[AIRFLOW-3160] Load latest_dagruns asynchronously (#4005)"

2018-11-06 Thread GitBox
codecov-io edited a comment on issue #4145: Revert "[AIRFLOW-3160] Load 
latest_dagruns asynchronously (#4005)"
URL: 
https://github.com/apache/incubator-airflow/pull/4145#issuecomment-436413754
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=h1)
 Report
   > Merging 
[#4145](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/dc0eb58e97178b050b79584f18d8b9bd2c3dea5f?src=pr=desc)
 will **increase** coverage by `0.03%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4145/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4145  +/-   ##
   ==
   + Coverage   77.46%   77.49%   +0.03% 
   ==
 Files 199  199  
 Lines   1627216246  -26 
   ==
   - Hits1260512590  -15 
   + Misses   3667 3656  -11
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/www/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/4145/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=)
 | `68.8% <ø> (-0.21%)` | :arrow_down: |
   | 
[airflow/www\_rbac/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/4145/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy92aWV3cy5weQ==)
 | `72.38% <ø> (+0.5%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=footer).
 Last update 
[dc0eb58...a4fc042](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-2866) Missing CSRF Token Error on Web RBAC UI Create/Update Operations

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2866:
---
Affects Version/s: 2.0.0
Fix Version/s: (was: 1.10.1)
   2.0.0

> Missing CSRF Token Error on Web RBAC UI Create/Update Operations
> 
>
> Key: AIRFLOW-2866
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2866
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.0.0
>Reporter: Jasper Kahn
>Priority: Major
> Fix For: 2.0.0
>
>
> Attempting to modify or delete many resources (such as Connections or Users) 
> results in a 400 from the webserver:
> {quote}{{Bad Request}}
> {{The CSRF session token is missing.}}{quote}
> Logs report:
> {quote}{{[2018-08-07 18:45:15,771] \{csrf.py:251} INFO - The CSRF session 
> token is missing.}}
> {{192.168.9.1 - - [07/Aug/2018:18:45:15 +] "POST 
> /admin/connection/delete/ HTTP/1.1" 400 150 
> "http://localhost:8081/admin/connection/; "Mozilla/5.0 (X11; Linux x86_64) 
> AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 
> Safari/537.36"}}{quote}
> Chrome dev tools show the CSRF token is present in the request payload.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io commented on issue #4145: Revert "[AIRFLOW-3160] Load latest_dagruns asynchronously (#4005)"

2018-11-06 Thread GitBox
codecov-io commented on issue #4145: Revert "[AIRFLOW-3160] Load latest_dagruns 
asynchronously (#4005)"
URL: 
https://github.com/apache/incubator-airflow/pull/4145#issuecomment-436413754
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=h1)
 Report
   > Merging 
[#4145](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/dc0eb58e97178b050b79584f18d8b9bd2c3dea5f?src=pr=desc)
 will **increase** coverage by `0.03%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4145/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4145  +/-   ##
   ==
   + Coverage   77.46%   77.49%   +0.03% 
   ==
 Files 199  199  
 Lines   1627216246  -26 
   ==
   - Hits1260512590  -15 
   + Misses   3667 3656  -11
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/www/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/4145/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=)
 | `68.8% <ø> (-0.21%)` | :arrow_down: |
   | 
[airflow/www\_rbac/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/4145/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy92aWV3cy5weQ==)
 | `72.38% <ø> (+0.5%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=footer).
 Last update 
[dc0eb58...a4fc042](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-2216) Cannot specify a profile for AWS Hook to load with s3 config file

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-2216.

Resolution: Fixed

> Cannot specify a profile for AWS Hook to load with s3 config file
> -
>
> Key: AIRFLOW-2216
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2216
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.9.0
> Environment: IDE: PyCharm
> Airflow 1.9
> Python 3.4.3
>Reporter: Lorena Mesa
>Assignee: Lorena Mesa
>Priority: Minor
> Fix For: 1.10.1
>
>
> Currently the source code for AWS Hook doesn't permit the user to provide a 
> profile when their aws connection object specifies in the extra param's 
> information on s3_config_file:
> {code:java}
> def _get_credentials(self, region_name):
> aws_access_key_id = None
> aws_secret_access_key = None
> aws_session_token = None
> endpoint_url = None
> if self.aws_conn_id:
> try:
> # Cut for brevity
> elif 's3_config_file' in connection_object.extra_dejson:
>  aws_access_key_id, aws_secret_access_key = \
> _parse_s3_config(connection_object.extra_dejson['s3_config_file'],
>connection_object.extra_dejson.get('s3_config_format'),
>  connection_object.extra_dejson.get('profile')){code}
> The _parse_s3_config method has a param for profile set to none, so by not 
> providing it in the method you cannot now specify a profile credential to be 
> loaded. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #4145: Revert "[AIRFLOW-3160] Load latest_dagruns asynchronously (#4005)"

2018-11-06 Thread GitBox
codecov-io edited a comment on issue #4145: Revert "[AIRFLOW-3160] Load 
latest_dagruns asynchronously (#4005)"
URL: 
https://github.com/apache/incubator-airflow/pull/4145#issuecomment-436413754
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=h1)
 Report
   > Merging 
[#4145](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/dc0eb58e97178b050b79584f18d8b9bd2c3dea5f?src=pr=desc)
 will **increase** coverage by `0.03%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4145/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4145  +/-   ##
   ==
   + Coverage   77.46%   77.49%   +0.03% 
   ==
 Files 199  199  
 Lines   1627216246  -26 
   ==
   - Hits1260512590  -15 
   + Misses   3667 3656  -11
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/www/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/4145/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=)
 | `68.8% <ø> (-0.21%)` | :arrow_down: |
   | 
[airflow/www\_rbac/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/4145/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy92aWV3cy5weQ==)
 | `72.38% <ø> (+0.5%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=footer).
 Last update 
[dc0eb58...a4fc042](https://codecov.io/gh/apache/incubator-airflow/pull/4145?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #3172: [AIRFLOW-2216] Use profile for AWS hook if S3 config file provided in aws_default connection extra parameters

2018-11-06 Thread GitBox
ashb commented on issue #3172: [AIRFLOW-2216] Use profile for AWS hook if S3 
config file provided in aws_default connection extra parameters
URL: 
https://github.com/apache/incubator-airflow/pull/3172#issuecomment-436413661
 
 
   Merged in another Pr now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb closed pull request #3172: [AIRFLOW-2216] Use profile for AWS hook if S3 config file provided in aws_default connection extra parameters

2018-11-06 Thread GitBox
ashb closed pull request #3172: [AIRFLOW-2216] Use profile for AWS hook if S3 
config file provided in aws_default connection extra parameters
URL: https://github.com/apache/incubator-airflow/pull/3172
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/hooks/aws_hook.py 
b/airflow/contrib/hooks/aws_hook.py
index 2a8fa5f823..e4020fd35b 100644
--- a/airflow/contrib/hooks/aws_hook.py
+++ b/airflow/contrib/hooks/aws_hook.py
@@ -100,8 +100,11 @@ def _get_credentials(self, region_name):
 
 elif 's3_config_file' in connection_object.extra_dejson:
 aws_access_key_id, aws_secret_access_key = \
-
_parse_s3_config(connection_object.extra_dejson['s3_config_file'],
- 
connection_object.extra_dejson.get('s3_config_format'))
+_parse_s3_config(
+connection_object.extra_dejson['s3_config_file'],
+connection_object.extra_dejson['s3_config_format'],
+connection_object.extra_dejson['profile']
+)
 
 if region_name is None:
 region_name = 
connection_object.extra_dejson.get('region_name')
diff --git a/tests/contrib/hooks/test_aws_hook.py 
b/tests/contrib/hooks/test_aws_hook.py
index 086e486144..dd1b69e173 100644
--- a/tests/contrib/hooks/test_aws_hook.py
+++ b/tests/contrib/hooks/test_aws_hook.py
@@ -14,6 +14,7 @@
 #
 
 import unittest
+
 import boto3
 
 from airflow import configuration
@@ -141,6 +142,26 @@ def test_get_credentials_from_extra(self, 
mock_get_connection):
 self.assertEqual(credentials_from_hook.secret_key, 
'aws_secret_access_key')
 self.assertIsNone(credentials_from_hook.token)
 
+@mock.patch('airflow.contrib.hooks.aws_hook._parse_s3_config',
+return_value=('aws_access_key_id', 'aws_secret_access_key'))
+@mock.patch.object(AwsHook, 'get_connection')
+def test_get_credentials_from_extra_with_s3_config_and_profile(
+self, mock_get_connection, mock_parse_s3_config
+):
+mock_connection = Connection(
+extra='{"s3_config_format": "aws", '
+  '"profile": "test", '
+  '"s3_config_file": "aws-credentials", '
+  '"region_name": "us-east-1"}')
+mock_get_connection.return_value = mock_connection
+hook = AwsHook()
+hook._get_credentials(region_name=None)
+mock_parse_s3_config.assert_called_with(
+'aws-credentials',
+'aws',
+'test'
+)
+
 @unittest.skipIf(mock_sts is None, 'mock_sts package not present')
 @mock.patch.object(AwsHook, 'get_connection')
 @mock_sts


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #4146: [AIRFLOW-3306] Disable flask-sqlalchemy modification tracking.

2018-11-06 Thread GitBox
ashb commented on issue #4146: [AIRFLOW-3306] Disable flask-sqlalchemy 
modification tracking.
URL: 
https://github.com/apache/incubator-airflow/pull/4146#issuecomment-436400468
 
 
   Thanks, I saw the warning but hadn't dug in to what the event system was or 
if we were using it.
   
   (Trying to fix the tests on master, so might ask you to rebase once we do 
that)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3307) Update insecure node dependencies

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677279#comment-16677279
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3307:


Sure, We should update them, but the security of it doesn't concern us as they 
are dev-time only so don't affect our users.

> Update insecure node dependencies
> -
>
> Key: AIRFLOW-3307
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3307
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Josh Carp
>Assignee: Josh Carp
>Priority: Trivial
>
> `npm audit` shows some node dependencies that are out of date and potentially 
> insecure. We should update them with `npm audit fix`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3307) Update insecure node dependencies

2018-11-06 Thread Josh Carp (JIRA)
Josh Carp created AIRFLOW-3307:
--

 Summary: Update insecure node dependencies
 Key: AIRFLOW-3307
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3307
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Josh Carp
Assignee: Josh Carp


`npm audit` shows some node dependencies that are out of date and potentially 
insecure. We should update them with `npm audit fix`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3161) Log Url link does not link to task instance logs in RBAC UI

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3161.

   Resolution: Fixed
Fix Version/s: 1.10.1
   2.0.0

> Log Url link does not link to task instance logs in RBAC UI
> ---
>
> Key: AIRFLOW-3161
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3161
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Eric Chang
>Assignee: Eric Chang
>Priority: Minor
> Fix For: 2.0.0, 1.10.1
>
> Attachments: image-2018-10-04-17-33-33-616.png, 
> image-2018-10-04-17-34-12-135.png, image-2018-10-04-17-35-14-224.png
>
>
> In the new RBAC UI, the "Log Url" link (0) for Task instances don't link to 
> the log for the task instances (1). Instead, they link to the DAG log list 
> (2).
> (0)
> !image-2018-10-04-17-35-14-224.png|width=172,height=172!
> (1)
> !image-2018-10-04-17-34-12-135.png|width=660,height=376!
> (2)
> !image-2018-10-04-17-33-33-616.png|width=478,height=238!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (AIRFLOW-3161) Log Url link does not link to task instance logs in RBAC UI

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor reopened AIRFLOW-3161:


Reopening to change fix versions

> Log Url link does not link to task instance logs in RBAC UI
> ---
>
> Key: AIRFLOW-3161
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3161
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Eric Chang
>Assignee: Eric Chang
>Priority: Minor
> Fix For: 2.0.0, 1.10.1
>
> Attachments: image-2018-10-04-17-33-33-616.png, 
> image-2018-10-04-17-34-12-135.png, image-2018-10-04-17-35-14-224.png
>
>
> In the new RBAC UI, the "Log Url" link (0) for Task instances don't link to 
> the log for the task instances (1). Instead, they link to the DAG log list 
> (2).
> (0)
> !image-2018-10-04-17-35-14-224.png|width=172,height=172!
> (1)
> !image-2018-10-04-17-34-12-135.png|width=660,height=376!
> (2)
> !image-2018-10-04-17-33-33-616.png|width=478,height=238!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3306) Disable unused flask-sqlalchemy modification tracking

2018-11-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677212#comment-16677212
 ] 

ASF GitHub Bot commented on AIRFLOW-3306:
-

jmcarp opened a new pull request #4146: [AIRFLOW-3306] Disable flask-sqlalchemy 
modification tracking.
URL: https://github.com/apache/incubator-airflow/pull/4146
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3306
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   By default, flask-sqlalchemy tracks model changes for its event system, 
which adds some overhead. Since I don't think we're using the flask-sqlalchemy 
event system, we should be able to turn off modification tracking and improve 
performance.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Just a config change; existing tests should cover it.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Disable unused flask-sqlalchemy modification tracking
> -
>
> Key: AIRFLOW-3306
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3306
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Josh Carp
>Assignee: Josh Carp
>Priority: Trivial
>
> By default, flask-sqlalchemy tracks model changes for its event system, which 
> adds some overhead. Since I don't think we're using the flask-sqlalchemy 
> event system, we should be able to turn off modification tracking and improve 
> performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jmcarp opened a new pull request #4146: [AIRFLOW-3306] Disable flask-sqlalchemy modification tracking.

2018-11-06 Thread GitBox
jmcarp opened a new pull request #4146: [AIRFLOW-3306] Disable flask-sqlalchemy 
modification tracking.
URL: https://github.com/apache/incubator-airflow/pull/4146
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3306
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   By default, flask-sqlalchemy tracks model changes for its event system, 
which adds some overhead. Since I don't think we're using the flask-sqlalchemy 
event system, we should be able to turn off modification tracking and improve 
performance.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Just a config change; existing tests should cover it.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3306) Disable unused flask-sqlalchemy modification tracking

2018-11-06 Thread Josh Carp (JIRA)
Josh Carp created AIRFLOW-3306:
--

 Summary: Disable unused flask-sqlalchemy modification tracking
 Key: AIRFLOW-3306
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3306
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Josh Carp
Assignee: Josh Carp


By default, flask-sqlalchemy tracks model changes for its event system, which 
adds some overhead. Since I don't think we're using the flask-sqlalchemy event 
system, we should be able to turn off modification tracking and improve 
performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ashb commented on issue #4140: [AIRFLOW-3302] Small CSS fixes

2018-11-06 Thread GitBox
ashb commented on issue #4140: [AIRFLOW-3302] Small CSS fixes
URL: 
https://github.com/apache/incubator-airflow/pull/4140#issuecomment-436343506
 
 
   Probably okay, but could you include a before and after screenshot?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #4133: [AIRFLOW-3270] Allow passwordless-binding for LDAP auth backend

2018-11-06 Thread GitBox
ashb commented on issue #4133: [AIRFLOW-3270] Allow passwordless-binding for 
LDAP auth backend
URL: 
https://github.com/apache/incubator-airflow/pull/4133#issuecomment-436341469
 
 
   If we're deprecating the old non-RBAC api this might not matter on master 
anymore anyway :D


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3160) Load latest_dagruns asynchronously

2018-11-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677020#comment-16677020
 ] 

ASF GitHub Bot commented on AIRFLOW-3160:
-

ashb opened a new pull request #4145: Revert "[AIRFLOW-3160] Load 
latest_dagruns asynchronously (#4005)"
URL: https://github.com/apache/incubator-airflow/pull/4145
 
 
   This reverts commit 0287cceed8137823743497b7e11f19ef35bacd9d.
   
   Testing to see if the tests pass with this change reverted.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Load latest_dagruns asynchronously 
> ---
>
> Key: AIRFLOW-3160
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3160
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webserver
>Affects Versions: 1.10.0
>Reporter: Dan Davydov
>Assignee: Dan Davydov
>Priority: Major
> Fix For: 2.0.0
>
>
> The front page loads very slowly when the DB has latency because one blocking 
> query is made per DAG against the DB.
>  
> The latest dagruns should be loaded asynchronously and in batch like the 
> other UI elements that query the database.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ashb commented on issue #4145: Revert "[AIRFLOW-3160] Load latest_dagruns asynchronously (#4005)"

2018-11-06 Thread GitBox
ashb commented on issue #4145: Revert "[AIRFLOW-3160] Load latest_dagruns 
asynchronously (#4005)"
URL: 
https://github.com/apache/incubator-airflow/pull/4145#issuecomment-436331547
 
 
   Don't merge this before Travis has run the tests


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb edited a comment on issue #4005: [AIRFLOW-3160] Load latest_dagruns asynchronously, speed up front page load time

2018-11-06 Thread GitBox
ashb edited a comment on issue #4005: [AIRFLOW-3160] Load latest_dagruns 
asynchronously, speed up front page load time
URL: 
https://github.com/apache/incubator-airflow/pull/4005#issuecomment-436330620
 
 
   Examples that (we think) started happening after this PR was merged:
   
   https://travis-ci.org/apache/incubator-airflow/jobs/451428747#L4660
   
   in `ERROR: test_success (tests.www_rbac.test_views.TestAirflowBaseViews)` 
(postgres this time)
   
   And same build on Mysql 
https://travis-ci.org/apache/incubator-airflow/jobs/451428748#L4660
   
   Going to try reverting this PR and see if it fixes things, even though the 
error doesn't make any sense.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb opened a new pull request #4145: Revert "[AIRFLOW-3160] Load latest_dagruns asynchronously (#4005)"

2018-11-06 Thread GitBox
ashb opened a new pull request #4145: Revert "[AIRFLOW-3160] Load 
latest_dagruns asynchronously (#4005)"
URL: https://github.com/apache/incubator-airflow/pull/4145
 
 
   This reverts commit 0287cceed8137823743497b7e11f19ef35bacd9d.
   
   Testing to see if the tests pass with this change reverted.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #4005: [AIRFLOW-3160] Load latest_dagruns asynchronously, speed up front page load time

2018-11-06 Thread GitBox
ashb commented on issue #4005: [AIRFLOW-3160] Load latest_dagruns 
asynchronously, speed up front page load time
URL: 
https://github.com/apache/incubator-airflow/pull/4005#issuecomment-436330620
 
 
   Examples that (we think) started happening after this PR was merged:
   
   https://travis-ci.org/apache/incubator-airflow/jobs/451428747#L4660
   
   in `ERROR: test_success (tests.www_rbac.test_views.TestAirflowBaseViews)` 
(postgres this time)
   
   And same build on Mysql 
https://travis-ci.org/apache/incubator-airflow/jobs/451428748#L4660
   
   Going to try reverting this PR and see if it fixes things.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3285) lazy marking of upstream_failed task state

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677006#comment-16677006
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3285:


The lazy feature as you have described it isn't something we'd accept as it's 
quite a behaviour change and a little bit of a work-ardound, but a combo 
trigger rule so we could do say {{trigger_rule=\{'all_done','one_failed',\}}} 
to say "trigger on any of these conditions" would be acceptable

> lazy marking of upstream_failed task state
> --
>
> Key: AIRFLOW-3285
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3285
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Kevin McHale
>Priority: Minor
>
> Airflow aggressively applies the {{upstream_failed}} task state: as soon as a 
> task fails, all of its downstream dependencies get marked.  This sometimes 
> creates problems for us at Etsy.
> In particular, we use a pattern for our hadoop Airflow DAGs along these lines:
>  # the DAG creates a hadoop cluster in GCP/Dataproc
>  # the DAG executes its tasks on the cluster
>  # the DAG deletes the cluster once all tasks are done
> There are some cases in which the tasks immediately upstream of the 
> cluster-delete step get marked as {{upstream_failed}}, triggering the 
> cluster-delete step, even while other tasks continue to execute without 
> problems on the cluster.  The cluster-delete step of course kills all of the 
> running tasks, requiring all of them to be re-run once the problem with the 
> failed task is mitigated.
> As an example, a DAG that looks like this can exhibit the problem:
> {code:java}
> Cluster = ClusterCreateOperator(...)
> A = Job1Operator(...)
> Cluster << A
> B = Job2Operator(...)
> Cluster << B
> C = Job3Operator(...)
> A << C
> B << C
> ClusterDelete = DeleteClusterOperator(trigger_rule="all_done", ...)
> D << ClusterDelete{code}
> In a DAG like this, suppose task A fails while task B is running.  Task C 
> will immediately be marked as {{upstream_failed}}, which will cause 
> ClusterDelete to run while task B is still running, which will cause task B 
> to also fail.
> Our solution to this problem has been to implement something like [this 
> diff|https://github.com/mchalek/incubator-airflow/commit/585349018656cd9b2e3e3e113db6412345485dde],
>  which lazily applies the {{upstream_failed}} state only to tasks for which 
> all upstream tasks have already completed.
> The consequence in terms of the example above is that task C will not be 
> marked {{upstream_failed}} in response to task A failing until task B 
> completes, ensuring that the cluster is not deleted while any upstream tasks 
> are running.
> We find this not to have any adverse behavior on our airflow instances, so we 
> run all of them with this lazy-marking feature enabled.  However, we 
> recognize that a change in behavior like this may be something that existing 
> users will want to opt-in for, so we included a config flag in the diff that 
> defaults to the original behavior.
> We would appreciate your consideration of incorporating this diff, or 
> something like it, to allow us to configure this behavior in unmodified, 
> upstream airflow.
> Thanks!
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3300) Frequent crash of scheduler while interacting with Airflow Metadata (Mysql)

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676990#comment-16676990
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3300:


Might be fixed by AIRFLOW-2703, but a deadlock is possibly a sign of a bigger 
issue.

> Frequent crash of scheduler while interacting with Airflow Metadata (Mysql)
> ---
>
> Key: AIRFLOW-3300
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3300
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Tanuj Gupta
>Priority: Major
>
> It's been very frequent when scheduler tries to update the task instance 
> table and ends up with scheduler crash due to deadlock occurrence. Following 
> is the stack-trace for the same -
>  
> {noformat}
> sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (1213, 
> 'Deadlock found when trying to get lock; try restarting transaction') [SQL: 
> u'UPDATE task_instance, dag_run SET task_instance.state=%s WHERE 
> task_instance.dag_id IN (%s, %s, %s, %s, %s) AND task_instance.state IN (%s, 
> %s) AND dag_run.dag_id = task_instance.dag_id AND dag_run.execution_date = 
> task_instance.execution_date AND dag_run.state != %s'] [parameters: (None, 
> 'org_test0802_h9zrva', 
> '27bd514b5ab9854b0a494110_45aa7868_1799_4046_ad20_f35e3de1a4ec_p8bkfg', 
> 'org_e2e_trainman1528457521430', 'org_blockerretryissue_6kiafp', 
> 'org_e2e_compute_v21540610294106_svnfmr', u'queued', u'scheduled', 
> u'running')] (Background on this error at: 
> http://sqlalche.me/e/e3q8){noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3305) KubernetesPodOperator has a race condition for log output

2018-11-06 Thread James Meickle (JIRA)
James Meickle created AIRFLOW-3305:
--

 Summary: KubernetesPodOperator has a race condition for log output
 Key: AIRFLOW-3305
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3305
 Project: Apache Airflow
  Issue Type: Bug
  Components: kubernetes
Affects Versions: 1.10.0
Reporter: James Meickle


The KubernetesPodOperator follows logs from the container in the pod that it 
launches: 
[https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/kubernetes/pod_launcher.py#L96]

This is set to "follow" mode, which streams logs. However, it is possible (but 
not guaranteed) for the pod's container to have started before the log stream 
call reaches the cluster. In this case, re-running the same task may result in 
very different-looking logs, with no notification that there was any 
truncation. This is a confusing experience for operators who are not familiar 
with Kubernetes.

My recommendation is to remove "tail_lines" which should have the effect of 
fetching all previous logs when streaming starts: 
https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/CoreV1Api.md#read_namespaced_pod_log



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3304) Kubernetes pod operator does not capture init container logs

2018-11-06 Thread James Meickle (JIRA)
James Meickle created AIRFLOW-3304:
--

 Summary: Kubernetes pod operator does not capture init container 
logs
 Key: AIRFLOW-3304
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3304
 Project: Apache Airflow
  Issue Type: Improvement
  Components: kubernetes
Affects Versions: 1.10.0
Reporter: James Meickle


The KubernetesPodOperator attempts to stream logs from the created pod. 
However, it only gets logs from the 'base' container. If you subclass this 
operator and modify the pod to also have init containers, their logs are not 
streamed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ultrabug commented on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-11-06 Thread GitBox
ultrabug commented on issue #2460: [AIRFLOW-1424] make the next execution date 
of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-436303989
 
 
   OK @ashb @XD-DENG , I've update the PR to reuse the scheduler logic directly 
so that we are fully in line with how the scheduler does it. No magic, no 
debate on how it should be done.
   
   I've updated the code as you can see it's very easy and clean that way.
   Also, tested with (fixed) documentation example and it handles catchup 
correctly!
   
   Hope you like it :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] oelesinsc24 commented on a change in pull request #4068: [AIRFLOW-2310]: Add AWS Glue Job Compatibility to Airflow

2018-11-06 Thread GitBox
oelesinsc24 commented on a change in pull request #4068: [AIRFLOW-2310]: Add 
AWS Glue Job Compatibility to Airflow
URL: https://github.com/apache/incubator-airflow/pull/4068#discussion_r231178088
 
 

 ##
 File path: airflow/contrib/hooks/aws_glue_job_hook.py
 ##
 @@ -0,0 +1,130 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+
+from airflow.exceptions import AirflowException
+from airflow.contrib.hooks.aws_hook import AwsHook
+import time
+
+
+class AwsGlueJobHook(AwsHook):
+"""
+Interact with AWS Glue - create job, trigger, crawler
+
+:param job_name: unique job name per AWS account
+:type str
+:param desc: job description
+:type str
+:param region_name: aws region name (example: us-east-1)
+:type region_name: str
 
 Review comment:
   Sure


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mikemole commented on issue #4112: [AIRFLOW-3212] Add AwsGlueCatalogPartitionSensor

2018-11-06 Thread GitBox
mikemole commented on issue #4112: [AIRFLOW-3212] Add 
AwsGlueCatalogPartitionSensor
URL: 
https://github.com/apache/incubator-airflow/pull/4112#issuecomment-436302237
 
 
   @ashb I incorporated your feedback, rebased, and squashed.  Please let me 
know if there is anything else you need.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] janhicken commented on issue #4139: [AIRFLOW-2715] Pick up the region setting while launching Dataflow templates

2018-11-06 Thread GitBox
janhicken commented on issue #4139: [AIRFLOW-2715] Pick up the region setting 
while launching Dataflow templates
URL: 
https://github.com/apache/incubator-airflow/pull/4139#issuecomment-436299237
 
 
   Do you mean some documentation?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ultrabug commented on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-11-06 Thread GitBox
ultrabug commented on issue #2460: [AIRFLOW-1424] make the next execution date 
of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-436298904
 
 
   To be complete: now ofc if I add a DummyOperator task to the example DAG, it 
does not fail any more since it can find out an actual start_date thanks to the 
logic from jobs.py... which leads me to think that this "find start_date from 
tasks" logic is important to keep.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ultrabug edited a comment on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-11-06 Thread GitBox
ultrabug edited a comment on issue #2460: [AIRFLOW-1424] make the next 
execution date of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-436295821
 
 
   @XD-DENG bad news, the example DAG in the documentation is breaking the 
scheduler on master so even the documentation is wrong.
   
   Fresh installation, if I run the scheduler using the DAG:
   
   ```python
   """
   Code that goes along with the Airflow tutorial located at:
   
https://github.com/airbnb/airflow/blob/master/airflow/example_dags/tutorial.py
   """
   from airflow import DAG
   from airflow.operators.bash_operator import BashOperator
   from datetime import datetime, timedelta
   
   
   default_args = {
   'owner': 'airflow',
   'depends_on_past': False,
   'start_date': datetime(2015, 12, 1),
   'email': ['airf...@example.com'],
   'email_on_failure': False,
   'email_on_retry': False,
   'retries': 1,
   'retry_delay': timedelta(minutes=5),
   'schedule_interval': '@hourly',
   }
   
   dag = DAG('tutorial', catchup=False, default_args=default_args)
   ```
   
   nothing happens, the scheduler does not pick up anything
   
   now if I change the catchup parameter to `True` 
   
   ```python
   dag = DAG('tutorial', catchup=True, default_args=default_args)
   ```
   
   I get the scheduler failing with
   
   ```
   Process DagFileProcessor1-Process:
   Traceback (most recent call last):
 File "/usr/lib64/python2.7/multiprocessing/process.py", line 267, in 
_bootstrap
   self.run()
 File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
   self._target(*self._args, **self._kwargs)
 File "/home/alexys/github/incubator-airflow_numberly/airflow/jobs.py", 
line 395, in helper
   pickle_dags)
 File "/home/alexys/github/incubator-airflow_numberly/airflow/utils/db.py", 
line 74, in wrapper
   return func(*args, **kwargs)
 File "/home/alexys/github/incubator-airflow_numberly/airflow/jobs.py", 
line 1726, in process_file
   self._process_dags(dagbag, dags, ti_keys_to_schedule)
 File "/home/alexys/github/incubator-airflow_numberly/airflow/jobs.py", 
line 1426, in _process_dags
   dag_run = self.create_dag_run(dag)
 File "/home/alexys/github/incubator-airflow_numberly/airflow/utils/db.py", 
line 74, in wrapper
   return func(*args, **kwargs)
 File "/home/alexys/github/incubator-airflow_numberly/airflow/jobs.py", 
line 872, in create_dag_run
   if next_run_date > timezone.utcnow():
   TypeError: can't compare datetime.datetime to NoneType
   ```
   
   That None result is annoying even the scheduler :)
   
   EDIT: quoting the documentation for expected behavior
   
   ```
   In the example above, if the DAG is picked up by the scheduler daemon on 
2016-01-02 at 6 AM, (or from the command line), a single DAG Run will be 
created, with an execution_date of 2016-01-01, and the next one will be created 
just after midnight on the morning of 2016-01-03 with an execution date of 
2016-01-02.
   
   If the dag.catchup value had been True instead, the scheduler would have 
created a DAG Run for each completed interval between 2015-12-01 and 2016-01-02 
(but not yet one for 2016-01-02, as that interval hasn’t completed) and the 
scheduler will execute them sequentially. This behavior is great for atomic 
datasets that can easily be split into periods. Turning catchup off is great if 
your DAG Runs perform backfill internally.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ultrabug commented on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-11-06 Thread GitBox
ultrabug commented on issue #2460: [AIRFLOW-1424] make the next execution date 
of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-436295821
 
 
   @XD-DENG bad news, the example DAG in the documentation is breaking the 
scheduler on master so even the documentation is wrong.
   
   Fresh installation, if I run the scheduler using the DAG:
   
   ```python
   """
   Code that goes along with the Airflow tutorial located at:
   
https://github.com/airbnb/airflow/blob/master/airflow/example_dags/tutorial.py
   """
   from airflow import DAG
   from airflow.operators.bash_operator import BashOperator
   from datetime import datetime, timedelta
   
   
   default_args = {
   'owner': 'airflow',
   'depends_on_past': False,
   'start_date': datetime(2015, 12, 1),
   'email': ['airf...@example.com'],
   'email_on_failure': False,
   'email_on_retry': False,
   'retries': 1,
   'retry_delay': timedelta(minutes=5),
   'schedule_interval': '@hourly',
   }
   
   dag = DAG('tutorial', catchup=False, default_args=default_args)
   ```
   
   nothing happens, the scheduler does not pick up anything
   
   now if I change the catchup parameter to `True` 
   
   ```python
   dag = DAG('tutorial', catchup=True, default_args=default_args)
   ```
   
   I get the scheduler failing with
   
   ```
   Process DagFileProcessor1-Process:
   Traceback (most recent call last):
 File "/usr/lib64/python2.7/multiprocessing/process.py", line 267, in 
_bootstrap
   self.run()
 File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
   self._target(*self._args, **self._kwargs)
 File "/home/alexys/github/incubator-airflow_numberly/airflow/jobs.py", 
line 395, in helper
   pickle_dags)
 File "/home/alexys/github/incubator-airflow_numberly/airflow/utils/db.py", 
line 74, in wrapper
   return func(*args, **kwargs)
 File "/home/alexys/github/incubator-airflow_numberly/airflow/jobs.py", 
line 1726, in process_file
   self._process_dags(dagbag, dags, ti_keys_to_schedule)
 File "/home/alexys/github/incubator-airflow_numberly/airflow/jobs.py", 
line 1426, in _process_dags
   dag_run = self.create_dag_run(dag)
 File "/home/alexys/github/incubator-airflow_numberly/airflow/utils/db.py", 
line 74, in wrapper
   return func(*args, **kwargs)
 File "/home/alexys/github/incubator-airflow_numberly/airflow/jobs.py", 
line 872, in create_dag_run
   if next_run_date > timezone.utcnow():
   TypeError: can't compare datetime.datetime to NoneType
   ```
   
   That None result is annoying even the scheduler :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #4129: [AIRFLOW-3294] Update connections form and integration docs

2018-11-06 Thread GitBox
kaxil commented on a change in pull request #4129: [AIRFLOW-3294] Update 
connections form and integration docs
URL: https://github.com/apache/incubator-airflow/pull/4129#discussion_r231166490
 
 

 ##
 File path: docs/integration.rst
 ##
 @@ -1011,3 +1011,13 @@ QuboleFileSensor
 
 
 .. autoclass:: airflow.contrib.sensors.qubole_sensor.QuboleFileSensor
+
+QuboleCheckOperator
+'''
+
+.. autoclass:: 
airflow.contrib.operators.qubole_check_operator.QuboleCheckOperator
+
+QuboleValueCheckOperator
+
+
+.. autoclass:: 
airflow.contrib.operators.qubole_check_operator.QuboleValueCheckOperator
 
 Review comment:
   Ya, I am happy with that given the link points to the same class in code.rst 
:)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #4144: [AIRFLOW-XXX] Use mocking in SimpleHttpOperator tests

2018-11-06 Thread GitBox
ashb commented on issue #4144: [AIRFLOW-XXX] Use mocking in SimpleHttpOperator 
tests
URL: 
https://github.com/apache/incubator-airflow/pull/4144#issuecomment-436290384
 
 
   Whoops I flake8'd up.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #4129: [AIRFLOW-3294] Update connections form and integration docs

2018-11-06 Thread GitBox
ashb commented on a change in pull request #4129: [AIRFLOW-3294] Update 
connections form and integration docs
URL: https://github.com/apache/incubator-airflow/pull/4129#discussion_r231161317
 
 

 ##
 File path: docs/integration.rst
 ##
 @@ -1011,3 +1011,13 @@ QuboleFileSensor
 
 
 .. autoclass:: airflow.contrib.sensors.qubole_sensor.QuboleFileSensor
+
+QuboleCheckOperator
+'''
+
+.. autoclass:: 
airflow.contrib.operators.qubole_check_operator.QuboleCheckOperator
+
+QuboleValueCheckOperator
+
+
+.. autoclass:: 
airflow.contrib.operators.qubole_check_operator.QuboleValueCheckOperator
 
 Review comment:
   Having looked at how this is rendered how about a middle ground - we list 
the classes in this doc, but just as links, rather than including the code 
docstrings?
   
   For example, in this screen shot the list and short description would stay, 
but the EmrAddStepsOperator wouldn't be in this doc?
   
   https://user-images.githubusercontent.com/34150/48073272-50d22380-e1d6-11e8-801a-5a2e5401dbc3.png;>
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3289) BashOperator mangles {{\}} escapes in commands

2018-11-06 Thread Nikolay Semyachkin (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676871#comment-16676871
 ] 

Nikolay Semyachkin commented on AIRFLOW-3289:
-

The workaround suggested above didn't work (I still have `N` in the output).

What worked is to put 
{code:java}
cat example.csv | sed 's;,,;,\\N,;g' > example_processed.csv{code}
in .sh file and call it from aiflow BashOperator like 
{code:java}
bash process.sh{code}

> BashOperator mangles {{\}} escapes in commands
> --
>
> Key: AIRFLOW-3289
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3289
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Nikolay Semyachkin
>Priority: Major
> Attachments: example.csv, issue_proof.py
>
>
> I want to call a sed command on csv file to replace empty values (,,) with \N.
> I can do it with the following bash command 
> {code:java}
> cat example.csv | sed 's;,,;,\\N,;g' > example_processed.csv{code}
> But when I try to do the same with airflow BashOperator, it substitutes ,, 
> with N (instead of \N).
>  
> I attached the code and csv file to reproduce.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-1632) MySQL to GCS fails for date/datetime before ~1850

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-1632.

Resolution: Duplicate

> MySQL to GCS fails for date/datetime before ~1850
> -
>
> Key: AIRFLOW-1632
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1632
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: gcp
> Environment: Google Cloud Platform
>Reporter: Michael Ghen
>Assignee: Michael Ghen
>Priority: Minor
>
> For tables in MySQL that use a "date" or "datetime" type, a dag that exports 
> from MySQL to Google Cloud Storage and then loads from GCS to BigQuery will 
> fail when the dates are before 1970.
> When the table is exported as JSON to a GCS bucket, dates and datetimes are 
> converted to timestamps using:
> {code}
> time.mktime(value.timetuple())
> {code} 
> This creates a problem when you try parse a date that can't be converted to a 
> UNIX timestamp. For example:
> {code}
> >>> value = datetime.date(1850,1,1)
> >>> time.mktime(value.timetuple())
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: year out of range
> {code}
> *Steps to reproduce*
> 0. Set up a MySQL connection and GCP connection in Airflow.
> 1. Create a MySQL table with a "date" field and put some data into the table. 
> {code}
> CREATE TABLE table_with_date (
> date_field date,
> datetime_field datetime
> );
> INSERT INTO table_with_date (date_field, datetime_field) VALUES 
> ('1850-01-01',NOW());
> {code}
> 2. Create a DAG that will export the data from the MySQL to GCS and then load 
> from GCS to BigQuery (use the schema file). For example:
> {code}
> extract = MySqlToGoogleCloudStorageOperator(
> task_id="extract_table",
> mysql_conn_id='mysql_connection',
> google_cloud_storage_conn_id='gcp_connection',
> sql="SELECT * FROM table_with_date",
> bucket='gcs-bucket',
> filename='table_with_date.json',
> schema_filename='schemas/table_with_date.json',
> dag=dag)
> load = GoogleCloudStorageToBigQueryOperator(
> task_id="load_table",
> bigquery_conn_id='gcp_connection',
> google_cloud_storage_conn_id='gcp_connection',
> bucket='gcs-bucket',
> destination_project_dataset_table="dataset.table_with_date",
> source_objects=['table_with_date.json'],
> schema_object='schemas/table_with_date.json',
> source_format='NEWLINE_DELIMITED_JSON',
> create_disposition='CREATE_IF_NEEDED',
> write_disposition='WRITE_TRUNCATE',
> dag=dag)
> load.set_upstream(extract)
> {code}
> 3. Run the DAG 
> Expected: The DAG runs successfully.
> Actual: The `extract_table` task fails with error:
> {code}
> ...
>  ERROR - year out of range
>  Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/airflow/models.py", line 1374, in run
>  result = task_copy.execute(context=context)
>   File 
> "/usr/lib/python2.7/site-packages/airflow/contrib/operators/mysql_to_gcs.py", 
> line 91, in execute
> files_to_upload = self._write_local_data_files(cursor)
>   File 
> "/usr/lib/python2.7/site-packages/airflow/contrib/operators/mysql_to_gcs.py", 
> line 132, in _write_local_data_files
> row = map(self.convert_types, row)
>   File 
> "/usr/lib/python2.7/site-packages/airflow/contrib/operators/mysql_to_gcs.py", 
> line 196, in convert_types
> return time.mktime(value.timetuple())
> ValueError: year out of range
> ...
> {code}
> *Comments:*
> This is really a problem with Python not being able to handle years before 
> like 1850. Bigquery timestamp seems to be able to take years all the way to 
> year 0001. From, 
> https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#timestamp-type,
>  Timestamp range is:
> {quote}
> 0001-01-01 00:00:00 to -12-31 23:59:59.99 UTC.
> {quote}
> I think the fix is probably to keep date/datetime converting to timestamp but 
> use `calendar.timegm`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil closed pull request #4144: [AIRFLOW-XXX] Use mocking in SimpleHttpOperator tests

2018-11-06 Thread GitBox
kaxil closed pull request #4144: [AIRFLOW-XXX] Use mocking in 
SimpleHttpOperator tests
URL: https://github.com/apache/incubator-airflow/pull/4144
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ultrabug commented on a change in pull request #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-11-06 Thread GitBox
ultrabug commented on a change in pull request #2460: [AIRFLOW-1424] make the 
next execution date of DAGs visible
URL: https://github.com/apache/incubator-airflow/pull/2460#discussion_r231150858
 
 

 ##
 File path: airflow/models.py
 ##
 @@ -3055,6 +3055,37 @@ def latest_execution_date(self):
 session.close()
 return execution_date
 
+@property
+def next_run_date(self):
+"""
+Returns the next run date for which the dag will be scheduled
+"""
+next_run_date = None
+if not self.latest_execution_date:
+# First run
+task_start_dates = [t.start_date for t in self.tasks]
+if task_start_dates:
+next_run_date = self.normalize_schedule(min(task_start_dates))
+else:
+next_run_date = self.following_schedule(self.latest_execution_date)
+return next_run_date
+
+@property
+def next_execution_date(self):
+"""
+Returns the next execution date at which the dag will be scheduled by
 
 Review comment:
   This is exactly the logic I've followed and you can see that my proposed 
implementation is based on the scheduler's jobs.py code indeed
   
   I can shrink the next_run_date function into one to make it simple tho 
indeed, gonna update
   
   @XD-DENG as you see the scheduler does otherwise and having None as a result 
looks strange to me since there's always a **period end** at which the 
scheduler itself will execute the DAG (if it has a start_date that is).
   
   So it's not because no previous execution has happened that there won't be 
any right. And the scheduler code above shows how it does it.
   
   Still, @XD-DENG I think your link has a nice example and I'll make sure to 
validate the fact that this PR behaves exactly as the documentation says. Sound 
good to you?   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sbilinski commented on issue #4123: [AIRFLOW-3288] Add SNS integration

2018-11-06 Thread GitBox
sbilinski commented on issue #4123: [AIRFLOW-3288] Add SNS integration
URL: 
https://github.com/apache/incubator-airflow/pull/4123#issuecomment-436275126
 
 
   @ashb 
   
   1. Sorry about that - I'm going to open a PR fixing this shortly. 
   2. I'd suggest to include this info in the PR template, as the following is 
not clear enough in my opinion:  
   
   > When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #4005: [AIRFLOW-3160] Load latest_dagruns asynchronously, speed up front page load time

2018-11-06 Thread GitBox
ashb commented on issue #4005: [AIRFLOW-3160] Load latest_dagruns 
asynchronously, speed up front page load time
URL: 
https://github.com/apache/incubator-airflow/pull/4005#issuecomment-436270852
 
 
   @aoen Speak of the devil :D this seems to be causing test failures after 
this PR was merged, but only on MySQL somewhat odly. Can you take a look 
please, (we might revert this PR temporarily)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb opened a new pull request #4144: [AIRFLOW-XXX] Use mocking in SimpleHttpOperator tests

2018-11-06 Thread GitBox
ashb opened a new pull request #4144: [AIRFLOW-XXX] Use mocking in 
SimpleHttpOperator tests
URL: https://github.com/apache/incubator-airflow/pull/4144
 
 
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] This changes thee tests in [AIRFLOW-3262]/(#4135) to use requests_mock
   rather than making actual HTTP requests as we have had this test fail on
   Travis with connection refused.
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ron819 commented on a change in pull request #4075: [AIRFLOW-502] BashOperator success/failure conditions not documented

2018-11-06 Thread GitBox
ron819 commented on a change in pull request #4075: [AIRFLOW-502] BashOperator 
success/failure conditions not documented
URL: https://github.com/apache/incubator-airflow/pull/4075#discussion_r231133812
 
 

 ##
 File path: airflow/operators/bash_operator.py
 ##
 @@ -49,6 +49,15 @@ class BashOperator(BaseOperator):
 :type env: dict
 :param output_encoding: Output encoding of bash command
 :type output_encoding: str
+
+On execution of the operator the task will up for retry when exception is 
raised.
+However if a command exists with non-zero value Airflow will not recognize
 
 Review comment:
   @ashb added the requested changes


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #4069: [AIRFLOW-3233] Fix deletion of DAGs in the UI

2018-11-06 Thread GitBox
ashb commented on issue #4069: [AIRFLOW-3233] Fix deletion of DAGs in the UI
URL: 
https://github.com/apache/incubator-airflow/pull/4069#issuecomment-436264005
 
 
   Should be possible to test this by creating a Dag object in the DB and 
ensuring the correct delete link appears in the output


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #4069: [AIRFLOW-3233] Fix deletion of DAGs in the UI

2018-11-06 Thread GitBox
ashb commented on a change in pull request #4069: [AIRFLOW-3233] Fix deletion 
of DAGs in the UI
URL: https://github.com/apache/incubator-airflow/pull/4069#discussion_r231132887
 
 

 ##
 File path: airflow/www/templates/airflow/dags.html
 ##
 @@ -191,11 +191,11 @@ DAGs
 
 
 
-
+
 
 Review comment:
   Minor nit: this includes the comment in thee HTML which we don't need. If 
you think the comment is useful then:
   
   ```suggestion
   {# Use dag_id instead of dag.dag_id, because the DAG might 
not exist in the webserver's DagBag #}
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-2865) Race condition between on_success_callback and LocalTaskJob's cleanup

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2865:
---
Fix Version/s: 1.10.1

> Race condition between on_success_callback and LocalTaskJob's cleanup
> -
>
> Key: AIRFLOW-2865
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2865
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Marcin Mejran
>Priority: Minor
> Fix For: 2.0.0, 1.10.1
>
>
> The TaskInstance's run_raw_task method first records SUCCESS for the task 
> instance and then runs the on_success_callback function.
> The LocalTaskJob's heartbeat_callback checks for any TI's with a SUCCESS 
> state and terminates their processes.
> As such it's possible for the TI process to be terminated before the 
> on_success_callback function finishes running.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3299) Logs for currently running sensors not visible in the UI

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3299:
---
Summary: Logs for currently running sensors not visible in the UI  (was: 
Logs for currently running tasks fail to load)

> Logs for currently running sensors not visible in the UI
> 
>
> Key: AIRFLOW-3299
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3299
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Reporter: Brad Holmes
>Priority: Major
>
> When a task is actively running, the logs are not appearing.  I have tracked 
> this down to the {{next_try_number}} logic of task-instances.
> In [the view at line 
> 836|https://github.com/apache/incubator-airflow/blame/master/airflow/www/views.py#L836],
>  we have
> {code:java}
> logs = [''] * (ti.next_try_number - 1 if ti is not None else 0)
> {code}
> The length of the {{logs}} array informs the frontend on the number of 
> {{attempts}} that exist, and thus how many AJAX calls to make to load the 
> logs.
> Here is the current logic I have observed
> ||Task State||Current length of 'logs'||Needed length of 'logs'||
> |Successfully completed in 1 attempt|1|1|
> |Successfully completed in 2 attempt|2|2|
> |Not yet attempted|0|0|
> |Actively running task, first time|0|1|
> That last case is the bug.  Perhaps task-instance needs a method like 
> {{most_recent_try_number}} ?  I don't see how to make use of {{try_number()}} 
> or {{next_try_number()}} to meet the need here.
> ||Task State||try_number()||next_try_number()||Number of Attempts _Should_ 
> Display||
> |Successfully completed in 1 attempt|2|2|1|
> |Successfully completed in 2 attempt|3|3|2|
> |Not yet attempted|1|1|0|
> |Actively running task, first time|0|1|1|
> [~ashb] : You implemented this portion of task-instance 11 months ago.  Any 
> suggestions?  Or perhaps the problem is elsewhere?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-3279) Documentation for Google Logging unclear

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-3279.

Resolution: Information Provided

> Documentation for Google Logging unclear
> 
>
> Key: AIRFLOW-3279
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3279
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: configuration, Documentation, gcp, logging
>Reporter: Paul Velthuis
>Priority: Blocker
>
> The documentation of how to install logging to a Google Cloud bucket is 
> unclear.
> I am now following the tutorial on the airflow page:
> [https://airflow.apache.org/howto/write-logs.html]
> Here I find it unclear what part of the 'logger' I have to adjust in the 
> `{{airflow/config_templates/airflow_local_settings.py}}`.
>  
> The adjustment states:
>  
>  # Update the airflow.task and airflow.tas_runner blocks to be 'gcs.task' 
> instead of 'file.task'. 'loggers':
>  Unknown macro: \{ 'airflow.task'}
>  
> However what I find in the template is:
> |'loggers': \{\| \|'airflow.processor': { | |'handlers': 
> ['processor'], | |'level': LOG_LEVEL, | 
> |'propagate': False, | |},|
> |'airflow.task': { 
> \| 
> \|'handlers': ['task'], 
> \| 
> \|'level': LOG_LEVEL, 
> \| 
> \|'propagate': False, 
> \| 
> \|},|
> |'flask_appbuilder': { 
> \| 
> \|'handler': ['console'], 
> \| 
> \|'level': FAB_LOG_LEVEL, 
> \| 
> \|'propagate': True, 
> \| 
> \|}|
> },
>  
> Since for me it is very important to do it right at the first time I hope 
> some clarity can be provided in what has to be adjusted in the logger. Is it 
> only the 'airflow.task' or more?
> Furthermore, at step 6 it is a little unclear what remote_log_conn_id means. 
> I would propose to add a little more information to make this more clear.
>  
> The current error I am facing is:
> Traceback (most recent call last):
>  File "/usr/local/bin/airflow", line 16, in 
>  from airflow import configuration
>  File "/usr/local/lib/python2.7/site-packages/airflow/__init__.py", line 31, 
> in 
>  from airflow import settings
>  File "/usr/local/lib/python2.7/site-packages/airflow/settings.py", line 198, 
> in 
>  configure_logging()
>  File "/usr/local/lib/python2.7/site-packages/airflow/logging_config.py", 
> line 71, in configure_logging
>  dictConfig(logging_config)
>  File "/usr/local/lib/python2.7/logging/config.py", line 794, in dictConfig
>  dictConfigClass(config).configure()
>  File "/usr/local/lib/python2.7/logging/config.py", line 568, in configure
>  handler = self.configure_handler(handlers[name])
>  File "/usr/local/lib/python2.7/logging/config.py", line 733, in 
> configure_handler
>  result = factory(**kwargs)
>  File 
> "/usr/local/lib/python2.7/site-packages/airflow/utils/log/gcs_task_handler.py",
>  line 30, in __init__
>  super(GCSTaskHandler, self).__init__(base_log_folder, filename_template)
>  File 
> "/usr/local/lib/python2.7/site-packages/airflow/utils/log/file_task_handler.py",
>  line 46, in __init__
>  self.filename_jinja_template = Template(self.filename_template)
>  File "/usr/local/lib/python2.7/site-packages/jinja2/environment.py", line 
> 926, in __new__
>  return env.from_string(source, template_class=cls)
>  File "/usr/local/lib/python2.7/site-packages/jinja2/environment.py", line 
> 862, in from_string
>  return cls.from_code(self, self.compile(source), globals, None)
>  File "/usr/local/lib/python2.7/site-packages/jinja2/environment.py", line 
> 565, in compile
>  self.handle_exception(exc_info, source_hint=source_hint)
>  File "/usr/local/lib/python2.7/site-packages/jinja2/environment.py", line 
> 754, in handle_exception
>  reraise(exc_type, exc_value, tb)
>  File "", line 1, in template
> jinja2.exceptions.TemplateSyntaxError: expected token ':', got '}'
> Error in atexit._run_exitfuncs:
> Traceback (most recent call last):
>  File "/usr/local/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
>  func(*targs, **kargs)
>  File "/usr/local/lib/python2.7/logging/__init__.py", line 1676, in shutdown
>  h.close()
>  File 
> "/usr/local/lib/python2.7/site-packages/airflow/utils/log/gcs_task_handler.py",
>  line 73, in close
>  if self.closed:
> AttributeError: 'GCSTaskHandler' object has no attribute 'closed'
> Error in sys.exitfunc:
> Traceback (most recent call last):
>  File "/usr/local/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
>  func(*targs, **kargs)
>  File "/usr/local/lib/python2.7/logging/__init__.py", line 1676, in shutdown
>  h.close()
>  File 
> "/usr/local/lib/python2.7/site-packages/airflow/utils/log/gcs_task_handler.py",
>  line 73, in close
>  if self.closed:
> AttributeError: 'GCSTaskHandler' object has no attribute 'closed'
>  If I look at the Airflow code I see the following code for the 
> 

[GitHub] ashb commented on issue #4123: [AIRFLOW-3288] Add SNS integration

2018-11-06 Thread GitBox
ashb commented on issue #4123: [AIRFLOW-3288] Add SNS integration
URL: 
https://github.com/apache/incubator-airflow/pull/4123#issuecomment-436260007
 
 
   These new classes are not linked from the docs - please add them to at least 
docs/code.rst


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3293) Rename TimeDeltaSensor to ScheduleTimeDeltaSensor

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676783#comment-16676783
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3293:


AIRFLOW-2747 and AIRFLOW-850 would help with your second point. And as of the 
current release even if the sensor behaved how you wanted it would still take 
up an executor slot as that is how sensors work.

I am uncertain on if this is a common enough use case to support directly given 
the two tickets mentioned above.

> Rename TimeDeltaSensor to ScheduleTimeDeltaSensor
> -
>
> Key: AIRFLOW-3293
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3293
> Project: Apache Airflow
>  Issue Type: Wish
>Reporter: Darren Weber
>Priority: Major
>
> The TimeDeltaSensor has baked-in lookups for the schedule and 
> schedule_interval lurking in the class init, it's not a pure time delta.  It 
> would be ideal to have a TimeDelta that is purely relative to the time that 
> an upstream task triggers it.  If there is a way to do this, please note it 
> here or suggest some implementation alternative that could achieve this 
> easily.
> The implementation below using a PythonOperator works, but it consumes a 
> worker for 5min needlessly.  It would be much better to have a TimeDelta that 
> accepts the time when an upstream sensor triggers it and then waits for a 
> timedelta, with options from the base sensor for poke interval (and timeout). 
>  This could be used without consuming a worker as much with the reschedule 
> option.  Something like this can help with adding jitter to downstream tasks 
> that could otherwise hit an HTTP endpoint too hard all at once.
> {code:python}
> def wait5(*args, **kwargs):
> import random
> import time as t
> minutes = random.randint(3,6)
> t.sleep(minutes * 60)
> return True
> wait5_task = PythonOperator(
> task_id="python_op_wait_5min",
> python_callable=wait5,
> dag=a_dag)
> upstream_http_sensor >> wait5_task
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ashb commented on a change in pull request #4075: [AIRFLOW-502] BashOperator success/failure conditions not documented

2018-11-06 Thread GitBox
ashb commented on a change in pull request #4075: [AIRFLOW-502] BashOperator 
success/failure conditions not documented
URL: https://github.com/apache/incubator-airflow/pull/4075#discussion_r229991618
 
 

 ##
 File path: airflow/operators/bash_operator.py
 ##
 @@ -49,6 +49,15 @@ class BashOperator(BaseOperator):
 :type env: dict
 :param output_encoding: Output encoding of bash command
 :type output_encoding: str
+
+On execution of the operator the task will up for retry when exception is 
raised.
+However if a command exists with non-zero value Airflow will not recognize
+it as failure unless explicitly specified in the beggining of the script.
 
 Review comment:
   ```suggestion
   it as failure unless the whole shell exits with a failure. The easiest 
way of 
   achieving this is to prefix the command with ``set -e;`` 
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #4075: [AIRFLOW-502] BashOperator success/failure conditions not documented

2018-11-06 Thread GitBox
ashb commented on a change in pull request #4075: [AIRFLOW-502] BashOperator 
success/failure conditions not documented
URL: https://github.com/apache/incubator-airflow/pull/4075#discussion_r231123170
 
 

 ##
 File path: airflow/operators/bash_operator.py
 ##
 @@ -49,6 +49,15 @@ class BashOperator(BaseOperator):
 :type env: dict
 :param output_encoding: Output encoding of bash command
 :type output_encoding: str
+
+On execution of the operator the task will up for retry when exception is 
raised.
+However if a command exists with non-zero value Airflow will not recognize
 
 Review comment:
   ```suggestion
   However if a sub-command exists with non-zero value Airflow will not 
recognize
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #4075: [AIRFLOW-502] BashOperator success/failure conditions not documented

2018-11-06 Thread GitBox
ashb commented on a change in pull request #4075: [AIRFLOW-502] BashOperator 
success/failure conditions not documented
URL: https://github.com/apache/incubator-airflow/pull/4075#discussion_r231123082
 
 

 ##
 File path: airflow/operators/bash_operator.py
 ##
 @@ -49,6 +49,15 @@ class BashOperator(BaseOperator):
 :type env: dict
 :param output_encoding: Output encoding of bash command
 :type output_encoding: str
+
+On execution of the operator the task will up for retry when exception is 
raised.
+However if a command exists with non-zero value Airflow will not recognize
+it as failure unless explicitly specified in the beggining of the script.
+Example:
+bash_command = "python3 script.py '{{ next_execution_date }}'"
+when executing command exit(1) the task will be marked as success.
+bash_command = "set -e; python3 script.py '{{ next_execution_date }}'"
+when executing command  exit(1) the task will be marked as up for 
retry.
 
 Review comment:
   Also this so that exit(1) is rendered as code/mono-spaced
   ```suggestion
   when executing command ``exit(1)`` the task will be marked as up for 
retry.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #4075: [AIRFLOW-502] BashOperator success/failure conditions not documented

2018-11-06 Thread GitBox
ashb commented on a change in pull request #4075: [AIRFLOW-502] BashOperator 
success/failure conditions not documented
URL: https://github.com/apache/incubator-airflow/pull/4075#discussion_r231095783
 
 

 ##
 File path: airflow/operators/bash_operator.py
 ##
 @@ -49,6 +49,15 @@ class BashOperator(BaseOperator):
 :type env: dict
 :param output_encoding: Output encoding of bash command
 :type output_encoding: str
+
+On execution of the operator the task will up for retry when exception is 
raised.
+However if a command exists with non-zero value Airflow will not recognize
+it as failure unless explicitly specified in the beggining of the script.
+Example:
+bash_command = "python3 script.py '{{ next_execution_date }}'"
+when executing command exit(1) the task will be marked as success.
+bash_command = "set -e; python3 script.py '{{ next_execution_date }}'"
+when executing command  exit(1) the task will be marked as up for 
retry.
 
 Review comment:
   I suspect these aren't going to render quite right - can you run `make -C 
docs html` then check the rendering of this (I think it writs to 
docs/build/html/index.html or similar)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zackmeso commented on issue #4114: [AIRFLOW-3259] Fix internal server error when displaying charts

2018-11-06 Thread GitBox
zackmeso commented on issue #4114: [AIRFLOW-3259] Fix internal server error 
when displaying charts
URL: 
https://github.com/apache/incubator-airflow/pull/4114#issuecomment-436248564
 
 
   @Fokko You can try and create a chart on airflow. It won't work. 
   
   As I said earlier it's hard to test the whole code of the `chart_data` 
function (Check comment above). However, we can apply both `sort` and 
`sort_values` against some fake dataframe and see that both behave in the same 
way (using an older pandas version). Except that one is no longer a part of 
pandas and one still is. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] msumit commented on issue #1933: [AIRFLOW-689] Okta Authentication

2018-11-06 Thread GitBox
msumit commented on issue #1933: [AIRFLOW-689] Okta Authentication
URL: 
https://github.com/apache/incubator-airflow/pull/1933#issuecomment-436243611
 
 
   Folks who were interested in Okta's api based authentication, I've raised 
another PR (https://github.com/apache/incubator-airflow/pull/4143). Give it a 
try and see how it goes for you. 
   
   @ashb agrees with your thought, but folks who want simplicity, there is no 
harm in having one more auth provider in the contrib section.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-689) Okta Authentication

2018-11-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676725#comment-16676725
 ] 

ASF GitHub Bot commented on AIRFLOW-689:


msumit opened a new pull request #4143: [AIRFLOW-689] Okta Authentication
URL: https://github.com/apache/incubator-airflow/pull/4143
 
 
   Dear Airflow Maintainers,
   
   Please accept this PR that addresses the following issues:
   
   https://issues.apache.org/jira/browse/AIRFLOW-689
   
   ### Description
   
   - [ ] Ability to use Okta's api based authentication mechanism
   
   ### Tests
   
   - Install Okta SDK with pip onto your airflow instance
   - Add in your Okta API key and organization URL (usually your_org.okta.com) 
in the config
   - Replace backend with okta_auth in the config
   - Log in
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Okta Authentication
> ---
>
> Key: AIRFLOW-689
> URL: https://issues.apache.org/jira/browse/AIRFLOW-689
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: contrib
>Reporter: Brian Yang
>Assignee: Brian Yang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] msumit opened a new pull request #4143: [AIRFLOW-689] Okta Authentication

2018-11-06 Thread GitBox
msumit opened a new pull request #4143: [AIRFLOW-689] Okta Authentication
URL: https://github.com/apache/incubator-airflow/pull/4143
 
 
   Dear Airflow Maintainers,
   
   Please accept this PR that addresses the following issues:
   
   https://issues.apache.org/jira/browse/AIRFLOW-689
   
   ### Description
   
   - [ ] Ability to use Okta's api based authentication mechanism
   
   ### Tests
   
   - Install Okta SDK with pip onto your airflow instance
   - Add in your Okta API key and organization URL (usually your_org.okta.com) 
in the config
   - Replace backend with okta_auth in the config
   - Log in
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #4137: [AIRFLOW-XXX] Fix Docstrings in Hooks, Sensors & Operators

2018-11-06 Thread GitBox
kaxil commented on issue #4137: [AIRFLOW-XXX] Fix Docstrings in Hooks, Sensors 
& Operators
URL: 
https://github.com/apache/incubator-airflow/pull/4137#issuecomment-436231517
 
 
   @ashb Yes, I am going to give that plugin a good review and play-around and 
then discuss with you guys on the best approach we can take.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil closed pull request #4137: [AIRFLOW-XXX] Fix Docstrings in Hooks, Sensors & Operators

2018-11-06 Thread GitBox
kaxil closed pull request #4137: [AIRFLOW-XXX] Fix Docstrings in Hooks, Sensors 
& Operators
URL: https://github.com/apache/incubator-airflow/pull/4137
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/hooks/spark_submit_hook.py 
b/airflow/contrib/hooks/spark_submit_hook.py
index 65bb6134e6..197b84a7b6 100644
--- a/airflow/contrib/hooks/spark_submit_hook.py
+++ b/airflow/contrib/hooks/spark_submit_hook.py
@@ -33,14 +33,15 @@ class SparkSubmitHook(BaseHook, LoggingMixin):
 This hook is a wrapper around the spark-submit binary to kick off a 
spark-submit job.
 It requires that the "spark-submit" binary is in the PATH or the 
spark_home to be
 supplied.
+
 :param conf: Arbitrary Spark configuration properties
 :type conf: dict
 :param conn_id: The connection id as configured in Airflow administration. 
When an
-invalid connection_id is supplied, it will default to yarn.
+invalid connection_id is supplied, it will default to yarn.
 :type conn_id: str
 :param files: Upload additional files to the executor running the job, 
separated by a
-  comma. Files will be placed in the working directory of each 
executor.
-  For example, serialized objects.
+comma. Files will be placed in the working directory of each executor.
+For example, serialized objects.
 :type files: str
 :param py_files: Additional python files used by the job, can be .zip, 
.egg or .py.
 :type py_files: str
@@ -51,19 +52,19 @@ class SparkSubmitHook(BaseHook, LoggingMixin):
 :param java_class: the main class of the Java application
 :type java_class: str
 :param packages: Comma-separated list of maven coordinates of jars to 
include on the
-driver and executor classpaths
+driver and executor classpaths
 :type packages: str
 :param exclude_packages: Comma-separated list of maven coordinates of jars 
to exclude
-while resolving the dependencies provided in 'packages'
+while resolving the dependencies provided in 'packages'
 :type exclude_packages: str
 :param repositories: Comma-separated list of additional remote 
repositories to search
-for the maven coordinates given with 'packages'
+for the maven coordinates given with 'packages'
 :type repositories: str
 :param total_executor_cores: (Standalone & Mesos only) Total cores for all 
executors
-(Default: all the available cores on the worker)
+(Default: all the available cores on the worker)
 :type total_executor_cores: int
 :param executor_cores: (Standalone, YARN and Kubernetes only) Number of 
cores per
-executor (Default: 2)
+executor (Default: 2)
 :type executor_cores: int
 :param executor_memory: Memory per executor (e.g. 1000M, 2G) (Default: 1G)
 :type executor_memory: str
@@ -80,7 +81,7 @@ class SparkSubmitHook(BaseHook, LoggingMixin):
 :param application_args: Arguments for the application being submitted
 :type application_args: list
 :param env_vars: Environment variables for spark-submit. It
- supports yarn and k8s mode too.
+supports yarn and k8s mode too.
 :type env_vars: dict
 :param verbose: Whether to pass the verbose flag to spark-submit process 
for debugging
 :type verbose: bool
diff --git a/airflow/contrib/hooks/sqoop_hook.py 
b/airflow/contrib/hooks/sqoop_hook.py
index 74cddc2b21..f4bad83144 100644
--- a/airflow/contrib/hooks/sqoop_hook.py
+++ b/airflow/contrib/hooks/sqoop_hook.py
@@ -36,13 +36,14 @@ class SqoopHook(BaseHook, LoggingMixin):
 
 Additional arguments that can be passed via the 'extra' JSON field of the
 sqoop connection:
-* job_tracker: Job tracker local|jobtracker:port.
-* namenode: Namenode.
-* lib_jars: Comma separated jar files to include in the classpath.
-* files: Comma separated files to be copied to the map reduce cluster.
-* archives: Comma separated archives to be unarchived on the compute
-machines.
-* password_file: Path to file containing the password.
+
+* ``job_tracker``: Job tracker local|jobtracker:port.
+* ``namenode``: Namenode.
+* ``lib_jars``: Comma separated jar files to include in the classpath.
+* ``files``: Comma separated files to be copied to the map reduce 
cluster.
+* ``archives``: Comma separated archives to be unarchived on the 
compute
+machines.
+* ``password_file``: Path to file containing the password.
 
 :param conn_id: Reference to the sqoop connection.
 :type conn_id: str
@@ -205,6 +206,7 @@ def import_table(self, table, target_dir=None, 
append=False, file_type="text",
 """
   

[GitHub] kaxil commented on issue #4139: [AIRFLOW-2715] Pick up the region setting while launching Dataflow templates

2018-11-06 Thread GitBox
kaxil commented on issue #4139: [AIRFLOW-2715] Pick up the region setting while 
launching Dataflow templates
URL: 
https://github.com/apache/incubator-airflow/pull/4139#issuecomment-436224075
 
 
   Can you add this to `DataflowTemplateOperator`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-3184) AwsHook with a conn_id that doesn't exist doesn't cause an error

2018-11-06 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3184:
---
Labels: easy-fix  (was: )

Looking at the code the fix is probably in _get_credentials inside aws_hook - 
the try block should only re-raise the error if {{self.aws_conn_id != 
'aws_default'}}

> AwsHook with a conn_id that doesn't exist doesn't cause an error
> 
>
> Key: AIRFLOW-3184
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3184
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: aws
>Affects Versions: 1.9.0
>Reporter: Ash Berlin-Taylor
>Priority: Minor
>  Labels: easy-fix
>
> It is possible to create an S3Hook (which is a subclass of the AwsHook) with 
> an invalid connection ID, and rather than it causing an error of "connection 
> not found" or similar, it falls back to something, and continues 
> execution anyway.
> Simple repro code:
> {code}
> h = S3Hook('i-dontexist')
> h.list_keys(bucket_name="bucket", prefix="folder/")
> {code}
> Ideally the first line here should throw an exception of some form or other 
> (possibly _except_ in the case where the {{conn_id}} is the default value of 
> "aws_default") rather than it's current behaviour, as this made it more 
> difficult to track down the source of our problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2679) GoogleCloudStorageToBigQueryOperator to support MERGE

2018-11-06 Thread Daniel Lamblin (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676624#comment-16676624
 ] 

Daniel Lamblin commented on AIRFLOW-2679:
-

The operator uses the Google Cloud Storage Hook to download the schema, and the 
Big Query Hook to create the table, either as external or by loading. It does 
this by setting a table insert job with a configuration that includes the write 
disposition.
As you can see from the Google Cloud Big Query API 
[https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs] 
configuration.copy.writeDisposition only supports the three modes you listed 
that Airflow in turn supports.

Merge is a query statement. It requires extra clauses to identify how to merge 
for a match and no match.
Using it correctly involves two steps: loading the table, and merging the 
loaded table with your target table.
As, in this scenario, the loaded table is likely just a staging table about to 
be discarded after the merge statement, it would make sense to load it as an 
external table, possibly saving time overall.

> GoogleCloudStorageToBigQueryOperator to support MERGE
> -
>
> Key: AIRFLOW-2679
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2679
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: jack
>Priority: Major
>
> Currently the 
> {color:#22}GoogleCloudStorageToBigQueryOp{color}{color:#22}erator 
> support the write_disposition parameter which can be : WRITE_TRUNCATE, 
> WRITE_APPEND , WRITE_EMPTY{color}
>  
> {color:#22}However Google has another very useful writing method 
> MERGE:{color}
> {color:#22}[https://cloud.google.com/bigquery/docs/reference/standard-sql/dml-syntax#merge_examples]{color}
> {color:#22}{color:#22}Support MERGE statement will be extremely 
> useful.{color}{color}
> {color:#22}{color:#22}The idea  behind this request is to do it 
> directly from Google Storage file rather than load the file into a table and 
> then run another MERGE statement.{color}{color}
>  
> {color:#22}{color:#22}The MERGE statement is really helpful when one 
> wants his records to be updated rather than appended or replaced. 
> {color}{color}
>  
> {color:#22} {color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil commented on issue #4137: [AIRFLOW-XXX] Fix Docstrings in Hooks, Sensors & Operators

2018-11-06 Thread GitBox
kaxil commented on issue #4137: [AIRFLOW-XXX] Fix Docstrings in Hooks, Sensors 
& Operators
URL: 
https://github.com/apache/incubator-airflow/pull/4137#issuecomment-436215443
 
 
   @r39132 There is a huge list of issues but some of them are what we don't 
need like `D401 First line should be in imperative mood`. 
   
   
![image](https://user-images.githubusercontent.com/8811558/48060340-a0ebbe80-e1b3-11e8-9593-b493814dc114.png)
   
   We decreased from 7641 to 7634. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb edited a comment on issue #4006: [AIRFLOW-3164] Verify server certificate when connecting to LDAP

2018-11-06 Thread GitBox
ashb edited a comment on issue #4006: [AIRFLOW-3164] Verify server certificate 
when connecting to LDAP
URL: 
https://github.com/apache/incubator-airflow/pull/4006#issuecomment-436214512
 
 
   Fair point, they can create a custom auth backend if they want to,. I'll put 
that back.
   
   @bolkedebruin Have confirmed with `tshark` that not specifying a version 
uses TLSv1.2 by default. (Couldn't think of any way of unit testing this.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #4006: [AIRFLOW-3164] Verify server certificate when connecting to LDAP

2018-11-06 Thread GitBox
ashb commented on issue #4006: [AIRFLOW-3164] Verify server certificate when 
connecting to LDAP
URL: 
https://github.com/apache/incubator-airflow/pull/4006#issuecomment-436214512
 
 
   Fair point, they can create a custom auth backend if they want to,. I'll put 
that back.
   
   @bolkedebruin Have confirmed with `tshark` that not specifying a version 
uses TLSv1.2 by default.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XD-DENG commented on issue #4138: [AIRFLOW-3301] Update DockerOperator unit test for PR #3977 to fix CI failure

2018-11-06 Thread GitBox
XD-DENG commented on issue #4138: [AIRFLOW-3301] Update DockerOperator unit 
test for PR #3977 to fix CI failure
URL: 
https://github.com/apache/incubator-airflow/pull/4138#issuecomment-436214350
 
 
   Thank you @kaxil  :-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3586: [AIRFLOW-2733] Reconcile psutil and subprocess in webserver cli

2018-11-06 Thread GitBox
codecov-io edited a comment on issue #3586: [AIRFLOW-2733] Reconcile psutil and 
subprocess in webserver cli
URL: 
https://github.com/apache/incubator-airflow/pull/3586#issuecomment-403631506
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3586?src=pr=h1)
 Report
   > Merging 
[#3586](https://codecov.io/gh/apache/incubator-airflow/pull/3586?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/e703d6beeb379ee88ef5e7df495e8a785666f8af?src=pr=desc)
 will **increase** coverage by `0.88%`.
   > The diff coverage is `50%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3586/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3586?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3586  +/-   ##
   ==
   + Coverage   76.67%   77.56%   +0.88% 
   ==
 Files 199  204   +5 
 Lines   1618615767 -419 
   ==
   - Hits1241012229 -181 
   + Misses   3776 3538 -238
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3586?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/3586/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `64.43% <50%> (-0.4%)` | :arrow_down: |
   | 
[airflow/operators/slack\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3586/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvc2xhY2tfb3BlcmF0b3IucHk=)
 | `0% <0%> (-97.37%)` | :arrow_down: |
   | 
[airflow/sensors/s3\_key\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3586/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX2tleV9zZW5zb3IucHk=)
 | `31.03% <0%> (-68.97%)` | :arrow_down: |
   | 
[airflow/sensors/s3\_prefix\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3586/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX3ByZWZpeF9zZW5zb3IucHk=)
 | `41.17% <0%> (-58.83%)` | :arrow_down: |
   | 
[airflow/example\_dags/example\_python\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3586/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9weXRob25fb3BlcmF0b3IucHk=)
 | `78.94% <0%> (-15.79%)` | :arrow_down: |
   | 
[airflow/utils/helpers.py](https://codecov.io/gh/apache/incubator-airflow/pull/3586/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9oZWxwZXJzLnB5)
 | `71.34% <0%> (-13.04%)` | :arrow_down: |
   | 
[airflow/hooks/mysql\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3586/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9teXNxbF9ob29rLnB5)
 | `78% <0%> (-12%)` | :arrow_down: |
   | 
[airflow/sensors/sql\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3586/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3NxbF9zZW5zb3IucHk=)
 | `90.47% <0%> (-9.53%)` | :arrow_down: |
   | 
[airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/incubator-airflow/pull/3586/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5)
 | `73.91% <0%> (-7.52%)` | :arrow_down: |
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/3586/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `83.95% <0%> (-5.47%)` | :arrow_down: |
   | ... and [94 
more](https://codecov.io/gh/apache/incubator-airflow/pull/3586/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3586?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3586?src=pr=footer).
 Last update 
[e703d6b...e7e5a68](https://codecov.io/gh/apache/incubator-airflow/pull/3586?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #4138: [AIRFLOW-3301] Update DockerOperator unit test for PR #3977 to fix CI failure

2018-11-06 Thread GitBox
kaxil commented on issue #4138: [AIRFLOW-3301] Update DockerOperator unit test 
for PR #3977 to fix CI failure
URL: 
https://github.com/apache/incubator-airflow/pull/4138#issuecomment-436212845
 
 
   Thanks @XD-DENG 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3301) Update CI test for [AIRFLOW-3132] (PR #3977)

2018-11-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676608#comment-16676608
 ] 

ASF GitHub Bot commented on AIRFLOW-3301:
-

kaxil closed pull request #4138: [AIRFLOW-3301] Update DockerOperator unit test 
for PR #3977 to fix CI failure
URL: https://github.com/apache/incubator-airflow/pull/4138
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/tests/operators/test_docker_operator.py 
b/tests/operators/test_docker_operator.py
index a7d63e4ebc..7ab27c1aeb 100644
--- a/tests/operators/test_docker_operator.py
+++ b/tests/operators/test_docker_operator.py
@@ -80,6 +80,7 @@ def test_execute(self, client_class_mock, mkdtemp_mock):
   shm_size=1000,
   cpu_shares=1024,
   mem_limit=None,
+  auto_remove=False,
   dns=None,
   dns_search=None)
 client_mock.images.assert_called_with(name='ubuntu:latest')


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update CI test for [AIRFLOW-3132] (PR #3977)
> 
>
> Key: AIRFLOW-3301
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3301
> Project: Apache Airflow
>  Issue Type: Test
>  Components: tests
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
>
> In PR [https://github.com/apache/incubator-airflow/pull/3977,] test is not 
> updated accordingly, and it results in CI failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil closed pull request #4138: [AIRFLOW-3301] Update DockerOperator unit test for PR #3977 to fix CI failure

2018-11-06 Thread GitBox
kaxil closed pull request #4138: [AIRFLOW-3301] Update DockerOperator unit test 
for PR #3977 to fix CI failure
URL: https://github.com/apache/incubator-airflow/pull/4138
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/tests/operators/test_docker_operator.py 
b/tests/operators/test_docker_operator.py
index a7d63e4ebc..7ab27c1aeb 100644
--- a/tests/operators/test_docker_operator.py
+++ b/tests/operators/test_docker_operator.py
@@ -80,6 +80,7 @@ def test_execute(self, client_class_mock, mkdtemp_mock):
   shm_size=1000,
   cpu_shares=1024,
   mem_limit=None,
+  auto_remove=False,
   dns=None,
   dns_search=None)
 client_mock.images.assert_called_with(name='ubuntu:latest')


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2842) GCS rsync operator

2018-11-06 Thread Daniel Lamblin (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676582#comment-16676582
 ] 

Daniel Lamblin commented on AIRFLOW-2842:
-

Do you think it would not be possible with a simple BashOperator call to the 
utility?

> GCS rsync operator
> --
>
> Key: AIRFLOW-2842
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2842
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Vikram Oberoi
>Priority: Major
>
> The GoogleCloudStorageToGoogleCloudStorageOperator supports copying objects 
> from one bucket to another using a wildcard.
> As long you don't delete anything in the source bucket, the destination 
> bucket will end up synchronized on every run.
> However, each object gets copied over even if it exists at the destination, 
> which makes this operation inefficient, time-consuming, and potentially 
> costly.
> I'd love an operator that behaves like `gsutil rsync` for when I need to 
> synchronize two buckets, supporting `gsutil rsync -d` behavior as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3303) Deprecate old UI in favor of new FAB RBAC

2018-11-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676513#comment-16676513
 ] 

ASF GitHub Bot commented on AIRFLOW-3303:
-

verdan opened a new pull request #4142: [AIRFLOW-3303] Deprecate old UI in 
favor of FAB
URL: https://github.com/apache/incubator-airflow/pull/4142
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW-3303) issues and references 
them in the PR title. 
   
   ### Description
   
   - [ ] We are using two different versions of UI in Apache Airflow. Idea is 
to deprecate and remove the older version of UI and use the new Flask App 
Builder (RBAC) version as the default UI from now on. (most probably in release 
2.0.x)
   This PR removes the old UI and renames the references of `www_rbac` to 
`www`. 
   
   ### Tests
   
   - [ ] Skipped some of the test case classes as these were purely using the 
older version of application and configurations. 
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Deprecate old UI in favor of new FAB RBAC
> -
>
> Key: AIRFLOW-3303
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3303
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ui
>Reporter: Verdan Mahmood
>Assignee: Verdan Mahmood
>Priority: Major
>
> It's hard to maintain two multiple UIs in parallel. 
> The idea is to remove the old UI in favor of the new FAB RBAC version. 
> Make sure to verify all the REST APIs are in place, and working. 
> All test cases should pass. Skip the tests related to the old UI



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] verdan opened a new pull request #4142: [AIRFLOW-3303] Deprecate old UI in favor of FAB

2018-11-06 Thread GitBox
verdan opened a new pull request #4142: [AIRFLOW-3303] Deprecate old UI in 
favor of FAB
URL: https://github.com/apache/incubator-airflow/pull/4142
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW-3303) issues and references 
them in the PR title. 
   
   ### Description
   
   - [ ] We are using two different versions of UI in Apache Airflow. Idea is 
to deprecate and remove the older version of UI and use the new Flask App 
Builder (RBAC) version as the default UI from now on. (most probably in release 
2.0.x)
   This PR removes the old UI and renames the references of `www_rbac` to 
`www`. 
   
   ### Tests
   
   - [ ] Skipped some of the test case classes as these were purely using the 
older version of application and configurations. 
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3303) Deprecate old UI in favor of new FAB RBAC

2018-11-06 Thread Verdan Mahmood (JIRA)
Verdan Mahmood created AIRFLOW-3303:
---

 Summary: Deprecate old UI in favor of new FAB RBAC
 Key: AIRFLOW-3303
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3303
 Project: Apache Airflow
  Issue Type: Improvement
  Components: ui
Reporter: Verdan Mahmood
Assignee: Verdan Mahmood


It's hard to maintain two multiple UIs in parallel. 

The idea is to remove the old UI in favor of the new FAB RBAC version. 

Make sure to verify all the REST APIs are in place, and working. 

All test cases should pass. Skip the tests related to the old UI



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] phani8996 edited a comment on issue #4111: [AIRFLOW-3266] Add AWS Athena Operator and hook

2018-11-06 Thread GitBox
phani8996 edited a comment on issue #4111: [AIRFLOW-3266] Add AWS Athena 
Operator and hook
URL: 
https://github.com/apache/incubator-airflow/pull/4111#issuecomment-436176981
 
 
   @ashb requested changes have been made. Please review and share if anything 
else need to be done. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] phani8996 commented on issue #4111: [AIRFLOW-3266] Add AWS Athena Operator and hook

2018-11-06 Thread GitBox
phani8996 commented on issue #4111: [AIRFLOW-3266] Add AWS Athena Operator and 
hook
URL: 
https://github.com/apache/incubator-airflow/pull/4111#issuecomment-436176981
 
 
   @ashb all requested changes have been made. Please review and share if 
anything else can be done. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3302) Small CSS fixes

2018-11-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676369#comment-16676369
 ] 

ASF GitHub Bot commented on AIRFLOW-3302:
-

msumit opened a new pull request #4140: [AIRFLOW-3302] Small CSS fixes
URL: https://github.com/apache/incubator-airflow/pull/4140
 
 
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. 
 - https://issues.apache.org/jira/browse/AIRFLOW-3302
   
   ### Description
   
   - [ ] 2 small CSS fixes
  - Don't highlight logout button when viewing *Log* tab of a task run
  - Align Airflow logo to the center of the login page
   
   ### Tests
   
   - [ ] Tested manually
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Small CSS fixes
> ---
>
> Key: AIRFLOW-3302
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3302
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Sumit Maheshwari
>Assignee: Sumit Maheshwari
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] jie8357IOII opened a new pull request #4141: [bugfix] minikube enviroment lack of init airflow db step

2018-11-06 Thread GitBox
jie8357IOII opened a new pull request #4141:  [bugfix] minikube enviroment lack 
of init airflow db step
URL: https://github.com/apache/incubator-airflow/pull/4141
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   No jira.
   
   ### Description
   When use kubenetes/kube/deplopy to deploy airflow on minikube, always 
failure in init container.
   Because lack of 'airflow' database in postgresql, I add init 'airflow' db 
step in python.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] msumit opened a new pull request #4140: [AIRFLOW-3302] Small CSS fixes

2018-11-06 Thread GitBox
msumit opened a new pull request #4140: [AIRFLOW-3302] Small CSS fixes
URL: https://github.com/apache/incubator-airflow/pull/4140
 
 
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. 
 - https://issues.apache.org/jira/browse/AIRFLOW-3302
   
   ### Description
   
   - [ ] 2 small CSS fixes
  - Don't highlight logout button when viewing *Log* tab of a task run
  - Align Airflow logo to the center of the login page
   
   ### Tests
   
   - [ ] Tested manually
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2715) Dataflow template operator dosenot support region parameter

2018-11-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676330#comment-16676330
 ] 

ASF GitHub Bot commented on AIRFLOW-2715:
-

janhicken opened a new pull request #4139: [AIRFLOW-2715] Pick up the region 
setting while launching Dataflow templates
URL: https://github.com/apache/incubator-airflow/pull/4139
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW-2715) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   To launch an instance of a Dataflow template in the configured region,
   the API service.projects().locations().teplates() instead of
   service.projects().templates() has to be used. Otherwise, all jobs will
   always be started in us-central1.
   
   In case there is no region configured, the default region `us-central1` will 
get picked up.
   
   To make it even worse, the polling for the job status already honors the
   region parameter and will search for the job in the wrong region in the
   current implementation. Because the job's status is not found, the
   corresponding Airflow task will hang.
   
   This PR is a second approach and follow-up of #4125
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   `tests.contrib.hooks.test_gcp_dataflow_hook.DataFlowTemplateHookTest` has 
been modified
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Dataflow template operator dosenot support region parameter
> ---
>
> Key: AIRFLOW-2715
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2715
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: operators
>Affects Versions: 1.9.0
>Reporter: Mohammed Tameem
>Priority: Critical
> Fix For: 2.0.0
>
>
> The DataflowTemplateOperator  uses dataflow.projects.templates.launch which 
> has a region parameter but only supports execution of the dataflow job in the 
> us-central1 region. Alternatively  there is another api, 
> dataflow.projects.locations.templates.launch which supports execution of the 
> template in all regional endpoints provided by google cloud.
> It would be great if,
>  # The base REST API of this operator could be changed from 
> "dataflow.projects.templates.launch" to 
> "dataflow.projects.locations.templates.launch"
>  # A templated region paramter was included in the operator to run the 
> dataflow job in the requested regional endpoint.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] janhicken opened a new pull request #4139: [AIRFLOW-2715] Pick up the region setting while launching Dataflow templates

2018-11-06 Thread GitBox
janhicken opened a new pull request #4139: [AIRFLOW-2715] Pick up the region 
setting while launching Dataflow templates
URL: https://github.com/apache/incubator-airflow/pull/4139
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW-2715) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   To launch an instance of a Dataflow template in the configured region,
   the API service.projects().locations().teplates() instead of
   service.projects().templates() has to be used. Otherwise, all jobs will
   always be started in us-central1.
   
   In case there is no region configured, the default region `us-central1` will 
get picked up.
   
   To make it even worse, the polling for the job status already honors the
   region parameter and will search for the job in the wrong region in the
   current implementation. Because the job's status is not found, the
   corresponding Airflow task will hang.
   
   This PR is a second approach and follow-up of #4125
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   `tests.contrib.hooks.test_gcp_dataflow_hook.DataFlowTemplateHookTest` has 
been modified
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yangaws commented on a change in pull request #4126: [AIRFLOW-2524] More AWS SageMaker operators, sensors for model, endpoint-config and endpoint

2018-11-06 Thread GitBox
yangaws commented on a change in pull request #4126: [AIRFLOW-2524] More AWS 
SageMaker operators, sensors for model, endpoint-config and endpoint
URL: https://github.com/apache/incubator-airflow/pull/4126#discussion_r231030413
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_endpoint_operator.py
 ##
 @@ -0,0 +1,151 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.hooks.aws_hook import AwsHook
+from airflow.contrib.operators.sagemaker_base_operator import 
SageMakerBaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerEndpointOperator(SageMakerBaseOperator):
+
+"""
+Create a SageMaker endpoint.
+
+This operator returns The ARN of the endpoint created in Amazon SageMaker
+
+:param config:
+The configuration necessary to create an endpoint.
+
+If you need to create a SageMaker endpoint based on an existed 
SageMaker model and an existed SageMaker
+endpoint config, 
+
+config = endpoint_configuration;
+
+If you need to create all of SageMaker model, SageMaker 
endpoint-config and SageMaker endpoint, 
+
+config = {
+'Model': model_configuration,
+
+'EndpointConfig': endpoint_config_configuration,
+
+'Endpoint': endpoint_configuration
+}
+
+For details of the configuration parameter of model_configuration, See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_model
+
+For details of the configuration parameter of 
endpoint_config_configuration, See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint_config
+
+For details of the configuration parameter of endpoint_configuration, 
See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint
+:type config: dict
+:param aws_conn_id: The AWS connection ID to use.
+:type aws_conn_id: str
+:param wait_for_completion: Whether the operator should wait until the 
endpoint creation finishes.
+:type wait_for_completion: bool
+:param check_interval: If wait is set to True, this is the time interval, 
in seconds, that this operation waits
+before polling the status of the endpoint creation.
+:type check_interval: int
+:param max_ingestion_time: If wait is set to True, this operation fails if 
the endpoint creation doesn't finish
+within max_ingestion_time seconds. If you set this parameter to None 
it never times out.
+:type max_ingestion_time: int
+:param operation: Whether to create an endpoint or update an endpoint. 
Must be either 'create or 'update'.
+:type operation: str
+"""  # noqa
+
+@apply_defaults
+def __init__(self,
+ config,
+ wait_for_completion=True,
+ check_interval=30,
+ max_ingestion_time=None,
+ operation='create',
+ *args, **kwargs):
+super(SageMakerEndpointOperator, self).__init__(config=config,
+*args, **kwargs)
+
+self.config = config
+self.wait_for_completion = wait_for_completion
+self.check_interval = check_interval
+self.max_ingestion_time = max_ingestion_time
+self.operation = operation.lower()
+if self.operation not in ['create', 'update']:
+raise AirflowException('Invalid value! Argument operation has to 
be one of "create" and "update"')
 
 Review comment:
   Updated. Thanks for explanation for AirflowException. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact 

[GitHub] yangaws commented on a change in pull request #4126: [AIRFLOW-2524] More AWS SageMaker operators, sensors for model, endpoint-config and endpoint

2018-11-06 Thread GitBox
yangaws commented on a change in pull request #4126: [AIRFLOW-2524] More AWS 
SageMaker operators, sensors for model, endpoint-config and endpoint
URL: https://github.com/apache/incubator-airflow/pull/4126#discussion_r231030278
 
 

 ##
 File path: tests/contrib/sensors/test_sagemaker_endpoint_sensor.py
 ##
 @@ -0,0 +1,110 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import unittest
+
+try:
+from unittest import mock
+except ImportError:
+try:
+import mock
+except ImportError:
+mock = None
+
+from airflow import configuration
+from airflow.contrib.sensors.sagemaker_endpoint_sensor \
+import SageMakerEndpointSensor
+from airflow.contrib.hooks.sagemaker_hook import SageMakerHook
+from airflow.exceptions import AirflowException
+
+DESCRIBE_ENDPOINT_CREATING_RESPONSE = {
+'EndpointStatus': 'Creating',
+'ResponseMetadata': {
+'HTTPStatusCode': 200,
+}
+}
+DESCRIBE_ENDPOINT_INSERVICE_RESPONSE = {
+'EndpointStatus': 'InService',
+'ResponseMetadata': {
+'HTTPStatusCode': 200,
+}
+}
+
+DESCRIBE_ENDPOINT_FAILED_RESPONSE = {
+'EndpointStatus': 'Failed',
+'ResponseMetadata': {
+'HTTPStatusCode': 200,
+},
+'FailureReason': 'Unknown'
+}
+
+DESCRIBE_ENDPOINT_UPDATING_RESPONSE = {
+'EndpointStatus': 'Updating',
+'ResponseMetadata': {
+'HTTPStatusCode': 200,
+}
+}
+
+
+class TestSageMakerEndpointSensor(unittest.TestCase):
+def setUp(self):
+configuration.load_test_config()
+
+@mock.patch.object(SageMakerHook, 'get_conn')
+@mock.patch.object(SageMakerHook, 'describe_endpoint')
+def test_sensor_with_failure(self, mock_describe, mock_client):
+mock_describe.side_effect = [DESCRIBE_ENDPOINT_FAILED_RESPONSE]
+sensor = SageMakerEndpointSensor(
+task_id='test_task',
+poke_interval=1,
+aws_conn_id='aws_test',
+endpoint_name='test_job_name'
+)
+self.assertRaises(AirflowException, sensor.execute, None)
+mock_describe.assert_called_once_with('test_job_name')
+
+@mock.patch.object(SageMakerHook, 'get_conn')
+@mock.patch.object(SageMakerHook, '__init__')
+@mock.patch.object(SageMakerHook, 'describe_endpoint')
+def test_sensor(self, mock_describe, hook_init, mock_client):
+hook_init.return_value = None
+
+mock_describe.side_effect = [
+DESCRIBE_ENDPOINT_CREATING_RESPONSE,
+DESCRIBE_ENDPOINT_UPDATING_RESPONSE,
+DESCRIBE_ENDPOINT_INSERVICE_RESPONSE
+]
+sensor = SageMakerEndpointSensor(
+task_id='test_task',
+poke_interval=1,
+aws_conn_id='aws_test',
+endpoint_name='test_job_name'
+)
+
+sensor.execute(None)
+
+# make sure we called 4 times(terminated when its compeleted)
 
 Review comment:
   Nice catch! Updated all sensor tests with this inaccurate comment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-957) the execution_date of dagrun that is created by TriggerDagRunOperator is not euqal the execution_date of TriggerDagRunOperator's task instance

2018-11-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676321#comment-16676321
 ] 

ASF GitHub Bot commented on AIRFLOW-957:


anxodio closed pull request #2238: [AIRFLOW-957] Add execution_date parameter 
to TriggerDagRunOperator
URL: https://github.com/apache/incubator-airflow/pull/2238
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/operators/dagrun_operator.py 
b/airflow/operators/dagrun_operator.py
index c3ffa1ada7..7094c50071 100644
--- a/airflow/operators/dagrun_operator.py
+++ b/airflow/operators/dagrun_operator.py
@@ -12,7 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from datetime import datetime
 import logging
 
 from airflow.models import BaseOperator, DagBag
@@ -23,9 +22,27 @@
 
 
 class DagRunOrder(object):
-def __init__(self, run_id=None, payload=None):
-self.run_id = run_id
+def __init__(self, execution_date, run_id=None, payload=None):
+self._run_id = run_id
 self.payload = payload
+self.execution_date = execution_date
+
+@property
+def run_id(self):
+return self._run_id or self._auto_run_id
+
+@run_id.setter
+def run_id(self, value):
+self._run_id = value
+
+@property
+def execution_date(self):
+return self._execution_date
+
+@execution_date.setter
+def execution_date(self, dt):
+self._execution_date = dt
+self._auto_run_id = 'trig__%s' % dt.isoformat()
 
 
 class TriggerDagRunOperator(BaseOperator):
@@ -37,8 +54,11 @@ class TriggerDagRunOperator(BaseOperator):
 :param python_callable: a reference to a python function that will be
 called while passing it the ``context`` object and a placeholder
 object ``obj`` for your callable to fill and return if you want
-a DagRun created. This ``obj`` object contains a ``run_id`` and
-``payload`` attribute that you can modify in your function.
+a DagRun created. This ``obj`` object contains an
+``execution_date``, a ``run_id`` and ``payload`` attribute that
+you can modify in your function.
+The ``execution_date`` is by default the current
+task's instance ``execution_date``.
 The ``run_id`` should be a unique identifier for that DAG run, and
 the payload has to be a picklable object that will be made available
 to your tasks while executing that DAG run. Your function header
@@ -60,7 +80,7 @@ def __init__(
 self.trigger_dag_id = trigger_dag_id
 
 def execute(self, context):
-dro = DagRunOrder(run_id='trig__' + datetime.now().isoformat())
+dro = DagRunOrder(context['execution_date'])
 dro = self.python_callable(context, dro)
 if dro:
 session = settings.Session()
@@ -70,6 +90,7 @@ def execute(self, context):
 run_id=dro.run_id,
 state=State.RUNNING,
 conf=dro.payload,
+execution_date=dro.execution_date,
 external_trigger=True)
 logging.info("Creating DagRun {}".format(dr))
 session.add(dr)
diff --git a/tests/core.py b/tests/core.py
index 353b847c6b..3c6e841bb9 100644
--- a/tests/core.py
+++ b/tests/core.py
@@ -446,6 +446,7 @@ def test_bash_operator_kill(self):
 
 def test_trigger_dagrun(self):
 def trigga(context, obj):
+trigga.run_id = obj.run_id
 if True:
 return obj
 
@@ -456,6 +457,40 @@ def trigga(context, obj):
 dag=self.dag)
 t.run(start_date=DEFAULT_DATE, end_date=DEFAULT_DATE, 
ignore_ti_state=True)
 
+session = settings.Session()
+new_dag_run = session.query(models.DagRun).filter(
+models.DagRun.run_id == trigga.run_id).first()
+self.assertEqual(new_dag_run.execution_date, DEFAULT_DATE)
+
+def test_trigger_dagrun_order_modified(self):
+"""
+Test TriggerDagRunOperator with changes in DagRunOrder
+"""
+new_execution_date = datetime(2016, 1, 1)
+new_dag_run_id = 'manual_run_id'
+payload_key = 'message'
+payload = {payload_key: 'Hello World'}
+
+def trigga(context, obj):
+obj.run_id = new_dag_run_id
+obj.execution_date = new_execution_date
+obj.payload = payload
+if True:
+return obj
+
+t = TriggerDagRunOperator(
+task_id='test_trigger_dagrun',
+trigger_dag_id='example_bash_operator',
+python_callable=trigga,
+dag=self.dag)
+

[GitHub] anxodio commented on a change in pull request #2238: [AIRFLOW-957] Add execution_date parameter to TriggerDagRunOperator

2018-11-06 Thread GitBox
anxodio commented on a change in pull request #2238: [AIRFLOW-957] Add 
execution_date parameter to TriggerDagRunOperator
URL: https://github.com/apache/incubator-airflow/pull/2238#discussion_r231029186
 
 

 ##
 File path: tests/core.py
 ##
 @@ -456,6 +457,40 @@ def trigga(context, obj):
 dag=self.dag)
 t.run(start_date=DEFAULT_DATE, end_date=DEFAULT_DATE, 
ignore_ti_state=True)
 
+session = settings.Session()
+new_dag_run = session.query(models.DagRun).filter(
+models.DagRun.run_id == trigga.run_id).first()
+self.assertEqual(new_dag_run.execution_date, DEFAULT_DATE)
+
+def test_trigger_dagrun_order_modified(self):
+"""
+Test TriggerDagRunOperator with changes in DagRunOrder
+"""
+new_execution_date = datetime(2016, 1, 1)
+new_dag_run_id = 'manual_run_id'
+payload_key = 'message'
+payload = {payload_key: 'Hello World'}
+
+def trigga(context, obj):
+obj.run_id = new_dag_run_id
+obj.execution_date = new_execution_date
+obj.payload = payload
+if True:
 
 Review comment:
   @ron819 yes, I will close it. I think it is a good idea to implement 
something like this, but it's a breaking change and I think that must be 
accorded with the community before.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] anxodio closed pull request #2238: [AIRFLOW-957] Add execution_date parameter to TriggerDagRunOperator

2018-11-06 Thread GitBox
anxodio closed pull request #2238: [AIRFLOW-957] Add execution_date parameter 
to TriggerDagRunOperator
URL: https://github.com/apache/incubator-airflow/pull/2238
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/operators/dagrun_operator.py 
b/airflow/operators/dagrun_operator.py
index c3ffa1ada7..7094c50071 100644
--- a/airflow/operators/dagrun_operator.py
+++ b/airflow/operators/dagrun_operator.py
@@ -12,7 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-from datetime import datetime
 import logging
 
 from airflow.models import BaseOperator, DagBag
@@ -23,9 +22,27 @@
 
 
 class DagRunOrder(object):
-def __init__(self, run_id=None, payload=None):
-self.run_id = run_id
+def __init__(self, execution_date, run_id=None, payload=None):
+self._run_id = run_id
 self.payload = payload
+self.execution_date = execution_date
+
+@property
+def run_id(self):
+return self._run_id or self._auto_run_id
+
+@run_id.setter
+def run_id(self, value):
+self._run_id = value
+
+@property
+def execution_date(self):
+return self._execution_date
+
+@execution_date.setter
+def execution_date(self, dt):
+self._execution_date = dt
+self._auto_run_id = 'trig__%s' % dt.isoformat()
 
 
 class TriggerDagRunOperator(BaseOperator):
@@ -37,8 +54,11 @@ class TriggerDagRunOperator(BaseOperator):
 :param python_callable: a reference to a python function that will be
 called while passing it the ``context`` object and a placeholder
 object ``obj`` for your callable to fill and return if you want
-a DagRun created. This ``obj`` object contains a ``run_id`` and
-``payload`` attribute that you can modify in your function.
+a DagRun created. This ``obj`` object contains an
+``execution_date``, a ``run_id`` and ``payload`` attribute that
+you can modify in your function.
+The ``execution_date`` is by default the current
+task's instance ``execution_date``.
 The ``run_id`` should be a unique identifier for that DAG run, and
 the payload has to be a picklable object that will be made available
 to your tasks while executing that DAG run. Your function header
@@ -60,7 +80,7 @@ def __init__(
 self.trigger_dag_id = trigger_dag_id
 
 def execute(self, context):
-dro = DagRunOrder(run_id='trig__' + datetime.now().isoformat())
+dro = DagRunOrder(context['execution_date'])
 dro = self.python_callable(context, dro)
 if dro:
 session = settings.Session()
@@ -70,6 +90,7 @@ def execute(self, context):
 run_id=dro.run_id,
 state=State.RUNNING,
 conf=dro.payload,
+execution_date=dro.execution_date,
 external_trigger=True)
 logging.info("Creating DagRun {}".format(dr))
 session.add(dr)
diff --git a/tests/core.py b/tests/core.py
index 353b847c6b..3c6e841bb9 100644
--- a/tests/core.py
+++ b/tests/core.py
@@ -446,6 +446,7 @@ def test_bash_operator_kill(self):
 
 def test_trigger_dagrun(self):
 def trigga(context, obj):
+trigga.run_id = obj.run_id
 if True:
 return obj
 
@@ -456,6 +457,40 @@ def trigga(context, obj):
 dag=self.dag)
 t.run(start_date=DEFAULT_DATE, end_date=DEFAULT_DATE, 
ignore_ti_state=True)
 
+session = settings.Session()
+new_dag_run = session.query(models.DagRun).filter(
+models.DagRun.run_id == trigga.run_id).first()
+self.assertEqual(new_dag_run.execution_date, DEFAULT_DATE)
+
+def test_trigger_dagrun_order_modified(self):
+"""
+Test TriggerDagRunOperator with changes in DagRunOrder
+"""
+new_execution_date = datetime(2016, 1, 1)
+new_dag_run_id = 'manual_run_id'
+payload_key = 'message'
+payload = {payload_key: 'Hello World'}
+
+def trigga(context, obj):
+obj.run_id = new_dag_run_id
+obj.execution_date = new_execution_date
+obj.payload = payload
+if True:
+return obj
+
+t = TriggerDagRunOperator(
+task_id='test_trigger_dagrun',
+trigger_dag_id='example_bash_operator',
+python_callable=trigga,
+dag=self.dag)
+t.run(start_date=DEFAULT_DATE, end_date=DEFAULT_DATE, 
ignore_ti_state=True)
+
+session = settings.Session()
+new_dag_run = session.query(models.DagRun).filter(
+models.DagRun.run_id == new_dag_run_id).first()
+

[GitHub] yangaws commented on a change in pull request #4126: [AIRFLOW-2524] More AWS SageMaker operators, sensors for model, endpoint-config and endpoint

2018-11-06 Thread GitBox
yangaws commented on a change in pull request #4126: [AIRFLOW-2524] More AWS 
SageMaker operators, sensors for model, endpoint-config and endpoint
URL: https://github.com/apache/incubator-airflow/pull/4126#discussion_r231028458
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_training_operator.py
 ##
 @@ -29,23 +29,26 @@ class SageMakerTrainingOperator(SageMakerBaseOperator):
 
 This operator returns The ARN of the training job created in Amazon 
SageMaker.
 
-:param config: The configuration necessary to start a training job 
(templated)
+:param config: The configuration necessary to start a training job 
(templated).
+
+For details of the configuration parameter, See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_training_job
 :type config: dict
 :param aws_conn_id: The AWS connection ID to use.
 :type aws_conn_id: str
-:param wait_for_completion: if the operator should block until training 
job finishes
+:param wait_for_completion: If wait is set to True, the time interval, in 
seconds,
+that the operation waits to check the status of the training job.
 :type wait_for_completion: bool
 :param print_log: if the operator should print the cloudwatch log during 
training
 :type print_log: bool
 :param check_interval: if wait is set to be true, this is the time interval
 in seconds which the operator will check the status of the training job
 :type check_interval: int
-:param max_ingestion_time: if wait is set to be true, the operator will 
fail
-if the training job hasn't finish within the max_ingestion_time in 
seconds
-(Caution: be careful to set this parameters because training can take 
very long)
-Setting it to None implies no timeout.
+:param max_ingestion_time: If wait is set to True, the operation fails if 
the training job
+doesn't finish within max_ingestion_time seconds. If you set this 
parameter to None,
+the operation does not timeout.
 :type max_ingestion_time: int
-"""
+"""  # noqa
 
 Review comment:
   Just the external link is too long. I am not sure if there's a way to 
separate a long link to multiple lines.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-11-06 Thread GitBox
codecov-io commented on issue #2460: [AIRFLOW-1424] make the next execution 
date of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-436165897
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2460?src=pr=h1)
 Report
   > Merging 
[#2460](https://codecov.io/gh/apache/incubator-airflow/pull/2460?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/80a3d6ac78c5c13abb8826b9dcbe0529f60fed81?src=pr=desc)
 will **increase** coverage by `0.02%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/2460/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/2460?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#2460  +/-   ##
   ==
   + Coverage   76.67%   76.69%   +0.02% 
   ==
 Files 199  199  
 Lines   1621216233  +21 
   ==
   + Hits1243012450  +20 
   - Misses   3782 3783   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/2460?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/2460/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `92.11% <100%> (+0.02%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2460?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2460?src=pr=footer).
 Last update 
[80a3d6a...da9b738](https://codecov.io/gh/apache/incubator-airflow/pull/2460?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yangaws commented on a change in pull request #4126: [AIRFLOW-2524] More AWS SageMaker operators, sensors for model, endpoint-config and endpoint

2018-11-06 Thread GitBox
yangaws commented on a change in pull request #4126: [AIRFLOW-2524] More AWS 
SageMaker operators, sensors for model, endpoint-config and endpoint
URL: https://github.com/apache/incubator-airflow/pull/4126#discussion_r231028384
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_endpoint_config_operator.py
 ##
 @@ -0,0 +1,67 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.operators.sagemaker_base_operator import 
SageMakerBaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerEndpointConfigOperator(SageMakerBaseOperator):
+
+"""
+Create a SageMaker endpoint config.
+
+This operator returns The ARN of the endpoint config created in Amazon 
SageMaker
+
+:param config: The configuration necessary to create an endpoint config.
+
+For details of the configuration parameter, See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_endpoint_config
+:type config: dict
+:param aws_conn_id: The AWS connection ID to use.
+:type aws_conn_id: str
+"""  # noqa
 
 Review comment:
   Just the external link is too long. I am not sure if there's a way to 
separate a long link to multiple lines.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services