[jira] [Commented] (AIRFLOW-5532) Missed imagePullSecrets from pod created from k8s executor

2019-09-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935195#comment-16935195
 ] 

ASF GitHub Bot commented on AIRFLOW-5532:
-

shawnzhu commented on pull request #6166: [AIRFLOW-5532] Fix imagePullSecrets 
in pod created from k8s executor
URL: https://github.com/apache/airflow/pull/6166
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5532
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
  bug fix for supporting pulling private container image.
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
  
`tests.kubernetes.TestKubernetesWorkerConfiguration.test_make_pod_with_image_pull_secrets()`
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Missed imagePullSecrets from pod created from k8s executor
> --
>
> Key: AIRFLOW-5532
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5532
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor-kubernetes
>Affects Versions: 2.0.0
> Environment: container image apache/airflow:master-python3.7-ci-slim 
> which appears to be v2.0.0-dev 
>Reporter: Ke Zhu
>Assignee: Daniel Imberman
>Priority: Major
>
> I used container image built based on 
> *apache/airflow:master-python3.7-ci-slim* with k8s executor configured in 
> airflow.cfg. all config options under section [kubernetes] and 
> [kubernetes_secrets] are recognized successfully except option 
> _image_pull_secrets_. 
>  
> Then I inspected the code at 
> [https://github.com/apache/airflow/blob/47801057989046dfcf7b424ce54afee103803815/airflow/kubernetes/worker_configuration.py#L367-L387]
>  where {{image_pull_secrets}} is NOT specified at all when constructing an 
> object of {{PodGenerator}}. When reviewing further, I've noticed that the 
> method {{WorkerConfiguration._get_image_pull_secrets()}} is not used at all. 
> Which should be used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] shawnzhu opened a new pull request #6166: [AIRFLOW-5532] Fix imagePullSecrets in pod created from k8s executor

2019-09-21 Thread GitBox
shawnzhu opened a new pull request #6166: [AIRFLOW-5532] Fix imagePullSecrets 
in pod created from k8s executor
URL: https://github.com/apache/airflow/pull/6166
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-5532
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
  bug fix for supporting pulling private container image.
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
  
`tests.kubernetes.TestKubernetesWorkerConfiguration.test_make_pod_with_image_pull_secrets()`
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-5532) Missed imagePullSecrets from pod created from k8s executor

2019-09-21 Thread Ke Zhu (Jira)
Ke Zhu created AIRFLOW-5532:
---

 Summary: Missed imagePullSecrets from pod created from k8s executor
 Key: AIRFLOW-5532
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5532
 Project: Apache Airflow
  Issue Type: Bug
  Components: executor-kubernetes
Affects Versions: 2.0.0
 Environment: container image apache/airflow:master-python3.7-ci-slim 
which appears to be v2.0.0-dev 
Reporter: Ke Zhu
Assignee: Daniel Imberman


I used container image built based on *apache/airflow:master-python3.7-ci-slim* 
with k8s executor configured in airflow.cfg. all config options under section 
[kubernetes] and [kubernetes_secrets] are recognized successfully except option 
_image_pull_secrets_. 

 

Then I inspected the code at 
[https://github.com/apache/airflow/blob/47801057989046dfcf7b424ce54afee103803815/airflow/kubernetes/worker_configuration.py#L367-L387]
 where {{image_pull_secrets}} is NOT specified at all when constructing an 
object of {{PodGenerator}}. When reviewing further, I've noticed that the 
method {{WorkerConfiguration._get_image_pull_secrets()}} is not used at all. 
Which should be used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] codecov-io edited a comment on issue #6102: [AIRFLOW-4309] Remove Broken Dag error after Dag is deleted

2019-09-21 Thread GitBox
codecov-io edited a comment on issue #6102: [AIRFLOW-4309] Remove Broken Dag 
error after Dag is deleted
URL: https://github.com/apache/airflow/pull/6102#issuecomment-531431531
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6102?src=pr=h1) 
Report
   > Merging 
[#6102](https://codecov.io/gh/apache/airflow/pull/6102?src=pr=desc) into 
[master](https://codecov.io/gh/apache/airflow/commit/31e7d319ea0d77fb639ba4ed1c46197f853552a0?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6102/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6102?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#6102  +/-   ##
   ==
   + Coverage   80.05%   80.06%   +<.01% 
   ==
 Files 608  608  
 Lines   3502935031   +2 
   ==
   + Hits2804128046   +5 
   + Misses   6988 6985   -3
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6102?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/api/common/experimental/delete\_dag.py](https://codecov.io/gh/apache/airflow/pull/6102/diff?src=pr=tree#diff-YWlyZmxvdy9hcGkvY29tbW9uL2V4cGVyaW1lbnRhbC9kZWxldGVfZGFnLnB5)
 | `87.5% <100%> (+0.54%)` | :arrow_up: |
   | 
[airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6102/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=)
 | `100% <100%> (ø)` | :arrow_up: |
   | 
[airflow/models/taskinstance.py](https://codecov.io/gh/apache/airflow/pull/6102/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGFza2luc3RhbmNlLnB5)
 | `93.72% <0%> (+0.5%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6102?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6102?src=pr=footer). 
Last update 
[31e7d31...4861ea1](https://codecov.io/gh/apache/airflow/pull/6102?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (AIRFLOW-5489) bash_senor: Remove unneeded assignment of variable

2019-09-21 Thread Kaxil Naik (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-5489.
-
Fix Version/s: 1.10.6
   Resolution: Fixed

> bash_senor: Remove unneeded assignment of variable
> --
>
> Key: AIRFLOW-5489
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5489
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.10.6
>Reporter: Jakob Homan
>Priority: Minor
>  Labels: ccoss2019, newbie
> Fix For: 1.10.6
>
>
> Note: This ticket's being created to facilitate a new contributor's workshop 
> for Airflow. After the workshop has completed, I'll mark these all available 
> for anyone that might like to take them on.
> The `line` variable is assigned to `''` but then immediately reassigned in 
> the loop.  This first assignment should be deleted.
> airflow/contrib/sensors/bash_sensor.py:83
> {code:java}
> line = ''
> for line in iter(sp.stdout.readline, b''):
> line = line.decode(self.output_encoding).strip()
> self.log.info(line)
> sp.wait()
> self.log.info("Command exited with return code %s", sp.returncode) {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5492) Missing docstring for hive .py

2019-09-21 Thread Kaxil Naik (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-5492.
-
Fix Version/s: 2.0.0
   Resolution: Fixed

> Missing docstring for hive .py
> --
>
> Key: AIRFLOW-5492
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5492
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: utils
>Affects Versions: 1.10.6
>Reporter: Jakob Homan
>Priority: Minor
>  Labels: ccoss2019, newbie
> Fix For: 2.0.0
>
>
> Note: This ticket's being created to facilitate a new contributor's workshop 
> for Airflow. After the workshop has completed, I'll mark these all available 
> for anyone that might like to take them on.
> We need to add doc_strings for both {{schema}} and {{metastore_conn_id}}
> airflow/macros/hive.py:83
> {code:java}
> def closest_ds_partition(
> table, ds, before=True, schema="default",
> metastore_conn_id='metastore_default'):
> """
> This function finds the date in a list closest to the target date.
> An optional parameter can be given to get the closest before or after.
> :param table: A hive table name
> :type table: str
> :param ds: A datestamp ``%Y-%m-%d`` e.g. ``-mm-dd``
> :type ds: list[datetime.date]
> :param before: closest before (True), after (False) or either side of ds
> :type before: bool or None
> :returns: The closest date
> :rtype: str or None {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5491) mark_tasks pydoc is incorrect

2019-09-21 Thread Kaxil Naik (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-5491.
-
Fix Version/s: 1.10.6
   Resolution: Fixed

> mark_tasks pydoc is incorrect
> -
>
> Key: AIRFLOW-5491
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5491
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: security
>Affects Versions: 1.10.6
>Reporter: Jakob Homan
>Priority: Minor
>  Labels: ccoss2019, newbie
> Fix For: 1.10.6
>
>
> Note: This ticket's being created to facilitate a new contributor's workshop 
> for Airflow. After the workshop has completed, I'll mark these all available 
> for anyone that might like to take them on.
> The pydoc for set_state is incorrect; it thinks the first param named {{task 
> instead of }}{{tasks}} (which is used in the code, and the doc itself thinks 
> this is a single task instead of an iterable.
> airflow/api/common/experimental/mark_tasks.py:62
> {code:java}
> def set_state(
> tasks: Iterable[BaseOperator],
> execution_date: datetime.datetime,
> upstream: bool = False,
> downstream: bool = False,
> future: bool = False,
> past: bool = False,
> state: str = State.SUCCESS,
> commit: bool = False,
> session=None):  # pylint: disable=too-many-arguments,too-many-locals
> """
> Set the state of a task instance and if needed its relatives. Can set 
> state
> for future tasks (calculated from execution_date) and retroactively
> for past tasks. Will verify integrity of past dag runs in order to create
> tasks that did not exist. It will not create dag runs that are missing
> on the schedule (but it will as for subdag dag runs if needed).
> :param task: the task from which to work. task.task.dag needs to be set 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-5492) Missing docstring for hive .py

2019-09-21 Thread Kaxil Naik (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik updated AIRFLOW-5492:

Fix Version/s: (was: 2.0.0)
   1.10.6

> Missing docstring for hive .py
> --
>
> Key: AIRFLOW-5492
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5492
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: utils
>Affects Versions: 1.10.6
>Reporter: Jakob Homan
>Priority: Minor
>  Labels: ccoss2019, newbie
> Fix For: 1.10.6
>
>
> Note: This ticket's being created to facilitate a new contributor's workshop 
> for Airflow. After the workshop has completed, I'll mark these all available 
> for anyone that might like to take them on.
> We need to add doc_strings for both {{schema}} and {{metastore_conn_id}}
> airflow/macros/hive.py:83
> {code:java}
> def closest_ds_partition(
> table, ds, before=True, schema="default",
> metastore_conn_id='metastore_default'):
> """
> This function finds the date in a list closest to the target date.
> An optional parameter can be given to get the closest before or after.
> :param table: A hive table name
> :type table: str
> :param ds: A datestamp ``%Y-%m-%d`` e.g. ``-mm-dd``
> :type ds: list[datetime.date]
> :param before: closest before (True), after (False) or either side of ds
> :type before: bool or None
> :returns: The closest date
> :rtype: str or None {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5495) Remove unneeded parens in dataproc.py

2019-09-21 Thread Kaxil Naik (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-5495.
-
Fix Version/s: 2.0.0
   Resolution: Fixed

> Remove unneeded parens in dataproc.py
> -
>
> Key: AIRFLOW-5495
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5495
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.10.6
>Reporter: Jakob Homan
>Assignee: Adan Christian Rosales Ornelas
>Priority: Minor
>  Labels: ccoss2019, newbie
> Fix For: 2.0.0
>
>
> Note: This ticket's being created to facilitate a new contributor's workshop 
> for Airflow. After the workshop has completed, I'll mark these all available 
> for anyone that might like to take them on.
> The parens around {{self.custom_image_project_id}} don't need to be there; we 
> should remove them.
> airflow/gcp/operators/dataproc.py:409
> {code:java}
> elif self.custom_image:
> project_id = self.custom_image_project_id if 
> (self.custom_image_project_id) else self.project_id
> custom_image_url = 'https://www.googleapis.com/compute/beta/projects/' \ 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] codecov-io commented on issue #6163: [AIRFLOW-4574] SSHHook private_key may only be supplied in extras

2019-09-21 Thread GitBox
codecov-io commented on issue #6163: [AIRFLOW-4574] SSHHook private_key may 
only be supplied in extras
URL: https://github.com/apache/airflow/pull/6163#issuecomment-533831042
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/6163?src=pr=h1) 
Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@123479c`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/airflow/pull/6163/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6163?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master#6163   +/-   ##
   =
 Coverage  ?   80.06%   
   =
 Files ?  608   
 Lines ?35030   
 Branches  ?0   
   =
 Hits  ?28046   
 Misses? 6984   
 Partials  ?0
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/airflow/pull/6163?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[airflow/contrib/hooks/ssh\_hook.py](https://codecov.io/gh/apache/airflow/pull/6163/diff?src=pr=tree#diff-YWlyZmxvdy9jb250cmliL2hvb2tzL3NzaF9ob29rLnB5)
 | `88.49% <100%> (ø)` | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/airflow/pull/6163?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/airflow/pull/6163?src=pr=footer). 
Last update 
[123479c...76f48e8](https://codecov.io/gh/apache/airflow/pull/6163?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on issue #5908: Revert "[AIRFLOW-4797] Improve performance and behaviour of zombie de…

2019-09-21 Thread GitBox
kaxil commented on issue #5908: Revert "[AIRFLOW-4797] Improve performance and 
behaviour of zombie de…
URL: https://github.com/apache/airflow/pull/5908#issuecomment-533830055
 
 
   Hmm, rethinking it. Nvm I think the PR is fine as it, doesn't need new Jira. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5531) Replace deprecated log.warn() with log.warning()

2019-09-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935153#comment-16935153
 ] 

ASF GitHub Bot commented on AIRFLOW-5531:
-

kaxil commented on pull request #6165: [AIRFLOW-5531] Replace deprecated 
log.warn() with log.warning()
URL: https://github.com/apache/airflow/pull/6165
 
 
   - Add pre-commit hook to detect this
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. 
 - https://issues.apache.org/jira/browse/AIRFLOW-5531
   
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   We should replace log.warn which is deprecated with log.warning
   
   Additionally, this PR adds pre-commit to detect this cases.
   
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Add pre-commit hooks
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace deprecated log.warn() with log.warning()
> 
>
> Key: AIRFLOW-5531
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5531
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: logging
>Affects Versions: 2.0.0, 1.10.5
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Minor
> Fix For: 2.0.0, 1.10.6
>
>
> We should replace *log.warn* which is deprecated with *log.warning*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] kaxil opened a new pull request #6165: [AIRFLOW-5531] Replace deprecated log.warn() with log.warning()

2019-09-21 Thread GitBox
kaxil opened a new pull request #6165: [AIRFLOW-5531] Replace deprecated 
log.warn() with log.warning()
URL: https://github.com/apache/airflow/pull/6165
 
 
   - Add pre-commit hook to detect this
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. 
 - https://issues.apache.org/jira/browse/AIRFLOW-5531
   
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   We should replace log.warn which is deprecated with log.warning
   
   Additionally, this PR adds pre-commit to detect this cases.
   
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Add pre-commit hooks
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-5531) Replace deprecated log.warn() with log.warning()

2019-09-21 Thread Kaxil Naik (Jira)
Kaxil Naik created AIRFLOW-5531:
---

 Summary: Replace deprecated log.warn() with log.warning()
 Key: AIRFLOW-5531
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5531
 Project: Apache Airflow
  Issue Type: Improvement
  Components: logging
Affects Versions: 1.10.5, 2.0.0
Reporter: Kaxil Naik
Assignee: Kaxil Naik
 Fix For: 2.0.0, 1.10.6


We should replace *log.warn* which is deprecated with *log.warning*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] nuclearpinguin commented on issue #4785: [AIRFLOW-3965] Fixing GoogleCloudStorageToBigQueryOperator failing for jobs outside US and EU

2019-09-21 Thread GitBox
nuclearpinguin commented on issue #4785: [AIRFLOW-3965] Fixing 
GoogleCloudStorageToBigQueryOperator failing for jobs outside US and EU
URL: https://github.com/apache/airflow/pull/4785#issuecomment-533828465
 
 
   I think yes


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on a change in pull request #6157: [AIRFLOW-774] Fix long-broken DAG parsing Statsd metrics

2019-09-21 Thread GitBox
kaxil commented on a change in pull request #6157: [AIRFLOW-774] Fix 
long-broken DAG parsing Statsd metrics
URL: https://github.com/apache/airflow/pull/6157#discussion_r326873339
 
 

 ##
 File path: UPDATING.md
 ##
 @@ -40,6 +40,16 @@ assists users migrating to a new version.
 
 ## Airflow Master
 
+### Some DAG Processing metrics have been renamed
+
+The following metrics are deprected and won't be emitted in Airflow 2.0:
 
 Review comment:
   ```suggestion
   The following metrics are deprecated and won't be emitted in Airflow 2.0:
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk merged pull request #6164: [AIRFLOW-XXX] Fix backticks in the new file

2019-09-21 Thread GitBox
potiuk merged pull request #6164: [AIRFLOW-XXX] Fix backticks in the new file
URL: https://github.com/apache/airflow/pull/6164
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk opened a new pull request #6164: [AIRFLOW-XXX] Fix backtick issues in .rst files & Add Precommit hook …

2019-09-21 Thread GitBox
potiuk opened a new pull request #6164: [AIRFLOW-XXX] Fix backtick issues in 
.rst files & Add Precommit hook …
URL: https://github.com/apache/airflow/pull/6164
 
 
   …(#6162)
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5434) Use hook to provide credentials in GKEPodOperator

2019-09-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935147#comment-16935147
 ] 

ASF GitHub Bot commented on AIRFLOW-5434:
-

mik-laj commented on pull request #6050: [AIRFLOW-5434] Use hook to provide 
credentials in GKEPodOperator
URL: https://github.com/apache/airflow/pull/6050
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Use hook to provide credentials in GKEPodOperator
> -
>
> Key: AIRFLOW-5434
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5434
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.5
>Reporter: Kamil Bregula
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5434) Use hook to provide credentials in GKEPodOperator

2019-09-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935148#comment-16935148
 ] 

ASF subversion and git services commented on AIRFLOW-5434:
--

Commit 86b4caac9a1421500ba7eeb10a147b0f731bab08 in airflow's branch 
refs/heads/master from Kamil Breguła
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=86b4caa ]

[AIRFLOW-5434] Use hook to provide credentials in GKEPodOperator (#6050)



> Use hook to provide credentials in GKEPodOperator
> -
>
> Key: AIRFLOW-5434
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5434
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.5
>Reporter: Kamil Bregula
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] mik-laj merged pull request #6050: [AIRFLOW-5434] Use hook to provide credentials in GKEPodOperator

2019-09-21 Thread GitBox
mik-laj merged pull request #6050: [AIRFLOW-5434] Use hook to provide 
credentials in GKEPodOperator
URL: https://github.com/apache/airflow/pull/6050
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #5598: [AIRFLOW-733][AIRFLOW-883] Apply default_args when setting `op.dag = dag` or `dag >> op`

2019-09-21 Thread GitBox
mik-laj commented on issue #5598: [AIRFLOW-733][AIRFLOW-883] Apply default_args 
when setting `op.dag = dag` or `dag >> op`
URL: https://github.com/apache/airflow/pull/5598#issuecomment-533825856
 
 
   Any progress on this PR? Can I help with it? Travis is sad and this PR 
require rebase.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #5676: [AIRFLOW-5062] Add support for getting/setting ACL in S3Hook

2019-09-21 Thread GitBox
mik-laj commented on issue #5676: [AIRFLOW-5062] Add support for 
getting/setting ACL in S3Hook
URL: https://github.com/apache/airflow/pull/5676#issuecomment-533825771
 
 
   Travis is sad. Can you fix it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache Livy REST operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache 
Livy REST operator
URL: https://github.com/apache/airflow/pull/6090#discussion_r326870689
 
 

 ##
 File path: airflow/contrib/hooks/livy_hook.py
 ##
 @@ -0,0 +1,297 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+This module contains the Apache Livy hook.
+"""
+
+import re
+from enum import Enum
+import json
+import requests
+
+from airflow.exceptions import AirflowException
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+
+class BatchState(Enum):
+"""
+Batch session states
+"""
+NOT_STARTED = 'not_started'
+STARTING = 'starting'
+RUNNING = 'running'
+IDLE = 'idle'
+BUSY = 'busy'
+SHUTTING_DOWN = 'shutting_down'
+ERROR = 'error'
+DEAD = 'dead'
+KILLED = 'killed'
+SUCCESS = 'success'
+
+
+TERMINAL_STATES = {
+BatchState.SUCCESS,
+BatchState.DEAD,
+BatchState.KILLED,
+BatchState.ERROR,
+}
+
+
+class LivyHook(BaseHook, LoggingMixin):
+"""
+Hook for Apache Livy through the REST API.
+
+For more information about the API refer to
+https://livy.apache.org/docs/latest/rest-api.html
+
+:param livy_conn_id: reference to a pre-defined Livy Connection.
+:type livy_conn_id: str
+"""
+def __init__(self, livy_conn_id='livy_default'):
+super(LivyHook, self).__init__(livy_conn_id)
+self._livy_conn_id = livy_conn_id
+self._build_base_url()
+
+def _build_base_url(self):
+"""
+Build connection URL
+"""
+params = self.get_connection(self._livy_conn_id)
+
+base_url = params.host
+
+if not base_url:
+raise AirflowException("Missing Livy endpoint hostname")
+
+if '://' not in base_url:
+base_url = '{}://{}'.format('http', base_url)
+if not re.search(r':\d+$', base_url):
+base_url = '{}:{}'.format(base_url, str(params.port or 8998))
+
+self._base_url = base_url
+
+def get_conn(self):
+pass
+
+def post_batch(self, *args, **kwargs):
+"""
+Perform request to submit batch
+"""
+
+batch_submit_body = json.dumps(LivyHook.build_post_batch_body(*args, 
**kwargs))
+headers = {'Content-Type': 'application/json'}
+
+self.log.info("Submitting job {} to {}".format(batch_submit_body, 
self._base_url))
+response = requests.post(self._base_url + '/batches', 
data=batch_submit_body, headers=headers)
+self.log.debug("Got response: {}".format(response.text))
+
+if response.status_code != 201:
+raise AirflowException("Could not submit batch. Status code: 
{}".format(response.status_code))
+
+batch_id = LivyHook._parse_post_response(response.json())
+if batch_id is None:
+raise AirflowException("Unable to parse a batch session id")
+self.log.info("Batch submitted with session id: {}".format(batch_id))
+
+return batch_id
+
+def get_batch(self, session_id):
+"""
+Fetch info about the specified batch
+:param session_id: identifier of the batch sessions
+:type session_id: int
+"""
+LivyHook._validate_session_id(session_id)
+
+self.log.debug("Fetching info for batch session {}".format(session_id))
+response = requests.get('{}/batches/{}'.format(self._base_url, 
session_id))
+
+if response.status_code != 200:
+self.log.warning("Got status code {} for session 
{}".format(response.status_code, session_id))
+raise AirflowException("Unable to fetch batch with id: 
{}".format(session_id))
+
+return response.json()
+
+def get_batch_state(self, session_id):
+"""
+Fetch the state of the specified batch
+:param session_id: identifier of the batch sessions
+:type session_id: int
+"""
+LivyHook._validate_session_id(session_id)
+
+self.log.debug("Fetching info for batch session {}".format(session_id))
+response = requests.get('{}/batches/{}/state'.format(self._base_url, 
session_id))
+
+ 

[GitHub] [airflow] feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache Livy REST operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache 
Livy REST operator
URL: https://github.com/apache/airflow/pull/6090#discussion_r326871022
 
 

 ##
 File path: airflow/contrib/operators/livy_operator.py
 ##
 @@ -0,0 +1,175 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+This module contains the Apache Livy operator.
+"""
+
+from time import sleep, gmtime, mktime
+
+from airflow.models import BaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+from airflow.contrib.hooks.livy_hook import LivyHook, BatchState, 
TERMINAL_STATES
+
+
+class LivyOperator(BaseOperator):
+"""
 
 Review comment:
   Could you add a short description here what the operator does?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache Livy REST operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache 
Livy REST operator
URL: https://github.com/apache/airflow/pull/6090#discussion_r326871046
 
 

 ##
 File path: airflow/contrib/operators/livy_operator.py
 ##
 @@ -0,0 +1,175 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+This module contains the Apache Livy operator.
+"""
+
+from time import sleep, gmtime, mktime
+
+from airflow.models import BaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+from airflow.contrib.hooks.livy_hook import LivyHook, BatchState, 
TERMINAL_STATES
+
+
+class LivyOperator(BaseOperator):
+"""
+:param file: Path of the  file containing the application to execute 
(required).
+:type file: str
+:param class_name: Application Java/Spark main class string.
+:type class_name: str
+:param args: Command line arguments for the application s.
 
 Review comment:
   ```suggestion
   :param args: Command line arguments for the application.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache Livy REST operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache 
Livy REST operator
URL: https://github.com/apache/airflow/pull/6090#discussion_r326871710
 
 

 ##
 File path: tests/contrib/hooks/test_livy_hook.py
 ##
 @@ -0,0 +1,428 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import unittest
+from unittest.mock import patch, MagicMock
+import json
+from requests.exceptions import RequestException
+
+from airflow import AirflowException
+from airflow.models import Connection
+from airflow.utils import db
+
+from airflow.contrib.hooks.livy_hook import LivyHook, BatchState
+
+TEST_ID = 100
+SAMPLE_GET_RESPONSE = {'id': TEST_ID, 'state': BatchState.SUCCESS.value}
+
+
+class TestLivyHook(unittest.TestCase):
+
+def setUp(self):
 
 Review comment:
   I think 
[setUpClass](https://docs.python.org/3/library/unittest.html#unittest.TestCase.setUpClass)
 would also work and it wouldn't call those "db merges" before each test. And 
you can specify 
[tearDownClass](https://docs.python.org/3/library/unittest.html#unittest.TestCase.tearDownClass)
 to drop the connections.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache Livy REST operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache 
Livy REST operator
URL: https://github.com/apache/airflow/pull/6090#discussion_r326871060
 
 

 ##
 File path: airflow/contrib/operators/livy_operator.py
 ##
 @@ -0,0 +1,175 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+This module contains the Apache Livy operator.
+"""
+
+from time import sleep, gmtime, mktime
+
+from airflow.models import BaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+from airflow.contrib.hooks.livy_hook import LivyHook, BatchState, 
TERMINAL_STATES
+
+
+class LivyOperator(BaseOperator):
+"""
+:param file: Path of the  file containing the application to execute 
(required).
+:type file: str
+:param class_name: Application Java/Spark main class string.
+:type class_name: str
+:param args: Command line arguments for the application s.
+:type args: list
+:param jars: jars to be used in this sessions.
+:type jars: list
+:param py_files: Python files to be used in this session.
+:type py_files: list
+:param files: files to be used in this session.
+:type files: list
+:param driver_memory: Amount of memory to use for the driver process  
string.
 
 Review comment:
   ```suggestion
   :param driver_memory: Amount of memory to use for the driver process.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache Livy REST operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache 
Livy REST operator
URL: https://github.com/apache/airflow/pull/6090#discussion_r326870791
 
 

 ##
 File path: airflow/contrib/hooks/livy_hook.py
 ##
 @@ -0,0 +1,297 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+This module contains the Apache Livy hook.
+"""
+
+import re
+from enum import Enum
+import json
+import requests
+
+from airflow.exceptions import AirflowException
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+
+class BatchState(Enum):
+"""
+Batch session states
+"""
+NOT_STARTED = 'not_started'
+STARTING = 'starting'
+RUNNING = 'running'
+IDLE = 'idle'
+BUSY = 'busy'
+SHUTTING_DOWN = 'shutting_down'
+ERROR = 'error'
+DEAD = 'dead'
+KILLED = 'killed'
+SUCCESS = 'success'
+
+
+TERMINAL_STATES = {
+BatchState.SUCCESS,
+BatchState.DEAD,
+BatchState.KILLED,
+BatchState.ERROR,
+}
+
+
+class LivyHook(BaseHook, LoggingMixin):
+"""
+Hook for Apache Livy through the REST API.
+
+For more information about the API refer to
+https://livy.apache.org/docs/latest/rest-api.html
+
+:param livy_conn_id: reference to a pre-defined Livy Connection.
+:type livy_conn_id: str
+"""
+def __init__(self, livy_conn_id='livy_default'):
+super(LivyHook, self).__init__(livy_conn_id)
+self._livy_conn_id = livy_conn_id
+self._build_base_url()
+
+def _build_base_url(self):
+"""
+Build connection URL
+"""
+params = self.get_connection(self._livy_conn_id)
+
+base_url = params.host
+
+if not base_url:
+raise AirflowException("Missing Livy endpoint hostname")
+
+if '://' not in base_url:
+base_url = '{}://{}'.format('http', base_url)
+if not re.search(r':\d+$', base_url):
+base_url = '{}:{}'.format(base_url, str(params.port or 8998))
+
+self._base_url = base_url
+
+def get_conn(self):
+pass
+
+def post_batch(self, *args, **kwargs):
+"""
+Perform request to submit batch
+"""
+
+batch_submit_body = json.dumps(LivyHook.build_post_batch_body(*args, 
**kwargs))
+headers = {'Content-Type': 'application/json'}
+
+self.log.info("Submitting job {} to {}".format(batch_submit_body, 
self._base_url))
+response = requests.post(self._base_url + '/batches', 
data=batch_submit_body, headers=headers)
+self.log.debug("Got response: {}".format(response.text))
+
+if response.status_code != 201:
+raise AirflowException("Could not submit batch. Status code: 
{}".format(response.status_code))
+
+batch_id = LivyHook._parse_post_response(response.json())
+if batch_id is None:
+raise AirflowException("Unable to parse a batch session id")
+self.log.info("Batch submitted with session id: {}".format(batch_id))
+
+return batch_id
+
+def get_batch(self, session_id):
+"""
+Fetch info about the specified batch
+:param session_id: identifier of the batch sessions
+:type session_id: int
+"""
+LivyHook._validate_session_id(session_id)
+
+self.log.debug("Fetching info for batch session {}".format(session_id))
+response = requests.get('{}/batches/{}'.format(self._base_url, 
session_id))
+
+if response.status_code != 200:
+self.log.warning("Got status code {} for session 
{}".format(response.status_code, session_id))
+raise AirflowException("Unable to fetch batch with id: 
{}".format(session_id))
+
+return response.json()
+
+def get_batch_state(self, session_id):
+"""
+Fetch the state of the specified batch
+:param session_id: identifier of the batch sessions
+:type session_id: int
+"""
+LivyHook._validate_session_id(session_id)
+
+self.log.debug("Fetching info for batch session {}".format(session_id))
+response = requests.get('{}/batches/{}/state'.format(self._base_url, 
session_id))
+
+ 

[GitHub] [airflow] feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache Livy REST operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache 
Livy REST operator
URL: https://github.com/apache/airflow/pull/6090#discussion_r326871162
 
 

 ##
 File path: airflow/contrib/operators/livy_operator.py
 ##
 @@ -0,0 +1,175 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+This module contains the Apache Livy operator.
+"""
+
+from time import sleep, gmtime, mktime
+
+from airflow.models import BaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+from airflow.contrib.hooks.livy_hook import LivyHook, BatchState, 
TERMINAL_STATES
+
+
+class LivyOperator(BaseOperator):
+"""
+:param file: Path of the  file containing the application to execute 
(required).
+:type file: str
+:param class_name: Application Java/Spark main class string.
+:type class_name: str
+:param args: Command line arguments for the application s.
+:type args: list
+:param jars: jars to be used in this sessions.
+:type jars: list
+:param py_files: Python files to be used in this session.
+:type py_files: list
+:param files: files to be used in this session.
+:type files: list
+:param driver_memory: Amount of memory to use for the driver process  
string.
 
 Review comment:
   There are some more of those - see below.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache Livy REST operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache 
Livy REST operator
URL: https://github.com/apache/airflow/pull/6090#discussion_r326870662
 
 

 ##
 File path: airflow/contrib/hooks/livy_hook.py
 ##
 @@ -0,0 +1,297 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+This module contains the Apache Livy hook.
+"""
+
+import re
+from enum import Enum
+import json
+import requests
+
+from airflow.exceptions import AirflowException
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+
+class BatchState(Enum):
+"""
+Batch session states
+"""
+NOT_STARTED = 'not_started'
+STARTING = 'starting'
+RUNNING = 'running'
+IDLE = 'idle'
+BUSY = 'busy'
+SHUTTING_DOWN = 'shutting_down'
+ERROR = 'error'
+DEAD = 'dead'
+KILLED = 'killed'
+SUCCESS = 'success'
+
+
+TERMINAL_STATES = {
+BatchState.SUCCESS,
+BatchState.DEAD,
+BatchState.KILLED,
+BatchState.ERROR,
+}
+
+
+class LivyHook(BaseHook, LoggingMixin):
+"""
+Hook for Apache Livy through the REST API.
+
+For more information about the API refer to
+https://livy.apache.org/docs/latest/rest-api.html
+
+:param livy_conn_id: reference to a pre-defined Livy Connection.
+:type livy_conn_id: str
+"""
+def __init__(self, livy_conn_id='livy_default'):
+super(LivyHook, self).__init__(livy_conn_id)
+self._livy_conn_id = livy_conn_id
+self._build_base_url()
+
+def _build_base_url(self):
+"""
+Build connection URL
+"""
+params = self.get_connection(self._livy_conn_id)
+
+base_url = params.host
+
+if not base_url:
+raise AirflowException("Missing Livy endpoint hostname")
+
+if '://' not in base_url:
+base_url = '{}://{}'.format('http', base_url)
+if not re.search(r':\d+$', base_url):
+base_url = '{}:{}'.format(base_url, str(params.port or 8998))
+
+self._base_url = base_url
+
+def get_conn(self):
+pass
+
+def post_batch(self, *args, **kwargs):
+"""
+Perform request to submit batch
 
 Review comment:
   I think it is missing in every function you documented.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache Livy REST operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache 
Livy REST operator
URL: https://github.com/apache/airflow/pull/6090#discussion_r326870579
 
 

 ##
 File path: airflow/contrib/hooks/livy_hook.py
 ##
 @@ -0,0 +1,297 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+This module contains the Apache Livy hook.
+"""
+
+import re
+from enum import Enum
+import json
+import requests
+
+from airflow.exceptions import AirflowException
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+
+class BatchState(Enum):
+"""
+Batch session states
+"""
+NOT_STARTED = 'not_started'
+STARTING = 'starting'
+RUNNING = 'running'
+IDLE = 'idle'
+BUSY = 'busy'
+SHUTTING_DOWN = 'shutting_down'
+ERROR = 'error'
+DEAD = 'dead'
+KILLED = 'killed'
+SUCCESS = 'success'
+
+
+TERMINAL_STATES = {
+BatchState.SUCCESS,
+BatchState.DEAD,
+BatchState.KILLED,
+BatchState.ERROR,
+}
+
+
+class LivyHook(BaseHook, LoggingMixin):
+"""
+Hook for Apache Livy through the REST API.
+
+For more information about the API refer to
+https://livy.apache.org/docs/latest/rest-api.html
+
+:param livy_conn_id: reference to a pre-defined Livy Connection.
+:type livy_conn_id: str
+"""
+def __init__(self, livy_conn_id='livy_default'):
+super(LivyHook, self).__init__(livy_conn_id)
+self._livy_conn_id = livy_conn_id
+self._build_base_url()
+
+def _build_base_url(self):
+"""
+Build connection URL
+"""
+params = self.get_connection(self._livy_conn_id)
+
+base_url = params.host
+
+if not base_url:
+raise AirflowException("Missing Livy endpoint hostname")
+
+if '://' not in base_url:
+base_url = '{}://{}'.format('http', base_url)
+if not re.search(r':\d+$', base_url):
+base_url = '{}:{}'.format(base_url, str(params.port or 8998))
+
+self._base_url = base_url
+
+def get_conn(self):
+pass
+
+def post_batch(self, *args, **kwargs):
+"""
+Perform request to submit batch
+"""
+
+batch_submit_body = json.dumps(LivyHook.build_post_batch_body(*args, 
**kwargs))
+headers = {'Content-Type': 'application/json'}
+
+self.log.info("Submitting job {} to {}".format(batch_submit_body, 
self._base_url))
+response = requests.post(self._base_url + '/batches', 
data=batch_submit_body, headers=headers)
+self.log.debug("Got response: {}".format(response.text))
+
+if response.status_code != 201:
+raise AirflowException("Could not submit batch. Status code: 
{}".format(response.status_code))
+
+batch_id = LivyHook._parse_post_response(response.json())
+if batch_id is None:
+raise AirflowException("Unable to parse a batch session id")
+self.log.info("Batch submitted with session id: {}".format(batch_id))
+
+return batch_id
+
+def get_batch(self, session_id):
+"""
+Fetch info about the specified batch
+:param session_id: identifier of the batch sessions
+:type session_id: int
+"""
+LivyHook._validate_session_id(session_id)
+
+self.log.debug("Fetching info for batch session {}".format(session_id))
+response = requests.get('{}/batches/{}'.format(self._base_url, 
session_id))
+
+if response.status_code != 200:
+self.log.warning("Got status code {} for session 
{}".format(response.status_code, session_id))
+raise AirflowException("Unable to fetch batch with id: 
{}".format(session_id))
+
+return response.json()
+
+def get_batch_state(self, session_id):
+"""
+Fetch the state of the specified batch
+:param session_id: identifier of the batch sessions
+:type session_id: int
+"""
+LivyHook._validate_session_id(session_id)
+
+self.log.debug("Fetching info for batch session {}".format(session_id))
+response = requests.get('{}/batches/{}/state'.format(self._base_url, 
session_id))
+
+ 

[GitHub] [airflow] feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache Livy REST operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache 
Livy REST operator
URL: https://github.com/apache/airflow/pull/6090#discussion_r326870633
 
 

 ##
 File path: airflow/contrib/hooks/livy_hook.py
 ##
 @@ -0,0 +1,297 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+This module contains the Apache Livy hook.
+"""
+
+import re
+from enum import Enum
+import json
+import requests
+
+from airflow.exceptions import AirflowException
+from airflow.hooks.base_hook import BaseHook
+from airflow.utils.log.logging_mixin import LoggingMixin
+
+
+class BatchState(Enum):
+"""
+Batch session states
+"""
+NOT_STARTED = 'not_started'
+STARTING = 'starting'
+RUNNING = 'running'
+IDLE = 'idle'
+BUSY = 'busy'
+SHUTTING_DOWN = 'shutting_down'
+ERROR = 'error'
+DEAD = 'dead'
+KILLED = 'killed'
+SUCCESS = 'success'
+
+
+TERMINAL_STATES = {
+BatchState.SUCCESS,
+BatchState.DEAD,
+BatchState.KILLED,
+BatchState.ERROR,
+}
+
+
+class LivyHook(BaseHook, LoggingMixin):
+"""
+Hook for Apache Livy through the REST API.
+
+For more information about the API refer to
+https://livy.apache.org/docs/latest/rest-api.html
+
+:param livy_conn_id: reference to a pre-defined Livy Connection.
+:type livy_conn_id: str
+"""
+def __init__(self, livy_conn_id='livy_default'):
+super(LivyHook, self).__init__(livy_conn_id)
+self._livy_conn_id = livy_conn_id
+self._build_base_url()
+
+def _build_base_url(self):
+"""
+Build connection URL
+"""
+params = self.get_connection(self._livy_conn_id)
+
+base_url = params.host
+
+if not base_url:
+raise AirflowException("Missing Livy endpoint hostname")
+
+if '://' not in base_url:
+base_url = '{}://{}'.format('http', base_url)
+if not re.search(r':\d+$', base_url):
+base_url = '{}:{}'.format(base_url, str(params.port or 8998))
+
+self._base_url = base_url
+
+def get_conn(self):
+pass
+
+def post_batch(self, *args, **kwargs):
+"""
+Perform request to submit batch
 
 Review comment:
   Could you document the `:return:` and the `:rtype:`, please? :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache Livy REST operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache 
Livy REST operator
URL: https://github.com/apache/airflow/pull/6090#discussion_r326871236
 
 

 ##
 File path: airflow/contrib/operators/livy_operator.py
 ##
 @@ -0,0 +1,175 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+This module contains the Apache Livy operator.
+"""
+
+from time import sleep, gmtime, mktime
+
+from airflow.models import BaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+from airflow.contrib.hooks.livy_hook import LivyHook, BatchState, 
TERMINAL_STATES
+
+
+class LivyOperator(BaseOperator):
+"""
+:param file: Path of the  file containing the application to execute 
(required).
+:type file: str
+:param class_name: Application Java/Spark main class string.
+:type class_name: str
+:param args: Command line arguments for the application s.
+:type args: list
+:param jars: jars to be used in this sessions.
+:type jars: list
+:param py_files: Python files to be used in this session.
+:type py_files: list
+:param files: files to be used in this session.
+:type files: list
+:param driver_memory: Amount of memory to use for the driver process  
string.
+:type driver_memory: str
+:param driver_cores: Number of cores to use for the driver process int.
+:type driver_cores: str
+:param executor_memory: Amount of memory to use per executor process  
string.
+:type executor_memory: str
+:param executor_cores: Number of cores to use for each executor  int.
+:type executor_cores: str
+:param num_executors: Number of executors to launch for this session  int.
+:type num_executors: str
+:param archives: Archives to be used in this session.
+:type archives: list
+:param queue: The name of the YARN queue to which submitted string.
+:type queue: str
+:param name: The name of this session  string.
+:type name: str
+:param conf: Spark configuration properties.
+:type conf: dict
+:param proxy_user: User to impersonate when running the job.
+:type proxy_user: str
+:param livy_conn_id: reference to a pre-defined Livy Connection.
+:type livy_conn_id: str
+:param polling_interval: time in seconds between polling for job 
completion. Don't poll for values >=0
+:type polling_interval: int
+:param timeout: for a value greater than zero, number of seconds to poll 
before killing the batch.
+:type timeout: int
+"""
+
+@apply_defaults
+def __init__(
+self,
+file=None,
+args=None,
+conf=None,
+livy_conn_id='livy_default',
+polling_interval=0,
+timeout=24 * 3600,
+*vargs,
+**kwargs
 
 Review comment:
   Would you mind splitting up the kwargs here, too?
   Then you could also remove those lines below.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache Livy REST operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache 
Livy REST operator
URL: https://github.com/apache/airflow/pull/6090#discussion_r326871346
 
 

 ##
 File path: airflow/contrib/operators/livy_operator.py
 ##
 @@ -0,0 +1,175 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+This module contains the Apache Livy operator.
+"""
+
+from time import sleep, gmtime, mktime
+
+from airflow.models import BaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+from airflow.contrib.hooks.livy_hook import LivyHook, BatchState, 
TERMINAL_STATES
+
+
+class LivyOperator(BaseOperator):
+"""
+:param file: Path of the  file containing the application to execute 
(required).
+:type file: str
+:param class_name: Application Java/Spark main class string.
+:type class_name: str
+:param args: Command line arguments for the application s.
+:type args: list
+:param jars: jars to be used in this sessions.
+:type jars: list
+:param py_files: Python files to be used in this session.
+:type py_files: list
+:param files: files to be used in this session.
+:type files: list
+:param driver_memory: Amount of memory to use for the driver process  
string.
+:type driver_memory: str
+:param driver_cores: Number of cores to use for the driver process int.
+:type driver_cores: str
+:param executor_memory: Amount of memory to use per executor process  
string.
+:type executor_memory: str
+:param executor_cores: Number of cores to use for each executor  int.
+:type executor_cores: str
+:param num_executors: Number of executors to launch for this session  int.
+:type num_executors: str
+:param archives: Archives to be used in this session.
+:type archives: list
+:param queue: The name of the YARN queue to which submitted string.
+:type queue: str
+:param name: The name of this session  string.
+:type name: str
+:param conf: Spark configuration properties.
+:type conf: dict
+:param proxy_user: User to impersonate when running the job.
+:type proxy_user: str
+:param livy_conn_id: reference to a pre-defined Livy Connection.
+:type livy_conn_id: str
+:param polling_interval: time in seconds between polling for job 
completion. Don't poll for values >=0
+:type polling_interval: int
+:param timeout: for a value greater than zero, number of seconds to poll 
before killing the batch.
+:type timeout: int
+"""
+
+@apply_defaults
+def __init__(
+self,
+file=None,
+args=None,
+conf=None,
+livy_conn_id='livy_default',
+polling_interval=0,
+timeout=24 * 3600,
+*vargs,
+**kwargs
+):
+super(LivyOperator, self).__init__(*vargs, **kwargs)
+
+self._spark_params = {
+'file': file,
+'args': args,
+'conf': conf,
+}
+
+self._spark_params['proxy_user'] = kwargs.get('proxy_user')
+self._spark_params['class_name'] = kwargs.get('class_name')
+self._spark_params['jars'] = kwargs.get('jars')
+self._spark_params['py_files'] = kwargs.get('py_files')
+self._spark_params['files'] = kwargs.get('files')
+self._spark_params['driver_memory'] = kwargs.get('driver_memory')
+self._spark_params['driver_cores'] = kwargs.get('driver_cores')
+self._spark_params['executor_memory'] = kwargs.get('executor_memory')
+self._spark_params['executor_cores'] = kwargs.get('executor_cores')
+self._spark_params['num_executors'] = kwargs.get('num_executors')
+self._spark_params['archives'] = kwargs.get('archives')
+self._spark_params['queue'] = kwargs.get('queue')
+self._spark_params['name'] = kwargs.get('name')
+
+self._livy_conn_id = livy_conn_id
+self._polling_interval = polling_interval
+self._timeout = timeout
+
+self._livy_hook = None
+self._batch_id = None
+self._start_ts = None
+
+def _init_hook(self):
+if self._livy_conn_id:
+if self._livy_hook 

[GitHub] [airflow] feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache Livy REST operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6090: [AIRFLOW-5470] Add Apache 
Livy REST operator
URL: https://github.com/apache/airflow/pull/6090#discussion_r326871517
 
 

 ##
 File path: airflow/contrib/operators/livy_operator.py
 ##
 @@ -0,0 +1,175 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+This module contains the Apache Livy operator.
+"""
+
+from time import sleep, gmtime, mktime
+
+from airflow.models import BaseOperator
+from airflow.utils.decorators import apply_defaults
+from airflow.exceptions import AirflowException
+from airflow.contrib.hooks.livy_hook import LivyHook, BatchState, 
TERMINAL_STATES
+
+
+class LivyOperator(BaseOperator):
+"""
+:param file: Path of the  file containing the application to execute 
(required).
+:type file: str
+:param class_name: Application Java/Spark main class string.
+:type class_name: str
+:param args: Command line arguments for the application s.
+:type args: list
+:param jars: jars to be used in this sessions.
+:type jars: list
+:param py_files: Python files to be used in this session.
+:type py_files: list
+:param files: files to be used in this session.
+:type files: list
+:param driver_memory: Amount of memory to use for the driver process  
string.
+:type driver_memory: str
+:param driver_cores: Number of cores to use for the driver process int.
+:type driver_cores: str
+:param executor_memory: Amount of memory to use per executor process  
string.
+:type executor_memory: str
+:param executor_cores: Number of cores to use for each executor  int.
+:type executor_cores: str
+:param num_executors: Number of executors to launch for this session  int.
+:type num_executors: str
+:param archives: Archives to be used in this session.
+:type archives: list
+:param queue: The name of the YARN queue to which submitted string.
+:type queue: str
+:param name: The name of this session  string.
+:type name: str
+:param conf: Spark configuration properties.
+:type conf: dict
+:param proxy_user: User to impersonate when running the job.
+:type proxy_user: str
+:param livy_conn_id: reference to a pre-defined Livy Connection.
+:type livy_conn_id: str
+:param polling_interval: time in seconds between polling for job 
completion. Don't poll for values >=0
+:type polling_interval: int
+:param timeout: for a value greater than zero, number of seconds to poll 
before killing the batch.
+:type timeout: int
+"""
+
+@apply_defaults
+def __init__(
+self,
+file=None,
+args=None,
+conf=None,
+livy_conn_id='livy_default',
+polling_interval=0,
+timeout=24 * 3600,
+*vargs,
 
 Review comment:
   Why did you call them `*vargs` instead of `*args` ? Your documentation has 
only `*args` and I think `*args` is also more commonly used.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #5680: [AIRFLOW-5066] allow k8s fieldref substitution

2019-09-21 Thread GitBox
mik-laj commented on issue #5680: [AIRFLOW-5066] allow k8s fieldref substitution
URL: https://github.com/apache/airflow/pull/5680#issuecomment-533825709
 
 
   @jsurloppe Travis is sad. Can you fix it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #5693: [AIRFLOW-5080] Added npm to apt-get install in compile.sh for docker

2019-09-21 Thread GitBox
mik-laj commented on issue #5693: [AIRFLOW-5080] Added npm to apt-get install 
in compile.sh for docker
URL: https://github.com/apache/airflow/pull/5693#issuecomment-533825610
 
 
   Hello @potiuk  @Esfahan .
   Any progress on this PR? Can i help with this?
   Best,


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil merged pull request #6162: [AIRFLOW-XXX] Fix backtick issues in .rst files & Add Precommit hook

2019-09-21 Thread GitBox
kaxil merged pull request #6162: [AIRFLOW-XXX] Fix backtick issues in .rst 
files & Add Precommit hook
URL: https://github.com/apache/airflow/pull/6162
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil commented on issue #6162: [AIRFLOW-XXX] Fix backtick issues in .rst files & Add Precommit hook

2019-09-21 Thread GitBox
kaxil commented on issue #6162: [AIRFLOW-XXX] Fix backtick issues in .rst files 
& Add Precommit hook
URL: https://github.com/apache/airflow/pull/6162#issuecomment-533825341
 
 
   CI passed: https://travis-ci.org/kaxil/airflow/builds/587910679


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-4574) Add SSHHook private key parameter pkey

2019-09-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935143#comment-16935143
 ] 

ASF GitHub Bot commented on AIRFLOW-4574:
-

dstandish commented on pull request #6163: [AIRFLOW-4574] SSHHook private_key 
may only be supplied in extras
URL: https://github.com/apache/airflow/pull/6163
 
 
   * discussion on original PR suggested removing private_key option as init 
param
   * with this PR, can still provide through extras, but not as init param
   * also add support for private_key in tunnel -- missing in original PR for 
this issue
   * remove test related to private_key init param
   * use context manager to auto-close socket listener so tests can be re-run
   
   @mik-laj @pgagnon @kaxil in spirit of collaboration I set out to address 
issue in original PR for this issue (#6104 ).  Namely, I set out to remove 
private_key as an init param to SSHHook.  Lo and behold I noticed that original 
PR did not extend support for `private_key` to the get_tunnel hook method, 
because it doesn't use `get_conn` but connects independently.  This PR 
rectifies this oversight by adding this capability.
   I did not create new jira because this feels like continuation of same issue 
-- just more fully realizing it, and in a way that everyone can be happy with.
   There were some minor tweaks that I made to testing.  
   * In test, The `HELLO_SERVER_CMD` was not executed in context manager, so it 
left the socket listener running, which meant you could not rerun the tests 
without manually killing the listener process.  I use context manager.  I think 
this makes sense in same PR because it actively interfered with my ability to 
test my change.
   * In test, I also moved connection creation / destruction to setUpClass / 
tearDownClass so they are created and destroyed only once for the ssh hook test 
suite.  Made sense to do this because in adding tunnel test I had to use the 
connection in more than one place.
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add SSHHook private key parameter pkey
> --
>
> Key: AIRFLOW-4574
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4574
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: hooks
>Reporter: Freddy Fostvedt
>Assignee: Freddy Fostvedt
>Priority: Minor
> Fix For: 1.10.6
>
>
> The SSHHook only supports 

[GitHub] [airflow] dstandish opened a new pull request #6163: [AIRFLOW-4574] SSHHook private_key may only be supplied in extras

2019-09-21 Thread GitBox
dstandish opened a new pull request #6163: [AIRFLOW-4574] SSHHook private_key 
may only be supplied in extras
URL: https://github.com/apache/airflow/pull/6163
 
 
   * discussion on original PR suggested removing private_key option as init 
param
   * with this PR, can still provide through extras, but not as init param
   * also add support for private_key in tunnel -- missing in original PR for 
this issue
   * remove test related to private_key init param
   * use context manager to auto-close socket listener so tests can be re-run
   
   @mik-laj @pgagnon @kaxil in spirit of collaboration I set out to address 
issue in original PR for this issue (#6104 ).  Namely, I set out to remove 
private_key as an init param to SSHHook.  Lo and behold I noticed that original 
PR did not extend support for `private_key` to the get_tunnel hook method, 
because it doesn't use `get_conn` but connects independently.  This PR 
rectifies this oversight by adding this capability.
   I did not create new jira because this feels like continuation of same issue 
-- just more fully realizing it, and in a way that everyone can be happy with.
   There were some minor tweaks that I made to testing.  
   * In test, The `HELLO_SERVER_CMD` was not executed in context manager, so it 
left the socket listener running, which meant you could not rerun the tests 
without manually killing the listener process.  I use context manager.  I think 
this makes sense in same PR because it actively interfered with my ability to 
test my change.
   * In test, I also moved connection creation / destruction to setUpClass / 
tearDownClass so they are created and destroyed only once for the ssh hook test 
suite.  Made sense to do this because in adding tunnel test I had to use the 
connection in more than one place.
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
 - In case you are proposing a fundamental code change, you need to create 
an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (AIRFLOW-5530) Fix typo in AWS SQS Sensor

2019-09-21 Thread Kamil Bregula (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula closed AIRFLOW-5530.
--
Fix Version/s: 1.10.6
   Resolution: Fixed

> Fix typo in AWS SQS Sensor
> --
>
> Key: AIRFLOW-5530
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5530
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, operators
>Affects Versions: 1.10.5
>Reporter: Kamil Bregula
>Priority: Major
> Fix For: 1.10.6
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5530) Fix typo in AWS SQS Sensor

2019-09-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935138#comment-16935138
 ] 

ASF subversion and git services commented on AIRFLOW-5530:
--

Commit 123479cd6ad877e295a25367613534be7f7aec3b in airflow's branch 
refs/heads/master from Fabrizio Milo
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=123479c ]

[AIRFLOW-5530] Fiix typo in AWS SQS sensors (#6012)



> Fix typo in AWS SQS Sensor
> --
>
> Key: AIRFLOW-5530
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5530
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, operators
>Affects Versions: 1.10.5
>Reporter: Kamil Bregula
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] mik-laj merged pull request #6012: [AIRFLOW-5530] Fiix typo in AWS SQS sensors

2019-09-21 Thread GitBox
mik-laj merged pull request #6012: [AIRFLOW-5530] Fiix typo in AWS SQS sensors
URL: https://github.com/apache/airflow/pull/6012
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5530) Fix typo in AWS SQS Sensor

2019-09-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935137#comment-16935137
 ] 

ASF GitHub Bot commented on AIRFLOW-5530:
-

mik-laj commented on pull request #6012: [AIRFLOW-5530] Fiix typo in AWS SQS 
sensors
URL: https://github.com/apache/airflow/pull/6012
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix typo in AWS SQS Sensor
> --
>
> Key: AIRFLOW-5530
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5530
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, operators
>Affects Versions: 1.10.5
>Reporter: Kamil Bregula
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] mik-laj commented on issue #6074: [AIRFLOW-5390] Remove provide context

2019-09-21 Thread GitBox
mik-laj commented on issue #6074: [AIRFLOW-5390] Remove provide context
URL: https://github.com/apache/airflow/pull/6074#issuecomment-533823472
 
 
   Travis is sad. Can you do rebase?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #6012: [AIRFLOW-5530] Fiix typo in AWS SQS sensors

2019-09-21 Thread GitBox
mik-laj commented on issue #6012: [AIRFLOW-5530] Fiix typo in AWS SQS sensors
URL: https://github.com/apache/airflow/pull/6012#issuecomment-533823839
 
 
   I created a missing jira ticket.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #6079: [AIRFLOW-5448] Handle istio-proxy for Kubernetes Pods

2019-09-21 Thread GitBox
mik-laj commented on issue #6079: [AIRFLOW-5448] Handle istio-proxy for 
Kubernetes Pods
URL: https://github.com/apache/airflow/pull/6079#issuecomment-533823322
 
 
   Travis is sad. Can you fix it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-5530) Fix typo in AWS SQS Sensor

2019-09-21 Thread Kamil Bregula (Jira)
Kamil Bregula created AIRFLOW-5530:
--

 Summary: Fix typo in AWS SQS Sensor
 Key: AIRFLOW-5530
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5530
 Project: Apache Airflow
  Issue Type: Improvement
  Components: aws, operators
Affects Versions: 1.10.5
Reporter: Kamil Bregula






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] mik-laj edited a comment on issue #6147: [AIRFLOW-5512] Add gcp dependencies into devel

2019-09-21 Thread GitBox
mik-laj edited a comment on issue #6147: [AIRFLOW-5512] Add gcp dependencies 
into devel
URL: https://github.com/apache/airflow/pull/6147#issuecomment-533622830
 
 
   I agree. I think, it's a good idea to move transfer operators to separate 
python package. Currently, this division exists in integration.rst file. We 
have service operators and transfer operators in separate tables.
   https://airflow.readthedocs.io/en/latest/integration.html
   I thought also about long distance perspective. Separate package make easier 
to move operators to separate pip package - library. In this situation, all 
dependencies in new library should be [peer 
dependencies](https://nodejs.org/es/blog/npm/peer-dependencies/). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #5914: [AIRFLOW-5311] Add an AWS Lambda Operator

2019-09-21 Thread GitBox
mik-laj commented on issue #5914: [AIRFLOW-5311] Add an AWS Lambda Operator
URL: https://github.com/apache/airflow/pull/5914#issuecomment-533823603
 
 
   Travis is sad. Can you fix it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on a change in pull request #6075: [AIRFLOW-5266] Allow aws_athena_hook to get all query results

2019-09-21 Thread GitBox
mik-laj commented on a change in pull request #6075: [AIRFLOW-5266] Allow 
aws_athena_hook to get all query results
URL: https://github.com/apache/airflow/pull/6075#discussion_r326871063
 
 

 ##
 File path: tests/contrib/hooks/test_aws_athena_hook.py
 ##
 @@ -0,0 +1,88 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# 'License'); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# 'AS IS' BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+import unittest
+from unittest import mock
+
+try:
+from airflow.contrib.hooks.aws_athena_hook import AWSAthenaHook
+except ImportError:
+AWSAthenaHook = None  # type: ignore
+
+
+class MockAthenaClient:
 
 Review comment:
   I agree. We should use `unittest.mock` in this place.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #6090: [AIRFLOW-5470] Add Apache Livy REST operator

2019-09-21 Thread GitBox
mik-laj commented on issue #6090: [AIRFLOW-5470] Add Apache Livy REST operator
URL: https://github.com/apache/airflow/pull/6090#issuecomment-533823089
 
 
   Travis is sad. Can you fix it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #6097: [AIRFLOW-5478] Decode PythonVirtualenvOperator Output to Logs

2019-09-21 Thread GitBox
mik-laj commented on issue #6097: [AIRFLOW-5478] Decode 
PythonVirtualenvOperator Output to Logs
URL: https://github.com/apache/airflow/pull/6097#issuecomment-533823014
 
 
   Travis is sad. Can you fix it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #6124: [AIRFLOW-5501] in_cluster default value in KubernetesPodOperator overwrites configuration

2019-09-21 Thread GitBox
mik-laj commented on issue #6124: [AIRFLOW-5501] in_cluster default value in 
KubernetesPodOperator overwrites configuration
URL: https://github.com/apache/airflow/pull/6124#issuecomment-533822951
 
 
   Travis is sad. Can you do rebase?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] kaxil opened a new pull request #6162: [AIRFLOW-XXX] Fix backtick issues in .rst files & Add Precommit hook

2019-09-21 Thread GitBox
kaxil opened a new pull request #6162: [AIRFLOW-XXX] Fix backtick issues in 
.rst files & Add Precommit hook
URL: https://github.com/apache/airflow/pull/6162
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   [AIRFLOW-XXX] Fix backtick issues in .rst files & Add Precommit hook
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   doc only change 
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
 - If you implement backwards incompatible changes, please leave a note in 
the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so 
we can assign it to a appropriate release
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mik-laj commented on issue #5919: [AIRFLOW-5262] Update timeout exception to include dag

2019-09-21 Thread GitBox
mik-laj commented on issue #5919: [AIRFLOW-5262] Update timeout exception to 
include dag
URL: https://github.com/apache/airflow/pull/5919#issuecomment-533822837
 
 
   Travis is sad. Can you fix it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #4607: WIP: [AIRFLOW-1894] Google cloud bigquery

2019-09-21 Thread GitBox
potiuk commented on issue #4607: WIP: [AIRFLOW-1894] Google cloud bigquery
URL: https://github.com/apache/airflow/pull/4607#issuecomment-533822728
 
 
   Hey @jmcarp - are you still doing it? Or should we close that one?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #4785: [AIRFLOW-3965] Fixing GoogleCloudStorageToBigQueryOperator failing for jobs outside US and EU

2019-09-21 Thread GitBox
potiuk commented on issue #4785: [AIRFLOW-3965] Fixing 
GoogleCloudStorageToBigQueryOperator failing for jobs outside US and EU
URL: https://github.com/apache/airflow/pull/4785#issuecomment-533822653
 
 
   Should we close that one @nuclearpinguin @lihan @mik-laj ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #5565: [AIRFLOW-4899] Fix get_dataset_list from bigquery hook to return next…

2019-09-21 Thread GitBox
potiuk commented on issue #5565: [AIRFLOW-4899] Fix get_dataset_list from 
bigquery hook to return next…
URL: https://github.com/apache/airflow/pull/5565#issuecomment-533822563
 
 
   Same here @benjamingrenier :(. Rebase is needed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #5566: [AIRFLOW-4935] Add method in the bigquery hook to list tables in a dataset

2019-09-21 Thread GitBox
potiuk commented on issue #5566: [AIRFLOW-4935] Add method in the bigquery hook 
to list tables in a dataset
URL: https://github.com/apache/airflow/pull/5566#issuecomment-533822511
 
 
   Hey @benjamingrenier -> I must ask for another rebase: (. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (AIRFLOW-5499) Move GCP utils to core

2019-09-21 Thread Kamil Bregula (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Bregula closed AIRFLOW-5499.
--
Resolution: Fixed

> Move GCP utils to core
> --
>
> Key: AIRFLOW-5499
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5499
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5499) Move GCP utils to core

2019-09-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935130#comment-16935130
 ] 

ASF GitHub Bot commented on AIRFLOW-5499:
-

mik-laj commented on pull request #6122: [AIRFLOW-5499] Move GCP utils to core
URL: https://github.com/apache/airflow/pull/6122
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Move GCP utils to core
> --
>
> Key: AIRFLOW-5499
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5499
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5499) Move GCP utils to core

2019-09-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935131#comment-16935131
 ] 

ASF subversion and git services commented on AIRFLOW-5499:
--

Commit 8f04ebe66964f008b39fc6767a09914820801c59 in airflow's branch 
refs/heads/master from Tomek
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=8f04ebe ]

[AIRFLOW-5499] Move GCP utils to core (#6122)



> Move GCP utils to core
> --
>
> Key: AIRFLOW-5499
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5499
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] mik-laj merged pull request #6122: [AIRFLOW-5499] Move GCP utils to core

2019-09-21 Thread GitBox
mik-laj merged pull request #6122: [AIRFLOW-5499] Move GCP utils to core
URL: https://github.com/apache/airflow/pull/6122
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk merged pull request #6134: [AIRFLOW-XXX] Add a third way to configure authorization

2019-09-21 Thread GitBox
potiuk merged pull request #6134: [AIRFLOW-XXX] Add a third way to configure 
authorization
URL: https://github.com/apache/airflow/pull/6134
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5526) Update docs configuration due to migration of GCP docs

2019-09-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935129#comment-16935129
 ] 

ASF subversion and git services commented on AIRFLOW-5526:
--

Commit 57b5854430984fcf768f4d5b93b905ff80c1445e in airflow's branch 
refs/heads/v1-10-test from TobKed
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=57b5854 ]

[AIRFLOW-5526] Update docs configuration due to migration of GCP docs (#6154)

* [AIRFLOW-5526] Update docs configuration due to migration of GCP docs


> Update docs configuration due to migration of GCP docs
> --
>
> Key: AIRFLOW-5526
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5526
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 1.10.5
>Reporter: Tobiasz Kedzierski
>Assignee: Tobiasz Kedzierski
>Priority: Minor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5526) Update docs configuration due to migration of GCP docs

2019-09-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935128#comment-16935128
 ] 

ASF subversion and git services commented on AIRFLOW-5526:
--

Commit 57b5854430984fcf768f4d5b93b905ff80c1445e in airflow's branch 
refs/heads/v1-10-test from TobKed
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=57b5854 ]

[AIRFLOW-5526] Update docs configuration due to migration of GCP docs (#6154)

* [AIRFLOW-5526] Update docs configuration due to migration of GCP docs


> Update docs configuration due to migration of GCP docs
> --
>
> Key: AIRFLOW-5526
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5526
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 1.10.5
>Reporter: Tobiasz Kedzierski
>Assignee: Tobiasz Kedzierski
>Priority: Minor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-5526) Update docs configuration due to migration of GCP docs

2019-09-21 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-5526:
--
Fix Version/s: (was: 2.0.0)
   1.10.6

> Update docs configuration due to migration of GCP docs
> --
>
> Key: AIRFLOW-5526
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5526
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 1.10.5
>Reporter: Tobiasz Kedzierski
>Assignee: Tobiasz Kedzierski
>Priority: Minor
> Fix For: 1.10.6
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5526) Update docs configuration due to migration of GCP docs

2019-09-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935127#comment-16935127
 ] 

ASF subversion and git services commented on AIRFLOW-5526:
--

Commit b31773cff98241572819fea83a25e2edd58078be in airflow's branch 
refs/heads/master from TobKed
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=b31773c ]

[AIRFLOW-5526] Update docs configuration due to migration of GCP docs (#6154)

* [AIRFLOW-5526] Update docs configuration due to migration of GCP docs


> Update docs configuration due to migration of GCP docs
> --
>
> Key: AIRFLOW-5526
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5526
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 1.10.5
>Reporter: Tobiasz Kedzierski
>Assignee: Tobiasz Kedzierski
>Priority: Minor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5526) Update docs configuration due to migration of GCP docs

2019-09-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935126#comment-16935126
 ] 

ASF subversion and git services commented on AIRFLOW-5526:
--

Commit b31773cff98241572819fea83a25e2edd58078be in airflow's branch 
refs/heads/master from TobKed
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=b31773c ]

[AIRFLOW-5526] Update docs configuration due to migration of GCP docs (#6154)

* [AIRFLOW-5526] Update docs configuration due to migration of GCP docs


> Update docs configuration due to migration of GCP docs
> --
>
> Key: AIRFLOW-5526
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5526
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 1.10.5
>Reporter: Tobiasz Kedzierski
>Assignee: Tobiasz Kedzierski
>Priority: Minor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5526) Update docs configuration due to migration of GCP docs

2019-09-21 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-5526.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Update docs configuration due to migration of GCP docs
> --
>
> Key: AIRFLOW-5526
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5526
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 1.10.5
>Reporter: Tobiasz Kedzierski
>Assignee: Tobiasz Kedzierski
>Priority: Minor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5526) Update docs configuration due to migration of GCP docs

2019-09-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935125#comment-16935125
 ] 

ASF GitHub Bot commented on AIRFLOW-5526:
-

potiuk commented on pull request #6154: [AIRFLOW-5526] Update docs 
configuration due to migration of GCP docs
URL: https://github.com/apache/airflow/pull/6154
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update docs configuration due to migration of GCP docs
> --
>
> Key: AIRFLOW-5526
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5526
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 1.10.5
>Reporter: Tobiasz Kedzierski
>Assignee: Tobiasz Kedzierski
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] potiuk merged pull request #6154: [AIRFLOW-5526] Update docs configuration due to migration of GCP docs

2019-09-21 Thread GitBox
potiuk merged pull request #6154: [AIRFLOW-5526] Update docs configuration due 
to migration of GCP docs
URL: https://github.com/apache/airflow/pull/6154
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6151: [AIRFLOW-5522] BQ list dataset tables operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6151: [AIRFLOW-5522] BQ list 
dataset tables operator
URL: https://github.com/apache/airflow/pull/6151#discussion_r326869642
 
 

 ##
 File path: airflow/gcp/hooks/bigquery.py
 ##
 @@ -1430,6 +1430,39 @@ def cancel_query(self) -> None:
   self.running_job_id)
 time.sleep(5)
 
+def get_dataset_tables(self, dataset_id: str, project_id: Optional[str] = 
None,
+   max_results: Optional[int] = None,
+   page_token: Optional[str] = None) -> Dict[str, 
Union[str, int, List]]:
+"""
+Get the list of tables for a given dataset.
+see 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/list
 
 Review comment:
   ```suggestion
   
   .. seealso:: 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/list
   ```
   Looks better imo. Don't you think so? :)
   
   https://www.sphinx-doc.org/en/1.5/markup/para.html#directive-seealso
   
   It creates colored boxes: 
https://thomas-cokelaer.info/tutorials/sphinx/rest_syntax.html#colored-boxes-note-seealso-todo-and-warnings


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6151: [AIRFLOW-5522] BQ list dataset tables operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6151: [AIRFLOW-5522] BQ list 
dataset tables operator
URL: https://github.com/apache/airflow/pull/6151#discussion_r326870004
 
 

 ##
 File path: airflow/gcp/hooks/bigquery.py
 ##
 @@ -1430,6 +1430,39 @@ def cancel_query(self) -> None:
   self.running_job_id)
 time.sleep(5)
 
+def get_dataset_tables(self, dataset_id: str, project_id: Optional[str] = 
None,
+   max_results: Optional[int] = None,
+   page_token: Optional[str] = None) -> Dict[str, 
Union[str, int, List]]:
+"""
+Get the list of tables for a given dataset.
+see 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/list
+
+:param dataset_id: the dataset ID of the requested dataset.
+:type dataset_id: str
+:param project_id: (Optional) the project of the requested dataset. If 
None,
+self.project_id will be used.
+:type project_id: str
+:param max_results: (Optional) the maximum number of tables to return.
+:type max_results: int
+:param page_token: (Optional) page token, returned from a previous 
call,
+identifying the result set.
+:type page_token: str
+
+:return: map containing the list of tables + metadata.
+"""
+optional_params = {}
+if max_results:
+optional_params['maxResults'] = max_results
+if page_token:
+optional_params['pageToken'] = page_token
+
+dataset_project_id = project_id if project_id else self.project_id
 
 Review comment:
   ```suggestion
   dataset_project_id = project_id or self.project_id
   ```
   See: https://docs.python.org/3/library/stdtypes.html#truth-value-testing


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6151: [AIRFLOW-5522] BQ list dataset tables operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6151: [AIRFLOW-5522] BQ list 
dataset tables operator
URL: https://github.com/apache/airflow/pull/6151#discussion_r326869776
 
 

 ##
 File path: airflow/gcp/operators/bigquery.py
 ##
 @@ -1132,6 +1132,65 @@ def execute(self, context):
 project_id=self.project_id)
 
 
+class BigQueryGetDatasetTablesOperator(BaseOperator):
+"""
+This operator retrieves the list of tables in the specified dataset.
+
+:param dataset_id: the dataset ID of the requested dataset.
+:type dataset_id: str
+:param project_id: (Optional) the project of the requested dataset. If 
None,
+self.project_id will be used.
+:type project_id: str
+:param max_results: (Optional) the maximum number of tables to return.
+:type max_results: int
+:param page_token: (Optional) page token, returned from a previous call,
+identifying the result set.
+:type page_token: str
+:param gcp_conn_id: (Optional) The connection ID used to connect to Google 
Cloud Platform.
+:type gcp_conn_id: str
+:param delegate_to: (Optional) The account to impersonate, if any.
+For this to work, the service account making the request must have 
domain-wide
+delegation enabled.
+:type delegate_to: str
+
+:rtype: dict
+
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/list#response-body
 
 Review comment:
   ```suggestion
   
   .. seealso:: 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/list#response-body
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6151: [AIRFLOW-5522] BQ list dataset tables operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6151: [AIRFLOW-5522] BQ list 
dataset tables operator
URL: https://github.com/apache/airflow/pull/6151#discussion_r326869867
 
 

 ##
 File path: tests/gcp/operators/test_bigquery.py
 ##
 @@ -566,3 +566,26 @@ def test_execute(self, mock_hook):
 deletion_dataset_table=deletion_dataset_table,
 ignore_if_missing=ignore_if_missing
 )
+
+
+class TestBigQueryGetDatasetTablesOperator(unittest.TestCase):
+@mock.patch('airflow.gcp.operators.bigquery.BigQueryHook')
+def test_execute(self, mock_hook):
+operator = BigQueryGetDatasetTablesOperator(
+task_id=TASK_ID,
+dataset_id=TEST_DATASET,
+project_id=TEST_GCP_PROJECT_ID,
+max_results=2
+)
+
+operator.execute(None)
 
 Review comment:
   ```suggestion
   operator.execute(context=None)
   ```
   just my preference - change it only if you agree :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #6151: [AIRFLOW-5522] BQ list dataset tables operator

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #6151: [AIRFLOW-5522] BQ list 
dataset tables operator
URL: https://github.com/apache/airflow/pull/6151#discussion_r326869642
 
 

 ##
 File path: airflow/gcp/hooks/bigquery.py
 ##
 @@ -1430,6 +1430,39 @@ def cancel_query(self) -> None:
   self.running_job_id)
 time.sleep(5)
 
+def get_dataset_tables(self, dataset_id: str, project_id: Optional[str] = 
None,
+   max_results: Optional[int] = None,
+   page_token: Optional[str] = None) -> Dict[str, 
Union[str, int, List]]:
+"""
+Get the list of tables for a given dataset.
+see 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/list
 
 Review comment:
   ```suggestion
   
   .. seealso:: 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/list
   ```
   Looks better imo. Don't you think so? :)
   
   https://www.sphinx-doc.org/en/1.5/markup/para.html#directive-seealso


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-5519) Fix missing apply default in sql_to_gcs operator

2019-09-21 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935120#comment-16935120
 ] 

Jarek Potiuk commented on AIRFLOW-5519:
---

[~yimingl] I cherry-picked it for 1.10.6 :).

> Fix missing apply default in sql_to_gcs operator
> 
>
> Key: AIRFLOW-5519
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5519
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.10.4
>Reporter: Yiming Liu
>Assignee: Yiming Liu
>Priority: Major
> Fix For: 1.10.6
>
>
>  
> We don't have a apply_defaults decorator in sql_to_gcs operator, which will 
> cause multi-level argument defaults missing. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-5519) Fix missing apply default in sql_to_gcs operator

2019-09-21 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-5519:
--
Fix Version/s: (was: 2.0.0)
   1.10.6

> Fix missing apply default in sql_to_gcs operator
> 
>
> Key: AIRFLOW-5519
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5519
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.10.4
>Reporter: Yiming Liu
>Assignee: Yiming Liu
>Priority: Major
> Fix For: 1.10.6
>
>
>  
> We don't have a apply_defaults decorator in sql_to_gcs operator, which will 
> cause multi-level argument defaults missing. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5519) Fix missing apply default in sql_to_gcs operator

2019-09-21 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935117#comment-16935117
 ] 

ASF subversion and git services commented on AIRFLOW-5519:
--

Commit ec4c6dbb1f7866f2aefd28f9a7c19efba0782503 in airflow's branch 
refs/heads/v1-10-test from Yiming Liu
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=ec4c6db ]

[AIRFLOW-5519] Fix sql_to_gcs operator missing multi-level default args by 
adding apply_defaults decorator  (#6146)

(cherry picked from commit d313d8d24b1969be9154b555dd91466a2489e1c7)


> Fix missing apply default in sql_to_gcs operator
> 
>
> Key: AIRFLOW-5519
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5519
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.10.4
>Reporter: Yiming Liu
>Assignee: Yiming Liu
>Priority: Major
> Fix For: 2.0.0
>
>
>  
> We don't have a apply_defaults decorator in sql_to_gcs operator, which will 
> cause multi-level argument defaults missing. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] feluelle commented on a change in pull request #5998: [AIRFLOW-5398] Update contrib example DAGs to context manager

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #5998: [AIRFLOW-5398] Update 
contrib example DAGs to context manager
URL: https://github.com/apache/airflow/pull/5998#discussion_r326869541
 
 

 ##
 File path: airflow/contrib/example_dags/example_winrm_operator.py
 ##
 @@ -37,39 +37,40 @@
 from airflow.models import DAG
 from airflow.operators.dummy_operator import DummyOperator
 
-args = {
+
+default_args = {
 'owner': 'Airflow',
 'start_date': airflow.utils.dates.days_ago(2)
 }
 
-dag = DAG(
-dag_id='POC_winrm_parallel', default_args=args,
+with DAG(
+dag_id='POC_winrm_parallel',
+default_args=default_args,
 schedule_interval='0 0 * * *',
-dagrun_timeout=timedelta(minutes=60))
+dagrun_timeout=timedelta(minutes=60)
+) as dag:
 
-cmd = 'ls -l'
-run_this_last = DummyOperator(task_id='run_this_last', dag=dag)
+cmd = 'ls -l'
+run_this_last = DummyOperator(task_id='run_this_last', dag=dag)
 
 Review comment:
   ```suggestion
   run_this_last = DummyOperator(task_id='run_this_last')
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] feluelle commented on a change in pull request #5998: [AIRFLOW-5398] Update contrib example DAGs to context manager

2019-09-21 Thread GitBox
feluelle commented on a change in pull request #5998: [AIRFLOW-5398] Update 
contrib example DAGs to context manager
URL: https://github.com/apache/airflow/pull/5998#discussion_r326869516
 
 

 ##
 File path: airflow/contrib/example_dags/example_papermill_operator.py
 ##
 @@ -0,0 +1,52 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+This DAG will use Papermill to run the notebook "hello_world", based on the 
execution date
+it will create an output notebook "out-". All fields, including the keys 
in the parameters, are
+templated.
+"""
+
+from datetime import timedelta
+
+import airflow
+
+from airflow.models import DAG
+from airflow.operators.papermill_operator import PapermillOperator
+
+
+default_args = {
+'owner': 'Airflow',
+'start_date': airflow.utils.dates.days_ago(2)
+}
+
+with DAG(
+dag_id='example_papermill_operator',
+default_args=default_args,
+schedule_interval='0 0 * * *',
+dagrun_timeout=timedelta(minutes=60)
+) as dag:
+# [START howto_operator_papermill]
+run_this = PapermillOperator(
+task_id="run_example_notebook",
+dag=dag,
 
 Review comment:
   ```suggestion
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #5944: [AIRFLOW-5362] Reorder imports

2019-09-21 Thread GitBox
potiuk commented on a change in pull request #5944: [AIRFLOW-5362] Reorder 
imports
URL: https://github.com/apache/airflow/pull/5944#discussion_r326869430
 
 

 ##
 File path: airflow/__init__.py
 ##
 @@ -23,39 +23,39 @@
 implement their own login mechanisms by providing an `airflow_login` module
 in their PYTHONPATH. airflow_login should be based off the
 `airflow.www.login`
-"""
 
-from typing import Optional, Callable
-from airflow import version
-from airflow.utils.log.logging_mixin import LoggingMixin
+isort:skip_file
 
 Review comment:
   BTW. I just merged the pylint changes few days ago - so rebasing this one 
and re-running the isort should be good to go @KevinYang21 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #6161: [AIRFLOW-XXX] Fix incorrect backticks in BREEZE.rst

2019-09-21 Thread GitBox
potiuk commented on issue #6161: [AIRFLOW-XXX] Fix incorrect backticks in 
BREEZE.rst
URL: https://github.com/apache/airflow/pull/6161#issuecomment-533819382
 
 
    


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #6160: [AIRFLOW-XXX] Fix Prerequisites link in BREEZE.rst

2019-09-21 Thread GitBox
potiuk commented on issue #6160: [AIRFLOW-XXX] Fix Prerequisites link in 
BREEZE.rst
URL: https://github.com/apache/airflow/pull/6160#issuecomment-533819367
 
 
   Thanks @XD-DENG and @kaxil :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-5529) Apache Drill hook

2019-09-21 Thread Sayed Mohammad Hossein Torabi (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sayed Mohammad Hossein Torabi updated AIRFLOW-5529:
---
Summary:  Apache Drill hook  (was: Apache Drill hook)

>  Apache Drill hook
> --
>
> Key: AIRFLOW-5529
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5529
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: hooks
>Affects Versions: 1.10.5
>Reporter: Sayed Mohammad Hossein Torabi
>Assignee: Sayed Mohammad Hossein Torabi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-5529) Apache Drill hook

2019-09-21 Thread Sayed Mohammad Hossein Torabi (Jira)
Sayed Mohammad Hossein Torabi created AIRFLOW-5529:
--

 Summary: Apache Drill hook
 Key: AIRFLOW-5529
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5529
 Project: Apache Airflow
  Issue Type: New Feature
  Components: hooks
Affects Versions: 1.10.5
Reporter: Sayed Mohammad Hossein Torabi
Assignee: Sayed Mohammad Hossein Torabi






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow] nuclearpinguin commented on a change in pull request #6096: [AIRFLOW-5477] Rewrite Google PubSub Hook to Google Cloud Python

2019-09-21 Thread GitBox
nuclearpinguin commented on a change in pull request #6096: [AIRFLOW-5477] 
Rewrite Google PubSub Hook to Google Cloud Python
URL: https://github.com/apache/airflow/pull/6096#discussion_r326855817
 
 

 ##
 File path: airflow/gcp/sensors/pubsub.py
 ##
 @@ -98,11 +110,17 @@ def execute(self, context):
 def poke(self, context):
 hook = PubSubHook(gcp_conn_id=self.gcp_conn_id,
   delegate_to=self.delegate_to)
-self._messages = hook.pull(
-self.project, self.subscription, self.max_messages,
-self.return_immediately)
+pulled_messages = hook.pull(
+project_id=self.project_id,
+subscription=self.subscription,
+max_messages=self.max_messages,
+return_immediately=self.return_immediately
+)
+
+self._messages = [MessageToDict(m) for m in pulled_messages]
+
 if self._messages and self.ack_messages:
 if self.ack_messages:
 
 Review comment:
   ```suggestion
   ```
   This is a double assertion. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] stale[bot] commented on issue #5443: [AIRFLOW-4823] modified the source to have create_default_connections flag in config

2019-09-21 Thread GitBox
stale[bot] commented on issue #5443: [AIRFLOW-4823] modified the source to have 
create_default_connections flag in config
URL: https://github.com/apache/airflow/pull/5443#issuecomment-533782605
 
 
   This issue has been automatically marked as stale because it has not had 
recent activity. It will be closed if no further activity occurs. Thank you for 
your contributions.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] mortenbpost commented on issue #6014: [AIRFLOW-5342] Fix MSSQL breaking task_instance db migration

2019-09-21 Thread GitBox
mortenbpost commented on issue #6014: [AIRFLOW-5342] Fix MSSQL breaking 
task_instance db migration
URL: https://github.com/apache/airflow/pull/6014#issuecomment-533781995
 
 
   > @potiuk If we do edit these `airflow/migrations/versions` files. We should 
re-enable Pylint for them.
   
   Sure, as long as alembic.op module is ignored I think it's okay!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on issue #6149: [AIRFLOW-XXX] Fix typo and format error

2019-09-21 Thread GitBox
potiuk commented on issue #6149: [AIRFLOW-XXX] Fix typo and format error
URL: https://github.com/apache/airflow/pull/6149#issuecomment-533777150
 
 
   Thanks @haoliang7 @XD-DENG -> Happy to see the documentation is read :)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services