[jira] [Closed] (AIRFLOW-2942) pypi not updated to 1.10

2018-08-27 Thread Matthias (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias closed AIRFLOW-2942.
-
Resolution: Done

I'm closing this myself as pypi ([https://pypi.org/project/apache-airflow/]) 
has been updated with version 1.10.

 

Thanks

> pypi not updated to 1.10
> 
>
> Key: AIRFLOW-2942
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2942
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10
>Reporter: Matthias
>Priority: Major
>
> According to 
> [https://cwiki.apache.org/confluence/display/AIRFLOW/Announcements,] airflow 
> 1.10 is out since August 20th. 
> Pypi is still serving 1.9.0 ([https://pypi.org/project/apache-airflow/)] as 
> of today.
>  
> Installing via `pip install apache-airflow==1.10.0` does not work as that 
> release is not available on pypi.
> output:
> ```
>  Collecting apache-airflow==1.10.0
>  Could not find a version that satisfies the requirement 
> apache-airflow==1.10.0 (from versions: 1.8.1, 1.8.2rc1, 1.8.2, 1.9.0)
>  No matching distribution found for apache-airflow==1.10.0
>  ```
>  
> Fix should be trivial - pushing the new release to pypi.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2965) Add CLI command to find the next dag run.

2018-08-27 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594518#comment-16594518
 ] 

jack commented on AIRFLOW-2965:
---

[~sanand] I have no knowledge of how to implement this. Anyone Is welcome to 
take it and submit a PR :)

> Add CLI command to find the next dag run.
> -
>
> Key: AIRFLOW-2965
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2965
> Project: Apache Airflow
>  Issue Type: Task
>Reporter: jack
>Priority: Minor
> Fix For: 1.10.1
>
>
> I have a dag with the following properties:
> {code:java}
> dag = DAG(
>     dag_id='mydag',
>     default_args=args,
>     schedule_interval='0 1 * * *',
>     max_active_runs=1,
>     catchup=False){code}
>  
>  
> This runs great.
> Last run is: 2018-08-26 01:00  (start date is 2018-08-27 01:00)
>  
> Now it's 2018-08-27 17:55 I decided to change my dag to:
>  
> {code:java}
> dag = DAG(
>     dag_id='mydag',
>     default_args=args,
>     schedule_interval='0 23 * * *',
>     max_active_runs=1,
>     catchup=False){code}
>  
> Now, I have no idea when will be the next dag run.
> Will it be today at 23:00? I can't be sure when the cycle is complete. I'm 
> not even sure that this change will do what I wish.
> I'm sure you guys are expert and you can answer this question but most of us 
> wouldn't know.
>  
> The scheduler has the knowledge when the dag is available for running. All 
> I'm asking is to take that knowledge and create a CLI command that I will 
> give the dag_id and it will tell me the next date/hour which my dag will be 
> runnable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2965) Add CLI command to find the next dag run.

2018-08-27 Thread Siddharth Anand (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594394#comment-16594394
 ] 

Siddharth Anand commented on AIRFLOW-2965:
--

This is a good ask! I recall it was requested a long time ago and somehow never 
made it to reality! Would you like to take a crack at it? You can send a note 
to the Dev list if someone would like to take it on as well!

-s

> Add CLI command to find the next dag run.
> -
>
> Key: AIRFLOW-2965
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2965
> Project: Apache Airflow
>  Issue Type: Task
>Reporter: jack
>Priority: Minor
> Fix For: 1.10.1
>
>
> I have a dag with the following properties:
> {code:java}
> dag = DAG(
>     dag_id='mydag',
>     default_args=args,
>     schedule_interval='0 1 * * *',
>     max_active_runs=1,
>     catchup=False){code}
>  
>  
> This runs great.
> Last run is: 2018-08-26 01:00  (start date is 2018-08-27 01:00)
>  
> Now it's 2018-08-27 17:55 I decided to change my dag to:
>  
> {code:java}
> dag = DAG(
>     dag_id='mydag',
>     default_args=args,
>     schedule_interval='0 23 * * *',
>     max_active_runs=1,
>     catchup=False){code}
>  
> Now, I have no idea when will be the next dag run.
> Will it be today at 23:00? I can't be sure when the cycle is complete. I'm 
> not even sure that this change will do what I wish.
> I'm sure you guys are expert and you can answer this question but most of us 
> wouldn't know.
>  
> The scheduler has the knowledge when the dag is available for running. All 
> I'm asking is to take that knowledge and create a CLI command that I will 
> give the dag_id and it will tell me the next date/hour which my dag will be 
> runnable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AIRFLOW-2962) PiPy Package not published for Airflow 1.10.0

2018-08-27 Thread Siddharth Anand (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Anand closed AIRFLOW-2962.

Resolution: Fixed

> PiPy Package not published for Airflow 1.10.0
> -
>
> Key: AIRFLOW-2962
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2962
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 1.10.0, 1.10
> Environment: Python 3
> Docker
>Reporter: Arihant
>Assignee: Siddharth Anand
>Priority: Major
>  Labels: CI, build
>
> In the Airflow 
> [announcements|https://cwiki.apache.org/confluence/display/AIRFLOW/Announcements#Announcements-Aug20,2018]
>  it was mentioned that the package is available via PyPi:
> Pypi : [https://pypi.python.org/pypi/apache-airflow] (Run `pip install 
> apache-airflow`)
>  
> But the release version is not available there.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2962) PiPy Package not published for Airflow 1.10.0

2018-08-27 Thread Siddharth Anand (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594388#comment-16594388
 ] 

Siddharth Anand commented on AIRFLOW-2962:
--

This works now!

(venv) sianand@LM-SJN-21002367:~/airflow_release $ pip install apache-airflow

...

(venv) sianand@LM-SJN-21002367:~/airflow_release $ airflow version
[2018-08-27 18:36:16,686] \{__init__.py:51} INFO - Using executor 
SequentialExecutor
/Users/sianand/airflow_release/venv/lib/python3.6/site-packages/airflow/bin/cli.py:1595:
 DeprecationWarning: The celeryd_concurrency option in [celery] has been 
renamed to worker_concurrency - the old setting has been used, but please 
update your config.
  default=conf.get('celery', 'worker_concurrency')),

     _
     |__( )_  __/__  /  __
  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 _/_/  |_/_/  /_/    /_/    /_/  \//|__/
   v1.10.0

> PiPy Package not published for Airflow 1.10.0
> -
>
> Key: AIRFLOW-2962
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2962
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 1.10.0, 1.10
> Environment: Python 3
> Docker
>Reporter: Arihant
>Assignee: Siddharth Anand
>Priority: Major
>  Labels: CI, build
>
> In the Airflow 
> [announcements|https://cwiki.apache.org/confluence/display/AIRFLOW/Announcements#Announcements-Aug20,2018]
>  it was mentioned that the package is available via PyPi:
> Pypi : [https://pypi.python.org/pypi/apache-airflow] (Run `pip install 
> apache-airflow`)
>  
> But the release version is not available there.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2962) PiPy Package not published for Airflow 1.10.0

2018-08-27 Thread Siddharth Anand (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Anand reassigned AIRFLOW-2962:


Assignee: Siddharth Anand

> PiPy Package not published for Airflow 1.10.0
> -
>
> Key: AIRFLOW-2962
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2962
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 1.10.0, 1.10
> Environment: Python 3
> Docker
>Reporter: Arihant
>Assignee: Siddharth Anand
>Priority: Major
>  Labels: CI, build
>
> In the Airflow 
> [announcements|https://cwiki.apache.org/confluence/display/AIRFLOW/Announcements#Announcements-Aug20,2018]
>  it was mentioned that the package is available via PyPi:
> Pypi : [https://pypi.python.org/pypi/apache-airflow] (Run `pip install 
> apache-airflow`)
>  
> But the release version is not available there.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] r39132 edited a comment on issue #3656: [AIRFLOW-2803] Fix all ESLint issues

2018-08-27 Thread GitBox
r39132 edited a comment on issue #3656: [AIRFLOW-2803] Fix all ESLint issues
URL: 
https://github.com/apache/incubator-airflow/pull/3656#issuecomment-416415615
 
 
   @tedmiston Please resolve conflicts and squash your commits. @verdan thx for 
your feedback. It would be great if you could do one more pass after the rebase 
& squash! Also, can you provide steps that I can run to verify the fixes? Are 
the steps you outline above (i.e. `npm run lint`) still valid? Thx for your 
work on this btw!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] r39132 commented on issue #3656: [AIRFLOW-2803] Fix all ESLint issues

2018-08-27 Thread GitBox
r39132 commented on issue #3656: [AIRFLOW-2803] Fix all ESLint issues
URL: 
https://github.com/apache/incubator-airflow/pull/3656#issuecomment-416415615
 
 
   @tedmiston Please resolve conflicts and squash your commits. @verdan thx for 
your feedback. It would be great if you could do one more pass after the rebase 
& squash!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cHYzZQo commented on issue #3660: [AIRFLOW-2817] Force explicit choice on GPL dependency

2018-08-27 Thread GitBox
cHYzZQo commented on issue #3660: [AIRFLOW-2817] Force explicit choice on GPL 
dependency
URL: 
https://github.com/apache/incubator-airflow/pull/3660#issuecomment-416403083
 
 
   another option might be to have a bracket invocation that can be used. ei:
   
   ```
   airflow[gpl]
   ```
   
   or 
   
   ```
airflow[non-gpl]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cHYzZQo commented on issue #3660: [AIRFLOW-2817] Force explicit choice on GPL dependency

2018-08-27 Thread GitBox
cHYzZQo commented on issue #3660: [AIRFLOW-2817] Force explicit choice on GPL 
dependency
URL: 
https://github.com/apache/incubator-airflow/pull/3660#issuecomment-416400378
 
 
   This is such an anti-pattern IMO. Making the package install broken by 
default. It's also a backward incompatible change that breaks everyone's 
current install scripts. I'd request you reconsider. If we really can't use a 
GPL package by default print a warning and use the NON-GPL option. Packages 
shouldn't just refuse to install by default.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2964) Lazy generation of the job description

2018-08-27 Thread Victor Jimenez (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594200#comment-16594200
 ] 

Victor Jimenez commented on AIRFLOW-2964:
-

That makes sense. I will post a question there.

> Lazy generation of the job description
> --
>
> Key: AIRFLOW-2964
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2964
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Affects Versions: Airflow 1.9.0, 1.10
>Reporter: Victor Jimenez
>Priority: Major
>
> When instantiating a {{DatabricksSubmitRunOperator}} users need to pass the 
> description of the job that will later be executed on Databricks.
> The job description is only needed at execution time (when the hook is 
> called). However, the {{json}} parameter must already have the full job 
> description when constructing the operator. This may present a problem if 
> computing the job description needs to execute expensive operations (e.g., 
> querying a database). The expensive operation will be invoked every single 
> time the DAG is reprocessed (which may happen quite frequently).
> It would be good to have an equivalent mechanism to the {{python_callable}} 
> parameter in the {{PythonOperator}}. In this way, users could pass a function 
> that would generate the job description only when the operator is actually 
> executed.
> [~andrewmchen] Is there any other way to do this? If not, does this sound 
> reasonable? I can create a PR implementing this proposal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #3806: [AIRFLOW-2956] added kubernetes tolerations to kubernetes pod operator

2018-08-27 Thread GitBox
codecov-io edited a comment on issue #3806: [AIRFLOW-2956] added kubernetes 
tolerations to kubernetes pod operator
URL: 
https://github.com/apache/incubator-airflow/pull/3806#issuecomment-416340975
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=h1)
 Report
   > Merging 
[#3806](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/1801baefe44f361010c23e6ec4ee8b8569eab82d?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3806/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3806   +/-   ##
   ===
 Coverage   77.41%   77.41%   
   ===
 Files 203  203   
 Lines   1581015810   
   ===
 Hits1223912239   
 Misses   3571 3571
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3806/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvX19pbml0X18ucHk=)
 | `61.11% <ø> (ø)` | :arrow_up: |
   | 
[airflow/sensors/http\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3806/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL2h0dHBfc2Vuc29yLnB5)
 | `96.42% <ø> (ø)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=footer).
 Last update 
[1801bae...e6f5f22](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] shenni119 closed pull request #3812: editting zendesk_hook.py https://citybase.atlassian.net/browse/DS-78

2018-08-27 Thread GitBox
shenni119 closed pull request #3812: editting zendesk_hook.py 
https://citybase.atlassian.net/browse/DS-78 
URL: https://github.com/apache/incubator-airflow/pull/3812
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/hooks/zendesk_hook.py b/airflow/hooks/zendesk_hook.py
index 3cf8353344..2579d7f30b 100644
--- a/airflow/hooks/zendesk_hook.py
+++ b/airflow/hooks/zendesk_hook.py
@@ -78,7 +78,7 @@ def call(self, path, query=None, get_all_pages=True, 
side_loading=False):
 next_page = results['next_page']
 if side_loading:
 keys += query['include'].split(',')
-results = {key: results[key] for key in keys}
+results = {key: results[key] for key in keys}
 
 if get_all_pages:
 while next_page is not None:


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] shenni119 opened a new pull request #3812: editting zendesk_hook.py https://citybase.atlassian.net/browse/DS-78

2018-08-27 Thread GitBox
shenni119 opened a new pull request #3812: editting zendesk_hook.py 
https://citybase.atlassian.net/browse/DS-78 
URL: https://github.com/apache/incubator-airflow/pull/3812
 
 
   **Description:**
   When I am using zendesk_hook.py file in incubator-airflow/airflow/hooks, I 
am getting an error: "KeyError: 'search'" - the error is due to line 81 in the 
zendesk_hook file "results ={key: results[key] for key in keys}"
   
   I believe the line needs an extra tab at line 81, because this line is 
running regardless if sideloading is set to true. I believe it should only run 
when sideloading is set to true. 
   
   **Tests:**
   My code runs once the tab has been added.
   
   
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-27 Thread GitBox
codecov-io edited a comment on issue #3570: [AIRFLOW-2709] Improve error 
handling in Databricks hook
URL: 
https://github.com/apache/incubator-airflow/pull/3570#issuecomment-401965037
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3570?src=pr=h1)
 Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@fc10f7e`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3570/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3570?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master#3570   +/-   ##
   =
 Coverage  ?   77.41%   
   =
 Files ?  203   
 Lines ?15810   
 Branches  ?0   
   =
 Hits  ?12239   
 Misses? 3571   
 Partials  ?0
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3570?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3570?src=pr=footer).
 Last update 
[fc10f7e...fdec2d0](https://codecov.io/gh/apache/incubator-airflow/pull/3570?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-27 Thread GitBox
codecov-io edited a comment on issue #3570: [AIRFLOW-2709] Improve error 
handling in Databricks hook
URL: 
https://github.com/apache/incubator-airflow/pull/3570#issuecomment-401965037
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3570?src=pr=h1)
 Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@fc10f7e`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3570/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3570?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master#3570   +/-   ##
   =
 Coverage  ?   77.41%   
   =
 Files ?  203   
 Lines ?15810   
 Branches  ?0   
   =
 Hits  ?12239   
 Misses? 3571   
 Partials  ?0
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3570?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3570?src=pr=footer).
 Last update 
[fc10f7e...fdec2d0](https://codecov.io/gh/apache/incubator-airflow/pull/3570?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2964) Lazy generation of the job description

2018-08-27 Thread Andrew Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594159#comment-16594159
 ] 

Andrew Chen commented on AIRFLOW-2964:
--

My guess is no. Maybe we should post this as a question to dev mailing list? 
d...@airflow.incubator.apache.org

> Lazy generation of the job description
> --
>
> Key: AIRFLOW-2964
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2964
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Affects Versions: Airflow 1.9.0, 1.10
>Reporter: Victor Jimenez
>Priority: Major
>
> When instantiating a {{DatabricksSubmitRunOperator}} users need to pass the 
> description of the job that will later be executed on Databricks.
> The job description is only needed at execution time (when the hook is 
> called). However, the {{json}} parameter must already have the full job 
> description when constructing the operator. This may present a problem if 
> computing the job description needs to execute expensive operations (e.g., 
> querying a database). The expensive operation will be invoked every single 
> time the DAG is reprocessed (which may happen quite frequently).
> It would be good to have an equivalent mechanism to the {{python_callable}} 
> parameter in the {{PythonOperator}}. In this way, users could pass a function 
> that would generate the job description only when the operator is actually 
> executed.
> [~andrewmchen] Is there any other way to do this? If not, does this sound 
> reasonable? I can create a PR implementing this proposal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] justinholmes commented on issue #3806: [AIRFLOW-2956] added kubernetes tolerations to kubernetes pod operator

2018-08-27 Thread GitBox
justinholmes commented on issue #3806: [AIRFLOW-2956] added kubernetes 
tolerations to kubernetes pod operator
URL: 
https://github.com/apache/incubator-airflow/pull/3806#issuecomment-416347018
 
 
   Will try to add a few example tests as well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] andrewmchen commented on issue #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-27 Thread GitBox
andrewmchen commented on issue #3570: [AIRFLOW-2709] Improve error handling in 
Databricks hook
URL: 
https://github.com/apache/incubator-airflow/pull/3570#issuecomment-416346693
 
 
   If this is ready @betabandido, could we get some help getting this merged? 
Would you be the right person @Fokko?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (AIRFLOW-2964) Lazy generation of the job description

2018-08-27 Thread Victor Jimenez (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594146#comment-16594146
 ] 

Victor Jimenez edited comment on AIRFLOW-2964 at 8/27/18 7:40 PM:
--

I am not familiar at all with 
[XCom|https://airflow.apache.org/concepts.html#xcoms], but do you think using 
this feature might be a way to solve this issue?


was (Author: betabandido):
I am not familiar at all with {{XCom}}, but do you think using this feature 
might be a way to solve this issue?

> Lazy generation of the job description
> --
>
> Key: AIRFLOW-2964
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2964
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Affects Versions: Airflow 1.9.0, 1.10
>Reporter: Victor Jimenez
>Priority: Major
>
> When instantiating a {{DatabricksSubmitRunOperator}} users need to pass the 
> description of the job that will later be executed on Databricks.
> The job description is only needed at execution time (when the hook is 
> called). However, the {{json}} parameter must already have the full job 
> description when constructing the operator. This may present a problem if 
> computing the job description needs to execute expensive operations (e.g., 
> querying a database). The expensive operation will be invoked every single 
> time the DAG is reprocessed (which may happen quite frequently).
> It would be good to have an equivalent mechanism to the {{python_callable}} 
> parameter in the {{PythonOperator}}. In this way, users could pass a function 
> that would generate the job description only when the operator is actually 
> executed.
> [~andrewmchen] Is there any other way to do this? If not, does this sound 
> reasonable? I can create a PR implementing this proposal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AIRFLOW-2964) Lazy generation of the job description

2018-08-27 Thread Victor Jimenez (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594146#comment-16594146
 ] 

Victor Jimenez edited comment on AIRFLOW-2964 at 8/27/18 7:37 PM:
--

I am not familiar at all with {{XCom}}, but do you think using this feature 
might be a way to solve this issue?


was (Author: betabandido):
I am not familiar at all with `XCom`, but do you think using this feature might 
be a way to solve this issue?

> Lazy generation of the job description
> --
>
> Key: AIRFLOW-2964
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2964
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Affects Versions: Airflow 1.9.0, 1.10
>Reporter: Victor Jimenez
>Priority: Major
>
> When instantiating a {{DatabricksSubmitRunOperator}} users need to pass the 
> description of the job that will later be executed on Databricks.
> The job description is only needed at execution time (when the hook is 
> called). However, the {{json}} parameter must already have the full job 
> description when constructing the operator. This may present a problem if 
> computing the job description needs to execute expensive operations (e.g., 
> querying a database). The expensive operation will be invoked every single 
> time the DAG is reprocessed (which may happen quite frequently).
> It would be good to have an equivalent mechanism to the {{python_callable}} 
> parameter in the {{PythonOperator}}. In this way, users could pass a function 
> that would generate the job description only when the operator is actually 
> executed.
> [~andrewmchen] Is there any other way to do this? If not, does this sound 
> reasonable? I can create a PR implementing this proposal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2964) Lazy generation of the job description

2018-08-27 Thread Victor Jimenez (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594146#comment-16594146
 ] 

Victor Jimenez commented on AIRFLOW-2964:
-

I am not familiar at all with `XCom`, but do you think using this feature might 
be a way to solve this issue?

> Lazy generation of the job description
> --
>
> Key: AIRFLOW-2964
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2964
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Affects Versions: Airflow 1.9.0, 1.10
>Reporter: Victor Jimenez
>Priority: Major
>
> When instantiating a {{DatabricksSubmitRunOperator}} users need to pass the 
> description of the job that will later be executed on Databricks.
> The job description is only needed at execution time (when the hook is 
> called). However, the {{json}} parameter must already have the full job 
> description when constructing the operator. This may present a problem if 
> computing the job description needs to execute expensive operations (e.g., 
> querying a database). The expensive operation will be invoked every single 
> time the DAG is reprocessed (which may happen quite frequently).
> It would be good to have an equivalent mechanism to the {{python_callable}} 
> parameter in the {{PythonOperator}}. In this way, users could pass a function 
> that would generate the job description only when the operator is actually 
> executed.
> [~andrewmchen] Is there any other way to do this? If not, does this sound 
> reasonable? I can create a PR implementing this proposal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #3806: [AIRFLOW-2956] added kubernetes tolerations to kubernetes pod operator

2018-08-27 Thread GitBox
codecov-io edited a comment on issue #3806: [AIRFLOW-2956] added kubernetes 
tolerations to kubernetes pod operator
URL: 
https://github.com/apache/incubator-airflow/pull/3806#issuecomment-416340975
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=h1)
 Report
   > Merging 
[#3806](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/1801baefe44f361010c23e6ec4ee8b8569eab82d?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3806/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3806   +/-   ##
   ===
 Coverage   77.41%   77.41%   
   ===
 Files 203  203   
 Lines   1581015810   
   ===
 Hits1223912239   
 Misses   3571 3571
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3806/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvX19pbml0X18ucHk=)
 | `61.11% <ø> (ø)` | :arrow_up: |
   | 
[airflow/sensors/http\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3806/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL2h0dHBfc2Vuc29yLnB5)
 | `96.42% <ø> (ø)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=footer).
 Last update 
[1801bae...e6f5f22](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3806: [AIRFLOW-2956] added kubernetes tolerations to kubernetes pod operator

2018-08-27 Thread GitBox
codecov-io edited a comment on issue #3806: [AIRFLOW-2956] added kubernetes 
tolerations to kubernetes pod operator
URL: 
https://github.com/apache/incubator-airflow/pull/3806#issuecomment-416340975
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=h1)
 Report
   > Merging 
[#3806](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/1801baefe44f361010c23e6ec4ee8b8569eab82d?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3806/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3806   +/-   ##
   ===
 Coverage   77.41%   77.41%   
   ===
 Files 203  203   
 Lines   1581015810   
   ===
 Hits1223912239   
 Misses   3571 3571
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3806/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvX19pbml0X18ucHk=)
 | `61.11% <ø> (ø)` | :arrow_up: |
   | 
[airflow/sensors/http\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3806/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL2h0dHBfc2Vuc29yLnB5)
 | `96.42% <ø> (ø)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=footer).
 Last update 
[1801bae...e6f5f22](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3806: [AIRFLOW-2956] added kubernetes tolerations to kubernetes pod operator

2018-08-27 Thread GitBox
codecov-io commented on issue #3806: [AIRFLOW-2956] added kubernetes 
tolerations to kubernetes pod operator
URL: 
https://github.com/apache/incubator-airflow/pull/3806#issuecomment-416340975
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=h1)
 Report
   > Merging 
[#3806](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/1801baefe44f361010c23e6ec4ee8b8569eab82d?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3806/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3806   +/-   ##
   ===
 Coverage   77.41%   77.41%   
   ===
 Files 203  203   
 Lines   1581015810   
   ===
 Hits1223912239   
 Misses   3571 3571
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3806/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvX19pbml0X18ucHk=)
 | `61.11% <ø> (ø)` | :arrow_up: |
   | 
[airflow/sensors/http\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3806/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL2h0dHBfc2Vuc29yLnB5)
 | `96.42% <ø> (ø)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=footer).
 Last update 
[1801bae...e6f5f22](https://codecov.io/gh/apache/incubator-airflow/pull/3806?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] betabandido commented on a change in pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-27 Thread GitBox
betabandido commented on a change in pull request #3570: [AIRFLOW-2709] Improve 
error handling in Databricks hook
URL: https://github.com/apache/incubator-airflow/pull/3570#discussion_r213084704
 
 

 ##
 File path: airflow/contrib/hooks/databricks_hook.py
 ##
 @@ -175,6 +190,12 @@ def cancel_run(self, run_id):
 self._do_api_call(CANCEL_RUN_ENDPOINT, json)
 
 
+def _retryable_error(exception):
+return type(exception) == requests_exceptions.ConnectionError \
 
 Review comment:
   @andrewmchen Oh, that is completely right! Otherwise derived exceptions 
won't be treated as retryable. Thanks for spotting that. I did a small change 
to the tests to cover this case too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (AIRFLOW-2967) Beeline commands do not have quoted urls

2018-08-27 Thread Chris McLennon (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris McLennon closed AIRFLOW-2967.
---
Resolution: Fixed

> Beeline commands do not have quoted urls
> 
>
> Key: AIRFLOW-2967
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2967
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Chris McLennon
>Assignee: Chris McLennon
>Priority: Major
>
> When HiveHook uses beeline with kerberos auth, it doesn't quote the 
> connection URL, leading to an error. It should quote the connection URL.
>  
> Example:
> Airflow runs command which fails:
> {code:java}
> beeline -u 
> jdbc:hive2://hiveserver.mycompany.com:1/default;principal=hive/hiveserver.mycompany@mycompany.com;
>  -f /tmp/airflow_hiveop_OwZO45/tmpOW_gZ6{code}
> Command should be run as:
> {code:java}
> beeline -u 
> "jdbc:hive2://hiveserver.mycompany.com:1/default;principal=hive/hiveserver.mycompany@mycompany.com;"
>  -f /tmp/airflow_hiveop_OwZO45/tmpOW_gZ6{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2967) Beeline commands do not have quoted urls

2018-08-27 Thread Chris McLennon (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594105#comment-16594105
 ] 

Chris McLennon commented on AIRFLOW-2967:
-

Actually, nevermind. I just saw this has already been implemented for future 
release: 
https://github.com/apache/incubator-airflow/blob/master/airflow/hooks/hive_hooks.py#L139

> Beeline commands do not have quoted urls
> 
>
> Key: AIRFLOW-2967
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2967
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Chris McLennon
>Assignee: Chris McLennon
>Priority: Major
>
> When HiveHook uses beeline with kerberos auth, it doesn't quote the 
> connection URL, leading to an error. It should quote the connection URL.
>  
> Example:
> Airflow runs command which fails:
> {code:java}
> beeline -u 
> jdbc:hive2://hiveserver.mycompany.com:1/default;principal=hive/hiveserver.mycompany@mycompany.com;
>  -f /tmp/airflow_hiveop_OwZO45/tmpOW_gZ6{code}
> Command should be run as:
> {code:java}
> beeline -u 
> "jdbc:hive2://hiveserver.mycompany.com:1/default;principal=hive/hiveserver.mycompany@mycompany.com;"
>  -f /tmp/airflow_hiveop_OwZO45/tmpOW_gZ6{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2967) Beeline commands do not have quoted urls

2018-08-27 Thread Chris McLennon (JIRA)
Chris McLennon created AIRFLOW-2967:
---

 Summary: Beeline commands do not have quoted urls
 Key: AIRFLOW-2967
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2967
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Chris McLennon
Assignee: Chris McLennon


When HiveHook uses beeline with kerberos auth, it doesn't quote the connection 
URL, leading to an error. It should quote the connection URL.

 

Example:

Airflow runs command which fails:
{code:java}
beeline -u 
jdbc:hive2://hiveserver.mycompany.com:1/default;principal=hive/hiveserver.mycompany@mycompany.com;
 -f /tmp/airflow_hiveop_OwZO45/tmpOW_gZ6{code}
Command should be run as:
{code:java}
beeline -u 
"jdbc:hive2://hiveserver.mycompany.com:1/default;principal=hive/hiveserver.mycompany@mycompany.com;"
 -f /tmp/airflow_hiveop_OwZO45/tmpOW_gZ6{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] justinholmes commented on issue #3806: [AIRFLOW-2956] added kubernetes tolerations to kubernetes pod operator

2018-08-27 Thread GitBox
justinholmes commented on issue #3806: [AIRFLOW-2956] added kubernetes 
tolerations to kubernetes pod operator
URL: 
https://github.com/apache/incubator-airflow/pull/3806#issuecomment-416328208
 
 
   Sure.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tedmiston commented on issue #3808: [AIRFLOW-2957] Remove obselete sensor references

2018-08-27 Thread GitBox
tedmiston commented on issue #3808: [AIRFLOW-2957] Remove obselete sensor 
references
URL: 
https://github.com/apache/incubator-airflow/pull/3808#issuecomment-416314704
 
 
   @Fokko @r39132 Thank you for catching this one I missed in #3760!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2964) Lazy generation of the job description

2018-08-27 Thread Andrew Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16593997#comment-16593997
 ] 

Andrew Chen commented on AIRFLOW-2964:
--

This certainly makes a lot of sense to me, especially if we can maintain the 
current behavior (use the callable only if it is defined). That being said, 
maybe it's a good idea to get some input from the community whether or not this 
is the right way to do things. A quick spot check of other operators seems to 
suggest they don't support this sort of lazy generation of parameters.

> Lazy generation of the job description
> --
>
> Key: AIRFLOW-2964
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2964
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Affects Versions: Airflow 1.9.0, 1.10
>Reporter: Victor Jimenez
>Priority: Major
>
> When instantiating a {{DatabricksSubmitRunOperator}} users need to pass the 
> description of the job that will later be executed on Databricks.
> The job description is only needed at execution time (when the hook is 
> called). However, the {{json}} parameter must already have the full job 
> description when constructing the operator. This may present a problem if 
> computing the job description needs to execute expensive operations (e.g., 
> querying a database). The expensive operation will be invoked every single 
> time the DAG is reprocessed (which may happen quite frequently).
> It would be good to have an equivalent mechanism to the {{python_callable}} 
> parameter in the {{PythonOperator}}. In this way, users could pass a function 
> that would generate the job description only when the operator is actually 
> executed.
> [~andrewmchen] Is there any other way to do this? If not, does this sound 
> reasonable? I can create a PR implementing this proposal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] andrewmchen commented on a change in pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook

2018-08-27 Thread GitBox
andrewmchen commented on a change in pull request #3570: [AIRFLOW-2709] Improve 
error handling in Databricks hook
URL: https://github.com/apache/incubator-airflow/pull/3570#discussion_r213048323
 
 

 ##
 File path: airflow/contrib/hooks/databricks_hook.py
 ##
 @@ -175,6 +190,12 @@ def cancel_run(self, run_id):
 self._do_api_call(CANCEL_RUN_ENDPOINT, json)
 
 
+def _retryable_error(exception):
+return type(exception) == requests_exceptions.ConnectionError \
 
 Review comment:
   nit: I think we should prefer `isinstanceof`. See 
https://stackoverflow.com/questions/1549801/what-are-the-differences-between-type-and-isinstance


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] andscoop commented on a change in pull request #3796: [AIRFLOW-2824] - Add config to disable default conn creation

2018-08-27 Thread GitBox
andscoop commented on a change in pull request #3796: [AIRFLOW-2824] - Add 
config to disable default conn creation
URL: https://github.com/apache/incubator-airflow/pull/3796#discussion_r213043144
 
 

 ##
 File path: airflow/utils/db.py
 ##
 @@ -286,6 +284,16 @@ def initdb(rbac=False):
 conn_id='cassandra_default', conn_type='cassandra',
 host='cassandra', port=9042))
 
+
+def initdb(rbac=False):
+session = settings.Session()
+
+from airflow import models
 
 Review comment:
   @feng-tao It is to prevent a circular import


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-2966) KubernetesExecutor + namespace quotas kills scheduler if the pod can't be launched

2018-08-27 Thread John Hofman (JIRA)
John Hofman created AIRFLOW-2966:


 Summary: KubernetesExecutor + namespace quotas kills scheduler if 
the pod can't be launched
 Key: AIRFLOW-2966
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2966
 Project: Apache Airflow
  Issue Type: Bug
  Components: scheduler
Affects Versions: 1.10
 Environment: Kubernetes 1.9.8
Reporter: John Hofman


When running Airflow in Kubernetes with the KubernetesExecutor and resource 
quota's set on the namespace Airflow is deployed in. If the scheduler tries to 
launch a pod into the namespace that exceeds the namespace limits it gets an 
ApiException, and crashes the scheduler.

This stack trace is an example of the ApiException from the kubernetes client:
{code:java}
[2018-08-27 09:51:08,516] {pod_launcher.py:58} ERROR - Exception when 
attempting to create Namespaced Pod.
Traceback (most recent call last):
File "/src/apache-airflow/airflow/contrib/kubernetes/pod_launcher.py", line 55, 
in run_pod_async
resp = self._client.create_namespaced_pod(body=req, namespace=pod.namespace)
File 
"/usr/local/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py", 
line 6057, in create_namespaced_pod
(data) = self.create_namespaced_pod_with_http_info(namespace, body, **kwargs)
File 
"/usr/local/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py", 
line 6142, in create_namespaced_pod_with_http_info
collection_formats=collection_formats)
File "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", 
line 321, in call_api
_return_http_data_only, collection_formats, _preload_content, _request_timeout)
File "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", 
line 155, in __call_api
_request_timeout=_request_timeout)
File "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", 
line 364, in request
body=body)
File "/usr/local/lib/python3.6/site-packages/kubernetes/client/rest.py", line 
266, in POST
body=body)
File "/usr/local/lib/python3.6/site-packages/kubernetes/client/rest.py", line 
222, in request
raise ApiException(http_resp=r)
kubernetes.client.rest.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': 
'b00e2cbb-bdb2-41f3-8090-824aee79448c', 'Content-Type': 'application/json', 
'Date': 'Mon, 27 Aug 2018 09:51:08 GMT', 'Content-Length': '410'})
HTTP response body: 
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods
 \"podname-ec366e89ef934d91b2d3ffe96234a725\" is forbidden: exceeded quota: 
compute-resources, requested: limits.memory=4Gi, used: limits.memory=6508Mi, 
limited: 
limits.memory=10Gi","reason":"Forbidden","details":{"name":"podname-ec366e89ef934d91b2d3ffe96234a725","kind":"pods"},"code":403}{code}
 

I would expect the scheduler to catch the Exception and at least mark the task 
as failed, or better yet retry the task later.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[36/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/awsbatch_operator.html
--
diff --git a/_modules/airflow/contrib/operators/awsbatch_operator.html 
b/_modules/airflow/contrib/operators/awsbatch_operator.html
new file mode 100644
index 000..33e1e8b
--- /dev/null
+++ b/_modules/airflow/contrib/operators/awsbatch_operator.html
@@ -0,0 +1,403 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.operators.awsbatch_operator  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.operators.awsbatch_operator
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.operators.awsbatch_operator
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+#
+import sys
+
+from math import pow
+from time import sleep
+
+from airflow.exceptions import AirflowException
+from airflow.models import BaseOperator
+from airflow.utils import apply_defaults
+
+from airflow.contrib.hooks.aws_hook import 
AwsHook
+
+
+[docs]class AWSBatchOperator(BaseOperator):
+
+Execute a job on AWS Batch Service
+
+:param job_name: the name for the job that will run on 
AWS Batch
+:type job_name: str
+:param job_definition: the job definition name on AWS 
Batch
+:type job_definition: str
+:param queue: the queue name on AWS Batch
+:type queue: str
+:param: overrides: the same parameter that boto3 will 
receive on containerOverrides:
+
http://boto3.readthedocs.io/en/latest/reference/services/batch.html#submit_job
+:type: overrides: dict
+:param max_retries: exponential backoff retries while 
waiter is not merged
+:type max_retries: int
+:param aws_conn_id: connection id of AWS credentials / 
region name. If None,
+credential boto3 strategy will be used 
(http://boto3.readthedocs.io/en/latest/guide/configuration.html).
+:type aws_conn_id: str
+:param region_name: region name to use in AWS Hook. 
Override the region_name in connection (if provided)
+
+
+ui_color = #c3dae0
+client = None
+arn = None
+template_fields = (overrides,)
+
+@apply_defaults
+def __init__(self, job_name, job_definition, queue, overrides, max_retries=288,
+ aws_conn_id=None, region_name=None, **kwargs):
+super(AWSBatchOperator, self).__init__(**kwargs)
+
+self.job_name = job_name
+self.aws_conn_id = aws_conn_id
+self.region_name = region_name
+self.job_definition = job_definition
+self.queue = queue
+self.overrides = overrides
+self.max_retries = max_retries
+
+self.jobId = None
+self.jobName = None
+
+self.hook = self.get_hook()
+
+def execute(self, context):
+self.log.info(
+Running AWS Batch Job - Job definition: 
%s - on queue %s,
+self.job_definition, self.queue
+)
+self.log.info(AWSBatchOperator overrides: 
%s, self.overrides)
+
+self.client = self.hook.get_client_type(

[09/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/operators/postgres_operator.html
--
diff --git a/_modules/airflow/operators/postgres_operator.html 
b/_modules/airflow/operators/postgres_operator.html
new file mode 100644
index 000..dc0669e
--- /dev/null
+++ b/_modules/airflow/operators/postgres_operator.html
@@ -0,0 +1,297 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.operators.postgres_operator  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.operators.postgres_operator
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.operators.postgres_operator
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+from airflow.hooks.postgres_hook import 
PostgresHook
+from airflow.models import BaseOperator
+from airflow.utils.decorators 
import apply_defaults
+
+
+[docs]class PostgresOperator(BaseOperator):
+
+Executes sql code in a specific Postgres database
+
+:param postgres_conn_id: reference to a specific postgres 
database
+:type postgres_conn_id: string
+:param sql: the sql code to be executed
+:type sql: Can receive a str representing a sql 
statement,
+a list of str (sql statements), or reference to a 
template file.
+Template reference are recognized by str ending in 
.sql
+:param database: name of database which overwrite defined 
one in connection
+:type database: string
+
+
+template_fields = (sql,)
+template_ext = (.sql,)
+ui_color = #ededed
+
+@apply_defaults
+def __init__(
+self, sql,
+postgres_conn_id=postgres_default, autocommit=False,
+parameters=None,
+database=None,
+*args, **kwargs):
+super(PostgresOperator, self).__init__(*args, **kwargs)
+self.sql = sql
+self.postgres_conn_id = postgres_conn_id
+self.autocommit = autocommit
+self.parameters = parameters
+self.database = database
+
+def execute(self, context):
+self.log.info(Executing: %s, self.sql)
+self.hook = PostgresHook(postgres_conn_id=self.postgres_conn_id,
+ schema=self.database)
+self.hook.run(self.sql, self.autocommit, 
parameters=self.parameters)
+for output in self.hook.conn.notices:
+self.log.info(output)
+
+
+   
+   
+
+   
+  
+  
+  
+
+  
+
+  
+
+
+
+  
+  Built with http://sphinx-doc.org/;>Sphinx using a https://github.com/snide/sphinx_rtd_theme;>theme provided by https://readthedocs.org;>Read the Docs. 
+
+
+
+
+  
+
+
+
+  
+  
+
+
+  
+
+
+var DOCUMENTATION_OPTIONS = {
+URL_ROOT:'../../../',
+VERSION:'',
+COLLAPSE_INDEX:false,
+FILE_SUFFIX:'.html',
+HAS_SOURCE:  true,
+SOURCELINK_SUFFIX: '.txt'
+};
+
+  
+  
+  
+
+  
+
+  
+  
+
+  
+
+  
+  
+  
+  jQuery(function 

[33/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/dataproc_operator.html
--
diff --git a/_modules/airflow/contrib/operators/dataproc_operator.html 
b/_modules/airflow/contrib/operators/dataproc_operator.html
index d49719e..f26e459 100644
--- a/_modules/airflow/contrib/operators/dataproc_operator.html
+++ b/_modules/airflow/contrib/operators/dataproc_operator.html
@@ -91,7 +91,7 @@
 Quick Start
 Installation
 Tutorial
-Configuration
+How-to Guides
 UI / Screenshots
 Concepts
 Data Profiling
@@ -99,8 +99,10 @@
 Scheduling  Triggers
 Plugins
 Security
+Time zones
 Experimental Rest API
 Integration
+Lineage
 FAQ
 API Reference
 
@@ -169,29 +171,42 @@
   Source code for airflow.contrib.operators.dataproc_operator
 # -*- coding: utf-8 -*-
 #
-# Licensed under the Apache License, Version 2.0 (the 
License);
-# you may not use this file except in compliance with the 
License.
-# You may obtain a copy of the License at
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
 #
-# http://www.apache.org/licenses/LICENSE-2.0
+#   http://www.apache.org/licenses/LICENSE-2.0
 #
-# Unless required by applicable law or agreed to in writing, 
software
-# distributed under the License is distributed on an AS 
IS BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
express or implied.
-# See the License for the specific language governing 
permissions and
-# limitations under the License.
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
 #
 
+import ntpath
+import os
+import re
 import time
+import uuid
+from datetime import timedelta
 
 from airflow.contrib.hooks.gcp_dataproc_hook import DataProcHook
+from airflow.contrib.hooks.gcs_hook import 
GoogleCloudStorageHook
+from airflow.exceptions import AirflowException
 from airflow.models import BaseOperator
 from airflow.utils.decorators 
import apply_defaults
 from airflow.version import version
 from googleapiclient.errors 
import HttpError
+from airflow.utils import timezone
 
 
-class DataprocClusterCreateOperator(BaseOperator):
+[docs]class DataprocClusterCreateOperator(BaseOperator):
 
 Create a new cluster on Google Cloud Dataproc. The 
operator will wait until the
 creation is successful or an error occurs in the creation 
process.
@@ -202,9 +217,79 @@
 
 for a detailed explanation on the different parameters. 
Most of the configuration
 parameters detailed in the link are available as a 
parameter to this operator.
+
+:param cluster_name: The name of the DataProc cluster to 
create.
+:type cluster_name: string
+:param project_id: The ID of the google cloud project in 
which
+to create the cluster
+:type project_id: string
+:param num_workers: The # of workers to spin up
+:type num_workers: int
+:param storage_bucket: The storage bucket to use, setting 
to None lets dataproc
+generate a custom one for you
+:type storage_bucket: string
+:param init_actions_uris: List of GCS uris 
containing
+dataproc initialization scripts
+:type init_actions_uris: list[string]
+:param init_action_timeout: Amount of time executable 
scripts in
+init_actions_uris has to complete
+:type init_action_timeout: string
+:param metadata: dict of key-value google compute engine 
metadata entries
+to add to all instances
+:type metadata: dict
+:param image_version: the version of software inside the 
Dataproc cluster
+:type image_version: string
+:param properties: dict of properties to set on
+config files (e.g. spark-defaults.conf), see
+
https://cloud.google.com/dataproc/docs/reference/rest/v1/ \
+projects.regions.clusters#SoftwareConfig
+:type properties: dict
+:param master_machine_type: Compute engine machine type 
to use for the master node
+:type master_machine_type: string
+:param master_disk_size: Disk size for the master 
node
+:type master_disk_size: int
+:param worker_machine_type: Compute engine machine type 
to use for the worker nodes
+:type worker_machine_type: string
+:param worker_disk_size: Disk size for the worker 
nodes
+:type worker_disk_size: int
+:param num_preemptible_workers: The # of preemptible 
worker 

[42/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/hooks/pinot_hook.html
--
diff --git a/_modules/airflow/contrib/hooks/pinot_hook.html 
b/_modules/airflow/contrib/hooks/pinot_hook.html
new file mode 100644
index 000..a34b554
--- /dev/null
+++ b/_modules/airflow/contrib/hooks/pinot_hook.html
@@ -0,0 +1,340 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.hooks.pinot_hook  Airflow Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.hooks.pinot_hook
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.hooks.pinot_hook
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+import six
+
+from pinotdb import connect
+
+from airflow.hooks.dbapi_hook 
import DbApiHook
+
+
+[docs]class PinotDbApiHook(DbApiHook):
+
+Connect to pinot db(https://github.com/linkedin/pinot) to 
issue pql
+
+conn_name_attr = pinot_broker_conn_id
+default_conn_name = pinot_broker_default
+supports_autocommit = False
+
+def __init__(self, *args, **kwargs):
+super(PinotDbApiHook, self).__init__(*args, **kwargs)
+
+[docs]
def get_conn(self):
+
+Establish a connection to pinot broker through pinot 
dbqpi.
+
+conn = self.get_connection(self.pinot_broker_conn_id)
+pinot_broker_conn = 
connect(
+host=conn.host,
+port=conn.port,
+path=conn.extra_dejson.get(endpoint, /pql),
+scheme=conn.extra_dejson.get(schema, http)
+)
+self.log.info(Get the connection to pinot 
+  broker on {host}.format(host=conn.host))
+return pinot_broker_conn
+
+[docs]
def get_uri(self):
+
+Get the connection uri for pinot broker.
+
+e.g: http://localhost:9000/pql
+
+conn = self.get_connection(getattr(self, self.conn_name_attr))
+host = conn.host
+if conn.port is not None:
+host += :{port}.format(port=conn.port)
+conn_type = http if not conn.conn_type else conn.conn_type
+endpoint = conn.extra_dejson.get(endpoint, pql)
+return {conn_type}://{host}/{endpoint}.format(
+conn_type=conn_type, host=host, endpoint=endpoint)
+
+[docs]
def get_records(self, sql):
+
+Executes the sql and returns a set of records.
+
+:param sql: the sql statement to be executed (str) or 
a list of
+sql statements to execute
+:type sql: str
+
+if six.PY2:
+sql = sql.encode(utf-8)
+
+with self.get_conn() 
as cur:
+cur.execute(sql)
+return cur.fetchall()
+
+[docs]
def get_first(self, sql):
+
+Executes the sql and returns the first resulting 
row.
+
+:param sql: the sql statement to be executed (str) or 
a list of
+sql statements to execute
+:type sql: str or list
+
+if six.PY2:
+sql = sql.encode(utf-8)
+
+with 

[18/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/executors/local_executor.html
--
diff --git a/_modules/airflow/executors/local_executor.html 
b/_modules/airflow/executors/local_executor.html
index cfbf241..24436b4 100644
--- a/_modules/airflow/executors/local_executor.html
+++ b/_modules/airflow/executors/local_executor.html
@@ -91,7 +91,7 @@
 Quick Start
 Installation
 Tutorial
-Configuration
+How-to Guides
 UI / Screenshots
 Concepts
 Data Profiling
@@ -99,8 +99,10 @@
 Scheduling  Triggers
 Plugins
 Security
+Time zones
 Experimental Rest API
 Integration
+Lineage
 FAQ
 API Reference
 
@@ -169,17 +171,49 @@
   Source code for airflow.executors.local_executor
 # -*- coding: utf-8 -*-
 #
-# Licensed under the Apache License, Version 2.0 (the 
License);
-# you may not use this file except in compliance with the 
License.
-# You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, 
software
-# distributed under the License is distributed on an AS 
IS BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
express or implied.
-# See the License for the specific language governing 
permissions and
-# limitations under the License.
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+LocalExecutor runs tasks by spawning processes in a 
controlled fashion in different
+modes. Given that BaseExecutor has the option to receive a 
`parallelism` parameter to
+limit the number of process spawned, when this parameter is 
`0` the number of processes
+that LocalExecutor can spawn is unlimited.
+
+The following strategies are implemented:
+1. Unlimited Parallelism (self.parallelism == 0): In this 
strategy, LocalExecutor will
+spawn a process every time `execute_async` is called, that 
is, every task submitted to the
+LocalExecutor will be executed in its own process. Once the 
task is executed and the
+result stored in the `result_queue`, the process terminates. 
There is no need for a
+`task_queue` in this approach, since as soon as a task is 
received a new process will be
+allocated to the task. Processes used in this strategy are of 
class LocalWorker.
+
+2. Limited Parallelism (self.parallelism  0): In this 
strategy, the LocalExecutor spawns
+the number of processes equal to the value of 
`self.parallelism` at `start` time,
+using a `task_queue` to coordinate the ingestion of tasks and 
the work distribution among
+the workers, which will take a task as soon as they are 
ready. During the lifecycle of
+the LocalExecutor, the worker processes are running waiting 
for tasks, once the
+LocalExecutor receives the call to shutdown the executor a 
poison token is sent to the
+workers to terminate them. Processes used in this strategy 
are of class QueuedLocalWorker.
+
+Arguably, `SequentialExecutor` could be thought as a 
LocalExecutor with limited
+parallelism of just 1 worker, i.e. `self.parallelism = 
1`.
+This option could lead to the unification of the executor 
implementations, running
+locally, into just one `LocalExecutor` with multiple 
modes.
+
 
 import multiprocessing
 import subprocess
@@ -187,20 +221,63 @@
 
 from builtins import range
 
-from airflow import configuration
 from airflow.executors.base_executor import 
BaseExecutor
 from airflow.utils.log.logging_mixin import 
LoggingMixin
 from airflow.utils.state import State
 
-PARALLELISM = configuration.get(core, PARALLELISM)
-
 
 class LocalWorker(multiprocessing.Process, LoggingMixin):
+
+LocalWorker Process implementation to 
run airflow commands. Executes the given
+command and puts the result into a result queue when 
done, terminating execution.
+
+def __init__(self, result_queue):
+
+:param result_queue: the queue to store result states 
tuples (key, State)
+:type result_queue: multiprocessing.Queue
+
+super(LocalWorker, self).__init__()
+self.daemon = True
+self.result_queue = result_queue
+self.key = None
+self.command = None
+
+def execute_work(self, key, command):
+
+Executes 

[35/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/bigquery_operator.html
--
diff --git a/_modules/airflow/contrib/operators/bigquery_operator.html 
b/_modules/airflow/contrib/operators/bigquery_operator.html
index 5a74ba1..4055654 100644
--- a/_modules/airflow/contrib/operators/bigquery_operator.html
+++ b/_modules/airflow/contrib/operators/bigquery_operator.html
@@ -91,7 +91,7 @@
 Quick Start
 Installation
 Tutorial
-Configuration
+How-to Guides
 UI / Screenshots
 Concepts
 Data Profiling
@@ -99,8 +99,10 @@
 Scheduling  Triggers
 Plugins
 Security
+Time zones
 Experimental Rest API
 Integration
+Lineage
 FAQ
 API Reference
 
@@ -169,19 +171,27 @@
   Source code for airflow.contrib.operators.bigquery_operator
 # -*- coding: utf-8 -*-
 #
-# Licensed under the Apache License, Version 2.0 (the 
License);
-# you may not use this file except in compliance with the 
License.
-# You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, 
software
-# distributed under the License is distributed on an AS 
IS BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
express or implied.
-# See the License for the specific language governing 
permissions and
-# limitations under the License.
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+import json
 
 from airflow.contrib.hooks.bigquery_hook import BigQueryHook
+from airflow.contrib.hooks.gcs_hook import 
GoogleCloudStorageHook, _parse_gcs_url
 from airflow.models import BaseOperator
 from airflow.utils.decorators 
import apply_defaults
 
@@ -204,6 +214,13 @@
 :param create_disposition: Specifies whether the job is 
allowed to create new tables.
 (default: CREATE_IF_NEEDED)
 :type create_disposition: string
+:param allow_large_results: Whether to allow large 
results.
+:type allow_large_results: boolean
+:param flatten_results: If true and query uses legacy SQL 
dialect, flattens
+all nested and repeated fields in the query results. 
``allow_large_results``
+must be ``true`` if this is set to ``false``. For 
standard SQL queries, this
+flag is ignored and results are never 
flattened.
+:type flatten_results: boolean
 :param bigquery_conn_id: reference to a specific BigQuery 
hook.
 :type bigquery_conn_id: string
 :param delegate_to: The account to impersonate, if 
any.
@@ -215,16 +232,25 @@
 :type udf_config: list
 :param use_legacy_sql: Whether to use legacy SQL (true) 
or standard SQL (false).
 :type use_legacy_sql: boolean
-:param maximum_billing_tier: Positive integer that serves 
as a multiplier of the basic price.
+:param maximum_billing_tier: Positive integer that serves 
as a multiplier
+of the basic price.
 Defaults to None, in which case it uses the value set 
in the project.
 :type maximum_billing_tier: integer
-:param query_params: a dictionary containing query 
parameter types and values, passed to
-BigQuery.
+:param maximum_bytes_billed: Limits the bytes billed for 
this job.
+Queries that will have bytes billed beyond this limit 
will fail
+(without incurring a charge). If unspecified, this 
will be
+set to your project default.
+:type maximum_bytes_billed: float
+:param schema_update_options: Allows the schema of the 
destination
+table to be updated as a side effect of the load 
job.
+:type schema_update_options: tuple
+:param query_params: a dictionary containing query 
parameter types and
+values, passed to BigQuery.
 :type query_params: dict
 
 
 template_fields = (bql, 
destination_dataset_table)
-template_ext = (.sql,)
+template_ext = (.sql, )
 ui_color = #e4f0e8
 
 @apply_defaults
@@ -233,13 +259,17 @@
  destination_dataset_table=False,
  write_disposition=WRITE_EMPTY,
  allow_large_results=False,
+ flatten_results=False,
  bigquery_conn_id=bigquery_default,
  

[40/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/hooks/spark_jdbc_hook.html
--
diff --git a/_modules/airflow/contrib/hooks/spark_jdbc_hook.html 
b/_modules/airflow/contrib/hooks/spark_jdbc_hook.html
new file mode 100644
index 000..c22b1e5
--- /dev/null
+++ b/_modules/airflow/contrib/hooks/spark_jdbc_hook.html
@@ -0,0 +1,481 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.hooks.spark_jdbc_hook  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.hooks.spark_jdbc_hook
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.hooks.spark_jdbc_hook
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+#
+import os
+from airflow.contrib.hooks.spark_submit_hook import SparkSubmitHook
+from airflow.exceptions import AirflowException
+
+
+[docs]class SparkJDBCHook(SparkSubmitHook):
+
+This hook extends the SparkSubmitHook specifically for 
performing data
+transfers to/from JDBC-based databases with Apache 
Spark.
+
+:param spark_app_name: Name of the job (default 
airflow-spark-jdbc)
+:type spark_app_name: str
+:param spark_conn_id: Connection id as configured in 
Airflow administration
+:type spark_conn_id: str
+:param spark_conf: Any additional Spark configuration 
properties
+:type spark_conf: dict
+:param spark_py_files: Additional python files used 
(.zip, .egg, or .py)
+:type spark_py_files: str
+:param spark_files: Additional files to upload to the 
container running the job
+:type spark_files: str
+:param spark_jars: Additional jars to upload and add to 
the driver and
+   executor classpath
+:type spark_jars: str
+:param num_executors: number of executor to run. This 
should be set so as to manage
+  the number of connections made with 
the JDBC database
+:type num_executors: int
+:param executor_cores: Number of cores per executor
+:type executor_cores: int
+:param executor_memory: Memory per executor (e.g. 1000M, 
2G)
+:type executor_memory: str
+:param driver_memory: Memory allocated to the driver 
(e.g. 1000M, 2G)
+:type driver_memory: str
+:param verbose: Whether to pass the verbose flag to 
spark-submit for debugging
+:type verbose: bool
+:param keytab: Full path to the file that contains the 
keytab
+:type keytab: str
+:param principal: The name of the kerberos principal used 
for keytab
+:type principal: str
+:param cmd_type: Which way the data should flow. 2 
possible values:
+ spark_to_jdbc: data written by spark 
from metastore to jdbc
+ jdbc_to_spark: data written by spark 
from jdbc to metastore
+:type cmd_type: str
+:param jdbc_table: The name of the JDBC table
+:type jdbc_table: str
+:param jdbc_conn_id: Connection id used for connection to 
JDBC database
+:type: jdbc_conn_id: str
+:param jdbc_driver: Name of the JDBC driver to use 

[51/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
1.10.0


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/commit/11437c14
Tree: 
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/tree/11437c14
Diff: 
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/diff/11437c14

Branch: refs/heads/asf-site
Commit: 11437c14a3607f6b85b6610dc776f55d78e7d767
Parents: 28a3eb6
Author: Kaxil Naik 
Authored: Mon Aug 27 17:22:22 2018 +0100
Committer: Kaxil Naik 
Committed: Mon Aug 27 17:22:22 2018 +0100

--
 _images/connection_create.png   |   Bin 0 -> 41547 bytes
 _images/connection_edit.png |   Bin 0 -> 53636 bytes
 _images/connections.png |   Bin 93057 -> 48442 bytes
 _modules/S3_hook.html   |   489 -
 .../contrib/executors/mesos_executor.html   |89 +-
 .../contrib/hooks/aws_dynamodb_hook.html|   300 +
 _modules/airflow/contrib/hooks/aws_hook.html|   410 +
 .../airflow/contrib/hooks/aws_lambda_hook.html  |   302 +
 .../airflow/contrib/hooks/bigquery_hook.html|   889 +-
 .../airflow/contrib/hooks/databricks_hook.html  |   462 +
 .../airflow/contrib/hooks/datadog_hook.html |   375 +
 .../airflow/contrib/hooks/datastore_hook.html   |52 +-
 .../contrib/hooks/discord_webhook_hook.html |   375 +
 _modules/airflow/contrib/hooks/emr_hook.html|35 +-
 _modules/airflow/contrib/hooks/fs_hook.html |   281 +
 _modules/airflow/contrib/hooks/ftp_hook.html|   494 +
 .../contrib/hooks/gcp_api_base_hook.html|   379 +
 .../contrib/hooks/gcp_dataflow_hook.html|   196 +-
 .../contrib/hooks/gcp_dataproc_hook.html|   463 +
 .../contrib/hooks/gcp_mlengine_hook.html|20 +-
 .../airflow/contrib/hooks/gcp_pubsub_hook.html  |   519 +
 _modules/airflow/contrib/hooks/gcs_hook.html|   315 +-
 .../airflow/contrib/hooks/jenkins_hook.html |   283 +
 _modules/airflow/contrib/hooks/jira_hook.html   |   319 +
 _modules/airflow/contrib/hooks/pinot_hook.html  |   340 +
 _modules/airflow/contrib/hooks/qubole_hook.html |   449 +
 _modules/airflow/contrib/hooks/redis_hook.html  |   328 +
 .../airflow/contrib/hooks/redshift_hook.html|   348 +
 _modules/airflow/contrib/hooks/sftp_hook.html   |   404 +
 .../contrib/hooks/slack_webhook_hook.html   |   364 +
 .../airflow/contrib/hooks/spark_jdbc_hook.html  |   481 +
 .../airflow/contrib/hooks/spark_sql_hook.html   |   396 +
 .../contrib/hooks/spark_submit_hook.html|   799 ++
 _modules/airflow/contrib/hooks/sqoop_hook.html  |   580 +
 _modules/airflow/contrib/hooks/ssh_hook.html|   470 +
 .../airflow/contrib/hooks/vertica_hook.html |   288 +
 _modules/airflow/contrib/hooks/wasb_hook.html   |94 +-
 _modules/airflow/contrib/kubernetes/secret.html |   276 +
 .../contrib/operators/awsbatch_operator.html|   403 +
 .../operators/bigquery_check_operator.html  |37 +-
 .../contrib/operators/bigquery_get_data.html|   351 +
 .../contrib/operators/bigquery_operator.html|   399 +-
 .../bigquery_table_delete_operator.html |   301 +
 .../contrib/operators/bigquery_to_bigquery.html |39 +-
 .../contrib/operators/bigquery_to_gcs.html  |39 +-
 .../contrib/operators/databricks_operator.html  |45 +-
 .../contrib/operators/dataflow_operator.html|   241 +-
 .../contrib/operators/dataproc_operator.html|   866 +-
 .../operators/datastore_export_operator.html|   344 +
 .../operators/datastore_import_operator.html|   332 +
 .../operators/discord_webhook_operator.html |   333 +
 .../airflow/contrib/operators/ecs_operator.html |31 +-
 .../operators/emr_add_steps_operator.html   |33 +-
 .../operators/emr_create_job_flow_operator.html |33 +-
 .../emr_terminate_job_flow_operator.html|31 +-
 .../airflow/contrib/operators/file_to_gcs.html  |   310 +
 .../airflow/contrib/operators/file_to_wasb.html |35 +-
 .../operators/gcs_download_operator.html|56 +-
 .../contrib/operators/gcs_list_operator.html|   326 +
 .../airflow/contrib/operators/gcs_operator.html |   357 +
 .../airflow/contrib/operators/gcs_to_bq.html|   288 +-
 .../airflow/contrib/operators/gcs_to_gcs.html   |   365 +
 .../airflow/contrib/operators/gcs_to_s3.html|   347 +
 .../contrib/operators/hipchat_operator.html |35 +-
 .../operators/jenkins_job_trigger_operator.html |   484 +
 .../contrib/operators/jira_operator.html|   329 +
 .../operators/kubernetes_pod_operator.html  |   362 +
 .../contrib/operators/mlengine_operator.html|   260 +-
 .../airflow/contrib/operators/mysql_to_gcs.html |   524 +
 .../operators/postgres_to_gcs_operator.html |   481 +
 .../contrib/operators/pubsub_operator.html  |   669 ++
 .../contrib/operators/qubole_operator.html  |   411 +
 .../contrib/operators/s3_list_operator.html 

[10/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/operators/http_operator.html
--
diff --git a/_modules/airflow/operators/http_operator.html 
b/_modules/airflow/operators/http_operator.html
new file mode 100644
index 000..5890dde
--- /dev/null
+++ b/_modules/airflow/operators/http_operator.html
@@ -0,0 +1,327 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.operators.http_operator  Airflow Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.operators.http_operator
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.operators.http_operator
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+from airflow.exceptions import AirflowException
+from airflow.hooks.http_hook 
import HttpHook
+from airflow.models import BaseOperator
+from airflow.utils.decorators 
import apply_defaults
+
+
+[docs]class SimpleHttpOperator(BaseOperator):
+
+Calls an endpoint on an HTTP system to execute an 
action
+
+:param http_conn_id: The connection to run the sensor 
against
+:type http_conn_id: string
+:param endpoint: The relative part of the full url
+:type endpoint: string
+:param method: The HTTP method to use, default = 
POST
+:type method: string
+:param data: The data to pass. POST-data in POST/PUT and 
params
+in the URL for a GET request.
+:type data: For POST/PUT, depends on the content-type 
parameter,
+for GET a dictionary of key/value string pairs
+:param headers: The HTTP headers to be added to the GET 
request
+:type headers: a dictionary of string key/value 
pairs
+:param response_check: A check against the 
requests response object.
+Returns True for pass and False 
otherwise.
+:type response_check: A lambda or defined function.
+:param extra_options: Extra options for the 
requests library, see the
+requests documentation (options to modify 
timeout, ssl, etc.)
+:type extra_options: A dictionary of options, where key 
is string and value
+depends on the option thats being 
modified.
+
+
+template_fields = (endpoint, data,)
+template_ext = ()
+ui_color = #f4a460
+
+@apply_defaults
+def __init__(self,
+ endpoint,
+ method=POST,
+ data=None,
+ headers=None,
+ response_check=None,
+ extra_options=None,
+ xcom_push=False,
+ http_conn_id=http_default, *args, **kwargs):
+
+If xcom_push is True, response of an HTTP request 
will also
+be pushed to an XCom.
+
+super(SimpleHttpOperator, self).__init__(*args, **kwargs)
+self.http_conn_id = http_conn_id
+self.method = method
+self.endpoint = endpoint
+self.headers = headers or {}
+self.data = data 
or {}
+self.response_check = response_check
+self.extra_options = extra_options or {}
+self.xcom_push_flag = xcom_push
+
+

[48/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/hooks/databricks_hook.html
--
diff --git a/_modules/airflow/contrib/hooks/databricks_hook.html 
b/_modules/airflow/contrib/hooks/databricks_hook.html
new file mode 100644
index 000..00ae02f
--- /dev/null
+++ b/_modules/airflow/contrib/hooks/databricks_hook.html
@@ -0,0 +1,462 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.hooks.databricks_hook  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.hooks.databricks_hook
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.hooks.databricks_hook
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+#
+import requests
+
+from airflow import __version__
+from airflow.exceptions import AirflowException
+from airflow.hooks.base_hook 
import BaseHook
+from requests import exceptions as requests_exceptions
+from requests.auth import AuthBase
+
+from airflow.utils.log.logging_mixin import 
LoggingMixin
+
+try:
+from urllib import parse as 
urlparse
+except ImportError:
+import urlparse
+
+
+SUBMIT_RUN_ENDPOINT = (POST, api/2.0/jobs/runs/submit)
+GET_RUN_ENDPOINT = (GET, 
api/2.0/jobs/runs/get)
+CANCEL_RUN_ENDPOINT = (POST, api/2.0/jobs/runs/cancel)
+USER_AGENT_HEADER = {user-agent: airflow-{v}.format(v=__version__)}
+
+
+[docs]class DatabricksHook(BaseHook, LoggingMixin):
+
+Interact with Databricks.
+
+def __init__(
+self,
+databricks_conn_id=databricks_default,
+timeout_seconds=180,
+retry_limit=3):
+
+:param databricks_conn_id: The name of the databricks 
connection to use.
+:type databricks_conn_id: string
+:param timeout_seconds: The amount of time in seconds 
the requests library
+will wait before timing-out.
+:type timeout_seconds: int
+:param retry_limit: The number of times to retry the 
connection in case of
+service outages.
+:type retry_limit: int
+
+self.databricks_conn_id = databricks_conn_id
+self.databricks_conn = self.get_connection(databricks_conn_id)
+self.timeout_seconds = timeout_seconds
+assert retry_limit = 1, Retry limit must be greater than equal to 1
+self.retry_limit = retry_limit
+
+def _parse_host(self, host):
+
+The purpose of this function is to be robust to 
improper connections
+settings provided by users, specifically in the host 
field.
+
+
+For example -- when users supply 
``https://xx.cloud.databricks.com`` as the
+host, we must strip out the protocol to get the 
host.
+ h = DatabricksHook()
+ assert 
h._parse_host(https://xx.cloud.databricks.com;) == \
+xx.cloud.databricks.com
+
+In the case where users supply the correct 
``xx.cloud.databricks.com`` as the
+host, this function is a no-op.
+ assert 
h._parse_host(xx.cloud.databricks.com) == 

[38/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/hooks/sqoop_hook.html
--
diff --git a/_modules/airflow/contrib/hooks/sqoop_hook.html 
b/_modules/airflow/contrib/hooks/sqoop_hook.html
new file mode 100644
index 000..4563030
--- /dev/null
+++ b/_modules/airflow/contrib/hooks/sqoop_hook.html
@@ -0,0 +1,580 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.hooks.sqoop_hook  Airflow Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.hooks.sqoop_hook
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.hooks.sqoop_hook
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+#
+
+
+This module contains a sqoop 1.x hook
+
+import subprocess
+
+from airflow.exceptions import AirflowException
+from airflow.hooks.base_hook 
import BaseHook
+from airflow.utils.log.logging_mixin import 
LoggingMixin
+from copy import deepcopy
+
+
+[docs]class SqoopHook(BaseHook, LoggingMixin):
+
+This hook is a wrapper around the sqoop 1 binary. To be 
able to use the hook
+it is required that sqoop is in the 
PATH.
+
+Additional arguments that can be passed via the 
extra JSON field of the
+sqoop connection:
+* job_tracker: Job tracker local|jobtracker:port.
+* namenode: Namenode.
+* lib_jars: Comma separated jar files to include in the 
classpath.
+* files: Comma separated files to be copied to the map 
reduce cluster.
+* archives: Comma separated archives to be unarchived on 
the compute
+machines.
+* password_file: Path to file containing the 
password.
+
+:param conn_id: Reference to the sqoop connection.
+:type conn_id: str
+:param verbose: Set sqoop to verbose.
+:type verbose: bool
+:param num_mappers: Number of map tasks to import in 
parallel.
+:type num_mappers: int
+:param properties: Properties to set via the -D 
argument
+:type properties: dict
+
+
+def __init__(self, conn_id=sqoop_default, verbose=False,
+ num_mappers=None, hcatalog_database=None,
+ hcatalog_table=None, properties=None):
+# No mutable types in the default parameters
+self.conn = self.get_connection(conn_id)
+connection_parameters = 
self.conn.extra_dejson
+self.job_tracker = connection_parameters.get(job_tracker, None)
+self.namenode = connection_parameters.get(namenode, None)
+self.libjars = connection_parameters.get(libjars, None)
+self.files = connection_parameters.get(files, None)
+self.archives = connection_parameters.get(archives, None)
+self.password_file = connection_parameters.get(password_file, None)
+self.hcatalog_database = hcatalog_database
+self.hcatalog_table = hcatalog_table
+self.verbose = verbose
+self.num_mappers = num_mappers
+self.properties = properties or {}
+self.log.info(Using connection to: {}:{}/{}.format(self.conn.host, self.conn.port, self.conn.schema))
+
+

[47/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/hooks/discord_webhook_hook.html
--
diff --git a/_modules/airflow/contrib/hooks/discord_webhook_hook.html 
b/_modules/airflow/contrib/hooks/discord_webhook_hook.html
new file mode 100644
index 000..1115b8c
--- /dev/null
+++ b/_modules/airflow/contrib/hooks/discord_webhook_hook.html
@@ -0,0 +1,375 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.hooks.discord_webhook_hook  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.hooks.discord_webhook_hook
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.hooks.discord_webhook_hook
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+#
+import json
+import re
+
+from airflow.hooks.http_hook 
import HttpHook
+from airflow.exceptions import AirflowException
+
+
+[docs]class DiscordWebhookHook(HttpHook):
+
+This hook allows you to post messages to Discord using 
incoming webhooks.
+Takes a Discord connection ID with a default relative 
webhook endpoint. The
+default endpoint can be overridden using the 
webhook_endpoint parameter
+
(https://discordapp.com/developers/docs/resources/webhook).
+
+Each Discord webhook can be pre-configured to use a 
specific username and
+avatar_url. You can override these defaults in this 
hook.
+
+:param http_conn_id: Http connection ID with host as 
https://discord.com/api/; and
+ default webhook endpoint in the 
extra field in the form of
+ {webhook_endpoint: 
webhooks/{webhook.id}/{webhook.token}}
+:type http_conn_id: str
+:param webhook_endpoint: Discord webhook endpoint in the 
form of
+ 
webhooks/{webhook.id}/{webhook.token}
+:type webhook_endpoint: str
+:param message: The message you want to send to your 
Discord channel
+(max 2000 characters)
+:type message: str
+:param username: Override the default username of the 
webhook
+:type username: str
+:param avatar_url: Override the default avatar of the 
webhook
+:type avatar_url: str
+:param tts: Is a text-to-speech message
+:type tts: bool
+:param proxy: Proxy to use to make the Discord webhook 
call
+:type proxy: str
+
+
+def __init__(self,
+ http_conn_id=None,
+ webhook_endpoint=None,
+ message=,
+ username=None,
+ avatar_url=None,
+ tts=False,
+ proxy=None,
+ *args,
+ **kwargs):
+super(DiscordWebhookHook, self).__init__(*args, **kwargs)
+self.http_conn_id = http_conn_id
+self.webhook_endpoint = self._get_webhook_endpoint(http_conn_id, webhook_endpoint)
+self.message = message
+self.username = username
+self.avatar_url = avatar_url
+self.tts = tts
+self.proxy = proxy
+
+def 

[30/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/gcs_to_bq.html
--
diff --git a/_modules/airflow/contrib/operators/gcs_to_bq.html 
b/_modules/airflow/contrib/operators/gcs_to_bq.html
index 1d61268..db2b756 100644
--- a/_modules/airflow/contrib/operators/gcs_to_bq.html
+++ b/_modules/airflow/contrib/operators/gcs_to_bq.html
@@ -91,7 +91,7 @@
 Quick Start
 Installation
 Tutorial
-Configuration
+How-to Guides
 UI / Screenshots
 Concepts
 Data Profiling
@@ -99,8 +99,10 @@
 Scheduling  Triggers
 Plugins
 Security
+Time zones
 Experimental Rest API
 Integration
+Lineage
 FAQ
 API Reference
 
@@ -169,17 +171,22 @@
   Source code for airflow.contrib.operators.gcs_to_bq
 # -*- coding: utf-8 -*-
 #
-# Licensed under the Apache License, Version 2.0 (the 
License);
-# you may not use this file except in compliance with the 
License.
-# You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, 
software
-# distributed under the License is distributed on an AS 
IS BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
express or implied.
-# See the License for the specific language governing 
permissions and
-# limitations under the License.
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
 
 import json
 
@@ -192,6 +199,92 @@
 [docs]class GoogleCloudStorageToBigQueryOperator(BaseOperator):
 
 Loads files from Google cloud storage into 
BigQuery.
+
+The schema to be used for the BigQuery table may be 
specified in one of
+two ways. You may either directly pass the schema fields 
in, or you may
+point the operator to a Google cloud storage object name. 
The object in
+Google cloud storage must be a JSON file with the schema 
fields in it.
+
+:param bucket: The bucket to load from.
+:type bucket: string
+:param source_objects: List of Google cloud storage URIs 
to load from.
+If source_format is DATASTORE_BACKUP, the 
list must only contain a single URI.
+:type object: list
+:param destination_project_dataset_table: The dotted 
(project.)dataset.table
+BigQuery table to load data into. If project 
is not included, project will
+be the project defined in the connection json.
+:type destination_project_dataset_table: string
+:param schema_fields: If set, the schema field list as 
defined here:
+
https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.load
+Should not be set when source_format is 
DATASTORE_BACKUP.
+:type schema_fields: list
+:param schema_object: If set, a GCS object path pointing 
to a .json file that
+contains the schema for the table.
+:param schema_object: string
+:param source_format: File format to export.
+:type source_format: string
+:param compression: [Optional] The compression type of 
the data source.
+Possible values include GZIP and NONE.
+The default value is NONE.
+This setting is ignored for Google Cloud 
Bigtable,
+Google Cloud Datastore backups and Avro 
formats.
+:type compression: string
+:param create_disposition: The create disposition if the 
table doesnt exist.
+:type create_disposition: string
+:param skip_leading_rows: Number of rows to skip when 
loading from a CSV.
+:type skip_leading_rows: int
+:param write_disposition: The write disposition if the 
table already exists.
+:type write_disposition: string
+:param field_delimiter: The delimiter to use when loading 
from a CSV.
+:type field_delimiter: string
+:param max_bad_records: The maximum number of bad records 
that BigQuery can
+ignore when running the job.
+:type max_bad_records: int
+:param quote_character: The value that is used to quote 
data sections in a CSV file.
+:type quote_character: string
+:param ignore_unknown_values: [Optional] Indicates if 
BigQuery should allow
+extra values that are not represented in the table 
schema.
+If true, the extra values are ignored. If false, 
records with 

[32/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/datastore_export_operator.html
--
diff --git a/_modules/airflow/contrib/operators/datastore_export_operator.html 
b/_modules/airflow/contrib/operators/datastore_export_operator.html
new file mode 100644
index 000..94e2d02
--- /dev/null
+++ b/_modules/airflow/contrib/operators/datastore_export_operator.html
@@ -0,0 +1,344 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.operators.datastore_export_operator  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.operators.datastore_export_operator
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for 
airflow.contrib.operators.datastore_export_operator
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+#
+from airflow.contrib.hooks.datastore_hook import DatastoreHook
+from airflow.contrib.hooks.gcs_hook import 
GoogleCloudStorageHook
+from airflow.exceptions import AirflowException
+from airflow.models import BaseOperator
+from airflow.utils.decorators 
import apply_defaults
+
+
+[docs]class DatastoreExportOperator(BaseOperator):
+
+Export entities from Google Cloud Datastore to Cloud 
Storage
+
+:param bucket: name of the cloud storage bucket to backup 
data
+:type bucket: string
+:param namespace: optional namespace path in the 
specified Cloud Storage bucket
+to backup data. If this namespace does not exist in 
GCS, it will be created.
+:type namespace: str
+:param datastore_conn_id: the name of the Datastore 
connection id to use
+:type datastore_conn_id: string
+:param cloud_storage_conn_id: the name of the cloud 
storage connection id to force-write
+backup
+:type cloud_storage_conn_id: string
+:param delegate_to: The account to impersonate, if 
any.
+For this to work, the service account making the 
request must have domain-wide
+delegation enabled.
+:type delegate_to: string
+:param entity_filter: description of what data from the 
project is included in the export,
+refer to 
https://cloud.google.com/datastore/docs/reference/rest/Shared.Types/EntityFilter
+:type entity_filter: dict
+:param labels: client-assigned labels for cloud 
storage
+:type labels: dict
+:param polling_interval_in_seconds: number of seconds to 
wait before polling for
+execution status again
+:type polling_interval_in_seconds: int
+:param overwrite_existing: if the storage bucket + 
namespace is not empty, it will be
+emptied prior to exports. This enables overwriting 
existing backups.
+:type overwrite_existing: bool
+:param xcom_push: push operation name to xcom for 
reference
+:type xcom_push: bool
+
+
+@apply_defaults
+def __init__(self,
+ bucket,
+ namespace=None,
+ datastore_conn_id=google_cloud_default,
+ cloud_storage_conn_id=google_cloud_default,
+ 

[08/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/operators/python_operator.html
--
diff --git a/_modules/airflow/operators/python_operator.html 
b/_modules/airflow/operators/python_operator.html
new file mode 100644
index 000..fb1e29d
--- /dev/null
+++ b/_modules/airflow/operators/python_operator.html
@@ -0,0 +1,613 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.operators.python_operator  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.operators.python_operator
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.operators.python_operator
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+from builtins import str
+import dill
+import inspect
+import os
+import pickle
+import subprocess
+import sys
+import types
+
+from airflow.exceptions import AirflowException
+from airflow.models import BaseOperator, SkipMixin
+from airflow.utils.decorators 
import apply_defaults
+from airflow.utils.file import TemporaryDirectory
+
+from textwrap import dedent
+
+
+[docs]class PythonOperator(BaseOperator):
+
+Executes a Python callable
+
+:param python_callable: A reference to an object that is 
callable
+:type python_callable: python callable
+:param op_kwargs: a dictionary of keyword arguments that 
will get unpacked
+in your function
+:type op_kwargs: dict
+:param op_args: a list of positional arguments that will 
get unpacked when
+calling your callable
+:type op_args: list
+:param provide_context: if set to true, Airflow will pass 
a set of
+keyword arguments that can be used in your function. 
This set of
+kwargs correspond exactly to what you can use in your 
jinja
+templates. For this to work, you need to define 
`**kwargs` in your
+function header.
+:type provide_context: bool
+:param templates_dict: a dictionary where the values are 
templates that
+will get templated by the Airflow engine sometime 
between
+``__init__`` and ``execute`` takes place and are made 
available
+in your callables context after the template has 
been applied
+:type templates_dict: dict of str
+:param templates_exts: a list of file extensions to 
resolve while
+processing templated fields, for examples 
``[.sql, .hql]``
+:type templates_exts: list(str)
+
+template_fields = (templates_dict,)
+template_ext = tuple()
+ui_color = #ffefeb
+
+@apply_defaults
+def __init__(
+self,
+python_callable,
+op_args=None,
+op_kwargs=None,
+provide_context=False,
+templates_dict=None,
+templates_exts=None,
+*args, **kwargs):
+super(PythonOperator, self).__init__(*args, **kwargs)
+if not callable(python_callable):
+raise AirflowException(`python_callable` param must be callable)
+self.python_callable = python_callable
+self.op_args = op_args or 

[41/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/hooks/redshift_hook.html
--
diff --git a/_modules/airflow/contrib/hooks/redshift_hook.html 
b/_modules/airflow/contrib/hooks/redshift_hook.html
new file mode 100644
index 000..314fd9e
--- /dev/null
+++ b/_modules/airflow/contrib/hooks/redshift_hook.html
@@ -0,0 +1,348 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.hooks.redshift_hook  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.hooks.redshift_hook
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.hooks.redshift_hook
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+from airflow.contrib.hooks.aws_hook import 
AwsHook
+
+
+[docs]class RedshiftHook(AwsHook):
+
+Interact with AWS Redshift, using the boto3 library
+
+def get_conn(self):
+return self.get_client_type(redshift)
+
+# TODO: Wrap create_cluster_snapshot
+[docs]
def cluster_status(self, cluster_identifier):
+
+Return status of a cluster
+
+:param cluster_identifier: unique identifier of a 
cluster
+:type cluster_identifier: str
+
+conn = self.get_conn()
+try:
+response = conn.describe_clusters(
+ClusterIdentifier=cluster_identifier)[Clusters]
+return response[0][ClusterStatus] if response else 
None
+except conn.exceptions.ClusterNotFoundFault:
+return cluster_not_found
+
+[docs]
def delete_cluster(
+self,
+cluster_identifier,
+skip_final_cluster_snapshot=True,
+final_cluster_snapshot_identifier=):
+
+Delete a cluster and optionally create a 
snapshot
+
+:param cluster_identifier: unique identifier of a 
cluster
+:type cluster_identifier: str
+:param skip_final_cluster_snapshot: determines 
cluster snapshot creation
+:type skip_final_cluster_snapshot: bool
+:param final_cluster_snapshot_identifier: name of 
final cluster snapshot
+:type final_cluster_snapshot_identifier: str
+
+response = self.get_conn().delete_cluster(
+ClusterIdentifier=cluster_identifier,
+SkipFinalClusterSnapshot=skip_final_cluster_snapshot,
+FinalClusterSnapshotIdentifier=final_cluster_snapshot_identifier
+)
+return response[Cluster] if response[Cluster] else None
+
+[docs]
def describe_cluster_snapshots(self, cluster_identifier):
+
+Gets a list of snapshots for a cluster
+
+:param cluster_identifier: unique identifier of a 
cluster
+:type cluster_identifier: str
+
+response = self.get_conn().describe_cluster_snapshots(
+ClusterIdentifier=cluster_identifier
+)
+if Snapshots 
not in response:
+return None
+snapshots = response[Snapshots]
+snapshots = filter(lambda 
x: x[Status], snapshots)
+   

[07/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/operators/s3_file_transform_operator.html
--
diff --git a/_modules/airflow/operators/s3_file_transform_operator.html 
b/_modules/airflow/operators/s3_file_transform_operator.html
index 8db7bc2..366400e 100644
--- a/_modules/airflow/operators/s3_file_transform_operator.html
+++ b/_modules/airflow/operators/s3_file_transform_operator.html
@@ -91,7 +91,7 @@
 Quick Start
 Installation
 Tutorial
-Configuration
+How-to Guides
 UI / Screenshots
 Concepts
 Data Profiling
@@ -99,8 +99,10 @@
 Scheduling  Triggers
 Plugins
 Security
+Time zones
 Experimental Rest API
 Integration
+Lineage
 FAQ
 API Reference
 
@@ -169,17 +171,22 @@
   Source code for airflow.operators.s3_file_transform_operator
 # -*- coding: utf-8 -*-
 #
-# Licensed under the Apache License, Version 2.0 (the 
License);
-# you may not use this file except in compliance with the 
License.
-# You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, 
software
-# distributed under the License is distributed on an AS 
IS BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
express or implied.
-# See the License for the specific language governing 
permissions and
-# limitations under the License.
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
 
 from tempfile import NamedTemporaryFile
 import subprocess
@@ -200,10 +207,13 @@
 The locations of the source and the destination files in 
the local
 filesystem is provided as an first and second arguments 
to the
 transformation script. The transformation script is 
expected to read the
-data from source , transform it and write the output to 
the local
+data from source, transform it and write the output to 
the local
 destination file. The operator then takes over control 
and uploads the
 local destination file to S3.
 
+S3 Select is also available to filter the source 
contents. Users can
+omit the transformation script if S3 Select expression is 
specified.
+
 :param source_s3_key: The key to be retrieved from 
S3
 :type source_s3_key: str
 :param source_aws_conn_id: source s3 connection
@@ -216,6 +226,8 @@
 :type replace: bool
 :param transform_script: location of the executable 
transformation script
 :type transform_script: str
+:param select_expression: S3 Select expression
+:type select_expression: str
 
 
 template_fields = (source_s3_key, dest_s3_key)
@@ -227,7 +239,8 @@
 self,
 source_s3_key,
 dest_s3_key,
-transform_script,
+transform_script=None,
+select_expression=None,
 source_aws_conn_id=aws_default,
 dest_aws_conn_id=aws_default,
 replace=False,
@@ -239,34 +252,54 @@
 self.dest_aws_conn_id = dest_aws_conn_id
 self.replace = replace
 self.transform_script = transform_script
+self.select_expression = select_expression
 
 def execute(self, context):
+if self.transform_script is None and 
self.select_expression is None:
+raise AirflowException(
+Either transform_script or 
select_expression must be specified)
+
 source_s3 = S3Hook(aws_conn_id=self.source_aws_conn_id)
 dest_s3 = S3Hook(aws_conn_id=self.dest_aws_conn_id)
+
 self.log.info(Downloading source S3 file 
%s, self.source_s3_key)
 if not source_s3.check_for_key(self.source_s3_key):
-raise AirflowException(The source key {0} does not exist.format(self.source_s3_key))
+raise AirflowException(
+The source key {0} does not exist.format(self.source_s3_key))
 source_s3_key_object = 
source_s3.get_key(self.source_s3_key)
-with NamedTemporaryFile(w) as f_source, 
NamedTemporaryFile(w) as f_dest:
+
+with NamedTemporaryFile(wb) as f_source, 
NamedTemporaryFile(wb) as f_dest:
 self.log.info(
 Dumping S3 file %s contents to local file %s,
 

[01/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
Repository: incubator-airflow-site
Updated Branches:
  refs/heads/asf-site 28a3eb600 -> 11437c14a


http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/bigquery_hook.html
--
diff --git a/_modules/bigquery_hook.html b/_modules/bigquery_hook.html
deleted file mode 100644
index e58a93b..000
--- a/_modules/bigquery_hook.html
+++ /dev/null
@@ -1,1279 +0,0 @@
-
-
-
-
-  
-
-  
-  
-  
-  
-  bigquery_hook  Airflow Documentation
-  
-
-  
-  
-  
-  
-
-  
-
-  
-  
-
-
-  
-
-  
-  
-
-  
-
-  
-
-  
-
-
-
- 
-
-  
-  
-
-
-
-
-
-   
-  
-
-
-
-  
-
-  
-
-  
- Airflow
-  
-
-  
-  
-
-  
-
-
-  
-
-  
-
-  
-
-
-
-  
-
-
-  
-
-
-
-  
-
-
-  
-
-
-  
-Project
-License
-Quick Start
-Installation
-Tutorial
-Configuration
-UI / 
Screenshots
-Concepts
-Data Profiling
-Command Line Interface
-Scheduling  Triggers
-Plugins
-Security
-Experimental Rest API
-Integration
-FAQ
-API 
Reference
-
-
-
-  
-
-  
-
-
-
-
-  
-  
-
-  
-  Airflow
-
-  
-
-
-  
-  
-
-  
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-  
-
-  Docs 
-
-  Module code 
-
-  bigquery_hook
-
-
-  
-
-
-
-  
-
-  
-
-  
-  
-
-  http://schema.org/Article;>
-   
-
-  Source code for bigquery_hook
-# -*- coding: utf-8 -*-
-#
-# Licensed under the Apache License, Version 2.0 (the 
License);
-# you may not use this file except in compliance with the 
License.
-# You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, 
software
-# distributed under the License is distributed on an AS 
IS BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
express or implied.
-# See the License for the specific language governing 
permissions and
-# limitations under the License.
-#
-
-
-This module contains a BigQuery Hook, as well as a very basic 
PEP 249
-implementation for BigQuery.
-
-
-import time
-
-from apiclient.discovery import build, 
HttpError
-from googleapiclient import errors
-from builtins import range
-from pandas_gbq.gbq import GbqConnector, \
-_parse_data as gbq_parse_data, \
-_check_google_client_version as gbq_check_google_client_version, \
-_test_google_api_imports as 
gbq_test_google_api_imports
-from pandas.tools.merge import concat
-from past.builtins import basestring
-
-from airflow.contrib.hooks.gcp_api_base_hook import GoogleCloudBaseHook
-from airflow.hooks.dbapi_hook 
import DbApiHook
-from airflow.utils.log.logging_mixin import 
LoggingMixin
-
-
-[docs]class BigQueryHook(GoogleCloudBaseHook, DbApiHook, 
LoggingMixin):
-
-Interact with BigQuery. This hook uses the Google Cloud 
Platform
-connection.
-
-conn_name_attr = bigquery_conn_id
-
-def __init__(self,
- bigquery_conn_id=bigquery_default,
- delegate_to=None):
-super(BigQueryHook, self).__init__(
-conn_id=bigquery_conn_id,
-delegate_to=delegate_to)
-
-[docs]
def get_conn(self):
-
-Returns a BigQuery PEP 249 connection object.
-
-service = self.get_service()
-project = self._get_field(project)
-return BigQueryConnection(service=service, project_id=project)
-
-[docs]   
 def get_service(self):
-
-Returns a BigQuery service object.
-
-http_authorized = self._authorize()
-return build(bigquery, v2, 
http=http_authorized)
-
-[docs]   
 def insert_rows(self, table, rows, target_fields=None, commit_every=1000):
-
-Insertion is currently unsupported. Theoretically, 
you could use
-BigQuerys streaming API to insert rows into a 
table, but this hasnt
-been implemented.
-
-raise NotImplementedError()
-
-[docs] 
   def get_pandas_df(self, bql, parameters=None, dialect=legacy):
-
-Returns a Pandas DataFrame for the results produced 
by a BigQuery
-query. The DbApiHook method must be overridden 
because Pandas
-doesnt support PEP 249 connections, except for 
SQLite. See:
-
-
https://github.com/pydata/pandas/blob/master/pandas/io/sql.py#L447
-https://github.com/pydata/pandas/issues/6900
-
-:param bql: The BigQuery SQL to execute.
-:type bql: string
-:param parameters: The parameters to render the SQL 
query with (not used, leave to override superclass method)
-:type parameters: mapping or iterable
-

[45/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/hooks/gcp_dataproc_hook.html
--
diff --git a/_modules/airflow/contrib/hooks/gcp_dataproc_hook.html 
b/_modules/airflow/contrib/hooks/gcp_dataproc_hook.html
new file mode 100644
index 000..4ca7edd
--- /dev/null
+++ b/_modules/airflow/contrib/hooks/gcp_dataproc_hook.html
@@ -0,0 +1,463 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.hooks.gcp_dataproc_hook  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.hooks.gcp_dataproc_hook
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.hooks.gcp_dataproc_hook
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+#
+import time
+import uuid
+
+from apiclient.discovery import build
+
+from airflow.contrib.hooks.gcp_api_base_hook import GoogleCloudBaseHook
+from airflow.utils.log.logging_mixin import 
LoggingMixin
+
+
+class _DataProcJob(LoggingMixin):
+def __init__(self, dataproc_api, project_id, job, region=global):
+self.dataproc_api = dataproc_api
+self.project_id = project_id
+self.region = region
+self.job = dataproc_api.projects().regions().jobs().submit(
+projectId=self.project_id,
+region=self.region,
+body=job).execute()
+self.job_id = self.job[reference][jobId]
+self.log.info(
+DataProc job %s is %s,
+self.job_id, str(self.job[status][state])
+)
+
+def wait_for_done(self):
+while True:
+self.job = self.dataproc_api.projects().regions().jobs().get(
+projectId=self.project_id,
+region=self.region,
+jobId=self.job_id).execute(num_retries=5)
+if ERROR 
== self.job[status][state]:
+print(str(self.job))
+self.log.error(DataProc job %s has errors, self.job_id)
+self.log.error(self.job[status][details])
+self.log.debug(str(self.job))
+return False
+if CANCELLED == self.job[status][state]:
+print(str(self.job))
+self.log.warning(DataProc job %s is cancelled, self.job_id)
+if details in self.job[status]:
+self.log.warning(self.job[status][details])
+self.log.debug(str(self.job))
+return False
+if DONE 
== self.job[status][state]:
+return True
+self.log.debug(
+DataProc job %s is %s,
+self.job_id, str(self.job[status][state])
+)
+time.sleep(5)
+
+def raise_error(self, message=None):
+if ERROR 
== self.job[status][state]:
+if message is None:
+message = Google DataProc job has error
+raise Exception(message + : 
 + str(self.job[status][details]))
+
+def get(self):
+return self.job
+
+
+class _DataProcJobBuilder:
+def 

[21/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/sensors/emr_job_flow_sensor.html
--
diff --git a/_modules/airflow/contrib/sensors/emr_job_flow_sensor.html 
b/_modules/airflow/contrib/sensors/emr_job_flow_sensor.html
new file mode 100644
index 000..4fcbbc9
--- /dev/null
+++ b/_modules/airflow/contrib/sensors/emr_job_flow_sensor.html
@@ -0,0 +1,288 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.sensors.emr_job_flow_sensor  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.sensors.emr_job_flow_sensor
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.sensors.emr_job_flow_sensor
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+from airflow.contrib.hooks.emr_hook import 
EmrHook
+from airflow.contrib.sensors.emr_base_sensor import EmrBaseSensor
+from airflow.utils import apply_defaults
+
+
+[docs]class EmrJobFlowSensor(EmrBaseSensor):
+
+Asks for the state of the JobFlow until it reaches a 
terminal state.
+If it fails the sensor errors, failing the task.
+
+:param job_flow_id: job_flow_id to check the state 
of
+:type job_flow_id: string
+
+
+NON_TERMINAL_STATES = [STARTING, BOOTSTRAPPING, RUNNING, WAITING, TERMINATING]
+FAILED_STATE = [TERMINATED_WITH_ERRORS]
+template_fields = [job_flow_id]
+template_ext = ()
+
+@apply_defaults
+def __init__(self,
+ job_flow_id,
+ *args,
+ **kwargs):
+super(EmrJobFlowSensor, self).__init__(*args, **kwargs)
+self.job_flow_id = job_flow_id
+
+def get_emr_response(self):
+emr = EmrHook(aws_conn_id=self.aws_conn_id).get_conn()
+
+self.log.info(Poking cluster %s, self.job_flow_id)
+return emr.describe_cluster(ClusterId=self.job_flow_id)
+
+def state_from_response(self, response):
+return response[Cluster][Status][State]
+
+
+   
+   
+
+   
+  
+  
+  
+
+  
+
+  
+
+
+
+  
+  Built with http://sphinx-doc.org/;>Sphinx using a https://github.com/snide/sphinx_rtd_theme;>theme provided by https://readthedocs.org;>Read the Docs. 
+
+
+
+
+  
+
+
+
+  
+  
+
+
+  
+
+
+var DOCUMENTATION_OPTIONS = {
+URL_ROOT:'../../../../',
+VERSION:'',
+COLLAPSE_INDEX:false,
+FILE_SUFFIX:'.html',
+HAS_SOURCE:  true,
+SOURCELINK_SUFFIX: '.txt'
+};
+
+  
+  
+  
+
+  
+
+  
+  
+
+  
+
+  
+  
+  
+  jQuery(function () {
+  SphinxRtdTheme.StickyNav.enable();
+  });
+  
+   
+
+
+
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/sensors/emr_step_sensor.html
--
diff --git a/_modules/airflow/contrib/sensors/emr_step_sensor.html 

[44/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/hooks/gcp_pubsub_hook.html
--
diff --git a/_modules/airflow/contrib/hooks/gcp_pubsub_hook.html 
b/_modules/airflow/contrib/hooks/gcp_pubsub_hook.html
new file mode 100644
index 000..19713d1
--- /dev/null
+++ b/_modules/airflow/contrib/hooks/gcp_pubsub_hook.html
@@ -0,0 +1,519 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.hooks.gcp_pubsub_hook  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.hooks.gcp_pubsub_hook
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.hooks.gcp_pubsub_hook
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+from uuid import uuid4
+
+from apiclient.discovery import build
+from apiclient import errors
+
+from airflow.contrib.hooks.gcp_api_base_hook import GoogleCloudBaseHook
+
+
+def _format_subscription(project, subscription):
+return projects/{}/subscriptions/{}.format(project, subscription)
+
+
+def _format_topic(project, topic):
+return projects/{}/topics/{}.format(project, topic)
+
+
+class PubSubException(Exception):
+pass
+
+
+[docs]class PubSubHook(GoogleCloudBaseHook):
+Hook for accessing Google 
Pub/Sub.
+
+The GCP project against which actions are applied is 
determined by
+the project embedded in the Connection referenced by 
gcp_conn_id.
+
+
+def __init__(self, gcp_conn_id=google_cloud_default, delegate_to=None):
+super(PubSubHook, self).__init__(gcp_conn_id, delegate_to=delegate_to)
+
+[docs]
def get_conn(self):
+Returns a Pub/Sub service 
object.
+
+:rtype: apiclient.discovery.Resource
+
+http_authorized = self._authorize()
+return build(pubsub, v1, 
http=http_authorized)
+
+[docs]
def publish(self, project, topic, messages):
+Publishes messages to a Pub/Sub 
topic.
+
+:param project: the GCP project ID in which to 
publish
+:type project: string
+:param topic: the Pub/Sub topic to which to publish; 
do not
+include the ``projects/{project}/topics/`` 
prefix.
+:type topic: string
+:param messages: messages to publish; if the data 
field in a
+message is set, it should already be base64 
encoded.
+:type messages: list of PubSub messages; see
+
http://cloud.google.com/pubsub/docs/reference/rest/v1/PubsubMessage
+
+body = {messages: messages}
+full_topic = _format_topic(project, topic)
+request = self.get_conn().projects().topics().publish(
+topic=full_topic, body=body)
+try:
+request.execute()
+except errors.HttpError as 
e:
+raise PubSubException(
+Error publishing to topic {}.format(full_topic), e)
+
+[docs]
def create_topic(self, project, topic, fail_if_exists=False):
+Creates a Pub/Sub topic, if it does 
not already exist.
+
+:param project: the GCP project ID in which to 

[20/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/sensors/hdfs_sensor.html
--
diff --git a/_modules/airflow/contrib/sensors/hdfs_sensor.html 
b/_modules/airflow/contrib/sensors/hdfs_sensor.html
new file mode 100644
index 000..df9ecaa
--- /dev/null
+++ b/_modules/airflow/contrib/sensors/hdfs_sensor.html
@@ -0,0 +1,313 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.sensors.hdfs_sensor  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.sensors.hdfs_sensor
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.sensors.hdfs_sensor
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+from airflow.sensors.hdfs_sensor import 
HdfsSensor
+
+
+[docs]class HdfsSensorRegex(HdfsSensor):
+def __init__(self,
+ regex,
+ *args,
+ **kwargs):
+super(HdfsSensorRegex, self).__init__(*args, **kwargs)
+self.regex = regex
+
+[docs]
def poke(self, context):
+
+poke matching files in a directory with 
self.regex
+
+:return: Bool depending on the search criteria
+
+sb = self.hook(self.hdfs_conn_id).get_conn()
+self.log.info(
+Poking for {self.filepath} to be a directory 

+with files matching {self.regex.pattern}.
+format(**locals())
+)
+result = [f for f in sb.ls([self.filepath], include_toplevel=False) if
+  f[file_type] == f and
+  self.regex.match(f[path].replace(%s/ 
% self.filepath, ))]
+result = self.filter_for_ignored_ext(result, self.ignored_ext,
+ self.ignore_copying)
+result = self.filter_for_filesize(result, self.file_size)
+return bool(result)
+
+
+[docs]class HdfsSensorFolder(HdfsSensor):
+def __init__(self,
+ be_empty=False,
+ *args,
+ **kwargs):
+super(HdfsSensorFolder, self).__init__(*args, **kwargs)
+self.be_empty = be_empty
+
+[docs]
def poke(self, context):
+
+poke for a non empty directory
+
+:return: Bool depending on the search criteria
+
+sb = self.hook(self.hdfs_conn_id).get_conn()
+result = [f for f in sb.ls([self.filepath], include_toplevel=True)]
+result = self.filter_for_ignored_ext(result, self.ignored_ext,
+ self.ignore_copying)
+result = self.filter_for_filesize(result, self.file_size)
+if self.be_empty:
+self.log.info(Poking for filepath {self.filepath} to a empty 
directory
+  .format(**locals()))
+return len(result) == 1 and result[0][path] == self.filepath
+else:
+self.log.info(Poking for filepath {self.filepath} to a non empty 
directory
+  .format(**locals()))
+result.pop(0)
+return 

[25/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/s3_to_gcs_operator.html
--
diff --git a/_modules/airflow/contrib/operators/s3_to_gcs_operator.html 
b/_modules/airflow/contrib/operators/s3_to_gcs_operator.html
new file mode 100644
index 000..754fcb2
--- /dev/null
+++ b/_modules/airflow/contrib/operators/s3_to_gcs_operator.html
@@ -0,0 +1,425 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.operators.s3_to_gcs_operator  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.operators.s3_to_gcs_operator
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.operators.s3_to_gcs_operator
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+from tempfile import NamedTemporaryFile
+
+from airflow.contrib.hooks.gcs_hook import 
(GoogleCloudStorageHook,
+_parse_gcs_url)
+from airflow.contrib.operators.s3_list_operator import S3ListOperator
+from airflow.exceptions import AirflowException
+from airflow.hooks.S3_hook 
import S3Hook
+from airflow.utils.decorators 
import apply_defaults
+
+
+[docs]class S3ToGoogleCloudStorageOperator(S3ListOperator):
+
+Synchronizes an S3 key, possibly a prefix, with a Google 
Cloud Storage
+destination path.
+
+:param bucket: The S3 bucket where to find the 
objects.
+:type bucket: string
+:param prefix: Prefix string which filters objects whose 
name begin with
+such prefix.
+:type prefix: string
+:param delimiter: the delimiter marks key 
hierarchy.
+:type delimiter: string
+:param aws_conn_id: The source S3 connection
+:type aws_conn_id: string
+:param dest_gcs_conn_id: The destination connection ID to 
use
+when connecting to Google Cloud Storage.
+:type dest_gcs_conn_id: string
+:param dest_gcs: The destination Google Cloud Storage 
bucket and prefix
+where you want to store the files.
+:type dest_gcs: string
+:param delegate_to: The account to impersonate, if 
any.
+For this to work, the service account making the 
request must have
+domain-wide delegation enabled.
+:type delegate_to: string
+:param replace: Whether you want to replace existing 
destination files
+or not.
+:type replace: bool
+
+
+**Example**:
+.. code-block:: python
+   s3_to_gcs_op = S3ToGoogleCloudStorageOperator(
+task_id=s3_to_gcs_example,
+bucket=my-s3-bucket,
+prefix=data/customers-201804,
+
dest_gcs_conn_id=google_cloud_default,
+
dest_gcs=gs://my.gcs.bucket/some/customers/,
+replace=False,
+dag=my-dag)
+
+Note that ``bucket``, ``prefix``, ``delimiter`` and 
``dest_gcs`` are
+templated, so you can use variables in them if you 
wish.
+
+
+template_fields = (bucket, prefix, delimiter, dest_gcs)
+ui_color = #e09411
+
+@apply_defaults
+def __init__(self,
+ bucket,

[14/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/hooks/slack_hook.html
--
diff --git a/_modules/airflow/hooks/slack_hook.html 
b/_modules/airflow/hooks/slack_hook.html
new file mode 100644
index 000..f331546
--- /dev/null
+++ b/_modules/airflow/hooks/slack_hook.html
@@ -0,0 +1,296 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.hooks.slack_hook  Airflow Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.hooks.slack_hook
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.hooks.slack_hook
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+from slackclient import SlackClient
+from airflow.hooks.base_hook 
import BaseHook
+from airflow.exceptions import AirflowException
+
+
+[docs]class SlackHook(BaseHook):
+
+   Interact with Slack, using slackclient library.
+
+
+def __init__(self, token=None, slack_conn_id=None):
+
+Takes both Slack API token directly and connection 
that has Slack API token.
+
+If both supplied, Slack API token will be used.
+
+:param token: Slack API token
+:type token: string
+:param slack_conn_id: connection that has Slack API 
token in the password field
+:type slack_conn_id: string
+
+self.token = self.__get_token(token, slack_conn_id)
+
+def __get_token(self, token, slack_conn_id):
+if token is not None:
+return token
+elif slack_conn_id is not None:
+conn = self.get_connection(slack_conn_id)
+
+if not getattr(conn, password, None):
+raise AirflowException(Missing token(password) in Slack connection)
+return conn.password
+else:
+raise AirflowException(Cannot get token: No valid Slack token nor slack_conn_id 
supplied.)
+
+def call(self, method, api_params):
+sc = SlackClient(self.token)
+rc = sc.api_call(method, **api_params)
+
+if not rc[ok]:
+msg = Slack API call failed ({}).format(rc[error])
+raise AirflowException(msg)
+
+
+   
+   
+
+   
+  
+  
+  
+
+  
+
+  
+
+
+
+  
+  Built with http://sphinx-doc.org/;>Sphinx using a https://github.com/snide/sphinx_rtd_theme;>theme provided by https://readthedocs.org;>Read the Docs. 
+
+
+
+
+  
+
+
+
+  
+  
+
+
+  
+
+
+var DOCUMENTATION_OPTIONS = {
+URL_ROOT:'../../../',
+VERSION:'',
+COLLAPSE_INDEX:false,
+FILE_SUFFIX:'.html',
+HAS_SOURCE:  true,
+SOURCELINK_SUFFIX: '.txt'
+};
+
+  
+  
+  
+
+  
+
+  
+  
+
+  
+
+  
+  
+  
+  jQuery(function () {
+  SphinxRtdTheme.StickyNav.enable();
+  });
+  
+   
+
+
+
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/hooks/sqlite_hook.html

[04/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/sensors/hdfs_sensor.html
--
diff --git a/_modules/airflow/sensors/hdfs_sensor.html 
b/_modules/airflow/sensors/hdfs_sensor.html
new file mode 100644
index 000..7747a54
--- /dev/null
+++ b/_modules/airflow/sensors/hdfs_sensor.html
@@ -0,0 +1,352 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.sensors.hdfs_sensor  Airflow Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.sensors.hdfs_sensor
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.sensors.hdfs_sensor
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+import re
+import sys
+from builtins import str
+
+from airflow import settings
+from airflow.hooks.hdfs_hook 
import HDFSHook
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils.decorators 
import apply_defaults
+from airflow.utils.log.logging_mixin import 
LoggingMixin
+
+
+[docs]class HdfsSensor(BaseSensorOperator):
+
+Waits for a file or folder to land in HDFS
+
+template_fields = (filepath,)
+ui_color = settings.WEB_COLORS[LIGHTBLUE]
+
+@apply_defaults
+def __init__(self,
+ filepath,
+ hdfs_conn_id=hdfs_default,
+ ignored_ext=[_COPYING_],
+ ignore_copying=True,
+ file_size=None,
+ hook=HDFSHook,
+ *args,
+ **kwargs):
+super(HdfsSensor, self).__init__(*args, **kwargs)
+self.filepath = filepath
+self.hdfs_conn_id = hdfs_conn_id
+self.file_size = file_size
+self.ignored_ext = ignored_ext
+self.ignore_copying = ignore_copying
+self.hook = hook
+
+[docs]
@staticmethod
+def filter_for_filesize(result, size=None):
+
+Will test the filepath result and test if its size is 
at least self.filesize
+
+:param result: a list of dicts returned by Snakebite 
ls
+:param size: the file size in MB a file should be at 
least to trigger True
+:return: (bool) depending on the matching 
criteria
+
+if size:
+log = LoggingMixin().log
+log.debug(
+Filtering for file size = 
%s in files: %s,
+size, map(lambda 
x: x[path], result)
+)
+size *= settings.MEGABYTE
+result = [x for x in result 
if x[length] = size]
+log.debug(HdfsSensor.poke: after size filter result is %s, result)
+return result
+
+[docs]
@staticmethod
+def filter_for_ignored_ext(result, ignored_ext, ignore_copying):
+
+Will filter if instructed to do so the result to 
remove matching criteria
+
+:param result: (list) of dicts returned by Snakebite 
ls
+:param ignored_ext: (list) of ignored 
extensions
+:param ignore_copying: (bool) shall we ignore ?
+:return: (list) of dicts which were not removed
+
+if ignore_copying:

[19/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/sensors/redis_key_sensor.html
--
diff --git a/_modules/airflow/contrib/sensors/redis_key_sensor.html 
b/_modules/airflow/contrib/sensors/redis_key_sensor.html
new file mode 100644
index 000..55e26cd
--- /dev/null
+++ b/_modules/airflow/contrib/sensors/redis_key_sensor.html
@@ -0,0 +1,282 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.sensors.redis_key_sensor  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.sensors.redis_key_sensor
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.sensors.redis_key_sensor
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+from airflow.contrib.hooks.redis_hook import RedisHook
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils.decorators 
import apply_defaults
+
+
+[docs]class RedisKeySensor(BaseSensorOperator):
+
+Checks for the existence of a key in a Redis 
database
+
+template_fields = (key,)
+ui_color = #f0eee4
+
+@apply_defaults
+def __init__(self, key, redis_conn_id, *args, **kwargs):
+
+Create a new RedisKeySensor
+
+:param key: The key to be monitored
+:type key: string
+:param redis_conn_id: The connection ID to use when 
connecting to Redis DB.
+:type redis_conn_id: string
+
+super(RedisKeySensor, self).__init__(*args, **kwargs)
+self.redis_conn_id = redis_conn_id
+self.key = key
+
+def poke(self, context):
+self.log.info(Sensor check existence of key: 
%s, self.key)
+return RedisHook(self.redis_conn_id).key_exists(self.key)
+
+
+   
+   
+
+   
+  
+  
+  
+
+  
+
+  
+
+
+
+  
+  Built with http://sphinx-doc.org/;>Sphinx using a https://github.com/snide/sphinx_rtd_theme;>theme provided by https://readthedocs.org;>Read the Docs. 
+
+
+
+
+  
+
+
+
+  
+  
+
+
+  
+
+
+var DOCUMENTATION_OPTIONS = {
+URL_ROOT:'../../../../',
+VERSION:'',
+COLLAPSE_INDEX:false,
+FILE_SUFFIX:'.html',
+HAS_SOURCE:  true,
+SOURCELINK_SUFFIX: '.txt'
+};
+
+  
+  
+  
+
+  
+
+  
+  
+
+  
+
+  
+  
+  
+  jQuery(function () {
+  SphinxRtdTheme.StickyNav.enable();
+  });
+  
+   
+
+
+
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/sensors/sftp_sensor.html
--
diff --git a/_modules/airflow/contrib/sensors/sftp_sensor.html 
b/_modules/airflow/contrib/sensors/sftp_sensor.html
new file mode 100644
index 000..1df8d4f
--- /dev/null
+++ b/_modules/airflow/contrib/sensors/sftp_sensor.html
@@ -0,0 +1,287 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.sensors.sftp_sensor  Airflow 

[49/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/hooks/bigquery_hook.html
--
diff --git a/_modules/airflow/contrib/hooks/bigquery_hook.html 
b/_modules/airflow/contrib/hooks/bigquery_hook.html
index 1926c79..aff2ebd 100644
--- a/_modules/airflow/contrib/hooks/bigquery_hook.html
+++ b/_modules/airflow/contrib/hooks/bigquery_hook.html
@@ -91,7 +91,7 @@
 Quick Start
 Installation
 Tutorial
-Configuration
+How-to Guides
 UI / Screenshots
 Concepts
 Data Profiling
@@ -99,8 +99,10 @@
 Scheduling  Triggers
 Plugins
 Security
+Time zones
 Experimental Rest API
 Integration
+Lineage
 FAQ
 API Reference
 
@@ -169,39 +171,45 @@
   Source code for airflow.contrib.hooks.bigquery_hook
 # -*- coding: utf-8 -*-
 #
-# Licensed under the Apache License, Version 2.0 (the 
License);
-# you may not use this file except in compliance with the 
License.
-# You may obtain a copy of the License at
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
 #
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, 
software
-# distributed under the License is distributed on an AS 
IS BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
express or implied.
-# See the License for the specific language governing 
permissions and
-# limitations under the License.
-#
-
 
 This module contains a BigQuery Hook, as well as a very basic 
PEP 249
 implementation for BigQuery.
 
 
 import time
-
-from apiclient.discovery import build, 
HttpError
-from googleapiclient import errors
 from builtins import range
-from pandas_gbq.gbq import GbqConnector, \
-_parse_data as gbq_parse_data, \
-_check_google_client_version as gbq_check_google_client_version, \
-_test_google_api_imports as 
gbq_test_google_api_imports
-from pandas.tools.merge import concat
+
 from past.builtins import basestring
 
+from airflow import AirflowException
 from airflow.contrib.hooks.gcp_api_base_hook import GoogleCloudBaseHook
 from airflow.hooks.dbapi_hook 
import DbApiHook
 from airflow.utils.log.logging_mixin import 
LoggingMixin
+from apiclient.discovery import HttpError, build
+from googleapiclient import errors
+from pandas_gbq.gbq import \
+_check_google_client_version as gbq_check_google_client_version
+from pandas_gbq import read_gbq
+from pandas_gbq.gbq import \
+_test_google_api_imports as 
gbq_test_google_api_imports
+from pandas_gbq.gbq import GbqConnector
 
 
 [docs]class BigQueryHook(GoogleCloudBaseHook, DbApiHook, 
LoggingMixin):
@@ -213,10 +221,11 @@
 
 def __init__(self,
  bigquery_conn_id=bigquery_default,
- delegate_to=None):
+ delegate_to=None,
+ use_legacy_sql=True):
 super(BigQueryHook, self).__init__(
-conn_id=bigquery_conn_id,
-delegate_to=delegate_to)
+gcp_conn_id=bigquery_conn_id, delegate_to=delegate_to)
+self.use_legacy_sql = use_legacy_sql
 
 [docs]
def get_conn(self):
 
@@ -224,7 +233,10 @@
 
 service = self.get_service()
 project = self._get_field(project)
-return BigQueryConnection(service=service, project_id=project)
+return BigQueryConnection(
+service=service,
+project_id=project,
+use_legacy_sql=self.use_legacy_sql)
 
 [docs]
def get_service(self):
 
@@ -241,7 +253,7 @@
 
 raise NotImplementedError()
 
-[docs]
def get_pandas_df(self, bql, parameters=None, dialect=legacy):
+[docs]
def get_pandas_df(self, bql, parameters=None, dialect=None):
 
 Returns a Pandas DataFrame for the results produced 
by a BigQuery
 query. The DbApiHook method must be overridden 
because Pandas
@@ -252,35 +264,31 @@
 
 :param bql: The BigQuery SQL to execute.
 :type bql: string
-:param parameters: The parameters to render the SQL 
query with (not used, leave to override superclass method)
+:param parameters: The parameters to render the SQL 
query with (not
+used, leave to override superclass method)
 :type 

[43/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/hooks/gcs_hook.html
--
diff --git a/_modules/airflow/contrib/hooks/gcs_hook.html 
b/_modules/airflow/contrib/hooks/gcs_hook.html
index 6f3bc49..5db94f8 100644
--- a/_modules/airflow/contrib/hooks/gcs_hook.html
+++ b/_modules/airflow/contrib/hooks/gcs_hook.html
@@ -91,7 +91,7 @@
 Quick Start
 Installation
 Tutorial
-Configuration
+How-to Guides
 UI / Screenshots
 Concepts
 Data Profiling
@@ -99,8 +99,10 @@
 Scheduling  Triggers
 Plugins
 Security
+Time zones
 Experimental Rest API
 Integration
+Lineage
 FAQ
 API Reference
 
@@ -169,23 +171,31 @@
   Source code for airflow.contrib.hooks.gcs_hook
 # -*- coding: utf-8 -*-
 #
-# Licensed under the Apache License, Version 2.0 (the 
License);
-# you may not use this file except in compliance with the 
License.
-# You may obtain a copy of the License at
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
 #
-# http://www.apache.org/licenses/LICENSE-2.0
+#   http://www.apache.org/licenses/LICENSE-2.0
 #
-# Unless required by applicable law or agreed to in writing, 
software
-# distributed under the License is distributed on an AS 
IS BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
express or implied.
-# See the License for the specific language governing 
permissions and
-# limitations under the License.
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
 #
 from apiclient.discovery import build
 from apiclient.http import MediaFileUpload
 from googleapiclient import errors
 
 from airflow.contrib.hooks.gcp_api_base_hook import GoogleCloudBaseHook
+from airflow.exceptions import AirflowException
+
+import re
 
 
 [docs]class GoogleCloudStorageHook(GoogleCloudBaseHook):
@@ -195,7 +205,7 @@
 
 
 def __init__(self,
- google_cloud_storage_conn_id=google_cloud_storage_default,
+ google_cloud_storage_conn_id=google_cloud_default,
  delegate_to=None):
 super(GoogleCloudStorageHook, self).__init__(google_cloud_storage_conn_id,
  delegate_to)
@@ -207,7 +217,6 @@
 http_authorized = self._authorize()
 return build(storage, v1, 
http=http_authorized)
 
-
 # pylint:disable=redefined-builtin
 [docs]
def copy(self, source_bucket, source_object, destination_bucket=None,
  destination_object=None):
@@ -217,10 +226,10 @@
 destination_bucket or destination_object can be 
omitted, in which case
 source bucket/object is used, but not both.
 
-:param bucket: The bucket of the object to copy 
from.
-:type bucket: string
-:param object: The object to copy.
-:type object: string
+:param source_bucket: The bucket of the object to 
copy from.
+:type source_bucket: string
+:param source_object: The object to copy.
+:type source_object: string
 :param destination_bucket: The destination of the 
object to copied to.
 Can be omitted; then the same bucket is 
used.
 :type destination_bucket: string
@@ -252,9 +261,60 @@
 return False
 raise
 
+[docs]
def rewrite(self, source_bucket, source_object, destination_bucket,
+destination_object=None):
+
+Has the same functionality as copy, except that will 
work on files
+over 5 TB, as well as when copying between locations 
and/or storage
+classes.
+
+destination_object can be omitted, in which case 
source_object is used.
+
+:param source_bucket: The bucket of the object to 
copy from.
+:type source_bucket: string
+:param source_object: The object to copy.
+:type source_object: string
+:param destination_bucket: The destination of the 
object to copied to.
+:type destination_bucket: string
+:param destination_object: The (renamed) path of the 
object if given.
+Can be omitted; then the same name is used.
+
+destination_object = 
destination_object or source_object
+if (source_bucket == destination_bucket and
+source_object == 
destination_object):
+raise ValueError(
+

[37/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/hooks/ssh_hook.html
--
diff --git a/_modules/airflow/contrib/hooks/ssh_hook.html 
b/_modules/airflow/contrib/hooks/ssh_hook.html
new file mode 100644
index 000..08f1831
--- /dev/null
+++ b/_modules/airflow/contrib/hooks/ssh_hook.html
@@ -0,0 +1,470 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.hooks.ssh_hook  Airflow Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.hooks.ssh_hook
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.hooks.ssh_hook
+# -*- coding: utf-8 -*-
+#
+# Copyright 2012-2015 Spotify AB
+# Ported to Airflow by Bolke de Bruin
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+import getpass
+import os
+
+import paramiko
+from paramiko.config import SSH_PORT
+
+from contextlib import contextmanager
+from airflow.exceptions import AirflowException
+from airflow.hooks.base_hook 
import BaseHook
+from airflow.utils.log.logging_mixin import 
LoggingMixin
+
+
+[docs]class SSHHook(BaseHook, LoggingMixin):
+
+Hook for ssh remote execution using Paramiko.
+ref: https://github.com/paramiko/paramiko
+This hook also lets you create ssh tunnel and serve as 
basis for SFTP file transfer
+
+:param ssh_conn_id: connection id from airflow 
Connections from where all the required
+parameters can be fetched like username, password or 
key_file.
+Thought the priority is given to the param passed 
during init
+:type ssh_conn_id: str
+:param remote_host: remote host to connect
+:type remote_host: str
+:param username: username to connect to the 
remote_host
+:type username: str
+:param password: password of the username to connect to 
the remote_host
+:type password: str
+:param key_file: key file to use to connect to the 
remote_host.
+:type key_file: str
+:param port: port of remote host to connect (Default is 
paramiko SSH_PORT)
+:type port: int
+:param timeout: timeout for the attempt to connect to the 
remote_host.
+:type timeout: int
+:param keepalive_interval: send a keepalive packet to 
remote host every keepalive_interval seconds
+:type keepalive_interval: int
+
+
+def __init__(self,
+ ssh_conn_id=None,
+ remote_host=None,
+ username=None,
+ password=None,
+ key_file=None,
+ port=SSH_PORT,
+ timeout=10,
+ keepalive_interval=30
+ ):
+super(SSHHook, self).__init__(ssh_conn_id)
+self.ssh_conn_id = ssh_conn_id
+self.remote_host = remote_host
+self.username = username
+self.password = password
+self.key_file = key_file
+self.timeout = timeout
+self.keepalive_interval = keepalive_interval
+# Default values, overridable from Connection
+self.compress = True
+self.no_host_key_check = True
+self.client = None

[23/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/sqoop_operator.html
--
diff --git a/_modules/airflow/contrib/operators/sqoop_operator.html 
b/_modules/airflow/contrib/operators/sqoop_operator.html
new file mode 100644
index 000..31fafec
--- /dev/null
+++ b/_modules/airflow/contrib/operators/sqoop_operator.html
@@ -0,0 +1,467 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.operators.sqoop_operator  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.operators.sqoop_operator
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.operators.sqoop_operator
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+#
+
+
+This module contains a sqoop 1 operator
+
+import os
+import signal
+
+from airflow.contrib.hooks.sqoop_hook import SqoopHook
+from airflow.exceptions import AirflowException
+from airflow.models import BaseOperator
+from airflow.utils.decorators 
import apply_defaults
+
+
+[docs]class SqoopOperator(BaseOperator):
+
+Execute a Sqoop job.
+Documentation for Apache Sqoop can be found here: 
https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html.
+
+template_fields = (conn_id, cmd_type, table, query, target_dir, file_type, columns, split_by,
+   where, export_dir, input_null_string, input_null_non_string, staging_table,
+   enclosed_by, escaped_by, input_fields_terminated_by, 
input_lines_terminated_by,
+   input_optionally_enclosed_by, properties, extra_import_options, driver,
+   extra_export_options, hcatalog_database, hcatalog_table,)
+ui_color = #7D8CA4
+
+@apply_defaults
+def __init__(self,
+ conn_id=sqoop_default,
+ cmd_type=import,
+ table=None,
+ query=None,
+ target_dir=None,
+ append=None,
+ file_type=text,
+ columns=None,
+ num_mappers=None,
+ split_by=None,
+ where=None,
+ export_dir=None,
+ input_null_string=None,
+ input_null_non_string=None,
+ staging_table=None,
+ clear_staging_table=False,
+ enclosed_by=None,
+ escaped_by=None,
+ input_fields_terminated_by=None,
+ input_lines_terminated_by=None,
+ input_optionally_enclosed_by=None,
+ batch=False,
+ direct=False,
+ driver=None,
+ verbose=False,
+ relaxed_isolation=False,
+ properties=None,
+ hcatalog_database=None,
+ hcatalog_table=None,
+ create_hcatalog_table=False,
+ extra_import_options=None,
+ extra_export_options=None,
+ *args,
+ **kwargs):

[50/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/hooks/aws_dynamodb_hook.html
--
diff --git a/_modules/airflow/contrib/hooks/aws_dynamodb_hook.html 
b/_modules/airflow/contrib/hooks/aws_dynamodb_hook.html
new file mode 100644
index 000..68c9921
--- /dev/null
+++ b/_modules/airflow/contrib/hooks/aws_dynamodb_hook.html
@@ -0,0 +1,300 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.hooks.aws_dynamodb_hook  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.hooks.aws_dynamodb_hook
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.hooks.aws_dynamodb_hook
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+from airflow.exceptions import AirflowException
+from airflow.contrib.hooks.aws_hook import 
AwsHook
+
+
+[docs]class AwsDynamoDBHook(AwsHook):
+
+Interact with AWS DynamoDB.
+
+:param table_keys: partition key and sort key
+:type table_keys: list
+:param table_name: target DynamoDB table
+:type table_name: str
+:param region_name: aws region name (example: 
us-east-1)
+:type region_name: str
+
+
+def __init__(self, table_keys=None, table_name=None, region_name=None, *args, **kwargs):
+self.table_keys = table_keys
+self.table_name = table_name
+self.region_name = region_name
+super(AwsDynamoDBHook, self).__init__(*args, **kwargs)
+
+def get_conn(self):
+self.conn = self.get_resource_type(dynamodb, self.region_name)
+return self.conn
+
+[docs]
def write_batch_data(self, items):
+
+Write batch items to dynamodb table with provisioned 
throughout capacity.
+
+
+dynamodb_conn = self.get_conn()
+
+try:
+table = dynamodb_conn.Table(self.table_name)
+
+with table.batch_writer(overwrite_by_pkeys=self.table_keys) as 
batch:
+for item in items:
+batch.put_item(Item=item)
+return True
+except Exception as general_error:
+raise AirflowException(
+Failed to insert items in dynamodb, 
error: {error}.format(
+error=str(general_error)
+)
+)
+
+
+   
+   
+
+   
+  
+  
+  
+
+  
+
+  
+
+
+
+  
+  Built with http://sphinx-doc.org/;>Sphinx using a https://github.com/snide/sphinx_rtd_theme;>theme provided by https://readthedocs.org;>Read the Docs. 
+
+
+
+
+  
+
+
+
+  
+  
+
+
+  
+
+
+var DOCUMENTATION_OPTIONS = {
+URL_ROOT:'../../../../',
+VERSION:'',
+COLLAPSE_INDEX:false,
+FILE_SUFFIX:'.html',
+HAS_SOURCE:  true,
+SOURCELINK_SUFFIX: '.txt'
+};
+
+  
+  
+  
+
+  
+
+  
+  
+
+  
+
+  
+  
+  
+  jQuery(function () {
+  SphinxRtdTheme.StickyNav.enable();
+  });
+  
+   
+
+
+
\ No newline at 

[31/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/file_to_gcs.html
--
diff --git a/_modules/airflow/contrib/operators/file_to_gcs.html 
b/_modules/airflow/contrib/operators/file_to_gcs.html
new file mode 100644
index 000..1be0683
--- /dev/null
+++ b/_modules/airflow/contrib/operators/file_to_gcs.html
@@ -0,0 +1,310 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.operators.file_to_gcs  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.operators.file_to_gcs
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.operators.file_to_gcs
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+#
+
+from airflow.contrib.hooks.gcs_hook import 
GoogleCloudStorageHook
+from airflow.models import BaseOperator
+from airflow.utils.decorators 
import apply_defaults
+
+
+[docs]class FileToGoogleCloudStorageOperator(BaseOperator):
+
+Uploads a file to Google Cloud Storage
+
+:param src: Path to the local file
+:type src: string
+:param dst: Destination path within the specified 
bucket
+:type dst: string
+:param bucket: The bucket to upload to
+:type bucket: string
+:param google_cloud_storage_conn_id: The Airflow 
connection ID to upload with
+:type google_cloud_storage_conn_id: string
+:param mime_type: The mime-type string
+:type mime_type: string
+:param delegate_to: The account to impersonate, if 
any
+:type delegate_to: string
+
+template_fields = (src, 
dst, bucket)
+
+@apply_defaults
+def __init__(self,
+ src,
+ dst,
+ bucket,
+ google_cloud_storage_conn_id=google_cloud_default,
+ mime_type=application/octet-stream,
+ delegate_to=None,
+ *args,
+ **kwargs):
+super(FileToGoogleCloudStorageOperator, self).__init__(*args, **kwargs)
+self.src = src
+self.dst = dst
+self.bucket = bucket
+self.google_cloud_storage_conn_id = google_cloud_storage_conn_id
+self.mime_type = mime_type
+self.delegate_to = delegate_to
+
+[docs]
def execute(self, context):
+
+Uploads the file to Google cloud storage
+
+hook = GoogleCloudStorageHook(
+google_cloud_storage_conn_id=self.google_cloud_storage_conn_id,
+delegate_to=self.delegate_to)
+
+hook.upload(
+bucket=self.bucket,
+object=self.dst,
+mime_type=self.mime_type,
+filename=self.src)
+
+
+   
+   
+
+   
+  
+  
+  
+
+  
+
+  
+
+
+
+  
+  Built with http://sphinx-doc.org/;>Sphinx using a https://github.com/snide/sphinx_rtd_theme;>theme provided by https://readthedocs.org;>Read the Docs. 
+
+
+
+
+  
+
+
+
+  
+  
+
+
+  
+
+
+var DOCUMENTATION_OPTIONS = {
+

[12/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/operators/bash_operator.html
--
diff --git a/_modules/airflow/operators/bash_operator.html 
b/_modules/airflow/operators/bash_operator.html
new file mode 100644
index 000..d6d7dc6
--- /dev/null
+++ b/_modules/airflow/operators/bash_operator.html
@@ -0,0 +1,376 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.operators.bash_operator  Airflow Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.operators.bash_operator
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.operators.bash_operator
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+
+from builtins import bytes
+import os
+import signal
+from subprocess import Popen, 
STDOUT, PIPE
+from tempfile import gettempdir, NamedTemporaryFile
+
+from airflow import configuration as conf
+from airflow.exceptions import AirflowException
+from airflow.models import BaseOperator
+from airflow.utils.decorators 
import apply_defaults
+from airflow.utils.file import TemporaryDirectory
+
+
+# These variables are required in cases when BashOperator 
tasks use airflow specific code,
+# e.g. they import packages in the airflow context and the 
possibility of impersonation
+# gives not guarantee that these variables are available in 
the impersonated environment.
+# Hence, we need to propagate them in the Bash script used as 
a wrapper of commands in
+# this BashOperator.
+PYTHONPATH_VAR = PYTHONPATH
+AIRFLOW_HOME_VAR = AIRFLOW_HOME
+
+
+[docs]class BashOperator(BaseOperator):
+
+Execute a Bash script, command or set of commands.
+
+:param bash_command: The command, set of commands or 
reference to a
+bash script (must be .sh) to be 
executed.
+:type bash_command: string
+:param xcom_push: If xcom_push is True, the last line 
written to stdout
+will also be pushed to an XCom when the bash command 
completes.
+:type xcom_push: bool
+:param env: If env is not None, it must be a mapping that 
defines the
+environment variables for the new process; these are 
used instead
+of inheriting the current process environment, which 
is the default
+behavior. (templated)
+:type env: dict
+:type output_encoding: output encoding of bash 
command
+
+template_fields = (bash_command, env)
+template_ext = (.sh, 
.bash,)
+ui_color = #f0ede4
+
+@apply_defaults
+def __init__(
+self,
+bash_command,
+xcom_push=False,
+env=None,
+output_encoding=utf-8,
+*args, **kwargs):
+
+super(BashOperator, self).__init__(*args, **kwargs)
+self.bash_command = bash_command
+self.env = env
+self.xcom_push_flag = xcom_push
+self.output_encoding = output_encoding
+
+[docs]
def execute(self, context):
+
+Execute the bash command in a temporary 
directory
+which will be cleaned afterwards
+
+self.log.info(Tmp dir 

[29/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/jenkins_job_trigger_operator.html
--
diff --git 
a/_modules/airflow/contrib/operators/jenkins_job_trigger_operator.html 
b/_modules/airflow/contrib/operators/jenkins_job_trigger_operator.html
new file mode 100644
index 000..1bb3657
--- /dev/null
+++ b/_modules/airflow/contrib/operators/jenkins_job_trigger_operator.html
@@ -0,0 +1,484 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.operators.jenkins_job_trigger_operator  
Airflow Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.operators.jenkins_job_trigger_operator
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for 
airflow.contrib.operators.jenkins_job_trigger_operator
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+import time
+import socket
+import json
+from airflow.exceptions import AirflowException
+from airflow.models import BaseOperator
+from airflow.utils.decorators 
import apply_defaults
+from airflow.contrib.hooks.jenkins_hook import JenkinsHook
+import jenkins
+from jenkins import JenkinsException
+from six.moves.urllib.request 
import Request, urlopen
+from six.moves.urllib.error 
import HTTPError, URLError
+
+try:
+basestring
+except NameError:
+basestring = str  # For python3 compatibility
+
+
+# TODO Use jenkins_urlopen instead when it will be 
available
+# in the stable python-jenkins version ( 0.4.15)
+def jenkins_request_with_headers(jenkins_server, req, add_crumb=True):
+
+We need to get the headers in addition to the body 
answer
+to get the location from them
+This function is just a copy of the one present in 
python-jenkins library
+with just the return call changed
+:param jenkins_server: The server to query
+:param req: The request to execute
+:param add_crumb: Boolean to indicate if it should add 
crumb to the request
+:return:
+
+try:
+if jenkins_server.auth:
+req.add_header(Authorization, jenkins_server.auth)
+if add_crumb:
+jenkins_server.maybe_add_crumb(req)
+response = urlopen(req, timeout=jenkins_server.timeout)
+response_body = response.read()
+response_headers = response.info()
+if response_body is None:
+raise jenkins.EmptyResponseException(
+Error communicating with 
server[%s]: 
+empty response % jenkins_server.server)
+return {body: response_body.decode(utf-8), headers: response_headers}
+except HTTPError as e:
+# Jenkinss funky authentication means its nigh 
impossible to
+# distinguish errors.
+if e.code in [401, 403, 500]:
+# six.moves.urllib.error.HTTPError provides a 
reason
+# attribute for all python version except for ver 
2.6
+# Falling back to HTTPError.msg since it contains 
the
+# same info as reason
+raise JenkinsException(
+Error in request.  +
+  

[24/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/spark_jdbc_operator.html
--
diff --git a/_modules/airflow/contrib/operators/spark_jdbc_operator.html 
b/_modules/airflow/contrib/operators/spark_jdbc_operator.html
new file mode 100644
index 000..5a9ddab
--- /dev/null
+++ b/_modules/airflow/contrib/operators/spark_jdbc_operator.html
@@ -0,0 +1,449 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.operators.spark_jdbc_operator  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.operators.spark_jdbc_operator
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.operators.spark_jdbc_operator
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+#
+from airflow.contrib.operators.spark_submit_operator import SparkSubmitOperator
+from airflow.contrib.hooks.spark_jdbc_hook import SparkJDBCHook
+from airflow.utils.decorators 
import apply_defaults
+
+
+[docs]class SparkJDBCOperator(SparkSubmitOperator):
+
+This operator extends the SparkSubmitOperator 
specifically for performing data
+transfers to/from JDBC-based databases with Apache Spark. 
As with the
+SparkSubmitOperator, it assumes that the 
spark-submit binary is available on the
+PATH.
+
+:param spark_app_name: Name of the job (default 
airflow-spark-jdbc)
+:type spark_app_name: str
+:param spark_conn_id: Connection id as configured in 
Airflow administration
+:type spark_conn_id: str
+:param spark_conf: Any additional Spark configuration 
properties
+:type spark_conf: dict
+:param spark_py_files: Additional python files used 
(.zip, .egg, or .py)
+:type spark_py_files: str
+:param spark_files: Additional files to upload to the 
container running the job
+:type spark_files: str
+:param spark_jars: Additional jars to upload and add to 
the driver and
+   executor classpath
+:type spark_jars: str
+:param num_executors: number of executor to run. This 
should be set so as to manage
+  the number of connections made with 
the JDBC database
+:type num_executors: int
+:param executor_cores: Number of cores per executor
+:type executor_cores: int
+:param executor_memory: Memory per executor (e.g. 1000M, 
2G)
+:type executor_memory: str
+:param driver_memory: Memory allocated to the driver 
(e.g. 1000M, 2G)
+:type driver_memory: str
+:param verbose: Whether to pass the verbose flag to 
spark-submit for debugging
+:type verbose: bool
+:param keytab: Full path to the file that contains the 
keytab
+:type keytab: str
+:param principal: The name of the kerberos principal used 
for keytab
+:type principal: str
+:param cmd_type: Which way the data should flow. 2 
possible values:
+ spark_to_jdbc: data written by spark 
from metastore to jdbc
+ jdbc_to_spark: data written by spark 
from jdbc to metastore
+:type 

[26/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/pubsub_operator.html
--
diff --git a/_modules/airflow/contrib/operators/pubsub_operator.html 
b/_modules/airflow/contrib/operators/pubsub_operator.html
new file mode 100644
index 000..99a540f
--- /dev/null
+++ b/_modules/airflow/contrib/operators/pubsub_operator.html
@@ -0,0 +1,669 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.operators.pubsub_operator  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.operators.pubsub_operator
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.operators.pubsub_operator
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+from airflow.contrib.hooks.gcp_pubsub_hook import PubSubHook
+from airflow.models import BaseOperator
+from airflow.utils.decorators 
import apply_defaults
+
+
+[docs]class PubSubTopicCreateOperator(BaseOperator):
+Create a PubSub topic.
+
+By default, if the topic already exists, this operator 
will
+not cause the DAG to fail. ::
+
+with DAG(successful DAG) as dag:
+(
+dag
+ 
PubSubTopicCreateOperator(project=my-project,
+ 
topic=my_new_topic)
+ 
PubSubTopicCreateOperator(project=my-project,
+ 
topic=my_new_topic)
+)
+
+The operator can be configured to fail if the topic 
already exists. ::
+
+with DAG(failing DAG) as dag:
+(
+dag
+ 
PubSubTopicCreateOperator(project=my-project,
+ 
topic=my_new_topic)
+ 
PubSubTopicCreateOperator(project=my-project,
+ 
topic=my_new_topic,
+ 
fail_if_exists=True)
+)
+
+Both ``project`` and ``topic`` are templated so you can 
use
+variables in them.
+
+template_fields = [project, topic]
+ui_color = #0273d4
+
+@apply_defaults
+def __init__(
+self,
+project,
+topic,
+fail_if_exists=False,
+gcp_conn_id=google_cloud_default,
+delegate_to=None,
+*args,
+**kwargs):
+
+:param project: the GCP project ID where the topic 
will be created
+:type project: string
+:param topic: the topic to create. Do not include 
the
+full topic path. In other words, instead of
+``projects/{project}/topics/{topic}``, provide 
only
+``{topic}``. (templated)
+:type topic: string
+:param gcp_conn_id: The connection ID to use 
connecting to
+Google Cloud Platform.
+:type gcp_conn_id: string
+:param delegate_to: The account to impersonate, if 
any.
+For this to work, the service account making the 
request
+

[39/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/hooks/spark_submit_hook.html
--
diff --git a/_modules/airflow/contrib/hooks/spark_submit_hook.html 
b/_modules/airflow/contrib/hooks/spark_submit_hook.html
new file mode 100644
index 000..6903f5f
--- /dev/null
+++ b/_modules/airflow/contrib/hooks/spark_submit_hook.html
@@ -0,0 +1,799 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.hooks.spark_submit_hook  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.hooks.spark_submit_hook
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.hooks.spark_submit_hook
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+#
+import os
+import subprocess
+import re
+import time
+
+from airflow.hooks.base_hook 
import BaseHook
+from airflow.exceptions import AirflowException
+from airflow.utils.log.logging_mixin import 
LoggingMixin
+from airflow.contrib.kubernetes import 
kube_client
+from kubernetes.client.rest 
import ApiException
+
+
+[docs]class SparkSubmitHook(BaseHook, LoggingMixin):
+
+This hook is a wrapper around the spark-submit binary to 
kick off a spark-submit job.
+It requires that the spark-submit binary is 
in the PATH or the spark_home to be
+supplied.
+:param conf: Arbitrary Spark configuration 
properties
+:type conf: dict
+:param conn_id: The connection id as configured in 
Airflow administration. When an
+invalid connection_id is supplied, it 
will default to yarn.
+:type conn_id: str
+:param files: Upload additional files to the executor 
running the job, separated by a
+  comma. Files will be placed in the working 
directory of each executor.
+  For example, serialized objects.
+:type files: str
+:param py_files: Additional python files used by the job, 
can be .zip, .egg or .py.
+:type py_files: str
+:param driver_classpath: Additional, driver-specific, 
classpath settings.
+:type driver_classpath: str
+:param jars: Submit additional jars to upload and place 
them in executor classpath.
+:type jars: str
+:param java_class: the main class of the Java 
application
+:type java_class: str
+:param packages: Comma-separated list of maven 
coordinates of jars to include on the
+driver and executor classpaths
+:type packages: str
+:param exclude_packages: Comma-separated list of maven 
coordinates of jars to exclude
+while resolving the dependencies provided in 
packages
+:type exclude_packages: str
+:param repositories: Comma-separated list of additional 
remote repositories to search
+for the maven coordinates given with 
packages
+:type repositories: str
+:param total_executor_cores: (Standalone  Mesos 
only) Total cores for all executors
+(Default: all the available cores on the worker)
+:type total_executor_cores: int
+:param executor_cores: (Standalone, YARN and Kubernetes 
only) Number of 

[16/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/hooks/hdfs_hook.html
--
diff --git a/_modules/airflow/hooks/hdfs_hook.html 
b/_modules/airflow/hooks/hdfs_hook.html
new file mode 100644
index 000..664c043
--- /dev/null
+++ b/_modules/airflow/hooks/hdfs_hook.html
@@ -0,0 +1,335 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.hooks.hdfs_hook  Airflow Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.hooks.hdfs_hook
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.hooks.hdfs_hook
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+from airflow.hooks.base_hook 
import BaseHook
+from airflow import configuration
+
+try:
+snakebite_imported = True
+from snakebite.client 
import Client, HAClient, 
Namenode, AutoConfigClient
+except ImportError:
+snakebite_imported = False
+
+from airflow.exceptions import AirflowException
+
+
+class HDFSHookException(AirflowException):
+pass
+
+
+[docs]class HDFSHook(BaseHook):
+
+Interact with HDFS. This class is a wrapper around the 
snakebite library.
+
+:param hdfs_conn_id: Connection id to fetch connection 
info
+:type conn_id: string
+:param proxy_user: effective user for HDFS 
operations
+:type proxy_user: string
+:param autoconfig: use snakebites automatically 
configured client
+:type autoconfig: bool
+
+def __init__(self, hdfs_conn_id=hdfs_default, proxy_user=None,
+ autoconfig=False):
+if not snakebite_imported:
+raise ImportError(
+This HDFSHook implementation requires 
snakebite, but 
+snakebite is not compatible with Python 
3 
+(as of August 2015). Please use Python 2 
if you require 
+this hook  -- or help by submitting a 
PR!)
+self.hdfs_conn_id = hdfs_conn_id
+self.proxy_user = proxy_user
+self.autoconfig = autoconfig
+
+[docs]  
  def get_conn(self):
+
+Returns a snakebite HDFSClient object.
+
+# When using HAClient, proxy_user must be the same, 
so is ok to always
+# take the first.
+effective_user = self.proxy_user
+autoconfig = self.autoconfig
+use_sasl = configuration.conf.get(core, security) == kerberos
+
+try:
+connections = self.get_connections(self.hdfs_conn_id)
+
+if not effective_user:
+effective_user = 
connections[0].login
+if not autoconfig:
+autoconfig = 
connections[0].extra_dejson.get(autoconfig,
+ False)
+hdfs_namenode_principal = connections[0].extra_dejson.get(
+hdfs_namenode_principal)
+except AirflowException:
+if not autoconfig:
+raise
+
+if autoconfig:
+# will read config info from $HADOOP_HOME conf 
files
+client = AutoConfigClient(effective_user=effective_user,
+  

[34/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/databricks_operator.html
--
diff --git a/_modules/airflow/contrib/operators/databricks_operator.html 
b/_modules/airflow/contrib/operators/databricks_operator.html
index cb38349..c39f769 100644
--- a/_modules/airflow/contrib/operators/databricks_operator.html
+++ b/_modules/airflow/contrib/operators/databricks_operator.html
@@ -91,7 +91,7 @@
 Quick Start
 Installation
 Tutorial
-Configuration
+How-to Guides
 UI / Screenshots
 Concepts
 Data Profiling
@@ -99,8 +99,10 @@
 Scheduling  Triggers
 Plugins
 Security
+Time zones
 Experimental Rest API
 Integration
+Lineage
 FAQ
 API Reference
 
@@ -169,17 +171,22 @@
   Source code for airflow.contrib.operators.databricks_operator
 # -*- coding: utf-8 -*-
 #
-# Licensed under the Apache License, Version 2.0 (the 
License);
-# you may not use this file except in compliance with the 
License.
-# You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, 
software
-# distributed under the License is distributed on an AS 
IS BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
express or implied.
-# See the License for the specific language governing 
permissions and
-# limitations under the License.
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
 #
 
 import six
@@ -190,6 +197,10 @@
 from airflow.models import BaseOperator
 
 
+XCOM_RUN_ID_KEY = run_id
+XCOM_RUN_PAGE_URL_KEY = run_page_url
+
+
 [docs]class DatabricksSubmitRunOperator(BaseOperator):
 
 Submits an Spark job run to Databricks using the
@@ -307,6 +318,8 @@
 :param databricks_retry_limit: Amount of times retry if 
the Databricks backend is
 unreachable. Its value must be greater than or equal 
to 1.
 :type databricks_retry_limit: int
+:param do_xcom_push: Whether we should push run_id and 
run_page_url to xcom.
+:type do_xcom_push: boolean
 
 # Used in airflow.models.BaseOperator
 template_fields = (json,)
@@ -327,6 +340,7 @@
 databricks_conn_id=databricks_default,
 polling_period_seconds=30,
 databricks_retry_limit=3,
+do_xcom_push=False,
 **kwargs):
 
 Creates a new ``DatabricksSubmitRunOperator``.
@@ -356,6 +370,7 @@
 self.json = self._deep_string_coerce(self.json)
 # This variable will be used in case our task gets 
killed.
 self.run_id = None
+self.do_xcom_push = do_xcom_push
 
 def _deep_string_coerce(self, content, json_path=json):
 
@@ -393,8 +408,12 @@
 def execute(self, context):
 hook = self.get_hook()
 self.run_id = hook.submit_run(self.json)
-run_page_url = hook.get_run_page_url(self.run_id)
+if self.do_xcom_push:
+context[ti].xcom_push(key=XCOM_RUN_ID_KEY, value=self.run_id)
 self.log.info(Run submitted with run_id: %s, self.run_id)
+run_page_url = hook.get_run_page_url(self.run_id)
+if self.do_xcom_push:
+context[ti].xcom_push(key=XCOM_RUN_PAGE_URL_KEY, value=run_page_url)
 self._log_run_page_url(run_page_url)
 while True:
 run_state = hook.get_run_state(self.run_id)

http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/dataflow_operator.html
--
diff --git a/_modules/airflow/contrib/operators/dataflow_operator.html 
b/_modules/airflow/contrib/operators/dataflow_operator.html
index 6807e24..d5d4041 100644
--- a/_modules/airflow/contrib/operators/dataflow_operator.html
+++ b/_modules/airflow/contrib/operators/dataflow_operator.html
@@ -91,7 +91,7 @@
 Quick Start
 Installation
 Tutorial
-Configuration
+How-to Guides
 UI / Screenshots
 Concepts
 Data Profiling
@@ -99,8 +99,10 @@
 Scheduling  Triggers
 Plugins
 Security
+Time zones
 Experimental Rest API
 Integration
+Lineage
 FAQ
 API Reference
 
@@ -169,25 +171,31 @@
   Source code for 

[05/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/operators/slack_operator.html
--
diff --git a/_modules/airflow/operators/slack_operator.html 
b/_modules/airflow/operators/slack_operator.html
new file mode 100644
index 000..40152da
--- /dev/null
+++ b/_modules/airflow/operators/slack_operator.html
@@ -0,0 +1,366 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.operators.slack_operator  Airflow Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.operators.slack_operator
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.operators.slack_operator
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+import json
+
+from airflow.models import BaseOperator
+from airflow.utils.decorators 
import apply_defaults
+from airflow.hooks.slack_hook 
import SlackHook
+from airflow.exceptions import AirflowException
+
+
+[docs]class SlackAPIOperator(BaseOperator):
+
+Base Slack Operator
+The SlackAPIPostOperator is derived from this 
operator.
+In the future additional Slack API Operators will be 
derived from this class as well
+
+:param slack_conn_id: Slack connection ID which its 
password is Slack API token
+:type slack_conn_id: string
+:param token: Slack API token 
(https://api.slack.com/web)
+:type token: string
+:param method: The Slack API Method to Call 
(https://api.slack.com/methods)
+:type method: string
+:param api_params: API Method call parameters 
(https://api.slack.com/methods)
+:type api_params: dict
+
+
+@apply_defaults
+def __init__(self,
+ slack_conn_id=None,
+ token=None,
+ method=None,
+ api_params=None,
+ *args, **kwargs):
+super(SlackAPIOperator, self).__init__(*args, **kwargs)
+
+if token is None and 
slack_conn_id is None:
+raise AirflowException(No valid Slack token nor slack_conn_id 
supplied.)
+if token is not None 
and slack_conn_id is not None:
+raise AirflowException(Cannot determine Slack credential when both token and 
slack_conn_id are supplied.)
+
+self.token = token
+self.slack_conn_id = slack_conn_id
+
+self.method = method
+self.api_params = api_params
+
+[docs]
def construct_api_call_params(self):
+
+Used by the execute function. Allows templating on 
the source fields of the api_call_params dict before construction
+
+Override in child classes.
+Each SlackAPIOperator child class is responsible for 
having a construct_api_call_params function
+which sets self.api_call_params with a dict of API 
call parameters (https://api.slack.com/methods)
+
+
+pass
+
+[docs]
def execute(self, **kwargs):
+
+SlackAPIOperator calls will not fail even if the call 
is not unsuccessful.
+It should not prevent a DAG from completing in 
success
+
+if not self.api_params:
+

[06/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/operators/sensors.html
--
diff --git a/_modules/airflow/operators/sensors.html 
b/_modules/airflow/operators/sensors.html
deleted file mode 100644
index 7414daf..000
--- a/_modules/airflow/operators/sensors.html
+++ /dev/null
@@ -1,929 +0,0 @@
-
-
-
-
-  
-
-  
-  
-  
-  
-  airflow.operators.sensors  Airflow Documentation
-  
-
-  
-  
-  
-  
-
-  
-
-  
-  
-
-
-  
-
-  
-  
-
-  
-
-  
-
-  
-
-
-
- 
-
-  
-  
-
-
-
-
-
-   
-  
-
-
-
-  
-
-  
-
-  
- Airflow
-  
-
-  
-  
-
-  
-
-
-  
-
-  
-
-  
-
-
-
-  
-
-
-  
-
-
-
-  
-
-
-  
-
-
-  
-Project
-License
-Quick Start
-Installation
-Tutorial
-Configuration
-UI / Screenshots
-Concepts
-Data Profiling
-Command Line Interface
-Scheduling  Triggers
-Plugins
-Security
-Experimental Rest API
-Integration
-FAQ
-API Reference
-
-
-
-  
-
-  
-
-
-
-
-  
-  
-
-  
-  Airflow
-
-  
-
-
-  
-  
-
-  
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-  
-
-  Docs 
-
-  Module code 
-
-  airflow.operators.sensors
-
-
-  
-
-
-
-  
-
-  
-
-  
-  
-
-  http://schema.org/Article;>
-   
-
-  Source code for airflow.operators.sensors
-# -*- coding: utf-8 -*-
-#
-# Licensed under the Apache License, Version 2.0 (the 
License);
-# you may not use this file except in compliance with the 
License.
-# You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, 
software
-# distributed under the License is distributed on an AS 
IS BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
express or implied.
-# See the License for the specific language governing 
permissions and
-# limitations under the License.
-
-from __future__ import print_function
-from future import standard_library
-
-from airflow.utils.log.logging_mixin import 
LoggingMixin
-
-standard_library.install_aliases()
-from builtins import str
-from past.builtins import basestring
-
-from datetime import datetime
-from urllib.parse import urlparse
-from time import sleep
-import re
-import sys
-
-from airflow import settings
-from airflow.exceptions import AirflowException, AirflowSensorTimeout, AirflowSkipException
-from airflow.models import BaseOperator, TaskInstance
-from airflow.hooks.base_hook 
import BaseHook
-from airflow.hooks.hdfs_hook 
import HDFSHook
-from airflow.hooks.http_hook 
import HttpHook
-from airflow.utils.state import State
-from airflow.utils.decorators 
import apply_defaults
-
-
-[docs]class BaseSensorOperator(BaseOperator):
-
-Sensor operators are derived from this class an inherit 
these attributes.
-
-Sensor operators keep executing at a time interval and 
succeed when
-a criteria is met and fail if and when they time 
out.
-
-:param soft_fail: Set to true to mark the task as SKIPPED 
on failure
-:type soft_fail: bool
-:param poke_interval: Time in seconds that the job should 
wait in
-between each tries
-:type poke_interval: int
-:param timeout: Time, in seconds before the task times 
out and fails.
-:type timeout: int
-
-ui_color = #e6f1f2
-
-@apply_defaults
-def __init__(
-self,
-poke_interval=60,
-timeout=60*60*24*7,
-soft_fail=False,
-*args, **kwargs):
-super(BaseSensorOperator, self).__init__(*args, **kwargs)
-self.poke_interval = poke_interval
-self.soft_fail = soft_fail
-self.timeout = timeout
-
-def poke(self, context):
-
-Function that the sensors defined while deriving this 
class should
-override.
-
-raise AirflowException(Override me.)
-
-def execute(self, context):
-started_at = datetime.utcnow()
-while not self.poke(context):
-if (datetime.utcnow() - started_at).total_seconds() 
 self.timeout:
-if self.soft_fail:
-raise AirflowSkipException(Snap. Time is OUT.)
-else:
-raise AirflowSensorTimeout(Snap. Time is OUT.)
-sleep(self.poke_interval)
-self.log.info(Success criteria met. 
Exiting.)
-
-
-class SqlSensor(BaseSensorOperator):
-
-Runs a sql statement until a criteria is met. It will 
keep trying while
-sql returns no row, or if the first cell in (0, 
0, ).
-
-:param conn_id: The connection to 

[02/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/sensors/web_hdfs_sensor.html
--
diff --git a/_modules/airflow/sensors/web_hdfs_sensor.html 
b/_modules/airflow/sensors/web_hdfs_sensor.html
new file mode 100644
index 000..32711ea
--- /dev/null
+++ b/_modules/airflow/sensors/web_hdfs_sensor.html
@@ -0,0 +1,279 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.sensors.web_hdfs_sensor  Airflow Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.sensors.web_hdfs_sensor
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.sensors.web_hdfs_sensor
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils.decorators 
import apply_defaults
+
+
+[docs]class WebHdfsSensor(BaseSensorOperator):
+
+Waits for a file or folder to land in HDFS
+
+template_fields = (filepath,)
+
+@apply_defaults
+def __init__(self,
+ filepath,
+ webhdfs_conn_id=webhdfs_default,
+ *args,
+ **kwargs):
+super(WebHdfsSensor, self).__init__(*args, **kwargs)
+self.filepath = filepath
+self.webhdfs_conn_id = webhdfs_conn_id
+
+def poke(self, context):
+from airflow.hooks.webhdfs_hook import 
WebHDFSHook
+c = WebHDFSHook(self.webhdfs_conn_id)
+self.log.info(Poking for file {self.filepath}.format(**locals()))
+return c.check_for_path(hdfs_path=self.filepath)
+
+
+   
+   
+
+   
+  
+  
+  
+
+  
+
+  
+
+
+
+  
+  Built with http://sphinx-doc.org/;>Sphinx using a https://github.com/snide/sphinx_rtd_theme;>theme provided by https://readthedocs.org;>Read the Docs. 
+
+
+
+
+  
+
+
+
+  
+  
+
+
+  
+
+
+var DOCUMENTATION_OPTIONS = {
+URL_ROOT:'../../../',
+VERSION:'',
+COLLAPSE_INDEX:false,
+FILE_SUFFIX:'.html',
+HAS_SOURCE:  true,
+SOURCELINK_SUFFIX: '.txt'
+};
+
+  
+  
+  
+
+  
+
+  
+  
+
+  
+
+  
+  
+  
+  jQuery(function () {
+  SphinxRtdTheme.StickyNav.enable();
+  });
+  
+   
+
+
+
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/bash_operator.html
--
diff --git a/_modules/bash_operator.html b/_modules/bash_operator.html
deleted file mode 100644
index 3b6f9ef..000
--- a/_modules/bash_operator.html
+++ /dev/null
@@ -1,350 +0,0 @@
-
-
-
-
-  
-
-  
-  
-  
-  
-  bash_operator  Airflow Documentation
-  
-
-  
-  
-  
-  
-
-  
-
-  
-  
-
-
-  
-
-  
-  
-
-  
-
-  
-
-  
-
-
-
- 
-
-  
-  
-
-
-
-
-
-   
-  
-
-
-
-  
-
-  
-
-  
- Airflow
-  
-
-  
-  
-
-  
- 

[28/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/mlengine_operator.html
--
diff --git a/_modules/airflow/contrib/operators/mlengine_operator.html 
b/_modules/airflow/contrib/operators/mlengine_operator.html
index b322476..c7743c2 100644
--- a/_modules/airflow/contrib/operators/mlengine_operator.html
+++ b/_modules/airflow/contrib/operators/mlengine_operator.html
@@ -91,7 +91,7 @@
 Quick Start
 Installation
 Tutorial
-Configuration
+How-to Guides
 UI / Screenshots
 Concepts
 Data Profiling
@@ -99,8 +99,10 @@
 Scheduling  Triggers
 Plugins
 Security
+Time zones
 Experimental Rest API
 Integration
+Lineage
 FAQ
 API Reference
 
@@ -184,77 +186,17 @@
 # limitations under the License.
 import re
 
-from airflow import settings
+from apiclient import errors
+
 from airflow.contrib.hooks.gcp_mlengine_hook import MLEngineHook
 from airflow.exceptions import AirflowException
 from airflow.operators import BaseOperator
 from airflow.utils.decorators 
import apply_defaults
-from apiclient import errors
-
 from airflow.utils.log.logging_mixin import 
LoggingMixin
 
 log = LoggingMixin().log
 
 
-def _create_prediction_input(project_id,
- region,
- data_format,
- input_paths,
- output_path,
- model_name=None,
- version_name=None,
- uri=None,
- max_worker_count=None,
- runtime_version=None):
-
-Create the batch prediction input from the given 
parameters.
-
-Args:
-A subset of arguments documented in __init__ method 
of class
-MLEngineBatchPredictionOperator
-
-Returns:
-A dictionary representing the predictionInput object 
as documented
-in 
https://cloud.google.com/ml-engine/reference/rest/v1/projects.jobs.
-
-Raises:
-ValueError: if a unique model/version origin cannot 
be determined.
-
-prediction_input = {
-dataFormat: 
data_format,
-inputPaths: 
input_paths,
-outputPath: 
output_path,
-region: region
-}
-
-if uri:
-if model_name or version_name:
-log.error(
-Ambiguous model origin: Both uri and 
model/version name are provided.
-)
-raise ValueError(Ambiguous model origin.)
-prediction_input[uri] = uri
-elif model_name:
-origin_name = projects/{}/models/{}.format(project_id, model_name)
-if not version_name:
-prediction_input[modelName] = origin_name
-else:
-prediction_input[versionName] = \
-origin_name + 
/versions/{}.format(version_name)
-else:
-log.error(
-Missing model origin: Batch prediction 
expects a model, 
-a model  version combination, or a URI 
to savedModel.)
-raise ValueError(Missing model origin.)
-
-if max_worker_count:
-prediction_input[maxWorkerCount] = max_worker_count
-if runtime_version:
-prediction_input[runtimeVersion] = runtime_version
-
-return prediction_input
-
-
 def _normalize_mlengine_job_id(job_id):
 
 Replaces invalid MLEngine job_id characters with 
_.
@@ -268,10 +210,27 @@
 Returns:
 A valid job_id representation.
 
-match = re.search(r\d, job_id)
+
+# Add a prefix when a job_id starts with a digit or a 
template
+match = re.search(r\d|\{{2}, job_id)
 if match and match.start() is 
0:
-job_id = z_{}.format(job_id)
-return re.sub([^0-9a-zA-Z]+, _, job_id)
+job = z_{}.format(job_id)
+else:
+job = job_id
+
+# Clean up bad characters except 
templates
+tracker = 0
+cleansed_job_id = 
+for m in re.finditer(r\{{2}.+?\}{2}, job):
+cleansed_job_id += re.sub(r[^0-9a-zA-Z]+, _,
+  job[tracker:m.start()])
+cleansed_job_id += job[m.start():m.end()]
+tracker = m.end()
+
+# Clean up last substring or the full string if no 
templates
+cleansed_job_id += re.sub(r[^0-9a-zA-Z]+, _, job[tracker:])
+
+return cleansed_job_id
 
 
 [docs]class MLEngineBatchPredictionOperator(BaseOperator):
@@ -289,14 +248,20 @@
 
 In options 2 and 3, both model and version name should 
contain the
 minimal identifier. For instance, call
+
+::
+
 MLEngineBatchPredictionOperator(
 ...,
 model_name=my_model,
 version_name=my_version,
 ...)
+
 if the desired model version is
 
projects/my_project/models/my_model/versions/my_version.
 
+See 
https://cloud.google.com/ml-engine/reference/rest/v1/projects.jobs
+for further documentation on the parameters.
 
 :param project_id: The Google Cloud project 

[03/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/sensors/named_hive_partition_sensor.html
--
diff --git a/_modules/airflow/sensors/named_hive_partition_sensor.html 
b/_modules/airflow/sensors/named_hive_partition_sensor.html
new file mode 100644
index 000..b199984
--- /dev/null
+++ b/_modules/airflow/sensors/named_hive_partition_sensor.html
@@ -0,0 +1,339 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.sensors.named_hive_partition_sensor  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.sensors.named_hive_partition_sensor
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.sensors.named_hive_partition_sensor
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+from past.builtins import basestring
+
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils.decorators 
import apply_defaults
+
+
+[docs]class NamedHivePartitionSensor(BaseSensorOperator):
+
+Waits for a set of partitions to show up in Hive.
+
+:param partition_names: List of fully qualified names of 
the
+partitions to wait for. A fully qualified name is of 
the
+form ``schema.table/pk1=pv1/pk2=pv2``, for 
example,
+default.users/ds=2016-01-01. This is passed as is to 
the metastore
+Thrift client ``get_partitions_by_name`` method. Note 
that
+you cannot use logical or comparison operators as 
in
+HivePartitionSensor.
+:type partition_names: list of strings
+:param metastore_conn_id: reference to the metastore 
thrift service
+connection id
+:type metastore_conn_id: str
+
+
+template_fields = (partition_names,)
+ui_color = #8d99ae
+
+@apply_defaults
+def __init__(self,
+ partition_names,
+ metastore_conn_id=metastore_default,
+ poke_interval=60 * 3,
+ hook=None,
+ *args,
+ **kwargs):
+super(NamedHivePartitionSensor, self).__init__(
+poke_interval=poke_interval, *args, **kwargs)
+
+if isinstance(partition_names, basestring):
+raise TypeError(partition_names must be an array of strings)
+
+self.metastore_conn_id = metastore_conn_id
+self.partition_names = partition_names
+self.hook = hook
+if self.hook and metastore_conn_id != metastore_default:
+self.log.warning(A hook was passed but a non 
default
+ metastore_conn_id=
+ {} was used.format(metastore_conn_id))
+
+@staticmethod
+def parse_partition_name(partition):
+first_split = partition.split(., 1)
+if len(first_split) 
== 1:
+schema = default
+table_partition = 
max(first_split)  # poor 
man first
+else:
+schema, table_partition = first_split
+second_split = table_partition.split(/, 1)
+if len(second_split) 
== 1:
+raise 

[15/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/hooks/oracle_hook.html
--
diff --git a/_modules/airflow/hooks/oracle_hook.html 
b/_modules/airflow/hooks/oracle_hook.html
new file mode 100644
index 000..3683348
--- /dev/null
+++ b/_modules/airflow/hooks/oracle_hook.html
@@ -0,0 +1,382 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.hooks.oracle_hook  Airflow Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.hooks.oracle_hook
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.hooks.oracle_hook
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+import cx_Oracle
+
+from airflow.hooks.dbapi_hook 
import DbApiHook
+from builtins import str
+from past.builtins import basestring
+from datetime import datetime
+import numpy
+
+
+[docs]class OracleHook(DbApiHook):
+
+Interact with Oracle SQL.
+
+conn_name_attr = oracle_conn_id
+default_conn_name = oracle_default
+supports_autocommit = False
+
+[docs]
def get_conn(self):
+
+Returns a oracle connection object
+Optional parameters for using a custom DSN connection 
(instead of using a server alias from tnsnames.ora)
+The dsn (data source name) is the TNS entry (from the 
Oracle names server or tnsnames.ora file)
+or is a string like the one returned from 
makedsn().
+
+:param dsn: the host address for the Oracle 
server
+:param service_name: the db_unique_name of the 
database that you are connecting to (CONNECT_DATA part of TNS)
+You can set these parameters in the extra fields of 
your connection
+as in ``{ 
dsn:some.host.address , 
service_name:some.service.name }``
+
+conn = self.get_connection(self.oracle_conn_id)
+dsn = conn.extra_dejson.get(dsn, None)
+sid = conn.extra_dejson.get(sid, None)
+mod = conn.extra_dejson.get(module, None)
+
+service_name = conn.extra_dejson.get(service_name, None)
+if dsn and sid and 
not service_name:
+dsn = cx_Oracle.makedsn(dsn, conn.port, sid)
+conn = cx_Oracle.connect(conn.login, conn.password, dsn=dsn)
+elif dsn and service_name and not sid:
+dsn = cx_Oracle.makedsn(dsn, conn.port, service_name=service_name)
+conn = cx_Oracle.connect(conn.login, conn.password, dsn=dsn)
+else:
+conn = cx_Oracle.connect(conn.login, conn.password, conn.host)
+
+if mod is not None:
+conn.module = mod
+
+return conn
+
+[docs]
def insert_rows(self, table, rows, target_fields=None, commit_every=1000):
+
+A generic way to insert a set of tuples into a 
table,
+the whole set of inserts is treated as one 
transaction
+Changes from standard DbApiHook implementation:
+- Oracle SQL queries in cx_Oracle can not be 
terminated with a semicolon (;)
+- Replace NaN values with NULL using numpy.nan_to_num 
(not using is_nan() because of input types 

[22/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/sensors/aws_redshift_cluster_sensor.html
--
diff --git a/_modules/airflow/contrib/sensors/aws_redshift_cluster_sensor.html 
b/_modules/airflow/contrib/sensors/aws_redshift_cluster_sensor.html
new file mode 100644
index 000..16a47e3
--- /dev/null
+++ b/_modules/airflow/contrib/sensors/aws_redshift_cluster_sensor.html
@@ -0,0 +1,287 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.sensors.aws_redshift_cluster_sensor  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.sensors.aws_redshift_cluster_sensor
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for 
airflow.contrib.sensors.aws_redshift_cluster_sensor
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+# 
+#   http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+from airflow.contrib.hooks.redshift_hook import RedshiftHook
+from airflow.sensors.base_sensor_operator import BaseSensorOperator
+from airflow.utils.decorators 
import apply_defaults
+
+
+[docs]class AwsRedshiftClusterSensor(BaseSensorOperator):
+
+Waits for a Redshift cluster to reach a specific 
status.
+
+:param cluster_identifier: The identifier for the cluster 
being pinged.
+:type cluster_identifier: str
+:param target_status: The cluster status desired.
+:type target_status: str
+
+template_fields = (cluster_identifier, target_status)
+
+@apply_defaults
+def __init__(self,
+ cluster_identifier,
+ target_status=available,
+ aws_conn_id=aws_default,
+ *args,
+ **kwargs):
+super(AwsRedshiftClusterSensor, self).__init__(*args, **kwargs)
+self.cluster_identifier = cluster_identifier
+self.target_status = target_status
+self.aws_conn_id = aws_conn_id
+
+def poke(self, context):
+self.log.info(Poking for status : {self.target_status}\n
+  for cluster {self.cluster_identifier}.format(**locals()))
+hook = RedshiftHook(aws_conn_id=self.aws_conn_id)
+return hook.cluster_status(self.cluster_identifier) == self.target_status
+
+
+   
+   
+
+   
+  
+  
+  
+
+  
+
+  
+
+
+
+  
+  Built with http://sphinx-doc.org/;>Sphinx using a https://github.com/snide/sphinx_rtd_theme;>theme provided by https://readthedocs.org;>Read the Docs. 
+
+
+
+
+  
+
+
+
+  
+  
+
+
+  
+
+
+var DOCUMENTATION_OPTIONS = {
+URL_ROOT:'../../../../',
+VERSION:'',
+COLLAPSE_INDEX:false,
+FILE_SUFFIX:'.html',
+HAS_SOURCE:  true,
+SOURCELINK_SUFFIX: '.txt'
+};
+
+  
+  
+  
+
+  
+
+  
+  
+
+  
+
+  
+  
+  
+  jQuery(function () {
+  SphinxRtdTheme.StickyNav.enable();
+  });
+  
+   
+
+
+
\ No newline at end of file


[27/51] [partial] incubator-airflow-site git commit: 1.10.0

2018-08-27 Thread kaxilnaik
http://git-wip-us.apache.org/repos/asf/incubator-airflow-site/blob/11437c14/_modules/airflow/contrib/operators/mysql_to_gcs.html
--
diff --git a/_modules/airflow/contrib/operators/mysql_to_gcs.html 
b/_modules/airflow/contrib/operators/mysql_to_gcs.html
new file mode 100644
index 000..47346f5
--- /dev/null
+++ b/_modules/airflow/contrib/operators/mysql_to_gcs.html
@@ -0,0 +1,524 @@
+
+
+
+
+  
+
+  
+  
+  
+  
+  airflow.contrib.operators.mysql_to_gcs  Airflow 
Documentation
+  
+
+  
+  
+  
+  
+
+  
+
+  
+  
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
+
+
+ 
+
+  
+  
+
+
+
+
+
+   
+  
+
+
+
+  
+
+  
+
+  
+ Airflow
+  
+
+  
+  
+
+  
+
+
+  
+
+  
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+  
+Project
+License
+Quick Start
+Installation
+Tutorial
+How-to Guides
+UI / Screenshots
+Concepts
+Data Profiling
+Command Line Interface
+Scheduling  Triggers
+Plugins
+Security
+Time zones
+Experimental Rest API
+Integration
+Lineage
+FAQ
+API Reference
+
+
+
+  
+
+  
+
+
+
+
+  
+  
+
+  
+  Airflow
+
+  
+
+
+  
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  Docs 
+
+  Module code 
+
+  airflow.contrib.operators.mysql_to_gcs
+
+
+  
+
+
+
+  
+
+  
+
+  
+  
+
+  http://schema.org/Article;>
+   
+
+  Source code for airflow.contrib.operators.mysql_to_gcs
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under 
one
+# or more contributor license agreements.  See the NOTICE 
file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this 
file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in 
compliance
+# with the License.  You may obtain a copy of the License 
at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in 
writing,
+# software distributed under the License is distributed on 
an
+# AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS 
OF ANY
+# KIND, either express or implied.  See the License for 
the
+# specific language governing permissions and 
limitations
+# under the License.
+
+import sys
+import json
+import time
+import base64
+
+from airflow.contrib.hooks.gcs_hook import 
GoogleCloudStorageHook
+from airflow.hooks.mysql_hook 
import MySqlHook
+from airflow.models import BaseOperator
+from airflow.utils.decorators 
import apply_defaults
+from datetime import date, 
datetime
+from decimal import Decimal
+from MySQLdb.constants import FIELD_TYPE
+from tempfile import NamedTemporaryFile
+from six import string_types
+
+PY3 = sys.version_info[0] == 3
+
+
+[docs]class MySqlToGoogleCloudStorageOperator(BaseOperator):
+
+Copy data from MySQL to Google cloud storage in JSON 
format.
+
+template_fields = (sql, 
bucket, filename, schema_filename, schema)
+template_ext = (.sql,)
+ui_color = #a0e08c
+
+@apply_defaults
+def __init__(self,
+ sql,
+ bucket,
+ filename,
+ schema_filename=None,
+ approx_max_file_size_bytes=19,
+ mysql_conn_id=mysql_default,
+ google_cloud_storage_conn_id=google_cloud_default,
+ schema=None,
+ delegate_to=None,
+ *args,
+ **kwargs):
+
+:param sql: The SQL to execute on the MySQL 
table.
+:type sql: string
+:param bucket: The bucket to upload to.
+:type bucket: string
+:param filename: The filename to use as the object 
name when uploading
+to Google cloud storage. A {} should be specified 
in the filename
+to allow the operator to inject file numbers in 
cases where the
+file is split due to size.
+:type filename: string
+:param schema_filename: If set, the filename to use 
as the object name
+when uploading a .json file containing the 
BigQuery schema fields
+for the table that was dumped from MySQL.
+:type schema_filename: string
+:param approx_max_file_size_bytes: This operator 
supports the ability
+to split large table dumps into multiple files 
(see notes in the
+filenamed param docs above). Google cloud storage 
allows for files
+to be a maximum of 4GB. This param allows 
developers to specify the
+file size of the splits.
+

[GitHub] kaxil removed a comment on issue #3811: [AIRFLOW-2961] Refactor tests.BackfillJobTest.test_backfill_examples test

2018-08-27 Thread GitBox
kaxil removed a comment on issue #3811: [AIRFLOW-2961] Refactor 
tests.BackfillJobTest.test_backfill_examples test
URL: 
https://github.com/apache/incubator-airflow/pull/3811#issuecomment-416267000
 
 
   @Fokko Hey, hope you had a long weekend, would you be able to upload the 
1.10.0 artifacts to PyPi?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #3811: [AIRFLOW-2961] Refactor tests.BackfillJobTest.test_backfill_examples test

2018-08-27 Thread GitBox
kaxil commented on issue #3811: [AIRFLOW-2961] Refactor 
tests.BackfillJobTest.test_backfill_examples test
URL: 
https://github.com/apache/incubator-airflow/pull/3811#issuecomment-416267713
 
 
   @Fokko Let me know if you are busy, I will upload then :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil closed pull request #3811: [AIRFLOW-2961] Refactor tests.BackfillJobTest.test_backfill_examples test

2018-08-27 Thread GitBox
kaxil closed pull request #3811: [AIRFLOW-2961] Refactor 
tests.BackfillJobTest.test_backfill_examples test
URL: https://github.com/apache/incubator-airflow/pull/3811
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/tests/jobs.py b/tests/jobs.py
index fd3a96a4d8..f9c07b96c9 100644
--- a/tests/jobs.py
+++ b/tests/jobs.py
@@ -61,6 +61,7 @@
 from airflow import configuration
 configuration.load_test_config()
 
+logger = logging.getLogger(__name__)
 
 try:
 from unittest import mock
@@ -194,23 +195,21 @@ def test_backfill_multi_dates(self):
 def test_backfill_examples(self):
 """
 Test backfilling example dags
-"""
 
-# some DAGs really are just examples... but try to make them work!
-skip_dags = [
-'example_http_operator',
-'example_twitter_dag',
-'example_trigger_target_dag',
-'example_trigger_controller_dag',  # tested above
-'test_utils',  # sleeps forever
-'example_kubernetes_executor',  # requires kubernetes cluster
-'example_kubernetes_operator'  # requires kubernetes cluster
-]
+Try to backfill some of the example dags. Be carefull, not all dags 
are suitable
+for doing this. For example, a dag that sleeps forever, or does not 
have a
+schedule won't work here since you simply can't backfill them.
+"""
+include_dags = {
+'example_branch_operator',
+'example_bash_operator',
+'example_skip_dag',
+'latest_only'
+}
 
-logger = logging.getLogger('BackfillJobTest.test_backfill_examples')
 dags = [
 dag for dag in self.dagbag.dags.values()
-if 'example_dags' in dag.full_filepath and dag.dag_id not in 
skip_dags
+if 'example_dags' in dag.full_filepath and dag.dag_id in 
include_dags
 ]
 
 for dag in dags:
@@ -218,6 +217,11 @@ def test_backfill_examples(self):
 start_date=DEFAULT_DATE,
 end_date=DEFAULT_DATE)
 
+# Make sure that we have the dags that we want to test available
+# in the example_dags folder, if this assertion fails, one of the
+# dags in the include_dags array isn't available anymore
+self.assertEqual(len(include_dags), len(dags))
+
 for i, dag in enumerate(sorted(dags, key=lambda d: d.dag_id)):
 logger.info('*** Running example DAG #{}: {}'.format(i, 
dag.dag_id))
 job = BackfillJob(


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #3811: [AIRFLOW-2961] Refactor tests.BackfillJobTest.test_backfill_examples test

2018-08-27 Thread GitBox
kaxil commented on issue #3811: [AIRFLOW-2961] Refactor 
tests.BackfillJobTest.test_backfill_examples test
URL: 
https://github.com/apache/incubator-airflow/pull/3811#issuecomment-416267000
 
 
   @Fokko Hey, hope you had a long weekend, would you be able to upload the 
1.10.0 artifacts to PyPi?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-2965) Add CLI command to find the next dag run.

2018-08-27 Thread jack (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jack updated AIRFLOW-2965:
--
Description: 
I have a dag with the following properties:
{code:java}
dag = DAG(
    dag_id='mydag',
    default_args=args,
    schedule_interval='0 1 * * *',
    max_active_runs=1,
    catchup=False){code}
 

 

This runs great.

Last run is: 2018-08-26 01:00  (start date is 2018-08-27 01:00)

 

Now it's 2018-08-27 17:55 I decided to change my dag to:

 
{code:java}
dag = DAG(
    dag_id='mydag',
    default_args=args,
    schedule_interval='0 23 * * *',
    max_active_runs=1,
    catchup=False){code}
 

Now, I have no idea when will be the next dag run.

Will it be today at 23:00? I can't be sure when the cycle is complete. I'm not 
even sure that this change will do what I wish.

I'm sure you guys are expert and you can answer this question but most of us 
wouldn't know.

 

The scheduler has the knowledge when the dag is available for running. All I'm 
asking is to take that knowledge and create a CLI command that I will give the 
dag_id and it will tell me the next date/hour which my dag will be runnable.

  was:
I have a dag with the following properties:
{code:java}
dag = DAG(
    dag_id='mydag',
    default_args=args,
    schedule_interval='0 1 * * *',
    max_active_runs=1,
    catchup=False){code}
 

 

This runs great.

Last run is: 2018-08-26 01:00  (start date is 2018-08-27 01:00)

 

Now it's 2018-08-27 17:55 I decided to change my dag to:

 
{code:java}
dag = DAG(
    dag_id='mydag',
    default_args=args,
    schedule_interval='0 23 * * *',
    max_active_runs=1,
    catchup=False){code}
 

Now, I have no idea if the dag when will be the next run.

Will it be today at 23:00? I can't be sure when the cycle is complete.

I'm sure you guys are expert and you can answer this question but most of us 
wouldn't know.

 

The scheduler has the knowledge when the dag is available for running. All I'm 
asking is to take that knowledge and create a CLI command that I will give the 
dag_id and it will tell me the next date/hour which my dag will be runnable.


> Add CLI command to find the next dag run.
> -
>
> Key: AIRFLOW-2965
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2965
> Project: Apache Airflow
>  Issue Type: Task
>Reporter: jack
>Priority: Minor
> Fix For: 1.10.1
>
>
> I have a dag with the following properties:
> {code:java}
> dag = DAG(
>     dag_id='mydag',
>     default_args=args,
>     schedule_interval='0 1 * * *',
>     max_active_runs=1,
>     catchup=False){code}
>  
>  
> This runs great.
> Last run is: 2018-08-26 01:00  (start date is 2018-08-27 01:00)
>  
> Now it's 2018-08-27 17:55 I decided to change my dag to:
>  
> {code:java}
> dag = DAG(
>     dag_id='mydag',
>     default_args=args,
>     schedule_interval='0 23 * * *',
>     max_active_runs=1,
>     catchup=False){code}
>  
> Now, I have no idea when will be the next dag run.
> Will it be today at 23:00? I can't be sure when the cycle is complete. I'm 
> not even sure that this change will do what I wish.
> I'm sure you guys are expert and you can answer this question but most of us 
> wouldn't know.
>  
> The scheduler has the knowledge when the dag is available for running. All 
> I'm asking is to take that knowledge and create a CLI command that I will 
> give the dag_id and it will tell me the next date/hour which my dag will be 
> runnable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2965) Add CLI command to find the next dag run.

2018-08-27 Thread jack (JIRA)
jack created AIRFLOW-2965:
-

 Summary: Add CLI command to find the next dag run.
 Key: AIRFLOW-2965
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2965
 Project: Apache Airflow
  Issue Type: Task
Reporter: jack
 Fix For: 1.10.1


I have a dag with the following properties:
{code:java}
dag = DAG(
    dag_id='mydag',
    default_args=args,
    schedule_interval='0 1 * * *',
    max_active_runs=1,
    catchup=False){code}
 

 

This runs great.

Last run is: 2018-08-26 01:00  (start date is 2018-08-27 01:00)

 

Now it's 2018-08-27 17:55 I decided to change my dag to:

 
{code:java}
dag = DAG(
    dag_id='mydag',
    default_args=args,
    schedule_interval='0 23 * * *',
    max_active_runs=1,
    catchup=False){code}
 

Now, I have no idea if the dag when will be the next run.

Will it be today at 23:00? I can't be sure when the cycle is complete.

I'm sure you guys are expert and you can answer this question but most of us 
wouldn't know.

 

The scheduler has the knowledge when the dag is available for running. All I'm 
asking is to take that knowledge and create a CLI command that I will give the 
dag_id and it will tell me the next date/hour which my dag will be runnable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko commented on a change in pull request #3811: [AIRFLOW-2961] Refactor tests.BackfillJobTest.test_backfill_examples test

2018-08-27 Thread GitBox
Fokko commented on a change in pull request #3811: [AIRFLOW-2961] Refactor 
tests.BackfillJobTest.test_backfill_examples test
URL: https://github.com/apache/incubator-airflow/pull/3811#discussion_r213001968
 
 

 ##
 File path: tests/jobs.py
 ##
 @@ -194,30 +195,32 @@ def test_backfill_multi_dates(self):
 def test_backfill_examples(self):
 """
 Test backfilling example dags
-"""
 
-# some DAGs really are just examples... but try to make them work!
-skip_dags = [
-'example_http_operator',
-'example_twitter_dag',
-'example_trigger_target_dag',
-'example_trigger_controller_dag',  # tested above
-'test_utils',  # sleeps forever
-'example_kubernetes_executor',  # requires kubernetes cluster
-'example_kubernetes_operator'  # requires kubernetes cluster
-]
+Try to backfill some of the example dags. Be carefull, not all dags 
are suitable
+for doing this. For example, a dag that sleeps forever, or does not 
have a
+schedule won't work here since you simply can't backfill them.
+"""
+include_dags = {
+'example_branch_operator',
+'example_bash_operator',
+'example_skip_dag',
+'latest_only'
+}
 
-logger = logging.getLogger('BackfillJobTest.test_backfill_examples')
 dags = [
 dag for dag in self.dagbag.dags.values()
-if 'example_dags' in dag.full_filepath and dag.dag_id not in 
skip_dags
+if 'example_dags' in dag.full_filepath and dag.dag_id in 
include_dags
 ]
 
 for dag in dags:
 dag.clear(
 start_date=DEFAULT_DATE,
 end_date=DEFAULT_DATE)
 
+# Make sure that we have all dags the dags that we want to test 
available
 
 Review comment:
   Thanks, fixed  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #3811: [AIRFLOW-2961] Refactor tests.BackfillJobTest.test_backfill_examples test

2018-08-27 Thread GitBox
kaxil commented on a change in pull request #3811: [AIRFLOW-2961] Refactor 
tests.BackfillJobTest.test_backfill_examples test
URL: https://github.com/apache/incubator-airflow/pull/3811#discussion_r212984027
 
 

 ##
 File path: tests/jobs.py
 ##
 @@ -194,30 +195,32 @@ def test_backfill_multi_dates(self):
 def test_backfill_examples(self):
 """
 Test backfilling example dags
-"""
 
-# some DAGs really are just examples... but try to make them work!
-skip_dags = [
-'example_http_operator',
-'example_twitter_dag',
-'example_trigger_target_dag',
-'example_trigger_controller_dag',  # tested above
-'test_utils',  # sleeps forever
-'example_kubernetes_executor',  # requires kubernetes cluster
-'example_kubernetes_operator'  # requires kubernetes cluster
-]
+Try to backfill some of the example dags. Be carefull, not all dags 
are suitable
+for doing this. For example, a dag that sleeps forever, or does not 
have a
+schedule won't work here since you simply can't backfill them.
+"""
+include_dags = {
+'example_branch_operator',
+'example_bash_operator',
+'example_skip_dag',
+'latest_only'
+}
 
-logger = logging.getLogger('BackfillJobTest.test_backfill_examples')
 dags = [
 dag for dag in self.dagbag.dags.values()
-if 'example_dags' in dag.full_filepath and dag.dag_id not in 
skip_dags
+if 'example_dags' in dag.full_filepath and dag.dag_id in 
include_dags
 ]
 
 for dag in dags:
 dag.clear(
 start_date=DEFAULT_DATE,
 end_date=DEFAULT_DATE)
 
+# Make sure that we have all dags the dags that we want to test 
available
 
 Review comment:
   Nit: `all dags the dags that we want to` to `all the dags that we want to`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] verdan commented on issue #3656: [AIRFLOW-2803] Fix all ESLint issues

2018-08-27 Thread GitBox
verdan commented on issue #3656: [AIRFLOW-2803] Fix all ESLint issues
URL: 
https://github.com/apache/incubator-airflow/pull/3656#issuecomment-416202515
 
 
    looks great!! I believe this one is ready to be finalized. nice work 
@tedmiston 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3790: [AIRFLOW-2994] Fix command status check in Qubole Check operator

2018-08-27 Thread GitBox
codecov-io edited a comment on issue #3790: [AIRFLOW-2994] Fix command status 
check in Qubole Check operator
URL: 
https://github.com/apache/incubator-airflow/pull/3790#issuecomment-416196863
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3790?src=pr=h1)
 Report
   > Merging 
[#3790](https://codecov.io/gh/apache/incubator-airflow/pull/3790?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/6afd38d79b2439ca5e3bb349c5d2007ad09aa27f?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3790/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3790?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master   #3790  +/-   ##
   =
   - Coverage   77.41%   77.4%   -0.01% 
   =
 Files 203 203  
 Lines   15810   15810  
   =
   - Hits12239   12238   -1 
   - Misses   35713572   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3790?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3790/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.74% <0%> (-0.05%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3790?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3790?src=pr=footer).
 Last update 
[6afd38d...2750e5d](https://codecov.io/gh/apache/incubator-airflow/pull/3790?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #3790: [AIRFLOW-2994] Fix command status check in Qubole Check operator

2018-08-27 Thread GitBox
codecov-io commented on issue #3790: [AIRFLOW-2994] Fix command status check in 
Qubole Check operator
URL: 
https://github.com/apache/incubator-airflow/pull/3790#issuecomment-416196863
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3790?src=pr=h1)
 Report
   > Merging 
[#3790](https://codecov.io/gh/apache/incubator-airflow/pull/3790?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/6afd38d79b2439ca5e3bb349c5d2007ad09aa27f?src=pr=desc)
 will **decrease** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3790/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3790?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master   #3790  +/-   ##
   =
   - Coverage   77.41%   77.4%   -0.01% 
   =
 Files 203 203  
 Lines   15810   15810  
   =
   - Hits12239   12238   -1 
   - Misses   35713572   +1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3790?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3790/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.74% <0%> (-0.05%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3790?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3790?src=pr=footer).
 Last update 
[6afd38d...2750e5d](https://codecov.io/gh/apache/incubator-airflow/pull/3790?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-2964) Lazy generation of the job description

2018-08-27 Thread Victor Jimenez (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victor Jimenez updated AIRFLOW-2964:

Description: 
When instantiating a {{DatabricksSubmitRunOperator}} users need to pass the 
description of the job that will later be executed on Databricks.

The job description is only needed at execution time (when the hook is called). 
However, the {{json}} parameter must already have the full job description when 
constructing the operator. This may present a problem if computing the job 
description needs to execute expensive operations (e.g., querying a database). 
The expensive operation will be invoked every single time the DAG is 
reprocessed (which may happen quite frequently).

It would be good to have an equivalent mechanism to the {{python_callable}} 
parameter in the {{PythonOperator}}. In this way, users could pass a function 
that would generate the job description only when the operator is actually 
executed.

[~andrewmchen] Is there any other way to do this? If not, does this sound 
reasonable? I can create a PR implementing this proposal.

  was:
When instantiating a {{DatabricksSubmitRunOperator}} users need to pass the 
description of the job that will later be executed on Databricks.

The job description is only needed at execution time (when the hook is called). 
However, the {{json}} parameter must already have the full job description when 
constructing the operator. This may present a problem if computing the job 
description needs to execute expensive operations (e.g., querying a database). 
The expensive operation will be invoked every single time the DAG is 
reprocessed (which may happen quite frequently).

It would be good to have an equivalent mechanism to the {{python_callable}} 
parameter in the {{PythonOperator}}. In this way, users could pass a function 
that would generate the job description only when the operator is actually 
executed.


> Lazy generation of the job description
> --
>
> Key: AIRFLOW-2964
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2964
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Affects Versions: Airflow 1.9.0, 1.10
>Reporter: Victor Jimenez
>Priority: Major
>
> When instantiating a {{DatabricksSubmitRunOperator}} users need to pass the 
> description of the job that will later be executed on Databricks.
> The job description is only needed at execution time (when the hook is 
> called). However, the {{json}} parameter must already have the full job 
> description when constructing the operator. This may present a problem if 
> computing the job description needs to execute expensive operations (e.g., 
> querying a database). The expensive operation will be invoked every single 
> time the DAG is reprocessed (which may happen quite frequently).
> It would be good to have an equivalent mechanism to the {{python_callable}} 
> parameter in the {{PythonOperator}}. In this way, users could pass a function 
> that would generate the job description only when the operator is actually 
> executed.
> [~andrewmchen] Is there any other way to do this? If not, does this sound 
> reasonable? I can create a PR implementing this proposal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2964) Lazy generation of the job description

2018-08-27 Thread Victor Jimenez (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victor Jimenez updated AIRFLOW-2964:

Description: 
When instantiating a {{DatabricksSubmitRunOperator }}users need to pass the 
description of the job that will later be executed on Databricks.

The job description is only needed at execution time (when the hook is called). 
However, the {{json}} parameter must already have the full job description when 
constructing the operator. This may present a problem if computing the job 
description needs to execute expensive operations (e.g., querying a database). 
The expensive operation will be invoked every single time the DAG is 
reprocessed (which may happen quite frequently).

It would be good to have an equivalent mechanism to the {{python_callable}} 
parameter in the {{PythonOperator}}. In this way, users could pass a function 
that would generate the job description only when the operator is actually 
executed.

  was:
When instantiating a DatabricksSubmitRunOperator users need to pass the 
description of the job that will later be executed on Databricks.

The job description is only needed at execution time (when the hook is called). 
However, the `json` parameter must already have the full job description when 
constructing the operator. This may present a problem if computing the job 
description needs to execute expensive operations (e.g., querying a database). 
The expensive operation will be invoked every single time the DAG is 
reprocessed (which may happen quite frequently).

It would be good to have an equivalent mechanism to the `python_callable` 
parameter in the `PythonOperator`. In this way, users could pass a function 
that would generate the job description only when the operator is actually 
executed.


> Lazy generation of the job description
> --
>
> Key: AIRFLOW-2964
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2964
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Affects Versions: Airflow 1.9.0, 1.10
>Reporter: Victor Jimenez
>Priority: Major
>
> When instantiating a {{DatabricksSubmitRunOperator }}users need to pass the 
> description of the job that will later be executed on Databricks.
> The job description is only needed at execution time (when the hook is 
> called). However, the {{json}} parameter must already have the full job 
> description when constructing the operator. This may present a problem if 
> computing the job description needs to execute expensive operations (e.g., 
> querying a database). The expensive operation will be invoked every single 
> time the DAG is reprocessed (which may happen quite frequently).
> It would be good to have an equivalent mechanism to the {{python_callable}} 
> parameter in the {{PythonOperator}}. In this way, users could pass a function 
> that would generate the job description only when the operator is actually 
> executed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2964) Lazy generation of the job description

2018-08-27 Thread Victor Jimenez (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victor Jimenez updated AIRFLOW-2964:

Description: 
When instantiating a DatabricksSubmitRunOperator users need to pass the 
description of the job that will later be executed on Databricks.

The job description is only needed at execution time (when the hook is called). 
However, the `json` parameter must already have the full job description when 
constructing the operator. This may present a problem if computing the job 
description needs to execute expensive operations (e.g., querying a database). 
The expensive operation will be invoked every single time the DAG is 
reprocessed (which may happen quite frequently).

It would be good to have an equivalent mechanism to the `python_callable` 
parameter in the `PythonOperator`. In this way, users could pass a function 
that would generate the job description only when the operator is actually 
executed.

  was:
When instantiating a `DatabricksSubmitRunOperator` users need to pass the 
description of the job that will later be executed on Databricks.

The job description is only needed at execution time (when the hook is called). 
However, the `json` parameter must already have the full job description when 
constructing the operator. This may present a problem if computing the job 
description needs to execute expensive operations (e.g., querying a database). 
The expensive operation will be invoked every single time the DAG is 
reprocessed (which may happen quite frequently).

It would be good to have an equivalent mechanism to the `python_callable` 
parameter in the `PythonOperator`. In this way, users could pass a function 
that would generate the job description only when the operator is actually 
executed.


> Lazy generation of the job description
> --
>
> Key: AIRFLOW-2964
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2964
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Affects Versions: Airflow 1.9.0, 1.10
>Reporter: Victor Jimenez
>Priority: Major
>
> When instantiating a DatabricksSubmitRunOperator users need to pass the 
> description of the job that will later be executed on Databricks.
> The job description is only needed at execution time (when the hook is 
> called). However, the `json` parameter must already have the full job 
> description when constructing the operator. This may present a problem if 
> computing the job description needs to execute expensive operations (e.g., 
> querying a database). The expensive operation will be invoked every single 
> time the DAG is reprocessed (which may happen quite frequently).
> It would be good to have an equivalent mechanism to the `python_callable` 
> parameter in the `PythonOperator`. In this way, users could pass a function 
> that would generate the job description only when the operator is actually 
> executed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2964) Lazy generation of the job description

2018-08-27 Thread Victor Jimenez (JIRA)
Victor Jimenez created AIRFLOW-2964:
---

 Summary: Lazy generation of the job description
 Key: AIRFLOW-2964
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2964
 Project: Apache Airflow
  Issue Type: Improvement
  Components: contrib, operators
Affects Versions: Airflow 1.9.0, 1.10
Reporter: Victor Jimenez


When instantiating a `DatabricksSubmitRunOperator` users need to pass the 
description of the job that will later be executed on Databricks.

The job description is only needed at execution time (when the hook is called). 
However, the `json` parameter must already have the full job description when 
constructing the operator. This may present a problem if computing the job 
description needs to execute expensive operations (e.g., querying a database). 
The expensive operation will be invoked every single time the DAG is 
reprocessed (which may happen quite frequently).

It would be good to have an equivalent mechanism to the `python_callable` 
parameter in the `PythonOperator`. In this way, users could pass a function 
that would generate the job description only when the operator is actually 
executed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2963) Error parsing AIRFLOW_CONN_ URI

2018-08-27 Thread Leonardo de Campos Almeida (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonardo de Campos Almeida updated AIRFLOW-2963:

Fix Version/s: (was: 1.9.0)

> Error parsing AIRFLOW_CONN_ URI
> ---
>
> Key: AIRFLOW-2963
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2963
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: boto3, configuration
>Affects Versions: 1.9.0
>Reporter: Leonardo de Campos Almeida
>Priority: Minor
>  Labels: easyfix
>
> I'm using the environment variable AIRFLOW_CONN_ to define my connection to 
> AWS, but my AWS secret access key has a slash on it.
>  e.g.:
> {code:java}
> s3://login:pass/word@bucket
> {code}
>  The problem is that the method *BaseHook._get_connection_from_env* doesn't 
> accept this URI as a valid URI. When it finds the / it is assuming that the 
> path starts there, so it is returning:
>  * host: login
>  * port: pass
>  * path: word
> And ignoring the rest, so I get an error, because pass is not a valid port 
> number.
> So, I tried to pass the URI quoted
> {code:java}
> s3://login:pass%2Fword@bucker
> {code}
> But them, the values are not being unquoted correctly, and the AwsHook is 
> trying to use pass%2Fword as the secret access key.
>  I took a look at the method that parses the URI, and it is only unquoting 
> the host, manually.
> {code:java}
> def parse_from_uri(self, uri):
> temp_uri = urlparse(uri)
> hostname = temp_uri.hostname or ''
> if '%2f' in hostname:
> hostname = hostname.replace('%2f', '/').replace('%2F', '/')
> conn_type = temp_uri.scheme
> if conn_type == 'postgresql':
> conn_type = 'postgres'
> self.conn_type = conn_type
> self.host = hostname
> self.schema = temp_uri.path[1:]
> self.login = temp_uri.username
> self.password = temp_uri.password
> self.port = temp_uri.port
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2963) Error parsing AIRFLOW_CONN_ URI

2018-08-27 Thread Leonardo de Campos Almeida (JIRA)
Leonardo de Campos Almeida created AIRFLOW-2963:
---

 Summary: Error parsing AIRFLOW_CONN_ URI
 Key: AIRFLOW-2963
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2963
 Project: Apache Airflow
  Issue Type: Bug
  Components: boto3, configuration
Affects Versions: 1.9.0
Reporter: Leonardo de Campos Almeida
 Fix For: 1.9.0


I'm using the environment variable AIRFLOW_CONN_ to define my connection to 
AWS, but my AWS secret access key has a slash on it.

 e.g.:
{code:java}
s3://login:pass/word@bucket
{code}
 The problem is that the method *BaseHook._get_connection_from_env* doesn't 
accept this URI as a valid URI. When it finds the / it is assuming that the 
path starts there, so it is returning:
 * host: login
 * port: pass
 * path: word

And ignoring the rest, so I get an error, because pass is not a valid port 
number.

So, I tried to pass the URI quoted
{code:java}
s3://login:pass%2Fword@bucker
{code}
But them, the values are not being unquoted correctly, and the AwsHook is 
trying to use pass%2Fword as the secret access key.

 I took a look at the method that parses the URI, and it is only unquoting the 
host, manually.
{code:java}
def parse_from_uri(self, uri):
temp_uri = urlparse(uri)
hostname = temp_uri.hostname or ''
if '%2f' in hostname:
hostname = hostname.replace('%2f', '/').replace('%2F', '/')
conn_type = temp_uri.scheme
if conn_type == 'postgresql':
conn_type = 'postgres'
self.conn_type = conn_type
self.host = hostname
self.schema = temp_uri.path[1:]
self.login = temp_uri.username
self.password = temp_uri.password
self.port = temp_uri.port
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] XD-DENG commented on issue #3809: [AIRFLOW-2959] Make HTTPSensor doc clearer

2018-08-27 Thread GitBox
XD-DENG commented on issue #3809: [AIRFLOW-2959] Make HTTPSensor doc clearer
URL: 
https://github.com/apache/incubator-airflow/pull/3809#issuecomment-416164634
 
 
   Thanks @msumit 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   >