[GitHub] [airflow] mik-laj edited a comment on pull request #10256: fixes http hook using schema field from airflow.models.connection.Con…

2020-08-09 Thread GitBox


mik-laj edited a comment on pull request #10256:
URL: https://github.com/apache/airflow/pull/10256#issuecomment-671086076


   @kubatyszko  Can you give the value you have in the Secret backet? Do you 
realize that in the backend secret you should store the connection 
representation as a URI, not a URL?
   
   Have you tried to create a connection using the Web UI and then view the URI 
representations of the Connection using the CLI? 
   
https://airflow.readthedocs.io/en/latest/howto/connection/index.html#connection-uri-format
   
   https://issues.apache.org/jira/browse/AIRFLOW-2910 Is it related?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on pull request #10256: fixes http hook using schema field from airflow.models.connection.Con…

2020-08-09 Thread GitBox


mik-laj commented on pull request #10256:
URL: https://github.com/apache/airflow/pull/10256#issuecomment-671086076


   @kubatyszko  Can you give the value you have in the Secret backet? 
   
   Have you tried to create a connection using the Web UI and then view the URI 
representations of the Connection using the CLI?
   
https://airflow.readthedocs.io/en/latest/howto/connection/index.html#connection-uri-format
   
   https://issues.apache.org/jira/browse/AIRFLOW-2910 Is it related?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj edited a comment on pull request #9907: Added feature to Import connections from a file.

2020-08-09 Thread GitBox


mik-laj edited a comment on pull request #9907:
URL: https://github.com/apache/airflow/pull/9907#issuecomment-671062255


   We already have the `airflow connections import` command. Your command will 
be a great complement to it.
   
https://airflow.readthedocs.io/en/latest/howto/connection/index.html#exporting-connections-from-the-cli



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on issue #8715: Documentation about loading packages in Python / Airflow

2020-08-09 Thread GitBox


mik-laj commented on issue #8715:
URL: https://github.com/apache/airflow/issues/8715#issuecomment-671087782


   It's related to:
   https://github.com/apache/airflow/issues/9498
   https://github.com/apache/airflow/issues/9507
   So I added this to the Airflow 2.0 milestone



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj edited a comment on issue #10007: Sagemaker contrib module - should be independant

2020-08-09 Thread GitBox


mik-laj edited a comment on issue #10007:
URL: https://github.com/apache/airflow/issues/10007#issuecomment-671088094


   @shlomiken Is there any progress here? This is a ticket that got a lot of 
thumbs up so would like to know if this ticket is lockable or needs to be 
processed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on issue #10007: Sagemaker contrib module - should be independant

2020-08-09 Thread GitBox


mik-laj commented on issue #10007:
URL: https://github.com/apache/airflow/issues/10007#issuecomment-671088094


   @shlomiken Is there any progress here?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on issue #8495: Allow configuration of args vs command to support container entrypoint in kubernetes executor

2020-08-09 Thread GitBox


mik-laj commented on issue #8495:
URL: https://github.com/apache/airflow/issues/8495#issuecomment-671088585


   @dmayle @matthieu-foucault @humbledude Would you like to contribute this 
change to the project? This looks like a much needed change so I'd love to help 
with the review.
   
   CC: @dimberman 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch v1-10-test updated: add documentation for k8s fixes

2020-08-09 Thread dimberman
This is an automated email from the ASF dual-hosted git repository.

dimberman pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v1-10-test by this push:
 new 90fe915  add documentation for k8s fixes
90fe915 is described below

commit 90fe91551bbd8d2db527addeee699aa88331691f
Author: Daniel Imberman 
AuthorDate: Sun Aug 9 11:58:37 2020 -0700

add documentation for k8s fixes
---
 airflow/contrib/kubernetes/pod.py  | 3 +++
 airflow/kubernetes/pod_launcher.py | 4 
 2 files changed, 7 insertions(+)

diff --git a/airflow/contrib/kubernetes/pod.py 
b/airflow/contrib/kubernetes/pod.py
index 5ed563e..944cd8c 100644
--- a/airflow/contrib/kubernetes/pod.py
+++ b/airflow/contrib/kubernetes/pod.py
@@ -198,6 +198,9 @@ class Pod(object):
 
 
 def _extract_env_vars_and_secrets(env_vars):
+"""
+Extracts environment variables and Secret objects from V1Pod Environment
+"""
 result = {}
 env_vars = env_vars or []  # type: List[Union[k8s.V1EnvVar, dict]]
 secrets = []
diff --git a/airflow/kubernetes/pod_launcher.py 
b/airflow/kubernetes/pod_launcher.py
index 620df31..875a24c 100644
--- a/airflow/kubernetes/pod_launcher.py
+++ b/airflow/kubernetes/pod_launcher.py
@@ -294,6 +294,10 @@ class PodLauncher(LoggingMixin):
 
 
 def _convert_to_airflow_pod(pod):
+"""
+Converts a k8s V1Pod object into an `airflow.kubernetes.pod.Pod` object.
+This function is purely for backwards compatibility
+"""
 base_container = pod.spec.containers[0]  # type: k8s.V1Container
 env_vars, secrets = _extract_env_vars_and_secrets(base_container.env)
 volumes, vol_secrets = _extract_volumes_and_secrets(pod.spec.volumes, 
base_container.volume_mounts)



[GitHub] [airflow] mik-laj commented on issue #8715: Documentation about loading packages in Python / Airflow

2020-08-09 Thread GitBox


mik-laj commented on issue #8715:
URL: https://github.com/apache/airflow/issues/8715#issuecomment-671088716


   @rootcss Could you please leave a comment so I can assign you to this ticket?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on issue #9181: Make KubernetesExecutor PersistentVolumeClaim mounts more flexible.

2020-08-09 Thread GitBox


mik-laj commented on issue #9181:
URL: https://github.com/apache/airflow/issues/9181#issuecomment-671089535


   Can the expected result be achieved with this configuration option?
   
https://airflow.readthedocs.io/en/latest/configurations-ref.html#pod-template-file



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow-site] potiuk commented on pull request #279: Move announcements page from confluence to website

2020-08-09 Thread GitBox


potiuk commented on pull request #279:
URL: https://github.com/apache/airflow-site/pull/279#issuecomment-671090633


   Fantastic ! Thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow-site] branch master updated: Move announcements page from confluence to website (#279)

2020-08-09 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow-site.git


The following commit(s) were added to refs/heads/master by this push:
 new 6d47dfe  Move announcements page from confluence to website (#279)
6d47dfe is described below

commit 6d47dfea8fdd11cca60154b90d10ed2f4073ebce
Author: Santhosh Kumar 
AuthorDate: Mon Aug 10 00:48:30 2020 +0530

Move announcements page from confluence to website (#279)
---
 .../site/content/en/announcements/_index.md| 724 +
 .../site/static/images/airflow_dark_bg.png | Bin 0 -> 3719 bytes
 public/categories/index.xml|  13 +
 public/index.xml   |  13 +
 public/sitemap.xml |  17 +
 public/tags/index.xml  |  13 +
 6 files changed, 780 insertions(+)

diff --git a/landing-pages/site/content/en/announcements/_index.md 
b/landing-pages/site/content/en/announcements/_index.md
new file mode 100644
index 000..e233b2b
--- /dev/null
+++ b/landing-pages/site/content/en/announcements/_index.md
@@ -0,0 +1,724 @@
+---
+title: "Announcements"
+date: 2020-08-08T05:30:20+05:30
+draft: false
+menu:
+  main:
+weight: 25
+
+---
+
+
+
+
+**Note:** Follow [@ApacheAirflow](https://twitter.com/ApacheAirflow) on 
Twitter for the latest news and announcements!
+
+
+# July 20, 2020
+
+Airflow PMC welcomes **Leah Cole** 
([@leahecole](https://github.com/leahecole)) and **Ry Walker** 
([@ryw](https://github.com/ryw)) as new Airflow Committers.
+
+
+# July 10, 2020
+
+We've just released Airflow v1.10.11
+
+PyPI - https://pypi.org/project/apache-airflow/1.10.11/
+
+Docs - https://airflow.apache.org/docs/1.10.11/
+
+ChangeLog - https://airflow.apache.org/docs/1.10.11/changelog.html
+
+306 commits since 1.10.10 (12 New Features, 90 Improvements, 53 Bug Fixes, and 
several doc changes)
+
+
+# July 8, 2020
+
+Airflow PMC welcomes **Daniel Imberman** 
([@dimberman](https://github.com/dimberman)), **Tomek Turbaszek** 
([@turbaszek](https://github.com/turbaszek)), and **Kamil Breguła** 
([@mik-laj](https://github.com/mik-laj)) as new PMC members, and **QP Hou** 
([@houqp](https://github.com/houqp)) as a committer. Congrats!
+
+
+# July 6, 2020
+
+The (virtual) Airflow Summit has begun – you can watch along at 
[airflowsummit.org](https://airflowsummit.org/)
+
+
+# Jun 24, 2020
+
+We've just released Airflow Backport Provider Packages 2020.6.24
+
+The Backport provider packages make it possible to easily use Airflow 2.0 
Operators, Hooks, Sensors, Secrets, Transfers in Airflow 1.10. More stats 
below, but the Backport Provider packages increase the number of 
easily-available integrations for Airflow 1.10 users by a whopping **55%**.
+
+- We have **58** backport packages in total. **599** classes (Operators, 
Hooks, Transfers, Sensors, Secrets)
+- We have **213** new (!) classes that have not been easily available to 1.10 
users so far:
+- Operators: 150
+- Transfers: 12
+- Sensors: 14
+- Hooks: 37
+- Secrets: 0
+- We have 386 classes that were moved. Quite a number of those (hard to say 
exactly how many) got new features, options, parameters.
+- Operators: 204
+- Transfers: 36
+- Sensors: 46
+- Hooks: 96
+- Secrets: 4
+
+**List of the backport provider packages:**
+
+>
+> 1. 
[Amazon](https://pypi.org/project/apache-airflow-backport-providers-amazon/2020.6.24/)
+> 2. [Apache 
HDFS](https://pypi.org/project/apache-airflow-backport-providers-apache-hdfs/2020.6.24/)
+> 3. [Apache 
Hive](https://pypi.org/project/apache-airflow-backport-providers-apache-hive/2020.6.24/)
+> 4. [Apache 
Livy](https://pypi.org/project/apache-airflow-backport-providers-apache-livy/2020.6.24/)
+> 5. [Apache 
Pig](https://pypi.org/project/apache-airflow-backport-providers-apache-pig/2020.6.24/)
+> 6. [Apache 
Pinot](https://pypi.org/project/apache-airflow-backport-providers-apache-pinot/2020.6.24/)
+> 7. [Apache 
Spark](https://pypi.org/project/apache-airflow-backport-providers-apache-spark/2020.6.24/)
+> 8. [Apache 
Sqoop](https://pypi.org/project/apache-airflow-backport-providers-apache-sqoop/2020.6.24/)
+> 9. 
[Azure](https://pypi.org/project/apache-airflow-backport-providers-microsoft-azure/2020.6.24/)
+> 10. 
[Cassandra](https://pypi.org/project/apache-airflow-backport-providers-apache-cassandra/2020.6.24/)
+> 11. 
[Celery](https://pypi.org/project/apache-airflow-backport-providers-celery/2020.6.24/)
+> 12. 
[Cloudant](https://pypi.org/project/apache-airflow-backport-providers-cloudant/2020.6.24/)
+> 13. 
[Databricks](https://pypi.org/project/apache-airflow-backport-providers-databricks/2020.6.24/)
+> 14. 
[Datadog](https://pypi.org/project/apache-airflow-backport-providers-datadog/2020.6.24/)
+> 15. 
[Dingding](https://pypi.org/project/apache-airflow-backport-providers-dingding/2020.6.24/)
+> 16. 

[GitHub] [airflow-site] potiuk merged pull request #279: Move announcements page from confluence to website

2020-08-09 Thread GitBox


potiuk merged pull request #279:
URL: https://github.com/apache/airflow-site/pull/279


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk closed issue #10196: Move Announcement page to airflow.apache.org

2020-08-09 Thread GitBox


potiuk closed issue #10196:
URL: https://github.com/apache/airflow/issues/10196


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dimberman opened a new pull request #10266: Add reconcile_metadata to reconcile_pods

2020-08-09 Thread GitBox


dimberman opened a new pull request #10266:
URL: https://github.com/apache/airflow/pull/10266


   metadata objects require a more complex merge strategy
   then a simple "merge pods" for merging labels and other
   features
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on pull request #10192: Deprecate BaseHook.get_connections method (#10135)

2020-08-09 Thread GitBox


mik-laj commented on pull request #10192:
URL: https://github.com/apache/airflow/pull/10192#issuecomment-671097020


   I added `airflow connection get` command: 
https://github.com/apache/airflow/pull/10214



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk closed issue #10155: Airflow 1.10.10 + DAG SERIALIZATION = fails to start manually the DAG's operators

2020-08-09 Thread GitBox


potiuk closed issue #10155:
URL: https://github.com/apache/airflow/issues/10155


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk merged pull request #10246: Added DataprepGetJobsForJobGroupOperator

2020-08-09 Thread GitBox


potiuk merged pull request #10246:
URL: https://github.com/apache/airflow/pull/10246


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #10246: Added DataprepGetJobsForJobGroupOperator

2020-08-09 Thread GitBox


potiuk commented on pull request #10246:
URL: https://github.com/apache/airflow/pull/10246#issuecomment-671099018


   Thanks @michalslowikowski00 !



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #10264: Add Sphinx Spell-Checker

2020-08-09 Thread GitBox


potiuk commented on issue #10264:
URL: https://github.com/apache/airflow/issues/10264#issuecomment-671098812


   BTW. I got a lot better at it with recent changes in IntelliJ where 
spell-checking and grammar is built in :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #10264: Add Sphinx Spell-Checker

2020-08-09 Thread GitBox


potiuk commented on issue #10264:
URL: https://github.com/apache/airflow/issues/10264#issuecomment-671098732


   Love it. I make a lot of spelling mistakes (usually because I want to go 
fast). So having an automated check would be awesome!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] shlomiken commented on issue #10007: Sagemaker contrib module - should be independant

2020-08-09 Thread GitBox


shlomiken commented on issue #10007:
URL: https://github.com/apache/airflow/issues/10007#issuecomment-671099821


   Hi @mik-laj , unfortunately we don't have the capacity now to refactor this. 
we decided to install sagemaker as it worked till now.
   this means that the use case of working with packaged zip files which 
contain sagemaker is a problem - so we actually install it separately on target 
machines. (botocore also have a problem working in packaged zips)
   i still think that the SageMaker operators could provide this easy to use 
configuration object instead of the full blown package , but this is sagemaker 
issue i guess. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on issue #8657: Add a configuration option to enable Airflow to look for DAGs in a specified S3 bucket.

2020-08-09 Thread GitBox


mik-laj commented on issue #8657:
URL: https://github.com/apache/airflow/issues/8657#issuecomment-671089031


   Such a feature is not planned. It is recommended to set up a separate 
process (e.g. sidecar) that will be responsible for file synchronization. 
However, I would be happy if there was a guide in the documentation for this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj edited a comment on issue #9181: Make KubernetesExecutor PersistentVolumeClaim mounts more flexible.

2020-08-09 Thread GitBox


mik-laj edited a comment on issue #9181:
URL: https://github.com/apache/airflow/issues/9181#issuecomment-671089535


   Can the expected result be achieved with pod_template_file option?
   
https://airflow.readthedocs.io/en/latest/configurations-ref.html#pod-template-file



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on issue #10038: Using >= or =~ in dependency management is a very bad practice

2020-08-09 Thread GitBox


mik-laj commented on issue #10038:
URL: https://github.com/apache/airflow/issues/10038#issuecomment-671089649


   Is there anything left to do in this ticket? We have docs about it: 
https://airflow.readthedocs.io/en/latest/installation.html



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow-site] branch asf-site updated: Update asf-site to output generated at 6d47dfe

2020-08-09 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/airflow-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 3a1233d  Update asf-site to output generated at 6d47dfe
3a1233d is described below

commit 3a1233d9183f61be039bfcfbe625cba6da3b714d
Author: potiuk 
AuthorDate: Sun Aug 9 19:22:13 2020 +

Update asf-site to output generated at 6d47dfe
---
 404.html   |   28 +
 {blog => announcements}/index.html | 1012 ++--
 announcements/index.xml|   18 +
 blog/airflow-1.10.10/index.html|   32 +-
 blog/airflow-1.10.8-1.10.9/index.html  |   32 +-
 blog/airflow-survey/index.html |   32 +-
 blog/announcing-new-website/index.html |   32 +-
 .../index.html |   32 +-
 .../index.html |   32 +-
 .../index.html |   32 +-
 .../index.html |   32 +-
 blog/index.html|   28 +
 .../index.html |   32 +-
 blog/tags/community/index.html |   28 +
 blog/tags/development/index.html   |   28 +
 blog/tags/documentation/index.html |   28 +
 blog/tags/release/index.html   |   28 +
 blog/tags/rest-api/index.html  |   28 +
 blog/tags/survey/index.html|   28 +
 blog/tags/users/index.html |   28 +
 categories/index.html  |   28 +
 community/index.html   |   28 +
 images/airflow_dark_bg.png |  Bin 0 -> 3719 bytes
 index.html |   58 +-
 install/index.html |   28 +
 meetups/index.html |   28 +
 privacy-notice/index.html  |   28 +
 roadmap/index.html |   28 +
 search/index.html  |   32 +-
 sitemap.xml|  105 +-
 tags/index.html|   28 +
 use-cases/adobe/index.html |   32 +-
 use-cases/big-fish-games/index.html|   32 +-
 use-cases/dish/index.html  |   32 +-
 use-cases/experity/index.html  |   32 +-
 use-cases/index.html   |   28 +
 use-cases/onefootball/index.html   |   32 +-
 37 files changed, 1738 insertions(+), 411 deletions(-)

diff --git a/404.html b/404.html
index 125ca42..e0425eb 100644
--- a/404.html
+++ b/404.html
@@ -197,6 +197,20 @@ if (!doNotTrack) {
 
 
 
+Announcements
+
+
+
+
+
+
+
+
+
+
+
 Blog
@@ -328,6 +342,20 @@ if (!doNotTrack) {
 
 
 
+Announcements
+
+
+
+
+
+
+
+
+
+
+
 Blog
diff --git a/blog/index.html b/announcements/index.html
similarity index 50%
copy from blog/index.html
copy to announcements/index.html
index e91c208..f9b5c9f 100644
--- a/blog/index.html
+++ b/announcements/index.html
@@ -9,7 +9,7 @@
 
 
 
-
+
 
 
 
@@ -30,20 +30,21 @@
 
 
 
-Blog | Apache Airflow
+Announcements | Apache Airflow
 
 
-
+
 
 
-
-
+
+
+
 
 
 
 
 
-
+
 
 
 
@@ -198,6 +199,20 @@ if (!doNotTrack) {
 
 
 
+Announcements
+
+
+
+
+
+
+
+
+
+
+
 Blog
@@ -329,6 +344,20 @@ if (!doNotTrack) {
 
 
 
+Announcements
+
+
+
+
+
+
+
+
+
+
+
 Blog
@@ -360,360 +389,712 @@ if 

[GitHub] [airflow] potiuk closed issue #10038: Using >= or =~ in dependency management is a very bad practice

2020-08-09 Thread GitBox


potiuk closed issue #10038:
URL: https://github.com/apache/airflow/issues/10038


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on issue #8657: Add a configuration option to enable Airflow to look for DAGs in a specified S3 bucket.

2020-08-09 Thread GitBox


potiuk commented on issue #8657:
URL: https://github.com/apache/airflow/issues/8657#issuecomment-671098446


   Yep. There was an extensive discussion about it at the devlist 
https://lists.apache.org/thread.html/224d1e7d1b11e0b8314075f21b1b81708749f2899f4cce5af295e8a8%40%3Cdev.airflow.apache.org%3E
 and there is a long discussion in the wiki page: 
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-5+Remote+DAG+Fetcher. I 
believe the current result of the discussion is that Airflow should not 
implement DAG fetcher for now and (as @mik-laj mentioned) it can be done with 
side-cars rather easily (and in the way that will be good for particular 
cases). Another option will be to add some options in the Helm Chart where we 
have side-cars already and git-sync is implemented as one. Adding and S3 
side-car there might be a good idea.
   
   @DmitryRusakovKodiak  @ismailsimsek  - since you are interested in it - 
please feel free to open an issue for Helm Chart extension (or even donate one) 
 or re-open a discussion in the devlist, but for now I am closing this one.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk closed issue #8657: Add a configuration option to enable Airflow to look for DAGs in a specified S3 bucket.

2020-08-09 Thread GitBox


potiuk closed issue #8657:
URL: https://github.com/apache/airflow/issues/8657


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow-site] potiuk commented on pull request #279: Move announcements page from confluence to website

2020-08-09 Thread GitBox


potiuk commented on pull request #279:
URL: https://github.com/apache/airflow-site/pull/279#issuecomment-671098545


   The page is live. Thanks Again :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (06a1836 -> ef08831)

2020-08-09 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 06a1836  Add Missing Apache Providers to docs/installation.rst (#10265)
 add ef08831  Added DataprepGetJobsForJobGroupOperator (#10246)

No new revisions were added by this update.

Summary of changes:
 airflow/models/connection.py   |  1 +
 ...mple_postgres_to_gcs.py => example_dataprep.py} | 28 +++
 airflow/providers/google/cloud/hooks/dataprep.py   | 75 +
 .../providers/google/cloud/operators/dataprep.py   | 56 +
 docs/howto/operator/google/cloud/dataprep.rst  | 60 +
 docs/operators-and-hooks-ref.rst   |  6 ++
 .../providers/google/cloud/hooks/test_dataprep.py  | 97 ++
 .../google/cloud/operators/test_dataprep.py| 25 +++---
 8 files changed, 319 insertions(+), 29 deletions(-)
 copy airflow/providers/google/cloud/example_dags/{example_postgres_to_gcs.py 
=> example_dataprep.py} (62%)
 create mode 100644 airflow/providers/google/cloud/hooks/dataprep.py
 create mode 100644 airflow/providers/google/cloud/operators/dataprep.py
 create mode 100644 docs/howto/operator/google/cloud/dataprep.rst
 create mode 100644 tests/providers/google/cloud/hooks/test_dataprep.py
 copy airflow/contrib/utils/sendgrid.py => 
tests/providers/google/cloud/operators/test_dataprep.py (55%)



[GitHub] [airflow] mik-laj commented on issue #10007: Sagemaker contrib module - should be independant

2020-08-09 Thread GitBox


mik-laj commented on issue #10007:
URL: https://github.com/apache/airflow/issues/10007#issuecomment-671100253


   I close the ticket because the problem is no longer with Airflow, but with 
the Sagemaker SDK.
   
   @shlomiken Thank you for sharing your experience about this operator.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow-site] LeonY1 edited a comment on issue #275: The buttons for the Use Cases feel reversed

2020-08-09 Thread GitBox


LeonY1 edited a comment on issue #275:
URL: https://github.com/apache/airflow-site/issues/275#issuecomment-671061272


   When we look at the use cases page we can see that there are a lot of 
different use cases ordered from left to right. So the example I will be 
talking about will be between Adobe and Big Fish games. Since they're ordered 
from left to right, it would make sense that the next use case would be to the 
right since that is how the array was ordered.
   
   
![UseCases](https://user-images.githubusercontent.com/14265005/89735007-1047af80-da25-11ea-9f72-540cd80294ec.png)
   
   ### Adobe
   
   So when we enter the Adobe use case, it would make more sense to go to next 
since there hasn't been any previous already. However, this is not the case. 
You will have to use previous to move to the Big Fish games.
   
   
![Adobe](https://user-images.githubusercontent.com/14265005/89735061-6f0d2900-da25-11ea-8de6-e3dd8d5ad0df.png)
   
   ### Big Fish
   
   And in converse to that, you will need to press next to move to the previous 
option. 
   
   
![BigFish](https://user-images.githubusercontent.com/14265005/89735080-89470700-da25-11ea-87a8-122a063b1a97.png)
   
   This also mixes up your original pattern thinking of the right side goes to 
the next available one in the array while the left one goes to the previous one.
   
   I believe this could have been thought to be previous and next in the order 
they've been added but since there's no date or anything of that sort, this can 
be seen to be very confusing. This was just my opinion on the site; however, I 
think that others would definitely have some sort of confusion when seeing this
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow-site] LeonY1 commented on issue #275: The buttons for the Use Cases feel reversed

2020-08-09 Thread GitBox


LeonY1 commented on issue #275:
URL: https://github.com/apache/airflow-site/issues/275#issuecomment-671061272


   When we look at the use cases page we can see that there are a lot of 
different use cases ordered from left to right. So the example I will be 
talking about will be between Adobe and Big Fish games. Since they're ordered 
from left to right, it would make sense that the next use case would be to the 
right since that is how the array was ordered.
   
   
![UseCases](https://user-images.githubusercontent.com/14265005/89735007-1047af80-da25-11ea-9f72-540cd80294ec.png)
   
   So when we enter the Adobe use case, it would make more sense to go to next 
since there hasn't been any previous already. However, this is not the case. 
You will have to use previous to move to the Big Fish games.
   
   
![Adobe](https://user-images.githubusercontent.com/14265005/89735061-6f0d2900-da25-11ea-8de6-e3dd8d5ad0df.png)
   
   And in converse to that, you will need to press next to move to the previous 
option. 
   
   
![BigFish](https://user-images.githubusercontent.com/14265005/89735080-89470700-da25-11ea-87a8-122a063b1a97.png)
   
   This also mixes up your original pattern thinking of the right side goes to 
the next available one in the array while the left one goes to the previous one.
   
   I believe this could have been thought to be previous and next in the order 
they've been added but since there's no date or anything of that sort, this can 
be seen to be very confusing. This was just my opinion on the site; however, I 
think that others would definitely have some sort of confusion when seeing this
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on pull request #9907: Added feature to Import connections from a file.

2020-08-09 Thread GitBox


mik-laj commented on pull request #9907:
URL: https://github.com/apache/airflow/pull/9907#issuecomment-671062255


   We already have the `airflow connections import` command. This command will 
be a great improvement for the CLI. We already have the command `airflow 
connections import`. Your command will be a great complement to this command.
   
https://airflow.readthedocs.io/en/latest/howto/connection/index.html#exporting-connections-from-the-cli



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil merged pull request #10263: Fix various typos in the repo

2020-08-09 Thread GitBox


kaxil merged pull request #10263:
URL: https://github.com/apache/airflow/pull/10263


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil opened a new pull request #10265: Add Missing Apache Proviers to docs/installation.rst

2020-08-09 Thread GitBox


kaxil opened a new pull request #10265:
URL: https://github.com/apache/airflow/pull/10265


   The following were missing:
   
   - kylin
   - livy
   - sqoop
   - pig
   
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on issue #8091: Introduce Pydocstyle

2020-08-09 Thread GitBox


mik-laj commented on issue #8091:
URL: https://github.com/apache/airflow/issues/8091#issuecomment-671069863


   @kaxil @potiuk What else are we planning to do with this ticket?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil closed issue #8091: Introduce Pydocstyle

2020-08-09 Thread GitBox


kaxil closed issue #8091:
URL: https://github.com/apache/airflow/issues/8091


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on issue #8091: Introduce Pydocstyle

2020-08-09 Thread GitBox


kaxil commented on issue #8091:
URL: https://github.com/apache/airflow/issues/8091#issuecomment-671069981


   This can be closed, I will create a new ticket on the following steps



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil merged pull request #10265: Add Missing Apache Providers to docs/installation.rst

2020-08-09 Thread GitBox


kaxil merged pull request #10265:
URL: https://github.com/apache/airflow/pull/10265


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dimberman commented on a change in pull request #10230: Fix KubernetesPodOperator reattachment

2020-08-09 Thread GitBox


dimberman commented on a change in pull request #10230:
URL: https://github.com/apache/airflow/pull/10230#discussion_r467603846



##
File path: airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py
##
@@ -304,14 +297,40 @@ def execute(self, context) -> Optional[str]:
 except AirflowException as ex:
 raise AirflowException('Pod Launching failed: 
{error}'.format(error=ex))
 
+def handle_pod_overlap(self, labels, try_numbers_match, launcher, 
pod_list):
+"""
+

Review comment:
   @kaxil done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] dimberman commented on pull request #10230: Fix KubernetesPodOperator reattachment

2020-08-09 Thread GitBox


dimberman commented on pull request #10230:
URL: https://github.com/apache/airflow/pull/10230#issuecomment-671073461


   > LGTM, logic is much clearer, thank you.
   > 
   > One thing to consider is the comment by @dakov here: [#6377 
(comment)](https://github.com/apache/airflow/pull/6377#discussion_r459648834)
   > 
   > Perhaps on line 280 where we check for 0 or 1 existing pod & raise 
otherwise, we should only raise if `reattach_on_restart` is True? As if it is 
False then we probably don't care & we will create another pod anyway. What do 
you think?
   
   Yeah that makes sense. Tbh I'll be surprised if many people turn off 
`reattach_on_restart` as it seems like the logical step to take.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated: Fix Warning when using a different Sphinx Builder (#10262)

2020-08-09 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
 new 5503a6a  Fix Warning when using a different Sphinx Builder (#10262)
5503a6a is described below

commit 5503a6a152f2cb59264a2f5a5e267f38355c871a
Author: Kaxil Naik 
AuthorDate: Sun Aug 9 15:39:31 2020 +0100

Fix Warning when using a different Sphinx Builder (#10262)
---
 docs/exts/redirects.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/exts/redirects.py b/docs/exts/redirects.py
index 9e887cc..5ea4885 100644
--- a/docs/exts/redirects.py
+++ b/docs/exts/redirects.py
@@ -29,7 +29,7 @@ log = logging.getLogger(__name__)
 
 
 def generate_redirects(app):
-"""Generaate redirects files."""
+"""Generate redirects files."""
 redirect_file_path = os.path.join(app.srcdir, app.config.redirects_file)
 if not os.path.exists(redirect_file_path):
 raise ExtensionError(f"Could not find redirects file at 
'{redirect_file_path}'")
@@ -38,7 +38,7 @@ def generate_redirects(app):
 
 if not isinstance(app.builder, builders.StandaloneHTMLBuilder):
 log.warning(
-"The plugin is support only 'html' builder, but you are using 
'{type(app.builder)}'. Skipping..."
+f"The plugin supports only 'html' builder, but you are using 
'{type(app.builder)}'. Skipping..."
 )
 return
 



[GitHub] [airflow] kaxil opened a new pull request #10263: Fix various typos in the repo

2020-08-09 Thread GitBox


kaxil opened a new pull request #10263:
URL: https://github.com/apache/airflow/pull/10263


   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil opened a new issue #10264: Add Spell-Checker

2020-08-09 Thread GitBox


kaxil opened a new issue #10264:
URL: https://github.com/apache/airflow/issues/10264


   We have been fixing various typos in the project but it would be good if we 
can enable a Spell Checker for our Docsite. So our docs are typo-free.
   
   We can use https://pypi.org/project/sphinxcontrib-spelling/ to do this.
   
   I gave it a shot but unfortunately, there are too many words that we need to 
add in `docs/spelling_wordlist.txt`.
   
   Here is the list I had but there are still ~6k more words that need to be 
added:
   
   ```
   Acyclic
   Airbnb
   Async
   Avro
   Bas
   BaseView
   Cassanda
   DagRun
   Dask
   Dataproc
   Datastore
   Gantt
   Gunicorn
   Harenslak
   Hashicorp
   Jarek
   Jinja
   Jira
   Kamil
   Kerberos
   Kibana
   Kubernetes
   Oozie
   Opsgenie
   Parameterizing
   Potiuk
   Py
   Qubole
   Sqoop
   Standarization
   Systemd
   Templating
   XCom
   XComs
   Zsh
   adls
   airflow
   airflowignore
   ansible
   apikey
   argcomplete
   args
   async
   auth
   autocommit
   autodetect
   automl
   autoscale
   aws
   backend
   backfill
   backfilled
   bashcompinit
   batcher
   bigquery
   bigtable
   bitshift
   boto
   botocore
   catchup
   cfg
   chown
   classmethod
   cloudant
   cloudsql
   cncf
   config
   configMapRef
   configmap
   configuing
   cronjob
   crypto
   cyexamplekey
   dag
   dagbag
   dagruns
   databricks
   datadog
   dataset
   datasets
   datetime
   dbs
   dejson
   deserializing
   dest
   dev
   devel
   dingding
   distros
   dockerenv
   docstring
   docstrings
   elasticsearch
   envFrom
   eventlet
   exampleinclude
   exasol
   facebook
   failover
   fernet
   fluentd
   fs
   gRPC
   gcp
   gcpcloudsql
   gevent
   github
   greenlets
   grpc
   gssapi
   hadoop
   hashicorp
   hdfs
   hiveserver
   howto
   httpbin
   imap
   initdb
   integration
   integrations
   jalr
   jdbc
   jinja
   keytab
   krb
   kubernetes
   kwargs
   kylin
   licence
   literalinclude
   logins
   loglevel
   logstash
   lshift
   macOS
   mdeng
   memorystore
   mesos
   metadatabase
   metarouter
   metastore
   mongo
   msg
   mssql
   noqa
   odbc
   papermill
   param
   paramiko
   petabyte
   pgdatabase
   pghost
   pgpassfile
   pgpassword
   pgport
   pguser
   pidfile
   pinot
   postgre
   postgres
   postgresql
   precheck
   proc
   programmatically
   psql
   py
   pylint
   pythonpath
   rankdir
   rbac
   readthedocs
   resetdb
   rshift
   rst
   salesforce
   saml
   sanitization
   searchpath
   secretRef
   secretsmanager
   seealso
   serverless
   sftp
   smtps
   spegno
   sqla
   stackdriver
   statsd
   stdout
   subcommand
   subdag
   subgraph
   subpackage
   subpackages
   subprocesses
   sudo
   tablename
   templated
   templating
   teradata
   timedelta
   umask
   unpause
   upgradedb
   upsert
   uptime
   utcnow
   versionable
   vertica
   wasb
   webhdfs
   webserver
   xcom
   
   yandex
   yandexcloud
   
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] VijayantSoni commented on pull request #8701: Adding ElastiCache Hook for creating, describing and deleting replication groups

2020-08-09 Thread GitBox


VijayantSoni commented on pull request #8701:
URL: https://github.com/apache/airflow/pull/8701#issuecomment-671067874


   Hi @mik-laj , rebased with master and tests ran fine this time. Thanks.
   Will it be possible for you to review in absence of @ashb ? 
   Thanks !



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on pull request #8701: Adding ElastiCache Hook for creating, describing and deleting replication groups

2020-08-09 Thread GitBox


mik-laj commented on pull request #8701:
URL: https://github.com/apache/airflow/pull/8701#issuecomment-671068349


   I am not an AWS expert. @feluelle  can you take a look?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated (5503a6a -> b43f90a)

2020-08-09 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git.


from 5503a6a  Fix Warning when using a different Sphinx Builder (#10262)
 add b43f90a  Fix various typos in the repo (#10263)

No new revisions were added by this update.

Summary of changes:
 airflow/api_connexion/schemas/health_schema.py  | 2 +-
 airflow/models/baseoperator.py  | 2 +-
 airflow/models/dag.py   | 4 ++--
 airflow/models/dagcode.py   | 6 +++---
 airflow/providers/apache/hive/hooks/hive.py | 2 +-
 airflow/providers/apache/kylin/hooks/kylin.py   | 2 +-
 airflow/providers/google/cloud/hooks/dlp.py | 4 ++--
 backport_packages/refactor_backport_packages.py | 2 +-
 docs/concepts.rst   | 4 ++--
 docs/howto/write-logs.rst   | 5 +++--
 10 files changed, 17 insertions(+), 16 deletions(-)



[airflow] branch master updated: Add Missing Apache Providers to docs/installation.rst (#10265)

2020-08-09 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
 new 06a1836  Add Missing Apache Providers to docs/installation.rst (#10265)
06a1836 is described below

commit 06a1836755f1e6e30f5c6e7d06cf2b0e063e260c
Author: Kaxil Naik 
AuthorDate: Sun Aug 9 17:17:06 2020 +0100

Add Missing Apache Providers to docs/installation.rst (#10265)
---
 docs/installation.rst | 8 
 1 file changed, 8 insertions(+)

diff --git a/docs/installation.rst b/docs/installation.rst
index 54a227f..d406ee3 100644
--- a/docs/installation.rst
+++ b/docs/installation.rst
@@ -147,10 +147,18 @@ Here's the list of the subpackages and what they enable:
 
+-+-+--+
 | hive| ``pip install 'apache-airflow[apache.hive]'``   | 
All Hive related operators   |
 
+-+-+--+
+| kylin   | ``pip install 'apache-airflow[apache.kylin]'``  | 
All Kylin related operators & hooks  |
++-+-+--+
+| livy| ``pip install 'apache-airflow[apache.livy]'``   | 
All Livy related operators & hooks   |
++-+-+--+
+| pig | ``pip install 'apache-airflow[apache.pig]'``| 
All Pig related operators & hooks|
++-+-+--+
 | presto  | ``pip install 'apache-airflow[apache.presto]'`` | 
All Presto related operators & hooks |
 
+-+-+--+
 | spark   | ``pip install 'apache-airflow[apache.spark]'``  | 
All Spark related operators & hooks  |
 
+-+-+--+
+| sqoop   | ``pip install 'apache-airflow[apache.sqoop]'``  | 
All Sqoop related operators & hooks  |
++-+-+--+
 | webhdfs | ``pip install 'apache-airflow[webhdfs]'``   | 
HDFS hooks and operators |
 
+-+-+--+
 



[GitHub] [airflow-site] yesemsanthoshkumar edited a comment on pull request #279: Move announcements page from confluence to website

2020-08-09 Thread GitBox


yesemsanthoshkumar edited a comment on pull request #279:
URL: https://github.com/apache/airflow-site/pull/279#issuecomment-671075081


   > > 1. Oct 18, 2019
   > >Same update as Nov 22, 2019. Should I remove one of them?
   > 
   > Yep. mistake. Remove the Nov 22nd one.
   
   Done
   > > 1. April 22, 2016
   > >Migrating to Apache Phrase points to announcements confluence page. 
Should we repoint this to the website?
   > 
   > Yes please!
   
   Done
   > 
   > Few comments:
   > 
   > 1. It would be great to add some whitespace at the top of the Announcement 
page (as it is for other pages). Currently the header covers half of the first 
line:
   
   Done
   > 1. The list of the backport providers also has far too much space in. 
Maybe better to have them as bullets rather than numbered list to squeeze them 
together.
   
   I didn't notice those spaces between those list items. Removed them now.
   
   > BTW. I am not sure if you realize that but you can easily preview the 
generated pages now. It is enough to download the artifact from the PR, extract 
it and run `python -m http.server` and you will be able to preview it at 
http://localhost:8000
   
   Thanks for the info. Was checking from the site.sh script preview. Checked 
the above comments with the artifact as well.
   
   Let me know if there are any other issues.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow-site] yesemsanthoshkumar commented on pull request #279: Move announcements page from confluence to website

2020-08-09 Thread GitBox


yesemsanthoshkumar commented on pull request #279:
URL: https://github.com/apache/airflow-site/pull/279#issuecomment-671075081


   > > 1. Oct 18, 2019
   > >Same update as Nov 22, 2019. Should I remove one of them?
   > 
   > Yep. mistake. Remove the Nov 22nd one.
   
   Done
   > > 1. April 22, 2016
   > >Migrating to Apache Phrase points to announcements confluence page. 
Should we repoint this to the website?
   > 
   > Yes please!
   
   Done
   > 
   > Few comments:
   > 
   > 1. It would be great to add some whitespace at the top of the Announcement 
page (as it is for other pages). Currently the header covers half of the first 
line:
   
   Done
   > 1. The list of the backport providers also has far too much space in. 
Maybe better to have them as bullets rather than numbered list to squeeze them 
together.
   
   I didn't notice those spaces between those list items. Removed them now.
   
   > BTW. I am not sure if you realize that but you can easily preview the 
generated pages now. It is enough to download the artifact from the PR, extract 
it and run `python -m http.server` and you will be able to preview it at 
http://localhost:8000
   Thanks for the info. Was checking from the site.sh script preview. Checked 
the above comments with the artifact as well.
   
   Let me know if there are any other issues.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] rootcss commented on issue #8715: Documentation about loading packages in Python / Airflow

2020-08-09 Thread GitBox


rootcss commented on issue #8715:
URL: https://github.com/apache/airflow/issues/8715#issuecomment-671142928


   @mik-laj yes, can you assign this issue to me? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] roitvt commented on pull request #10023: spark-on-k8s sensor - add driver logs

2020-08-09 Thread GitBox


roitvt commented on pull request #10023:
URL: https://github.com/apache/airflow/pull/10023#issuecomment-671017812


   LGTM works well with spark-pi.
   Great addition thanks :) 
   @bbenzikry @mik-laj 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] roitvt removed a comment on pull request #10023: spark-on-k8s sensor - add driver logs

2020-08-09 Thread GitBox


roitvt removed a comment on pull request #10023:
URL: https://github.com/apache/airflow/pull/10023#issuecomment-671018435


   > > > @roitvt, sorry to bother you - do you think you'll have time to take a 
look this week?
   > > 
   > > 
   > > Hi Beni, I'm really sorry I'll do it this week. thank you very much for 
adding this functionality and I'll be glad to have a talk with you about using 
this integration and Spark on K8s :)
   > 
   > Great, thanks 
   > I'll be happy to discuss, I'm also interested in some design 
considerations you had when writing the operator.
   
   
   
   > > > @roitvt, sorry to bother you - do you think you'll have time to take a 
look this week?
   > > 
   > > 
   > > Hi Beni, I'm really sorry I'll do it this week. thank you very much for 
adding this functionality and I'll be glad to have a talk with you about using 
this integration and Spark on K8s :)
   > 
   > Great, thanks 
   > I'll be happy to discuss, I'm also interested in some design 
considerations you had when writing the operator.
   
   
   
   > > @roitvt, sorry to bother you - do you think you'll have time to take a 
look this week?
   > 
   > Hi Beni, I'm really sorry I'll do it this week. thank you very much for 
adding this functionality and I'll be glad to have a talk with you about using 
this integration and Spark on K8s :)
   What is your mail?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] roitvt commented on pull request #10023: spark-on-k8s sensor - add driver logs

2020-08-09 Thread GitBox


roitvt commented on pull request #10023:
URL: https://github.com/apache/airflow/pull/10023#issuecomment-671018435


   > > > @roitvt, sorry to bother you - do you think you'll have time to take a 
look this week?
   > > 
   > > 
   > > Hi Beni, I'm really sorry I'll do it this week. thank you very much for 
adding this functionality and I'll be glad to have a talk with you about using 
this integration and Spark on K8s :)
   > 
   > Great, thanks 
   > I'll be happy to discuss, I'm also interested in some design 
considerations you had when writing the operator.
   
   
   
   > > > @roitvt, sorry to bother you - do you think you'll have time to take a 
look this week?
   > > 
   > > 
   > > Hi Beni, I'm really sorry I'll do it this week. thank you very much for 
adding this functionality and I'll be glad to have a talk with you about using 
this integration and Spark on K8s :)
   > 
   > Great, thanks 
   > I'll be happy to discuss, I'm also interested in some design 
considerations you had when writing the operator.
   
   
   
   > > @roitvt, sorry to bother you - do you think you'll have time to take a 
look this week?
   > 
   > Hi Beni, I'm really sorry I'll do it this week. thank you very much for 
adding this functionality and I'll be glad to have a talk with you about using 
this integration and Spark on K8s :)
   What is your mail?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] roitvt commented on pull request #10023: spark-on-k8s sensor - add driver logs

2020-08-09 Thread GitBox


roitvt commented on pull request #10023:
URL: https://github.com/apache/airflow/pull/10023#issuecomment-671018479


   > > > @roitvt, sorry to bother you - do you think you'll have time to take a 
look this week?
   > > 
   > > 
   > > Hi Beni, I'm really sorry I'll do it this week. thank you very much for 
adding this functionality and I'll be glad to have a talk with you about using 
this integration and Spark on K8s :)
   > 
   > Great, thanks 
   > I'll be happy to discuss, I'm also interested in some design 
considerations you had when writing the operator.
   
   What is your mail?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] JeffryMAC commented on a change in pull request #10241: Create "Managing variable" in howto directory

2020-08-09 Thread GitBox


JeffryMAC commented on a change in pull request #10241:
URL: https://github.com/apache/airflow/pull/10241#discussion_r467544808



##
File path: docs/howto/variable.rst
##
@@ -0,0 +1,72 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Managing Connections
+
+
+Variables are a generic way to store and retrieve arbitrary content or
+settings as a simple key value store within Airflow. Variables can be
+listed, created, updated and deleted from the UI (``Admin -> Variables``),
+code or CLI.
+
+.. image:: ../img/variable_hidden.png
+
+See the :ref:`Variables Concepts ` documentation for
+more information.
+
+Storing Variables in Environment Variables

Review comment:
   I was pointing that because I don't know the answers and there is no 
where in the docs where this info available.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #10241: Create "Managing variable" in howto directory

2020-08-09 Thread GitBox


mik-laj commented on a change in pull request #10241:
URL: https://github.com/apache/airflow/pull/10241#discussion_r467550086



##
File path: docs/howto/variable.rst
##
@@ -0,0 +1,72 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Managing Variables
+

Review comment:
   ```suggestion
   Managing Variables
   ==
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #10241: Create "Managing variable" in howto directory

2020-08-09 Thread GitBox


mik-laj commented on a change in pull request #10241:
URL: https://github.com/apache/airflow/pull/10241#discussion_r467550010



##
File path: docs/howto/variable.rst
##
@@ -0,0 +1,72 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Managing Connections
+
+
+Variables are a generic way to store and retrieve arbitrary content or
+settings as a simple key value store within Airflow. Variables can be
+listed, created, updated and deleted from the UI (``Admin -> Variables``),
+code or CLI.
+
+.. image:: ../img/variable_hidden.png
+
+See the :ref:`Variables Concepts ` documentation for
+more information.
+
+Storing Variables in Environment Variables

Review comment:
   Environment variables are helpful because they can be set very easily in 
a containerized environment and their contents can be determined during 
deployment.
   
https://kubernetes.io/docs/tasks/inject-data-application/define-environment-variable-container/
   It could also contain secrets.
   https://kubernetes.io/docs/concepts/configuration/secret/
   In the DAG file, you can also read environment variables, but the use of 
Airflow variables allows for their easier configuration in the development 
environment, e.g. access to UI, easier development - access to variable in 
Jinja, and thanks to setting Airflow variables using environment variables also 
easy deployment.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] bbenzikry commented on pull request #10023: spark-on-k8s sensor - add driver logs

2020-08-09 Thread GitBox


bbenzikry commented on pull request #10023:
URL: https://github.com/apache/airflow/pull/10023#issuecomment-671020059


   > > > > @roitvt, sorry to bother you - do you think you'll have time to take 
a look this week?
   > > > 
   > > > 
   > > > Hi Beni, I'm really sorry I'll do it this week. thank you very much 
for adding this functionality and I'll be glad to have a talk with you about 
using this integration and Spark on K8s :)
   > > 
   > > 
   > > Great, thanks 
   > > I'll be happy to discuss, I'm also interested in some design 
considerations you had when writing the operator.
   > 
   > What is your mail?
   
   sent on FB



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #10220: Increase number of runs for quarantined tests

2020-08-09 Thread GitBox


potiuk commented on pull request #10220:
URL: https://github.com/apache/airflow/pull/10220#issuecomment-671020470


   :D Shall we :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch v1-10-test updated: Fix init containers

2020-08-09 Thread dimberman
This is an automated email from the ASF dual-hosted git repository.

dimberman pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/v1-10-test by this push:
 new 9e2557a  Fix init containers
9e2557a is described below

commit 9e2557a579ab9287293969662aa35e39b4b8b292
Author: Daniel Imberman 
AuthorDate: Sun Aug 9 10:18:18 2020 -0700

Fix init containers
---
 airflow/kubernetes/pod_launcher.py | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/airflow/kubernetes/pod_launcher.py 
b/airflow/kubernetes/pod_launcher.py
index 30cb26a..620df31 100644
--- a/airflow/kubernetes/pod_launcher.py
+++ b/airflow/kubernetes/pod_launcher.py
@@ -299,9 +299,8 @@ def _convert_to_airflow_pod(pod):
 volumes, vol_secrets = _extract_volumes_and_secrets(pod.spec.volumes, 
base_container.volume_mounts)
 secrets.extend(vol_secrets)
 api_client = ApiClient()
-if pod.spec.init_containers is None:
-init_containers = [],
-else:
+init_containers = pod.spec.init_containers
+if pod.spec.init_containers is not None:
 init_containers = [api_client.sanitize_for_serialization(i) for i in 
pod.spec.init_containers]
 dummy_pod = Pod(
 image=base_container.image,



[GitHub] [airflow] potiuk merged pull request #10236: Update example on docs/howto/connection/index.rst

2020-08-09 Thread GitBox


potiuk merged pull request #10236:
URL: https://github.com/apache/airflow/pull/10236


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk merged pull request #10257: Improve guide about Google Cloud Secret Manager Backend

2020-08-09 Thread GitBox


potiuk merged pull request #10257:
URL: https://github.com/apache/airflow/pull/10257


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated: Improve guide about Google Cloud Secret Manager Backend (#10257)

2020-08-09 Thread potiuk
This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
 new 55021b7  Improve guide about Google Cloud Secret Manager Backend 
(#10257)
55021b7 is described below

commit 55021b771d384b0dd90792962ea869e1bd8a40a0
Author: Kamil Breguła 
AuthorDate: Sun Aug 9 12:24:11 2020 +0200

Improve guide about Google Cloud Secret Manager Backend (#10257)
---
 .../secrets-backend/google-cloud-secret-manager-backend.rst  | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/docs/howto/secrets-backend/google-cloud-secret-manager-backend.rst 
b/docs/howto/secrets-backend/google-cloud-secret-manager-backend.rst
index 34454b2..f80cc7f 100644
--- a/docs/howto/secrets-backend/google-cloud-secret-manager-backend.rst
+++ b/docs/howto/secrets-backend/google-cloud-secret-manager-backend.rst
@@ -26,7 +26,15 @@ a secret backend and how to manage secrets.
 Before you begin
 
 
-`Configure Secret Manager and your local environment 
`__, 
once per project.
+Before you start, make sure you have performed the following tasks:
+
+1.  Include sendgrid subpackage as part of your Airflow installation
+
+.. code-block:: bash
+
+pip install apache-airflow[google]
+
+2. `Configure Secret Manager and your local environment 
`__, 
once per project.
 
 Enabling the secret backend
 """
@@ -50,7 +58,7 @@ You can also set this with environment variables.
 
 You can verify the correct setting of the configuration options with the 
``airflow config get-value`` command.
 
-.. code-block:: bash
+.. code-block:: console
 
 $ airflow config get-value secrets backend
 
airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend



[GitHub] [airflow] michalslowikowski00 commented on a change in pull request #10246: Added DataprepGetJobsForJobGroupOperator

2020-08-09 Thread GitBox


michalslowikowski00 commented on a change in pull request #10246:
URL: https://github.com/apache/airflow/pull/10246#discussion_r467566854



##
File path: tests/providers/google/cloud/hooks/test_dataprep.py
##
@@ -0,0 +1,97 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from unittest import mock
+
+import pytest
+from mock import patch
+from requests import HTTPError
+from tenacity import RetryError
+
+from airflow.providers.google.cloud.hooks import dataprep
+
+JOB_ID = 1234567
+URL = "https://api.clouddataprep.com/v4/jobGroups;
+TOKEN = ""
+EXTRA = {"token": TOKEN}
+
+
+@pytest.fixture(scope="class")
+def mock_hook():
+with mock.patch("airflow.hooks.base_hook.BaseHook.get_connection") as conn:
+hook = dataprep.GoogleDataprepHook(dataprep_conn_id="dataprep_conn_id")
+conn.return_value.extra_dejson = EXTRA
+yield hook
+
+
+class TestGoogleDataprepHook:
+def test_get_token(self, mock_hook):
+assert mock_hook._token == TOKEN
+
+@patch("airflow.providers.google.cloud.hooks.dataprep.requests.get")
+def test_mock_should_be_called_once_with_params(self, mock_get_request, 
mock_hook):
+mock_hook.get_jobs_for_job_group(job_id=JOB_ID)
+mock_get_request.assert_called_once_with(
+f"{URL}/{JOB_ID}/jobs",
+headers={
+"Content-Type": "application/json",
+"Authorization": f"Bearer {TOKEN}",
+},
+)
+
+@patch(
+"airflow.providers.google.cloud.hooks.dataprep.requests.get",
+side_effect=[HTTPError(), mock.MagicMock()],
+)
+def test_should_pass_after_retry(self, mock_get_request, mock_hook):
+mock_hook.get_jobs_for_job_group(JOB_ID)
+assert mock_get_request.call_count == 2
+
+@patch(
+"airflow.providers.google.cloud.hooks.dataprep.requests.get",
+side_effect=[mock.MagicMock(), HTTPError()],
+)
+def test_should_not_retry_after_success(self, mock_get_request, mock_hook):
+mock_hook.get_jobs_for_job_group.retry.sleep = mock.Mock()  # pylint: 
disable=no-member
+mock_hook.get_jobs_for_job_group(JOB_ID)
+assert mock_get_request.call_count == 1
+
+@patch(
+"airflow.providers.google.cloud.hooks.dataprep.requests.get",
+side_effect=[
+HTTPError(),
+HTTPError(),
+HTTPError(),
+HTTPError(),
+mock.MagicMock(),
+],
+)
+def test_should_retry_after_four_errors(self, mock_get_request, mock_hook):
+mock_hook.get_jobs_for_job_group.retry.sleep = mock.Mock()  # pylint: 
disable=no-member
+mock_hook.get_jobs_for_job_group(JOB_ID)
+assert mock_get_request.call_count == 5
+
+@patch(
+"airflow.providers.google.cloud.hooks.dataprep.requests.get",
+side_effect=[HTTPError(), HTTPError(), HTTPError(), HTTPError(), 
HTTPError()],
+)
+def test_raise_error_after_five_calls(self, mock_get_request, mock_hook):

Review comment:
   I did not mention anywhere else that tenacity retry has 5 calls. This 
case is only my interpretation. I was suggesting how it was done already done 
in airflow -- I mean that `num_retries`. I am open for suggestions.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #10227: Use Hash of Serialized DAG to determine DAG is changed or not

2020-08-09 Thread GitBox


kaxil commented on a change in pull request #10227:
URL: https://github.com/apache/airflow/pull/10227#discussion_r467569173



##
File path: airflow/models/serialized_dag.py
##
@@ -76,6 +82,7 @@ def __init__(self, dag: DAG):
 self.fileloc_hash = DagCode.dag_fileloc_hash(self.fileloc)
 self.data = SerializedDAG.to_dict(dag)
 self.last_updated = timezone.utcnow()
+self.dag_hash = hashlib.md5(json.dumps(self.data, 
sort_keys=True).encode("utf-8")).hexdigest()

Review comment:
   The main reason for using Hash of a Serialized DAG is related to the 
question I asked Airbnb folks:
   
   
![image](https://user-images.githubusercontent.com/8811558/89730618-39257000-da38-11ea-828f-076cff27102e.png)
   
   >Question for cong zhu: Do you store Hash of the DAG File? What happens when 
the modules imported by the DAG File changes bu not the DAG files itself, in 
that case DAG file hash remains the same but DAG might not be the same
   >Cong Zhu: Yes, we store the hash of the serialized DAG
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on pull request #10256: fixes http hook using schema field from airflow.models.connection.Con…

2020-08-09 Thread GitBox


potiuk commented on pull request #10256:
URL: https://github.com/apache/airflow/pull/10256#issuecomment-671040183


   Happy to help with the System Tests as well if needed -> just ping us in  
"#system-tests"  channel in Airflow's slack. The plans are that we fully 
automate system tests execution before we release 2.0.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Minyus edited a comment on issue #10037: KubernetesPodOperator namespace argument conflict when using pod_template_file

2020-08-09 Thread GitBox


Minyus edited a comment on issue #10037:
URL: https://github.com/apache/airflow/issues/10037#issuecomment-671041151


   I'm trying to use `pod_template_file` with `in_cluster=True` using Airflow 
1.10.11 (Docker image at 
[apache/airflow:1.10.11-python3.6](https://hub.docker.com/layers/apache/airflow/1.10.11-python3.6/images/sha256-5b43a4b820eb229ea6910aafcddc2d18de745a69817c0256cadc74f6faa2d3f5?context=explore))
 and getting the same symptom.
   
   I checked the 
[test_kubernetes_pod_operator.py](https://github.com/apache/airflow/blob/1.10.11/kubernetes_tests/test_kubernetes_pod_operator.py),
 but apparently the cases of `in_cluster=True` was not tested well.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Minyus commented on issue #10037: KubernetesPodOperator namespace argument conflict when using pod_template_file

2020-08-09 Thread GitBox


Minyus commented on issue #10037:
URL: https://github.com/apache/airflow/issues/10037#issuecomment-671041151


   I'm trying to use `pod_template_file` with `in_cluster=True` using Airflow 
1.10.11 (Docker image from 
[apache/airflow:1.10.11-python3.6](https://hub.docker.com/layers/apache/airflow/1.10.11-python3.6/images/sha256-5b43a4b820eb229ea6910aafcddc2d18de745a69817c0256cadc74f6faa2d3f5?context=explore))
 and getting the same symptom.
   
   I checked the 
[test_kubernetes_pod_operator.py](https://github.com/apache/airflow/blob/1.10.11/kubernetes_tests/test_kubernetes_pod_operator.py),
 but apparently the cases of `in_cluster=True` was not tested well.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] Minyus edited a comment on issue #10037: KubernetesPodOperator namespace argument conflict when using pod_template_file

2020-08-09 Thread GitBox


Minyus edited a comment on issue #10037:
URL: https://github.com/apache/airflow/issues/10037#issuecomment-671041151


   I'm trying to use `pod_template_file` with `in_cluster=True` using Airflow 
1.10.11 (Docker image at 
[apache/airflow:1.10.11-python3.6](https://hub.docker.com/layers/apache/airflow/1.10.11-python3.6/images/sha256-5b43a4b820eb229ea6910aafcddc2d18de745a69817c0256cadc74f6faa2d3f5?context=explore))
 and getting the same symptom.
   
   I checked the 
[test_kubernetes_pod_operator.py](https://github.com/apache/airflow/blob/1.10.11/kubernetes_tests/test_kubernetes_pod_operator.py),
 but apparently the cases of `in_cluster=True` were not tested well.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] ljb7977 edited a comment on pull request #9022: Fix AwsGlueJobSensor

2020-08-09 Thread GitBox


ljb7977 edited a comment on pull request #9022:
URL: https://github.com/apache/airflow/pull/9022#issuecomment-669986197


   > @ljb7977 how is this going? Could you please also do a rebase together 
with the changes? :)
   
   @feluelle Sorry for late replying. I made changes that you suggested and 
fixed some tests.
   Question: Is it good to allow AwsGlueJobHook to have context about glue job? 
I mean, can the hook keep run_id of the glue job it invoked or should it be 
static?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kubatyszko commented on pull request #10256: fixes http hook using schema field from airflow.models.connection.Con…

2020-08-09 Thread GitBox


kubatyszko commented on pull request #10256:
URL: https://github.com/apache/airflow/pull/10256#issuecomment-671083935


   I'm reconsider my solution to the issue.
   I think the conn_type always needs to be "http" - meaning protocol, while 
the scheme is supposed to reflect http or https.
   With connection stored as a secret in AWS, and a url like 
"https://foo.bar/SCHEME; - the scheme would have been populated from the SCHEME 
portion, which is unintuitive and doesn't reflect the URI format at all.
   Going to investigate further whether it's something that needs fixing in the 
http hook or secrets backend.
   
   Also, with my change, this test is failing now:
   
   ```
   __ 
TestHttpHook.test_https_connection 
__
   
   self = 
   mock_get_connection = 
   
   @mock.patch('airflow.providers.http.hooks.http.HttpHook.get_connection')
   def test_https_connection(self, mock_get_connection):
   conn = Connection(conn_id='http_default', conn_type='http',
 host='localhost', schema='https')
   mock_get_connection.return_value = conn
   hook = HttpHook()
   hook.get_conn({})
   >   self.assertEqual(hook.base_url, 'https://localhost')
   E   AssertionError: 'http://localhost' != 'https://localhost'
   E   - http://localhost
   E   + https://localhost
   E   ? +
   
   test_http.py:305: AssertionError
   ```
   (it succeeds without my change).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil merged pull request #10247: Create separate section for Cron Presets

2020-08-09 Thread GitBox


kaxil merged pull request #10247:
URL: https://github.com/apache/airflow/pull/10247


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated: Create separate section for Cron Presets (#10247)

2020-08-09 Thread kaxilnaik
This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
 new 637a2c1  Create separate section for Cron Presets (#10247)
637a2c1 is described below

commit 637a2c1d8b13efd47be19d3f0087bc7ab732b9a9
Author: Kaxil Naik 
AuthorDate: Sun Aug 9 10:23:12 2020 +0100

Create separate section for Cron Presets (#10247)
---
 docs/dag-run.rst | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/docs/dag-run.rst b/docs/dag-run.rst
index 06aa4bd..d7b4de1 100644
--- a/docs/dag-run.rst
+++ b/docs/dag-run.rst
@@ -28,7 +28,10 @@ a ``str``, or a ``datetime.timedelta`` object.
 .. tip::
 You can use an online editor for CRON expressions such as `Crontab guru 
`_
 
-Alternatively, you can also use one of these cron "presets":
+Alternatively, you can also use one of these cron "presets".
+
+Cron Presets
+
 
 
+++-+
 | preset | meaning 
   | cron|
@@ -52,7 +55,7 @@ Alternatively, you can also use one of these cron "presets":
 
+++-+
 
 Your DAG will be instantiated for each schedule along with a corresponding
-DAG Run entry in the database backend.
+DAG Run entry in the database backend.
 
 .. note::
 



[airflow] branch constraints-master updated: Updating constraints. GH run id:201193219

2020-08-09 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch constraints-master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/constraints-master by this 
push:
 new 64e06a3  Updating constraints. GH run id:201193219
64e06a3 is described below

commit 64e06a3ef609cdd9f95e369eaff49e17a09dad04
Author: Automated Github Actions commit 
AuthorDate: Sun Aug 9 10:04:57 2020 +

Updating constraints. GH run id:201193219

This update in constraints is automatically committed by the CI 
'constraints-push' step based on
HEAD of 'refs/heads/master' in 'apache/airflow'
with commit sha 637a2c1d8b13efd47be19d3f0087bc7ab732b9a9.

All tests passed in this build so we determined we can push the updated 
constraints.

See 
https://github.com/apache/airflow/blob/master/README.md#installing-from-pypi 
for details.
---
 constraints-3.7.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/constraints-3.7.txt b/constraints-3.7.txt
index 917e542..bd2bf18 100644
--- a/constraints-3.7.txt
+++ b/constraints-3.7.txt
@@ -111,7 +111,7 @@ docker-pycreds==0.4.0
 docker==3.7.3
 docopt==0.6.2
 docutils==0.16
-ecdsa==0.15
+ecdsa==0.14.1
 elasticsearch-dbapi==0.1.0
 elasticsearch-dsl==7.2.1
 elasticsearch==7.5.1



[GitHub] [airflow] mik-laj commented on a change in pull request #10257: Improve guide about Google Cloud Secret Manager Backend

2020-08-09 Thread GitBox


mik-laj commented on a change in pull request #10257:
URL: https://github.com/apache/airflow/pull/10257#discussion_r467563553



##
File path: docs/howto/secrets-backend/google-cloud-secret-manager-backend.rst
##
@@ -26,7 +26,15 @@ a secret backend and how to manage secrets.
 Before you begin
 
 
-`Configure Secret Manager and your local environment 
`__, 
once per project.
+Before you start, make sure you have performed the following tasks:
+
+1.  Include sendgrid subpackage as part of your Airflow installation
+
+.. code-block:: bash
+
+pip install airflow[google]

Review comment:
   I can't add `because pre-commit has a problem with this, but I updated 
the package name.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #10257: Improve guide about Google Cloud Secret Manager Backend

2020-08-09 Thread GitBox


mik-laj commented on a change in pull request #10257:
URL: https://github.com/apache/airflow/pull/10257#discussion_r467563372



##
File path: docs/howto/secrets-backend/google-cloud-secret-manager-backend.rst
##
@@ -26,7 +26,15 @@ a secret backend and how to manage secrets.
 Before you begin
 
 
-`Configure Secret Manager and your local environment 
`__, 
once per project.
+Before you start, make sure you have performed the following tasks:
+
+1.  Include sendgrid subpackage as part of your Airflow installation
+
+.. code-block:: bash
+
+pip install airflow[google]

Review comment:
   ```suggestion
   pip install apaache-airflow[google]
   ```

##
File path: docs/howto/secrets-backend/google-cloud-secret-manager-backend.rst
##
@@ -26,7 +26,15 @@ a secret backend and how to manage secrets.
 Before you begin
 
 
-`Configure Secret Manager and your local environment 
`__, 
once per project.
+Before you start, make sure you have performed the following tasks:
+
+1.  Include sendgrid subpackage as part of your Airflow installation
+
+.. code-block:: bash
+
+pip install airflow[google]

Review comment:
   ```suggestion
   pip install apache-airflow[google]
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] michalslowikowski00 commented on a change in pull request #10246: Added DataprepGetJobsForJobGroupOperator

2020-08-09 Thread GitBox


michalslowikowski00 commented on a change in pull request #10246:
URL: https://github.com/apache/airflow/pull/10246#discussion_r467565793



##
File path: airflow/providers/google/cloud/hooks/dataprep.py
##
@@ -0,0 +1,74 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+This module contains Google Dataprep hook.
+"""
+
+import requests
+from tenacity import retry, stop_after_attempt, wait_exponential
+
+from airflow import AirflowException
+from airflow.hooks.base_hook import BaseHook
+
+
+class GoogleDataprepHook(BaseHook):
+"""
+Hook for connection with Dataprep API.
+To get connection Dataprep with Airflow you need Dataprep token.
+https://clouddataprep.com/documentation/api#section/Authentication
+
+It should be added to the Connection in Airflow in JSON format.
+
+"""
+
+def __init__(self, dataprep_conn_id: str = "dataprep_conn_id") -> None:
+super().__init__()
+self.dataprep_conn_id = dataprep_conn_id
+self._url = "https://api.clouddataprep.com/v4/jobGroups;
+
+@property
+def _headers(self):

Review comment:
   Fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] michalslowikowski00 commented on a change in pull request #10246: Added DataprepGetJobsForJobGroupOperator

2020-08-09 Thread GitBox


michalslowikowski00 commented on a change in pull request #10246:
URL: https://github.com/apache/airflow/pull/10246#discussion_r467565824



##
File path: tests/providers/google/cloud/hooks/test_dataprep.py
##
@@ -0,0 +1,97 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from unittest import mock
+
+import pytest
+from mock import patch
+from requests import HTTPError
+from tenacity import RetryError
+
+from airflow.providers.google.cloud.hooks import dataprep
+
+JOB_ID = 1234567
+URL = "https://api.clouddataprep.com/v4/jobGroups;
+TOKEN = ""
+EXTRA = {"token": TOKEN}
+
+
+@pytest.fixture(scope="class")
+def mock_hook():
+with mock.patch("airflow.hooks.base_hook.BaseHook.get_connection") as conn:
+hook = dataprep.GoogleDataprepHook(dataprep_conn_id="dataprep_conn_id")
+conn.return_value.extra_dejson = EXTRA
+yield hook
+
+
+class TestGoogleDataprepHook:
+def test_get_token(self, mock_hook):
+assert mock_hook._token == TOKEN
+
+@patch("airflow.providers.google.cloud.hooks.dataprep.requests.get")
+def test_mock_should_be_called_once_with_params(self, mock_get_request, 
mock_hook):
+mock_hook.get_jobs_for_job_group(job_id=JOB_ID)
+mock_get_request.assert_called_once_with(
+f"{URL}/{JOB_ID}/jobs",
+headers={
+"Content-Type": "application/json",
+"Authorization": f"Bearer {TOKEN}",
+},
+)
+
+@patch(
+"airflow.providers.google.cloud.hooks.dataprep.requests.get",
+side_effect=[HTTPError(), mock.MagicMock()],
+)
+def test_should_pass_after_retry(self, mock_get_request, mock_hook):
+mock_hook.get_jobs_for_job_group(JOB_ID)
+assert mock_get_request.call_count == 2
+
+@patch(
+"airflow.providers.google.cloud.hooks.dataprep.requests.get",
+side_effect=[mock.MagicMock(), HTTPError()],
+)
+def test_should_not_retry_after_success(self, mock_get_request, mock_hook):
+mock_hook.get_jobs_for_job_group.retry.sleep = mock.Mock()  # pylint: 
disable=no-member
+mock_hook.get_jobs_for_job_group(JOB_ID)
+assert mock_get_request.call_count == 1
+
+@patch(
+"airflow.providers.google.cloud.hooks.dataprep.requests.get",
+side_effect=[

Review comment:
   :)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil opened a new pull request #10258: Add Syntax Highlights to code-blocks in docs/best-practices.rst

2020-08-09 Thread GitBox


kaxil opened a new pull request #10258:
URL: https://github.com/apache/airflow/pull/10258


   **Before**:
   
![image](https://user-images.githubusercontent.com/8811558/89730390-037f8780-da36-11ea-9006-f1c505dbc657.png)
   
   
   **After**:
   
![image](https://user-images.githubusercontent.com/8811558/89730397-07aba500-da36-11ea-8017-68c42a7d037d.png)
   
   
   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #10227: Use Hash of Serialized DAG to determine DAG is changed or not

2020-08-09 Thread GitBox


kaxil commented on a change in pull request #10227:
URL: https://github.com/apache/airflow/pull/10227#discussion_r467569001



##
File path: airflow/models/serialized_dag.py
##
@@ -65,6 +66,11 @@ class SerializedDagModel(Base):
 fileloc_hash = Column(BigInteger, nullable=False)
 data = Column(sqlalchemy_jsonfield.JSONField(json=json), nullable=False)
 last_updated = Column(UtcDateTime, nullable=False)
+# TODO: Make dag_hash not nullable in Airflow 1.10.13 or Airflow 2.0??

Review comment:
   Good point. I was tempted to do that, the only reason I didn't do it was 
to avoid any impression that it is a hash but I think that might not be a 
problem. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #10227: Use Hash of Serialized DAG to determine DAG is changed or not

2020-08-09 Thread GitBox


potiuk commented on a change in pull request #10227:
URL: https://github.com/apache/airflow/pull/10227#discussion_r467570749



##
File path: airflow/models/serialized_dag.py
##
@@ -76,6 +82,7 @@ def __init__(self, dag: DAG):
 self.fileloc_hash = DagCode.dag_fileloc_hash(self.fileloc)
 self.data = SerializedDAG.to_dict(dag)
 self.last_updated = timezone.utcnow()
+self.dag_hash = hashlib.md5(json.dumps(self.data, 
sort_keys=True).encode("utf-8")).hexdigest()

Review comment:
   Ah Right! Makes sense indeed. Great to get some input from AirBnB folks 
on that one :)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] tooptoop4 opened a new pull request #7637: [AIRFLOW-6994] SparkSubmitOperator re-launches spark driver even when original driver still running

2020-08-09 Thread GitBox


tooptoop4 opened a new pull request #7637:
URL: https://github.com/apache/airflow/pull/7637


   ---
   Issue link: 
[AIRFLOW-6994](https://issues.apache.org/jira/browse/AIRFLOW-6994)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (AIRFLOW-6994) SparkSubmitOperator re launches spark driver even when original driver still running

2020-08-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17173812#comment-17173812
 ] 

ASF GitHub Bot commented on AIRFLOW-6994:
-

tooptoop4 opened a new pull request #7637:
URL: https://github.com/apache/airflow/pull/7637


   ---
   Issue link: 
[AIRFLOW-6994](https://issues.apache.org/jira/browse/AIRFLOW-6994)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = 
JIRA ID*
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow 
Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   * For document-only changes commit message can start with 
`[AIRFLOW-]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)
 for more information.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> SparkSubmitOperator re launches spark driver even when original driver still 
> running
> 
>
> Key: AIRFLOW-6994
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6994
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.8, 1.10.9
>Reporter: t oo
>Assignee: t oo
>Priority: Major
> Fix For: 2.0.0
>
>
> https://issues.apache.org/jira/browse/AIRFLOW-6229 introduced a bug
> Due to temporary network blip in connection to spark the state goes to 
> unknown (as no tags found in curl response) and forces retry
> fix in spark_submit_hook.py:
>   
> {code:java}
>   def _process_spark_status_log(self, itr):
> """
> parses the logs of the spark driver status query process
> :param itr: An iterator which iterates over the input of the 
> subprocess
> """
> response_found = False
> driver_found = False
> # Consume the iterator
> for line in itr:
> line = line.strip()
> if "submissionId" in line:
> response_found = True
> 
> # Check if the log line is about the driver status and extract 
> the status.
> if "driverState" in line:
> self._driver_status = line.split(' : ')[1] \
> .replace(',', '').replace('\"', '').strip()
> driver_found = True
> self.log.debug("spark driver status log: {}".format(line))
> if response_found and not driver_found:
> self._driver_status = "UNKNOWN"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [airflow-site] yesemsanthoshkumar opened a new pull request #279: Move announcements page from confluence to website

2020-08-09 Thread GitBox


yesemsanthoshkumar opened a new pull request #279:
URL: https://github.com/apache/airflow-site/pull/279


   Resolves https://github.com/apache/airflow/issues/10196



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on a change in pull request #5499: [AIRFLOW-3964][AIP-17] Build smart sensor

2020-08-09 Thread GitBox


kaxil commented on a change in pull request #5499:
URL: https://github.com/apache/airflow/pull/5499#discussion_r467573812



##
File path: airflow/utils/log/file_task_handler.py
##
@@ -71,15 +71,25 @@ def close(self):
 
 def _render_filename(self, ti, try_number):
 if self.filename_jinja_template:
-jinja_context = ti.get_template_context()
-jinja_context['try_number'] = try_number
+if hasattr(ti, 'task'):
+jinja_context = ti.get_template_context()
+jinja_context['try_number'] = try_number
+else:
+jinja_context = {
+'ti': ti,
+'ts': ti.execution_date.isoformat(),
+'try_number': try_number,
+}
 return self.filename_jinja_template.render(**jinja_context)
 
 return self.filename_template.format(dag_id=ti.dag_id,
  task_id=ti.task_id,
  
execution_date=ti.execution_date.isoformat(),
  try_number=try_number)
 
+def _read_grouped_logs(self):
+return False

Review comment:
   got it





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj opened a new pull request #10259: Fix redirects URLs

2020-08-09 Thread GitBox


mik-laj opened a new pull request #10259:
URL: https://github.com/apache/airflow/pull/10259


   
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)**
 for more information.
   In case of fundamental code change, Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals))
 is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party 
License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in 
[UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on issue #10155: Airflow 1.10.10 + DAG SERIALIZATION = fails to start manually the DAG's operators

2020-08-09 Thread GitBox


kaxil commented on issue #10155:
URL: https://github.com/apache/airflow/issues/10155#issuecomment-671041668


   This has been fixed in 1.10.11 - https://github.com/apache/airflow/pull/8775



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj opened a new issue #10260: Automatic reference to configuration reference docs

2020-08-09 Thread GitBox


mik-laj opened a new issue #10260:
URL: https://github.com/apache/airflow/issues/10260


   Hello,
   
   I would like it to be possible to easily refer to the reference list with 
configuration options.  Currently, most of the guides that require a 
description of configuration options are similar to the description below.
   https://user-images.githubusercontent.com/12058428/89731407-177fb500-da47-11ea-9df5-8a17b910da35.png;>
   There are code literals in this text: ``backend``, ``[secrets]``. 
Unfortunately, these literals are just text and I wish they could be clicked to 
go to a reference list containing a description of the option/section.
   
   To do this, we should add a new reference type, make changes to the listing 
template, and then make changes to all descriptions. We can be inspired by the 
Django project.
   
https://github.com/django/django/blob/58a336a674a658c1cda6707fe1cacb56aaed3008/docs/_ext/djangodocs.py#L47-L52
   
   Best regards,
   Kamil Breguła



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on issue #9414: Environment variables reference

2020-08-09 Thread GitBox


mik-laj commented on issue #9414:
URL: https://github.com/apache/airflow/issues/9414#issuecomment-671044613


   @ghost Do you have any problem? Do you have any questions? I will be very 
happy to help you if I can
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #10259: Fix redirects URLs

2020-08-09 Thread GitBox


mik-laj commented on a change in pull request #10259:
URL: https://github.com/apache/airflow/pull/10259#discussion_r467577727



##
File path: docs/redirects.txt
##
@@ -68,5 +68,5 @@ howto/operator/google/firebase/index.rst 
howto/operator/google/index.rst
 
 # Other redirects
 howto/operator/http/http.rst howto/operator/http.rst
-docs/howto/operator/http/index.rst howto/operator/http.rst
-docs/howto/use-alternative-secrets-backend.rst 
howto/altenative-secrets-backends/index.rst
+howto/operator/http/index.rst howto/operator/http.rst
+howto/use-alternative-secrets-backend.rst 
howto/altenative-secrets-backends/index.rst

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] aranjanthakur commented on issue #10166: Add capability to specify gunicorn access log format for airflow webserver

2020-08-09 Thread GitBox


aranjanthakur commented on issue #10166:
URL: https://github.com/apache/airflow/issues/10166#issuecomment-671048749


   Hi @mik-laj , I have made the required changes. Could you please guide how I 
can test it?
   I have run integration tests in breeze env. I want to build it locally and 
test the webserver.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow-site] potiuk commented on pull request #279: Move announcements page from confluence to website

2020-08-09 Thread GitBox


potiuk commented on pull request #279:
URL: https://github.com/apache/airflow-site/pull/279#issuecomment-671048559


   Looks great!
   
   > Resolves 
[apache/airflow#10196](https://github.com/apache/airflow/issues/10196)
   > 
   > @potiuk I have the following questions.
   > 
   > 1. Nov 21, 2019
   >I've tried to link all contributors' github profile wherever mentioned. 
But I couldn't find the profile Id for Kevin Yang.
   See below :)
   
   > 2. Oct 18, 2019
   >Same update as Nov 22, 2019. Should I remove one of them?
   
   Yep. mistake. Remove the Nov 22nd one.
   
   > 3. May 2, 2019
   >Need profile Id for Bas Harenslak, Joshua Carp, Kevin Yang. Similar to 
question no. 1.
   
   1) 3)  Missing profiles: @KevinYang21 @BasPH @jmcarp 
   
   > 4. April 22, 2016
   >Migrating to Apache Phrase points to announcements confluence page. 
Should we repoint this to the website?
   
   Yes please!
   
   Few  comments:
   1) It would be great to add some whitespace at the top of the Announcement 
page (as it is for other pages). Currently the header covers half of the first 
line:
   
   ![Screenshot from 2020-08-09 
14-51-55](https://user-images.githubusercontent.com/595491/89732555-e6f04900-da4f-11ea-8be5-e83759c6144e.png)
   
   2) The list of the backport providers also has far too much space in. Maybe 
better to have them as bullets rather than numbered list to squeeze them 
together.
   
   ![Screenshot from 2020-08-09 
14-52-42](https://user-images.githubusercontent.com/595491/89732581-fec7cd00-da4f-11ea-9954-a750b08d6cf0.png)
   
   BTW.  I am not sure if you realize that but you can easily preview the 
generated pages now. It is enough to download the artifact from the PR, extract 
it and run `python -m http.server` and you will be able to preview it at 
http://localhost:8000
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow-site] mik-laj commented on issue #275: The buttons for the Use Cases feel reversed

2020-08-09 Thread GitBox


mik-laj commented on issue #275:
URL: https://github.com/apache/airflow-site/issues/275#issuecomment-671050715


   @LeonY1  Can you take a screenshot and highlight which page item you are 
talking about? I can't find the problem either.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[airflow] branch master updated: Create "Managing variable" in howto directory (#10241)

2020-08-09 Thread kamilbregula
This is an automated email from the ASF dual-hosted git repository.

kamilbregula pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
 new 183cb8d  Create "Managing variable" in howto directory (#10241)
183cb8d is described below

commit 183cb8d56b876d72b21b84db9e218085a7a490d9
Author: Kamil Breguła 
AuthorDate: Sun Aug 9 16:33:11 2020 +0200

Create "Managing variable" in howto directory (#10241)
---
 docs/concepts.rst   | 33 +--
 docs/howto/connection/index.rst |  5 ++-
 docs/howto/index.rst|  1 +
 docs/howto/variable.rst | 72 +
 4 files changed, 76 insertions(+), 35 deletions(-)

diff --git a/docs/concepts.rst b/docs/concepts.rst
index a73d7c3..13f887a 100644
--- a/docs/concepts.rst
+++ b/docs/concepts.rst
@@ -788,38 +788,7 @@ or if you need to deserialize a json object from the 
variable :
 
 echo {{ var.json. }}
 
-Storing Variables in Environment Variables
---
-
-.. versionadded:: 1.10.10
-
-Airflow Variables can also be created and managed using Environment Variables. 
The environment variable
-naming convention is :envvar:`AIRFLOW_VAR_{VARIABLE_NAME}`, all uppercase.
-So if your variable key is ``FOO`` then the variable name should be 
``AIRFLOW_VAR_FOO``.
-
-For example,
-
-.. code-block:: bash
-
-export AIRFLOW_VAR_FOO=BAR
-
-# To use JSON, store them as JSON strings
-export AIRFLOW_VAR_FOO_BAZ='{"hello":"world"}'
-
-You can use them in your DAGs as:
-
-.. code-block:: python
-
-from airflow.models import Variable
-foo = Variable.get("foo")
-foo_json = Variable.get("foo_baz", deserialize_json=True)
-
-.. note::
-
-Single underscores surround ``VAR``.  This is in contrast with the way 
``airflow.cfg``
-parameters are stored, where double underscores surround the config 
section name.
-Variables set using Environment Variables would not appear in the Airflow 
UI but you will
-be able to use it in your DAG file.
+See :doc:`howto/variable` for details on managing variables.
 
 Branching
 =
diff --git a/docs/howto/connection/index.rst b/docs/howto/connection/index.rst
index 7f4a869..8b56271 100644
--- a/docs/howto/connection/index.rst
+++ b/docs/howto/connection/index.rst
@@ -309,9 +309,8 @@ Securing Connections
 
 
 Airflow uses `Fernet `__ to encrypt passwords 
in the connection
-configurations stored the metastore database. It guarantees that without the 
encryption password, Connection Passwords cannot be manipulated or read without 
the key.
-
-For information on configuring Fernet, look at :ref:`security/fernet`.
+configurations stored the metastore database. It guarantees that without the 
encryption password, Connection
+Passwords cannot be manipulated or read without the key. For information on 
configuring Fernet, look at :ref:`security/fernet`.
 
 In addition to retrieving connections from environment variables or the 
metastore database, you can enable
 an secrets backend to retrieve connections. For more details see 
:doc:`../secrets-backend/index`
diff --git a/docs/howto/index.rst b/docs/howto/index.rst
index a47dd20..1a76b71 100644
--- a/docs/howto/index.rst
+++ b/docs/howto/index.rst
@@ -37,6 +37,7 @@ configuring an Airflow environment.
 customize-state-colors-ui
 custom-operator
 connection/index
+variable
 write-logs
 run-behind-proxy
 run-with-systemd
diff --git a/docs/howto/variable.rst b/docs/howto/variable.rst
new file mode 100644
index 000..ce7bc02
--- /dev/null
+++ b/docs/howto/variable.rst
@@ -0,0 +1,72 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Managing Variables
+==
+
+Variables are a generic way to store and retrieve arbitrary content or
+settings as a simple key value store within Airflow. Variables can be
+listed, created, updated and deleted from the UI (``Admin -> Variables``),
+code or CLI.
+
+.. image:: ../img/variable_hidden.png
+
+See the :ref:`Variables Concepts ` documentation for
+more 

[GitHub] [airflow] mik-laj merged pull request #10241: Create "Managing variable" in howto directory

2020-08-09 Thread GitBox


mik-laj merged pull request #10241:
URL: https://github.com/apache/airflow/pull/10241


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] kaxil commented on pull request #10227: Use Hash of Serialized DAG to determine DAG is changed or not

2020-08-09 Thread GitBox


kaxil commented on pull request #10227:
URL: https://github.com/apache/airflow/pull/10227#issuecomment-671029234


   Ping @potiuk 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] potiuk commented on a change in pull request #10227: Use Hash of Serialized DAG to determine DAG is changed or not

2020-08-09 Thread GitBox


potiuk commented on a change in pull request #10227:
URL: https://github.com/apache/airflow/pull/10227#discussion_r467566683



##
File path: airflow/models/serialized_dag.py
##
@@ -76,6 +82,7 @@ def __init__(self, dag: DAG):
 self.fileloc_hash = DagCode.dag_fileloc_hash(self.fileloc)
 self.data = SerializedDAG.to_dict(dag)
 self.last_updated = timezone.utcnow()
+self.dag_hash = hashlib.md5(json.dumps(self.data, 
sort_keys=True).encode("utf-8")).hexdigest()

Review comment:
   I believe it will be better to use  hash of the file itself rather than 
Json representation. Not only it will be faster (you d note have to parse the 
file and create a SerializedDAG, but also it will be more accurate (there are 
some changes like comments that might cause the Serialized DAG to be the same 
(comments, jinja templates etc.). Of course it will be still ok (we care about 
Serialized representation in this case) but I think it would be more accurate 
to refresh Serialized DAG and hash every time when the file changes. Also it 
wil be much easier to reason about that hash and verify if it is correct or 
whether the file changed since - we just need to check the hash of the file on 
disk. 

##
File path: airflow/models/serialized_dag.py
##
@@ -65,6 +66,11 @@ class SerializedDagModel(Base):
 fileloc_hash = Column(BigInteger, nullable=False)
 data = Column(sqlalchemy_jsonfield.JSONField(json=json), nullable=False)
 last_updated = Column(UtcDateTime, nullable=False)
+# TODO: Make dag_hash not nullable in Airflow 1.10.13 or Airflow 2.0??

Review comment:
   We can make it not nullable and add default value in the migration 
("Hash not calculated yet"). This way it will be refreshed with the next run 
automagically and we will not worry about nullability.

##
File path: airflow/models/serialized_dag.py
##
@@ -102,9 +109,11 @@ def write_dag(cls, dag: DAG, min_update_interval: 
Optional[int] = None, session:
 return
 
 log.debug("Checking if DAG (%s) changed", dag.dag_id)
-serialized_dag_from_db: SerializedDagModel = 
session.query(cls).get(dag.dag_id)
 new_serialized_dag = cls(dag)
-if serialized_dag_from_db and (serialized_dag_from_db.data == 
new_serialized_dag.data):
+serialized_dag_hash_from_db = session.query(
+cls.dag_hash).filter(cls.dag_id == dag.dag_id).scalar()
+
+if serialized_dag_hash_from_db and (serialized_dag_hash_from_db == 
new_serialized_dag.dag_hash):

Review comment:
   If make the column nullable with default this might be just ==





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   >