[jira] [Created] (AIRFLOW-3650) Fix flaky test in TestTriggerDag

2019-01-07 Thread Tao Feng (JIRA)
Tao Feng created AIRFLOW-3650:
-

 Summary: Fix flaky test in TestTriggerDag
 Key: AIRFLOW-3650
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3650
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Tao Feng
Assignee: Tao Feng






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] feng-tao commented on issue #4457: [AIRFLOW-3650] Skip running on mysql for the flaky test

2019-01-07 Thread GitBox
feng-tao commented on issue #4457: [AIRFLOW-3650] Skip running on mysql for the 
flaky test
URL: https://github.com/apache/airflow/pull/4457#issuecomment-452203627
 
 
   PTAL @kaxil  @Fokko 
   
   I think the issue is not on the actual test itself, but on mysqlclient 
library. The session can't get the latest data from ORM after finish running 
this 
line(https://github.com/apache/airflow/blob/master/tests/www_rbac/test_views.py#L1473)
 (I open a local pdb to confirm). But I try different mysqlclient versions 
which all don't work. To unblock, I would suggest we skip MySQL orm for this 
test.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jmcarp opened a new pull request #4458: [AIRFLOW-3648] Default to connection project id in gcp cloud sql.

2019-01-07 Thread GitBox
jmcarp opened a new pull request #4458: [AIRFLOW-3648] Default to connection 
project id in gcp cloud sql.
URL: https://github.com/apache/airflow/pull/4458
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3648
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jmcarp commented on a change in pull request #4390: [AIRFLOW-3584] Use ORM DAGs for index view.

2019-01-07 Thread GitBox
jmcarp commented on a change in pull request #4390: [AIRFLOW-3584] Use ORM DAGs 
for index view.
URL: https://github.com/apache/airflow/pull/4390#discussion_r245878602
 
 

 ##
 File path: airflow/www_rbac/compile_assets.sh
 ##
 @@ -23,6 +23,6 @@ if [ -d airflow/www_rbac/static/dist ]; then
 fi
 
 cd airflow/www_rbac/
-npm install
+# npm install
 
 Review comment:
   That's a typo, I'll revert it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao edited a comment on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)?

2019-01-07 Thread GitBox
feng-tao edited a comment on issue #4421: [AIRFLOW-3468] Remove 
KnownEvent(Event)?
URL: https://github.com/apache/airflow/pull/4421#issuecomment-452174547
 
 
   @Fokko , regarding not having a migration script, I wonder what will happen 
for the following case:
   assume we have three tables: ``A``, ``B``, ``C``, we modify  table ``A`` and 
provide an alemic migration script(version 1), then delete table ``B`` without 
a script, then modify table ``C`` with another alemic script(version 2), in 
this case will the migration(upgrade / downgrade) run successfully from version 
1 to  2 or vice versa?  If yes, I am +1 :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on a change in pull request #4457: [AIRFLOW-XXX] Fix TestTriggerDag flaky test

2019-01-07 Thread GitBox
feng-tao commented on a change in pull request #4457: [AIRFLOW-XXX] Fix 
TestTriggerDag flaky test
URL: https://github.com/apache/airflow/pull/4457#discussion_r245870276
 
 

 ##
 File path: tests/www_rbac/test_views.py
 ##
 @@ -1464,12 +1464,9 @@ def test_trigger_dag_button_normal_exist(self):
 
 def test_trigger_dag_button(self):
 
-test_dag_id = "example_bash_operator"
+test_dag_id = "example_python_operator"
 
 DR = models.DagRun
-self.session.query(DR).delete()
-self.session.commit()
-
 self.client.get('trigger?dag_id={}'.format(test_dag_id))
 
 Review comment:
   thanks. this pr is for testing only. I would like to see why the test fails 
particularly for MySQL ORM. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jmcarp commented on issue #4436: [AIRFLOW-3631] Update flake8 and fix lint.

2019-01-07 Thread GitBox
jmcarp commented on issue #4436: [AIRFLOW-3631] Update flake8 and fix lint.
URL: https://github.com/apache/airflow/pull/4436#issuecomment-452143443
 
 
   Updated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2019-01-07 Thread GitBox
kaxil commented on issue #: [AIRFLOW-3281] Fix Kubernetes operator with 
git-sync
URL: https://github.com/apache/airflow/pull/#issuecomment-452132314
 
 
   @feng-tao I have included that commit as well.
   
   @dimberman  I have spent some time today and cherry-picked and resolved some 
conflicts (with some help from this PR - thank you guys), it would be great if 
you can verify if everything that was needed is there. And then we will try to 
resolve issues with tests if any.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)?

2019-01-07 Thread GitBox
feng-tao commented on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)?
URL: https://github.com/apache/airflow/pull/4421#issuecomment-452119451
 
 
   but CI is failing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)?

2019-01-07 Thread GitBox
feng-tao commented on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)?
URL: https://github.com/apache/airflow/pull/4421#issuecomment-452119383
 
 
   @Fokko , I see your point. +1


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3645) Use a base_executor_config and merge operator level executor_config

2019-01-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736462#comment-16736462
 ] 

ASF GitHub Bot commented on AIRFLOW-3645:
-

Mokubyow commented on pull request #4456: [AIRFLOW-3645] Add 
base_executor_config
URL: https://github.com/apache/airflow/pull/4456
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW-3645)
   
   ### Description
   
   - [x] Add a base_executor_config that merges any operator_level 
executor_config into itself. This helps to dry up KubernetesExecutor 
deployments that might need to pass an executor config to all operators.
   
   ### Tests
   
   - [x] My PR adds the following unit tests: `nosetests -v 
tests/utils/test_helpers.py:TestHelpers.test_dict_merge`
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
   
   ### Code Quality
   
   - [x] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Use a base_executor_config and merge operator level executor_config
> ---
>
> Key: AIRFLOW-3645
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3645
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Kyle Hamlin
>Assignee: Kyle Hamlin
>Priority: Major
> Fix For: 1.10.2
>
>
> It would be very useful to have a `base_executor_config` and merge the base 
> config with any operator level `executor_config`.
> I imaging referencing a python dict similar to how we reference a custom 
> logging_config
> *Example config*
> {code:java}
> [core]
> base_executor_config = config.base_executor_config.BASE_EXECUTOR_CONFIG
> {code}
> *Example base_executor_config*
> {code:java}
> BASE_EXECUTOR_CONFIG = {
> "KubernetesExecutor": {
> "image_pull_policy": "Always",
> "annotations": {
> "iam.amazonaws.com/role": "arn:aws:iam::"
> },
> "volumes": [
> {
> "name": "airflow-lib",
> "persistentVolumeClaim": {
> "claimName": "airflow-lib"
> }
> }
> ],
> "volume_mounts": [
> {
> "name": "airflow-lib",
> "mountPath": "/usr/local/airflow/lib",
> }
> ]
> }
> }
> {code}
> *Example operator*
> {code:java}
> run_this = PythonOperator(
> task_id='print_the_context',
> provide_context=True,
> python_callable=print_context,
> executor_config={
> "KubernetesExecutor": {
> "request_memory": "256Mi",
> "request_cpu": "100m",
> "limit_memory": "256Mi",
> "limit_cpu": "100m"
> }
> },
> dag=dag)
> {code}
> Then we'll want to have a dict deep merge function in that returns the 
> executor_config
> *Merge functionality*
> {code:java}
> import collections
> from airflow import conf
> from airflow.utils.module_loading import import_string
> def dict_merge(dct, merge_dct):
> """ Recursive dict merge. Inspired by :meth:``dict.update()``, instead of
> updating only top-level keys, dict_merge recurses down into dicts nested
> to an arbitrary depth, updating keys. The ``merge_dct`` is merged into
> ``dct``.
> :param dct: dict onto which the merge is executed
> :param merge_dct: dct merged into dct
> :return: dct
> """
> for k, v in merge_dct.items():
> if (k in dct and isinstance(dct[k], dict)
> and isinstance(merge_dct[k], collections.Mapping)):
> dict_merge(dct[k], 

[jira] [Updated] (AIRFLOW-3647) Contributed SparkSubmitOperator doesn't honor --archives configuration

2019-01-07 Thread Ken Melms (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ken Melms updated AIRFLOW-3647:
---
Description: 
The contributed SparkSubmitOperator has no ability to honor the spark-submit 
configuration field "--archives" which is treated subtly different than "files" 
or "-py-files" in that it will unzip the archive into the application's working 
directory, and can optionally add an alias to the unzipped folder so that you 
can refer to it elsewhere in your submission.

EG:

spark-submit  --archives=hdfs:user/someone/python35_venv.zip#PYTHON --conf 
"spark.yarn.appMasterEnv.PYSPARK_PYTHON=./PYTHON/python35/bin/python3" 
run_me.py  

In our case - this behavior allows for multiple python virtual environments to 
be sourced from HDFS without incurring the penalty of pushing the whole python 
virtual env to the cluster each submission.  This solves (for us) using 
python-based spark jobs on a cluster that the end user has no ability to define 
the python modules in use.

 

  was:
The contributed SparkSubmitOperator has no ability to honor the spark-submit 
configuration field "--archives" which is treated subtly different than 
"--files" or "--py-files" in that it will unzip the archive into the 
application's working directory, and can optionally add an alias to the 
unzipped folder so that you can refer to it elsewhere in your submission.

EG:

spark-submit  --archives=hdfs:user/someone/python35_venv.zip#PYTHON --conf 
"spark.yarn.appMasterEnv.PYSPARK_PYTHON=./PYTHON/python35/bin/python3" 
run_me.py  



In our case - this behavior allows for multiple python virtual environments to 
be sourced from HDFS without incurring the penalty of pushing the whole python 
virtual env to the cluster each submission.  This solves (for us) using 
python-based spark jobs on a cluster that the end user has no ability to define 
the python modules in use.

 


> Contributed SparkSubmitOperator doesn't honor --archives configuration
> --
>
> Key: AIRFLOW-3647
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3647
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: 1.10.1
> Environment: Linux (RHEL 7)
> Python 3.5 (using a virtual environment)
> spark-2.1.3-bin-hadoop26
> Airflow 1.10.1
> CDH 5.14 Hadoop [Yarn] cluster (no end user / dev modifications allowed)
>Reporter: Ken Melms
>Priority: Minor
>  Labels: easyfix, newbie
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The contributed SparkSubmitOperator has no ability to honor the spark-submit 
> configuration field "--archives" which is treated subtly different than 
> "files" or "-py-files" in that it will unzip the archive into the 
> application's working directory, and can optionally add an alias to the 
> unzipped folder so that you can refer to it elsewhere in your submission.
> EG:
> spark-submit  --archives=hdfs:user/someone/python35_venv.zip#PYTHON 
> --conf "spark.yarn.appMasterEnv.PYSPARK_PYTHON=./PYTHON/python35/bin/python3" 
> run_me.py  
> In our case - this behavior allows for multiple python virtual environments 
> to be sourced from HDFS without incurring the penalty of pushing the whole 
> python virtual env to the cluster each submission.  This solves (for us) 
> using python-based spark jobs on a cluster that the end user has no ability 
> to define the python modules in use.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3647) Contributed SparkSubmitOperator doesn't honor --archives configuration

2019-01-07 Thread Ken Melms (JIRA)
Ken Melms created AIRFLOW-3647:
--

 Summary: Contributed SparkSubmitOperator doesn't honor --archives 
configuration
 Key: AIRFLOW-3647
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3647
 Project: Apache Airflow
  Issue Type: Improvement
  Components: contrib
Affects Versions: 1.10.1
 Environment: Linux (RHEL 7)
Python 3.5 (using a virtual environment)
spark-2.1.3-bin-hadoop26
Airflow 1.10.1
CDH 5.14 Hadoop [Yarn] cluster (no end user / dev modifications allowed)

Reporter: Ken Melms


The contributed SparkSubmitOperator has no ability to honor the spark-submit 
configuration field "--archives" which is treated subtly different than 
"--files" or "--py-files" in that it will unzip the archive into the 
application's working directory, and can optionally add an alias to the 
unzipped folder so that you can refer to it elsewhere in your submission.

EG:

spark-submit  --archives=hdfs:user/someone/python35_venv.zip#PYTHON --conf 
"spark.yarn.appMasterEnv.PYSPARK_PYTHON=./PYTHON/python35/bin/python3" 
run_me.py  



In our case - this behavior allows for multiple python virtual environments to 
be sourced from HDFS without incurring the penalty of pushing the whole python 
virtual env to the cluster each submission.  This solves (for us) using 
python-based spark jobs on a cluster that the end user has no ability to define 
the python modules in use.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] felipegasparini commented on issue #4036: [AIRFLOW-2744] Allow RBAC to accept plugins for views and links.

2019-01-07 Thread GitBox
felipegasparini commented on issue #4036: [AIRFLOW-2744] Allow RBAC to accept 
plugins for views and links.
URL: https://github.com/apache/airflow/pull/4036#issuecomment-452113006
 
 
   We also need to fix the example in the documentation 
(https://airflow.apache.org/plugins.html#example). It is currently broken: 
import fails, wrong method name in the view and missing path for templates.
   
   I made a sample project to make this plugin integration work: 
https://github.com/felipegasparini/airflow_plugin_rbac_test/blob/dbaa049a9996df275b1d90f74b93ffbf206bb1d5/airflow/plugins/test_plugin/test_plugin.py
   
   I will submit a PR to fix the doc later, but just posting it here since it 
may be useful for others.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #4407: [AIRFLOW-3600] Remove dagbag from trigger

2019-01-07 Thread GitBox
feng-tao commented on issue #4407: [AIRFLOW-3600] Remove dagbag from trigger
URL: https://github.com/apache/airflow/pull/4407#issuecomment-452107835
 
 
   @Fokko , looking at recent commit, this is the one that modifies this part 
of the code. And the CI is not always fails with this test(sometimes works, 
sometimes not). Hence I suspect this pr is the case.
   
   And we are not sure when this pr checked in CI is broken or not, right?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #4405: [AIRFLOW-3598] Add tests for MsSqlToHiveTransfer

2019-01-07 Thread GitBox
Fokko commented on issue #4405: [AIRFLOW-3598] Add tests for MsSqlToHiveTransfer
URL: https://github.com/apache/airflow/pull/4405#issuecomment-452107449
 
 
   Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work started] (AIRFLOW-3645) Use a base_executor_config and merge operator level executor_config

2019-01-07 Thread Kyle Hamlin (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-3645 started by Kyle Hamlin.

> Use a base_executor_config and merge operator level executor_config
> ---
>
> Key: AIRFLOW-3645
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3645
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Kyle Hamlin
>Assignee: Kyle Hamlin
>Priority: Major
> Fix For: 1.10.2
>
>
> It would be very useful to have a `base_executor_config` and merge the base 
> config with any operator level `executor_config`.
> I imaging referencing a python dict similar to how we reference a custom 
> logging_config
> *Example config*
> {code:java}
> [core]
> base_executor_config = config.base_executor_config.BASE_EXECUTOR_CONFIG
> {code}
> *Example base_executor_config*
> {code:java}
> BASE_EXECUTOR_CONFIG = {
> "KubernetesExecutor": {
> "image_pull_policy": "Always",
> "annotations": {
> "iam.amazonaws.com/role": "arn:aws:iam::"
> },
> "volumes": [
> {
> "name": "airflow-lib",
> "persistentVolumeClaim": {
> "claimName": "airflow-lib"
> }
> }
> ],
> "volume_mounts": [
> {
> "name": "airflow-lib",
> "mountPath": "/usr/local/airflow/lib",
> }
> ]
> }
> }
> {code}
> *Example operator*
> {code:java}
> run_this = PythonOperator(
> task_id='print_the_context',
> provide_context=True,
> python_callable=print_context,
> executor_config={
> "KubernetesExecutor": {
> "request_memory": "256Mi",
> "request_cpu": "100m",
> "limit_memory": "256Mi",
> "limit_cpu": "100m"
> }
> },
> dag=dag)
> {code}
> Then we'll want to have a dict deep merge function in that returns the 
> executor_config
> *Merge functionality*
> {code:java}
> import collections
> from airflow import conf
> from airflow.utils.module_loading import import_string
> def dict_merge(dct, merge_dct):
> """ Recursive dict merge. Inspired by :meth:``dict.update()``, instead of
> updating only top-level keys, dict_merge recurses down into dicts nested
> to an arbitrary depth, updating keys. The ``merge_dct`` is merged into
> ``dct``.
> :param dct: dict onto which the merge is executed
> :param merge_dct: dct merged into dct
> :return: dct
> """
> for k, v in merge_dct.items():
> if (k in dct and isinstance(dct[k], dict)
> and isinstance(merge_dct[k], collections.Mapping)):
> dict_merge(dct[k], merge_dct[k])
> else:
> dct[k] = merge_dct[k]
> 
> return dct
> def get_executor_config(executor_config):
> """Try to import base_executor_config and merge it with provided
> executor_config.
> :param executor_config: operator level executor config
> :return: dict"""
> 
> try:
> base_executor_config = import_string(
> conf.get('core', 'base_executor_config'))
> merged_executor_config = dict_merge(
> base_executor_config, executor_config)
> return merged_executor_config
> except Exception:
> return executor_config
> {code}
> Finally, we'll want to call the get_executor_config function in the 
> `BaseOperator` possibly here: 
> https://github.com/apache/airflow/blob/master/airflow/models/__init__.py#L2348



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-3645) Use a base_executor_config and merge operator level executor_config

2019-01-07 Thread Kyle Hamlin (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Hamlin reassigned AIRFLOW-3645:


Assignee: Kyle Hamlin

> Use a base_executor_config and merge operator level executor_config
> ---
>
> Key: AIRFLOW-3645
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3645
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Kyle Hamlin
>Assignee: Kyle Hamlin
>Priority: Major
> Fix For: 1.10.2
>
>
> It would be very useful to have a `base_executor_config` and merge the base 
> config with any operator level `executor_config`.
> I imaging referencing a python dict similar to how we reference a custom 
> logging_config
> *Example config*
> {code:java}
> [core]
> base_executor_config = config.base_executor_config.BASE_EXECUTOR_CONFIG
> {code}
> *Example base_executor_config*
> {code:java}
> BASE_EXECUTOR_CONFIG = {
> "KubernetesExecutor": {
> "image_pull_policy": "Always",
> "annotations": {
> "iam.amazonaws.com/role": "arn:aws:iam::"
> },
> "volumes": [
> {
> "name": "airflow-lib",
> "persistentVolumeClaim": {
> "claimName": "airflow-lib"
> }
> }
> ],
> "volume_mounts": [
> {
> "name": "airflow-lib",
> "mountPath": "/usr/local/airflow/lib",
> }
> ]
> }
> }
> {code}
> *Example operator*
> {code:java}
> run_this = PythonOperator(
> task_id='print_the_context',
> provide_context=True,
> python_callable=print_context,
> executor_config={
> "KubernetesExecutor": {
> "request_memory": "256Mi",
> "request_cpu": "100m",
> "limit_memory": "256Mi",
> "limit_cpu": "100m"
> }
> },
> dag=dag)
> {code}
> Then we'll want to have a dict deep merge function in that returns the 
> executor_config
> *Merge functionality*
> {code:java}
> import collections
> from airflow import conf
> from airflow.utils.module_loading import import_string
> def dict_merge(dct, merge_dct):
> """ Recursive dict merge. Inspired by :meth:``dict.update()``, instead of
> updating only top-level keys, dict_merge recurses down into dicts nested
> to an arbitrary depth, updating keys. The ``merge_dct`` is merged into
> ``dct``.
> :param dct: dict onto which the merge is executed
> :param merge_dct: dct merged into dct
> :return: dct
> """
> for k, v in merge_dct.items():
> if (k in dct and isinstance(dct[k], dict)
> and isinstance(merge_dct[k], collections.Mapping)):
> dict_merge(dct[k], merge_dct[k])
> else:
> dct[k] = merge_dct[k]
> 
> return dct
> def get_executor_config(executor_config):
> """Try to import base_executor_config and merge it with provided
> executor_config.
> :param executor_config: operator level executor config
> :return: dict"""
> 
> try:
> base_executor_config = import_string(
> conf.get('core', 'base_executor_config'))
> merged_executor_config = dict_merge(
> base_executor_config, executor_config)
> return merged_executor_config
> except Exception:
> return executor_config
> {code}
> Finally, we'll want to call the get_executor_config function in the 
> `BaseOperator` possibly here: 
> https://github.com/apache/airflow/blob/master/airflow/models/__init__.py#L2348



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko commented on issue #4399: [AIRFLOW-3594] Unify different License Header

2019-01-07 Thread GitBox
Fokko commented on issue #4399: [AIRFLOW-3594] Unify different License Header
URL: https://github.com/apache/airflow/pull/4399#issuecomment-452106547
 
 
   @feluelle I've restarted the test, but due to the recent renaming, I'm not 
sure if the status of the CI will propagate properly. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] galak75 removed a comment on issue #4292: [AIRFLOW-2508] Handle non string types in Operators templatized fields

2019-01-07 Thread GitBox
galak75 removed a comment on issue #4292: [AIRFLOW-2508] Handle non string 
types in Operators templatized fields
URL: https://github.com/apache/airflow/pull/4292#issuecomment-445222367
 
 
   Everything went well on our fork (see 
https://travis-ci.org/VilledeMontreal/incubator-airflow/builds/464753497)
   But one build failed on Travis with the error below:
   ```
   No output has been received in the last 10m0s, this potentially indicates a 
stalled build or something wrong with the build itself.
   Check the details on how to adjust your build configuration on: 
https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
   The build has been terminated
   ```
   Could anyone restart the build on my PR please? I'm not able to do it, 
probably a question of permissions...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ArgentFalcon commented on issue #3533: [AIRFLOW-161] New redirect route and extra links

2019-01-07 Thread GitBox
ArgentFalcon commented on issue #3533: [AIRFLOW-161] New redirect route and 
extra links
URL: https://github.com/apache/airflow/pull/3533#issuecomment-452102290
 
 
   Oh yeah, I should finish this up. I have to pull some more changes that we 
did internally at Lyft that make it more versatile. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feluelle commented on issue #4405: [AIRFLOW-3598] Add tests for MsSqlToHiveTransfer

2019-01-07 Thread GitBox
feluelle commented on issue #4405: [AIRFLOW-3598] Add tests for 
MsSqlToHiveTransfer
URL: https://github.com/apache/airflow/pull/4405#issuecomment-452102237
 
 
   Sure.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on a change in pull request #4390: [AIRFLOW-3584] Use ORM DAGs for index view.

2019-01-07 Thread GitBox
Fokko commented on a change in pull request #4390: [AIRFLOW-3584] Use ORM DAGs 
for index view.
URL: https://github.com/apache/airflow/pull/4390#discussion_r245817009
 
 

 ##
 File path: airflow/www_rbac/compile_assets.sh
 ##
 @@ -23,6 +23,6 @@ if [ -d airflow/www_rbac/static/dist ]; then
 fi
 
 cd airflow/www_rbac/
-npm install
+# npm install
 
 Review comment:
   Why is this commented out?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on a change in pull request #4390: [AIRFLOW-3584] Use ORM DAGs for index view.

2019-01-07 Thread GitBox
Fokko commented on a change in pull request #4390: [AIRFLOW-3584] Use ORM DAGs 
for index view.
URL: https://github.com/apache/airflow/pull/4390#discussion_r245817248
 
 

 ##
 File path: airflow/models/__init__.py
 ##
 @@ -240,6 +240,20 @@ def clear_task_instances(tis,
 dr.start_date = timezone.utcnow()
 
 
+def get_last_dagrun(dag_id, session, include_externally_triggered=False):
+"""
+Returns the last dag run for a dag, None if there was none.
+Last dag run can be any type of run eg. scheduled or backfilled.
+Overridden DagRuns are ignored.
+"""
+DR = DagRun
+query = session.query(DR).filter(DR.dag_id == dag_id)
+if not include_externally_triggered:
+query = query.filter(DR.external_trigger == False)  # noqa
+query = query.order_by(DR.execution_date.desc())
 
 Review comment:
   Shouldn't this query benefit from an index as well?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #4378: AIRFLOW-3573 - Remove DagStat table

2019-01-07 Thread GitBox
Fokko commented on issue #4378: AIRFLOW-3573 - Remove DagStat table
URL: https://github.com/apache/airflow/pull/4378#issuecomment-452100456
 
 
   @ffinfo PTAL


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #4351: [AIRFLOW-3554] Remove contrib folder from code cov omit list

2019-01-07 Thread GitBox
Fokko commented on issue #4351: [AIRFLOW-3554] Remove contrib folder from code 
cov omit list
URL: https://github.com/apache/airflow/pull/4351#issuecomment-452098776
 
 
   I'm okay with this as well. The contrib code should have tests as well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on a change in pull request #4298: [AIRFLOW-3478] Make sure that the session is closed

2019-01-07 Thread GitBox
Fokko commented on a change in pull request #4298: [AIRFLOW-3478] Make sure 
that the session is closed
URL: https://github.com/apache/airflow/pull/4298#discussion_r245813920
 
 

 ##
 File path: airflow/bin/cli.py
 ##
 @@ -423,14 +418,11 @@ def unpause(args, dag=None):
 def set_is_paused(is_paused, args, dag=None):
 dag = dag or get_dag(args)
 
-session = settings.Session()
-dm = session.query(DagModel).filter(
-DagModel.dag_id == dag.dag_id).first()
-dm.is_paused = is_paused
-session.commit()
 
 Review comment:
   I've restored the `.commit()` for now. I would like to work this Friday on 
setting the `expire_on_commit=True`: 
https://github.com/apache/airflow/blob/master/airflow/settings.py#L198
   
   It feels like we have a lot of connections to the database because they 
aren't properly closed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #4383: [AIRFLOW-3475] Move ImportError out of models.py

2019-01-07 Thread GitBox
Fokko commented on issue #4383: [AIRFLOW-3475] Move ImportError out of models.py
URL: https://github.com/apache/airflow/pull/4383#issuecomment-452097714
 
 
   @BasPH I've restarted the failed tests. Maybe do a rebase? It seems to fail 
on k8s.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #4298: [AIRFLOW-3478] Make sure that the session is closed

2019-01-07 Thread GitBox
Fokko commented on issue #4298: [AIRFLOW-3478] Make sure that the session is 
closed
URL: https://github.com/apache/airflow/pull/4298#issuecomment-452096893
 
 
   Rebased :-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #4320: [AIRFLOW-3515] Remove the run_duration option

2019-01-07 Thread GitBox
Fokko commented on issue #4320: [AIRFLOW-3515] Remove the run_duration option
URL: https://github.com/apache/airflow/pull/4320#issuecomment-452096465
 
 
   I've rebased onto master and resolved the conflicts


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)?

2019-01-07 Thread GitBox
Fokko commented on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)?
URL: https://github.com/apache/airflow/pull/4421#issuecomment-452095834
 
 
   Rebased. @feng-tao I've looked into the Alembic script, but it becomes quite 
nasty in my opinion. The upgrade will be a `DROP TABLE IF EXISTS`, and the 
downgrade will recreate the tables which aren't used in Airflow 2.0 anymore.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #4036: [AIRFLOW-2744] Allow RBAC to accept plugins for views and links.

2019-01-07 Thread GitBox
feng-tao commented on issue #4036: [AIRFLOW-2744] Allow RBAC to accept plugins 
for views and links.
URL: https://github.com/apache/airflow/pull/4036#issuecomment-452093401
 
 
   @oliviersm199 , it seems that PluginRBACTest fails(currently it is disabled) 
if re-enable. Do you think you have time to fix the test?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (AIRFLOW-3292) `delete_dag` endpoint and cli commands don't delete on exact dag_id matching

2019-01-07 Thread Teresa Martyny (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736368#comment-16736368
 ] 

Teresa Martyny edited comment on AIRFLOW-3292 at 1/7/19 9:25 PM:
-

Sorry Ash, I just saw this response. We don't use subdags, so we were unaware 
of this naming convention. Adding a validation to prevent people from naming 
dags this way would be great. In the meantime, we have created a ticket on our 
end to rename our dags. Thanks for clarifying!


was (Author: teresamartyny):
Sorry Ash, I just saw this response. We don't use subdags, so we were unaware 
of this naming convention. Adding a validation to prevent people from namings 
dags this way would be great. In the meantime, we have created a ticket on our 
end to rename our dags. Thanks for clarifying!

> `delete_dag` endpoint and cli commands don't delete on exact dag_id matching
> 
>
> Key: AIRFLOW-3292
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3292
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: api, cli
>Affects Versions: 1.10.0
>Reporter: Teresa Martyny
>Priority: Major
>
> If you have the following dag ids: `schema`, `schema.table1`, 
> `schema.table2`, `schema_replace`
> When you hit the delete_dag endpoint with the dag id: `schema`, it will 
> delete `schema`, `schema.table1`, and `schema.table2`, not just `schema`. 
> Underscores are fine so it doesn't delete `schema_replace`, but periods are 
> not.
> If this is expected behavior, clarifying that functionality in the docs would 
> be great, and then I can submit a feature request for the ability to use 
> regex for exact matching with this command and endpoint.
> Thanks!! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3292) `delete_dag` endpoint and cli commands don't delete on exact dag_id matching

2019-01-07 Thread Teresa Martyny (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736368#comment-16736368
 ] 

Teresa Martyny commented on AIRFLOW-3292:
-

Sorry Ash, I just saw this response. We don't use subdags, so we were unaware 
of this naming convention. Adding a validation to prevent people from namings 
dags this way would be great. In the meantime, we have created a ticket on our 
end to rename our dags. Thanks for clarifying!

> `delete_dag` endpoint and cli commands don't delete on exact dag_id matching
> 
>
> Key: AIRFLOW-3292
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3292
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: api, cli
>Affects Versions: 1.10.0
>Reporter: Teresa Martyny
>Priority: Major
>
> If you have the following dag ids: `schema`, `schema.table1`, 
> `schema.table2`, `schema_replace`
> When you hit the delete_dag endpoint with the dag id: `schema`, it will 
> delete `schema`, `schema.table1`, and `schema.table2`, not just `schema`. 
> Underscores are fine so it doesn't delete `schema_replace`, but periods are 
> not.
> If this is expected behavior, clarifying that functionality in the docs would 
> be great, and then I can submit a feature request for the ability to use 
> regex for exact matching with this command and endpoint.
> Thanks!! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] feng-tao edited a comment on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2019-01-07 Thread GitBox
feng-tao edited a comment on issue #: [AIRFLOW-3281] Fix Kubernetes 
operator with git-sync
URL: https://github.com/apache/airflow/pull/#issuecomment-452078774
 
 
   hey @kaxil , that's great news. I think we may need to revert this 
commit(https://github.com/apache/airflow/commit/b5b9287a75596a617557798f1286cf7b89c55350#diff-a7b22c07c43739c8eb0850a6fd6f7eb8)
 in v1-10-test branch as it is already included in master branch and 
cherry-pick the same commit from master branch. I think the original author 
creates a separate commit for v10.1 release. Once that commit is reverted, 
thing should be much easier. And we should include this critical fix as 
well(https://github.com/apache/airflow/pull/4305/commits/bf45855e11e0cb80040615af19fe9138406cb52b)
 once the commit is resolved.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2019-01-07 Thread GitBox
feng-tao commented on issue #: [AIRFLOW-3281] Fix Kubernetes operator with 
git-sync
URL: https://github.com/apache/airflow/pull/#issuecomment-452079131
 
 
   @kaxil, thanks for running the release :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2019-01-07 Thread GitBox
feng-tao commented on issue #: [AIRFLOW-3281] Fix Kubernetes operator with 
git-sync
URL: https://github.com/apache/airflow/pull/#issuecomment-452078774
 
 
   hey @kaxil , that's great news. I think we may need to revert this 
commit(https://github.com/apache/airflow/commit/b5b9287a75596a617557798f1286cf7b89c55350#diff-a7b22c07c43739c8eb0850a6fd6f7eb8)
 in v1-10-test branch as it is already included in master branch. I think the 
original author creates a separate commit for v10.1 release. Once that commit 
is reverted, thing should be much easier.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2019-01-07 Thread GitBox
kaxil commented on issue #: [AIRFLOW-3281] Fix Kubernetes operator with 
git-sync
URL: https://github.com/apache/airflow/pull/#issuecomment-452076833
 
 
   Trying to resolve conflicts :D so that we can include 
https://github.com/apache/incubator-airflow/pull/3770 and your DAG-level access 
commit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3402) Set default kubernetes affinity and toleration settings in airflow.cfg

2019-01-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736325#comment-16736325
 ] 

ASF GitHub Bot commented on AIRFLOW-3402:
-

kaxil commented on pull request #4454: [AIRFLOW-3402] Port PR #4247 to 1.10-test
URL: https://github.com/apache/airflow/pull/4454
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Set default kubernetes affinity and toleration settings in airflow.cfg
> --
>
> Key: AIRFLOW-3402
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3402
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: kubernetes
>Reporter: Kevin Pullin
>Assignee: Kevin Pullin
>Priority: Major
> Fix For: 1.10.2
>
>
> Currently airflow supports setting kubernetes `affinity` and `toleration` 
> configuration inside dags using either a `KubernetesExecutorConfig` 
> definition or using the `KubernetesPodOperator`.
> In order to reduce having to set and maintain this configuration in every 
> dag, it'd be useful to have the ability to set these globally in the 
> airflow.cfg file.  One use case is to force all kubernetes pods to run on a 
> particular set of dedicated airflow nodes, which requires both affinity rules 
> and tolerations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil closed pull request #4454: [AIRFLOW-3402] Port PR #4247 to 1.10-test

2019-01-07 Thread GitBox
kaxil closed pull request #4454: [AIRFLOW-3402] Port PR #4247 to 1.10-test
URL: https://github.com/apache/airflow/pull/4454
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/config_templates/default_airflow.cfg 
b/airflow/config_templates/default_airflow.cfg
index c8aa4061e7..a72604a536 100644
--- a/airflow/config_templates/default_airflow.cfg
+++ b/airflow/config_templates/default_airflow.cfg
@@ -630,6 +630,16 @@ gcp_service_account_keys =
 # It will raise an exception if called from a process not running in a 
kubernetes environment.
 in_cluster = True
 
+# Affinity configuration as a single line formatted JSON object.
+# See the affinity model for top-level key names (e.g. `nodeAffinity`, etc.):
+#   
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.12/#affinity-v1-core
+affinity =
+
+# A list of toleration objects as a single line formatted JSON array
+# See:
+#   
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.12/#toleration-v1-core
+tolerations =
+
 [kubernetes_node_selectors]
 # The Key-value pairs to be given to worker pods.
 # The worker pods will be scheduled to the nodes of the specified key-value 
pairs.
diff --git a/airflow/contrib/example_dags/example_kubernetes_executor.py 
b/airflow/contrib/example_dags/example_kubernetes_executor.py
index 1d9bb73043..d03e255ab3 100644
--- a/airflow/contrib/example_dags/example_kubernetes_executor.py
+++ b/airflow/contrib/example_dags/example_kubernetes_executor.py
@@ -32,6 +32,31 @@
 schedule_interval=None
 )
 
+affinity = {
+'podAntiAffinity': {
+'requiredDuringSchedulingIgnoredDuringExecution': [
+{
+'topologyKey': 'kubernetes.io/hostname',
+'labelSelector': {
+'matchExpressions': [
+{
+'key': 'app',
+'operator': 'In',
+'values': ['airflow']
+}
+]
+}
+}
+]
+}
+}
+
+tolerations = [{
+'key': 'dedicated',
+'operator': 'Equal',
+'value': 'airflow'
+}]
+
 
 def print_stuff():
 print("stuff!")
@@ -59,11 +84,14 @@ def use_zip_binary():
 executor_config={"KubernetesExecutor": {"image": "airflow/ci_zip:latest"}}
 )
 
-# Limit resources on this operator/task
+# Limit resources on this operator/task with node affinity & tolerations
 three_task = PythonOperator(
 task_id="three_task", python_callable=print_stuff, dag=dag,
 executor_config={
-"KubernetesExecutor": {"request_memory": "128Mi", "limit_memory": 
"128Mi"}}
+"KubernetesExecutor": {"request_memory": "128Mi",
+   "limit_memory": "128Mi",
+   "tolerations": tolerations,
+   "affinity": affinity}}
 )
 
 start_task.set_downstream([one_task, two_task, three_task])
diff --git a/airflow/contrib/executors/kubernetes_executor.py 
b/airflow/contrib/executors/kubernetes_executor.py
index dd9cd3ec53..e06a5f47e1 100644
--- a/airflow/contrib/executors/kubernetes_executor.py
+++ b/airflow/contrib/executors/kubernetes_executor.py
@@ -16,6 +16,7 @@
 # under the License.
 
 import base64
+import json
 import multiprocessing
 from queue import Queue
 from dateutil import parser
@@ -40,7 +41,7 @@ class KubernetesExecutorConfig:
 def __init__(self, image=None, image_pull_policy=None, request_memory=None,
  request_cpu=None, limit_memory=None, limit_cpu=None,
  gcp_service_account_key=None, node_selectors=None, 
affinity=None,
- annotations=None, volumes=None, volume_mounts=None):
+ annotations=None, volumes=None, volume_mounts=None, 
tolerations=None):
 self.image = image
 self.image_pull_policy = image_pull_policy
 self.request_memory = request_memory
@@ -53,16 +54,18 @@ def __init__(self, image=None, image_pull_policy=None, 
request_memory=None,
 self.annotations = annotations
 self.volumes = volumes
 self.volume_mounts = volume_mounts
+self.tolerations = tolerations
 
 def __repr__(self):
 return "{}(image={}, image_pull_policy={}, request_memory={}, 
request_cpu={}, " \
"limit_memory={}, limit_cpu={}, gcp_service_account_key={}, " \
"node_selectors={}, affinity={}, annotations={}, volumes={}, " \
-   "volume_mounts={})" \
+   "volume_mounts={}, tolerations={})" \
 .format(KubernetesExecutorConfig.__name__, self.image, 
self.image_pull_policy,
 self.request_memory, self.request_cpu, self.limit_memory,
 

[GitHub] feluelle commented on a change in pull request #4455: [AIRFLOW-3519] Fix example http operator

2019-01-07 Thread GitBox
feluelle commented on a change in pull request #4455: [AIRFLOW-3519] Fix 
example http operator
URL: https://github.com/apache/airflow/pull/4455#discussion_r245791673
 
 

 ##
 File path: airflow/example_dags/example_http_operator.py
 ##
 @@ -92,7 +92,7 @@
 http_conn_id='http_default',
 endpoint='',
 request_params={},
-response_check=lambda response: True if "Google" in response.content else 
False,
 
 Review comment:
   ..and I thought `"Google" in response.text` makes more sense (to me) and you 
do not need to do the byte conversion by yourself. :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2019-01-07 Thread GitBox
feng-tao commented on issue #: [AIRFLOW-3281] Fix Kubernetes operator with 
git-sync
URL: https://github.com/apache/airflow/pull/#issuecomment-452075293
 
 
   what is the issue @kaxil ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao edited a comment on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2019-01-07 Thread GitBox
feng-tao edited a comment on issue #: [AIRFLOW-3281] Fix Kubernetes 
operator with git-sync
URL: https://github.com/apache/airflow/pull/#issuecomment-452075293
 
 
   @kaxil, what is the issue?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil edited a comment on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2019-01-07 Thread GitBox
kaxil edited a comment on issue #: [AIRFLOW-3281] Fix Kubernetes operator 
with git-sync
URL: https://github.com/apache/airflow/pull/#issuecomment-452074777
 
 
   Sorry, I merged it and had to revert which caused this piece to fail. May 
well have to create a new PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2019-01-07 Thread GitBox
kaxil commented on issue #: [AIRFLOW-3281] Fix Kubernetes operator with 
git-sync
URL: https://github.com/apache/airflow/pull/#issuecomment-452074777
 
 
   Sorry, I merged it and had to revert. Let me know when it is ready


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #4455: [AIRFLOW-3519] Fix example http operator

2019-01-07 Thread GitBox
kaxil commented on a change in pull request #4455: [AIRFLOW-3519] Fix example 
http operator
URL: https://github.com/apache/airflow/pull/4455#discussion_r245790529
 
 

 ##
 File path: airflow/example_dags/example_http_operator.py
 ##
 @@ -92,7 +92,7 @@
 http_conn_id='http_default',
 endpoint='',
 request_params={},
-response_check=lambda response: True if "Google" in response.content else 
False,
 
 Review comment:
   Makes sense. Thanks @feluelle 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feluelle commented on a change in pull request #4455: [AIRFLOW-3519] Fix example http operator

2019-01-07 Thread GitBox
feluelle commented on a change in pull request #4455: [AIRFLOW-3519] Fix 
example http operator
URL: https://github.com/apache/airflow/pull/4455#discussion_r245789254
 
 

 ##
 File path: airflow/example_dags/example_http_operator.py
 ##
 @@ -92,7 +92,7 @@
 http_conn_id='http_default',
 endpoint='',
 request_params={},
-response_check=lambda response: True if "Google" in response.content else 
False,
 
 Review comment:
   Yes. `response.content` is a byte string and not a decoded string like 
`response.text`.
   So either `b"Google" in response.content` would work or `"Google" in 
response.text` . See these Jira tickets: 
https://issues.apache.org/jira/browse/AIRFLOW-3519 and 
https://issues.apache.org/jira/browse/AIRFLOW-450


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil closed pull request #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync

2019-01-07 Thread GitBox
kaxil closed pull request #: [AIRFLOW-3281] Fix Kubernetes operator with 
git-sync
URL: https://github.com/apache/airflow/pull/
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):



 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3281) Kubernetes git sync implementation is broken

2019-01-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736311#comment-16736311
 ] 

ASF GitHub Bot commented on AIRFLOW-3281:
-

kaxil commented on pull request #: [AIRFLOW-3281] Fix Kubernetes operator 
with git-sync
URL: https://github.com/apache/airflow/pull/
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Kubernetes git sync implementation is broken
> 
>
> Key: AIRFLOW-3281
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3281
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Riccardo Bini
>Assignee: Riccardo Bini
>Priority: Major
>
> The current implementation of git-sync when airflow is being used with 
> kubernetes is broken.
> The init container doesn't share the volume with the airflow container and 
> the path of the dag folder doesn't take into account the fact that git sync 
> creates a sym link to the revision



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] benjamingregory commented on issue #3055: [AIRFLOW-2125] Using binary package psycopg2-binary

2019-01-07 Thread GitBox
benjamingregory commented on issue #3055: [AIRFLOW-2125] Using binary package 
psycopg2-binary
URL: https://github.com/apache/airflow/pull/3055#issuecomment-452071229
 
 
   @bern4rdelli @jgao54 @Fokko 
   
   Question as to why this was changed to `psychopg2-binary` given the 
following warning from 
http://initd.org/psycopg/docs/install.html#binary-install-from-pypi
   
   ```
   Note: The -binary package is meant for beginners to start playing with 
Python and PostgreSQL without the need to meet the build requirements. If you 
are the maintainer of a publish package depending on psycopg2 you shouldn’t use 
psycopg2-binary as a module dependency. For production use you are advised to 
use the source distribution.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on a change in pull request #4455: [AIRFLOW-3519] Fix example http operator

2019-01-07 Thread GitBox
kaxil commented on a change in pull request #4455: [AIRFLOW-3519] Fix example 
http operator
URL: https://github.com/apache/airflow/pull/4455#discussion_r245784539
 
 

 ##
 File path: airflow/example_dags/example_http_operator.py
 ##
 @@ -92,7 +92,7 @@
 http_conn_id='http_default',
 endpoint='',
 request_params={},
-response_check=lambda response: True if "Google" in response.content else 
False,
 
 Review comment:
   response.content works as well. Does it error for you @feluelle ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3646) Fix plugin manager test

2019-01-07 Thread Tao Feng (JIRA)
Tao Feng created AIRFLOW-3646:
-

 Summary: Fix plugin manager test
 Key: AIRFLOW-3646
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3646
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Tao Feng






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] feng-tao commented on issue #4399: [AIRFLOW-3594] Unify different License Header

2019-01-07 Thread GitBox
feng-tao commented on issue #4399: [AIRFLOW-3594] Unify different License Header
URL: https://github.com/apache/airflow/pull/4399#issuecomment-452035908
 
 
   PTAL @bolkedebruin 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3272) Create gRPC hook for creating generic grpc connection

2019-01-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736141#comment-16736141
 ] 

ASF GitHub Bot commented on AIRFLOW-3272:
-

morgendave commented on pull request #4101: [AIRFLOW-3272] Add base grpc hook
URL: https://github.com/apache/airflow/pull/4101
 
 
   Make sure you have checked all steps below.
   
   Jira
 My PR addresses the following Airflow Jira issues and references them in 
the PR title. For example, "[AIRFLOW-3272] My Airflow PR"
   https://issues.apache.org/jira/browse/AIRFLOW-3272
   Description
 Add support for gRPC connection in airflow. 
   
   In Airflow there are use cases of calling gPRC services, so instead of each 
time create the channel in a PythonOperator, there should be a basic GrpcHook 
to take care of it. The hook needs to take care of the authentication.
   
   Tests
 My PR adds the following unit tests OR does not need testing for this 
extremely good reason:
   Commits
 My commits all reference Jira issues in their subject lines, and I have 
squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "How to write a good git commit message":
   Subject is separated from body by a blank line
   Subject is limited to 50 characters (not including Jira issue reference)
   Subject does not end with a period
   Subject uses the imperative mood ("add", not "adding")
   Body wraps at 72 characters
   Body explains "what" and "why", not "how"
   Documentation
 In case of new functionality, my PR adds documentation that describes how 
to use it.
   When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   Code Quality
 Passes flake8
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create gRPC hook for creating generic grpc connection
> -
>
> Key: AIRFLOW-3272
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3272
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Zhiwei Zhao
>Assignee: Zhiwei Zhao
>Priority: Minor
>
> Add support for gRPC connection in airflow. 
> In Airflow there are use cases of calling gPRC services, so instead of each 
> time create the channel in a PythonOperator, there should be a basic GrpcHook 
> to take care of it. The hook needs to take care of the authentication.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3645) Use a base_executor_config and merge operator level executor_config

2019-01-07 Thread Kyle Hamlin (JIRA)
Kyle Hamlin created AIRFLOW-3645:


 Summary: Use a base_executor_config and merge operator level 
executor_config
 Key: AIRFLOW-3645
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3645
 Project: Apache Airflow
  Issue Type: Improvement
Reporter: Kyle Hamlin
 Fix For: 1.10.2


It would be very useful to have a `base_executor_config` and merge the base 
config with any operator level `executor_config`.

I imaging referencing a python dict similar to how we reference a custom 
logging_config

*Example config*
{code:java}
[core]
base_executor_config = config.base_executor_config.BASE_EXECUTOR_CONFIG
{code}
*Example base_executor_config*
{code:java}
BASE_EXECUTOR_CONFIG = {
"KubernetesExecutor": {
"image_pull_policy": "Always",
"annotations": {
"iam.amazonaws.com/role": "arn:aws:iam::"
},
"volumes": [
{
"name": "airflow-lib",
"persistentVolumeClaim": {
"claimName": "airflow-lib"
}
}
],
"volume_mounts": [
{
"name": "airflow-lib",
"mountPath": "/usr/local/airflow/lib",
}
]
}
}
{code}
*Example operator*
{code:java}
run_this = PythonOperator(
task_id='print_the_context',
provide_context=True,
python_callable=print_context,
executor_config={
"KubernetesExecutor": {
"request_memory": "256Mi",
"request_cpu": "100m",
"limit_memory": "256Mi",
"limit_cpu": "100m"
}
},
dag=dag)
{code}
Then we'll want to have a dict deep merge function in that returns the 
executor_config

*Merge functionality*
{code:java}
import collections
from airflow import conf
from airflow.utils.module_loading import import_string

def dict_merge(dct, merge_dct):
""" Recursive dict merge. Inspired by :meth:``dict.update()``, instead of
updating only top-level keys, dict_merge recurses down into dicts nested
to an arbitrary depth, updating keys. The ``merge_dct`` is merged into
``dct``.
:param dct: dict onto which the merge is executed
:param merge_dct: dct merged into dct
:return: dct
"""

for k, v in merge_dct.items():
if (k in dct and isinstance(dct[k], dict)
and isinstance(merge_dct[k], collections.Mapping)):
dict_merge(dct[k], merge_dct[k])
else:
dct[k] = merge_dct[k]

return dct


def get_executor_config(executor_config):
"""Try to import base_executor_config and merge it with provided
executor_config.
:param executor_config: operator level executor config
:return: dict"""

try:
base_executor_config = import_string(
conf.get('core', 'base_executor_config'))
merged_executor_config = dict_merge(
base_executor_config, executor_config)
return merged_executor_config
except Exception:
return executor_config
{code}

Finally, we'll want to call the get_executor_config function in the 
`BaseOperator` possibly here: 
https://github.com/apache/airflow/blob/master/airflow/models/__init__.py#L2348



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3272) Create gRPC hook for creating generic grpc connection

2019-01-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736140#comment-16736140
 ] 

ASF GitHub Bot commented on AIRFLOW-3272:
-

morgendave commented on pull request #4101: [AIRFLOW-3272] Add base grpc hook
URL: https://github.com/apache/airflow/pull/4101
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create gRPC hook for creating generic grpc connection
> -
>
> Key: AIRFLOW-3272
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3272
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Zhiwei Zhao
>Assignee: Zhiwei Zhao
>Priority: Minor
>
> Add support for gRPC connection in airflow. 
> In Airflow there are use cases of calling gPRC services, so instead of each 
> time create the channel in a PythonOperator, there should be a basic GrpcHook 
> to take care of it. The hook needs to take care of the authentication.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] morgendave opened a new pull request #4101: [AIRFLOW-3272] Add base grpc hook

2019-01-07 Thread GitBox
morgendave opened a new pull request #4101: [AIRFLOW-3272] Add base grpc hook
URL: https://github.com/apache/airflow/pull/4101
 
 
   Make sure you have checked all steps below.
   
   Jira
 My PR addresses the following Airflow Jira issues and references them in 
the PR title. For example, "[AIRFLOW-3272] My Airflow PR"
   https://issues.apache.org/jira/browse/AIRFLOW-3272
   Description
 Add support for gRPC connection in airflow. 
   
   In Airflow there are use cases of calling gPRC services, so instead of each 
time create the channel in a PythonOperator, there should be a basic GrpcHook 
to take care of it. The hook needs to take care of the authentication.
   
   Tests
 My PR adds the following unit tests OR does not need testing for this 
extremely good reason:
   Commits
 My commits all reference Jira issues in their subject lines, and I have 
squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "How to write a good git commit message":
   Subject is separated from body by a blank line
   Subject is limited to 50 characters (not including Jira issue reference)
   Subject does not end with a period
   Subject uses the imperative mood ("add", not "adding")
   Body wraps at 72 characters
   Body explains "what" and "why", not "how"
   Documentation
 In case of new functionality, my PR adds documentation that describes how 
to use it.
   When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   Code Quality
 Passes flake8


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #4453: [AIRFLOW-XXX] Fix existing flake8 errors

2019-01-07 Thread GitBox
feng-tao commented on issue #4453: [AIRFLOW-XXX] Fix existing flake8 errors
URL: https://github.com/apache/airflow/pull/4453#issuecomment-452019385
 
 
   @yohei1126 , I think this test is pretty flaky.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #4453: [AIRFLOW-XXX] Fix existing flake8 errors

2019-01-07 Thread GitBox
feng-tao commented on issue #4453: [AIRFLOW-XXX] Fix existing flake8 errors
URL: https://github.com/apache/airflow/pull/4453#issuecomment-452019252
 
 
   @yohei1126 , not sure which one you are looking at, but the master CI 
passes(https://travis-ci.org/apache/airflow/builds/476197675). 
   
   The test failure comes from https://github.com/apache/airflow/pull/4407 
which I have discussed in the pr as well as the mailing list.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feng-tao commented on issue #4407: [AIRFLOW-3600] Remove dagbag from trigger

2019-01-07 Thread GitBox
feng-tao commented on issue #4407: [AIRFLOW-3600] Remove dagbag from trigger
URL: https://github.com/apache/airflow/pull/4407#issuecomment-452019084
 
 
   @ffinfo , please let me know if you will investigate the issue. If not, I 
prefer to revert this change until the flaky test is fixed. What do you think 
@Fokko ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3519) example_http_operator is failing due to

2019-01-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736031#comment-16736031
 ] 

ASF GitHub Bot commented on AIRFLOW-3519:
-

feluelle commented on pull request #4455: [AIRFLOW-3519] Fix example http 
operator
URL: https://github.com/apache/airflow/pull/4455
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3519
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   This PR fixes the sensor in the example http operator that searches for a 
string in a byte-like object called response.content.
   Now it searches in the decoded response object called response.text.
   
   **NOTE:** This PR also fixes issue 
https://issues.apache.org/jira/browse/AIRFLOW-450. This ticket was already 
marked as resolved but it actually wasn't.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
   
   ### Code Quality
   
   - [x] Passes `flake8`
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> example_http_operator is failing due to 
> 
>
> Key: AIRFLOW-3519
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3519
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: Windows 10 professional edition.Apache airflow 
>Reporter: Arunachalam Ambikapathi
>Assignee: Felix Uellendall
>Priority: Minor
>
> When example_http_operator DAG is called from command line, 
>  ./airflow trigger_dag example_http_operator
> it was throwing error 
>  [2018-12-13 10:37:41,892]
> {logging_mixin.py:95}
> INFO - [2018-12-13 10:37:41,892]
> {http_hook.py:126}
> INFO - Sending 'GET' to url: [https://www.google.com/]
> [2018-12-13 10:37:41,992]
> {logging_mixin.py:95}
> WARNING - 
> /home/arun1/.local/lib/python3.5/site-packages/urllib3/connectionpool.py:847: 
> InsecureRequestWarning: Unverified HTTPS request is being made. Adding 
> certificate verification is strongly advised. See: 
> [https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings]
>  InsecureRequestWarning)
>  [2018-12-13 10:37:42,064]
> {models.py:1760}
> *ERROR - a bytes-like object is required, not 'str'*
> This may be due to this was not tested in python3.5 version.
>  *Fix:*
>  I changed the dag to this and tested it is working.
> from 
> response_check=lambda response: True if "Google" in response.content else 
> False,
> to 
> response_check=lambda response: True if *b'Google'* in response.content else 
> False,
> Please apply this in the example it would help new users a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3644) AIP-8 Split Hooks/Operators out of core package and repository

2019-01-07 Thread Tim Swast (JIRA)
Tim Swast created AIRFLOW-3644:
--

 Summary: AIP-8 Split Hooks/Operators out of core package and 
repository
 Key: AIRFLOW-3644
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3644
 Project: Apache Airflow
  Issue Type: Improvement
  Components: contrib, core
Reporter: Tim Swast


Based on discussion at 
http://mail-archives.apache.org/mod_mbox/airflow-dev/201809.mbox/%3c308670db-bd2a-4738-81b1-3f6fb312c...@apache.org%3E
 I believe separating hooks/operators into separate packages can benefit 
long-term maintainability of Apache Airflow by distributing maintenance and 
reducing the surface area of the core Airflow package.

AIP-8 draft: 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] feluelle opened a new pull request #4455: [AIRFLOW-3519] Fix example http operator

2019-01-07 Thread GitBox
feluelle opened a new pull request #4455: [AIRFLOW-3519] Fix example http 
operator
URL: https://github.com/apache/airflow/pull/4455
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3519
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   This PR fixes the sensor in the example http operator that searches for a 
string in a byte-like object called response.content.
   Now it searches in the decoded response object called response.text.
   
   **NOTE:** This PR also fixes issue 
https://issues.apache.org/jira/browse/AIRFLOW-450. This ticket was already 
marked as resolved but it actually wasn't.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
 - All the public functions and the classes in the PR contain docstrings 
that explain what it does
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Mokubyow edited a comment on issue #4309: [AIRFLOW-3504] Extend/refine the functionality of "/health" endpoint

2019-01-07 Thread GitBox
Mokubyow edited a comment on issue #4309: [AIRFLOW-3504] Extend/refine the 
functionality of "/health" endpoint
URL: https://github.com/apache/airflow/pull/4309#issuecomment-451993107
 
 
   Emmanual Bard brought up a good point about the scheduler health check. What 
you have only checks the last scheduled run, not the scheduler heartbeat which 
is what we really want to know. The query should look something like this:
   
   ```
   select max(latest_heartbeat) from job
   where job_type = 'SchedulerJob'
   and state = 'running'```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (AIRFLOW-3519) example_http_operator is failing due to

2019-01-07 Thread Felix Uellendall (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736002#comment-16736002
 ] 

Felix Uellendall edited comment on AIRFLOW-3519 at 1/7/19 4:04 PM:
---

I would rather change the response_check to return True if it is in 
response.text instead of response.content.
Because response.text is the decoded response of response.content. See 
http://docs.python-requests.org/en/master/user/quickstart/#response-content

What do you think [~Arun Ambikapathi] ?


was (Author: feluelle):
I would rather change the response_check to return True if it is in 
response.text instead of response.content.
Because response.text is the decoded response of response.content.

What do you think [~Arun Ambikapathi] ?

> example_http_operator is failing due to 
> 
>
> Key: AIRFLOW-3519
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3519
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: Windows 10 professional edition.Apache airflow 
>Reporter: Arunachalam Ambikapathi
>Assignee: Felix Uellendall
>Priority: Minor
>
> When example_http_operator DAG is called from command line, 
>  ./airflow trigger_dag example_http_operator
> it was throwing error 
>  [2018-12-13 10:37:41,892]
> {logging_mixin.py:95}
> INFO - [2018-12-13 10:37:41,892]
> {http_hook.py:126}
> INFO - Sending 'GET' to url: [https://www.google.com/]
> [2018-12-13 10:37:41,992]
> {logging_mixin.py:95}
> WARNING - 
> /home/arun1/.local/lib/python3.5/site-packages/urllib3/connectionpool.py:847: 
> InsecureRequestWarning: Unverified HTTPS request is being made. Adding 
> certificate verification is strongly advised. See: 
> [https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings]
>  InsecureRequestWarning)
>  [2018-12-13 10:37:42,064]
> {models.py:1760}
> *ERROR - a bytes-like object is required, not 'str'*
> This may be due to this was not tested in python3.5 version.
>  *Fix:*
>  I changed the dag to this and tested it is working.
> from 
> response_check=lambda response: True if "Google" in response.content else 
> False,
> to 
> response_check=lambda response: True if *b'Google'* in response.content else 
> False,
> Please apply this in the example it would help new users a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3519) example_http_operator is failing due to

2019-01-07 Thread Felix Uellendall (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736002#comment-16736002
 ] 

Felix Uellendall commented on AIRFLOW-3519:
---

I would rather change the response_check to return True if it is in 
response.text instead of response.content.
Because response.text is the decoded response of response.content.

What do you think [~Arun Ambikapathi] ?

> example_http_operator is failing due to 
> 
>
> Key: AIRFLOW-3519
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3519
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: Windows 10 professional edition.Apache airflow 
>Reporter: Arunachalam Ambikapathi
>Assignee: Felix Uellendall
>Priority: Minor
>
> When example_http_operator DAG is called from command line, 
>  ./airflow trigger_dag example_http_operator
> it was throwing error 
>  [2018-12-13 10:37:41,892]
> {logging_mixin.py:95}
> INFO - [2018-12-13 10:37:41,892]
> {http_hook.py:126}
> INFO - Sending 'GET' to url: [https://www.google.com/]
> [2018-12-13 10:37:41,992]
> {logging_mixin.py:95}
> WARNING - 
> /home/arun1/.local/lib/python3.5/site-packages/urllib3/connectionpool.py:847: 
> InsecureRequestWarning: Unverified HTTPS request is being made. Adding 
> certificate verification is strongly advised. See: 
> [https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings]
>  InsecureRequestWarning)
>  [2018-12-13 10:37:42,064]
> {models.py:1760}
> *ERROR - a bytes-like object is required, not 'str'*
> This may be due to this was not tested in python3.5 version.
>  *Fix:*
>  I changed the dag to this and tested it is working.
> from 
> response_check=lambda response: True if "Google" in response.content else 
> False,
> to 
> response_check=lambda response: True if *b'Google'* in response.content else 
> False,
> Please apply this in the example it would help new users a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (AIRFLOW-3601) update operators to BigQuery to support location

2019-01-07 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-3601 started by Yohei Onishi.
-
> update operators to BigQuery to support location
> 
>
> Key: AIRFLOW-3601
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3601
> Project: Apache Airflow
>  Issue Type: Task
>Affects Versions: 1.10.1
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> location support for BigQueryHook was merged by the PR 4324 
> [https://github.com/apache/incubator-airflow/pull/4324]
> The following operators needs to be updated.
>  * bigquery_check_operator.py
>  ** 
> BigQueryCheckOperator
>  * bigquery_get_data.py
>  ** 
> BigQueryGetDataOperator
>  *  bigquery_operator.py
>  ** 
> BigQueryOperator
> BigQueryCreateEmptyTableOperator
> BigQueryCreateExternalTableOperator
> BigQueryDeleteDatasetOperator
> BigQueryCreateEmptyDatasetOperator
>  *  bigquery_table_delete_operator.py
>  ** 
> BigQueryTableDeleteOperator
>  * bigquery_to_bigquery.py
>  ** 
> BigQueryToBigQueryOperator
>  * bigquery_to_gcs.py
>  ** 
> BigQueryToCloudStorageOperator
>  * gcs_to_bq.py
>  ** 
> GoogleCloudStorageToBigQueryOperator
>  * bigquery_sensor.py
>  ** 
> BigQueryTableSensor



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] feluelle commented on issue #3723: [AIRFLOW-2876] Update Tenacity to 4.12

2019-01-07 Thread GitBox
feluelle commented on issue #3723: [AIRFLOW-2876] Update Tenacity to 4.12
URL: https://github.com/apache/airflow/pull/3723#issuecomment-451887370
 
 
   I just tried to reproduce @r39132 error but I can't.
   I ran `pip install .` from the current `master` branch and installed the 
following versions successfully in a python 3.6.5 venv.
   
   Airflow Version: v2.0.0.dev0+
   Tenacity Version: 4.12.0
   
   After installation I ran `airflow initdb` then `airflow scheduler` then 
`airflow webserver` and all seems to work - no errors.
   
   Can someone else reproduce the error? I would like to fix it if it has not 
already been fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] yohei1126 commented on issue #4453: [AIRFLOW-XXX] Fix existing flake8 errors

2019-01-07 Thread GitBox
yohei1126 commented on issue #4453: [AIRFLOW-XXX] Fix existing flake8 errors
URL: https://github.com/apache/airflow/pull/4453#issuecomment-451875457
 
 
   @feng-tao your build failed. Is this your test? please check 
https://travis-ci.org/apache/airflow/jobs/476146347#L4902
   ```
   ==
   42) FAIL: test_trigger_dag_button (tests.www_rbac.test_views.TestTriggerDag)
   --
  Traceback (most recent call last):
   tests/www_rbac/test_views.py line 1476 in test_trigger_dag_button
 self.assertIsNotNone(run)
  AssertionError: unexpectedly None
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] feluelle commented on a change in pull request #4449: [AIRFLOW-3638] Add tests for PrestoToMySqlTransfer

2019-01-07 Thread GitBox
feluelle commented on a change in pull request #4449: [AIRFLOW-3638] Add tests 
for PrestoToMySqlTransfer
URL: https://github.com/apache/airflow/pull/4449#discussion_r245590944
 
 

 ##
 File path: tests/operators/test_presto_to_mysql.py
 ##
 @@ -0,0 +1,37 @@
+import unittest
 
 Review comment:
   @kaxil fixed :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services