[jira] [Created] (AIRFLOW-3650) Fix flaky test in TestTriggerDag
Tao Feng created AIRFLOW-3650: - Summary: Fix flaky test in TestTriggerDag Key: AIRFLOW-3650 URL: https://issues.apache.org/jira/browse/AIRFLOW-3650 Project: Apache Airflow Issue Type: Bug Reporter: Tao Feng Assignee: Tao Feng -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] feng-tao commented on issue #4457: [AIRFLOW-3650] Skip running on mysql for the flaky test
feng-tao commented on issue #4457: [AIRFLOW-3650] Skip running on mysql for the flaky test URL: https://github.com/apache/airflow/pull/4457#issuecomment-452203627 PTAL @kaxil @Fokko I think the issue is not on the actual test itself, but on mysqlclient library. The session can't get the latest data from ORM after finish running this line(https://github.com/apache/airflow/blob/master/tests/www_rbac/test_views.py#L1473) (I open a local pdb to confirm). But I try different mysqlclient versions which all don't work. To unblock, I would suggest we skip MySQL orm for this test. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jmcarp opened a new pull request #4458: [AIRFLOW-3648] Default to connection project id in gcp cloud sql.
jmcarp opened a new pull request #4458: [AIRFLOW-3648] Default to connection project id in gcp cloud sql. URL: https://github.com/apache/airflow/pull/4458 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3648 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [x] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jmcarp commented on a change in pull request #4390: [AIRFLOW-3584] Use ORM DAGs for index view.
jmcarp commented on a change in pull request #4390: [AIRFLOW-3584] Use ORM DAGs for index view. URL: https://github.com/apache/airflow/pull/4390#discussion_r245878602 ## File path: airflow/www_rbac/compile_assets.sh ## @@ -23,6 +23,6 @@ if [ -d airflow/www_rbac/static/dist ]; then fi cd airflow/www_rbac/ -npm install +# npm install Review comment: That's a typo, I'll revert it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao edited a comment on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)?
feng-tao edited a comment on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)? URL: https://github.com/apache/airflow/pull/4421#issuecomment-452174547 @Fokko , regarding not having a migration script, I wonder what will happen for the following case: assume we have three tables: ``A``, ``B``, ``C``, we modify table ``A`` and provide an alemic migration script(version 1), then delete table ``B`` without a script, then modify table ``C`` with another alemic script(version 2), in this case will the migration(upgrade / downgrade) run successfully from version 1 to 2 or vice versa? If yes, I am +1 :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on a change in pull request #4457: [AIRFLOW-XXX] Fix TestTriggerDag flaky test
feng-tao commented on a change in pull request #4457: [AIRFLOW-XXX] Fix TestTriggerDag flaky test URL: https://github.com/apache/airflow/pull/4457#discussion_r245870276 ## File path: tests/www_rbac/test_views.py ## @@ -1464,12 +1464,9 @@ def test_trigger_dag_button_normal_exist(self): def test_trigger_dag_button(self): -test_dag_id = "example_bash_operator" +test_dag_id = "example_python_operator" DR = models.DagRun -self.session.query(DR).delete() -self.session.commit() - self.client.get('trigger?dag_id={}'.format(test_dag_id)) Review comment: thanks. this pr is for testing only. I would like to see why the test fails particularly for MySQL ORM. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jmcarp commented on issue #4436: [AIRFLOW-3631] Update flake8 and fix lint.
jmcarp commented on issue #4436: [AIRFLOW-3631] Update flake8 and fix lint. URL: https://github.com/apache/airflow/pull/4436#issuecomment-452143443 Updated. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
kaxil commented on issue #: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/airflow/pull/#issuecomment-452132314 @feng-tao I have included that commit as well. @dimberman I have spent some time today and cherry-picked and resolved some conflicts (with some help from this PR - thank you guys), it would be great if you can verify if everything that was needed is there. And then we will try to resolve issues with tests if any. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)?
feng-tao commented on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)? URL: https://github.com/apache/airflow/pull/4421#issuecomment-452119451 but CI is failing. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)?
feng-tao commented on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)? URL: https://github.com/apache/airflow/pull/4421#issuecomment-452119383 @Fokko , I see your point. +1 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3645) Use a base_executor_config and merge operator level executor_config
[ https://issues.apache.org/jira/browse/AIRFLOW-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736462#comment-16736462 ] ASF GitHub Bot commented on AIRFLOW-3645: - Mokubyow commented on pull request #4456: [AIRFLOW-3645] Add base_executor_config URL: https://github.com/apache/airflow/pull/4456 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW-3645) ### Description - [x] Add a base_executor_config that merges any operator_level executor_config into itself. This helps to dry up KubernetesExecutor deployments that might need to pass an executor config to all operators. ### Tests - [x] My PR adds the following unit tests: `nosetests -v tests/utils/test_helpers.py:TestHelpers.test_dict_merge` ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [x] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Use a base_executor_config and merge operator level executor_config > --- > > Key: AIRFLOW-3645 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3645 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kyle Hamlin >Assignee: Kyle Hamlin >Priority: Major > Fix For: 1.10.2 > > > It would be very useful to have a `base_executor_config` and merge the base > config with any operator level `executor_config`. > I imaging referencing a python dict similar to how we reference a custom > logging_config > *Example config* > {code:java} > [core] > base_executor_config = config.base_executor_config.BASE_EXECUTOR_CONFIG > {code} > *Example base_executor_config* > {code:java} > BASE_EXECUTOR_CONFIG = { > "KubernetesExecutor": { > "image_pull_policy": "Always", > "annotations": { > "iam.amazonaws.com/role": "arn:aws:iam::" > }, > "volumes": [ > { > "name": "airflow-lib", > "persistentVolumeClaim": { > "claimName": "airflow-lib" > } > } > ], > "volume_mounts": [ > { > "name": "airflow-lib", > "mountPath": "/usr/local/airflow/lib", > } > ] > } > } > {code} > *Example operator* > {code:java} > run_this = PythonOperator( > task_id='print_the_context', > provide_context=True, > python_callable=print_context, > executor_config={ > "KubernetesExecutor": { > "request_memory": "256Mi", > "request_cpu": "100m", > "limit_memory": "256Mi", > "limit_cpu": "100m" > } > }, > dag=dag) > {code} > Then we'll want to have a dict deep merge function in that returns the > executor_config > *Merge functionality* > {code:java} > import collections > from airflow import conf > from airflow.utils.module_loading import import_string > def dict_merge(dct, merge_dct): > """ Recursive dict merge. Inspired by :meth:``dict.update()``, instead of > updating only top-level keys, dict_merge recurses down into dicts nested > to an arbitrary depth, updating keys. The ``merge_dct`` is merged into > ``dct``. > :param dct: dict onto which the merge is executed > :param merge_dct: dct merged into dct > :return: dct > """ > for k, v in merge_dct.items(): > if (k in dct and isinstance(dct[k], dict) > and isinstance(merge_dct[k], collections.Mapping)): > dict_merge(dct[k],
[jira] [Updated] (AIRFLOW-3647) Contributed SparkSubmitOperator doesn't honor --archives configuration
[ https://issues.apache.org/jira/browse/AIRFLOW-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ken Melms updated AIRFLOW-3647: --- Description: The contributed SparkSubmitOperator has no ability to honor the spark-submit configuration field "--archives" which is treated subtly different than "files" or "-py-files" in that it will unzip the archive into the application's working directory, and can optionally add an alias to the unzipped folder so that you can refer to it elsewhere in your submission. EG: spark-submit --archives=hdfs:user/someone/python35_venv.zip#PYTHON --conf "spark.yarn.appMasterEnv.PYSPARK_PYTHON=./PYTHON/python35/bin/python3" run_me.py In our case - this behavior allows for multiple python virtual environments to be sourced from HDFS without incurring the penalty of pushing the whole python virtual env to the cluster each submission. This solves (for us) using python-based spark jobs on a cluster that the end user has no ability to define the python modules in use. was: The contributed SparkSubmitOperator has no ability to honor the spark-submit configuration field "--archives" which is treated subtly different than "--files" or "--py-files" in that it will unzip the archive into the application's working directory, and can optionally add an alias to the unzipped folder so that you can refer to it elsewhere in your submission. EG: spark-submit --archives=hdfs:user/someone/python35_venv.zip#PYTHON --conf "spark.yarn.appMasterEnv.PYSPARK_PYTHON=./PYTHON/python35/bin/python3" run_me.py In our case - this behavior allows for multiple python virtual environments to be sourced from HDFS without incurring the penalty of pushing the whole python virtual env to the cluster each submission. This solves (for us) using python-based spark jobs on a cluster that the end user has no ability to define the python modules in use. > Contributed SparkSubmitOperator doesn't honor --archives configuration > -- > > Key: AIRFLOW-3647 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3647 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib >Affects Versions: 1.10.1 > Environment: Linux (RHEL 7) > Python 3.5 (using a virtual environment) > spark-2.1.3-bin-hadoop26 > Airflow 1.10.1 > CDH 5.14 Hadoop [Yarn] cluster (no end user / dev modifications allowed) >Reporter: Ken Melms >Priority: Minor > Labels: easyfix, newbie > Original Estimate: 1h > Remaining Estimate: 1h > > The contributed SparkSubmitOperator has no ability to honor the spark-submit > configuration field "--archives" which is treated subtly different than > "files" or "-py-files" in that it will unzip the archive into the > application's working directory, and can optionally add an alias to the > unzipped folder so that you can refer to it elsewhere in your submission. > EG: > spark-submit --archives=hdfs:user/someone/python35_venv.zip#PYTHON > --conf "spark.yarn.appMasterEnv.PYSPARK_PYTHON=./PYTHON/python35/bin/python3" > run_me.py > In our case - this behavior allows for multiple python virtual environments > to be sourced from HDFS without incurring the penalty of pushing the whole > python virtual env to the cluster each submission. This solves (for us) > using python-based spark jobs on a cluster that the end user has no ability > to define the python modules in use. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3647) Contributed SparkSubmitOperator doesn't honor --archives configuration
Ken Melms created AIRFLOW-3647: -- Summary: Contributed SparkSubmitOperator doesn't honor --archives configuration Key: AIRFLOW-3647 URL: https://issues.apache.org/jira/browse/AIRFLOW-3647 Project: Apache Airflow Issue Type: Improvement Components: contrib Affects Versions: 1.10.1 Environment: Linux (RHEL 7) Python 3.5 (using a virtual environment) spark-2.1.3-bin-hadoop26 Airflow 1.10.1 CDH 5.14 Hadoop [Yarn] cluster (no end user / dev modifications allowed) Reporter: Ken Melms The contributed SparkSubmitOperator has no ability to honor the spark-submit configuration field "--archives" which is treated subtly different than "--files" or "--py-files" in that it will unzip the archive into the application's working directory, and can optionally add an alias to the unzipped folder so that you can refer to it elsewhere in your submission. EG: spark-submit --archives=hdfs:user/someone/python35_venv.zip#PYTHON --conf "spark.yarn.appMasterEnv.PYSPARK_PYTHON=./PYTHON/python35/bin/python3" run_me.py In our case - this behavior allows for multiple python virtual environments to be sourced from HDFS without incurring the penalty of pushing the whole python virtual env to the cluster each submission. This solves (for us) using python-based spark jobs on a cluster that the end user has no ability to define the python modules in use. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] felipegasparini commented on issue #4036: [AIRFLOW-2744] Allow RBAC to accept plugins for views and links.
felipegasparini commented on issue #4036: [AIRFLOW-2744] Allow RBAC to accept plugins for views and links. URL: https://github.com/apache/airflow/pull/4036#issuecomment-452113006 We also need to fix the example in the documentation (https://airflow.apache.org/plugins.html#example). It is currently broken: import fails, wrong method name in the view and missing path for templates. I made a sample project to make this plugin integration work: https://github.com/felipegasparini/airflow_plugin_rbac_test/blob/dbaa049a9996df275b1d90f74b93ffbf206bb1d5/airflow/plugins/test_plugin/test_plugin.py I will submit a PR to fix the doc later, but just posting it here since it may be useful for others. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on issue #4407: [AIRFLOW-3600] Remove dagbag from trigger
feng-tao commented on issue #4407: [AIRFLOW-3600] Remove dagbag from trigger URL: https://github.com/apache/airflow/pull/4407#issuecomment-452107835 @Fokko , looking at recent commit, this is the one that modifies this part of the code. And the CI is not always fails with this test(sometimes works, sometimes not). Hence I suspect this pr is the case. And we are not sure when this pr checked in CI is broken or not, right? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #4405: [AIRFLOW-3598] Add tests for MsSqlToHiveTransfer
Fokko commented on issue #4405: [AIRFLOW-3598] Add tests for MsSqlToHiveTransfer URL: https://github.com/apache/airflow/pull/4405#issuecomment-452107449 Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work started] (AIRFLOW-3645) Use a base_executor_config and merge operator level executor_config
[ https://issues.apache.org/jira/browse/AIRFLOW-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-3645 started by Kyle Hamlin. > Use a base_executor_config and merge operator level executor_config > --- > > Key: AIRFLOW-3645 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3645 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kyle Hamlin >Assignee: Kyle Hamlin >Priority: Major > Fix For: 1.10.2 > > > It would be very useful to have a `base_executor_config` and merge the base > config with any operator level `executor_config`. > I imaging referencing a python dict similar to how we reference a custom > logging_config > *Example config* > {code:java} > [core] > base_executor_config = config.base_executor_config.BASE_EXECUTOR_CONFIG > {code} > *Example base_executor_config* > {code:java} > BASE_EXECUTOR_CONFIG = { > "KubernetesExecutor": { > "image_pull_policy": "Always", > "annotations": { > "iam.amazonaws.com/role": "arn:aws:iam::" > }, > "volumes": [ > { > "name": "airflow-lib", > "persistentVolumeClaim": { > "claimName": "airflow-lib" > } > } > ], > "volume_mounts": [ > { > "name": "airflow-lib", > "mountPath": "/usr/local/airflow/lib", > } > ] > } > } > {code} > *Example operator* > {code:java} > run_this = PythonOperator( > task_id='print_the_context', > provide_context=True, > python_callable=print_context, > executor_config={ > "KubernetesExecutor": { > "request_memory": "256Mi", > "request_cpu": "100m", > "limit_memory": "256Mi", > "limit_cpu": "100m" > } > }, > dag=dag) > {code} > Then we'll want to have a dict deep merge function in that returns the > executor_config > *Merge functionality* > {code:java} > import collections > from airflow import conf > from airflow.utils.module_loading import import_string > def dict_merge(dct, merge_dct): > """ Recursive dict merge. Inspired by :meth:``dict.update()``, instead of > updating only top-level keys, dict_merge recurses down into dicts nested > to an arbitrary depth, updating keys. The ``merge_dct`` is merged into > ``dct``. > :param dct: dict onto which the merge is executed > :param merge_dct: dct merged into dct > :return: dct > """ > for k, v in merge_dct.items(): > if (k in dct and isinstance(dct[k], dict) > and isinstance(merge_dct[k], collections.Mapping)): > dict_merge(dct[k], merge_dct[k]) > else: > dct[k] = merge_dct[k] > > return dct > def get_executor_config(executor_config): > """Try to import base_executor_config and merge it with provided > executor_config. > :param executor_config: operator level executor config > :return: dict""" > > try: > base_executor_config = import_string( > conf.get('core', 'base_executor_config')) > merged_executor_config = dict_merge( > base_executor_config, executor_config) > return merged_executor_config > except Exception: > return executor_config > {code} > Finally, we'll want to call the get_executor_config function in the > `BaseOperator` possibly here: > https://github.com/apache/airflow/blob/master/airflow/models/__init__.py#L2348 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-3645) Use a base_executor_config and merge operator level executor_config
[ https://issues.apache.org/jira/browse/AIRFLOW-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Hamlin reassigned AIRFLOW-3645: Assignee: Kyle Hamlin > Use a base_executor_config and merge operator level executor_config > --- > > Key: AIRFLOW-3645 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3645 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Kyle Hamlin >Assignee: Kyle Hamlin >Priority: Major > Fix For: 1.10.2 > > > It would be very useful to have a `base_executor_config` and merge the base > config with any operator level `executor_config`. > I imaging referencing a python dict similar to how we reference a custom > logging_config > *Example config* > {code:java} > [core] > base_executor_config = config.base_executor_config.BASE_EXECUTOR_CONFIG > {code} > *Example base_executor_config* > {code:java} > BASE_EXECUTOR_CONFIG = { > "KubernetesExecutor": { > "image_pull_policy": "Always", > "annotations": { > "iam.amazonaws.com/role": "arn:aws:iam::" > }, > "volumes": [ > { > "name": "airflow-lib", > "persistentVolumeClaim": { > "claimName": "airflow-lib" > } > } > ], > "volume_mounts": [ > { > "name": "airflow-lib", > "mountPath": "/usr/local/airflow/lib", > } > ] > } > } > {code} > *Example operator* > {code:java} > run_this = PythonOperator( > task_id='print_the_context', > provide_context=True, > python_callable=print_context, > executor_config={ > "KubernetesExecutor": { > "request_memory": "256Mi", > "request_cpu": "100m", > "limit_memory": "256Mi", > "limit_cpu": "100m" > } > }, > dag=dag) > {code} > Then we'll want to have a dict deep merge function in that returns the > executor_config > *Merge functionality* > {code:java} > import collections > from airflow import conf > from airflow.utils.module_loading import import_string > def dict_merge(dct, merge_dct): > """ Recursive dict merge. Inspired by :meth:``dict.update()``, instead of > updating only top-level keys, dict_merge recurses down into dicts nested > to an arbitrary depth, updating keys. The ``merge_dct`` is merged into > ``dct``. > :param dct: dict onto which the merge is executed > :param merge_dct: dct merged into dct > :return: dct > """ > for k, v in merge_dct.items(): > if (k in dct and isinstance(dct[k], dict) > and isinstance(merge_dct[k], collections.Mapping)): > dict_merge(dct[k], merge_dct[k]) > else: > dct[k] = merge_dct[k] > > return dct > def get_executor_config(executor_config): > """Try to import base_executor_config and merge it with provided > executor_config. > :param executor_config: operator level executor config > :return: dict""" > > try: > base_executor_config = import_string( > conf.get('core', 'base_executor_config')) > merged_executor_config = dict_merge( > base_executor_config, executor_config) > return merged_executor_config > except Exception: > return executor_config > {code} > Finally, we'll want to call the get_executor_config function in the > `BaseOperator` possibly here: > https://github.com/apache/airflow/blob/master/airflow/models/__init__.py#L2348 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] Fokko commented on issue #4399: [AIRFLOW-3594] Unify different License Header
Fokko commented on issue #4399: [AIRFLOW-3594] Unify different License Header URL: https://github.com/apache/airflow/pull/4399#issuecomment-452106547 @feluelle I've restarted the test, but due to the recent renaming, I'm not sure if the status of the CI will propagate properly. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] galak75 removed a comment on issue #4292: [AIRFLOW-2508] Handle non string types in Operators templatized fields
galak75 removed a comment on issue #4292: [AIRFLOW-2508] Handle non string types in Operators templatized fields URL: https://github.com/apache/airflow/pull/4292#issuecomment-445222367 Everything went well on our fork (see https://travis-ci.org/VilledeMontreal/incubator-airflow/builds/464753497) But one build failed on Travis with the error below: ``` No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself. Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received The build has been terminated ``` Could anyone restart the build on my PR please? I'm not able to do it, probably a question of permissions... This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ArgentFalcon commented on issue #3533: [AIRFLOW-161] New redirect route and extra links
ArgentFalcon commented on issue #3533: [AIRFLOW-161] New redirect route and extra links URL: https://github.com/apache/airflow/pull/3533#issuecomment-452102290 Oh yeah, I should finish this up. I have to pull some more changes that we did internally at Lyft that make it more versatile. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feluelle commented on issue #4405: [AIRFLOW-3598] Add tests for MsSqlToHiveTransfer
feluelle commented on issue #4405: [AIRFLOW-3598] Add tests for MsSqlToHiveTransfer URL: https://github.com/apache/airflow/pull/4405#issuecomment-452102237 Sure. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on a change in pull request #4390: [AIRFLOW-3584] Use ORM DAGs for index view.
Fokko commented on a change in pull request #4390: [AIRFLOW-3584] Use ORM DAGs for index view. URL: https://github.com/apache/airflow/pull/4390#discussion_r245817009 ## File path: airflow/www_rbac/compile_assets.sh ## @@ -23,6 +23,6 @@ if [ -d airflow/www_rbac/static/dist ]; then fi cd airflow/www_rbac/ -npm install +# npm install Review comment: Why is this commented out? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on a change in pull request #4390: [AIRFLOW-3584] Use ORM DAGs for index view.
Fokko commented on a change in pull request #4390: [AIRFLOW-3584] Use ORM DAGs for index view. URL: https://github.com/apache/airflow/pull/4390#discussion_r245817248 ## File path: airflow/models/__init__.py ## @@ -240,6 +240,20 @@ def clear_task_instances(tis, dr.start_date = timezone.utcnow() +def get_last_dagrun(dag_id, session, include_externally_triggered=False): +""" +Returns the last dag run for a dag, None if there was none. +Last dag run can be any type of run eg. scheduled or backfilled. +Overridden DagRuns are ignored. +""" +DR = DagRun +query = session.query(DR).filter(DR.dag_id == dag_id) +if not include_externally_triggered: +query = query.filter(DR.external_trigger == False) # noqa +query = query.order_by(DR.execution_date.desc()) Review comment: Shouldn't this query benefit from an index as well? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #4378: AIRFLOW-3573 - Remove DagStat table
Fokko commented on issue #4378: AIRFLOW-3573 - Remove DagStat table URL: https://github.com/apache/airflow/pull/4378#issuecomment-452100456 @ffinfo PTAL This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #4351: [AIRFLOW-3554] Remove contrib folder from code cov omit list
Fokko commented on issue #4351: [AIRFLOW-3554] Remove contrib folder from code cov omit list URL: https://github.com/apache/airflow/pull/4351#issuecomment-452098776 I'm okay with this as well. The contrib code should have tests as well. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on a change in pull request #4298: [AIRFLOW-3478] Make sure that the session is closed
Fokko commented on a change in pull request #4298: [AIRFLOW-3478] Make sure that the session is closed URL: https://github.com/apache/airflow/pull/4298#discussion_r245813920 ## File path: airflow/bin/cli.py ## @@ -423,14 +418,11 @@ def unpause(args, dag=None): def set_is_paused(is_paused, args, dag=None): dag = dag or get_dag(args) -session = settings.Session() -dm = session.query(DagModel).filter( -DagModel.dag_id == dag.dag_id).first() -dm.is_paused = is_paused -session.commit() Review comment: I've restored the `.commit()` for now. I would like to work this Friday on setting the `expire_on_commit=True`: https://github.com/apache/airflow/blob/master/airflow/settings.py#L198 It feels like we have a lot of connections to the database because they aren't properly closed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #4383: [AIRFLOW-3475] Move ImportError out of models.py
Fokko commented on issue #4383: [AIRFLOW-3475] Move ImportError out of models.py URL: https://github.com/apache/airflow/pull/4383#issuecomment-452097714 @BasPH I've restarted the failed tests. Maybe do a rebase? It seems to fail on k8s. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #4298: [AIRFLOW-3478] Make sure that the session is closed
Fokko commented on issue #4298: [AIRFLOW-3478] Make sure that the session is closed URL: https://github.com/apache/airflow/pull/4298#issuecomment-452096893 Rebased :-) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #4320: [AIRFLOW-3515] Remove the run_duration option
Fokko commented on issue #4320: [AIRFLOW-3515] Remove the run_duration option URL: https://github.com/apache/airflow/pull/4320#issuecomment-452096465 I've rebased onto master and resolved the conflicts This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)?
Fokko commented on issue #4421: [AIRFLOW-3468] Remove KnownEvent(Event)? URL: https://github.com/apache/airflow/pull/4421#issuecomment-452095834 Rebased. @feng-tao I've looked into the Alembic script, but it becomes quite nasty in my opinion. The upgrade will be a `DROP TABLE IF EXISTS`, and the downgrade will recreate the tables which aren't used in Airflow 2.0 anymore. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on issue #4036: [AIRFLOW-2744] Allow RBAC to accept plugins for views and links.
feng-tao commented on issue #4036: [AIRFLOW-2744] Allow RBAC to accept plugins for views and links. URL: https://github.com/apache/airflow/pull/4036#issuecomment-452093401 @oliviersm199 , it seems that PluginRBACTest fails(currently it is disabled) if re-enable. Do you think you have time to fix the test? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Comment Edited] (AIRFLOW-3292) `delete_dag` endpoint and cli commands don't delete on exact dag_id matching
[ https://issues.apache.org/jira/browse/AIRFLOW-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736368#comment-16736368 ] Teresa Martyny edited comment on AIRFLOW-3292 at 1/7/19 9:25 PM: - Sorry Ash, I just saw this response. We don't use subdags, so we were unaware of this naming convention. Adding a validation to prevent people from naming dags this way would be great. In the meantime, we have created a ticket on our end to rename our dags. Thanks for clarifying! was (Author: teresamartyny): Sorry Ash, I just saw this response. We don't use subdags, so we were unaware of this naming convention. Adding a validation to prevent people from namings dags this way would be great. In the meantime, we have created a ticket on our end to rename our dags. Thanks for clarifying! > `delete_dag` endpoint and cli commands don't delete on exact dag_id matching > > > Key: AIRFLOW-3292 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3292 > Project: Apache Airflow > Issue Type: Bug > Components: api, cli >Affects Versions: 1.10.0 >Reporter: Teresa Martyny >Priority: Major > > If you have the following dag ids: `schema`, `schema.table1`, > `schema.table2`, `schema_replace` > When you hit the delete_dag endpoint with the dag id: `schema`, it will > delete `schema`, `schema.table1`, and `schema.table2`, not just `schema`. > Underscores are fine so it doesn't delete `schema_replace`, but periods are > not. > If this is expected behavior, clarifying that functionality in the docs would > be great, and then I can submit a feature request for the ability to use > regex for exact matching with this command and endpoint. > Thanks!! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3292) `delete_dag` endpoint and cli commands don't delete on exact dag_id matching
[ https://issues.apache.org/jira/browse/AIRFLOW-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736368#comment-16736368 ] Teresa Martyny commented on AIRFLOW-3292: - Sorry Ash, I just saw this response. We don't use subdags, so we were unaware of this naming convention. Adding a validation to prevent people from namings dags this way would be great. In the meantime, we have created a ticket on our end to rename our dags. Thanks for clarifying! > `delete_dag` endpoint and cli commands don't delete on exact dag_id matching > > > Key: AIRFLOW-3292 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3292 > Project: Apache Airflow > Issue Type: Bug > Components: api, cli >Affects Versions: 1.10.0 >Reporter: Teresa Martyny >Priority: Major > > If you have the following dag ids: `schema`, `schema.table1`, > `schema.table2`, `schema_replace` > When you hit the delete_dag endpoint with the dag id: `schema`, it will > delete `schema`, `schema.table1`, and `schema.table2`, not just `schema`. > Underscores are fine so it doesn't delete `schema_replace`, but periods are > not. > If this is expected behavior, clarifying that functionality in the docs would > be great, and then I can submit a feature request for the ability to use > regex for exact matching with this command and endpoint. > Thanks!! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] feng-tao edited a comment on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
feng-tao edited a comment on issue #: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/airflow/pull/#issuecomment-452078774 hey @kaxil , that's great news. I think we may need to revert this commit(https://github.com/apache/airflow/commit/b5b9287a75596a617557798f1286cf7b89c55350#diff-a7b22c07c43739c8eb0850a6fd6f7eb8) in v1-10-test branch as it is already included in master branch and cherry-pick the same commit from master branch. I think the original author creates a separate commit for v10.1 release. Once that commit is reverted, thing should be much easier. And we should include this critical fix as well(https://github.com/apache/airflow/pull/4305/commits/bf45855e11e0cb80040615af19fe9138406cb52b) once the commit is resolved. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
feng-tao commented on issue #: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/airflow/pull/#issuecomment-452079131 @kaxil, thanks for running the release :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
feng-tao commented on issue #: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/airflow/pull/#issuecomment-452078774 hey @kaxil , that's great news. I think we may need to revert this commit(https://github.com/apache/airflow/commit/b5b9287a75596a617557798f1286cf7b89c55350#diff-a7b22c07c43739c8eb0850a6fd6f7eb8) in v1-10-test branch as it is already included in master branch. I think the original author creates a separate commit for v10.1 release. Once that commit is reverted, thing should be much easier. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
kaxil commented on issue #: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/airflow/pull/#issuecomment-452076833 Trying to resolve conflicts :D so that we can include https://github.com/apache/incubator-airflow/pull/3770 and your DAG-level access commit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3402) Set default kubernetes affinity and toleration settings in airflow.cfg
[ https://issues.apache.org/jira/browse/AIRFLOW-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736325#comment-16736325 ] ASF GitHub Bot commented on AIRFLOW-3402: - kaxil commented on pull request #4454: [AIRFLOW-3402] Port PR #4247 to 1.10-test URL: https://github.com/apache/airflow/pull/4454 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Set default kubernetes affinity and toleration settings in airflow.cfg > -- > > Key: AIRFLOW-3402 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3402 > Project: Apache Airflow > Issue Type: Improvement > Components: kubernetes >Reporter: Kevin Pullin >Assignee: Kevin Pullin >Priority: Major > Fix For: 1.10.2 > > > Currently airflow supports setting kubernetes `affinity` and `toleration` > configuration inside dags using either a `KubernetesExecutorConfig` > definition or using the `KubernetesPodOperator`. > In order to reduce having to set and maintain this configuration in every > dag, it'd be useful to have the ability to set these globally in the > airflow.cfg file. One use case is to force all kubernetes pods to run on a > particular set of dedicated airflow nodes, which requires both affinity rules > and tolerations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kaxil closed pull request #4454: [AIRFLOW-3402] Port PR #4247 to 1.10-test
kaxil closed pull request #4454: [AIRFLOW-3402] Port PR #4247 to 1.10-test URL: https://github.com/apache/airflow/pull/4454 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/config_templates/default_airflow.cfg b/airflow/config_templates/default_airflow.cfg index c8aa4061e7..a72604a536 100644 --- a/airflow/config_templates/default_airflow.cfg +++ b/airflow/config_templates/default_airflow.cfg @@ -630,6 +630,16 @@ gcp_service_account_keys = # It will raise an exception if called from a process not running in a kubernetes environment. in_cluster = True +# Affinity configuration as a single line formatted JSON object. +# See the affinity model for top-level key names (e.g. `nodeAffinity`, etc.): +# https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.12/#affinity-v1-core +affinity = + +# A list of toleration objects as a single line formatted JSON array +# See: +# https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.12/#toleration-v1-core +tolerations = + [kubernetes_node_selectors] # The Key-value pairs to be given to worker pods. # The worker pods will be scheduled to the nodes of the specified key-value pairs. diff --git a/airflow/contrib/example_dags/example_kubernetes_executor.py b/airflow/contrib/example_dags/example_kubernetes_executor.py index 1d9bb73043..d03e255ab3 100644 --- a/airflow/contrib/example_dags/example_kubernetes_executor.py +++ b/airflow/contrib/example_dags/example_kubernetes_executor.py @@ -32,6 +32,31 @@ schedule_interval=None ) +affinity = { +'podAntiAffinity': { +'requiredDuringSchedulingIgnoredDuringExecution': [ +{ +'topologyKey': 'kubernetes.io/hostname', +'labelSelector': { +'matchExpressions': [ +{ +'key': 'app', +'operator': 'In', +'values': ['airflow'] +} +] +} +} +] +} +} + +tolerations = [{ +'key': 'dedicated', +'operator': 'Equal', +'value': 'airflow' +}] + def print_stuff(): print("stuff!") @@ -59,11 +84,14 @@ def use_zip_binary(): executor_config={"KubernetesExecutor": {"image": "airflow/ci_zip:latest"}} ) -# Limit resources on this operator/task +# Limit resources on this operator/task with node affinity & tolerations three_task = PythonOperator( task_id="three_task", python_callable=print_stuff, dag=dag, executor_config={ -"KubernetesExecutor": {"request_memory": "128Mi", "limit_memory": "128Mi"}} +"KubernetesExecutor": {"request_memory": "128Mi", + "limit_memory": "128Mi", + "tolerations": tolerations, + "affinity": affinity}} ) start_task.set_downstream([one_task, two_task, three_task]) diff --git a/airflow/contrib/executors/kubernetes_executor.py b/airflow/contrib/executors/kubernetes_executor.py index dd9cd3ec53..e06a5f47e1 100644 --- a/airflow/contrib/executors/kubernetes_executor.py +++ b/airflow/contrib/executors/kubernetes_executor.py @@ -16,6 +16,7 @@ # under the License. import base64 +import json import multiprocessing from queue import Queue from dateutil import parser @@ -40,7 +41,7 @@ class KubernetesExecutorConfig: def __init__(self, image=None, image_pull_policy=None, request_memory=None, request_cpu=None, limit_memory=None, limit_cpu=None, gcp_service_account_key=None, node_selectors=None, affinity=None, - annotations=None, volumes=None, volume_mounts=None): + annotations=None, volumes=None, volume_mounts=None, tolerations=None): self.image = image self.image_pull_policy = image_pull_policy self.request_memory = request_memory @@ -53,16 +54,18 @@ def __init__(self, image=None, image_pull_policy=None, request_memory=None, self.annotations = annotations self.volumes = volumes self.volume_mounts = volume_mounts +self.tolerations = tolerations def __repr__(self): return "{}(image={}, image_pull_policy={}, request_memory={}, request_cpu={}, " \ "limit_memory={}, limit_cpu={}, gcp_service_account_key={}, " \ "node_selectors={}, affinity={}, annotations={}, volumes={}, " \ - "volume_mounts={})" \ + "volume_mounts={}, tolerations={})" \ .format(KubernetesExecutorConfig.__name__, self.image, self.image_pull_policy, self.request_memory, self.request_cpu, self.limit_memory,
[GitHub] feluelle commented on a change in pull request #4455: [AIRFLOW-3519] Fix example http operator
feluelle commented on a change in pull request #4455: [AIRFLOW-3519] Fix example http operator URL: https://github.com/apache/airflow/pull/4455#discussion_r245791673 ## File path: airflow/example_dags/example_http_operator.py ## @@ -92,7 +92,7 @@ http_conn_id='http_default', endpoint='', request_params={}, -response_check=lambda response: True if "Google" in response.content else False, Review comment: ..and I thought `"Google" in response.text` makes more sense (to me) and you do not need to do the byte conversion by yourself. :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
feng-tao commented on issue #: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/airflow/pull/#issuecomment-452075293 what is the issue @kaxil ? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao edited a comment on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
feng-tao edited a comment on issue #: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/airflow/pull/#issuecomment-452075293 @kaxil, what is the issue? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil edited a comment on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
kaxil edited a comment on issue #: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/airflow/pull/#issuecomment-452074777 Sorry, I merged it and had to revert which caused this piece to fail. May well have to create a new PR. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on issue #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
kaxil commented on issue #: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/airflow/pull/#issuecomment-452074777 Sorry, I merged it and had to revert. Let me know when it is ready This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on a change in pull request #4455: [AIRFLOW-3519] Fix example http operator
kaxil commented on a change in pull request #4455: [AIRFLOW-3519] Fix example http operator URL: https://github.com/apache/airflow/pull/4455#discussion_r245790529 ## File path: airflow/example_dags/example_http_operator.py ## @@ -92,7 +92,7 @@ http_conn_id='http_default', endpoint='', request_params={}, -response_check=lambda response: True if "Google" in response.content else False, Review comment: Makes sense. Thanks @feluelle This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feluelle commented on a change in pull request #4455: [AIRFLOW-3519] Fix example http operator
feluelle commented on a change in pull request #4455: [AIRFLOW-3519] Fix example http operator URL: https://github.com/apache/airflow/pull/4455#discussion_r245789254 ## File path: airflow/example_dags/example_http_operator.py ## @@ -92,7 +92,7 @@ http_conn_id='http_default', endpoint='', request_params={}, -response_check=lambda response: True if "Google" in response.content else False, Review comment: Yes. `response.content` is a byte string and not a decoded string like `response.text`. So either `b"Google" in response.content` would work or `"Google" in response.text` . See these Jira tickets: https://issues.apache.org/jira/browse/AIRFLOW-3519 and https://issues.apache.org/jira/browse/AIRFLOW-450 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil closed pull request #4444: [AIRFLOW-3281] Fix Kubernetes operator with git-sync
kaxil closed pull request #: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/airflow/pull/ This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3281) Kubernetes git sync implementation is broken
[ https://issues.apache.org/jira/browse/AIRFLOW-3281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736311#comment-16736311 ] ASF GitHub Bot commented on AIRFLOW-3281: - kaxil commented on pull request #: [AIRFLOW-3281] Fix Kubernetes operator with git-sync URL: https://github.com/apache/airflow/pull/ This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Kubernetes git sync implementation is broken > > > Key: AIRFLOW-3281 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3281 > Project: Apache Airflow > Issue Type: Bug >Reporter: Riccardo Bini >Assignee: Riccardo Bini >Priority: Major > > The current implementation of git-sync when airflow is being used with > kubernetes is broken. > The init container doesn't share the volume with the airflow container and > the path of the dag folder doesn't take into account the fact that git sync > creates a sym link to the revision -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] benjamingregory commented on issue #3055: [AIRFLOW-2125] Using binary package psycopg2-binary
benjamingregory commented on issue #3055: [AIRFLOW-2125] Using binary package psycopg2-binary URL: https://github.com/apache/airflow/pull/3055#issuecomment-452071229 @bern4rdelli @jgao54 @Fokko Question as to why this was changed to `psychopg2-binary` given the following warning from http://initd.org/psycopg/docs/install.html#binary-install-from-pypi ``` Note: The -binary package is meant for beginners to start playing with Python and PostgreSQL without the need to meet the build requirements. If you are the maintainer of a publish package depending on psycopg2 you shouldn’t use psycopg2-binary as a module dependency. For production use you are advised to use the source distribution. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on a change in pull request #4455: [AIRFLOW-3519] Fix example http operator
kaxil commented on a change in pull request #4455: [AIRFLOW-3519] Fix example http operator URL: https://github.com/apache/airflow/pull/4455#discussion_r245784539 ## File path: airflow/example_dags/example_http_operator.py ## @@ -92,7 +92,7 @@ http_conn_id='http_default', endpoint='', request_params={}, -response_check=lambda response: True if "Google" in response.content else False, Review comment: response.content works as well. Does it error for you @feluelle ? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-3646) Fix plugin manager test
Tao Feng created AIRFLOW-3646: - Summary: Fix plugin manager test Key: AIRFLOW-3646 URL: https://issues.apache.org/jira/browse/AIRFLOW-3646 Project: Apache Airflow Issue Type: Bug Reporter: Tao Feng -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] feng-tao commented on issue #4399: [AIRFLOW-3594] Unify different License Header
feng-tao commented on issue #4399: [AIRFLOW-3594] Unify different License Header URL: https://github.com/apache/airflow/pull/4399#issuecomment-452035908 PTAL @bolkedebruin This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3272) Create gRPC hook for creating generic grpc connection
[ https://issues.apache.org/jira/browse/AIRFLOW-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736141#comment-16736141 ] ASF GitHub Bot commented on AIRFLOW-3272: - morgendave commented on pull request #4101: [AIRFLOW-3272] Add base grpc hook URL: https://github.com/apache/airflow/pull/4101 Make sure you have checked all steps below. Jira My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-3272] My Airflow PR" https://issues.apache.org/jira/browse/AIRFLOW-3272 Description Add support for gRPC connection in airflow. In Airflow there are use cases of calling gPRC services, so instead of each time create the channel in a PythonOperator, there should be a basic GrpcHook to take care of it. The hook needs to take care of the authentication. Tests My PR adds the following unit tests OR does not need testing for this extremely good reason: Commits My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message": Subject is separated from body by a blank line Subject is limited to 50 characters (not including Jira issue reference) Subject does not end with a period Subject uses the imperative mood ("add", not "adding") Body wraps at 72 characters Body explains "what" and "why", not "how" Documentation In case of new functionality, my PR adds documentation that describes how to use it. When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. Code Quality Passes flake8 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Create gRPC hook for creating generic grpc connection > - > > Key: AIRFLOW-3272 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3272 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Zhiwei Zhao >Assignee: Zhiwei Zhao >Priority: Minor > > Add support for gRPC connection in airflow. > In Airflow there are use cases of calling gPRC services, so instead of each > time create the channel in a PythonOperator, there should be a basic GrpcHook > to take care of it. The hook needs to take care of the authentication. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3645) Use a base_executor_config and merge operator level executor_config
Kyle Hamlin created AIRFLOW-3645: Summary: Use a base_executor_config and merge operator level executor_config Key: AIRFLOW-3645 URL: https://issues.apache.org/jira/browse/AIRFLOW-3645 Project: Apache Airflow Issue Type: Improvement Reporter: Kyle Hamlin Fix For: 1.10.2 It would be very useful to have a `base_executor_config` and merge the base config with any operator level `executor_config`. I imaging referencing a python dict similar to how we reference a custom logging_config *Example config* {code:java} [core] base_executor_config = config.base_executor_config.BASE_EXECUTOR_CONFIG {code} *Example base_executor_config* {code:java} BASE_EXECUTOR_CONFIG = { "KubernetesExecutor": { "image_pull_policy": "Always", "annotations": { "iam.amazonaws.com/role": "arn:aws:iam::" }, "volumes": [ { "name": "airflow-lib", "persistentVolumeClaim": { "claimName": "airflow-lib" } } ], "volume_mounts": [ { "name": "airflow-lib", "mountPath": "/usr/local/airflow/lib", } ] } } {code} *Example operator* {code:java} run_this = PythonOperator( task_id='print_the_context', provide_context=True, python_callable=print_context, executor_config={ "KubernetesExecutor": { "request_memory": "256Mi", "request_cpu": "100m", "limit_memory": "256Mi", "limit_cpu": "100m" } }, dag=dag) {code} Then we'll want to have a dict deep merge function in that returns the executor_config *Merge functionality* {code:java} import collections from airflow import conf from airflow.utils.module_loading import import_string def dict_merge(dct, merge_dct): """ Recursive dict merge. Inspired by :meth:``dict.update()``, instead of updating only top-level keys, dict_merge recurses down into dicts nested to an arbitrary depth, updating keys. The ``merge_dct`` is merged into ``dct``. :param dct: dict onto which the merge is executed :param merge_dct: dct merged into dct :return: dct """ for k, v in merge_dct.items(): if (k in dct and isinstance(dct[k], dict) and isinstance(merge_dct[k], collections.Mapping)): dict_merge(dct[k], merge_dct[k]) else: dct[k] = merge_dct[k] return dct def get_executor_config(executor_config): """Try to import base_executor_config and merge it with provided executor_config. :param executor_config: operator level executor config :return: dict""" try: base_executor_config = import_string( conf.get('core', 'base_executor_config')) merged_executor_config = dict_merge( base_executor_config, executor_config) return merged_executor_config except Exception: return executor_config {code} Finally, we'll want to call the get_executor_config function in the `BaseOperator` possibly here: https://github.com/apache/airflow/blob/master/airflow/models/__init__.py#L2348 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3272) Create gRPC hook for creating generic grpc connection
[ https://issues.apache.org/jira/browse/AIRFLOW-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736140#comment-16736140 ] ASF GitHub Bot commented on AIRFLOW-3272: - morgendave commented on pull request #4101: [AIRFLOW-3272] Add base grpc hook URL: https://github.com/apache/airflow/pull/4101 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Create gRPC hook for creating generic grpc connection > - > > Key: AIRFLOW-3272 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3272 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Zhiwei Zhao >Assignee: Zhiwei Zhao >Priority: Minor > > Add support for gRPC connection in airflow. > In Airflow there are use cases of calling gPRC services, so instead of each > time create the channel in a PythonOperator, there should be a basic GrpcHook > to take care of it. The hook needs to take care of the authentication. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] morgendave opened a new pull request #4101: [AIRFLOW-3272] Add base grpc hook
morgendave opened a new pull request #4101: [AIRFLOW-3272] Add base grpc hook URL: https://github.com/apache/airflow/pull/4101 Make sure you have checked all steps below. Jira My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-3272] My Airflow PR" https://issues.apache.org/jira/browse/AIRFLOW-3272 Description Add support for gRPC connection in airflow. In Airflow there are use cases of calling gPRC services, so instead of each time create the channel in a PythonOperator, there should be a basic GrpcHook to take care of it. The hook needs to take care of the authentication. Tests My PR adds the following unit tests OR does not need testing for this extremely good reason: Commits My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message": Subject is separated from body by a blank line Subject is limited to 50 characters (not including Jira issue reference) Subject does not end with a period Subject uses the imperative mood ("add", not "adding") Body wraps at 72 characters Body explains "what" and "why", not "how" Documentation In case of new functionality, my PR adds documentation that describes how to use it. When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. Code Quality Passes flake8 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on issue #4453: [AIRFLOW-XXX] Fix existing flake8 errors
feng-tao commented on issue #4453: [AIRFLOW-XXX] Fix existing flake8 errors URL: https://github.com/apache/airflow/pull/4453#issuecomment-452019385 @yohei1126 , I think this test is pretty flaky. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on issue #4453: [AIRFLOW-XXX] Fix existing flake8 errors
feng-tao commented on issue #4453: [AIRFLOW-XXX] Fix existing flake8 errors URL: https://github.com/apache/airflow/pull/4453#issuecomment-452019252 @yohei1126 , not sure which one you are looking at, but the master CI passes(https://travis-ci.org/apache/airflow/builds/476197675). The test failure comes from https://github.com/apache/airflow/pull/4407 which I have discussed in the pr as well as the mailing list. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on issue #4407: [AIRFLOW-3600] Remove dagbag from trigger
feng-tao commented on issue #4407: [AIRFLOW-3600] Remove dagbag from trigger URL: https://github.com/apache/airflow/pull/4407#issuecomment-452019084 @ffinfo , please let me know if you will investigate the issue. If not, I prefer to revert this change until the flaky test is fixed. What do you think @Fokko ? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3519) example_http_operator is failing due to
[ https://issues.apache.org/jira/browse/AIRFLOW-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736031#comment-16736031 ] ASF GitHub Bot commented on AIRFLOW-3519: - feluelle commented on pull request #4455: [AIRFLOW-3519] Fix example http operator URL: https://github.com/apache/airflow/pull/4455 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3519 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: This PR fixes the sensor in the example http operator that searches for a string in a byte-like object called response.content. Now it searches in the decoded response object called response.text. **NOTE:** This PR also fixes issue https://issues.apache.org/jira/browse/AIRFLOW-450. This ticket was already marked as resolved but it actually wasn't. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [x] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > example_http_operator is failing due to > > > Key: AIRFLOW-3519 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3519 > Project: Apache Airflow > Issue Type: Bug > Environment: Windows 10 professional edition.Apache airflow >Reporter: Arunachalam Ambikapathi >Assignee: Felix Uellendall >Priority: Minor > > When example_http_operator DAG is called from command line, > ./airflow trigger_dag example_http_operator > it was throwing error > [2018-12-13 10:37:41,892] > {logging_mixin.py:95} > INFO - [2018-12-13 10:37:41,892] > {http_hook.py:126} > INFO - Sending 'GET' to url: [https://www.google.com/] > [2018-12-13 10:37:41,992] > {logging_mixin.py:95} > WARNING - > /home/arun1/.local/lib/python3.5/site-packages/urllib3/connectionpool.py:847: > InsecureRequestWarning: Unverified HTTPS request is being made. Adding > certificate verification is strongly advised. See: > [https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings] > InsecureRequestWarning) > [2018-12-13 10:37:42,064] > {models.py:1760} > *ERROR - a bytes-like object is required, not 'str'* > This may be due to this was not tested in python3.5 version. > *Fix:* > I changed the dag to this and tested it is working. > from > response_check=lambda response: True if "Google" in response.content else > False, > to > response_check=lambda response: True if *b'Google'* in response.content else > False, > Please apply this in the example it would help new users a lot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3644) AIP-8 Split Hooks/Operators out of core package and repository
Tim Swast created AIRFLOW-3644: -- Summary: AIP-8 Split Hooks/Operators out of core package and repository Key: AIRFLOW-3644 URL: https://issues.apache.org/jira/browse/AIRFLOW-3644 Project: Apache Airflow Issue Type: Improvement Components: contrib, core Reporter: Tim Swast Based on discussion at http://mail-archives.apache.org/mod_mbox/airflow-dev/201809.mbox/%3c308670db-bd2a-4738-81b1-3f6fb312c...@apache.org%3E I believe separating hooks/operators into separate packages can benefit long-term maintainability of Apache Airflow by distributing maintenance and reducing the surface area of the core Airflow package. AIP-8 draft: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] feluelle opened a new pull request #4455: [AIRFLOW-3519] Fix example http operator
feluelle opened a new pull request #4455: [AIRFLOW-3519] Fix example http operator URL: https://github.com/apache/airflow/pull/4455 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3519 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: This PR fixes the sensor in the example http operator that searches for a string in a byte-like object called response.content. Now it searches in the decoded response object called response.text. **NOTE:** This PR also fixes issue https://issues.apache.org/jira/browse/AIRFLOW-450. This ticket was already marked as resolved but it actually wasn't. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. - All the public functions and the classes in the PR contain docstrings that explain what it does ### Code Quality - [x] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Mokubyow edited a comment on issue #4309: [AIRFLOW-3504] Extend/refine the functionality of "/health" endpoint
Mokubyow edited a comment on issue #4309: [AIRFLOW-3504] Extend/refine the functionality of "/health" endpoint URL: https://github.com/apache/airflow/pull/4309#issuecomment-451993107 Emmanual Bard brought up a good point about the scheduler health check. What you have only checks the last scheduled run, not the scheduler heartbeat which is what we really want to know. The query should look something like this: ``` select max(latest_heartbeat) from job where job_type = 'SchedulerJob' and state = 'running'``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Comment Edited] (AIRFLOW-3519) example_http_operator is failing due to
[ https://issues.apache.org/jira/browse/AIRFLOW-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736002#comment-16736002 ] Felix Uellendall edited comment on AIRFLOW-3519 at 1/7/19 4:04 PM: --- I would rather change the response_check to return True if it is in response.text instead of response.content. Because response.text is the decoded response of response.content. See http://docs.python-requests.org/en/master/user/quickstart/#response-content What do you think [~Arun Ambikapathi] ? was (Author: feluelle): I would rather change the response_check to return True if it is in response.text instead of response.content. Because response.text is the decoded response of response.content. What do you think [~Arun Ambikapathi] ? > example_http_operator is failing due to > > > Key: AIRFLOW-3519 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3519 > Project: Apache Airflow > Issue Type: Bug > Environment: Windows 10 professional edition.Apache airflow >Reporter: Arunachalam Ambikapathi >Assignee: Felix Uellendall >Priority: Minor > > When example_http_operator DAG is called from command line, > ./airflow trigger_dag example_http_operator > it was throwing error > [2018-12-13 10:37:41,892] > {logging_mixin.py:95} > INFO - [2018-12-13 10:37:41,892] > {http_hook.py:126} > INFO - Sending 'GET' to url: [https://www.google.com/] > [2018-12-13 10:37:41,992] > {logging_mixin.py:95} > WARNING - > /home/arun1/.local/lib/python3.5/site-packages/urllib3/connectionpool.py:847: > InsecureRequestWarning: Unverified HTTPS request is being made. Adding > certificate verification is strongly advised. See: > [https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings] > InsecureRequestWarning) > [2018-12-13 10:37:42,064] > {models.py:1760} > *ERROR - a bytes-like object is required, not 'str'* > This may be due to this was not tested in python3.5 version. > *Fix:* > I changed the dag to this and tested it is working. > from > response_check=lambda response: True if "Google" in response.content else > False, > to > response_check=lambda response: True if *b'Google'* in response.content else > False, > Please apply this in the example it would help new users a lot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3519) example_http_operator is failing due to
[ https://issues.apache.org/jira/browse/AIRFLOW-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736002#comment-16736002 ] Felix Uellendall commented on AIRFLOW-3519: --- I would rather change the response_check to return True if it is in response.text instead of response.content. Because response.text is the decoded response of response.content. What do you think [~Arun Ambikapathi] ? > example_http_operator is failing due to > > > Key: AIRFLOW-3519 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3519 > Project: Apache Airflow > Issue Type: Bug > Environment: Windows 10 professional edition.Apache airflow >Reporter: Arunachalam Ambikapathi >Assignee: Felix Uellendall >Priority: Minor > > When example_http_operator DAG is called from command line, > ./airflow trigger_dag example_http_operator > it was throwing error > [2018-12-13 10:37:41,892] > {logging_mixin.py:95} > INFO - [2018-12-13 10:37:41,892] > {http_hook.py:126} > INFO - Sending 'GET' to url: [https://www.google.com/] > [2018-12-13 10:37:41,992] > {logging_mixin.py:95} > WARNING - > /home/arun1/.local/lib/python3.5/site-packages/urllib3/connectionpool.py:847: > InsecureRequestWarning: Unverified HTTPS request is being made. Adding > certificate verification is strongly advised. See: > [https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings] > InsecureRequestWarning) > [2018-12-13 10:37:42,064] > {models.py:1760} > *ERROR - a bytes-like object is required, not 'str'* > This may be due to this was not tested in python3.5 version. > *Fix:* > I changed the dag to this and tested it is working. > from > response_check=lambda response: True if "Google" in response.content else > False, > to > response_check=lambda response: True if *b'Google'* in response.content else > False, > Please apply this in the example it would help new users a lot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (AIRFLOW-3601) update operators to BigQuery to support location
[ https://issues.apache.org/jira/browse/AIRFLOW-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-3601 started by Yohei Onishi. - > update operators to BigQuery to support location > > > Key: AIRFLOW-3601 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3601 > Project: Apache Airflow > Issue Type: Task >Affects Versions: 1.10.1 >Reporter: Yohei Onishi >Assignee: Yohei Onishi >Priority: Major > > location support for BigQueryHook was merged by the PR 4324 > [https://github.com/apache/incubator-airflow/pull/4324] > The following operators needs to be updated. > * bigquery_check_operator.py > ** > BigQueryCheckOperator > * bigquery_get_data.py > ** > BigQueryGetDataOperator > * bigquery_operator.py > ** > BigQueryOperator > BigQueryCreateEmptyTableOperator > BigQueryCreateExternalTableOperator > BigQueryDeleteDatasetOperator > BigQueryCreateEmptyDatasetOperator > * bigquery_table_delete_operator.py > ** > BigQueryTableDeleteOperator > * bigquery_to_bigquery.py > ** > BigQueryToBigQueryOperator > * bigquery_to_gcs.py > ** > BigQueryToCloudStorageOperator > * gcs_to_bq.py > ** > GoogleCloudStorageToBigQueryOperator > * bigquery_sensor.py > ** > BigQueryTableSensor -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] feluelle commented on issue #3723: [AIRFLOW-2876] Update Tenacity to 4.12
feluelle commented on issue #3723: [AIRFLOW-2876] Update Tenacity to 4.12 URL: https://github.com/apache/airflow/pull/3723#issuecomment-451887370 I just tried to reproduce @r39132 error but I can't. I ran `pip install .` from the current `master` branch and installed the following versions successfully in a python 3.6.5 venv. Airflow Version: v2.0.0.dev0+ Tenacity Version: 4.12.0 After installation I ran `airflow initdb` then `airflow scheduler` then `airflow webserver` and all seems to work - no errors. Can someone else reproduce the error? I would like to fix it if it has not already been fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] yohei1126 commented on issue #4453: [AIRFLOW-XXX] Fix existing flake8 errors
yohei1126 commented on issue #4453: [AIRFLOW-XXX] Fix existing flake8 errors URL: https://github.com/apache/airflow/pull/4453#issuecomment-451875457 @feng-tao your build failed. Is this your test? please check https://travis-ci.org/apache/airflow/jobs/476146347#L4902 ``` == 42) FAIL: test_trigger_dag_button (tests.www_rbac.test_views.TestTriggerDag) -- Traceback (most recent call last): tests/www_rbac/test_views.py line 1476 in test_trigger_dag_button self.assertIsNotNone(run) AssertionError: unexpectedly None ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feluelle commented on a change in pull request #4449: [AIRFLOW-3638] Add tests for PrestoToMySqlTransfer
feluelle commented on a change in pull request #4449: [AIRFLOW-3638] Add tests for PrestoToMySqlTransfer URL: https://github.com/apache/airflow/pull/4449#discussion_r245590944 ## File path: tests/operators/test_presto_to_mysql.py ## @@ -0,0 +1,37 @@ +import unittest Review comment: @kaxil fixed :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services