[GitHub] [airflow] zhongjiajie commented on issue #7579: [AIRFLOW-6952] Use property for dag default_view
zhongjiajie commented on issue #7579: [AIRFLOW-6952] Use property for dag default_view URL: https://github.com/apache/airflow/pull/7579#issuecomment-592391945 @kaxil @potiuk @mik-laj PTAL, a small change here This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client URL: https://github.com/apache/airflow/pull/7541#discussion_r385543236 ## File path: airflow/providers/amazon/aws/sensors/sqs.py ## @@ -56,6 +56,7 @@ def __init__(self, self.aws_conn_id = aws_conn_id self.max_messages = max_messages self.wait_time_seconds = wait_time_seconds +self.hook = SQSHook(aws_conn_id=self.aws_conn_id) Review comment: Resolved, thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client URL: https://github.com/apache/airflow/pull/7541#discussion_r385543222 ## File path: airflow/providers/amazon/aws/sensors/s3_prefix.py ## @@ -69,12 +70,11 @@ def __init__(self, self.full_url = "s3://" + bucket_name + '/' + prefix self.aws_conn_id = aws_conn_id self.verify = verify +self.hook = S3Hook(aws_conn_id=self.aws_conn_id, verify=self.verify) Review comment: Resolved, thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client URL: https://github.com/apache/airflow/pull/7541#discussion_r385543208 ## File path: airflow/providers/amazon/aws/sensors/redshift.py ## @@ -43,9 +43,9 @@ def __init__(self, self.cluster_identifier = cluster_identifier self.target_status = target_status self.aws_conn_id = aws_conn_id +self.hook = RedshiftHook(aws_conn_id=self.aws_conn_id) Review comment: Resolved, thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client URL: https://github.com/apache/airflow/pull/7541#discussion_r385542845 ## File path: tests/providers/amazon/aws/operators/test_ecs.py ## @@ -48,12 +48,10 @@ ] } - +# pylint: disable=unused-argument +@mock.patch('airflow.providers.amazon.aws.operators.ecs.AwsBaseHook') Review comment: Found a way to fix this. I had to call `get_hook()` during the `setUp()` function, while the hook was still mocked, to ensure that the operator saves the mocked hook in its `self.hook` If I call `get_hook()` for the first time _after_ the `setUp` - (eg by the Operator execute) then it is no longer mocked. I guess this makes sense but it was really unexpected :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client URL: https://github.com/apache/airflow/pull/7541#discussion_r385542306 ## File path: tests/providers/amazon/aws/operators/test_ecs.py ## @@ -171,7 +166,7 @@ def test_execute_with_failures(self): } ) -def test_wait_end_tasks(self): +def test_wait_end_tasks(self, aws_hook_mock): Review comment: Found a way to fix the issue with `setUp`, so this arg will be removed :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client URL: https://github.com/apache/airflow/pull/7541#discussion_r385542156 ## File path: airflow/providers/amazon/aws/hooks/base_aws.py ## @@ -47,11 +48,29 @@ class AwsBaseHook(BaseHook): :param verify: Whether or not to verify SSL certificates. https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html :type verify: str or bool +:param str region_name: AWS Region name to use. If this is None then the default boto3 +behaviour is used. +:param str client_type: boto3 client_type used when creating boto3.client(). For +example, 's3', 'emr', etc. Provided by specific hooks for these clients which +subclass AwsBaseHook. +:param str resource_type: boto3 resource_type used when creating boto3.resource(). For +example, 's3'. Provided by specific hooks for these resources which +subclass AwsBaseHook. Review comment: Resolved, thanks. I preferred the conciseness of `:param : xyz` But have changed back to `:param` + `:type` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] roitvt commented on a change in pull request #7163: [AIRFLOW-6542] add spark-on-k8s operator/hook/sensor
roitvt commented on a change in pull request #7163: [AIRFLOW-6542] add spark-on-k8s operator/hook/sensor URL: https://github.com/apache/airflow/pull/7163#discussion_r385542795 ## File path: tests/test_project_structure.py ## @@ -36,6 +36,9 @@ 'tests/providers/apache/pig/operators/test_pig.py', 'tests/providers/apache/spark/hooks/test_spark_jdbc_script.py', 'tests/providers/cncf/kubernetes/operators/test_kubernetes_pod.py', + 'tests/providers/cncf/kubernetes/operators/test_spark_kubernetes_operator.py', +'tests/providers/cncf/kubernetes/hooks/test_kubernetes_hook.py', +'tests/providers/cncf/kubernetes/sensors/test_spark_kubernetes_sensor.py', Review comment: you are right! I added tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client URL: https://github.com/apache/airflow/pull/7541#discussion_r385542845 ## File path: tests/providers/amazon/aws/operators/test_ecs.py ## @@ -48,12 +48,10 @@ ] } - +# pylint: disable=unused-argument +@mock.patch('airflow.providers.amazon.aws.operators.ecs.AwsBaseHook') Review comment: Found a way to fix this. I had to call `get_hook()` during the `setUp `function, while the hook was still mocked, to ensure that the operator saves the mocked hook in its `self.hook` If I call `get_hook()` for the first time _after_ the `setUp` (eg by the Operator execute) then it is no longer mocked. I guess this makes sense but it was really unexpected :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client URL: https://github.com/apache/airflow/pull/7541#discussion_r385542185 ## File path: tests/providers/amazon/aws/sensors/test_sqs.py ## @@ -18,8 +18,9 @@ import unittest -from unittest.mock import MagicMock, patch +from unittest.mock import MagicMock +import mock Review comment: Resolved, thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client URL: https://github.com/apache/airflow/pull/7541#discussion_r385542306 ## File path: tests/providers/amazon/aws/operators/test_ecs.py ## @@ -171,7 +166,7 @@ def test_execute_with_failures(self): } ) -def test_wait_end_tasks(self): +def test_wait_end_tasks(self, aws_hook_mock): Review comment: Found a way to fix the issue with setUp, so this arg will be removed :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client URL: https://github.com/apache/airflow/pull/7541#discussion_r385542156 ## File path: airflow/providers/amazon/aws/hooks/base_aws.py ## @@ -47,11 +48,29 @@ class AwsBaseHook(BaseHook): :param verify: Whether or not to verify SSL certificates. https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html :type verify: str or bool +:param str region_name: AWS Region name to use. If this is None then the default boto3 +behaviour is used. +:param str client_type: boto3 client_type used when creating boto3.client(). For +example, 's3', 'emr', etc. Provided by specific hooks for these clients which +subclass AwsBaseHook. +:param str resource_type: boto3 resource_type used when creating boto3.resource(). For +example, 's3'. Provided by specific hooks for these resources which +subclass AwsBaseHook. Review comment: Resolved, thanks. I preferred the conciseness of :param : xyz But have changed back to :param + :type This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client URL: https://github.com/apache/airflow/pull/7541#discussion_r385541955 ## File path: airflow/providers/amazon/aws/hooks/base_aws.py ## @@ -232,6 +251,38 @@ def get_resource_type(self, resource_type, region_name=None, config=None): resource_type, endpoint_url=endpoint_url, config=config, verify=self.verify ) +@cached_property +def conn(self): +"""Get the underlying boto3 client (cached). + +The return value from this method is cached for efficiency. + +:return: boto3.client or boto3.resource for the current +client/resource type and region +:rtype: boto3.client() or boto3.resource() +:raises AirflowException: self.client_type or self.resource_type are not +populated. These are usually specified to this class, by a subclass +__init__ method. +""" +if self.client_type: +return self.get_client_type(self.client_type, region_name=self.region_name) +elif self.resource_type: +return self.get_resource_type(self.resource_type, region_name=self.region_name) +else: +raise AirflowException( +'Either self.client_type or self.resource_type' +' must be specified in the subclass') Review comment: Resolved, thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] saguziel commented on issue #7269: [AIRFLOW-6651] Add Redis Heartbeat option
saguziel commented on issue #7269: [AIRFLOW-6651] Add Redis Heartbeat option URL: https://github.com/apache/airflow/pull/7269#issuecomment-592382747 DB load. One example is from like from 9% to 6% load.. but this is very dependent on the configs used. It would reduce writes by 30-40% on our main cluster, so would also free up I/O. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-6952) Use property for dag default_view
[ https://issues.apache.org/jira/browse/AIRFLOW-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhongjiajie updated AIRFLOW-6952: - Summary: Use property for dag default_view (was: Dag default_view should use property) > Use property for dag default_view > - > > Key: AIRFLOW-6952 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6952 > Project: Apache Airflow > Issue Type: Improvement > Components: DAG >Affects Versions: 1.10.9 >Reporter: zhongjiajie >Assignee: zhongjiajie >Priority: Major > > should use dag.default_view instead of dag._default_view -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6952) Dag default_view should use property
[ https://issues.apache.org/jira/browse/AIRFLOW-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047252#comment-17047252 ] ASF GitHub Bot commented on AIRFLOW-6952: - zhongjiajie commented on pull request #7579: [AIRFLOW-6952] Use property for dag default_view URL: https://github.com/apache/airflow/pull/7579 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Dag default_view should use property > > > Key: AIRFLOW-6952 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6952 > Project: Apache Airflow > Issue Type: Improvement > Components: DAG >Affects Versions: 1.10.9 >Reporter: zhongjiajie >Assignee: zhongjiajie >Priority: Major > > should use dag.default_view instead of dag._default_view -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] zhongjiajie opened a new pull request #7579: [AIRFLOW-6952] Use property for dag default_view
zhongjiajie opened a new pull request #7579: [AIRFLOW-6952] Use property for dag default_view URL: https://github.com/apache/airflow/pull/7579 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-6952) Dag default_view should use property
zhongjiajie created AIRFLOW-6952: Summary: Dag default_view should use property Key: AIRFLOW-6952 URL: https://issues.apache.org/jira/browse/AIRFLOW-6952 Project: Apache Airflow Issue Type: Improvement Components: DAG Affects Versions: 1.10.9 Reporter: zhongjiajie Assignee: zhongjiajie should use dag.default_view instead of dag._default_view -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client URL: https://github.com/apache/airflow/pull/7541#discussion_r385517283 ## File path: tests/providers/amazon/aws/operators/test_ecs.py ## @@ -48,12 +48,10 @@ ] } - +# pylint: disable=unused-argument +@mock.patch('airflow.providers.amazon.aws.operators.ecs.AwsBaseHook') Review comment: Agreed, it _should_ work :) I'll try and do some more digging. There must be a good reason why it isn't working like that at the moment, and I don't want to have flaky tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-6951) sql_alchemy_connect_args missing from Configuration Reference in new website
Damian Shaw created AIRFLOW-6951: Summary: sql_alchemy_connect_args missing from Configuration Reference in new website Key: AIRFLOW-6951 URL: https://issues.apache.org/jira/browse/AIRFLOW-6951 Project: Apache Airflow Issue Type: Bug Components: documentation Affects Versions: 1.10.9 Reporter: Damian Shaw I'm not sure how this new website is being generated, but the Configuration Reference page: [https://airflow.apache.org/docs/stable/configurations-ref.html] Is missing configuration item "sql_alchemy_connect_args" I can find this configuration in readthedocs page: [https://airflow.readthedocs.io/en/latest/configurations-ref.html#sql-alchemy-connect-args] And in the Airflow Github: [https://github.com/apache/airflow/blob/008b4bab14222da068b737d6332db4963b994007/airflow/config_templates/config.yml#L135] But it is missing from the new website. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] zhongjiajie commented on issue #7544: [AIRFLOW-6922] Support python3.8
zhongjiajie commented on issue #7544: [AIRFLOW-6922] Support python3.8 URL: https://github.com/apache/airflow/pull/7544#issuecomment-592321041 It seem that 3.8 `typing` have some different to 3.6 and 3.7, So maybe we could not just support it by this way This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] zhongjiajie commented on a change in pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command
zhongjiajie commented on a change in pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command URL: https://github.com/apache/airflow/pull/7572#discussion_r385510607 ## File path: airflow/settings.py ## @@ -173,8 +173,8 @@ def configure_orm(disable_connection_pool=False): # https://docs.sqlalchemy.org/en/13/core/pooling.html#disconnect-handling-pessimistic pool_pre_ping = conf.getboolean('core', 'SQL_ALCHEMY_POOL_PRE_PING', fallback=True) -log.info("settings.configure_orm(): Using pool settings. pool_size={}, max_overflow={}, " - "pool_recycle={}, pid={}".format(pool_size, max_overflow, pool_recycle, os.getpid())) +log.debug("settings.configure_orm(): Using pool settings. pool_size=%d, max_overflow=%d, " + "pool_recycle=%d, pid=%d", pool_size, max_overflow, pool_recycle, os.getpid()) Review comment: I think it ok, just a small change, unless we create a PR change all log and and some static test for it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] codecov-io commented on issue #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files
codecov-io commented on issue #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files URL: https://github.com/apache/airflow/pull/6788#issuecomment-592316893 # [Codecov](https://codecov.io/gh/apache/airflow/pull/6788?src=pr=h1) Report > Merging [#6788](https://codecov.io/gh/apache/airflow/pull/6788?src=pr=desc) into [master](https://codecov.io/gh/apache/airflow/commit/ee16d3059e38601dbac41cd036ed23a703def48f?src=pr=desc) will **increase** coverage by `0.25%`. > The diff coverage is `72.72%`. [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/6788/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/6788?src=pr=tree) ```diff @@Coverage Diff @@ ## master#6788 +/- ## == + Coverage 86.57% 86.82% +0.25% == Files 896 897 +1 Lines 4262242691 +69 == + Hits3690037067 +167 + Misses 5722 5624 -98 ``` | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/6788?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/models/dagbag.py](https://codecov.io/gh/apache/airflow/pull/6788/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFnYmFnLnB5) | `89.74% <100%> (ø)` | :arrow_up: | | [airflow/utils/operator\_helpers.py](https://codecov.io/gh/apache/airflow/pull/6788/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9vcGVyYXRvcl9oZWxwZXJzLnB5) | `100% <100%> (ø)` | :arrow_up: | | [airflow/jobs/scheduler\_job.py](https://codecov.io/gh/apache/airflow/pull/6788/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL3NjaGVkdWxlcl9qb2IucHk=) | `90.51% <100%> (+0.01%)` | :arrow_up: | | [airflow/serialization/serialized\_objects.py](https://codecov.io/gh/apache/airflow/pull/6788/diff?src=pr=tree#diff-YWlyZmxvdy9zZXJpYWxpemF0aW9uL3NlcmlhbGl6ZWRfb2JqZWN0cy5weQ==) | `90.36% <100%> (+0.16%)` | :arrow_up: | | [airflow/models/\_\_init\_\_.py](https://codecov.io/gh/apache/airflow/pull/6788/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvX19pbml0X18ucHk=) | `91.3% <100%> (+0.39%)` | :arrow_up: | | [airflow/www/views.py](https://codecov.io/gh/apache/airflow/pull/6788/diff?src=pr=tree#diff-YWlyZmxvdy93d3cvdmlld3MucHk=) | `76.03% <61.53%> (-0.21%)` | :arrow_down: | | [airflow/models/templatedfields.py](https://codecov.io/gh/apache/airflow/pull/6788/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvdGVtcGxhdGVkZmllbGRzLnB5) | `62.16% <62.16%> (ø)` | | | [airflow/models/dagrun.py](https://codecov.io/gh/apache/airflow/pull/6788/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMvZGFncnVuLnB5) | `95.02% <71.42%> (-0.73%)` | :arrow_down: | | [airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/6788/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5) | `74.5% <0%> (+23.52%)` | :arrow_up: | | ... and [4 more](https://codecov.io/gh/apache/airflow/pull/6788/diff?src=pr=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/6788?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/6788?src=pr=footer). Last update [ee16d30...109cfeb](https://codecov.io/gh/apache/airflow/pull/6788?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on issue #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files
kaxil commented on issue #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files URL: https://github.com/apache/airflow/pull/6788#issuecomment-592308818 **ToDo**: - [ ] Add new column in SerializedDagTable to store unrendered template fields - [ ] Add tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files
kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files URL: https://github.com/apache/airflow/pull/6788#discussion_r385502185 ## File path: airflow/models/templatedfields.py ## @@ -0,0 +1,97 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +"""Save Rendered Template Fields """ +import json + +from sqlalchemy import JSON, Column, String +from sqlalchemy.orm import Session + +from airflow.models.base import ID_LEN, Base +from airflow.models.taskinstance import TaskInstance +from airflow.utils.session import provide_session +from airflow.utils.sqlalchemy import UtcDateTime + + +class RenderedTaskInstanceFields(Base): +""" +Save Rendered Template Fields +""" + +__tablename__ = "rendered_task_instance_fields" + +dag_id = Column(String(ID_LEN), primary_key=True) +task_id = Column(String(ID_LEN), primary_key=True) +execution_date = Column(UtcDateTime, primary_key=True) +rendered_fields = Column(JSON, nullable=True) Review comment: updated This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files
kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files URL: https://github.com/apache/airflow/pull/6788#discussion_r38550 ## File path: airflow/models/templatedfields.py ## @@ -0,0 +1,97 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +"""Save Rendered Template Fields """ +import json + +from sqlalchemy import JSON, Column, String +from sqlalchemy.orm import Session + +from airflow.models.base import ID_LEN, Base +from airflow.models.taskinstance import TaskInstance +from airflow.utils.session import provide_session +from airflow.utils.sqlalchemy import UtcDateTime + + +class RenderedTaskInstanceFields(Base): +""" +Save Rendered Template Fields +""" + +__tablename__ = "rendered_task_instance_fields" + +dag_id = Column(String(ID_LEN), primary_key=True) +task_id = Column(String(ID_LEN), primary_key=True) +execution_date = Column(UtcDateTime, primary_key=True) +rendered_fields = Column(JSON, nullable=True) + +def __init__(self, ti: TaskInstance): +self.dag_id = ti.dag_id +self.task_id = ti.task_id +self.task = ti.task +self.execution_date = ti.execution_date + +ti.render_templates() +self.rendered_fields = { +field: self.serialize_rendered_field( +getattr(self.task, field) +) for field in self.task.template_fields +} + +@staticmethod +@provide_session +def get_templated_fields(ti: TaskInstance, session: Session = None): +""" +Get templated field for a TaskInstance from the RenderedTaskInstanceFields +table. + +:param ti: Task Instance +:param session: SqlAlchemy Session +:return: Rendered Templated TI field +""" +result = session.query(RenderedTaskInstanceFields.rendered_fields).filter( +RenderedTaskInstanceFields.dag_id == ti.dag_id, +RenderedTaskInstanceFields.task_id == ti.task_id, +RenderedTaskInstanceFields.execution_date == ti.execution_date +).first() + +if result: +return result.rendered_fields +else: +return None + +@staticmethod +def serialize_rendered_field(rendered_field): Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files
kaxil commented on a change in pull request #6788: WIP: [AIRFLOW-5944] Rendering templated_fields without accessing DAG files URL: https://github.com/apache/airflow/pull/6788#discussion_r385502167 ## File path: airflow/models/dagrun.py ## @@ -419,6 +420,17 @@ def verify_integrity(self, session=None): 1, 1) ti = TaskInstance(task, self.execution_date) session.add(ti) +session.commit() + +# ToDo: Store only Last X number (maybe 10 or 100) TIs for a task +rtif = session.query(RenderedTaskInstanceFields).filter( +RenderedTaskInstanceFields.dag_id == ti.dag_id, +RenderedTaskInstanceFields.task_id == ti.task_id, +RenderedTaskInstanceFields.execution_date == ti.execution_date, +).first() Review comment: Updated This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on issue #7577: [AIRFLOW-6950] Remove refresh_executor_config from ti.refresh_from_db
kaxil commented on issue #7577: [AIRFLOW-6950] Remove refresh_executor_config from ti.refresh_from_db URL: https://github.com/apache/airflow/pull/7577#issuecomment-592301688 CI failure is unrelated This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil edited a comment on issue #7577: [AIRFLOW-6950] Remove refresh_executor_config from ti.refresh_from_db
kaxil edited a comment on issue #7577: [AIRFLOW-6950] Remove refresh_executor_config from ti.refresh_from_db URL: https://github.com/apache/airflow/pull/7577#issuecomment-592301688 CI failure looks unrelated This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil opened a new pull request #7578: [AIRFLOW-6856] BugFix: Paused Dags still Scheduled
kaxil opened a new pull request #7578: [AIRFLOW-6856] BugFix: Paused Dags still Scheduled URL: https://github.com/apache/airflow/pull/7578 https://github.com/apache/airflow/pull/7476/ introduced a bug due to which Paused Dags where still Scheduled. The Bug is the following query returns a list of sets: Query: ``` paused_dag_ids = ( session.query(DagModel.dag_id) .filter(DagModel.is_paused.is_(True)) .filter(DagModel.dag_id.in_(dagbag.dag_ids)) .all() ) ``` Result: ``` [('example_bash_operator',)] ``` Hence in `_find_dags_to_process()` (below): ``` if len(self.dag_ids) > 0: dags = [dag for dag in dags if dag.dag_id in self.dag_ids and dag.dag_id not in paused_dag_ids] else: dags = [dag for dag in dags if dag.dag_id not in paused_dag_ids] return dags ``` following happens: ``` dags = [dag for dag in dags if "example_bash_operator" not in [('example_bash_operator',)] ] ``` This evaluates to false. Instead `paused_dag_ids` should be `{'example_bash_operator'}` (A set) or just a list `['example_bash_operator']` --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6856) Bulk fetch paused_dag_ids
[ https://issues.apache.org/jira/browse/AIRFLOW-6856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047189#comment-17047189 ] ASF GitHub Bot commented on AIRFLOW-6856: - kaxil commented on pull request #7578: [AIRFLOW-6856] BugFix: Paused Dags still Scheduled URL: https://github.com/apache/airflow/pull/7578 https://github.com/apache/airflow/pull/7476/ introduced a bug due to which Paused Dags where still Scheduled. The Bug is the following query returns a list of sets: Query: ``` paused_dag_ids = ( session.query(DagModel.dag_id) .filter(DagModel.is_paused.is_(True)) .filter(DagModel.dag_id.in_(dagbag.dag_ids)) .all() ) ``` Result: ``` [('example_bash_operator',)] ``` Hence in `_find_dags_to_process()` (below): ``` if len(self.dag_ids) > 0: dags = [dag for dag in dags if dag.dag_id in self.dag_ids and dag.dag_id not in paused_dag_ids] else: dags = [dag for dag in dags if dag.dag_id not in paused_dag_ids] return dags ``` following happens: ``` dags = [dag for dag in dags if "example_bash_operator" not in [('example_bash_operator',)] ] ``` This evaluates to false. Instead `paused_dag_ids` should be `{'example_bash_operator'}` (A set) or just a list `['example_bash_operator']` --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Bulk fetch paused_dag_ids > - > > Key: AIRFLOW-6856 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6856 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.10.9 >Reporter: Kamil Bregula >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (AIRFLOW-6747) UI - Show count of tasks in each dag on the main dags page
[ https://issues.apache.org/jira/browse/AIRFLOW-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nityananda Gohain reassigned AIRFLOW-6747: -- Assignee: Nityananda Gohain > UI - Show count of tasks in each dag on the main dags page > -- > > Key: AIRFLOW-6747 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6747 > Project: Apache Airflow > Issue Type: Improvement > Components: ui >Affects Versions: 1.10.7 >Reporter: t oo >Assignee: Nityananda Gohain >Priority: Minor > Labels: gsoc, gsoc2020, mentor > > Main DAGs page in UI - would benefit from showing a new column: number of > tasks for each dag id -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (AIRFLOW-4078) Allow filtering by all columns in Browse Logs view
[ https://issues.apache.org/jira/browse/AIRFLOW-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047168#comment-17047168 ] Ebrima Jallow edited comment on AIRFLOW-4078 at 2/28/20 2:57 AM: - Hello, My name is Ebrima. I am currently doing my masters program in Saarland University. I would like to work on this issue. I know that I have to send an email to the mailing list, but I can't find this link. Can you please send me the links needed to get started? Best was (Author: maubeh1): Hello, My name is Ebrima. I am currently doing my masters program in Saarland University. I know that I have to send an email to the mailing list, but I can't find this link. Can you please send me the links needed to get started? Best > Allow filtering by all columns in Browse Logs view > -- > > Key: AIRFLOW-4078 > URL: https://issues.apache.org/jira/browse/AIRFLOW-4078 > Project: Apache Airflow > Issue Type: Improvement > Components: logging, ui >Affects Versions: 1.10.2 >Reporter: Brylie Christopher Oxley >Priority: Minor > Labels: features, gsoc, gsoc2020, mentor > Attachments: Screenshot from 2019-03-13 11-41-20.png, Screenshot from > 2019-03-13 11-44-26.png > > > The "Browse Logs" UI currently allows filtering by "DAG ID", "Task ID", > "Execution Date", and "Extra". > !Screenshot from 2019-03-13 11-41-20.png! > For consistency and flexibility, it would be good to allow filtering by any > of the available columns, specifically "Datetime", "Event", "Execution Date", > and "Owner". > !Screenshot from 2019-03-13 11-44-26.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-4078) Allow filtering by all columns in Browse Logs view
[ https://issues.apache.org/jira/browse/AIRFLOW-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047168#comment-17047168 ] Ebrima Jallow commented on AIRFLOW-4078: Hello, My name is Ebrima. I am currently doing my masters program in Saarland University. I know that I have to send an email to the mailing list, but I can't find this link. Can you please send me the links needed to get started? Best > Allow filtering by all columns in Browse Logs view > -- > > Key: AIRFLOW-4078 > URL: https://issues.apache.org/jira/browse/AIRFLOW-4078 > Project: Apache Airflow > Issue Type: Improvement > Components: logging, ui >Affects Versions: 1.10.2 >Reporter: Brylie Christopher Oxley >Priority: Minor > Labels: features, gsoc, gsoc2020, mentor > Attachments: Screenshot from 2019-03-13 11-41-20.png, Screenshot from > 2019-03-13 11-44-26.png > > > The "Browse Logs" UI currently allows filtering by "DAG ID", "Task ID", > "Execution Date", and "Extra". > !Screenshot from 2019-03-13 11-41-20.png! > For consistency and flexibility, it would be good to allow filtering by any > of the available columns, specifically "Datetime", "Event", "Execution Date", > and "Owner". > !Screenshot from 2019-03-13 11-44-26.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] zhongjiajie commented on issue #7148: [AIRFLOW-6472] Correct short option in cli
zhongjiajie commented on issue #7148: [AIRFLOW-6472] Correct short option in cli URL: https://github.com/apache/airflow/pull/7148#issuecomment-592285746 Conflict with UPDATING.md file, will rebase on master after get some feedback from community. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6472) Some of the options we have are long options with single -
[ https://issues.apache.org/jira/browse/AIRFLOW-6472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047167#comment-17047167 ] zhongjiajie commented on AIRFLOW-6472: -- Related dev mail list [https://lists.apache.org/thread.html/r95c53953499a236466e0b762dfdadd5ca2ba9d6e2a3516c699a14380%40%3Cdev.airflow.apache.org%3E] > Some of the options we have are long options with single - > -- > > Key: AIRFLOW-6472 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6472 > Project: Apache Airflow > Issue Type: Improvement > Components: cli >Affects Versions: 2.0.0, 1.10.7 >Reporter: Jarek Potiuk >Assignee: zhongjiajie >Priority: Major > > We have some "short" options that are really "long" ones: namely -int and -sd > in the run task. This is against the idea of short and long options in Unix > (and argparse follows that). The main reason to have short options is that > you can combine short options: > {{airflow task run -iAlm}} > When you have more than one letter "short" options this might become quickly > ambiguous. We do not have -s yet but if we add few more options this might > become a problem. > Also in the argparse documentation > [https://docs.python.org/2/library/argparse.html] it's mentioned that short > options should be single character only.\ > This should be fixed. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] zhongjiajie commented on a change in pull request #7575: [AIRFLOW-6949] Respect explicit `spark.kubernetes.namespace` conf to SparkSubmitOperator
zhongjiajie commented on a change in pull request #7575: [AIRFLOW-6949] Respect explicit `spark.kubernetes.namespace` conf to SparkSubmitOperator URL: https://github.com/apache/airflow/pull/7575#discussion_r385480964 ## File path: airflow/providers/apache/spark/hooks/spark_submit.py ## @@ -208,6 +208,9 @@ def _resolve_connection(self): self._conn_id, conn_data['master'] ) +if 'spark.kubernetes.namespace' in self._conf: +conn_data['namespace'] = self._conf['spark.kubernetes.namespace'] Review comment: Maybe better change line 204, WDYT ```py conn_data['namespace'] = self._conf['spark.kubernetes.namespace'] if 'spark.kubernetes.namespace' in self._conf else extra.get('namespace') ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6950) Remove refresh_executor_config from refresh_from_db
[ https://issues.apache.org/jira/browse/AIRFLOW-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047138#comment-17047138 ] ASF GitHub Bot commented on AIRFLOW-6950: - kaxil commented on pull request #7577: [AIRFLOW-6950] Remove refresh_executor_config from ti.refresh_from_db URL: https://github.com/apache/airflow/pull/7577 https://github.com/apache/airflow/pull/5926 added "refresh_executor_config" argument to "refresh_from_db" which is never used. So we should remove it We always use the latest executor_config from the task --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Remove refresh_executor_config from refresh_from_db > --- > > Key: AIRFLOW-6950 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6950 > Project: Apache Airflow > Issue Type: Improvement > Components: core >Affects Versions: 1.10.9 >Reporter: Kaxil Naik >Assignee: Kaxil Naik >Priority: Minor > Fix For: 1.10.10 > > > https://github.com/apache/airflow/pull/5926 added "refresh_executor_config" > argument to "refresh_from_db" which is never used. So we should remove it -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] kaxil opened a new pull request #7577: [AIRFLOW-6950] Remove refresh_executor_config from ti.refresh_from_db
kaxil opened a new pull request #7577: [AIRFLOW-6950] Remove refresh_executor_config from ti.refresh_from_db URL: https://github.com/apache/airflow/pull/7577 https://github.com/apache/airflow/pull/5926 added "refresh_executor_config" argument to "refresh_from_db" which is never used. So we should remove it We always use the latest executor_config from the task --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-6950) Remove refresh_executor_config from refresh_from_db
Kaxil Naik created AIRFLOW-6950: --- Summary: Remove refresh_executor_config from refresh_from_db Key: AIRFLOW-6950 URL: https://issues.apache.org/jira/browse/AIRFLOW-6950 Project: Apache Airflow Issue Type: Improvement Components: core Affects Versions: 1.10.9 Reporter: Kaxil Naik Assignee: Kaxil Naik Fix For: 1.10.10 https://github.com/apache/airflow/pull/5926 added "refresh_executor_config" argument to "refresh_from_db" which is never used. So we should remove it -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6730) is_alive uses seconds and not total_seconds
[ https://issues.apache.org/jira/browse/AIRFLOW-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047135#comment-17047135 ] ASF subversion and git services commented on AIRFLOW-6730: -- Commit 008b4bab14222da068b737d6332db4963b994007 in airflow's branch refs/heads/master from Alex Guziel [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=008b4ba ] [AIRFLOW-6730] Use total_seconds instead of seconds (#7363) * [AIRFLOW-6730] Use total_seconds instead of seconds * adds tests and fixes types issue * fix test > is_alive uses seconds and not total_seconds > --- > > Key: AIRFLOW-6730 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6730 > Project: Apache Airflow > Issue Type: Bug > Components: core >Affects Versions: 1.10.4 >Reporter: Alex Guziel >Assignee: Alex Guziel >Priority: Major > Fix For: 2.0.0, 1.10.10 > > > Example: > timedelta(days=1).seconds = 0 > timedelta(days=1).total_seconds() = 86400 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6730) is_alive uses seconds and not total_seconds
[ https://issues.apache.org/jira/browse/AIRFLOW-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047134#comment-17047134 ] ASF subversion and git services commented on AIRFLOW-6730: -- Commit 008b4bab14222da068b737d6332db4963b994007 in airflow's branch refs/heads/master from Alex Guziel [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=008b4ba ] [AIRFLOW-6730] Use total_seconds instead of seconds (#7363) * [AIRFLOW-6730] Use total_seconds instead of seconds * adds tests and fixes types issue * fix test > is_alive uses seconds and not total_seconds > --- > > Key: AIRFLOW-6730 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6730 > Project: Apache Airflow > Issue Type: Bug > Components: core >Affects Versions: 1.10.4 >Reporter: Alex Guziel >Assignee: Alex Guziel >Priority: Major > Fix For: 2.0.0, 1.10.10 > > > Example: > timedelta(days=1).seconds = 0 > timedelta(days=1).total_seconds() = 86400 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] saguziel commented on issue #7363: [AIRFLOW-6730] Use total_seconds instead of seconds
saguziel commented on issue #7363: [AIRFLOW-6730] Use total_seconds instead of seconds URL: https://github.com/apache/airflow/pull/7363#issuecomment-592269856 This passed on my travis-ci so i will merge This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] saguziel merged pull request #7363: [AIRFLOW-6730] Use total_seconds instead of seconds
saguziel merged pull request #7363: [AIRFLOW-6730] Use total_seconds instead of seconds URL: https://github.com/apache/airflow/pull/7363 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6730) is_alive uses seconds and not total_seconds
[ https://issues.apache.org/jira/browse/AIRFLOW-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047132#comment-17047132 ] ASF GitHub Bot commented on AIRFLOW-6730: - saguziel commented on pull request #7363: [AIRFLOW-6730] Use total_seconds instead of seconds URL: https://github.com/apache/airflow/pull/7363 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > is_alive uses seconds and not total_seconds > --- > > Key: AIRFLOW-6730 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6730 > Project: Apache Airflow > Issue Type: Bug > Components: core >Affects Versions: 1.10.4 >Reporter: Alex Guziel >Assignee: Alex Guziel >Priority: Major > Fix For: 2.0.0, 1.10.10 > > > Example: > timedelta(days=1).seconds = 0 > timedelta(days=1).total_seconds() = 86400 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] zhongjiajie commented on issue #7573: [AIRFLOW-6719] Add pyupgrade
zhongjiajie commented on issue #7573: [AIRFLOW-6719] Add pyupgrade URL: https://github.com/apache/airflow/pull/7573#issuecomment-592267628 Related PR https://github.com/apache/airflow/pull/7343 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6933) Pass in env vars for all operators
[ https://issues.apache.org/jira/browse/AIRFLOW-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047055#comment-17047055 ] ASF GitHub Bot commented on AIRFLOW-6933: - saguziel commented on pull request #7576: [AIRFLOW-6933] Pass in env vars for all operators URL: https://github.com/apache/airflow/pull/7576 Pass in the airflow context env vars for all operators by doing it in ti._run_raw. Remove it from PythonOperator since it is unnecessary. Leave it in other places since they do special insertions. --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Pass in env vars for all operators > -- > > Key: AIRFLOW-6933 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6933 > Project: Apache Airflow > Issue Type: Improvement > Components: core >Affects Versions: 1.10.9 >Reporter: Alex Guziel >Assignee: Alex Guziel >Priority: Major > > Right now, certain operators pass in certain env vars like task_id, dag_id, > etc. This should be expanded to include all operators -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] saguziel opened a new pull request #7576: [AIRFLOW-6933] Pass in env vars for all operators
saguziel opened a new pull request #7576: [AIRFLOW-6933] Pass in env vars for all operators URL: https://github.com/apache/airflow/pull/7576 Pass in the airflow context env vars for all operators by doing it in ti._run_raw. Remove it from PythonOperator since it is unnecessary. Leave it in other places since they do special insertions. --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] abdulbasitds commented on a change in pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration
abdulbasitds commented on a change in pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration URL: https://github.com/apache/airflow/pull/6007#discussion_r385377395 ## File path: docs/integration.rst ## @@ -17,7 +17,6 @@ Integration === - Review comment: Hi, I am sorry but resetting the commit and removing file doesnt make sense to me somehow (As reverting to some commit will still have the things that you wanted me to remove) and eventually make things complicated. I instead checked out this file from apache/master and comiit that like following, It doesnt appears in changed files as well, can you please tell if is it okay? ``` git checkout apache/master -- docs/integration.rst git commit -m "removing a file from PR" git push origin master ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] abdulbasitds commented on a change in pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration
abdulbasitds commented on a change in pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration URL: https://github.com/apache/airflow/pull/6007#discussion_r385377395 ## File path: docs/integration.rst ## @@ -17,7 +17,6 @@ Integration === - Review comment: Hi, I am sorry but resetting the commit and removing file doesnt make sense to me somehow (As reverting to some commit will still have the things that you wanted me to remove) and eventually make things complicated. I instead checked out this file from apache/master and comiit that like following, It doesnt appears in changed files as well, is it okay? ``` git checkout apache/master -- docs/integration.rst git commit -m "removing a file from PR" git push origin master ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] stale[bot] commented on issue #6565: [AIRFLOW-5909] Enable mapping to BYTEs type to sql_to_gcs operator
stale[bot] commented on issue #6565: [AIRFLOW-5909] Enable mapping to BYTEs type to sql_to_gcs operator URL: https://github.com/apache/airflow/pull/6565#issuecomment-592221601 This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb merged pull request #7553: [AIRFLOW-XXXX] Update LICENSE versions and remove old licenses
ashb merged pull request #7553: [AIRFLOW-] Update LICENSE versions and remove old licenses URL: https://github.com/apache/airflow/pull/7553 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on issue #6075: [AIRFLOW-5266] Allow aws_athena_hook to get all query results
ashb commented on issue #6075: [AIRFLOW-5266] Allow aws_athena_hook to get all query results URL: https://github.com/apache/airflow/pull/6075#issuecomment-592219938 I'll take a look at this tomorrow This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] codecov-io commented on issue #7575: [AIRFLOW-6949] Respect explicit `spark.kubernetes.namespace` conf to SparkSubmitOperator
codecov-io commented on issue #7575: [AIRFLOW-6949] Respect explicit `spark.kubernetes.namespace` conf to SparkSubmitOperator URL: https://github.com/apache/airflow/pull/7575#issuecomment-592215039 # [Codecov](https://codecov.io/gh/apache/airflow/pull/7575?src=pr=h1) Report > Merging [#7575](https://codecov.io/gh/apache/airflow/pull/7575?src=pr=desc) into [master](https://codecov.io/gh/apache/airflow/commit/d031f844517a8d12e7d90af0c472ca00c64b8963?src=pr=desc) will **increase** coverage by `<.01%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/7575/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/7575?src=pr=tree) ```diff @@Coverage Diff @@ ## master#7575 +/- ## == + Coverage 86.56% 86.56% +<.01% == Files 896 896 Lines 4262242623 +1 == + Hits3689636897 +1 Misses 5726 5726 ``` | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/7575?src=pr=tree) | Coverage Δ | | |---|---|---| | [...rflow/providers/apache/spark/hooks/spark\_submit.py](https://codecov.io/gh/apache/airflow/pull/7575/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYXBhY2hlL3NwYXJrL2hvb2tzL3NwYXJrX3N1Ym1pdC5weQ==) | `84.67% <100%> (+0.05%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/7575?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/7575?src=pr=footer). Last update [d031f84...c2a16db](https://codecov.io/gh/apache/airflow/pull/7575?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] saguziel commented on a change in pull request #7363: [AIRFLOW-6730] Use total_seconds instead of seconds
saguziel commented on a change in pull request #7363: [AIRFLOW-6730] Use total_seconds instead of seconds URL: https://github.com/apache/airflow/pull/7363#discussion_r385388599 ## File path: tests/providers/google/cloud/operators/test_dataproc.py ## @@ -104,7 +104,7 @@ "autoscaling_config": {"policy_uri": "autoscaling_policy"}, "config_bucket": "storage_bucket", "initialization_actions": [ -{"executable_file": "init_actions_uris", "execution_timeout": "600s"} +{"executable_file": "init_actions_uris", "execution_timeout": "600.0s"} Review comment: the python typing indicates float, but I think that is just within the airflow project. I will just cast it to int since that is guaranteed to not break anything This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on issue #7573: [AIRFLOW-6719] Add pyupgrade
ashb commented on issue #7573: [AIRFLOW-6719] Add pyupgrade URL: https://github.com/apache/airflow/pull/7573#issuecomment-592186638 Should we hold off on this PR until we are done with 1.10.x backports entirely, as that PR makes any future backports all but impossible. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb opened a new pull request #7575: [AIRFLOW-6949] Respect explicit `spark.kubernetes.namespace` conf to SparkSubmitOperator
ashb opened a new pull request #7575: [AIRFLOW-6949] Respect explicit `spark.kubernetes.namespace` conf to SparkSubmitOperator URL: https://github.com/apache/airflow/pull/7575 This means the value from the Operator/dag file takes precedence over the connection The previous behaviour was to emit one line from the conf arg, but then a later one from the connection: ``` --conf spark.kubernetes.namespace=airflow \ --conf spark.kubernetes.namespace=default \ ``` --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6949) SparkSubitOperator ignores explicit spark.kubernetes.namespace config option
[ https://issues.apache.org/jira/browse/AIRFLOW-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046981#comment-17046981 ] ASF GitHub Bot commented on AIRFLOW-6949: - ashb commented on pull request #7575: [AIRFLOW-6949] Respect explicit `spark.kubernetes.namespace` conf to SparkSubmitOperator URL: https://github.com/apache/airflow/pull/7575 This means the value from the Operator/dag file takes precedence over the connection The previous behaviour was to emit one line from the conf arg, but then a later one from the connection: ``` --conf spark.kubernetes.namespace=airflow \ --conf spark.kubernetes.namespace=default \ ``` --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > SparkSubitOperator ignores explicit spark.kubernetes.namespace config option > > > Key: AIRFLOW-6949 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6949 > Project: Apache Airflow > Issue Type: Bug > Components: hooks >Affects Versions: 1.10.9 >Reporter: Ash Berlin-Taylor >Assignee: Ash Berlin-Taylor >Priority: Minor > Fix For: 1.10.10 > > > If a user explicitly passes {{spark.kubernetes.namespace}} in the config > attribute, we should respect that over what is given in the connection. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (AIRFLOW-6949) SparkSubitOperator ignores explicit spark.kubernetes.namespace config option
Ash Berlin-Taylor created AIRFLOW-6949: -- Summary: SparkSubitOperator ignores explicit spark.kubernetes.namespace config option Key: AIRFLOW-6949 URL: https://issues.apache.org/jira/browse/AIRFLOW-6949 Project: Apache Airflow Issue Type: Bug Components: hooks Affects Versions: 1.10.9 Reporter: Ash Berlin-Taylor Assignee: Ash Berlin-Taylor Fix For: 1.10.10 If a user explicitly passes {{spark.kubernetes.namespace}} in the config attribute, we should respect that over what is given in the connection. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] abdulbasitds commented on a change in pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration
abdulbasitds commented on a change in pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration URL: https://github.com/apache/airflow/pull/6007#discussion_r385377395 ## File path: docs/integration.rst ## @@ -17,7 +17,6 @@ Integration === - Review comment: Hi, I am sorry but reseting the commit and removing file doesnt make sense to me somehow (As reverting to some commit will still have the things that you wanted me to remove) and eventually make things complicated. Can i checkout this file from apache/master and comit that? ``` git checkout apache/master -- docs/integration.rst git commit -m "removing a file from PR" git push origin master ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] abdulbasitds commented on a change in pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration
abdulbasitds commented on a change in pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration URL: https://github.com/apache/airflow/pull/6007#discussion_r385377395 ## File path: docs/integration.rst ## @@ -17,7 +17,6 @@ Integration === - Review comment: Hi, I am sorry but reseting the commit and removing file doesnt make sense to me somehow (As reverting to some commiy will still have the things that you wanted me to remove) and eventually make things complicated. Can i checkout this file from apache/master and comit that? ``` git checkout apache/master -- docs/integration.rst git commit -m "removing a file from PR" git push origin master ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] anitakar commented on a change in pull request #7217: [AIRFLOW-5946] Store source code in db
anitakar commented on a change in pull request #7217: [AIRFLOW-5946] Store source code in db URL: https://github.com/apache/airflow/pull/7217#discussion_r385374472 ## File path: docs/dag-serialization.rst ## @@ -63,6 +63,8 @@ Add the following settings in ``airflow.cfg``: If set to True, Webserver reads from DB instead of parsing DAG files * ``min_serialized_dag_update_interval``: This flag sets the minimum interval (in seconds) after which the serialized DAG in DB should be updated. This helps in reducing database write rate. +* ``store_code``: This flag decides whether to persist DAG files code in DB. +If set to True, Webserver reads file contents from DB instead of trying to access files in a DAG folder. If you are updating Airflow from <1.10.7, please do not forget to run ``airflow db upgrade``. Review comment: I am afraid I do not understand This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] stale[bot] commented on issue #5388: [AIRFLOW-4490] DagRun.conf should return empty dictionary by default
stale[bot] commented on issue #5388: [AIRFLOW-4490] DagRun.conf should return empty dictionary by default URL: https://github.com/apache/airflow/pull/5388#issuecomment-592150338 This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (AIRFLOW-6856) Bulk fetch paused_dag_ids
[ https://issues.apache.org/jira/browse/AIRFLOW-6856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Bregula resolved AIRFLOW-6856. Fix Version/s: 2.0.0 Resolution: Fixed > Bulk fetch paused_dag_ids > - > > Key: AIRFLOW-6856 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6856 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.10.9 >Reporter: Kamil Bregula >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6856) Bulk fetch paused_dag_ids
[ https://issues.apache.org/jira/browse/AIRFLOW-6856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046898#comment-17046898 ] ASF subversion and git services commented on AIRFLOW-6856: -- Commit d031f844517a8d12e7d90af0c472ca00c64b8963 in airflow's branch refs/heads/master from Kamil Breguła [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=d031f84 ] [AIRFLOW-6856] Bulk fetch paused_dag_ids (#7476) > Bulk fetch paused_dag_ids > - > > Key: AIRFLOW-6856 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6856 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.10.9 >Reporter: Kamil Bregula >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6856) Bulk fetch paused_dag_ids
[ https://issues.apache.org/jira/browse/AIRFLOW-6856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046897#comment-17046897 ] ASF GitHub Bot commented on AIRFLOW-6856: - mik-laj commented on pull request #7476: [AIRFLOW-6856] Bulk fetch paused_dag_ids URL: https://github.com/apache/airflow/pull/7476 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Bulk fetch paused_dag_ids > - > > Key: AIRFLOW-6856 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6856 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.10.9 >Reporter: Kamil Bregula >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] mik-laj merged pull request #7476: [AIRFLOW-6856] Bulk fetch paused_dag_ids
mik-laj merged pull request #7476: [AIRFLOW-6856] Bulk fetch paused_dag_ids URL: https://github.com/apache/airflow/pull/7476 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (AIRFLOW-5946) Store & Read code from DB for Code View
[ https://issues.apache.org/jira/browse/AIRFLOW-5946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Bregula reassigned AIRFLOW-5946: -- Assignee: Anita Fronczak (was: Kaxil Naik) > Store & Read code from DB for Code View > --- > > Key: AIRFLOW-5946 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5946 > Project: Apache Airflow > Issue Type: Improvement > Components: webserver >Affects Versions: 2.0.0, 1.10.7 >Reporter: Kaxil Naik >Assignee: Anita Fronczak >Priority: Major > Labels: dag-serialization > > To make Webserver not need DAG Files we need to find a way to get Code to > display in *Code View*. > - Store in lazy-loaded column in SerializedDag table > - Save in a new table with DAG_id and store versions as well. Add a limit of > last 10 versions. This is just needed by Code View so not a problem if we > store in New table > OR - Just keep as reading from file? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6931) One migration failed during "airflow initdb" in mssql server 2017
[ https://issues.apache.org/jira/browse/AIRFLOW-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046882#comment-17046882 ] ASF GitHub Bot commented on AIRFLOW-6931: - BaoshanGu commented on pull request #7574: [AIRFLOW-6931] Fixed migrations to find all dependencies for mssql URL: https://github.com/apache/airflow/pull/7574 [AIRFLOW-6931] Migration file 74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py failed during "airflow initdb" in mssql server 2017. The error message is: _mssql.MSSQLDatabaseException: (5074, b"The object 'UQ__dag_run__F78A9899295C1915' is dependent on column 'execution_date'.DB-Lib error message 20018, severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20018, severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n") --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [ ] Unit tests coverage for changes (not needed for documentation changes) - [ ] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [ ] Relevant documentation is updated including usage instructions. - [ ] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > One migration failed during "airflow initdb" in mssql server 2017 > - > > Key: AIRFLOW-6931 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6931 > Project: Apache Airflow > Issue Type: Bug > Components: database >Affects Versions: 1.10.9 > Environment: microsoft sqlserver 2017 >Reporter: Baoshan Gu >Priority: Major > > Running "airflw initdb" got error: > {code:java} > _mssql.MSSQLDatabaseException: (5074, b"The object > 'UQ__dag_run__F78A9899295C1915' is dependent on column > 'execution_date'.DB-Lib error message 20018, severity 16:\nGeneral SQL Server > error: Check messages from the SQL Server\nDB-Lib error message 20018, > severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n") > {code} > The issue is migration file > [74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py#L235] > does not find all constraints. > Confirmed that changing it to case-insensitive selection works: > {code}(tc.CONSTRAINT_TYPE = 'PRIMARY KEY' or LOWER(tc.CONSTRAINT_TYPE) = > 'unique'){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] BaoshanGu opened a new pull request #7574: [AIRFLOW-6931] Fixed migrations to find all dependencies for mssql
BaoshanGu opened a new pull request #7574: [AIRFLOW-6931] Fixed migrations to find all dependencies for mssql URL: https://github.com/apache/airflow/pull/7574 [AIRFLOW-6931] Migration file 74effc47d867_change_datetime_to_datetime2_6_on_mssql_.py failed during "airflow initdb" in mssql server 2017. The error message is: _mssql.MSSQLDatabaseException: (5074, b"The object 'UQ__dag_run__F78A9899295C1915' is dependent on column 'execution_date'.DB-Lib error message 20018, severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20018, severity 16:\nGeneral SQL Server error: Check messages from the SQL Server\n") --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [ ] Unit tests coverage for changes (not needed for documentation changes) - [ ] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [ ] Relevant documentation is updated including usage instructions. - [ ] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] boring-cyborg[bot] commented on issue #7574: [AIRFLOW-6931] Fixed migrations to find all dependencies for mssql
boring-cyborg[bot] commented on issue #7574: [AIRFLOW-6931] Fixed migrations to find all dependencies for mssql URL: https://github.com/apache/airflow/pull/7574#issuecomment-592114910 Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst) Here are some useful points: - Pay attention to the quality of your code (flake8, pylint and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that. - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/master/docs/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it. - Consider using [Breeze environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations. - Be patient and persistent. It might take some time to get a review or get the final approval from Committers. - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices). Apache Airflow is a community-driven project and together we are making it better . In case of doubts contact the developers at: Mailing List: d...@airflow.apache.org Slack: https://apache-airflow-slack.herokuapp.com/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6361) Run LocalTaskJob directly in Celery task
[ https://issues.apache.org/jira/browse/AIRFLOW-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046846#comment-17046846 ] ASF GitHub Bot commented on AIRFLOW-6361: - mik-laj commented on pull request #6905: [AIRFLOW-6361] Run LocalTaskJob directly in Celery task URL: https://github.com/apache/airflow/pull/6905 Hello, The executor runs multiple processes to perform one task. Many processes have a very short life cycle, so the process of starting it is a significant overhead. Firstly, the Celery executor trigger Celery tasks - app.task. This task runs the CLI command (first process), which contains LocalTaskJob. LocalTaskJob runs the separate command (second process) that executes user-code. This level of isolation is redundant because LocalTaskJob doesn't execute unsafe code. The first command is run by a new process creation, not by a fork, so this is an expensive operation. I suggest running code from the first process as part of the celery task to reduce the need to create new processes. The code currently uses CLIFactory to run the LocalTaskJob It is better to do this without unnecessary dependence on CLI, but it is a big change and I plan to do it in a separate PR. WIP PR: https://github.com/mik-laj/incubator-airflow/pull/10 (Travis green :-D ) Performance benchmark: === Example DAG from Airflow with unneeded sleep instructions deleted. ```python """Example DAG demonstrating the usage of the BashOperator.""" from datetime import timedelta import airflow from airflow.models import DAG from airflow.operators.bash_operator import BashOperator from airflow.operators.dummy_operator import DummyOperator args = { 'owner': 'airflow', 'start_date': airflow.utils.dates.days_ago(2), } dag = DAG( dag_id='example_bash_operator', default_args=args, schedule_interval='0 0 * * *', dagrun_timeout=timedelta(minutes=60), ) run_this_last = DummyOperator( task_id='run_this_last', dag=dag, ) # [START howto_operator_bash] run_this = BashOperator( task_id='run_after_loop', bash_command='echo 1', dag=dag, ) # [END howto_operator_bash] run_this >> run_this_last for i in range(3): task = BashOperator( task_id='runme_' + str(i), bash_command='echo "{{ task_instance_key_str }}", dag=dag, ) task >> run_this # [START howto_operator_bash_template] also_run_this = BashOperator( task_id='also_run_this', bash_command='echo "run_id={{ run_id }} | dag_run={{ dag_run }}"', dag=dag, ) # [END howto_operator_bash_template] also_run_this >> run_this_last if __name__ == "__main__": dag.cli() ``` ```python import airflow from airflow import DAG from airflow.models import DagBag dagbag = airflow.models.DagBag() dag: DAG = dagbag.get_dag("example_bash_operator") dag.clear() dag.run() ``` Environment: Brreze ``` unset AIRFLOW__CORE__DAGS_FOLDER unset AIRFLOW__CORE__UNIT_TEST_MODE chmod -R 777 /root sudo -E su airflow export AIRFLOW__CORE__EXECUTOR="CeleryExecutor" export AIRFLOW__CELERY__BROKER_URL="redis://redis:6379/0" export AIRFLOW__CELERY__WORKER_CONCURRENCY=8 seq 1 10 | xargs -n 1 -I {} bash -c "time python /files/benchmark_speed.py > /dev/null 2>&1" | grep '^(real\|user\|sys)'; ``` Result: |Fn. | After | Before | Change| ||---||---| |AVERAGE | 56.48 | 38.32 | -32% | |VAR | 23.60 | 0.04 | -98% | |MAX | 68.29 | 38.68 | -43% | |MIN | 53.26 | 38.08 | -28% | |STDEV | 4.86 | 0.19 | -96%. | Raw data After: ``` real 0m38.394s user 0m4.340s sys 0m1.600s real 0m38.355s user 0m4.700s sys 0m1.340s real 0m38.675s user 0m4.760s sys 0m1.530s real 0m38.488s user 0m4.770s sys 0m1.280s real 0m38.434s user 0m4.600s sys 0m1.390s real 0m38.378s user 0m4.500s sys 0m1.270s real 0m38.106s user 0m4.200s sys 0m1.100s real 0m38.082s user 0m4.170s sys 0m1.030s real 0m38.173s user 0m4.290s sys 0m1.340s real 0m38.161s user 0m4.460s sys 0m1.370s ``` Before: ``` real 0m53.488s user 0m5.140s sys 0m1.700s real 1m8.288s user 0m6.430s sys 0m2.200s real 0m53.371s user 0m5.330s sys 0m1.630s real 0m58.939s user 0m6.470s sys 0m1.730s real 0m53.255s user 0m4.950s sys 0m1.640s real 0m58.802s user 0m5.970s sys 0m1.790s real 0m58.449s user 0m5.380s sys 0m1.580s real 0m53.308s user 0m5.120s sys 0m1.430s real 0m53.485s user 0m5.220s sys 0m1.290s
[GitHub] [airflow] mik-laj opened a new pull request #6905: [AIRFLOW-6361] Run LocalTaskJob directly in Celery task
mik-laj opened a new pull request #6905: [AIRFLOW-6361] Run LocalTaskJob directly in Celery task URL: https://github.com/apache/airflow/pull/6905 Hello, The executor runs multiple processes to perform one task. Many processes have a very short life cycle, so the process of starting it is a significant overhead. Firstly, the Celery executor trigger Celery tasks - app.task. This task runs the CLI command (first process), which contains LocalTaskJob. LocalTaskJob runs the separate command (second process) that executes user-code. This level of isolation is redundant because LocalTaskJob doesn't execute unsafe code. The first command is run by a new process creation, not by a fork, so this is an expensive operation. I suggest running code from the first process as part of the celery task to reduce the need to create new processes. The code currently uses CLIFactory to run the LocalTaskJob It is better to do this without unnecessary dependence on CLI, but it is a big change and I plan to do it in a separate PR. WIP PR: https://github.com/mik-laj/incubator-airflow/pull/10 (Travis green :-D ) Performance benchmark: === Example DAG from Airflow with unneeded sleep instructions deleted. ```python """Example DAG demonstrating the usage of the BashOperator.""" from datetime import timedelta import airflow from airflow.models import DAG from airflow.operators.bash_operator import BashOperator from airflow.operators.dummy_operator import DummyOperator args = { 'owner': 'airflow', 'start_date': airflow.utils.dates.days_ago(2), } dag = DAG( dag_id='example_bash_operator', default_args=args, schedule_interval='0 0 * * *', dagrun_timeout=timedelta(minutes=60), ) run_this_last = DummyOperator( task_id='run_this_last', dag=dag, ) # [START howto_operator_bash] run_this = BashOperator( task_id='run_after_loop', bash_command='echo 1', dag=dag, ) # [END howto_operator_bash] run_this >> run_this_last for i in range(3): task = BashOperator( task_id='runme_' + str(i), bash_command='echo "{{ task_instance_key_str }}", dag=dag, ) task >> run_this # [START howto_operator_bash_template] also_run_this = BashOperator( task_id='also_run_this', bash_command='echo "run_id={{ run_id }} | dag_run={{ dag_run }}"', dag=dag, ) # [END howto_operator_bash_template] also_run_this >> run_this_last if __name__ == "__main__": dag.cli() ``` ```python import airflow from airflow import DAG from airflow.models import DagBag dagbag = airflow.models.DagBag() dag: DAG = dagbag.get_dag("example_bash_operator") dag.clear() dag.run() ``` Environment: Brreze ``` unset AIRFLOW__CORE__DAGS_FOLDER unset AIRFLOW__CORE__UNIT_TEST_MODE chmod -R 777 /root sudo -E su airflow export AIRFLOW__CORE__EXECUTOR="CeleryExecutor" export AIRFLOW__CELERY__BROKER_URL="redis://redis:6379/0" export AIRFLOW__CELERY__WORKER_CONCURRENCY=8 seq 1 10 | xargs -n 1 -I {} bash -c "time python /files/benchmark_speed.py > /dev/null 2>&1" | grep '^(real\|user\|sys)'; ``` Result: |Fn. | After | Before | Change| ||---||---| |AVERAGE | 56.48 | 38.32 | -32% | |VAR | 23.60 | 0.04 | -98% | |MAX | 68.29 | 38.68 | -43% | |MIN | 53.26 | 38.08 | -28% | |STDEV | 4.86 | 0.19 | -96%. | Raw data After: ``` real 0m38.394s user 0m4.340s sys 0m1.600s real 0m38.355s user 0m4.700s sys 0m1.340s real 0m38.675s user 0m4.760s sys 0m1.530s real 0m38.488s user 0m4.770s sys 0m1.280s real 0m38.434s user 0m4.600s sys 0m1.390s real 0m38.378s user 0m4.500s sys 0m1.270s real 0m38.106s user 0m4.200s sys 0m1.100s real 0m38.082s user 0m4.170s sys 0m1.030s real 0m38.173s user 0m4.290s sys 0m1.340s real 0m38.161s user 0m4.460s sys 0m1.370s ``` Before: ``` real 0m53.488s user 0m5.140s sys 0m1.700s real 1m8.288s user 0m6.430s sys 0m2.200s real 0m53.371s user 0m5.330s sys 0m1.630s real 0m58.939s user 0m6.470s sys 0m1.730s real 0m53.255s user 0m4.950s sys 0m1.640s real 0m58.802s user 0m5.970s sys 0m1.790s real 0m58.449s user 0m5.380s sys 0m1.580s real 0m53.308s user 0m5.120s sys 0m1.430s real 0m53.485s user 0m5.220s sys 0m1.290s real 0m53.387s user 0m5.020s sys 0m1.590s ``` --- Link to JIRA issue: https://issues.apache.org/jira/browse/AIRFLOW-6361 - [x] Description above provides context of the change - [x] Commit message starts with
[GitHub] [airflow] mik-laj commented on issue #7343: [AIRFLOW-6719] Introduce pyupgrade to enforce latest syntax
mik-laj commented on issue #7343: [AIRFLOW-6719] Introduce pyupgrade to enforce latest syntax URL: https://github.com/apache/airflow/pull/7343#issuecomment-592090216 @mik-laj please create new PR to apache/airflow but reuse JIRA number. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] zhongjiajie opened a new pull request #7573: Add pyupgrade
zhongjiajie opened a new pull request #7573: Add pyupgrade URL: https://github.com/apache/airflow/pull/7573 --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on a change in pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command
kaxil commented on a change in pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command URL: https://github.com/apache/airflow/pull/7572#discussion_r385258199 ## File path: airflow/cli/commands/version_command.py ## @@ -16,9 +16,8 @@ # under the License. """Version command""" import airflow -from airflow import settings def version(args): """Displays Airflow version at the command line""" -print(settings.HEADER + " v" + airflow.__version__) +print(f"v{airflow.__version__}") Review comment: We should probably remove `v` too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on a change in pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command
kaxil commented on a change in pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command URL: https://github.com/apache/airflow/pull/7572#discussion_r385258199 ## File path: airflow/cli/commands/version_command.py ## @@ -16,9 +16,8 @@ # under the License. """Version command""" import airflow -from airflow import settings def version(args): """Displays Airflow version at the command line""" -print(settings.HEADER + " v" + airflow.__version__) +print(f"v{airflow.__version__}") Review comment: We should probably remove `v` too This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] dhegberg commented on a change in pull request #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi…
dhegberg commented on a change in pull request #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi… URL: https://github.com/apache/airflow/pull/7437#discussion_r385254375 ## File path: docs/howto/write-logs.rst ## @@ -115,6 +115,29 @@ To configure it, you must additionally set the endpoint url to point to your loc You can do this via the Connection Extra ``host`` field. For example, ``{"host": "http://localstack:4572"}`` +.. _write-logs-amazon-cloudwatch: + +Writing Logs to Amazon Cloudwatch +- + + +Enabling remote logging +''' + +To enable this feature, ``airflow.cfg`` must be configured as follows: + +.. code-block:: ini + +[logging] +# Airflow can store logs remotely in AWS Cloudwatch. Users must supply a log group +# ARN (starting with 'cloudwatch://...') and an Airflow connection +# id that provides write and read access to the log location. +remote_logging = True +remote_base_log_folder = cloudwatch://arn:aws:logs:::log-group::* Review comment: When you display the ARNs in the cloudwatch console or via a describe-log-groups call, they all have a '*' suffix. The docs however describe the arn without a suffix: https://docs.aws.amazon.com/IAM/latest/UserGuide/list_amazoncloudwatchlogs.html#amazoncloudwatchlogs-resources-for-iam-policies I'll go the ARN format that is consistent with the docs and if users include the * when copying the ARN it will work anyway. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on a change in pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command
kaxil commented on a change in pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command URL: https://github.com/apache/airflow/pull/7572#discussion_r385258385 ## File path: airflow/cli/commands/version_command.py ## @@ -16,9 +16,8 @@ # under the License. """Version command""" import airflow -from airflow import settings def version(args): """Displays Airflow version at the command line""" -print(settings.HEADER + " v" + airflow.__version__) +print(f"v{airflow.__version__}") Review comment: ```suggestion print(f"{airflow.__version__}") ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] dhegberg commented on a change in pull request #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi…
dhegberg commented on a change in pull request #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi… URL: https://github.com/apache/airflow/pull/7437#discussion_r385254375 ## File path: docs/howto/write-logs.rst ## @@ -115,6 +115,29 @@ To configure it, you must additionally set the endpoint url to point to your loc You can do this via the Connection Extra ``host`` field. For example, ``{"host": "http://localstack:4572"}`` +.. _write-logs-amazon-cloudwatch: + +Writing Logs to Amazon Cloudwatch +- + + +Enabling remote logging +''' + +To enable this feature, ``airflow.cfg`` must be configured as follows: + +.. code-block:: ini + +[logging] +# Airflow can store logs remotely in AWS Cloudwatch. Users must supply a log group +# ARN (starting with 'cloudwatch://...') and an Airflow connection +# id that provides write and read access to the log location. +remote_logging = True +remote_base_log_folder = cloudwatch://arn:aws:logs:::log-group::* Review comment: When you display the ARNs in the cloudwatch console or via a describe-log-groups call, they all have a '*' suffix. The docs however describe the arn without a suffix: https://docs.aws.amazon.com/IAM/latest/UserGuide/list_amazoncloudwatchlogs.html#amazoncloudwatchlogs-resources-for-iam-policies Ill go the ARN format that is consistent with the docs and if users include the * when copying the ARN it will work anyway. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] dhegberg commented on a change in pull request #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi…
dhegberg commented on a change in pull request #7437: [AIRFLOW-2325] Add CloudwatchTaskHandler option for remote task loggi… URL: https://github.com/apache/airflow/pull/7437#discussion_r385254375 ## File path: docs/howto/write-logs.rst ## @@ -115,6 +115,29 @@ To configure it, you must additionally set the endpoint url to point to your loc You can do this via the Connection Extra ``host`` field. For example, ``{"host": "http://localstack:4572"}`` +.. _write-logs-amazon-cloudwatch: + +Writing Logs to Amazon Cloudwatch +- + + +Enabling remote logging +''' + +To enable this feature, ``airflow.cfg`` must be configured as follows: + +.. code-block:: ini + +[logging] +# Airflow can store logs remotely in AWS Cloudwatch. Users must supply a log group +# ARN (starting with 'cloudwatch://...') and an Airflow connection +# id that provides write and read access to the log location. +remote_logging = True +remote_base_log_folder = cloudwatch://arn:aws:logs:::log-group::* Review comment: When you display the ARNs in the cloudwatch console or via a describe-log-groups call, they all have a '*' suffix. The docs however describe the arn without a suffix: https://docs.aws.amazon.com/IAM/latest/UserGuide/list_amazoncloudwatchlogs.html#amazoncloudwatchlogs-resources-for-iam-policies I'm happy to go with whichever format you think is less confusing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (AIRFLOW-6862) Do not check the freshness of fresh DAG
[ https://issues.apache.org/jira/browse/AIRFLOW-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Bregula resolved AIRFLOW-6862. Fix Version/s: 2.0.0 Resolution: Fixed > Do not check the freshness of fresh DAG > --- > > Key: AIRFLOW-6862 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6862 > Project: Apache Airflow > Issue Type: Bug > Components: core >Affects Versions: 1.10.9 >Reporter: Kamil Bregula >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6862) Do not check the freshness of fresh DAG
[ https://issues.apache.org/jira/browse/AIRFLOW-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046817#comment-17046817 ] ASF subversion and git services commented on AIRFLOW-6862: -- Commit 3837dce02627cea12a51d5e38956758dcfa6c121 in airflow's branch refs/heads/master from Kamil Breguła [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=3837dce ] [AIRFLOW-6862] Do not check the freshness of fresh DAG (#7481) > Do not check the freshness of fresh DAG > --- > > Key: AIRFLOW-6862 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6862 > Project: Apache Airflow > Issue Type: Bug > Components: core >Affects Versions: 1.10.9 >Reporter: Kamil Bregula >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (AIRFLOW-6862) Do not check the freshness of fresh DAG
[ https://issues.apache.org/jira/browse/AIRFLOW-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046816#comment-17046816 ] ASF GitHub Bot commented on AIRFLOW-6862: - mik-laj commented on pull request #7481: [AIRFLOW-6862] Do not check the freshness of fresh DAG URL: https://github.com/apache/airflow/pull/7481 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Do not check the freshness of fresh DAG > --- > > Key: AIRFLOW-6862 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6862 > Project: Apache Airflow > Issue Type: Bug > Components: core >Affects Versions: 1.10.9 >Reporter: Kamil Bregula >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] mik-laj merged pull request #7481: [AIRFLOW-6862] Do not check the freshness of fresh DAG
mik-laj merged pull request #7481: [AIRFLOW-6862] Do not check the freshness of fresh DAG URL: https://github.com/apache/airflow/pull/7481 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (AIRFLOW-6650) Google Cloud Platform Connection key json documentation or code is wrong
[ https://issues.apache.org/jira/browse/AIRFLOW-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Bregula resolved AIRFLOW-6650. Fix Version/s: 2.0.0 Resolution: Invalid > Google Cloud Platform Connection key json documentation or code is wrong > > > Key: AIRFLOW-6650 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6650 > Project: Apache Airflow > Issue Type: Bug > Components: gcp >Affects Versions: 1.10.4, 1.10.5, 1.10.6, 1.10.7 >Reporter: Evgeniy Sokolov >Priority: Minor > Fix For: 2.0.0 > > > According to the documentation: > [https://airflow.readthedocs.io/en/stable/howto/connection/gcp.html] > The name of the external configuration for Keyfile JSON is: > * {{extra__google_cloud_platform__key_dict}} - Keyfile JSON > Excluding the prefix ({{extra__google_cloud_platform__) the name of the > variable is *key_dict*. > }} > However, '*keyfile_dict*' is expected in the source code: > [https://github.com/apache/airflow/blob/master/airflow/gcp/hooks/base.py] > {code:java} > 146: keyfile_dict = self._get_field('keyfile_dict', None) # type: > Optional[str]{code} > [https://github.com/apache/airflow/blob/v1-10-stable/airflow/contrib/hooks/gcp_api_base_hook.py] > {code:java} > 138: keyfile_dict = self._get_field('keyfile_dict', False){code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (AIRFLOW-6682) Move GCP classes to providers package
[ https://issues.apache.org/jira/browse/AIRFLOW-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Bregula resolved AIRFLOW-6682. Fix Version/s: 2.0.0 Resolution: Fixed > Move GCP classes to providers package > - > > Key: AIRFLOW-6682 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6682 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp, hooks, operators >Affects Versions: 1.10.7 >Reporter: Kamil Bregula >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] nuclearpinguin commented on a change in pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command
nuclearpinguin commented on a change in pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command URL: https://github.com/apache/airflow/pull/7572#discussion_r385223627 ## File path: airflow/settings.py ## @@ -173,8 +173,8 @@ def configure_orm(disable_connection_pool=False): # https://docs.sqlalchemy.org/en/13/core/pooling.html#disconnect-handling-pessimistic pool_pre_ping = conf.getboolean('core', 'SQL_ALCHEMY_POOL_PRE_PING', fallback=True) -log.info("settings.configure_orm(): Using pool settings. pool_size={}, max_overflow={}, " - "pool_recycle={}, pid={}".format(pool_size, max_overflow, pool_recycle, os.getpid())) +log.debug("settings.configure_orm(): Using pool settings. pool_size=%d, max_overflow=%d, " + "pool_recycle=%d, pid=%d", pool_size, max_overflow, pool_recycle, os.getpid()) Review comment: You are right, however, I am not sure if this log information is so crucial that it deserves separate PR? @feluelle @mik-laj This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] feluelle commented on a change in pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command
feluelle commented on a change in pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command URL: https://github.com/apache/airflow/pull/7572#discussion_r385220207 ## File path: airflow/settings.py ## @@ -173,8 +173,8 @@ def configure_orm(disable_connection_pool=False): # https://docs.sqlalchemy.org/en/13/core/pooling.html#disconnect-handling-pessimistic pool_pre_ping = conf.getboolean('core', 'SQL_ALCHEMY_POOL_PRE_PING', fallback=True) -log.info("settings.configure_orm(): Using pool settings. pool_size={}, max_overflow={}, " - "pool_recycle={}, pid={}".format(pool_size, max_overflow, pool_recycle, os.getpid())) +log.debug("settings.configure_orm(): Using pool settings. pool_size=%d, max_overflow=%d, " + "pool_recycle=%d, pid=%d", pool_size, max_overflow, pool_recycle, os.getpid()) Review comment: I think this should be a separate PR. This log will not only be displayed when printing airflow version. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] feluelle commented on a change in pull request #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook
feluelle commented on a change in pull request #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook URL: https://github.com/apache/airflow/pull/6576#discussion_r385217048 ## File path: airflow/providers/mysql/hooks/mysql.py ## @@ -113,8 +107,44 @@ def get_conn(self): conn_config['unix_socket'] = conn.extra_dejson['unix_socket'] if local_infile: conn_config["local_infile"] = 1 -conn = MySQLdb.connect(**conn_config) -return conn +return conn_config + +def _get_conn_config_mysql_connector_python(self, conn): +conn_config = { +'user': conn.login, +'password': conn.password or '', +'host': conn.host or 'localhost', +'database': self.schema or conn.schema or '', +'port': int(conn.port) if conn.port else 3306 +} + +if conn.extra_dejson.get('allow_local_infile', False): +conn_config["allow_local_infile"] = True + +return conn_config + +def get_conn(self): +""" +Establishes a connection to a mysql database +by extracting the connection configuration from the Airflow connection. + +.. note:: By default it connects to the database via the mysqlclient library. +But you can also choose the mysql-connector-python library which lets you connect through ssl +without any further ssl parameters required. + +:return: a mysql connection object +""" +conn = self.connection or self.get_connection(self.mysql_conn_id) # pylint: disable=no-member Review comment: https://github.com/apache/airflow/pull/6576#discussion_r362945137 :D This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] feluelle edited a comment on issue #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook
feluelle edited a comment on issue #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook URL: https://github.com/apache/airflow/pull/6576#issuecomment-591519540 The issue is the new version of `mysql-connector-python` 8.0.19, don't know exactly what it caused but 8.0.18 works. https://travis-ci.org/apache/airflow/jobs/654452458?utm_medium=notification_source=github_status This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] feluelle edited a comment on issue #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook
feluelle edited a comment on issue #6576: [AIRFLOW-5922] Add option to specify the mysql client library used in MySqlHook URL: https://github.com/apache/airflow/pull/6576#issuecomment-591519540 The issue is the new version of `mysql-connector-python` 8.0.19, don't know exactly what it caused but 8.0.18 works. Failed tests: https://travis-ci.org/apache/airflow/jobs/654452458?utm_medium=notification_source=github_status This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-6904) Airflow 1.10.7 suppresses Operator logs
[ https://issues.apache.org/jira/browse/AIRFLOW-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] RAHUL JAIN updated AIRFLOW-6904: Summary: Airflow 1.10.7 suppresses Operator logs (was: Airflow 1.10.9 suppresses Operator logs) > Airflow 1.10.7 suppresses Operator logs > --- > > Key: AIRFLOW-6904 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6904 > Project: Apache Airflow > Issue Type: Bug > Components: core >Affects Versions: 1.10.7, 1.10.8, 1.10.9 >Reporter: RAHUL JAIN >Priority: Critical > Attachments: 1.10.2.png, 1.10.9.png > > > After upgrading from 1.10.2 to 1.10.9, we noticed that the Operator logs are > no longer printed. See the attachments for comparison. There is also a slack > channel discussion pointing to a recen change that may have broken this - > [https://apache-airflow.slack.com/archives/CCQ7EGB1P/p1582548602014200] > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] feluelle commented on a change in pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration
feluelle commented on a change in pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration URL: https://github.com/apache/airflow/pull/6007#discussion_r385213897 ## File path: docs/integration.rst ## @@ -17,7 +17,6 @@ Integration === - Review comment: And please check what you commit before you push :) `git status` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6948) Remove ASCII Airflow from version command
[ https://issues.apache.org/jira/browse/AIRFLOW-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046766#comment-17046766 ] ASF GitHub Bot commented on AIRFLOW-6948: - nuclearpinguin commented on pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command URL: https://github.com/apache/airflow/pull/7572 Before: ``` root@3f957e731266:/opt/airflow# airflow version [2020-02-27 16:09:09,092] {settings.py:177} INFO - settings.configure_orm(): Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=902 _ |__( )_ __/__ / __ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / / ___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ / _/_/ |_/_/ /_//_//_/ \//|__/ v2.0.0.dev0 ``` after: ``` root@3f957e731266:/opt/airflow# airflow version v2.0.0.dev0 ``` --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Remove ASCII Airflow from version command > - > > Key: AIRFLOW-6948 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6948 > Project: Apache Airflow > Issue Type: Improvement > Components: cli >Affects Versions: 2.0.0 >Reporter: Tomasz Urbaszek >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] nuclearpinguin opened a new pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command
nuclearpinguin opened a new pull request #7572: [AIRFLOW-6948] Remove ASCII Airflow from version command URL: https://github.com/apache/airflow/pull/7572 Before: ``` root@3f957e731266:/opt/airflow# airflow version [2020-02-27 16:09:09,092] {settings.py:177} INFO - settings.configure_orm(): Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=902 _ |__( )_ __/__ / __ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / / ___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ / _/_/ |_/_/ /_//_//_/ \//|__/ v2.0.0.dev0 ``` after: ``` root@3f957e731266:/opt/airflow# airflow version v2.0.0.dev0 ``` --- Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg) Make sure to mark the boxes below before creating PR: [x] - [x] Description above provides context of the change - [x] Commit message/PR title starts with `[AIRFLOW-]`. AIRFLOW- = JIRA ID* - [x] Unit tests coverage for changes (not needed for documentation changes) - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)" - [x] Relevant documentation is updated including usage instructions. - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example). * For document-only changes commit message can start with `[AIRFLOW-]`. --- In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed. In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md). Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] feluelle commented on a change in pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration
feluelle commented on a change in pull request #6007: [AIRFLOW-2310] Enable AWS Glue Job Integration URL: https://github.com/apache/airflow/pull/6007#discussion_r385209674 ## File path: docs/integration.rst ## @@ -17,7 +17,6 @@ Integration === - Review comment: Reset your commit and remove this file from being committed. https://stackoverflow.com/a/15321456 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-6361) Run LocalTaskJob directly in Celery task
[ https://issues.apache.org/jira/browse/AIRFLOW-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046757#comment-17046757 ] ASF GitHub Bot commented on AIRFLOW-6361: - stale[bot] commented on pull request #6905: [AIRFLOW-6361] Run LocalTaskJob directly in Celery task URL: https://github.com/apache/airflow/pull/6905 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Run LocalTaskJob directly in Celery task > > > Key: AIRFLOW-6361 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6361 > Project: Apache Airflow > Issue Type: Improvement > Components: executors >Affects Versions: 1.10.6 >Reporter: Kamil Bregula >Priority: Major > Labels: performance > > Hello, > Celery runs the CLI first command, which contains LocalTaskJob. LocalTaskJob > is responsible for starting the next user-code process. This level of > isolation is redundant because LocalTaskJob doesn't execute unsafe code. The > first command is run by a new process creation, not by a fork, so this is an > expensive operation. > According to preliminary measurements, this change results in an increase in > performance close to 30%. > I will provide more information in PR. > Best regards > Kamil Bregula > After: > ``` > real 0m38.394s > user 0m4.340s > sys 0m1.600s > real 0m38.355s > user 0m4.700s > sys 0m1.340s > real 0m38.675s > user 0m4.760s > sys 0m1.530s > real 0m38.488s > user 0m4.770s > sys 0m1.280s > real 0m38.434s > user 0m4.600s > sys 0m1.390s > real 0m38.378s > user 0m4.500s > sys 0m1.270s > real 0m38.106s > user 0m4.200s > sys 0m1.100s > real 0m38.082s > user 0m4.170s > sys 0m1.030s > real 0m38.173s > user 0m4.290s > sys 0m1.340s > real 0m38.161s > user 0m4.460s > sys 0m1.370s > ``` > Before: > ``` > real 0m53.488s > user 0m5.140s > sys 0m1.700s > real 1m8.288s > user 0m6.430s > sys 0m2.200s > real 0m53.371s > user 0m5.330s > sys 0m1.630s > real 0m58.939s > user 0m6.470s > sys 0m1.730s > real 0m53.255s > user 0m4.950s > sys 0m1.640s > real 0m58.802s > user 0m5.970s > sys 0m1.790s > real 0m58.449s > user 0m5.380s > sys 0m1.580s > real 0m53.308s > user 0m5.120s > sys 0m1.430s > real 0m53.485s > user 0m5.220s > sys 0m1.290s > real 0m53.387s > user 0m5.020s > sys 0m1.590s > ``` -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [airflow] stale[bot] closed pull request #6905: [AIRFLOW-6361] Run LocalTaskJob directly in Celery task
stale[bot] closed pull request #6905: [AIRFLOW-6361] Run LocalTaskJob directly in Celery task URL: https://github.com/apache/airflow/pull/6905 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] codecov-io commented on issue #7570: [AIRFLOW-6946] [WIP] Switch to MySQL 5.7 in 2.0 as base
codecov-io commented on issue #7570: [AIRFLOW-6946] [WIP] Switch to MySQL 5.7 in 2.0 as base URL: https://github.com/apache/airflow/pull/7570#issuecomment-592035504 # [Codecov](https://codecov.io/gh/apache/airflow/pull/7570?src=pr=h1) Report > Merging [#7570](https://codecov.io/gh/apache/airflow/pull/7570?src=pr=desc) into [master](https://codecov.io/gh/apache/airflow/commit/6c266fce9d4c8ba3737fdedab0d09473c841d657?src=pr=desc) will **decrease** coverage by `0.31%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/7570/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/airflow/pull/7570?src=pr=tree) ```diff @@Coverage Diff @@ ## master#7570 +/- ## == - Coverage 86.81% 86.49% -0.32% == Files 896 896 Lines 4262642628 +2 == - Hits3700536873 -132 - Misses 5621 5755 +134 ``` | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/7570?src=pr=tree) | Coverage Δ | | |---|---|---| | [...w/providers/apache/hive/operators/mysql\_to\_hive.py](https://codecov.io/gh/apache/airflow/pull/7570/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYXBhY2hlL2hpdmUvb3BlcmF0b3JzL215c3FsX3RvX2hpdmUucHk=) | `35.84% <0%> (-64.16%)` | :arrow_down: | | [airflow/security/kerberos.py](https://codecov.io/gh/apache/airflow/pull/7570/diff?src=pr=tree#diff-YWlyZmxvdy9zZWN1cml0eS9rZXJiZXJvcy5weQ==) | `30.43% <0%> (-45.66%)` | :arrow_down: | | [airflow/providers/mysql/operators/mysql.py](https://codecov.io/gh/apache/airflow/pull/7570/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvbXlzcWwvb3BlcmF0b3JzL215c3FsLnB5) | `55% <0%> (-45%)` | :arrow_down: | | [airflow/executors/celery\_executor.py](https://codecov.io/gh/apache/airflow/pull/7570/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvY2VsZXJ5X2V4ZWN1dG9yLnB5) | `50.67% <0%> (-37.84%)` | :arrow_down: | | [airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/airflow/pull/7570/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5) | `79.45% <0%> (-5.48%)` | :arrow_down: | | [airflow/executors/base\_executor.py](https://codecov.io/gh/apache/airflow/pull/7570/diff?src=pr=tree#diff-YWlyZmxvdy9leGVjdXRvcnMvYmFzZV9leGVjdXRvci5weQ==) | `93.67% <0%> (-2.54%)` | :arrow_down: | | [airflow/providers/apache/hive/hooks/hive.py](https://codecov.io/gh/apache/airflow/pull/7570/diff?src=pr=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvYXBhY2hlL2hpdmUvaG9va3MvaGl2ZS5weQ==) | `76.02% <0%> (-1.54%)` | :arrow_down: | | [airflow/hooks/dbapi\_hook.py](https://codecov.io/gh/apache/airflow/pull/7570/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9kYmFwaV9ob29rLnB5) | `90.9% <0%> (-0.83%)` | :arrow_down: | | [airflow/jobs/scheduler\_job.py](https://codecov.io/gh/apache/airflow/pull/7570/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzL3NjaGVkdWxlcl9qb2IucHk=) | `89.95% <0%> (-0.15%)` | :arrow_down: | | [airflow/kubernetes/pod\_generator.py](https://codecov.io/gh/apache/airflow/pull/7570/diff?src=pr=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9nZW5lcmF0b3IucHk=) | `96.51% <0%> (ø)` | :arrow_up: | | ... and [1 more](https://codecov.io/gh/apache/airflow/pull/7570/diff?src=pr=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/7570?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/7570?src=pr=footer). Last update [6c266fc...73a3518](https://codecov.io/gh/apache/airflow/pull/7570?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services