[jira] [Commented] (AIRFLOW-2957) Remove old Sensor references
[ https://issues.apache.org/jira/browse/AIRFLOW-2957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592500#comment-16592500 ] ASF GitHub Bot commented on AIRFLOW-2957: - Fokko opened a new pull request #3808: [AIRFLOW-2957] Remove obselete sensor references URL: https://github.com/apache/incubator-airflow/pull/3808 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-XXX - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Remove old Sensor references > > > Key: AIRFLOW-2957 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2957 > Project: Apache Airflow > Issue Type: Bug >Reporter: Fokko Driesprong >Priority: Major > > Remove old sensor references: > https://github.com/apache/incubator-airflow/blob/master/airflow/operators/__init__.py#L57-L71 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] Fokko commented on issue #3760: [AIRFLOW-2909] Deprecate airflow.operators.sensors module
Fokko commented on issue #3760: [AIRFLOW-2909] Deprecate airflow.operators.sensors module URL: https://github.com/apache/incubator-airflow/pull/3760#issuecomment-415944811 @tedmiston We still missed this one: https://github.com/apache/incubator-airflow/pull/3808/files This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko opened a new pull request #3808: [AIRFLOW-2957] Remove obselete sensor references
Fokko opened a new pull request #3808: [AIRFLOW-2957] Remove obselete sensor references URL: https://github.com/apache/incubator-airflow/pull/3808 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-XXX - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-2957) Remove old Sensor references
Fokko Driesprong created AIRFLOW-2957: - Summary: Remove old Sensor references Key: AIRFLOW-2957 URL: https://issues.apache.org/jira/browse/AIRFLOW-2957 Project: Apache Airflow Issue Type: Bug Reporter: Fokko Driesprong Remove old sensor references: https://github.com/apache/incubator-airflow/blob/master/airflow/operators/__init__.py#L57-L71 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] ksaagariconic opened a new pull request #3807: Adding THE ICONIC to the list of orgs using Airflow
ksaagariconic opened a new pull request #3807: Adding THE ICONIC to the list of orgs using Airflow URL: https://github.com/apache/incubator-airflow/pull/3807 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XD-DENG edited a comment on issue #3793: [AIRFLOW-2948] Arg check & better doc - SSHOperator & SFTPOperator
XD-DENG edited a comment on issue #3793: [AIRFLOW-2948] Arg check & better doc - SSHOperator & SFTPOperator URL: https://github.com/apache/incubator-airflow/pull/3793#issuecomment-415934314 Hi @feng-tao , I have - Added instance checking for argument `ssh_hook`. If `ssh_hook` given by user is not of class "SSHHook", it would be handled properly. - Updated the tests for both operators accordingly (covered different combinations of arguments `ssh_hook` and `ssh_conn_id`). Both can pass. The CI is failing due to: - test_poke (tests.contrib.sensors.test_aws_redshift_cluster_sensor.`TestAwsRedshiftClusterSensor`) ... ERROR - test_poke_cluster_not_found (tests.contrib.sensors.test_aws_redshift_cluster_sensor.`TestAwsRedshiftClusterSensor`) ... ERROR - test_poke_false (tests.contrib.sensors.test_aws_redshift_cluster_sensor.`TestAwsRedshiftClusterSensor`) ... ERROR None of these test cases were touched in this PR. Seems it's exception like `Unable to locate credentials`. Not sure what is happening. Similar issue is happening to a few other PRs. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XD-DENG commented on issue #3793: [AIRFLOW-2948] Arg check & better doc - SSHOperator & SFTPOperator
XD-DENG commented on issue #3793: [AIRFLOW-2948] Arg check & better doc - SSHOperator & SFTPOperator URL: https://github.com/apache/incubator-airflow/pull/3793#issuecomment-415934314 Hi @feng-tao , I have - Added instance checking for argument `ssh_hook`. If `ssh_hook` given by user is not of class "SSHHook", it would be handled properly. - Updated the tests for both operators accordingly and both can pass. The CI is failing due to: - test_poke (tests.contrib.sensors.test_aws_redshift_cluster_sensor.TestAwsRedshiftClusterSensor) ... ERROR - test_poke_cluster_not_found (tests.contrib.sensors.test_aws_redshift_cluster_sensor.TestAwsRedshiftClusterSensor) ... ERROR - test_poke_false (tests.contrib.sensors.test_aws_redshift_cluster_sensor.TestAwsRedshiftClusterSensor) ... ERROR None of these test cases were touched in this PR. Seems it's exceptions like `Unable to locate credentials`. Not sure what is happening. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] justinholmes opened a new pull request #3806: [AIRFLOW-2956] added kubernetes tolerations
justinholmes opened a new pull request #3806: [AIRFLOW-2956] added kubernetes tolerations URL: https://github.com/apache/incubator-airflow/pull/3806 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-XXX - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-2956) Kubernetes tolerations for pod operator
Justin Holmes created AIRFLOW-2956: -- Summary: Kubernetes tolerations for pod operator Key: AIRFLOW-2956 URL: https://issues.apache.org/jira/browse/AIRFLOW-2956 Project: Apache Airflow Issue Type: Improvement Reporter: Justin Holmes Allowing users to specify Kubernetes tolerations would be nice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2956) Kubernetes tolerations for pod operator
[ https://issues.apache.org/jira/browse/AIRFLOW-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592359#comment-16592359 ] ASF GitHub Bot commented on AIRFLOW-2956: - justinholmes opened a new pull request #3806: [AIRFLOW-2956] added kubernetes tolerations URL: https://github.com/apache/incubator-airflow/pull/3806 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-XXX - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Kubernetes tolerations for pod operator > --- > > Key: AIRFLOW-2956 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2956 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Justin Holmes >Priority: Minor > > Allowing users to specify Kubernetes tolerations would be nice. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] r39132 closed pull request #3801: [AIRFLOW-XXX] Add 8fit to list of users
r39132 closed pull request #3801: [AIRFLOW-XXX] Add 8fit to list of users URL: https://github.com/apache/incubator-airflow/pull/3801 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/README.md b/README.md index 606f2da745..1f932d98cb 100644 --- a/README.md +++ b/README.md @@ -93,6 +93,7 @@ if you may. Currently **officially** using Airflow: 1. [6play](https://www.6play.fr) [[@lemourA](https://github.com/lemoura), [@achaussende](https://github.com/achaussende), [@d-nguyen](https://github.com/d-nguyen), [@julien-gm](https://github.com/julien-gm)] +1. [8fit](https://8fit.com/) [[@nicor88](https://github.com/nicor88), [@frnzska](https://github.com/frnzska)] 1. [AdBOOST](https://www.adboost.sk) [[AdBOOST](https://github.com/AdBOOST)] 1. [Agari](https://github.com/agaridata) [[@r39132](https://github.com/r39132)] 1. [Airbnb](http://airbnb.io/) [[@mistercrunch](https://github.com/mistercrunch), [@artwr](https://github.com/artwr)] This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] r39132 commented on issue #3801: [AIRFLOW-XXX] Add 8fit to list of users
r39132 commented on issue #3801: [AIRFLOW-XXX] Add 8fit to list of users URL: https://github.com/apache/incubator-airflow/pull/3801#issuecomment-415914550 +1 ... welcome to the Airflow community! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] r39132 commented on issue #3802: Adding THE ICONIC to the list of orgs with AirFlow
r39132 commented on issue #3802: Adding THE ICONIC to the list of orgs with AirFlow URL: https://github.com/apache/incubator-airflow/pull/3802#issuecomment-415914413 @ksaagariconic Please observe alphabetic ordering -- so, I expect this entry to show up after `Tails.com` and before `Thinking Machines`. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jakahn opened a new pull request #3805: [AIRFLOW-2062] Add per-connection KMS encryption.
jakahn opened a new pull request #3805: [AIRFLOW-2062] Add per-connection KMS encryption. URL: https://github.com/apache/incubator-airflow/pull/3805 ### Jira - [X] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. - https://issues.apache.org/jira/browse/AIRFLOW-2062 ### Description - [X] Here are some details about my PR, including screenshots of any UI changes: - For an in-depth explanation of these changes, please see this [design doc](https://docs.google.com/document/d/1qaucGw52aoR96swHQqIYN9nQXn2QKSOWvw6-KyI6Wwc/edit?usp=sharing). - Connections can now be encrypted using a local key that is encrypted using a KMS before being stored, instead of the global fernet key. Enabling this behavior requires creating a connection through the CLI (web UI in a coming PR) with the new KMS fields specified. - `Connection` and `BaseHook` were also refactored to support loading connections from the `Connection` class (instead of `BaseHook`). ### Tests - [X] My PR adds the following unit tests: - `tests/contrib/hooks/test_gcp_kms_hook.py` - `test_encrypt_conn` - `test_encrypt_conn` - `tests/models.py` - `FernetTest` - `test_connection_kms_encryption` - `test_connection_from_uri_kms_encryption` - `test_get_kms_hook_missing` - `test_update_kms` ### Commits - [X] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [X] In case of new functionality, my PR adds documentation that describes how to use it. ### Code Quality - [X] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-2062) Support fine-grained Connection encryption
[ https://issues.apache.org/jira/browse/AIRFLOW-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592266#comment-16592266 ] ASF GitHub Bot commented on AIRFLOW-2062: - jakahn opened a new pull request #3805: [AIRFLOW-2062] Add per-connection KMS encryption. URL: https://github.com/apache/incubator-airflow/pull/3805 ### Jira - [X] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. - https://issues.apache.org/jira/browse/AIRFLOW-2062 ### Description - [X] Here are some details about my PR, including screenshots of any UI changes: - For an in-depth explanation of these changes, please see this [design doc](https://docs.google.com/document/d/1qaucGw52aoR96swHQqIYN9nQXn2QKSOWvw6-KyI6Wwc/edit?usp=sharing). - Connections can now be encrypted using a local key that is encrypted using a KMS before being stored, instead of the global fernet key. Enabling this behavior requires creating a connection through the CLI (web UI in a coming PR) with the new KMS fields specified. - `Connection` and `BaseHook` were also refactored to support loading connections from the `Connection` class (instead of `BaseHook`). ### Tests - [X] My PR adds the following unit tests: - `tests/contrib/hooks/test_gcp_kms_hook.py` - `test_encrypt_conn` - `test_encrypt_conn` - `tests/models.py` - `FernetTest` - `test_connection_kms_encryption` - `test_connection_from_uri_kms_encryption` - `test_get_kms_hook_missing` - `test_update_kms` ### Commits - [X] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [X] In case of new functionality, my PR adds documentation that describes how to use it. ### Code Quality - [X] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support fine-grained Connection encryption > -- > > Key: AIRFLOW-2062 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2062 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib >Reporter: Wilson Lian >Priority: Minor > > This effort targets containerized tasks (e.g., those launched by > KubernetesExecutor). Under that paradigm, each task could potentially operate > under different credentials, and fine-grained Connection encryption will > enable an administrator to restrict which connections can be accessed by > which tasks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1701) CSRF Error on Dag Runs Page
[ https://issues.apache.org/jira/browse/AIRFLOW-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592241#comment-16592241 ] Gabriel Silk commented on AIRFLOW-1701: --- This should be fixed in the RBAC UI when my PR is merged: https://github.com/apache/incubator-airflow/pull/3804 > CSRF Error on Dag Runs Page > --- > > Key: AIRFLOW-1701 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1701 > Project: Apache Airflow > Issue Type: Bug > Components: DagRun >Affects Versions: Airflow 1.8 > Environment: Ubuntu 16.04 LTS; Google Chrome 61.0.3163.100 (Official > Build) (64-bit) >Reporter: James Crowley >Priority: Minor > Attachments: CSRF_Error.png > > > When attempting to modify the state of a Dag Run on /admin/dagrun, I receive > the following error message: > > 400 Bad Request > Bad Request > CSRF token missing or incorrect. > I am able to perform AJAX requests on other pages without issue. The missing > CSRF token appears to be isolated to the AJAX call to the > /admin/dagrun/ajax/update endpoint. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Issue Comment Deleted] (AIRFLOW-2866) Missing CSRF Token Error on Web UI Create/Update Operations
[ https://issues.apache.org/jira/browse/AIRFLOW-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriel Silk updated AIRFLOW-2866: -- Comment: was deleted (was: Here's a PR: https://github.com/apache/incubator-airflow/pull/3804) > Missing CSRF Token Error on Web UI Create/Update Operations > --- > > Key: AIRFLOW-2866 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2866 > Project: Apache Airflow > Issue Type: Bug > Components: webapp >Reporter: Jasper Kahn >Priority: Major > > Attempting to modify or delete many resources (such as Connections or Users) > results in a 400 from the webserver: > {quote}{{Bad Request}} > {{The CSRF session token is missing.}}{quote} > Logs report: > {quote}{{[2018-08-07 18:45:15,771] \{csrf.py:251} INFO - The CSRF session > token is missing.}} > {{192.168.9.1 - - [07/Aug/2018:18:45:15 +] "POST > /admin/connection/delete/ HTTP/1.1" 400 150 > "http://localhost:8081/admin/connection/"; "Mozilla/5.0 (X11; Linux x86_64) > AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 > Safari/537.36"}}{quote} > Chrome dev tools show the CSRF token is present in the request payload. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2866) Missing CSRF Token Error on Web UI Create/Update Operations
[ https://issues.apache.org/jira/browse/AIRFLOW-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592239#comment-16592239 ] Gabriel Silk commented on AIRFLOW-2866: --- Here's a PR: https://github.com/apache/incubator-airflow/pull/3804 > Missing CSRF Token Error on Web UI Create/Update Operations > --- > > Key: AIRFLOW-2866 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2866 > Project: Apache Airflow > Issue Type: Bug > Components: webapp >Reporter: Jasper Kahn >Priority: Major > > Attempting to modify or delete many resources (such as Connections or Users) > results in a 400 from the webserver: > {quote}{{Bad Request}} > {{The CSRF session token is missing.}}{quote} > Logs report: > {quote}{{[2018-08-07 18:45:15,771] \{csrf.py:251} INFO - The CSRF session > token is missing.}} > {{192.168.9.1 - - [07/Aug/2018:18:45:15 +] "POST > /admin/connection/delete/ HTTP/1.1" 400 150 > "http://localhost:8081/admin/connection/"; "Mozilla/5.0 (X11; Linux x86_64) > AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 > Safari/537.36"}}{quote} > Chrome dev tools show the CSRF token is present in the request payload. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2866) Missing CSRF Token Error on Web UI Create/Update Operations
[ https://issues.apache.org/jira/browse/AIRFLOW-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592237#comment-16592237 ] ASF GitHub Bot commented on AIRFLOW-2866: - gsilk opened a new pull request #3804: [AIRFLOW-2866] Fix missing CSRF token head when using RBAC UI URL: https://github.com/apache/incubator-airflow/pull/3804 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-2866 ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: This fixes the missing CSRF header when using RBAC UI, which breaks the pause/un-pause button for DAGs ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Missing CSRF Token Error on Web UI Create/Update Operations > --- > > Key: AIRFLOW-2866 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2866 > Project: Apache Airflow > Issue Type: Bug > Components: webapp >Reporter: Jasper Kahn >Priority: Major > > Attempting to modify or delete many resources (such as Connections or Users) > results in a 400 from the webserver: > {quote}{{Bad Request}} > {{The CSRF session token is missing.}}{quote} > Logs report: > {quote}{{[2018-08-07 18:45:15,771] \{csrf.py:251} INFO - The CSRF session > token is missing.}} > {{192.168.9.1 - - [07/Aug/2018:18:45:15 +] "POST > /admin/connection/delete/ HTTP/1.1" 400 150 > "http://localhost:8081/admin/connection/"; "Mozilla/5.0 (X11; Linux x86_64) > AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 > Safari/537.36"}}{quote} > Chrome dev tools show the CSRF token is present in the request payload. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] gsilk opened a new pull request #3804: [AIRFLOW-2866] Fix missing CSRF token head when using RBAC UI
gsilk opened a new pull request #3804: [AIRFLOW-2866] Fix missing CSRF token head when using RBAC UI URL: https://github.com/apache/incubator-airflow/pull/3804 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-2866 ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: This fixes the missing CSRF header when using RBAC UI, which breaks the pause/un-pause button for DAGs ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-2866) Missing CSRF Token Error on Web UI Create/Update Operations
[ https://issues.apache.org/jira/browse/AIRFLOW-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592233#comment-16592233 ] Gabriel Silk commented on AIRFLOW-2866: --- This doesn't resolve the issue when using rbac UI. I'll submit a patch for that. > Missing CSRF Token Error on Web UI Create/Update Operations > --- > > Key: AIRFLOW-2866 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2866 > Project: Apache Airflow > Issue Type: Bug > Components: webapp >Reporter: Jasper Kahn >Priority: Major > > Attempting to modify or delete many resources (such as Connections or Users) > results in a 400 from the webserver: > {quote}{{Bad Request}} > {{The CSRF session token is missing.}}{quote} > Logs report: > {quote}{{[2018-08-07 18:45:15,771] \{csrf.py:251} INFO - The CSRF session > token is missing.}} > {{192.168.9.1 - - [07/Aug/2018:18:45:15 +] "POST > /admin/connection/delete/ HTTP/1.1" 400 150 > "http://localhost:8081/admin/connection/"; "Mozilla/5.0 (X11; Linux x86_64) > AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 > Safari/537.36"}}{quote} > Chrome dev tools show the CSRF token is present in the request payload. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] mtp401 edited a comment on issue #3803: [AIRFLOW-2779] Restore Copyright notice of GHE auth backend
mtp401 edited a comment on issue #3803: [AIRFLOW-2779] Restore Copyright notice of GHE auth backend URL: https://github.com/apache/incubator-airflow/pull/3803#issuecomment-415884283 @bolkedebruin I can say with certainty that this contribution was made while Airflow still under Airbnb. The pull has since been migrated since the repository was transferred to the ASF. (https://github.com/apache/incubator-airflow/pull/802). I'm totally fine with @ashb's changes in this PR and this certainly seems to fall squarely under the ASF's policies regarding attributions: "[Use the NOTICE file to collect copyright notices and required attributions.](https://www.apache.org/legal/src-headers.html#overview)". This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] mtp401 commented on issue #3803: [AIRFLOW-2779] Restore Copyright notice of GHE auth backend
mtp401 commented on issue #3803: [AIRFLOW-2779] Restore Copyright notice of GHE auth backend URL: https://github.com/apache/incubator-airflow/pull/3803#issuecomment-415884283 @bolkedebruin I can say with certainty that this contribution was made while Airflow still under Airbnb. The pull has since been migrated since the repository was transferred to the ASF. (https://github.com/apache/incubator-airflow/pull/802). I'm totally fine with @ashb's changes in this PR and this certainly seems to fall squarely under the ASF's policies regarding attributions [Use the NOTICE file to collect copyright notices and required attributions.](https://www.apache.org/legal/src-headers.html#overview). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bolkedebruin edited a comment on issue #3803: [AIRFLOW-2779] Restore Copyright notice of GHE auth backend
bolkedebruin edited a comment on issue #3803: [AIRFLOW-2779] Restore Copyright notice of GHE auth backend URL: https://github.com/apache/incubator-airflow/pull/3803#issuecomment-415868711 Im sorry this is disallowed by Apache policy : http://www.apache.org/dev/apply-license.html#contributor-copyright @mtp401 Your contribution was made after transfer to the foundation and as such subject to the policy. You do retain copyright and you are not transferring it to the foundation. What we could do (I think) is add a contributor file and list your name if you require it. I do think I should have notified you. My apologies. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bolkedebruin commented on issue #3803: [AIRFLOW-2779] Restore Copyright notice of GHE auth backend
bolkedebruin commented on issue #3803: [AIRFLOW-2779] Restore Copyright notice of GHE auth backend URL: https://github.com/apache/incubator-airflow/pull/3803#issuecomment-415868711 Im sorry this is disallowed by Apache policy : http://www.apache.org/dev/apply-license.html#contributor-copyright @mtp401 Your contribution was made after transfer to the foundation and as such subject to the policy. You do retain copyright and you are not transferring it to the foundation. What we could do (I think) is add a contributor file and list your name if you require it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (AIRFLOW-1998) Implement Databricks Operator for jobs/run-now endpoint
[ https://issues.apache.org/jira/browse/AIRFLOW-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Israel Knight reassigned AIRFLOW-1998: -- Assignee: Israel Knight > Implement Databricks Operator for jobs/run-now endpoint > --- > > Key: AIRFLOW-1998 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1998 > Project: Apache Airflow > Issue Type: Improvement > Components: hooks, operators >Affects Versions: 1.9.0 >Reporter: Diego Rabatone Oliveira >Assignee: Israel Knight >Priority: Major > > Implement a Operator to deal with Databricks '2.0/jobs/run-now' API Endpoint. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2779) Verify and correct licenses
[ https://issues.apache.org/jira/browse/AIRFLOW-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591978#comment-16591978 ] Matt Pelland commented on AIRFLOW-2779: --- Just wanted to cross post from GitHub for posterity:[https://github.com/apache/incubator-airflow/pull/3803#issuecomment-415831639,] Thanks again [~ashb] for the quick resolution. > Verify and correct licenses > --- > > Key: AIRFLOW-2779 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2779 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Bolke de Bruin >Priority: Major > Labels: licenses > Fix For: 1.10.0 > > > # {color:#00}/airflow/security/utils.py{color} > {color:#00}2. ./airflow/security/kerberos.py{color} > {color:#00}3. ./airflow/www_rbac/static/jqClock.min.js{color} > {color:#00}4. ./airflow/www/static/bootstrap3-typeahead.min.js{color} > {color:#00}5. > ./apache-airflow-1.10.0rc2+incubating/scripts/ci/flake8_diff.sh{color} > {color:#00}6. {color}[https://www.apache.org/legal/resolved.html#optional] > {color:#00}7. ./docs/license.rst{color} > {color:#00}8. airflow/contrib/auth/backends/google_auth.py{color} > {color:#00}9. > /airflow/contrib/auth/backends/github_enterprise_auth.py{color} > {color:#00}10. /airflow/contrib/hooks/ssh_hook.py{color} > {color:#00}11. /airflow/minihivecluster.py{color} > {color:#00}This files [1][2] seem to be 3rd party ALv2 licensed files > that refers to a NOTICE file, that information in that NOTICE file (at the > very least the copyright into) should be in your NOTICE file. This should > also be noted in LICENSE.{color} > > {color:#00}LICENSE is: > - missing jQuery clock [3] and typeahead [4], as they are ALv2 it’s not > required to list them but it’s a good idea to do so. > - missing the license for this [5] > - this file [7] oddly has © 2016 GitHub, [Inc.at|http://inc.at/] the bottom > of it{color} > > * {color:#00}Year in NOTICE is not correct "2016 and onwards” isn’t > valid as copyright has an expiry date{color} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] mtp401 commented on issue #3803: [AIRFLOW-2779] Restore Copyright notice of GHE auth backend
mtp401 commented on issue #3803: [AIRFLOW-2779] Restore Copyright notice of GHE auth backend URL: https://github.com/apache/incubator-airflow/pull/3803#issuecomment-415831639 @ashb thanks for the quick response! This is exactly what I was looking for and if that works for the ASF it works for me. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-2955) Kubernetes pod operator: Unable to set requests/limits on task pods
[ https://issues.apache.org/jira/browse/AIRFLOW-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Davies updated AIRFLOW-2955: Description: When I try and set a resource limit/request on a DAG task with the KubernetesPodOperator as follows: {code:java} resources={"limit_cpu": 1, "request_cpu": 1}, {code} ...I get: {code:java} [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task Traceback (most recent call last): [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/bin/airflow", line 32, in [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task args.func(args) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/utils/cli.py", line 74, in wrapper [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task return f(*args, **kwargs) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 498, in run [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task _run(args, dag, ti) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 402, in _run [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task pool=args.pool, [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 74, in wrapper [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task return func(*args, **kwargs) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/models.py", line 1633, in _run_raw_task [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task result = task_copy.execute(context=context) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py", line 115, in execute [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task get_logs=self.get_logs) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/pod_launcher.py", line 71, in run_pod [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task resp = self.run_pod_async(pod) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/pod_launcher.py", line 52, in run_pod_async [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task req = self.kube_req_factory.create(pod) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/kubernetes_request_factory/pod_request_factory.py", line 56, in create [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task self.extract_resources(pod, req) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/kubernetes_request_factory/kubernetes_request_factory.py", line 160, in extract_resources [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task if not pod.resources or pod.resources.is_empty_resource_request(): [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task AttributeError: 'dict' object has no attribute 'is_empty_resource_request' {code} ...setting https://github.com/apache/incubator-airflow/blob/fc10f7e0a04145a0b2f31f8d0990bbe900b4e8a2/airflow/example_dags/example_kubernetes_executor.py#L66 works, however that only adjusts the metadata for the worker pod and not the pod ultimately used for the task. was: When I try and set a resource limit/request on a DAG task with the KubernetesPodOperator, I get: {code:java} [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task Traceback (most recent call last): [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/bin/airflow", line 32, in [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task args.func(args) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/utils/cli.py", line 74, in wrapper [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task return f(
[jira] [Updated] (AIRFLOW-2955) Kubernetes pod operator: Unable to set requests/limits on task pods
[ https://issues.apache.org/jira/browse/AIRFLOW-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Davies updated AIRFLOW-2955: Description: When I try and set a resource limit/request on a DAG task with the KubernetesPodOperator, I get: {code:java} [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task Traceback (most recent call last): [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/bin/airflow", line 32, in [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task args.func(args) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/utils/cli.py", line 74, in wrapper [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task return f(*args, **kwargs) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 498, in run [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task _run(args, dag, ti) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 402, in _run [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task pool=args.pool, [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 74, in wrapper [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task return func(*args, **kwargs) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/models.py", line 1633, in _run_raw_task [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task result = task_copy.execute(context=context) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py", line 115, in execute [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task get_logs=self.get_logs) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/pod_launcher.py", line 71, in run_pod [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task resp = self.run_pod_async(pod) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/pod_launcher.py", line 52, in run_pod_async [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task req = self.kube_req_factory.create(pod) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/kubernetes_request_factory/pod_request_factory.py", line 56, in create [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task self.extract_resources(pod, req) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/kubernetes_request_factory/kubernetes_request_factory.py", line 160, in extract_resources [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task if not pod.resources or pod.resources.is_empty_resource_request(): [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task AttributeError: 'dict' object has no attribute 'is_empty_resource_request' {code} ...setting https://github.com/apache/incubator-airflow/blob/fc10f7e0a04145a0b2f31f8d0990bbe900b4e8a2/airflow/example_dags/example_kubernetes_executor.py#L66 works, however that only adjusts the metadata for the worker pod and not the pod ultimately used for the task. was: When I try and set a resource limit/request on a DAG task with the KubernetesPodOperator, I get: {code:java} [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task Traceback (most recent call last): [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/bin/airflow", line 32, in [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task args.func(args) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/utils/cli.py", line 74, in wrapper [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task return f(*args, **kwargs) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2
[jira] [Created] (AIRFLOW-2955) Kubernetes pod operator: Unable to set requests/limits on task pods
Jon Davies created AIRFLOW-2955: --- Summary: Kubernetes pod operator: Unable to set requests/limits on task pods Key: AIRFLOW-2955 URL: https://issues.apache.org/jira/browse/AIRFLOW-2955 Project: Apache Airflow Issue Type: Bug Reporter: Jon Davies When I try and set a resource limit/request on a DAG task with the KubernetesPodOperator, I get: {code:java} [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task Traceback (most recent call last): [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/bin/airflow", line 32, in [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task args.func(args) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/utils/cli.py", line 74, in wrapper [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task return f(*args, **kwargs) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 498, in run [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task _run(args, dag, ti) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 402, in _run [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task pool=args.pool, [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 74, in wrapper [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task return func(*args, **kwargs) [2018-08-24 15:51:27,795] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/models.py", line 1633, in _run_raw_task [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task result = task_copy.execute(context=context) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py", line 115, in execute [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task get_logs=self.get_logs) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/pod_launcher.py", line 71, in run_pod [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task resp = self.run_pod_async(pod) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/pod_launcher.py", line 52, in run_pod_async [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task req = self.kube_req_factory.create(pod) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/kubernetes_request_factory/pod_request_factory.py", line 56, in create [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task self.extract_resources(pod, req) [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task File "/usr/local/lib/python3.7/site-packages/airflow/contrib/kubernetes/kubernetes_request_factory/kubernetes_request_factory.py", line 160, in extract_resources [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task if not pod.resources or pod.resources.is_empty_resource_request(): [2018-08-24 15:51:27,796] {base_task_runner.py:107} INFO - Job 2: Subtask task AttributeError: 'dict' object has no attribute 'is_empty_resource_request' {code} ...setting https://github.com/apache/incubator-airflow/blob/fc10f7e0a04145a0b2f31f8d0990bbe900b4e8a2/airflow/example_dags/example_kubernetes_executor.py#L66 works, however that only adjusts the metadata for the worker pod. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] feng-tao closed pull request #3800: [AIRFLOW-XXX] Adding King.com to the list of companies.
feng-tao closed pull request #3800: [AIRFLOW-XXX] Adding King.com to the list of companies. URL: https://github.com/apache/incubator-airflow/pull/3800 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/README.md b/README.md index 606f2da745..c4ae6c08fd 100644 --- a/README.md +++ b/README.md @@ -186,6 +186,7 @@ Currently **officially** using Airflow: 1. [JobTeaser](https://www.jobteaser.com) [[@stefani75](https://github.com/stefani75) & [@knil-sama](https://github.com/knil-sama)] 1. [Kalibrr](https://www.kalibrr.com/) [[@charlesverdad](https://github.com/charlesverdad)] 1. [Karmic](https://karmiclabs.com) [[@hyw](https://github.com/hyw)] +1. [King](https://king.com) [[@nathadfield](https://github.com/nathadfield)] 1. [Kiwi.com](https://kiwi.com/) [[@underyx](https://github.com/underyx)] 1. [Kogan.com](https://github.com/kogan) [[@geeknam](https://github.com/geeknam)] 1. [KPN B.V.](https://www.kpn.com/) [[@biyanisuraj](https://github.com/biyanisuraj) & [@gmic](https://github.com/gmic)] This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] feng-tao commented on issue #3800: [AIRFLOW-XXX] Adding King.com to the list of companies.
feng-tao commented on issue #3800: [AIRFLOW-XXX] Adding King.com to the list of companies. URL: https://github.com/apache/incubator-airflow/pull/3800#issuecomment-415814089 lgtm This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-2950) Running Airflow behind a proxy
[ https://issues.apache.org/jira/browse/AIRFLOW-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivakumar Gopalakrishnan updated AIRFLOW-2950: --- Description: We are setting up Airflow in a private VPC and we do NOT have a transparent proxy. The airflow configuration is using Celery Executors and SQS. we have changed the celery default configuration to include the region we are using However, the celery executor cannot reach AWS unless we specify the proxy configuration (http_proxy, https_proxy & no_proxy) using os.environment I need to edit the airflow executable script for this; I was looking to have these variable defined under core and load them if they are defined was: Airflow with a Celary+SQS configuration does not work behind a proxy it will be nice to add variables http_proxy, https_proxy & no_proxy as part of the core configuration so that they can be used internally > Running Airflow behind a proxy > -- > > Key: AIRFLOW-2950 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2950 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Shivakumar Gopalakrishnan >Assignee: Shivakumar Gopalakrishnan >Priority: Minor > > > We are setting up Airflow in a private VPC and we do NOT have a transparent > proxy. > The airflow configuration is using Celery Executors and SQS. we have changed > the celery default configuration to include the region we are using > However, the celery executor cannot reach AWS unless we specify the proxy > configuration (http_proxy, https_proxy & no_proxy) using os.environment > I need to edit the airflow executable script for this; I was looking to have > these variable defined under core and load them if they are defined > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (AIRFLOW-2779) Verify and correct licenses
[ https://issues.apache.org/jira/browse/AIRFLOW-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591787#comment-16591787 ] Ash Berlin-Taylor edited comment on AIRFLOW-2779 at 8/24/18 3:53 PM: - Sorry [~mtp401]! -Someone did an over-zealous find-and-replace-. ASF Policy doesn't want us to have individual copyright notices in each file, instead preferring them in a NOTICIES file. I'll open a PR to restore the correct attribution. was (Author: ashb): Sorry [~mtp401]! Someone did an over-zealous find-and-replace. I'll open a PR to restore the correct attribution. ASF Policy doesn't want us to have individual copyright notices in each file, instead preferring them in a NOTICIES file. Are you happy with this: {code} airflow.contrib.auth.backends.github_enterprise_auth * Copyright 2015 Matthew Pelland (m...@pelland.io) {code} > Verify and correct licenses > --- > > Key: AIRFLOW-2779 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2779 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Bolke de Bruin >Priority: Major > Labels: licenses > Fix For: 1.10.0 > > > # {color:#00}/airflow/security/utils.py{color} > {color:#00}2. ./airflow/security/kerberos.py{color} > {color:#00}3. ./airflow/www_rbac/static/jqClock.min.js{color} > {color:#00}4. ./airflow/www/static/bootstrap3-typeahead.min.js{color} > {color:#00}5. > ./apache-airflow-1.10.0rc2+incubating/scripts/ci/flake8_diff.sh{color} > {color:#00}6. {color}[https://www.apache.org/legal/resolved.html#optional] > {color:#00}7. ./docs/license.rst{color} > {color:#00}8. airflow/contrib/auth/backends/google_auth.py{color} > {color:#00}9. > /airflow/contrib/auth/backends/github_enterprise_auth.py{color} > {color:#00}10. /airflow/contrib/hooks/ssh_hook.py{color} > {color:#00}11. /airflow/minihivecluster.py{color} > {color:#00}This files [1][2] seem to be 3rd party ALv2 licensed files > that refers to a NOTICE file, that information in that NOTICE file (at the > very least the copyright into) should be in your NOTICE file. This should > also be noted in LICENSE.{color} > > {color:#00}LICENSE is: > - missing jQuery clock [3] and typeahead [4], as they are ALv2 it’s not > required to list them but it’s a good idea to do so. > - missing the license for this [5] > - this file [7] oddly has © 2016 GitHub, [Inc.at|http://inc.at/] the bottom > of it{color} > > * {color:#00}Year in NOTICE is not correct "2016 and onwards” isn’t > valid as copyright has an expiry date{color} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2779) Verify and correct licenses
[ https://issues.apache.org/jira/browse/AIRFLOW-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591839#comment-16591839 ] ASF GitHub Bot commented on AIRFLOW-2779: - ashb opened a new pull request #3803: [AIRFLOW-2779] Restore Copyright notice of GHE auth backend URL: https://github.com/apache/incubator-airflow/pull/3803 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-2779 ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: At request of original author. Placed in NOTICES file to comply with https://www.apache.org/legal/src-headers.html#headers. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Verify and correct licenses > --- > > Key: AIRFLOW-2779 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2779 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Bolke de Bruin >Priority: Major > Labels: licenses > Fix For: 1.10.0 > > > # {color:#00}/airflow/security/utils.py{color} > {color:#00}2. ./airflow/security/kerberos.py{color} > {color:#00}3. ./airflow/www_rbac/static/jqClock.min.js{color} > {color:#00}4. ./airflow/www/static/bootstrap3-typeahead.min.js{color} > {color:#00}5. > ./apache-airflow-1.10.0rc2+incubating/scripts/ci/flake8_diff.sh{color} > {color:#00}6. {color}[https://www.apache.org/legal/resolved.html#optional] > {color:#00}7. ./docs/license.rst{color} > {color:#00}8. airflow/contrib/auth/backends/google_auth.py{color} > {color:#00}9. > /airflow/contrib/auth/backends/github_enterprise_auth.py{color} > {color:#00}10. /airflow/contrib/hooks/ssh_hook.py{color} > {color:#00}11. /airflow/minihivecluster.py{color} > {color:#00}This files [1][2] seem to be 3rd party ALv2 licensed files > that refers to a NOTICE file, that information in that NOTICE file (at the > very least the copyright into) should be in your NOTICE file. This should > also be noted in LICENSE.{color} > > {color:#00}LICENSE is: > - missing jQuery clock [3] and typeahead [4], as they are ALv2 it’s not > required to list them but it’s a good idea to do so. > - missing the license for this [5] > - this file [7] oddly has © 2016 GitHub, [Inc.at|http://inc.at/] the bottom > of it{color} > > * {color:#00}Year in NOTICE is not correct "2016 and onwards” isn’t > valid as copyright has an expiry date{color} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] ashb opened a new pull request #3803: [AIRFLOW-2779] Restore Copyright notice of GHE auth backend
ashb opened a new pull request #3803: [AIRFLOW-2779] Restore Copyright notice of GHE auth backend URL: https://github.com/apache/incubator-airflow/pull/3803 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-2779 ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: At request of original author. Placed in NOTICES file to comply with https://www.apache.org/legal/src-headers.html#headers. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [x] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Comment Edited] (AIRFLOW-2779) Verify and correct licenses
[ https://issues.apache.org/jira/browse/AIRFLOW-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591787#comment-16591787 ] Ash Berlin-Taylor edited comment on AIRFLOW-2779 at 8/24/18 3:28 PM: - Sorry [~mtp401]! Someone did an over-zealous find-and-replace. I'll open a PR to restore the correct attribution. ASF Policy doesn't want us to have individual copyright notices in each file, instead preferring them in a NOTICIES file. Are you happy with this: {code} airflow.contrib.auth.backends.github_enterprise_auth * Copyright 2015 Matthew Pelland (m...@pelland.io) {code} was (Author: ashb): Sorry [~mtp401]! Someone did an over-zealous find-and-replace. I'll open a PR to restore the correct attribution. > Verify and correct licenses > --- > > Key: AIRFLOW-2779 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2779 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Bolke de Bruin >Priority: Major > Labels: licenses > Fix For: 1.10.0 > > > # {color:#00}/airflow/security/utils.py{color} > {color:#00}2. ./airflow/security/kerberos.py{color} > {color:#00}3. ./airflow/www_rbac/static/jqClock.min.js{color} > {color:#00}4. ./airflow/www/static/bootstrap3-typeahead.min.js{color} > {color:#00}5. > ./apache-airflow-1.10.0rc2+incubating/scripts/ci/flake8_diff.sh{color} > {color:#00}6. {color}[https://www.apache.org/legal/resolved.html#optional] > {color:#00}7. ./docs/license.rst{color} > {color:#00}8. airflow/contrib/auth/backends/google_auth.py{color} > {color:#00}9. > /airflow/contrib/auth/backends/github_enterprise_auth.py{color} > {color:#00}10. /airflow/contrib/hooks/ssh_hook.py{color} > {color:#00}11. /airflow/minihivecluster.py{color} > {color:#00}This files [1][2] seem to be 3rd party ALv2 licensed files > that refers to a NOTICE file, that information in that NOTICE file (at the > very least the copyright into) should be in your NOTICE file. This should > also be noted in LICENSE.{color} > > {color:#00}LICENSE is: > - missing jQuery clock [3] and typeahead [4], as they are ALv2 it’s not > required to list them but it’s a good idea to do so. > - missing the license for this [5] > - this file [7] oddly has © 2016 GitHub, [Inc.at|http://inc.at/] the bottom > of it{color} > > * {color:#00}Year in NOTICE is not correct "2016 and onwards” isn’t > valid as copyright has an expiry date{color} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2806) test_mark_success_no_kill test breaks intermittently on CI
[ https://issues.apache.org/jira/browse/AIRFLOW-2806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591797#comment-16591797 ] Taylor Edmiston commented on AIRFLOW-2806: -- [~gcuriel] I believe this issue still exists on the CI but admittedly I've been very focused on another lately (AIRFLOW-2803) and haven't turned my head back to this one yet. Do you know if it's been resolved already? > test_mark_success_no_kill test breaks intermittently on CI > -- > > Key: AIRFLOW-2806 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2806 > Project: Apache Airflow > Issue Type: Bug >Reporter: Taylor Edmiston >Assignee: Taylor Edmiston >Priority: Minor > > The test_mark_success_no_kill test is breaking intermittently on the CI for > some versions of Python and some databases, particularly Python 3.5 for both > PostgreSQL and MySQL. > A traceback of the error is > ([link|https://travis-ci.org/apache/incubator-airflow/jobs/407522994#L5668-L5701]): > {code:java} > 10) ERROR: test_mark_success_no_kill (tests.transplant_class..C) > -- > Traceback (most recent call last): > tests/jobs.py line 1116 in test_mark_success_no_kill > ti.refresh_from_db() > airflow/utils/db.py line 74 in wrapper > return func(*args, **kwargs) > /opt/python/3.5.5/lib/python3.5/contextlib.py line 66 in __exit__ > next(self.gen) > airflow/utils/db.py line 45 in create_session > session.commit() > > .tox/py35-backend_postgres/lib/python3.5/site-packages/sqlalchemy/orm/session.py > line 927 in commit > self.transaction.commit() > > .tox/py35-backend_postgres/lib/python3.5/site-packages/sqlalchemy/orm/session.py > line 471 in commit > t[1].commit() > > .tox/py35-backend_postgres/lib/python3.5/site-packages/sqlalchemy/engine/base.py > line 1632 in commit > self._do_commit() > > .tox/py35-backend_postgres/lib/python3.5/site-packages/sqlalchemy/engine/base.py > line 1663 in _do_commit > self.connection._commit_impl() > > .tox/py35-backend_postgres/lib/python3.5/site-packages/sqlalchemy/engine/base.py > line 723 in _commit_impl > self._handle_dbapi_exception(e, None, None, None, None) > > .tox/py35-backend_postgres/lib/python3.5/site-packages/sqlalchemy/engine/base.py > line 1402 in _handle_dbapi_exception > exc_info > > .tox/py35-backend_postgres/lib/python3.5/site-packages/sqlalchemy/util/compat.py > line 203 in raise_from_cause > reraise(type(exception), exception, tb=exc_tb, cause=cause) > > .tox/py35-backend_postgres/lib/python3.5/site-packages/sqlalchemy/util/compat.py > line 186 in reraise > raise value.with_traceback(tb) > > .tox/py35-backend_postgres/lib/python3.5/site-packages/sqlalchemy/engine/base.py > line 721 in _commit_impl > self.engine.dialect.do_commit(self.connection) > > .tox/py35-backend_postgres/lib/python3.5/site-packages/sqlalchemy/engine/default.py > line 443 in do_commit > dbapi_connection.commit() > OperationalError: (psycopg2.OperationalError) server closed the connection > unexpectedly > This probably means the server terminated abnormally{code} > It seems to be erroring out on trying to > [commit|http://initd.org/psycopg/docs/connection.html#connection.commit] the > pending transaction to the database, possibly because the connection has been > closed. What's weird is that this line is already in a try-except block > catching all exceptions, but I think it's somehow not entering the except > clause. > [https://github.com/apache/incubator-airflow/blob/f3b6b60c4809afdde916e8982a300f942f26109b/airflow/utils/db.py#L36-L50] > Note: This is a follow up to AIRFLOW-2801 ([PR > #3642|https://github.com/apache/incubator-airflow/pull/3642]) which provided > a short-term solution by skipping the flaky test. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] thesquelched commented on issue #2135: [AIRFLOW-843] Store exceptions on task_instance
thesquelched commented on issue #2135: [AIRFLOW-843] Store exceptions on task_instance URL: https://github.com/apache/incubator-airflow/pull/2135#issuecomment-415792501 I've just been lazy; I'll look at this again This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-2779) Verify and correct licenses
[ https://issues.apache.org/jira/browse/AIRFLOW-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591787#comment-16591787 ] Ash Berlin-Taylor commented on AIRFLOW-2779: Sorry [~mtp401]! Someone did an over-zealous find-and-replace. I'll open a PR to restore the correct attribution. > Verify and correct licenses > --- > > Key: AIRFLOW-2779 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2779 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Bolke de Bruin >Priority: Major > Labels: licenses > Fix For: 1.10.0 > > > # {color:#00}/airflow/security/utils.py{color} > {color:#00}2. ./airflow/security/kerberos.py{color} > {color:#00}3. ./airflow/www_rbac/static/jqClock.min.js{color} > {color:#00}4. ./airflow/www/static/bootstrap3-typeahead.min.js{color} > {color:#00}5. > ./apache-airflow-1.10.0rc2+incubating/scripts/ci/flake8_diff.sh{color} > {color:#00}6. {color}[https://www.apache.org/legal/resolved.html#optional] > {color:#00}7. ./docs/license.rst{color} > {color:#00}8. airflow/contrib/auth/backends/google_auth.py{color} > {color:#00}9. > /airflow/contrib/auth/backends/github_enterprise_auth.py{color} > {color:#00}10. /airflow/contrib/hooks/ssh_hook.py{color} > {color:#00}11. /airflow/minihivecluster.py{color} > {color:#00}This files [1][2] seem to be 3rd party ALv2 licensed files > that refers to a NOTICE file, that information in that NOTICE file (at the > very least the copyright into) should be in your NOTICE file. This should > also be noted in LICENSE.{color} > > {color:#00}LICENSE is: > - missing jQuery clock [3] and typeahead [4], as they are ALv2 it’s not > required to list them but it’s a good idea to do so. > - missing the license for this [5] > - this file [7] oddly has © 2016 GitHub, [Inc.at|http://inc.at/] the bottom > of it{color} > > * {color:#00}Year in NOTICE is not correct "2016 and onwards” isn’t > valid as copyright has an expiry date{color} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] ashb commented on a change in pull request #2135: [AIRFLOW-843] Store exceptions on task_instance
ashb commented on a change in pull request #2135: [AIRFLOW-843] Store exceptions on task_instance URL: https://github.com/apache/incubator-airflow/pull/2135#discussion_r212655832 ## File path: airflow/models.py ## @@ -1574,6 +1574,8 @@ def dry_run(self): def handle_failure(self, error, test_mode=False, context=None, session=None): self.log.exception(error) task = self.task +session = settings.Session() +self.exception = error Review comment: Adding a property to the TaskInstance object feels the wrong place for this - the context object feels like a more appropriate place. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashb commented on a change in pull request #2135: [AIRFLOW-843] Store exceptions on task_instance
ashb commented on a change in pull request #2135: [AIRFLOW-843] Store exceptions on task_instance URL: https://github.com/apache/incubator-airflow/pull/2135#discussion_r212653771 ## File path: airflow/models.py ## @@ -1574,6 +1574,8 @@ def dry_run(self): def handle_failure(self, error, test_mode=False, context=None, session=None): self.log.exception(error) task = self.task +session = settings.Session() Review comment: What is this line for? It seems like it does nothing? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jpds commented on a change in pull request #3782: [AIRFLOW-2936] Use official Python images as base image for Docker
jpds commented on a change in pull request #3782: [AIRFLOW-2936] Use official Python images as base image for Docker URL: https://github.com/apache/incubator-airflow/pull/3782#discussion_r212654329 ## File path: scripts/ci/kubernetes/docker/airflow-init.sh ## @@ -17,9 +17,10 @@ # specific language governing permissions and limitations * # under the License. -cd /usr/local/lib/python2.7/dist-packages/airflow && \ -cp -R example_dags/* /root/airflow/dags/ && \ +set -e + +cd /usr/local/lib/python3.7/site-packages/airflow/ && \ +cp -R example_dags/* /home/airflow/dags/ && \ airflow initdb && \ alembic upgrade heads && \ -(airflow create_user -u airflow -l airflow -f jon -e airf...@apache.org -r Admin -p airflow || true) && \ -echo "retrieved from mount" > /root/test_volume/test.txt Review comment: I haven't found this necessary in our deployments (but we're using an Airflow image with our DAGs baked in). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jpds commented on a change in pull request #3782: [AIRFLOW-2936] Use official Python images as base image for Docker
jpds commented on a change in pull request #3782: [AIRFLOW-2936] Use official Python images as base image for Docker URL: https://github.com/apache/incubator-airflow/pull/3782#discussion_r212654329 ## File path: scripts/ci/kubernetes/docker/airflow-init.sh ## @@ -17,9 +17,10 @@ # specific language governing permissions and limitations * # under the License. -cd /usr/local/lib/python2.7/dist-packages/airflow && \ -cp -R example_dags/* /root/airflow/dags/ && \ +set -e + +cd /usr/local/lib/python3.7/site-packages/airflow/ && \ +cp -R example_dags/* /home/airflow/dags/ && \ airflow initdb && \ alembic upgrade heads && \ -(airflow create_user -u airflow -l airflow -f jon -e airf...@apache.org -r Admin -p airflow || true) && \ -echo "retrieved from mount" > /root/test_volume/test.txt Review comment: I haven't found this necessary in our deployments. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] betabandido edited a comment on issue #3570: [AIRFLOW-2709] Improve error handling in Databricks hook
betabandido edited a comment on issue #3570: [AIRFLOW-2709] Improve error handling in Databricks hook URL: https://github.com/apache/incubator-airflow/pull/3570#issuecomment-415782113 @Fokko @andrewmchen I just changed the code so that we test for the HTTP status code instead of relying on parsing the response content. The hook's code is significantly simpler now too. I refactored a little bit the tests too. Now it is possible to test that no retry occurs for codes < 500. We can also test that once a retried request succeeds, the retry method simply returns the response. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] betabandido commented on issue #3570: [AIRFLOW-2709] Improve error handling in Databricks hook
betabandido commented on issue #3570: [AIRFLOW-2709] Improve error handling in Databricks hook URL: https://github.com/apache/incubator-airflow/pull/3570#issuecomment-415782113 @Fokko @andrewmchen I just changed the code so that we test for the HTTP status code instead of relying on parsing the response content. The hook's code is significantly simpler now too. I refactored a little bit the tests, and now it is possible to test that no retry occurs for codes < 500, as well as ensuring that once the request succeeds, the retry method simply returns the response. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-2709) Improve error handling in Databricks hook
[ https://issues.apache.org/jira/browse/AIRFLOW-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591693#comment-16591693 ] ASF GitHub Bot commented on AIRFLOW-2709: - betabandido closed pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook URL: https://github.com/apache/incubator-airflow/pull/3570 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/hooks/databricks_hook.py b/airflow/contrib/hooks/databricks_hook.py index 54f00e0090..bb89113491 100644 --- a/airflow/contrib/hooks/databricks_hook.py +++ b/airflow/contrib/hooks/databricks_hook.py @@ -24,6 +24,7 @@ from airflow.hooks.base_hook import BaseHook from requests import exceptions as requests_exceptions from requests.auth import AuthBase +from time import sleep from airflow.utils.log.logging_mixin import LoggingMixin @@ -47,7 +48,8 @@ def __init__( self, databricks_conn_id='databricks_default', timeout_seconds=180, -retry_limit=3): +retry_limit=3, +retry_delay=1.0): """ :param databricks_conn_id: The name of the databricks connection to use. :type databricks_conn_id: string @@ -57,6 +59,9 @@ def __init__( :param retry_limit: The number of times to retry the connection in case of service outages. :type retry_limit: int +:param retry_delay: The number of seconds to wait between retries (it +might be a floating point number). +:type retry_delay: float """ self.databricks_conn_id = databricks_conn_id self.databricks_conn = self.get_connection(databricks_conn_id) @@ -64,6 +69,7 @@ def __init__( if retry_limit < 1: raise ValueError('Retry limit must be greater than equal to 1') self.retry_limit = retry_limit +self.retry_delay = retry_delay @staticmethod def _parse_host(host): @@ -119,7 +125,8 @@ def _do_api_call(self, endpoint_info, json): else: raise AirflowException('Unexpected HTTP Method: ' + method) -for attempt_num in range(1, self.retry_limit + 1): +attempt_num = 1 +while True: try: response = request_func( url, @@ -127,21 +134,29 @@ def _do_api_call(self, endpoint_info, json): auth=auth, headers=USER_AGENT_HEADER, timeout=self.timeout_seconds) -if response.status_code == requests.codes.ok: -return response.json() -else: +response.raise_for_status() +return response.json() +except requests_exceptions.RequestException as e: +if not _retryable_error(e): # In this case, the user probably made a mistake. # Don't retry. raise AirflowException('Response: {0}, Status Code: {1}'.format( -response.content, response.status_code)) -except (requests_exceptions.ConnectionError, -requests_exceptions.Timeout) as e: -self.log.error( -'Attempt %s API Request to Databricks failed with reason: %s', -attempt_num, e -) -raise AirflowException(('API requests to Databricks failed {} times. ' + - 'Giving up.').format(self.retry_limit)) +e.response.content, e.response.status_code)) + +self._log_request_error(attempt_num, e) + +if attempt_num == self.retry_limit: +raise AirflowException(('API requests to Databricks failed {} times. ' + +'Giving up.').format(self.retry_limit)) + +attempt_num += 1 +sleep(self.retry_delay) + +def _log_request_error(self, attempt_num, error): +self.log.error( +'Attempt %s API Request to Databricks failed with reason: %s', +attempt_num, error +) def submit_run(self, json): """ @@ -175,6 +190,12 @@ def cancel_run(self, run_id): self._do_api_call(CANCEL_RUN_ENDPOINT, json) +def _retryable_error(exception): +return type(exception) == requests_exceptions.ConnectionError \ +or type(exception) == requests_exceptions.Timeout \ +or exception.response is not None and exception.response.status_code >= 500 + + RUN_LIFE_CYCLE_STATES = [ 'PENDING', 'RUNNING', diff --git a/airflow/contrib/operators/databricks_operator.p
[jira] [Commented] (AIRFLOW-2709) Improve error handling in Databricks hook
[ https://issues.apache.org/jira/browse/AIRFLOW-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591694#comment-16591694 ] ASF GitHub Bot commented on AIRFLOW-2709: - betabandido opened a new pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook URL: https://github.com/apache/incubator-airflow/pull/3570 ### JIRA - [x] My PR addresses the following [Airflow JIRA](https://issues.apache.org/jira/browse/AIRFLOW-2709) issues and references them in the PR title. ### Description - [x] This PR enhances the error handling in the Databricks hook (and in its corresponding operator). The PR adds the capability to wait between requests, and it adds support to handle a TEMPORARILY_UNAVAILABLE error. This error is neither a connection nor timeout error, and thus it is not correctly handled by the current hook implementation. ### Tests - [x] My PR adds a unit test to specifically test the new functionality in the hook, and modifies existing tests in the operator to ensure the hook is initialized correctly. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve error handling in Databricks hook > - > > Key: AIRFLOW-2709 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2709 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib, hooks >Reporter: Victor Jimenez >Priority: Major > > The Databricks hook handles both connection and timeout errors. However, > Databricks sometimes returns a temporarily unavailable error. That error is > neither a connection nor timeout one. It is just an HTTPError containing the > following text in the response: TEMPORARILY_UNAVAILABLE. The current error > handling in the hook should be enhanced to support this error. > Also, the Databricks hook contains retry logic. Yet, it does not support > sleeping for some time between requests. This creates a problem in handling > errors such as the TEMPORARILY_UNAVAILABLE one. This error typically resolves > after a few seconds. Adding support for sleeping between retry attempts would > really help to enhance the reliability of Databricks operations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] betabandido opened a new pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook
betabandido opened a new pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook URL: https://github.com/apache/incubator-airflow/pull/3570 ### JIRA - [x] My PR addresses the following [Airflow JIRA](https://issues.apache.org/jira/browse/AIRFLOW-2709) issues and references them in the PR title. ### Description - [x] This PR enhances the error handling in the Databricks hook (and in its corresponding operator). The PR adds the capability to wait between requests, and it adds support to handle a TEMPORARILY_UNAVAILABLE error. This error is neither a connection nor timeout error, and thus it is not correctly handled by the current hook implementation. ### Tests - [x] My PR adds a unit test to specifically test the new functionality in the hook, and modifies existing tests in the operator to ensure the hook is initialized correctly. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] betabandido closed pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook
betabandido closed pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook URL: https://github.com/apache/incubator-airflow/pull/3570 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/hooks/databricks_hook.py b/airflow/contrib/hooks/databricks_hook.py index 54f00e0090..bb89113491 100644 --- a/airflow/contrib/hooks/databricks_hook.py +++ b/airflow/contrib/hooks/databricks_hook.py @@ -24,6 +24,7 @@ from airflow.hooks.base_hook import BaseHook from requests import exceptions as requests_exceptions from requests.auth import AuthBase +from time import sleep from airflow.utils.log.logging_mixin import LoggingMixin @@ -47,7 +48,8 @@ def __init__( self, databricks_conn_id='databricks_default', timeout_seconds=180, -retry_limit=3): +retry_limit=3, +retry_delay=1.0): """ :param databricks_conn_id: The name of the databricks connection to use. :type databricks_conn_id: string @@ -57,6 +59,9 @@ def __init__( :param retry_limit: The number of times to retry the connection in case of service outages. :type retry_limit: int +:param retry_delay: The number of seconds to wait between retries (it +might be a floating point number). +:type retry_delay: float """ self.databricks_conn_id = databricks_conn_id self.databricks_conn = self.get_connection(databricks_conn_id) @@ -64,6 +69,7 @@ def __init__( if retry_limit < 1: raise ValueError('Retry limit must be greater than equal to 1') self.retry_limit = retry_limit +self.retry_delay = retry_delay @staticmethod def _parse_host(host): @@ -119,7 +125,8 @@ def _do_api_call(self, endpoint_info, json): else: raise AirflowException('Unexpected HTTP Method: ' + method) -for attempt_num in range(1, self.retry_limit + 1): +attempt_num = 1 +while True: try: response = request_func( url, @@ -127,21 +134,29 @@ def _do_api_call(self, endpoint_info, json): auth=auth, headers=USER_AGENT_HEADER, timeout=self.timeout_seconds) -if response.status_code == requests.codes.ok: -return response.json() -else: +response.raise_for_status() +return response.json() +except requests_exceptions.RequestException as e: +if not _retryable_error(e): # In this case, the user probably made a mistake. # Don't retry. raise AirflowException('Response: {0}, Status Code: {1}'.format( -response.content, response.status_code)) -except (requests_exceptions.ConnectionError, -requests_exceptions.Timeout) as e: -self.log.error( -'Attempt %s API Request to Databricks failed with reason: %s', -attempt_num, e -) -raise AirflowException(('API requests to Databricks failed {} times. ' + - 'Giving up.').format(self.retry_limit)) +e.response.content, e.response.status_code)) + +self._log_request_error(attempt_num, e) + +if attempt_num == self.retry_limit: +raise AirflowException(('API requests to Databricks failed {} times. ' + +'Giving up.').format(self.retry_limit)) + +attempt_num += 1 +sleep(self.retry_delay) + +def _log_request_error(self, attempt_num, error): +self.log.error( +'Attempt %s API Request to Databricks failed with reason: %s', +attempt_num, error +) def submit_run(self, json): """ @@ -175,6 +190,12 @@ def cancel_run(self, run_id): self._do_api_call(CANCEL_RUN_ENDPOINT, json) +def _retryable_error(exception): +return type(exception) == requests_exceptions.ConnectionError \ +or type(exception) == requests_exceptions.Timeout \ +or exception.response is not None and exception.response.status_code >= 500 + + RUN_LIFE_CYCLE_STATES = [ 'PENDING', 'RUNNING', diff --git a/airflow/contrib/operators/databricks_operator.py b/airflow/contrib/operators/databricks_operator.py index 7b8d522dba..3245a99256 100644 --- a/airflow/contrib/operators/databricks_operator.py +++ b/airflow/contrib/operators/databricks_operator.py @@ -146,6 +146,9 @@ class DatabricksSubmitRunOperator(BaseOpera
[GitHub] ashb commented on issue #3747: [AIRFLOW-2895] Prevent scheduler from spamming heartbeats/logs
ashb commented on issue #3747: [AIRFLOW-2895] Prevent scheduler from spamming heartbeats/logs URL: https://github.com/apache/incubator-airflow/pull/3747#issuecomment-415758097 Thanks, I'm starting to make a note of what PRs we'd want to get in to a 1.10.1 release. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bolkedebruin commented on issue #3747: [AIRFLOW-2895] Prevent scheduler from spamming heartbeats/logs
bolkedebruin commented on issue #3747: [AIRFLOW-2895] Prevent scheduler from spamming heartbeats/logs URL: https://github.com/apache/incubator-airflow/pull/3747#issuecomment-415757341 1.10.1 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-2709) Improve error handling in Databricks hook
[ https://issues.apache.org/jira/browse/AIRFLOW-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591613#comment-16591613 ] ASF GitHub Bot commented on AIRFLOW-2709: - betabandido closed pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook URL: https://github.com/apache/incubator-airflow/pull/3570 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/hooks/databricks_hook.py b/airflow/contrib/hooks/databricks_hook.py index 54f00e0090..bb89113491 100644 --- a/airflow/contrib/hooks/databricks_hook.py +++ b/airflow/contrib/hooks/databricks_hook.py @@ -24,6 +24,7 @@ from airflow.hooks.base_hook import BaseHook from requests import exceptions as requests_exceptions from requests.auth import AuthBase +from time import sleep from airflow.utils.log.logging_mixin import LoggingMixin @@ -47,7 +48,8 @@ def __init__( self, databricks_conn_id='databricks_default', timeout_seconds=180, -retry_limit=3): +retry_limit=3, +retry_delay=1.0): """ :param databricks_conn_id: The name of the databricks connection to use. :type databricks_conn_id: string @@ -57,6 +59,9 @@ def __init__( :param retry_limit: The number of times to retry the connection in case of service outages. :type retry_limit: int +:param retry_delay: The number of seconds to wait between retries (it +might be a floating point number). +:type retry_delay: float """ self.databricks_conn_id = databricks_conn_id self.databricks_conn = self.get_connection(databricks_conn_id) @@ -64,6 +69,7 @@ def __init__( if retry_limit < 1: raise ValueError('Retry limit must be greater than equal to 1') self.retry_limit = retry_limit +self.retry_delay = retry_delay @staticmethod def _parse_host(host): @@ -119,7 +125,8 @@ def _do_api_call(self, endpoint_info, json): else: raise AirflowException('Unexpected HTTP Method: ' + method) -for attempt_num in range(1, self.retry_limit + 1): +attempt_num = 1 +while True: try: response = request_func( url, @@ -127,21 +134,29 @@ def _do_api_call(self, endpoint_info, json): auth=auth, headers=USER_AGENT_HEADER, timeout=self.timeout_seconds) -if response.status_code == requests.codes.ok: -return response.json() -else: +response.raise_for_status() +return response.json() +except requests_exceptions.RequestException as e: +if not _retryable_error(e): # In this case, the user probably made a mistake. # Don't retry. raise AirflowException('Response: {0}, Status Code: {1}'.format( -response.content, response.status_code)) -except (requests_exceptions.ConnectionError, -requests_exceptions.Timeout) as e: -self.log.error( -'Attempt %s API Request to Databricks failed with reason: %s', -attempt_num, e -) -raise AirflowException(('API requests to Databricks failed {} times. ' + - 'Giving up.').format(self.retry_limit)) +e.response.content, e.response.status_code)) + +self._log_request_error(attempt_num, e) + +if attempt_num == self.retry_limit: +raise AirflowException(('API requests to Databricks failed {} times. ' + +'Giving up.').format(self.retry_limit)) + +attempt_num += 1 +sleep(self.retry_delay) + +def _log_request_error(self, attempt_num, error): +self.log.error( +'Attempt %s API Request to Databricks failed with reason: %s', +attempt_num, error +) def submit_run(self, json): """ @@ -175,6 +190,12 @@ def cancel_run(self, run_id): self._do_api_call(CANCEL_RUN_ENDPOINT, json) +def _retryable_error(exception): +return type(exception) == requests_exceptions.ConnectionError \ +or type(exception) == requests_exceptions.Timeout \ +or exception.response is not None and exception.response.status_code >= 500 + + RUN_LIFE_CYCLE_STATES = [ 'PENDING', 'RUNNING', diff --git a/airflow/contrib/operators/databricks_operator.p
[jira] [Commented] (AIRFLOW-2709) Improve error handling in Databricks hook
[ https://issues.apache.org/jira/browse/AIRFLOW-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591614#comment-16591614 ] ASF GitHub Bot commented on AIRFLOW-2709: - betabandido opened a new pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook URL: https://github.com/apache/incubator-airflow/pull/3570 ### JIRA - [x] My PR addresses the following [Airflow JIRA](https://issues.apache.org/jira/browse/AIRFLOW-2709) issues and references them in the PR title. ### Description - [x] This PR enhances the error handling in the Databricks hook (and in its corresponding operator). The PR adds the capability to wait between requests, and it adds support to handle a TEMPORARILY_UNAVAILABLE error. This error is neither a connection nor timeout error, and thus it is not correctly handled by the current hook implementation. ### Tests - [x] My PR adds a unit test to specifically test the new functionality in the hook, and modifies existing tests in the operator to ensure the hook is initialized correctly. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve error handling in Databricks hook > - > > Key: AIRFLOW-2709 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2709 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib, hooks >Reporter: Victor Jimenez >Priority: Major > > The Databricks hook handles both connection and timeout errors. However, > Databricks sometimes returns a temporarily unavailable error. That error is > neither a connection nor timeout one. It is just an HTTPError containing the > following text in the response: TEMPORARILY_UNAVAILABLE. The current error > handling in the hook should be enhanced to support this error. > Also, the Databricks hook contains retry logic. Yet, it does not support > sleeping for some time between requests. This creates a problem in handling > errors such as the TEMPORARILY_UNAVAILABLE one. This error typically resolves > after a few seconds. Adding support for sleeping between retry attempts would > really help to enhance the reliability of Databricks operations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] betabandido closed pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook
betabandido closed pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook URL: https://github.com/apache/incubator-airflow/pull/3570 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/hooks/databricks_hook.py b/airflow/contrib/hooks/databricks_hook.py index 54f00e0090..bb89113491 100644 --- a/airflow/contrib/hooks/databricks_hook.py +++ b/airflow/contrib/hooks/databricks_hook.py @@ -24,6 +24,7 @@ from airflow.hooks.base_hook import BaseHook from requests import exceptions as requests_exceptions from requests.auth import AuthBase +from time import sleep from airflow.utils.log.logging_mixin import LoggingMixin @@ -47,7 +48,8 @@ def __init__( self, databricks_conn_id='databricks_default', timeout_seconds=180, -retry_limit=3): +retry_limit=3, +retry_delay=1.0): """ :param databricks_conn_id: The name of the databricks connection to use. :type databricks_conn_id: string @@ -57,6 +59,9 @@ def __init__( :param retry_limit: The number of times to retry the connection in case of service outages. :type retry_limit: int +:param retry_delay: The number of seconds to wait between retries (it +might be a floating point number). +:type retry_delay: float """ self.databricks_conn_id = databricks_conn_id self.databricks_conn = self.get_connection(databricks_conn_id) @@ -64,6 +69,7 @@ def __init__( if retry_limit < 1: raise ValueError('Retry limit must be greater than equal to 1') self.retry_limit = retry_limit +self.retry_delay = retry_delay @staticmethod def _parse_host(host): @@ -119,7 +125,8 @@ def _do_api_call(self, endpoint_info, json): else: raise AirflowException('Unexpected HTTP Method: ' + method) -for attempt_num in range(1, self.retry_limit + 1): +attempt_num = 1 +while True: try: response = request_func( url, @@ -127,21 +134,29 @@ def _do_api_call(self, endpoint_info, json): auth=auth, headers=USER_AGENT_HEADER, timeout=self.timeout_seconds) -if response.status_code == requests.codes.ok: -return response.json() -else: +response.raise_for_status() +return response.json() +except requests_exceptions.RequestException as e: +if not _retryable_error(e): # In this case, the user probably made a mistake. # Don't retry. raise AirflowException('Response: {0}, Status Code: {1}'.format( -response.content, response.status_code)) -except (requests_exceptions.ConnectionError, -requests_exceptions.Timeout) as e: -self.log.error( -'Attempt %s API Request to Databricks failed with reason: %s', -attempt_num, e -) -raise AirflowException(('API requests to Databricks failed {} times. ' + - 'Giving up.').format(self.retry_limit)) +e.response.content, e.response.status_code)) + +self._log_request_error(attempt_num, e) + +if attempt_num == self.retry_limit: +raise AirflowException(('API requests to Databricks failed {} times. ' + +'Giving up.').format(self.retry_limit)) + +attempt_num += 1 +sleep(self.retry_delay) + +def _log_request_error(self, attempt_num, error): +self.log.error( +'Attempt %s API Request to Databricks failed with reason: %s', +attempt_num, error +) def submit_run(self, json): """ @@ -175,6 +190,12 @@ def cancel_run(self, run_id): self._do_api_call(CANCEL_RUN_ENDPOINT, json) +def _retryable_error(exception): +return type(exception) == requests_exceptions.ConnectionError \ +or type(exception) == requests_exceptions.Timeout \ +or exception.response is not None and exception.response.status_code >= 500 + + RUN_LIFE_CYCLE_STATES = [ 'PENDING', 'RUNNING', diff --git a/airflow/contrib/operators/databricks_operator.py b/airflow/contrib/operators/databricks_operator.py index 7b8d522dba..3245a99256 100644 --- a/airflow/contrib/operators/databricks_operator.py +++ b/airflow/contrib/operators/databricks_operator.py @@ -146,6 +146,9 @@ class DatabricksSubmitRunOperator(BaseOpera
[GitHub] betabandido opened a new pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook
betabandido opened a new pull request #3570: [AIRFLOW-2709] Improve error handling in Databricks hook URL: https://github.com/apache/incubator-airflow/pull/3570 ### JIRA - [x] My PR addresses the following [Airflow JIRA](https://issues.apache.org/jira/browse/AIRFLOW-2709) issues and references them in the PR title. ### Description - [x] This PR enhances the error handling in the Databricks hook (and in its corresponding operator). The PR adds the capability to wait between requests, and it adds support to handle a TEMPORARILY_UNAVAILABLE error. This error is neither a connection nor timeout error, and thus it is not correctly handled by the current hook implementation. ### Tests - [x] My PR adds a unit test to specifically test the new functionality in the hook, and modifies existing tests in the operator to ensure the hook is initialized correctly. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ksaagariconic opened a new pull request #3802: Adding THE ICONIC to the list of orgs with AirFlow
ksaagariconic opened a new pull request #3802: Adding THE ICONIC to the list of orgs with AirFlow URL: https://github.com/apache/incubator-airflow/pull/3802 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-XXX - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] nicor88 opened a new pull request #3801: [AIRFLOW-XXX] Add 8fit to list of users
nicor88 opened a new pull request #3801: [AIRFLOW-XXX] Add 8fit to list of users URL: https://github.com/apache/incubator-airflow/pull/3801 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-XXX - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Closed] (AIRFLOW-2157) Builds in TravisCI are so unstable now
[ https://issues.apache.org/jira/browse/AIRFLOW-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik closed AIRFLOW-2157. --- Resolution: Fixed > Builds in TravisCI are so unstable now > -- > > Key: AIRFLOW-2157 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2157 > Project: Apache Airflow > Issue Type: Improvement > Components: ci, travis >Reporter: Sergio Herrera >Assignee: Gerardo Curiel >Priority: Major > Labels: CI, test > > At the time i write this, I have a PR that builds and pass the tests > correctly. The problem is sometimes, after rebasing with changes in master > branch, TravisCI builds fails because a bad environment, but after some > commit recreations, it passes the tests. > After studying some of that builds, I think the problem is that installing > some things from scratch has a performance impact and other issues like > unavailable services or bad packages installations. > A possible great solution is creating a base image that contains some of the > software preinstalled (e.g, databases or messages queues) as the environment > for testing is the same for every build. > This can be related to an old task (AIRFLOW-87) about creating a development > environment. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-404) Travis cache can get borked make sure to retry downloading if unpacking fails
[ https://issues.apache.org/jira/browse/AIRFLOW-404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik updated AIRFLOW-404: --- Affects Version/s: 2.0.0 > Travis cache can get borked make sure to retry downloading if unpacking fails > - > > Key: AIRFLOW-404 > URL: https://issues.apache.org/jira/browse/AIRFLOW-404 > Project: Apache Airflow > Issue Type: Improvement > Components: travis >Affects Versions: 2.0.0 >Reporter: Bolke de Bruin >Assignee: Gerardo Curiel >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-404) Travis cache can get borked make sure to retry downloading if unpacking fails
[ https://issues.apache.org/jira/browse/AIRFLOW-404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik resolved AIRFLOW-404. Resolution: Fixed Resolved by https://github.com/apache/incubator-airflow/pull/3393 > Travis cache can get borked make sure to retry downloading if unpacking fails > - > > Key: AIRFLOW-404 > URL: https://issues.apache.org/jira/browse/AIRFLOW-404 > Project: Apache Airflow > Issue Type: Improvement > Components: travis >Affects Versions: 2.0.0 >Reporter: Bolke de Bruin >Assignee: Gerardo Curiel >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (AIRFLOW-602) Unit Test Cases Doesn't run in Master Branch
[ https://issues.apache.org/jira/browse/AIRFLOW-602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik closed AIRFLOW-602. -- Resolution: Fixed > Unit Test Cases Doesn't run in Master Branch > > > Key: AIRFLOW-602 > URL: https://issues.apache.org/jira/browse/AIRFLOW-602 > Project: Apache Airflow > Issue Type: Bug > Components: tests > Environment: Mac >Reporter: Siddharth >Assignee: Gerardo Curiel >Priority: Major > > Trying to run test cases in master branch > I am trying to run airflow unit tests on my mac > but I get this error: > ERROR: Failure: OperationalError ((sqlite3.OperationalError) no such table: > task_instance [SQL: u'DELETE FROM task_instance WHERE task_instance.dag_id = > ?'] [parameters: ('unit_tests',)]) > I basically checked out master version of the repo and ran run_unit_tests.sh > any idea whats going on? Looks like the table task_instance is not created > while running test case file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-87) Setup a development environment
[ https://issues.apache.org/jira/browse/AIRFLOW-87?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik updated AIRFLOW-87: -- Description: 1. Add developer guide, for example: how to build, create & deploy an artifact. 2. Add a vagrant file. 3. Run Unit tests was: 1. Add developer guide, for example: how to build, create & deploy an artifact. 2. Add a vagrant file. > Setup a development environment > --- > > Key: AIRFLOW-87 > URL: https://issues.apache.org/jira/browse/AIRFLOW-87 > Project: Apache Airflow > Issue Type: Task > Components: docs >Reporter: Amikam Snir >Assignee: Gerardo Curiel >Priority: Minor > Labels: docuentation > > 1. Add developer guide, for example: how to build, create & deploy an > artifact. > 2. Add a vagrant file. > 3. Run Unit tests -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (AIRFLOW-1042) Easy Unit Testing with Docker
[ https://issues.apache.org/jira/browse/AIRFLOW-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik closed AIRFLOW-1042. --- Resolution: Duplicate > Easy Unit Testing with Docker > - > > Key: AIRFLOW-1042 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1042 > Project: Apache Airflow > Issue Type: Improvement > Components: tests >Reporter: Joe Schmid >Priority: Major > > Running Airflow unit tests should be as easy as possible, especially for new > contributors. This is challenging due to the various external components > (MySQL, Postgres, etc.) that unit tests depend on. > This improvement uses Docker and Docker Compose to give contributors a very > simple way to create a read-made environment for unit testing. To run unit > tests will be a two step process: > > docker-compose up -d > > ./scripts/docker/unittest/run.sh > This use of Docker could also be expanded in the future to run all of the tox > tests like Travis does. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-507) Use Travis' trusty environment for CI
[ https://issues.apache.org/jira/browse/AIRFLOW-507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik resolved AIRFLOW-507. Resolution: Fixed > Use Travis' trusty environment for CI > - > > Key: AIRFLOW-507 > URL: https://issues.apache.org/jira/browse/AIRFLOW-507 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Bolke de Bruin >Priority: Major > > Travis' trusty build environment is more up to date and more flexible in the > way we do integration testing. It will allow for kerberos testing and celery. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-507) Use Travis' trusty environment for CI
[ https://issues.apache.org/jira/browse/AIRFLOW-507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik reassigned AIRFLOW-507: -- Assignee: (was: Gerardo Curiel) > Use Travis' trusty environment for CI > - > > Key: AIRFLOW-507 > URL: https://issues.apache.org/jira/browse/AIRFLOW-507 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Bolke de Bruin >Priority: Major > > Travis' trusty build environment is more up to date and more flexible in the > way we do integration testing. It will allow for kerberos testing and celery. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-1042) Easy Unit Testing with Docker
[ https://issues.apache.org/jira/browse/AIRFLOW-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik reassigned AIRFLOW-1042: --- Assignee: (was: Gerardo Curiel) > Easy Unit Testing with Docker > - > > Key: AIRFLOW-1042 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1042 > Project: Apache Airflow > Issue Type: Improvement > Components: tests >Reporter: Joe Schmid >Priority: Major > > Running Airflow unit tests should be as easy as possible, especially for new > contributors. This is challenging due to the various external components > (MySQL, Postgres, etc.) that unit tests depend on. > This improvement uses Docker and Docker Compose to give contributors a very > simple way to create a read-made environment for unit testing. To run unit > tests will be a two step process: > > docker-compose up -d > > ./scripts/docker/unittest/run.sh > This use of Docker could also be expanded in the future to run all of the tox > tests like Travis does. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-13) Migrate Travis CI to Apache repo
[ https://issues.apache.org/jira/browse/AIRFLOW-13?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik resolved AIRFLOW-13. --- Resolution: Fixed We have moved to GitBox. > Migrate Travis CI to Apache repo > > > Key: AIRFLOW-13 > URL: https://issues.apache.org/jira/browse/AIRFLOW-13 > Project: Apache Airflow > Issue Type: Bug > Components: project-management >Reporter: Chris Riccomini >Assignee: Jeremiah Lowin >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2950) Running Airflow behind a proxy
[ https://issues.apache.org/jira/browse/AIRFLOW-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591487#comment-16591487 ] Ash Berlin-Taylor commented on AIRFLOW-2950: Setting environment variables is part of how you run Airflow, not an aspect of airflow's own configuration. I.e. if you are using Systemd this goes in the {{.service}} file. I don't think we need special code for this, but documentation to make it clear would be appreciated. > Running Airflow behind a proxy > -- > > Key: AIRFLOW-2950 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2950 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Shivakumar Gopalakrishnan >Assignee: Shivakumar Gopalakrishnan >Priority: Minor > > Airflow with a Celary+SQS configuration does not work behind a proxy > it will be nice to add variables http_proxy, https_proxy & no_proxy as part > of the core configuration so that they can be used internally -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nathadfield opened a new pull request #3800: Adding King.com to the list of companies.
nathadfield opened a new pull request #3800: Adding King.com to the list of companies. URL: https://github.com/apache/incubator-airflow/pull/3800 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-XXX - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (AIRFLOW-2950) Running Airflow behind a proxy
[ https://issues.apache.org/jira/browse/AIRFLOW-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivakumar Gopalakrishnan reassigned AIRFLOW-2950: -- Assignee: Shivakumar Gopalakrishnan > Running Airflow behind a proxy > -- > > Key: AIRFLOW-2950 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2950 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Shivakumar Gopalakrishnan >Assignee: Shivakumar Gopalakrishnan >Priority: Minor > > Airflow with a Celary+SQS configuration does not work behind a proxy > it will be nice to add variables http_proxy, https_proxy & no_proxy as part > of the core configuration so that they can be used internally -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nathadfield closed pull request #3791: Adding King.com to the list of companies.
nathadfield closed pull request #3791: Adding King.com to the list of companies. URL: https://github.com/apache/incubator-airflow/pull/3791 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/README.md b/README.md index 11e25a7bdf..ad88820ee8 100644 --- a/README.md +++ b/README.md @@ -184,6 +184,7 @@ Currently **officially** using Airflow: 1. [JobTeaser](https://www.jobteaser.com) [[@stefani75](https://github.com/stefani75) & [@knil-sama](https://github.com/knil-sama)] 1. [Kalibrr](https://www.kalibrr.com/) [[@charlesverdad](https://github.com/charlesverdad)] 1. [Karmic](https://karmiclabs.com) [[@hyw](https://github.com/hyw)] +1. [King.com](https://www.king.com) [[@nathadfield](https://github.com/nathadfield)] 1. [Kiwi.com](https://kiwi.com/) [[@underyx](https://github.com/underyx)] 1. [Kogan.com](https://github.com/kogan) [[@geeknam](https://github.com/geeknam)] 1. [KPN B.V.](https://www.kpn.com/) [[@biyanisuraj](https://github.com/biyanisuraj) & [@gmic](https://github.com/gmic)] This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-2950) Running Airflow behind a proxy
[ https://issues.apache.org/jira/browse/AIRFLOW-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivakumar Gopalakrishnan updated AIRFLOW-2950: --- Issue Type: Improvement (was: Bug) > Running Airflow behind a proxy > -- > > Key: AIRFLOW-2950 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2950 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Shivakumar Gopalakrishnan >Priority: Minor > > Airflow with a Celary+SQS configuration does not work behind a proxy > it will be nice to add variables http_proxy, https_proxy & no_proxy as part > of the core configuration so that they can be used internally -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] cinhil commented on issue #3747: [AIRFLOW-2895] Prevent scheduler from spamming heartbeats/logs
cinhil commented on issue #3747: [AIRFLOW-2895] Prevent scheduler from spamming heartbeats/logs URL: https://github.com/apache/incubator-airflow/pull/3747#issuecomment-415715873 Hello, Do you intend to merge it (backport) on v1-10-stable branch ? Maybe aiming an 1.10.1 later... This issue block us to push 1.10 in production. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] msumit commented on issue #3791: Adding King.com to the list of companies.
msumit commented on issue #3791: Adding King.com to the list of companies. URL: https://github.com/apache/incubator-airflow/pull/3791#issuecomment-415711720 @nathadfield, plz squash merge your commits into one. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors
Fokko commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors URL: https://github.com/apache/incubator-airflow/pull/3596#issuecomment-415710523 Maybe we could even make the rescheduling default behaviour for Airflow 2.0, and get rid of the blocking tasks. That would also simplify the code/logic. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #3760: [AIRFLOW-2909] Deprecate airflow.operators.sensors module
Fokko commented on issue #3760: [AIRFLOW-2909] Deprecate airflow.operators.sensors module URL: https://github.com/apache/incubator-airflow/pull/3760#issuecomment-415710635 Makes sense @tedmiston Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors
Fokko commented on issue #3596: [AIRFLOW-2747] Explicit re-schedule of sensors URL: https://github.com/apache/incubator-airflow/pull/3596#issuecomment-415709674 @seelmann Can you base onto master? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #3783: Fix URI scheme selection
Fokko commented on issue #3783: Fix URI scheme selection URL: https://github.com/apache/incubator-airflow/pull/3783#issuecomment-415709019 Thanks @matt2000 Can you also add a test to cover this? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #3777: [AIRFLOW-XXX] Add Flipp to list of Airflow users
Fokko commented on issue #3777: [AIRFLOW-XXX] Add Flipp to list of Airflow users URL: https://github.com/apache/incubator-airflow/pull/3777#issuecomment-415709097 Welcome @sethwilsonwishabi This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on a change in pull request #3697: [AIRFLOW-2854] kubernetes_pod_operator add more configuration items
Fokko commented on a change in pull request #3697: [AIRFLOW-2854] kubernetes_pod_operator add more configuration items URL: https://github.com/apache/incubator-airflow/pull/3697#discussion_r212574450 ## File path: tests/contrib/minikube/test_kubernetes_pod_operator.py ## @@ -199,6 +213,24 @@ def test_faulty_image(self): print("exception: {}".format(cm)) +def test_faulty_service_account(self): +bad_service_account_name = "foobar" +k = KubernetesPodOperator( +namespace='default', +image="ubuntu:16.04", +cmds=["bash", "-cx"], +arguments=["echo 10"], +labels={"foo": "bar"}, +name="test", +task_id="task", +startup_timeout_seconds=5, +service_account_name=bad_service_account_name +) +with self.assertRaises(ApiException) as cm: +k.execute(None), + +print("exception: {}".format(cm)) Review comment: Can you remove the print? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on a change in pull request #3697: [AIRFLOW-2854] kubernetes_pod_operator add more configuration items
Fokko commented on a change in pull request #3697: [AIRFLOW-2854] kubernetes_pod_operator add more configuration items URL: https://github.com/apache/incubator-airflow/pull/3697#discussion_r212574502 ## File path: tests/contrib/minikube/test_kubernetes_pod_operator.py ## @@ -199,6 +213,24 @@ def test_faulty_image(self): print("exception: {}".format(cm)) +def test_faulty_service_account(self): +bad_service_account_name = "foobar" +k = KubernetesPodOperator( +namespace='default', +image="ubuntu:16.04", +cmds=["bash", "-cx"], +arguments=["echo 10"], +labels={"foo": "bar"}, +name="test", +task_id="task", +startup_timeout_seconds=5, +service_account_name=bad_service_account_name +) +with self.assertRaises(ApiException) as cm: Review comment: Please remove the `as cm` since it is not used. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] barrachri commented on issue #3799: [AIRFLOW-2665] Use shlex to split args and remove shell true
barrachri commented on issue #3799: [AIRFLOW-2665] Use shlex to split args and remove shell true URL: https://github.com/apache/incubator-airflow/pull/3799#issuecomment-415707951 I have a small doubt, what is the proper format of a `command`?: ``` def execute_work(self, key, command): """ Executes command received and stores result state in queue. :param key: the key to identify the TI :type key: Tuple(dag_id, task_id, execution_date) :param command: the command to execute :type command: string """ ``` because looking at the code seems should be a string...but looks like a small lie, because tests were passing a`list` (this is why I added the `isinstance` check). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #3794: [AIRFLOW-XXX] Added G Adventures to Users
Fokko commented on issue #3794: [AIRFLOW-XXX] Added G Adventures to Users URL: https://github.com/apache/incubator-airflow/pull/3794#issuecomment-415705921 Welcome aboard @samuelmullin This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] barrachri opened a new pull request #3799: [AIRFLOW-2665] Use shlex to split args and remove shell true
barrachri opened a new pull request #3799: [AIRFLOW-2665] Use shlex to split args and remove shell true URL: https://github.com/apache/incubator-airflow/pull/3799 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-XXX - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-2665) No BASH will cause the dag to fail
[ https://issues.apache.org/jira/browse/AIRFLOW-2665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591387#comment-16591387 ] ASF GitHub Bot commented on AIRFLOW-2665: - barrachri opened a new pull request #3799: [AIRFLOW-2665] Use shlex to split args and remove shell true URL: https://github.com/apache/incubator-airflow/pull/3799 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-XXX - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > No BASH will cause the dag to fail > -- > > Key: AIRFLOW-2665 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2665 > Project: Apache Airflow > Issue Type: Bug >Reporter: Christian Barra >Priority: Major > Labels: easy-fix > > If you are a running airflow in a system where bash is not available the dags > will fail, with no logs inside the UI (you have to scroll through the local > logs). > How to replicate this? just use an alpine based image: > ``` > [2018-06-22 21:05:20,659] \{jobs.py:1386} INFO - Processing DAG_1 > [2018-06-22 21:05:20,667] \{local_executor.py:43} INFO - LocalWorker running > airflow run DAG_1 stackoverflow 2018-06-22T21:05:19.384402 --local -sd > /usr/local/airflow/dags/my_dag.py > /bin/sh: exec: line 1: bash: not found > [2018-06-22 21:05:20,671] \{local_executor.py:50} ERROR - Failed to execute > task Command 'exec bash -c 'airflow run DAG_1 my_task > 2018-06-22T21:05:19.384402 --local -sd /usr/local/airflow/dags/my_dag.py'' > returned non-zero exit status 127.. > /bin/sh: exec: line 1: bash: not found > ``` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] bolkedebruin commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose
bolkedebruin commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose URL: https://github.com/apache/incubator-airflow/pull/3797#issuecomment-415702880 Travis sometimes has this issue. Just make "sudo rm -rf " part of the script for one time. Also don't rely on "Travis" as a user but use the right env var. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] dimberman commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose
dimberman commented on issue #3797: [AIRFLOW-2952] Splits CI into k8s + docker-compose URL: https://github.com/apache/incubator-airflow/pull/3797#issuecomment-415696995 @gerardo No luck with that. Any other potential culprits? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] betabandido commented on issue #3570: [AIRFLOW-2709] Improve error handling in Databricks hook
betabandido commented on issue #3570: [AIRFLOW-2709] Improve error handling in Databricks hook URL: https://github.com/apache/incubator-airflow/pull/3570#issuecomment-415693599 In the end I managed to reproduce the error. The HTTP status code is 503. See log: ``` > GET /api/2.0/jobs/list HTTP/1.1 > Host: northeurope.azuredatabricks.net ... (omitted output) HTTP/1.1 503 Service Temporarily Unavailable ... (omitted output) {"error_code":"TEMPORARILY_UNAVAILABLE","message":""} ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] nathadfield commented on issue #3791: Adding King.com to the list of companies.
nathadfield commented on issue #3791: Adding King.com to the list of companies. URL: https://github.com/apache/incubator-airflow/pull/3791#issuecomment-41561 Sure. All done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-2954) Change states to a set
Fokko Driesprong created AIRFLOW-2954: - Summary: Change states to a set Key: AIRFLOW-2954 URL: https://issues.apache.org/jira/browse/AIRFLOW-2954 Project: Apache Airflow Issue Type: Bug Reporter: Fokko Driesprong For performance reasons and for general convenience we should model the states as a set instead of an array since they are always unique -- This message was sent by Atlassian JIRA (v7.6.3#76005)