[jira] [Updated] (AIRFLOW-3169) Indicate in the main UI if the scheduler is NOT working.
[ https://issues.apache.org/jira/browse/AIRFLOW-3169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jack updated AIRFLOW-3169: -- Labels: (was: easy-fix) > Indicate in the main UI if the scheduler is NOT working. > > > Key: AIRFLOW-3169 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3169 > Project: Apache Airflow > Issue Type: Improvement >Affects Versions: 1.10.0 >Reporter: jack >Priority: Major > > I came to work today and took a look at Airflow UI. > Everything was green (success) - it took me a while to notice that the dates > of tasks are from Thursday. The scheduler was offline whole weekend. > Only when I restarted the scheduler tasks has began to run. I don't know why > the scheduler stopped but I think it would be great if the UI would indicate > in the main screen when the scheduler is offline. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3169) Indicate in the main UI if the scheduler is NOT working.
[ https://issues.apache.org/jira/browse/AIRFLOW-3169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jack updated AIRFLOW-3169: -- Affects Version/s: 1.10.0 Labels: easy-fix (was: ) > Indicate in the main UI if the scheduler is NOT working. > > > Key: AIRFLOW-3169 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3169 > Project: Apache Airflow > Issue Type: Improvement >Affects Versions: 1.10.0 >Reporter: jack >Priority: Major > > I came to work today and took a look at Airflow UI. > Everything was green (success) - it took me a while to notice that the dates > of tasks are from Thursday. The scheduler was offline whole weekend. > Only when I restarted the scheduler tasks has began to run. I don't know why > the scheduler stopped but I think it would be great if the UI would indicate > in the main screen when the scheduler is offline. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3158) Improve error message for Broken DAG by adding function name
[ https://issues.apache.org/jira/browse/AIRFLOW-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jack updated AIRFLOW-3158: -- Affects Version/s: (was: 1.9.0) 1.10.0 Labels: easy-fix (was: ) > Improve error message for Broken DAG by adding function name > > > Key: AIRFLOW-3158 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3158 > Project: Apache Airflow > Issue Type: Task >Affects Versions: 1.10.0 >Reporter: jack >Priority: Trivial > Labels: easy-fix > > The following message appears > {color:#a94442}Broken DAG: [/home/ubuntu/airflow/dags/a_dag.py] Relationships > can only be set between Operators; received function{color} > When generating > {code:java} > A >> B{code} > > in case A is an operator and B is a function. > The error message could be improved if it will be > {color:#a94442}Broken DAG: [/home/ubuntu/airflow/dags/a_dag.py] Relationships > can only be set between Operators; received function B{color} > This is a small change that makes the error user friendly and specify where > exactly is the issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3214) [Kubernetes] No logs, hard to debug
Zhi Lin created AIRFLOW-3214: Summary: [Kubernetes] No logs, hard to debug Key: AIRFLOW-3214 URL: https://issues.apache.org/jira/browse/AIRFLOW-3214 Project: Apache Airflow Issue Type: Bug Components: kubernetes Affects Versions: 1.10.0 Reporter: Zhi Lin I have a fresh install of airflow on Kubernetes, basically running scripts in scripts/ci/kubernetes, and I fix the flask-appbuilder version so the webserver is running fine. But when I try to run the example-kubernetes-executor dag, it fails and shows nothing useful in the log. Then I try to pull the image airflow/ci manually, and also fails. It says no such image or login is needed. I try to change the image in the dag to something I already have locally, but it still fails and this time no log... And I wonder if I want to use KubernetesPodOperator, I cannot run airflow in a k8s pod? Because it tries to find the config files of k8s... So what should I do? What I want to do is to run some machine learning task in a image specified. Sometimes I need to run a python script in that spawned pod, so I guess I need to put that file in the dag folder? Any suggestion appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] codecov-io edited a comment on issue #3656: [AIRFLOW-2803] Fix all ESLint issues
codecov-io edited a comment on issue #3656: [AIRFLOW-2803] Fix all ESLint issues URL: https://github.com/apache/incubator-airflow/pull/3656#issuecomment-408503531 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr=h1) Report > Merging [#3656](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/b8be322d3badfeadfa8f08e0bf92a12a6cd26418?src=pr=desc) will **increase** coverage by `1.71%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/3656/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr=tree) ```diff @@Coverage Diff @@ ## master#3656 +/- ## == + Coverage 75.79% 77.51% +1.71% == Files 199 205 +6 Lines 1594615751 -195 == + Hits1208612209 +123 + Misses 3860 3542 -318 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/sensors/s3\_key\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX2tleV9zZW5zb3IucHk=) | `31.03% <0%> (-68.97%)` | :arrow_down: | | [airflow/sensors/s3\_prefix\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX3ByZWZpeF9zZW5zb3IucHk=) | `41.17% <0%> (-58.83%)` | :arrow_down: | | [airflow/utils/helpers.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9oZWxwZXJzLnB5) | `67.07% <0%> (-17.31%)` | :arrow_down: | | [airflow/hooks/mysql\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9teXNxbF9ob29rLnB5) | `78% <0%> (-12%)` | :arrow_down: | | [airflow/sensors/sql\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3NxbF9zZW5zb3IucHk=) | `90.47% <0%> (-9.53%)` | :arrow_down: | | [airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5) | `73.91% <0%> (-7.52%)` | :arrow_down: | | [airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5) | `83.95% <0%> (-5.31%)` | :arrow_down: | | [airflow/utils/state.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zdGF0ZS5weQ==) | `93.33% <0%> (-3.34%)` | :arrow_down: | | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `88.54% <0%> (-3.18%)` | :arrow_down: | | [airflow/www\_rbac/utils.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy91dGlscy5weQ==) | `66.21% <0%> (-2.73%)` | :arrow_down: | | ... and [74 more](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr=footer). Last update [b8be322...65f3b96](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-3213) Create ADLS to GCS operator
Brandon Kvarda created AIRFLOW-3213: --- Summary: Create ADLS to GCS operator Key: AIRFLOW-3213 URL: https://issues.apache.org/jira/browse/AIRFLOW-3213 Project: Apache Airflow Issue Type: Improvement Components: gcp, operators Reporter: Brandon Kvarda Assignee: Brandon Kvarda Create ADLS to GCS operator that supports copying of files from ADLS to GCS -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3211) Airflow losing track of running GCP Dataproc jobs upon Airflow restart
[ https://issues.apache.org/jira/browse/AIRFLOW-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julie Chien updated AIRFLOW-3211: - Description: If Airflow restarts (say, due to deployments, system updates, or regular machine restarts such as the weekly restarts in GCP App Engine) while it's running a job on GCP Dataproc, it'll lose track of that job, mark the task as failed, and eventually retry. However, the jobs may still be running on Dataproc and maybe even finish successfully. So when Airflow retries and reruns the job, the same job will run twice. This can result in issues like delayed workflows, increased costs, and duplicate data. To reproduce: # Install Airflow and set up a GCP project that has Dataproc enabled. Create a bucket in the GCP project. # Install this DAG in the Airflow instance: [https://github.com/GoogleCloudPlatform/python-docs-samples/blob/b80895ed88ba86fce223df27a48bf481007ca708/composer/workflows/quickstart.py|https://github.com/GoogleCloudPlatform/python-docs-samples/blob/b80895ed88ba86fce223df27a48bf481007ca708/composer/workflows/quickstart.py.] Set up the Airflow variables as instructed in the comments at the top of the file. # Start the Airflow scheduler and webserver. Kick off a run of the above DAG through the Airflow UI. Wait for the cluster to spin up and the job to start running on Dataproc. # Kill the scheduler and webserver, and then start them back up. # Wait for Airflow to retry the task. Click on the cluster in Dataproc to observe that the job will have been resubmitted, even though the first job is still running without error. At Etsy, we've customized the Dataproc operators to allow for the new Airflow task to pick up where the old one left off upon Airflow restarts, and have been happily using our solution for the past 6 months. I'd like to submit a PR to merge this change upstream. was: If Airflow restarts (say, due to deployments, system updates, or regular machine restarts such as the weekly restarts in GCP App Engine) while it's running a job on GCP Dataproc, it'll lose track of that job, mark the task as failed, and eventually retry. However, the jobs may still be running on Dataproc and maybe even finish successfully. So when Airflow retries and reruns the job, the same job will run twice. This can result in issues like delayed workflows, increased costs, and duplicate data. To reproduce: # Install Airflow and set up a GCP project that has Dataproc enabled. Create a bucket in the GCP project. # Install this DAG in the Airflow instance: [https://github.com/GoogleCloudPlatform/python-docs-samples/blob/b80895ed88ba86fce223df27a48bf481007ca708/composer/workflows/quickstart.py.] Set up the Airflow variables as instructed in the comments at the top of the file. # Start the Airflow scheduler and webserver. Kick off a run of the above DAG through the Airflow UI. Wait for the cluster to spin up and the job to start running on Dataproc. # Kill the scheduler and webserver, and then start them back up. # Wait for Airflow to retry the task. Click on the cluster in Dataproc to observe that the job will have been resubmitted, even though the first job is still running without error. At Etsy, we've customized the Dataproc operators to allow for the new Airflow task to pick up where the old one left off upon Airflow restarts, and have been happily using our solution for the past 6 months. I'd like to submit a PR to merge this change upstream. > Airflow losing track of running GCP Dataproc jobs upon Airflow restart > -- > > Key: AIRFLOW-3211 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3211 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp >Affects Versions: 1.9.0, 1.10.0 >Reporter: Julie Chien >Assignee: Julie Chien >Priority: Minor > Labels: pull-request-available > Fix For: 1.9.0, 1.10.0 > > > If Airflow restarts (say, due to deployments, system updates, or regular > machine restarts such as the weekly restarts in GCP App Engine) while it's > running a job on GCP Dataproc, it'll lose track of that job, mark the task as > failed, and eventually retry. However, the jobs may still be running on > Dataproc and maybe even finish successfully. So when Airflow retries and > reruns the job, the same job will run twice. This can result in issues like > delayed workflows, increased costs, and duplicate data. > > To reproduce: > # Install Airflow and set up a GCP project that has Dataproc enabled. Create > a bucket in the GCP project. > # Install this DAG in the Airflow instance: >
[jira] [Updated] (AIRFLOW-3211) Airflow losing track of running GCP Dataproc jobs upon Airflow restart
[ https://issues.apache.org/jira/browse/AIRFLOW-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julie Chien updated AIRFLOW-3211: - Description: If Airflow restarts (say, due to deployments, system updates, or regular machine restarts such as the weekly restarts in GCP App Engine) while it's running a job on GCP Dataproc, it'll lose track of that job, mark the task as failed, and eventually retry. However, the jobs may still be running on Dataproc and maybe even finish successfully. So when Airflow retries and reruns the job, the same job will run twice. This can result in issues like delayed workflows, increased costs, and duplicate data. To reproduce: # Install Airflow and set up a GCP project that has Dataproc enabled. Create a bucket in the GCP project. # Install this DAG in the Airflow instance: [https://github.com/GoogleCloudPlatform/python-docs-samples/blob/b80895ed88ba86fce223df27a48bf481007ca708/composer/workflows/quickstart.py.] Set up the Airflow variables as instructed in the comments at the top of the file. # Start the Airflow scheduler and webserver. Kick off a run of the above DAG through the Airflow UI. Wait for the cluster to spin up and the job to start running on Dataproc. # Kill the scheduler and webserver, and then start them back up. # Wait for Airflow to retry the task. Click on the cluster in Dataproc to observe that the job will have been resubmitted, even though the first job is still running without error. At Etsy, we've customized the Dataproc operators to allow for the new Airflow task to pick up where the old one left off upon Airflow restarts, and have been happily using our solution for the past 6 months. I'd like to submit a PR to merge this change upstream. was: If Airflow restarts (say, due to deployments, system updates, or regular machine restarts such as the weekly restarts in GCP App Engine) while it's running a job on GCP Dataproc, it'll lose track of that job, mark the task as failed, and eventually retry. However, the jobs may still be running on Dataproc and maybe even finish successfully. So when Airflow retries and reruns the job, the same job will run twice. This can result in issues like delayed workflows, increased costs, and duplicate data. To reproduce: 1. Create a DAG in Airflow that runs a Dataproc job that sleeps for 10 minutes. Wait for the cluster to spin up and the job to start running on Dataproc. 2. SSH into the machine that hosts Airflow and run the following commands to simulate restarting Airflow: {{supervisorctl stop airflow-webserver supervisorctl stop airflow-scheduler ps aux | grep python | grep airflow | awk '\{print $2}' | xargs -r kill -9 supervisorctl restart airflow-scheduler supervisorctl restart airflow-webserver}} 3. Wait for Airflow to retry the task after Supervisor respawns the Airflow processes. Click on the cluster in Dataproc to observe that the job will have been resubmitted, even though the first job is still running without error. At Etsy, we've customized the Dataproc operators to allow for the new Airflow task to pick up where the old one left off upon Airflow restarts, and have been happily using our solution for the past 6 months. I'd like to submit a PR to merge this change upstream. > Airflow losing track of running GCP Dataproc jobs upon Airflow restart > -- > > Key: AIRFLOW-3211 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3211 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp >Affects Versions: 1.9.0, 1.10.0 >Reporter: Julie Chien >Assignee: Julie Chien >Priority: Minor > Labels: pull-request-available > Fix For: 1.9.0, 1.10.0 > > > If Airflow restarts (say, due to deployments, system updates, or regular > machine restarts such as the weekly restarts in GCP App Engine) while it's > running a job on GCP Dataproc, it'll lose track of that job, mark the task as > failed, and eventually retry. However, the jobs may still be running on > Dataproc and maybe even finish successfully. So when Airflow retries and > reruns the job, the same job will run twice. This can result in issues like > delayed workflows, increased costs, and duplicate data. > > To reproduce: > # Install Airflow and set up a GCP project that has Dataproc enabled. Create > a bucket in the GCP project. > # Install this DAG in the Airflow instance: > [https://github.com/GoogleCloudPlatform/python-docs-samples/blob/b80895ed88ba86fce223df27a48bf481007ca708/composer/workflows/quickstart.py.] > Set up the Airflow variables as instructed in the comments at the top of the > file. > # Start the Airflow scheduler and webserver. Kick off a run of the above DAG > through the Airflow UI.
[GitHub] tswast commented on issue #4003: [AIRFLOW-3163] add operator to enable setting table description in BigQuery table
tswast commented on issue #4003: [AIRFLOW-3163] add operator to enable setting table description in BigQuery table URL: https://github.com/apache/incubator-airflow/pull/4003#issuecomment-430036691 It's also possible to * add columns to a table and make updates to schema descriptions. https://cloud.google.com/bigquery/docs/managing-table-schemas * update the encryption key if the table was already using an encryption key. https://cloud.google.com/bigquery/docs/customer-managed-encryption#change_key Having a separate parameter for each property seems the most consistent with other operators, but I'm not opposed to having a single "table resource" parameter. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #2824: [AIRFLOW-1867] Fix sendgrid py3k bug; add sandbox mode
codecov-io edited a comment on issue #2824: [AIRFLOW-1867] Fix sendgrid py3k bug; add sandbox mode URL: https://github.com/apache/incubator-airflow/pull/2824#issuecomment-348023606 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=h1) Report > Merging [#2824](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/a581cbab1f79827ab645d21a9a221f1616cf8984?src=pr=desc) will **decrease** coverage by `0.01%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/2824/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=tree) ```diff @@Coverage Diff @@ ## master#2824 +/- ## == - Coverage 75.92% 75.91% -0.02% == Files 199 199 Lines 1595415957 +3 == Hits1211312113 - Misses 3841 3844 +3 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/2824/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5) | `64.82% <0%> (-0.24%)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=footer). Last update [a581cba...c1a3c9c](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #2824: [AIRFLOW-1867] Fix sendgrid py3k bug; add sandbox mode
codecov-io edited a comment on issue #2824: [AIRFLOW-1867] Fix sendgrid py3k bug; add sandbox mode URL: https://github.com/apache/incubator-airflow/pull/2824#issuecomment-348023606 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=h1) Report > Merging [#2824](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/a581cbab1f79827ab645d21a9a221f1616cf8984?src=pr=desc) will **decrease** coverage by `0.01%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/2824/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=tree) ```diff @@Coverage Diff @@ ## master#2824 +/- ## == - Coverage 75.92% 75.91% -0.02% == Files 199 199 Lines 1595415957 +3 == Hits1211312113 - Misses 3841 3844 +3 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/2824/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5) | `64.82% <0%> (-0.24%)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=footer). Last update [a581cba...c1a3c9c](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-3212) Add AWS Glue Catalog sensor that behaves like HivePartitionSensor
Michael Mole created AIRFLOW-3212: - Summary: Add AWS Glue Catalog sensor that behaves like HivePartitionSensor Key: AIRFLOW-3212 URL: https://issues.apache.org/jira/browse/AIRFLOW-3212 Project: Apache Airflow Issue Type: New Feature Components: aws Reporter: Michael Mole Please add an AWS Glue Catalog sensor that behaves like the HivePartitionSensor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3211) Airflow losing track of running GCP Dataproc jobs upon Airflow restart
[ https://issues.apache.org/jira/browse/AIRFLOW-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julie Chien updated AIRFLOW-3211: - Description: If Airflow restarts (say, due to deployments, system updates, or regular machine restarts such as the weekly restarts in GCP App Engine) while it's running a job on GCP Dataproc, it'll lose track of that job, mark the task as failed, and eventually retry. However, the jobs may still be running on Dataproc and maybe even finish successfully. So when Airflow retries and reruns the job, the same job will run twice. This can result in issues like delayed workflows, increased costs, and duplicate data. To reproduce: 1. Create a DAG in Airflow that runs a Dataproc job that sleeps for 10 minutes. Wait for the cluster to spin up and the job to start running on Dataproc. 2. SSH into the machine that hosts Airflow and run the following commands to simulate restarting Airflow: {{supervisorctl stop airflow-webserver supervisorctl stop airflow-scheduler ps aux | grep python | grep airflow | awk '\{print $2}' | xargs -r kill -9 supervisorctl restart airflow-scheduler supervisorctl restart airflow-webserver}} 3. Wait for Airflow to retry the task after Supervisor respawns the Airflow processes. Click on the cluster in Dataproc to observe that the job will have been resubmitted, even though the first job is still running without error. At Etsy, we've customized the Dataproc operators to allow for the new Airflow task to pick up where the old one left off upon Airflow restarts, and have been happily using our solution for the past 6 months. I'd like to submit a PR to merge this change upstream. was: If Airflow restarts (say, due to deployments, system updates, or regular machine restarts such as the weekly restarts in GCP App Engine) while it's running a job on GCP Dataproc, it'll lose track of that job, mark the task as failed, and eventually retry. However, the jobs may still be running on Dataproc and maybe even finish successfully. So when Airflow retries and reruns the job, the same job will run twice. To reproduce: 1. Create a DAG in Airflow that runs a Dataproc job that sleeps for 10 minutes. Wait for the cluster to spin up and the job to start running on Dataproc. 2. SSH into the machine that hosts Airflow and run the following commands to simulate restarting Airflow: {{supervisorctl stop airflow-webserver supervisorctl stop airflow-scheduler ps aux | grep python | grep airflow | awk '\{print $2}' | xargs -r kill -9 supervisorctl restart airflow-scheduler supervisorctl restart airflow-webserver}} 3. Wait for Airflow to retry the task after Supervisor respawns the Airflow processes. Click on the cluster in Dataproc to observe that the job will have been resubmitted, even though the first job is still running without error. At Etsy, we've customized the Dataproc operators to allow for the new Airflow task to pick up where the old one left off upon Airflow restarts, and have been happily using our solution for the past 6 months. I'd like to submit a PR to merge this change upstream. > Airflow losing track of running GCP Dataproc jobs upon Airflow restart > -- > > Key: AIRFLOW-3211 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3211 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp >Affects Versions: 1.9.0, 1.10.0 >Reporter: Julie Chien >Assignee: Julie Chien >Priority: Minor > Labels: pull-request-available > Fix For: 1.9.0, 1.10.0 > > > If Airflow restarts (say, due to deployments, system updates, or regular > machine restarts such as the weekly restarts in GCP App Engine) while it's > running a job on GCP Dataproc, it'll lose track of that job, mark the task as > failed, and eventually retry. However, the jobs may still be running on > Dataproc and maybe even finish successfully. So when Airflow retries and > reruns the job, the same job will run twice. This can result in issues like > delayed workflows, increased costs, and duplicate data. > > To reproduce: > 1. Create a DAG in Airflow that runs a Dataproc job that sleeps for 10 > minutes. Wait for the cluster to spin up and the job to start running on > Dataproc. > 2. SSH into the machine that hosts Airflow and run the following commands to > simulate restarting Airflow: > {{supervisorctl stop airflow-webserver supervisorctl stop airflow-scheduler > ps aux | grep python | grep airflow | awk '\{print $2}' | xargs -r kill -9 > supervisorctl restart airflow-scheduler supervisorctl restart > airflow-webserver}} > 3. Wait for Airflow to retry the task after Supervisor respawns the Airflow > processes. Click on the cluster in Dataproc to observe that the job
[jira] [Created] (AIRFLOW-3211) Airflow losing track of running GCP Dataproc jobs upon Airflow restart
Julie Chien created AIRFLOW-3211: Summary: Airflow losing track of running GCP Dataproc jobs upon Airflow restart Key: AIRFLOW-3211 URL: https://issues.apache.org/jira/browse/AIRFLOW-3211 Project: Apache Airflow Issue Type: Improvement Components: gcp Affects Versions: 1.10.0, 1.9.0 Reporter: Julie Chien Assignee: Julie Chien Fix For: 1.10.0, 1.9.0 If Airflow restarts (say, due to deployments, system updates, or regular machine restarts such as the weekly restarts in GCP App Engine) while it's running a job on GCP Dataproc, it'll lose track of that job, mark the task as failed, and eventually retry. However, the jobs may still be running on Dataproc and maybe even finish successfully. So when Airflow retries and reruns the job, the same job will run twice. To reproduce: 1. Create a DAG in Airflow that runs a Dataproc job that sleeps for 10 minutes. Wait for the cluster to spin up and the job to start running on Dataproc. 2. SSH into the machine that hosts Airflow and run the following commands to simulate restarting Airflow: {{supervisorctl stop airflow-webserver supervisorctl stop airflow-scheduler ps aux | grep python | grep airflow | awk '\{print $2}' | xargs -r kill -9 supervisorctl restart airflow-scheduler supervisorctl restart airflow-webserver}} 3. Wait for Airflow to retry the task after Supervisor respawns the Airflow processes. Click on the cluster in Dataproc to observe that the job will have been resubmitted, even though the first job is still running without error. At Etsy, we've customized the Dataproc operators to allow for the new Airflow task to pick up where the old one left off upon Airflow restarts, and have been happily using our solution for the past 6 months. I'd like to submit a PR to merge this change upstream. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] aoen commented on issue #4037: [AIRFLOW-3191] Fix not being able to specify execution_date when creating dagrun
aoen commented on issue #4037: [AIRFLOW-3191] Fix not being able to specify execution_date when creating dagrun URL: https://github.com/apache/incubator-airflow/pull/4037#issuecomment-429951432 The dagrun got created with the correct timezone, it follows the same pattern as other date elements in the forms. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] thesquelched commented on issue #2135: [AIRFLOW-843] Store exceptions on task_instance
thesquelched commented on issue #2135: [AIRFLOW-843] Store exceptions on task_instance URL: https://github.com/apache/incubator-airflow/pull/2135#issuecomment-429951532 @ashb looks good now? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io commented on issue #4056: [AIRFLOW-3207] option to stop task pushing result to xcom
codecov-io commented on issue #4056: [AIRFLOW-3207] option to stop task pushing result to xcom URL: https://github.com/apache/incubator-airflow/pull/4056#issuecomment-429951253 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4056?src=pr=h1) Report > Merging [#4056](https://codecov.io/gh/apache/incubator-airflow/pull/4056?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/6097f829ac5a4442180018ed56fa1b695badb131?src=pr=desc) will **increase** coverage by `<.01%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4056/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4056?src=pr=tree) ```diff @@Coverage Diff@@ ## master #4056 +/- ## = + Coverage 75.89% 75.9% +<.01% = Files 199 199 Lines 15957 15958 +1 = + Hits12111 12113 +2 + Misses 38463845 -1 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4056?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4056/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `91.9% <100%> (ø)` | :arrow_up: | | [airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/4056/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5) | `88.84% <0%> (+0.35%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4056?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4056?src=pr=footer). Last update [6097f82...1b46dd9](https://codecov.io/gh/apache/incubator-airflow/pull/4056?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-3210) Changing defaults types in BigQuery Hook break BigQuery operator
Siarhei Hushchyn created AIRFLOW-3210: - Summary: Changing defaults types in BigQuery Hook break BigQuery operator Key: AIRFLOW-3210 URL: https://issues.apache.org/jira/browse/AIRFLOW-3210 Project: Apache Airflow Issue Type: Bug Components: contrib, gcp Reporter: Siarhei Hushchyn Changes in BigQuery Hook break BigQuery operator run_query() and all DAGs which accommodate current type (Boolean or value): [BigQuery operator set|https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/operators/bigquery_operator.py#L115-L121]: destination_dataset_table=False, udf_config=False, [New BigQuery hook expects|https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/bigquery_hook.py#L645-L650]: (udf_config, 'userDefinedFunctionResources', None, list), (destination_dataset_table, 'destinationTable', None, dict), -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3209) return job id on bq operators
Ben Marengo created AIRFLOW-3209: Summary: return job id on bq operators Key: AIRFLOW-3209 URL: https://issues.apache.org/jira/browse/AIRFLOW-3209 Project: Apache Airflow Issue Type: Improvement Components: operators Reporter: Ben Marengo Assignee: Ben Marengo i would like to be able to access the job_id in the post_execute() -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (AIRFLOW-3207) option to stop task pushing result to xcom
[ https://issues.apache.org/jira/browse/AIRFLOW-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-3207 started by Ben Marengo. > option to stop task pushing result to xcom > -- > > Key: AIRFLOW-3207 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3207 > Project: Apache Airflow > Issue Type: Improvement > Components: models, operators >Reporter: Ben Marengo >Assignee: Ben Marengo >Priority: Major > > follows the completion of AIRFLOW-886, and closure (incomplete) of AIRFLOW-888 > i would actually like functionality similar to this, but i dont think it > necessitates the global config flag. > - BaseOperator should have an option to stop a task pushing the return value > of execute() to xcom. > - the default should be to push (preserves backward compat) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3207) option to stop task pushing result to xcom
[ https://issues.apache.org/jira/browse/AIRFLOW-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650504#comment-16650504 ] ASF GitHub Bot commented on AIRFLOW-3207: - marengaz opened a new pull request #4056: AIRFLOW-3207 option to stop task pushing result to xcom URL: https://github.com/apache/incubator-airflow/pull/4056 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-3207\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3207 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-3207\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > option to stop task pushing result to xcom > -- > > Key: AIRFLOW-3207 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3207 > Project: Apache Airflow > Issue Type: Improvement > Components: models, operators >Reporter: Ben Marengo >Assignee: Ben Marengo >Priority: Major > > follows the completion of AIRFLOW-886, and closure (incomplete) of AIRFLOW-888 > i would actually like functionality similar to this, but i dont think it > necessitates the global config flag. > - BaseOperator should have an option to stop a task pushing the return value > of execute() to xcom. > - the default should be to push (preserves backward compat) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] marengaz opened a new pull request #4056: AIRFLOW-3207 option to stop task pushing result to xcom
marengaz opened a new pull request #4056: AIRFLOW-3207 option to stop task pushing result to xcom URL: https://github.com/apache/incubator-airflow/pull/4056 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-3207\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3207 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-3207\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-1945) Pass --autoscale to celery workers
[ https://issues.apache.org/jira/browse/AIRFLOW-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650483#comment-16650483 ] ASF GitHub Bot commented on AIRFLOW-1945: - msumit closed pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added URL: https://github.com/apache/incubator-airflow/pull/3989 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/bin/cli.py b/airflow/bin/cli.py index 675a88a63c..cfc6c6b8d6 100644 --- a/airflow/bin/cli.py +++ b/airflow/bin/cli.py @@ -1038,12 +1038,16 @@ def worker(args): from airflow.executors.celery_executor import app as celery_app from celery.bin import worker +autoscale = args.autoscale +if autoscale is None and conf.has_option("celery", "worker_autoscale"): +autoscale = conf.get("celery", "worker_autoscale") worker = worker.worker(app=celery_app) options = { 'optimization': 'fair', 'O': 'fair', 'queues': args.queues, 'concurrency': args.concurrency, +'autoscale': autoscale, 'hostname': args.celery_hostname, 'loglevel': conf.get('core', 'LOGGING_LEVEL'), } @@ -1916,6 +1920,9 @@ class CLIFactory(object): ('-d', '--delete'), help='Delete a user', action='store_true'), +'autoscale': Arg( +('-a', '--autoscale'), +help="Minimum and Maximum number of worker to autoscale"), } subparsers = ( @@ -2058,7 +2065,7 @@ class CLIFactory(object): 'func': worker, 'help': "Start a Celery worker node", 'args': ('do_pickle', 'queues', 'concurrency', 'celery_hostname', - 'pid', 'daemon', 'stdout', 'stderr', 'log_file'), + 'pid', 'daemon', 'stdout', 'stderr', 'log_file', 'autoscale'), }, { 'func': flower, 'help': "Start a Celery Flower", diff --git a/airflow/config_templates/default_airflow.cfg b/airflow/config_templates/default_airflow.cfg index b572dbb2f7..12e5a16f21 100644 --- a/airflow/config_templates/default_airflow.cfg +++ b/airflow/config_templates/default_airflow.cfg @@ -349,6 +349,13 @@ celery_app_name = airflow.executors.celery_executor # your worker box and the nature of your tasks worker_concurrency = 16 +# The minimum and maximum concurrency that will be used when starting workers with the +# "airflow worker" command. Pick these numbers based on resources on +# worker box and the nature of the task. If autoscale option is available worker_concurrency +# will be ignored. +# http://docs.celeryproject.org/en/latest/reference/celery.bin.worker.html#cmdoption-celery-worker-autoscale +# worker_autoscale = 12,16 + # When you start an airflow worker, airflow starts a tiny web server # subprocess to serve the workers local log files to the airflow main # web server, who then builds pages and sends them to users. This defines This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Pass --autoscale to celery workers > -- > > Key: AIRFLOW-1945 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1945 > Project: Apache Airflow > Issue Type: Improvement > Components: celery, cli >Reporter: Michael O. >Assignee: Sai Phanindhra >Priority: Trivial > Labels: easyfix > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > Celery supports autoscaling of the worker pool size (number of tasks that can > parallelize within one worker node). I'd like to propose to support passing > the --autoscale parameter to {{airflow worker}}. > Since this is a trivial change, I am not sure if there's any reason for not > being supported already.(?) > For example > {{airflow worker --concurrency=4}} will set a fixed pool size of 4. > With minimal changes in > [https://github.com/apache/incubator-airflow/blob/4ce4faaeae7a76d97defcf9a9d3304ac9d78b9bd/airflow/bin/cli.py#L855] > it could support > {{airflow worker --autoscale=2,10}} to set an autoscaled pool size of 2 to 10 > Some references: > * > http://docs.celeryproject.org/en/latest/internals/reference/celery.worker.autoscale.html > * > https://github.com/apache/incubator-airflow/blob/4ce4faaeae7a76d97defcf9a9d3304ac9d78b9bd/airflow/bin/cli.py#L855 -- This message was sent by Atlassian JIRA
[GitHub] bolkedebruin commented on issue #4037: [AIRFLOW-3191] Fix not being able to specify execution_date when creating dagrun
bolkedebruin commented on issue #4037: [AIRFLOW-3191] Fix not being able to specify execution_date when creating dagrun URL: https://github.com/apache/incubator-airflow/pull/4037#issuecomment-429933914 Did you verify Tz information? (Didn’t look at the code) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] msumit closed pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added
msumit closed pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added URL: https://github.com/apache/incubator-airflow/pull/3989 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/bin/cli.py b/airflow/bin/cli.py index 675a88a63c..cfc6c6b8d6 100644 --- a/airflow/bin/cli.py +++ b/airflow/bin/cli.py @@ -1038,12 +1038,16 @@ def worker(args): from airflow.executors.celery_executor import app as celery_app from celery.bin import worker +autoscale = args.autoscale +if autoscale is None and conf.has_option("celery", "worker_autoscale"): +autoscale = conf.get("celery", "worker_autoscale") worker = worker.worker(app=celery_app) options = { 'optimization': 'fair', 'O': 'fair', 'queues': args.queues, 'concurrency': args.concurrency, +'autoscale': autoscale, 'hostname': args.celery_hostname, 'loglevel': conf.get('core', 'LOGGING_LEVEL'), } @@ -1916,6 +1920,9 @@ class CLIFactory(object): ('-d', '--delete'), help='Delete a user', action='store_true'), +'autoscale': Arg( +('-a', '--autoscale'), +help="Minimum and Maximum number of worker to autoscale"), } subparsers = ( @@ -2058,7 +2065,7 @@ class CLIFactory(object): 'func': worker, 'help': "Start a Celery worker node", 'args': ('do_pickle', 'queues', 'concurrency', 'celery_hostname', - 'pid', 'daemon', 'stdout', 'stderr', 'log_file'), + 'pid', 'daemon', 'stdout', 'stderr', 'log_file', 'autoscale'), }, { 'func': flower, 'help': "Start a Celery Flower", diff --git a/airflow/config_templates/default_airflow.cfg b/airflow/config_templates/default_airflow.cfg index b572dbb2f7..12e5a16f21 100644 --- a/airflow/config_templates/default_airflow.cfg +++ b/airflow/config_templates/default_airflow.cfg @@ -349,6 +349,13 @@ celery_app_name = airflow.executors.celery_executor # your worker box and the nature of your tasks worker_concurrency = 16 +# The minimum and maximum concurrency that will be used when starting workers with the +# "airflow worker" command. Pick these numbers based on resources on +# worker box and the nature of the task. If autoscale option is available worker_concurrency +# will be ignored. +# http://docs.celeryproject.org/en/latest/reference/celery.bin.worker.html#cmdoption-celery-worker-autoscale +# worker_autoscale = 12,16 + # When you start an airflow worker, airflow starts a tiny web server # subprocess to serve the workers local log files to the airflow main # web server, who then builds pages and sends them to users. This defines This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-3208) Apache airflow 1.8.0 integration with LDAP anonmyously
[ https://issues.apache.org/jira/browse/AIRFLOW-3208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna ADDEPALLI LN updated AIRFLOW-3208: --- Description: Hello., We wanted to have airflow integration with LDAP anonymously, the LDAP is based on either "openldap" or "389 directory Server". Below is the detail added in the airflow.cfg : {noformat} [webserver] authenticate = True auth_backend = airflow.contrib.auth.backends.ldap_auth {noformat} {noformat} [ldap] uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 user_filter = user_name_attr = uid group_member_attr = groupMembership=ou=groups,dc=odc,dc=im superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im data_profiler_filter = bind_user = ou=people,dc=odc,dc=im bind_password = basedn = ou=people,dc=odc,dc=im cacert = /opt/orchestration/airflow/ldap_ca.crt search_scope = SUBTREE{noformat} However, when trying to validate, it failed with below exception, please advise what to correct as per provided detail of LDAP as per above ? We only use "basedn=ou=people,dc=odc,dc=im" with provided LDAP host and was able to access anonymously when tried using jxplorer workbench. We are able to do LDAP anonymously both on kibana/elasticsearch/jenkins, however coming to airflow, please advise solution. {noformat} Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1988, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1641, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1544, in handle_user_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, in reraise raise value File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1639, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1625, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 69, in inner return self._run_view(f, *args, **kwargs) File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 368, in _run_view return fn(self, *args, **kwargs) File "/usr/local/lib/python3.6/site-packages/airflow/www/views.py", line 650, in login return airflow.login.login(self, request) File "/usr/local/lib/python3.6/site-packages/airflow/contrib/auth/backends/ldap_auth.py", line 268, in login LdapUser.try_login(username, password) File "/usr/local/lib/python3.6/site-packages/airflow/contrib/auth/backends/ldap_auth.py", line 180, in try_login search_scope=native(search_scope)) File "/usr/local/lib/python3.6/site-packages/ldap3/core/connection.py", line 779, in search check_names=self.check_names) File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 372, in search_operation request['filter'] = compile_filter(parse_filter(search_filter, schema, auto_escape, auto_encode, validator, check_names).elements[0]) # parse the searchFilter string and compile it starting from the root node File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 206, in parse_filter current_node.append(evaluate_match(search_filter[start_pos:end_pos], schema, auto_escape, auto_encode, validator, check_names)) File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 89, in evaluate_match raise LDAPInvalidFilterError('invalid matching assertion') ldap3.core.exceptions.LDAPInvalidFilterError: invalid matching assertion {noformat} was: Hello., We wanted to have airflow integration with LDAP anonymously, the LDAP is based on either "openldap" or "389 directory Server". Below is the detail added in the airflow.cfg : {noformat} [webserver] authenticate = True auth_backend = airflow.contrib.auth.backends.ldap_auth {noformat} {noformat} [ldap] uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 user_filter = user_name_attr = uid group_member_attr = groupMembership=ou=groups,dc=odc,dc=im superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im data_profiler_filter = bind_user = ou=people,dc=odc,dc=im bind_password = basedn = ou=people,dc=odc,dc=im cacert = /opt/orchestration/airflow/ldap_ca.crt search_scope = SUBTREE{noformat} However, when trying to validate, it failed with below exception, please advise what to correct as per provided detail of LDAP as per above ? We only use "basedn=ou=people,dc=odc,dc=im" with provided LDAP host and was able to access anonymously when tried using jxplorer workbench. We tried both on kibana/elasticsearch/jenkins anonymously, however coming to airflow, please advise solution. {noformat} Traceback (most recent call
[jira] [Created] (AIRFLOW-3208) Apache airflow 1.8.0 integration with LDAP anonmyously
Hari Krishna ADDEPALLI LN created AIRFLOW-3208: -- Summary: Apache airflow 1.8.0 integration with LDAP anonmyously Key: AIRFLOW-3208 URL: https://issues.apache.org/jira/browse/AIRFLOW-3208 Project: Apache Airflow Issue Type: Bug Components: authentication Affects Versions: 1.8.2, 1.8.0 Reporter: Hari Krishna ADDEPALLI LN Hello., We wanted to have airflow integration with LDAP anonymously, the LDAP is based on either "openldap" or "389 directory Server". Below is the detail added in the airflow.cfg : {noformat} [webserver] authenticate = True auth_backend = airflow.contrib.auth.backends.ldap_auth {noformat} {noformat} [ldap] uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 user_filter = user_name_attr = uid group_member_attr = groupMembership=ou=groups,dc=odc,dc=im superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im data_profiler_filter = bind_user = ou=people,dc=odc,dc=im bind_password = basedn = ou=people,dc=odc,dc=im cacert = /opt/orchestration/airflow/ldap_ca.crt search_scope = SUBTREE{noformat} However, when trying to validate, it failed with below exception, please advise what to correct as per provided detail of LDAP as per above ? We only use "basedn=ou=people,dc=odc,dc=im" with provided LDAP host and was able to access anonymously when tried using jxplorer workbench. We tried both on kibana/elasticsearch/jenkins anonymously, however coming to airflow, please advise solution. {noformat} Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1988, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1641, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1544, in handle_user_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, in reraise raise value File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1639, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1625, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 69, in inner return self._run_view(f, *args, **kwargs) File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 368, in _run_view return fn(self, *args, **kwargs) File "/usr/local/lib/python3.6/site-packages/airflow/www/views.py", line 650, in login return airflow.login.login(self, request) File "/usr/local/lib/python3.6/site-packages/airflow/contrib/auth/backends/ldap_auth.py", line 268, in login LdapUser.try_login(username, password) File "/usr/local/lib/python3.6/site-packages/airflow/contrib/auth/backends/ldap_auth.py", line 180, in try_login search_scope=native(search_scope)) File "/usr/local/lib/python3.6/site-packages/ldap3/core/connection.py", line 779, in search check_names=self.check_names) File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 372, in search_operation request['filter'] = compile_filter(parse_filter(search_filter, schema, auto_escape, auto_encode, validator, check_names).elements[0]) # parse the searchFilter string and compile it starting from the root node File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 206, in parse_filter current_node.append(evaluate_match(search_filter[start_pos:end_pos], schema, auto_escape, auto_encode, validator, check_names)) File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 89, in evaluate_match raise LDAPInvalidFilterError('invalid matching assertion') ldap3.core.exceptions.LDAPInvalidFilterError: invalid matching assertion {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3208) Apache airflow 1.8.0 integration with LDAP anonmyously
[ https://issues.apache.org/jira/browse/AIRFLOW-3208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Krishna ADDEPALLI LN updated AIRFLOW-3208: --- Description: Hello., We wanted to have airflow integration with LDAP anonymously, the LDAP is based on either "openldap" or "389 directory Server". Below is the detail added in the airflow.cfg : {noformat} [webserver] authenticate = True auth_backend = airflow.contrib.auth.backends.ldap_auth {noformat} {noformat} [ldap] uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 user_filter = user_name_attr = uid group_member_attr = groupMembership=ou=groups,dc=odc,dc=im superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im data_profiler_filter = bind_user = ou=people,dc=odc,dc=im bind_password = basedn = ou=people,dc=odc,dc=im cacert = /opt/orchestration/airflow/ldap_ca.crt search_scope = SUBTREE{noformat} However, when trying to validate, it failed with below exception, please advise what to correct as per provided detail of LDAP as per above ? We only use "basedn=ou=people,dc=odc,dc=im" with provided LDAP host and was able to access anonymously when tried using jxplorer workbench. We tried both on kibana/elasticsearch/jenkins anonymously, however coming to airflow, please advise solution. {noformat} Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1988, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1641, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1544, in handle_user_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, in reraise raise value File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1639, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1625, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 69, in inner return self._run_view(f, *args, **kwargs) File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 368, in _run_view return fn(self, *args, **kwargs) File "/usr/local/lib/python3.6/site-packages/airflow/www/views.py", line 650, in login return airflow.login.login(self, request) File "/usr/local/lib/python3.6/site-packages/airflow/contrib/auth/backends/ldap_auth.py", line 268, in login LdapUser.try_login(username, password) File "/usr/local/lib/python3.6/site-packages/airflow/contrib/auth/backends/ldap_auth.py", line 180, in try_login search_scope=native(search_scope)) File "/usr/local/lib/python3.6/site-packages/ldap3/core/connection.py", line 779, in search check_names=self.check_names) File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 372, in search_operation request['filter'] = compile_filter(parse_filter(search_filter, schema, auto_escape, auto_encode, validator, check_names).elements[0]) # parse the searchFilter string and compile it starting from the root node File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 206, in parse_filter current_node.append(evaluate_match(search_filter[start_pos:end_pos], schema, auto_escape, auto_encode, validator, check_names)) File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 89, in evaluate_match raise LDAPInvalidFilterError('invalid matching assertion') ldap3.core.exceptions.LDAPInvalidFilterError: invalid matching assertion {noformat} was: Hello., We wanted to have airflow integration with LDAP anonymously, the LDAP is based on either "openldap" or "389 directory Server". Below is the detail added in the airflow.cfg : {noformat} [webserver] authenticate = True auth_backend = airflow.contrib.auth.backends.ldap_auth {noformat} {noformat} [ldap] uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 user_filter = user_name_attr = uid group_member_attr = groupMembership=ou=groups,dc=odc,dc=im superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im data_profiler_filter = bind_user = ou=people,dc=odc,dc=im bind_password = basedn = ou=people,dc=odc,dc=im cacert = /opt/orchestration/airflow/ldap_ca.crt search_scope = SUBTREE{noformat} However, when trying to validate, it failed with below exception, please advise what to correct as per provided detail of LDAP as per above ? We only use "basedn=ou=people,dc=odc,dc=im" with provided LDAP host and was able to access anonymously when tried using jxplorer workbench. We tried both on kibana/elasticsearch/jenkins anonymously, however coming to airflow, please advise solution. {noformat} Traceback (most recent call
[jira] [Created] (AIRFLOW-3207) option to stop task pushing result to xcom
Ben Marengo created AIRFLOW-3207: Summary: option to stop task pushing result to xcom Key: AIRFLOW-3207 URL: https://issues.apache.org/jira/browse/AIRFLOW-3207 Project: Apache Airflow Issue Type: Improvement Components: models, operators Reporter: Ben Marengo Assignee: Ben Marengo follows the completion of AIRFLOW-886, and closure (incomplete) of AIRFLOW-888 i would actually like functionality similar to this, but i dont think it necessitates the global config flag. - BaseOperator should have an option to stop a task pushing the return value of execute() to xcom. - the default should be to push (preserves backward compat) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] oliviersm199 commented on issue #4036: [AIRFLOW-2744] Allow RBAC to accept plugins for views and links.
oliviersm199 commented on issue #4036: [AIRFLOW-2744] Allow RBAC to accept plugins for views and links. URL: https://github.com/apache/incubator-airflow/pull/4036#issuecomment-429928013 Hello have the core committers had time to look at this? I know @Fokko requested input from @jgao54, let me know if you need anything from me. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added
codecov-io edited a comment on issue #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added URL: https://github.com/apache/incubator-airflow/pull/3989#issuecomment-426543786 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3989?src=pr=h1) Report > Merging [#3989](https://codecov.io/gh/apache/incubator-airflow/pull/3989?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/a581cbab1f79827ab645d21a9a221f1616cf8984?src=pr=desc) will **decrease** coverage by `3.04%`. > The diff coverage is `0%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/3989/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3989?src=pr=tree) ```diff @@Coverage Diff @@ ## master#3989 +/- ## == - Coverage 75.92% 72.87% -3.05% == Files 199 199 Lines 1595417003+1049 == + Hits1211312391 +278 - Misses 3841 4612 +771 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/3989?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5) | `58.76% <0%> (-6.3%)` | :arrow_down: | | [airflow/hooks/druid\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9kcnVpZF9ob29rLnB5) | `67.36% <0%> (-20.64%)` | :arrow_down: | | [airflow/task/task\_runner/base\_task\_runner.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL2Jhc2VfdGFza19ydW5uZXIucHk=) | `60.97% <0%> (-18.34%)` | :arrow_down: | | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `75.55% <0%> (-16.4%)` | :arrow_down: | | [airflow/example\_dags/example\_python\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9weXRob25fb3BlcmF0b3IucHk=) | `78.94% <0%> (-15.79%)` | :arrow_down: | | [airflow/utils/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9jb25maWd1cmF0aW9uLnB5) | `85.71% <0%> (-14.29%)` | :arrow_down: | | [airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5) | `85.07% <0%> (-3.78%)` | :arrow_down: | | [airflow/operators/s3\_file\_transform\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvczNfZmlsZV90cmFuc2Zvcm1fb3BlcmF0b3IucHk=) | `93.87% <0%> (-2.35%)` | :arrow_down: | | [airflow/www\_rbac/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy92aWV3cy5weQ==) | `72.18% <0%> (-0.36%)` | :arrow_down: | | ... and [13 more](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3989?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3989?src=pr=footer). Last update [a581cba...959ca5d](https://codecov.io/gh/apache/incubator-airflow/pull/3989?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #2135: [AIRFLOW-843] Store exceptions on task_instance
codecov-io edited a comment on issue #2135: [AIRFLOW-843] Store exceptions on task_instance URL: https://github.com/apache/incubator-airflow/pull/2135#issuecomment-348005174 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2135?src=pr=h1) Report > Merging [#2135](https://codecov.io/gh/apache/incubator-airflow/pull/2135?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/a581cbab1f79827ab645d21a9a221f1616cf8984?src=pr=desc) will **increase** coverage by `<.01%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/2135/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/2135?src=pr=tree) ```diff @@Coverage Diff @@ ## master#2135 +/- ## == + Coverage 75.92% 75.93% +<.01% == Files 199 199 Lines 1595415956 +2 == + Hits1211312116 +3 + Misses 3841 3840 -1 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/2135?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/2135/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `91.99% <100%> (+0.04%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2135?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2135?src=pr=footer). Last update [a581cba...e584190](https://codecov.io/gh/apache/incubator-airflow/pull/2135?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] thesquelched commented on issue #2135: [AIRFLOW-843] Store exceptions on task_instance
thesquelched commented on issue #2135: [AIRFLOW-843] Store exceptions on task_instance URL: https://github.com/apache/incubator-airflow/pull/2135#issuecomment-429901059 @xnuinside sorry for the long wait; rebased and addressed PR comments This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] thesquelched commented on a change in pull request #2135: [AIRFLOW-843] Store exceptions on task_instance
thesquelched commented on a change in pull request #2135: [AIRFLOW-843] Store exceptions on task_instance URL: https://github.com/apache/incubator-airflow/pull/2135#discussion_r225210703 ## File path: airflow/models.py ## @@ -1574,6 +1574,8 @@ def dry_run(self): def handle_failure(self, error, test_mode=False, context=None, session=None): self.log.exception(error) task = self.task +session = settings.Session() Review comment: Cruft from testing, I suppose; I'll remove This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aoen edited a comment on issue #4037: [AIRFLOW-3191] Fix not being able to specify execution_date when creating dagrun
aoen edited a comment on issue #4037: [AIRFLOW-3191] Fix not being able to specify execution_date when creating dagrun URL: https://github.com/apache/incubator-airflow/pull/4037#issuecomment-429896591 Certainly! Here is the creation page before my change: https://user-images.githubusercontent.com/1592778/46960432-87e86580-d06c-11e8-8f51-b265be009cf5.png;> Here is the screenshot where you can see the execution_date parameter is now available: https://user-images.githubusercontent.com/1592778/46960215-0264b580-d06c-11e8-9424-7e8f3780f694.png;> This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aoen commented on issue #4037: [AIRFLOW-3191] Fix not being able to specify execution_date when creating dagrun
aoen commented on issue #4037: [AIRFLOW-3191] Fix not being able to specify execution_date when creating dagrun URL: https://github.com/apache/incubator-airflow/pull/4037#issuecomment-429896591 Certainly! Here is the screenshot where you can see the execution_date parameter is now available: https://user-images.githubusercontent.com/1592778/46960215-0264b580-d06c-11e8-9424-7e8f3780f694.png;> This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] aoen commented on issue #4005: [AIRFLOW-3160] Load latest_dagruns asynchronously, speed up front page load time
aoen commented on issue #4005: [AIRFLOW-3160] Load latest_dagruns asynchronously, speed up front page load time URL: https://github.com/apache/incubator-airflow/pull/4005#issuecomment-429895674 Ready to merge FYI. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4038: [AIRFLOW-1970] Let empty Fernet key or special `no encryption` phrase.
codecov-io edited a comment on issue #4038: [AIRFLOW-1970] Let empty Fernet key or special `no encryption` phrase. URL: https://github.com/apache/incubator-airflow/pull/4038#issuecomment-429203577 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=h1) Report > Merging [#4038](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/a581cbab1f79827ab645d21a9a221f1616cf8984?src=pr=desc) will **increase** coverage by `<.01%`. > The diff coverage is `63.63%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4038/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=tree) ```diff @@Coverage Diff @@ ## master#4038 +/- ## == + Coverage 75.92% 75.92% +<.01% == Files 199 199 Lines 1595415955 +1 == + Hits1211312114 +1 Misses 3841 3841 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/4038/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5) | `89.41% <50%> (+0.56%)` | :arrow_up: | | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4038/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `91.88% <66.66%> (-0.07%)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=footer). Last update [a581cba...211cde1](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4038: [AIRFLOW-1970] Let empty Fernet key or special `no encryption` phrase.
codecov-io edited a comment on issue #4038: [AIRFLOW-1970] Let empty Fernet key or special `no encryption` phrase. URL: https://github.com/apache/incubator-airflow/pull/4038#issuecomment-429203577 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=h1) Report > Merging [#4038](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/a581cbab1f79827ab645d21a9a221f1616cf8984?src=pr=desc) will **increase** coverage by `<.01%`. > The diff coverage is `63.63%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4038/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=tree) ```diff @@Coverage Diff @@ ## master#4038 +/- ## == + Coverage 75.92% 75.92% +<.01% == Files 199 199 Lines 1595415955 +1 == + Hits1211312114 +1 Misses 3841 3841 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/4038/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5) | `89.41% <50%> (+0.56%)` | :arrow_up: | | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4038/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `91.88% <66.66%> (-0.07%)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=footer). Last update [a581cba...211cde1](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-1945) Pass --autoscale to celery workers
[ https://issues.apache.org/jira/browse/AIRFLOW-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650256#comment-16650256 ] ASF GitHub Bot commented on AIRFLOW-1945: - phani8996 opened a new pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added URL: https://github.com/apache/incubator-airflow/pull/3989 Dear Airflow Maintainers, This will add a provision to autoscale celery workers unlike same numbers of workers irrespective of number of running tasks. Please accept this PR that addresses the following issues: https://issues.apache.org/jira/browse/AIRFLOW-1945 Testing Done: Manually tested by passing arguments in cli This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Pass --autoscale to celery workers > -- > > Key: AIRFLOW-1945 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1945 > Project: Apache Airflow > Issue Type: Improvement > Components: celery, cli >Reporter: Michael O. >Assignee: Sai Phanindhra >Priority: Trivial > Labels: easyfix > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > Celery supports autoscaling of the worker pool size (number of tasks that can > parallelize within one worker node). I'd like to propose to support passing > the --autoscale parameter to {{airflow worker}}. > Since this is a trivial change, I am not sure if there's any reason for not > being supported already.(?) > For example > {{airflow worker --concurrency=4}} will set a fixed pool size of 4. > With minimal changes in > [https://github.com/apache/incubator-airflow/blob/4ce4faaeae7a76d97defcf9a9d3304ac9d78b9bd/airflow/bin/cli.py#L855] > it could support > {{airflow worker --autoscale=2,10}} to set an autoscaled pool size of 2 to 10 > Some references: > * > http://docs.celeryproject.org/en/latest/internals/reference/celery.worker.autoscale.html > * > https://github.com/apache/incubator-airflow/blob/4ce4faaeae7a76d97defcf9a9d3304ac9d78b9bd/airflow/bin/cli.py#L855 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] phani8996 commented on issue #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added
phani8996 commented on issue #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added URL: https://github.com/apache/incubator-airflow/pull/3989#issuecomment-429866997 Mistakely closed PR This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] phani8996 opened a new pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added
phani8996 opened a new pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added URL: https://github.com/apache/incubator-airflow/pull/3989 Dear Airflow Maintainers, This will add a provision to autoscale celery workers unlike same numbers of workers irrespective of number of running tasks. Please accept this PR that addresses the following issues: https://issues.apache.org/jira/browse/AIRFLOW-1945 Testing Done: Manually tested by passing arguments in cli This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-1945) Pass --autoscale to celery workers
[ https://issues.apache.org/jira/browse/AIRFLOW-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650252#comment-16650252 ] ASF GitHub Bot commented on AIRFLOW-1945: - phani8996 closed pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added URL: https://github.com/apache/incubator-airflow/pull/3989 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/bin/cli.py b/airflow/bin/cli.py index 09bd0c1806..19ff220d9f 100644 --- a/airflow/bin/cli.py +++ b/airflow/bin/cli.py @@ -1055,12 +1055,16 @@ def worker(args): from airflow.executors.celery_executor import app as celery_app from celery.bin import worker +autoscale = args.autoscale +if autoscale is None and conf.has_option("celery", "worker_autoscale"): +autoscale = conf.get("celery", "worker_autoscale") worker = worker.worker(app=celery_app) options = { 'optimization': 'fair', 'O': 'fair', 'queues': args.queues, 'concurrency': args.concurrency, +'autoscale': autoscale, 'hostname': args.celery_hostname, } @@ -1932,6 +1936,9 @@ class CLIFactory(object): ('-d', '--delete'), help='Delete a user', action='store_true'), +'autoscale': Arg( +('-a', '--autoscale'), +help="Minimum and Maximum number of worker to autoscale"), } subparsers = ( @@ -2074,7 +2081,7 @@ class CLIFactory(object): 'func': worker, 'help': "Start a Celery worker node", 'args': ('do_pickle', 'queues', 'concurrency', 'celery_hostname', - 'pid', 'daemon', 'stdout', 'stderr', 'log_file'), + 'pid', 'daemon', 'stdout', 'stderr', 'log_file', 'autoscale'), }, { 'func': flower, 'help': "Start a Celery Flower", diff --git a/airflow/config_templates/default_airflow.cfg b/airflow/config_templates/default_airflow.cfg index bb4ab208d7..a1806a5dee 100644 --- a/airflow/config_templates/default_airflow.cfg +++ b/airflow/config_templates/default_airflow.cfg @@ -349,6 +349,13 @@ celery_app_name = airflow.executors.celery_executor # your worker box and the nature of your tasks worker_concurrency = 16 +# The minimum and maximum concurrency that will be used when starting workers with the +# "airflow worker" command. Pick these numbers based on resources on +# worker box and the nature of the task. If autoscale option is available worker_concurrency +# will be ignored. +# http://docs.celeryproject.org/en/latest/reference/celery.bin.worker.html#cmdoption-celery-worker-autoscale +# worker_autoscale = 12,16 + # When you start an airflow worker, airflow starts a tiny web server # subprocess to serve the workers local log files to the airflow main # web server, who then builds pages and sends them to users. This defines This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Pass --autoscale to celery workers > -- > > Key: AIRFLOW-1945 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1945 > Project: Apache Airflow > Issue Type: Improvement > Components: celery, cli >Reporter: Michael O. >Assignee: Sai Phanindhra >Priority: Trivial > Labels: easyfix > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > Celery supports autoscaling of the worker pool size (number of tasks that can > parallelize within one worker node). I'd like to propose to support passing > the --autoscale parameter to {{airflow worker}}. > Since this is a trivial change, I am not sure if there's any reason for not > being supported already.(?) > For example > {{airflow worker --concurrency=4}} will set a fixed pool size of 4. > With minimal changes in > [https://github.com/apache/incubator-airflow/blob/4ce4faaeae7a76d97defcf9a9d3304ac9d78b9bd/airflow/bin/cli.py#L855] > it could support > {{airflow worker --autoscale=2,10}} to set an autoscaled pool size of 2 to 10 > Some references: > * > http://docs.celeryproject.org/en/latest/internals/reference/celery.worker.autoscale.html > * > https://github.com/apache/incubator-airflow/blob/4ce4faaeae7a76d97defcf9a9d3304ac9d78b9bd/airflow/bin/cli.py#L855 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] phani8996 closed pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added
phani8996 closed pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added URL: https://github.com/apache/incubator-airflow/pull/3989 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/bin/cli.py b/airflow/bin/cli.py index 09bd0c1806..19ff220d9f 100644 --- a/airflow/bin/cli.py +++ b/airflow/bin/cli.py @@ -1055,12 +1055,16 @@ def worker(args): from airflow.executors.celery_executor import app as celery_app from celery.bin import worker +autoscale = args.autoscale +if autoscale is None and conf.has_option("celery", "worker_autoscale"): +autoscale = conf.get("celery", "worker_autoscale") worker = worker.worker(app=celery_app) options = { 'optimization': 'fair', 'O': 'fair', 'queues': args.queues, 'concurrency': args.concurrency, +'autoscale': autoscale, 'hostname': args.celery_hostname, } @@ -1932,6 +1936,9 @@ class CLIFactory(object): ('-d', '--delete'), help='Delete a user', action='store_true'), +'autoscale': Arg( +('-a', '--autoscale'), +help="Minimum and Maximum number of worker to autoscale"), } subparsers = ( @@ -2074,7 +2081,7 @@ class CLIFactory(object): 'func': worker, 'help': "Start a Celery worker node", 'args': ('do_pickle', 'queues', 'concurrency', 'celery_hostname', - 'pid', 'daemon', 'stdout', 'stderr', 'log_file'), + 'pid', 'daemon', 'stdout', 'stderr', 'log_file', 'autoscale'), }, { 'func': flower, 'help': "Start a Celery Flower", diff --git a/airflow/config_templates/default_airflow.cfg b/airflow/config_templates/default_airflow.cfg index bb4ab208d7..a1806a5dee 100644 --- a/airflow/config_templates/default_airflow.cfg +++ b/airflow/config_templates/default_airflow.cfg @@ -349,6 +349,13 @@ celery_app_name = airflow.executors.celery_executor # your worker box and the nature of your tasks worker_concurrency = 16 +# The minimum and maximum concurrency that will be used when starting workers with the +# "airflow worker" command. Pick these numbers based on resources on +# worker box and the nature of the task. If autoscale option is available worker_concurrency +# will be ignored. +# http://docs.celeryproject.org/en/latest/reference/celery.bin.worker.html#cmdoption-celery-worker-autoscale +# worker_autoscale = 12,16 + # When you start an airflow worker, airflow starts a tiny web server # subprocess to serve the workers local log files to the airflow main # web server, who then builds pages and sends them to users. This defines This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] phani8996 commented on issue #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added
phani8996 commented on issue #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added URL: https://github.com/apache/incubator-airflow/pull/3989#issuecomment-429864622 > @phani8996 plz rebase your commits into a single commit. Also, the commit message could be like this `[AIRFLOW-1945] Add Autoscale config for Celery workers` @msumit commits have been rebased and commit message updated with proper message. Please check. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test
codecov-io edited a comment on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test URL: https://github.com/apache/incubator-airflow/pull/4049#issuecomment-429554020 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=h1) Report > Merging [#4049](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/719e0b16b909baedbc4679568548a4b123e6476a?src=pr=desc) will **increase** coverage by `1.77%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4049/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=tree) ```diff @@Coverage Diff @@ ## master#4049 +/- ## == + Coverage 75.91% 77.69% +1.77% == Files 199 199 Lines 1594815944 -4 == + Hits1210712387 +280 + Misses 3841 3557 -284 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/operators/docker\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvZG9ja2VyX29wZXJhdG9yLnB5) | `97.61% <100%> (+97.61%)` | :arrow_up: | | [airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5) | `82.48% <0%> (+0.35%)` | :arrow_up: | | [airflow/hooks/hive\_hooks.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9oaXZlX2hvb2tzLnB5) | `73.42% <0%> (+0.52%)` | :arrow_up: | | [airflow/operators/hive\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvaGl2ZV9vcGVyYXRvci5weQ==) | `86.53% <0%> (+5.76%)` | :arrow_up: | | [airflow/utils/file.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9maWxlLnB5) | `84% <0%> (+8%)` | :arrow_up: | | [airflow/operators/python\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvcHl0aG9uX29wZXJhdG9yLnB5) | `95.03% <0%> (+13.04%)` | :arrow_up: | | [airflow/operators/subdag\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvc3ViZGFnX29wZXJhdG9yLnB5) | `90.32% <0%> (+19.35%)` | :arrow_up: | | [airflow/operators/latest\_only\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbGF0ZXN0X29ubHlfb3BlcmF0b3IucHk=) | `90% <0%> (+65%)` | :arrow_up: | | [airflow/operators/s3\_to\_hive\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvczNfdG9faGl2ZV9vcGVyYXRvci5weQ==) | `94.01% <0%> (+94.01%)` | :arrow_up: | | ... and [1 more](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=footer). Last update [719e0b1...111a803](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test
codecov-io edited a comment on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test URL: https://github.com/apache/incubator-airflow/pull/4049#issuecomment-429554020 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=h1) Report > Merging [#4049](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/719e0b16b909baedbc4679568548a4b123e6476a?src=pr=desc) will **increase** coverage by `1.77%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4049/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=tree) ```diff @@Coverage Diff @@ ## master#4049 +/- ## == + Coverage 75.91% 77.69% +1.77% == Files 199 199 Lines 1594815944 -4 == + Hits1210712387 +280 + Misses 3841 3557 -284 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/operators/docker\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvZG9ja2VyX29wZXJhdG9yLnB5) | `97.61% <100%> (+97.61%)` | :arrow_up: | | [airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5) | `82.48% <0%> (+0.35%)` | :arrow_up: | | [airflow/hooks/hive\_hooks.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9oaXZlX2hvb2tzLnB5) | `73.42% <0%> (+0.52%)` | :arrow_up: | | [airflow/operators/hive\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvaGl2ZV9vcGVyYXRvci5weQ==) | `86.53% <0%> (+5.76%)` | :arrow_up: | | [airflow/utils/file.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9maWxlLnB5) | `84% <0%> (+8%)` | :arrow_up: | | [airflow/operators/python\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvcHl0aG9uX29wZXJhdG9yLnB5) | `95.03% <0%> (+13.04%)` | :arrow_up: | | [airflow/operators/subdag\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvc3ViZGFnX29wZXJhdG9yLnB5) | `90.32% <0%> (+19.35%)` | :arrow_up: | | [airflow/operators/latest\_only\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbGF0ZXN0X29ubHlfb3BlcmF0b3IucHk=) | `90% <0%> (+65%)` | :arrow_up: | | [airflow/operators/s3\_to\_hive\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvczNfdG9faGl2ZV9vcGVyYXRvci5weQ==) | `94.01% <0%> (+94.01%)` | :arrow_up: | | ... and [1 more](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=footer). Last update [719e0b1...111a803](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XD-DENG edited a comment on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test
XD-DENG edited a comment on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test URL: https://github.com/apache/incubator-airflow/pull/4049#issuecomment-429848698 In addition, I would suggest to include this commit into `1.10.1` which is intended to fix bugs. Re-enabling these tests should be useful to ensure less potential bugs. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] msumit commented on issue #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added
msumit commented on issue #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added URL: https://github.com/apache/incubator-airflow/pull/3989#issuecomment-429849799 @phani8996 plz rebase your commits into a single commit. Also, the commit message could be like this `[AIRFLOW-1945] Add Autoscale config for Celery workers` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XD-DENG commented on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test
XD-DENG commented on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test URL: https://github.com/apache/incubator-airflow/pull/4049#issuecomment-429848698 In addition, I would suggest to include this commit into `1.10.1` which is intended to fix bugs. Re-enabling these should be useful to ensure less potential bugs. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] msumit commented on a change in pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added
msumit commented on a change in pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added URL: https://github.com/apache/incubator-airflow/pull/3989#discussion_r225161038 ## File path: airflow/config_templates/default_airflow.cfg ## @@ -349,6 +349,12 @@ celery_app_name = airflow.executors.celery_executor # your worker box and the nature of your tasks worker_concurrency = 16 +# The minimum and maximum concurrency that will be used when starting workers with the +# "airflow worker" command. Pick these numbers based on resources on +# worker box and the nature of the task. If autoscale option is available worker_concurrency +# will be ignored +#worker_autoscale = 12,16 Review comment: @phani8996 a space after `#` would be better. i.e `# worker_autoscale = 12,16` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XD-DENG commented on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test
XD-DENG commented on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test URL: https://github.com/apache/incubator-airflow/pull/4049#issuecomment-429848209 Hi @Fokko , I have changed the name of some of the Operator test scripts (prepend with `"test_"`), after making sure they don't raise any exception and work as designed, including - `tests/operators/docker_operator.py` (with some code change) - `tests/operators/hive_operator.py` - `tests/operators/latest_only_operator.py` - `tests/operators/python_operator.py` - `tests/operators/s3_to_hive_operator.py` - `tests/operators/slack_operator.py` - `tests/operators/subdag_operator.py` There are another two, - `tests/operators/bash_operator.py` - `tests/operators/operator.py` needing more works. I may fix them later (if I get time & nobody else picks them up). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] msumit closed pull request #3984: [AIRFLOW-3141] Handle duration for missing dag.
msumit closed pull request #3984: [AIRFLOW-3141] Handle duration for missing dag. URL: https://github.com/apache/incubator-airflow/pull/3984 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/www/views.py b/airflow/www/views.py index 0aef2281e7..f2414b680d 100644 --- a/airflow/www/views.py +++ b/airflow/www/views.py @@ -1612,6 +1612,10 @@ def duration(self, session=None): num_runs = request.args.get('num_runs') num_runs = int(num_runs) if num_runs else default_dag_run +if dag is None: +flash('DAG "{0}" seems to be missing.'.format(dag_id), "error") +return redirect('/admin/') + if base_date: base_date = pendulum.parse(base_date) else: diff --git a/airflow/www_rbac/views.py b/airflow/www_rbac/views.py index e6e505c41a..7658c5c3f9 100644 --- a/airflow/www_rbac/views.py +++ b/airflow/www_rbac/views.py @@ -1352,6 +1352,10 @@ def duration(self, session=None): num_runs = request.args.get('num_runs') num_runs = int(num_runs) if num_runs else default_dag_run +if dag is None: +flash('DAG "{0}" seems to be missing.'.format(dag_id), "error") +return redirect('/') + if base_date: base_date = pendulum.parse(base_date) else: diff --git a/tests/core.py b/tests/core.py index 918e9b4d49..91062f6e58 100644 --- a/tests/core.py +++ b/tests/core.py @@ -1877,6 +1877,10 @@ def test_dag_views(self): response = self.app.get( '/admin/airflow/duration?days=30_id=example_bash_operator') self.assertIn("example_bash_operator", response.data.decode('utf-8')) +response = self.app.get( +'/admin/airflow/duration?days=30_id=missing_dag', +follow_redirects=True) +self.assertIn("seems to be missing", response.data.decode('utf-8')) response = self.app.get( '/admin/airflow/tries?days=30_id=example_bash_operator') self.assertIn("example_bash_operator", response.data.decode('utf-8')) diff --git a/tests/www_rbac/test_views.py b/tests/www_rbac/test_views.py index a952b9874c..e79cfb6db8 100644 --- a/tests/www_rbac/test_views.py +++ b/tests/www_rbac/test_views.py @@ -381,6 +381,11 @@ def test_duration(self): resp = self.client.get(url, follow_redirects=True) self.check_content_in_response('example_bash_operator', resp) +def test_duration_missing(self): +url = 'duration?days=30_id=missing_dag' +resp = self.client.get(url, follow_redirects=True) +self.check_content_in_response('seems to be missing', resp) + def test_tries(self): url = 'tries?days=30_id=example_bash_operator' resp = self.client.get(url, follow_redirects=True) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3141) Fix 500 on duration view when dag doesn't exist
[ https://issues.apache.org/jira/browse/AIRFLOW-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650184#comment-16650184 ] ASF GitHub Bot commented on AIRFLOW-3141: - msumit closed pull request #3984: [AIRFLOW-3141] Handle duration for missing dag. URL: https://github.com/apache/incubator-airflow/pull/3984 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/www/views.py b/airflow/www/views.py index 0aef2281e7..f2414b680d 100644 --- a/airflow/www/views.py +++ b/airflow/www/views.py @@ -1612,6 +1612,10 @@ def duration(self, session=None): num_runs = request.args.get('num_runs') num_runs = int(num_runs) if num_runs else default_dag_run +if dag is None: +flash('DAG "{0}" seems to be missing.'.format(dag_id), "error") +return redirect('/admin/') + if base_date: base_date = pendulum.parse(base_date) else: diff --git a/airflow/www_rbac/views.py b/airflow/www_rbac/views.py index e6e505c41a..7658c5c3f9 100644 --- a/airflow/www_rbac/views.py +++ b/airflow/www_rbac/views.py @@ -1352,6 +1352,10 @@ def duration(self, session=None): num_runs = request.args.get('num_runs') num_runs = int(num_runs) if num_runs else default_dag_run +if dag is None: +flash('DAG "{0}" seems to be missing.'.format(dag_id), "error") +return redirect('/') + if base_date: base_date = pendulum.parse(base_date) else: diff --git a/tests/core.py b/tests/core.py index 918e9b4d49..91062f6e58 100644 --- a/tests/core.py +++ b/tests/core.py @@ -1877,6 +1877,10 @@ def test_dag_views(self): response = self.app.get( '/admin/airflow/duration?days=30_id=example_bash_operator') self.assertIn("example_bash_operator", response.data.decode('utf-8')) +response = self.app.get( +'/admin/airflow/duration?days=30_id=missing_dag', +follow_redirects=True) +self.assertIn("seems to be missing", response.data.decode('utf-8')) response = self.app.get( '/admin/airflow/tries?days=30_id=example_bash_operator') self.assertIn("example_bash_operator", response.data.decode('utf-8')) diff --git a/tests/www_rbac/test_views.py b/tests/www_rbac/test_views.py index a952b9874c..e79cfb6db8 100644 --- a/tests/www_rbac/test_views.py +++ b/tests/www_rbac/test_views.py @@ -381,6 +381,11 @@ def test_duration(self): resp = self.client.get(url, follow_redirects=True) self.check_content_in_response('example_bash_operator', resp) +def test_duration_missing(self): +url = 'duration?days=30_id=missing_dag' +resp = self.client.get(url, follow_redirects=True) +self.check_content_in_response('seems to be missing', resp) + def test_tries(self): url = 'tries?days=30_id=example_bash_operator' resp = self.client.get(url, follow_redirects=True) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix 500 on duration view when dag doesn't exist > --- > > Key: AIRFLOW-3141 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3141 > Project: Apache Airflow > Issue Type: Bug >Reporter: Josh Carp >Assignee: Josh Carp >Priority: Trivial > > Loading the duration view for a dag that doesn't exist throws a 500. Based on > the behavior of other dag views, this should redirect to the admin view and > flash an error message instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3203) Bugs in DockerOperator & Some operator test scripts were named incorrectly
[ https://issues.apache.org/jira/browse/AIRFLOW-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaodong DENG updated AIRFLOW-3203: --- Summary: Bugs in DockerOperator & Some operator test scripts were named incorrectly (was: Bugs in DockerOperator & some operator tests) > Bugs in DockerOperator & Some operator test scripts were named incorrectly > -- > > Key: AIRFLOW-3203 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3203 > Project: Apache Airflow > Issue Type: Bug > Components: operators, tests >Affects Versions: 1.10.0 >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > > Usage of `cpu_shares` and `cpu_shares` is incorrect in DockerOperator, based > on documentation of Python package "docker". > In addition, its test is not really working due to incorrect file name. This > also happens for some other test scripts for Operators. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3203) Bugs in DockerOperator & Some operator test scripts were named incorrectly
[ https://issues.apache.org/jira/browse/AIRFLOW-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaodong DENG updated AIRFLOW-3203: --- Description: Usage of `cpu_shares` and `cpu_shares` is incorrect in DockerOperator, based on documentation of Python package "docker". In addition, its test is not really working due to incorrect file name. This also happens for some other test scripts for Operators. This results in test discovery failure. was: Usage of `cpu_shares` and `cpu_shares` is incorrect in DockerOperator, based on documentation of Python package "docker". In addition, its test is not really working due to incorrect file name. This also happens for some other test scripts for Operators. > Bugs in DockerOperator & Some operator test scripts were named incorrectly > -- > > Key: AIRFLOW-3203 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3203 > Project: Apache Airflow > Issue Type: Bug > Components: operators, tests >Affects Versions: 1.10.0 >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > > Usage of `cpu_shares` and `cpu_shares` is incorrect in DockerOperator, based > on documentation of Python package "docker". > In addition, its test is not really working due to incorrect file name. This > also happens for some other test scripts for Operators. This results in test > discovery failure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3203) Bugs in DockerOperator & some operator tests
[ https://issues.apache.org/jira/browse/AIRFLOW-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaodong DENG updated AIRFLOW-3203: --- Description: Usage of `cpu_shares` and `cpu_shares` is incorrect in DockerOperator, based on documentation of Python package "docker". In addition, its test is not really working due to incorrect file name. This also happens for some other test scripts for Operators. was: Usage of `cpu_shares` and `cpu_shares` is incorrect in DockerOperator, based on documentation of Python package "docker". In addition, its test is not really working due to incorrect file name. Summary: Bugs in DockerOperator & some operator tests (was: Bugs in DockerOperator & its test) > Bugs in DockerOperator & some operator tests > > > Key: AIRFLOW-3203 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3203 > Project: Apache Airflow > Issue Type: Bug > Components: operators, tests >Affects Versions: 1.10.0 >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > > Usage of `cpu_shares` and `cpu_shares` is incorrect in DockerOperator, based > on documentation of Python package "docker". > In addition, its test is not really working due to incorrect file name. This > also happens for some other test scripts for Operators. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] msumit commented on issue #3984: [AIRFLOW-3141] Handle duration for missing dag.
msumit commented on issue #3984: [AIRFLOW-3141] Handle duration for missing dag. URL: https://github.com/apache/incubator-airflow/pull/3984#issuecomment-429815073 much better than getting a 500 page. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice
codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice URL: https://github.com/apache/incubator-airflow/pull/4055#issuecomment-429739296 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=h1) Report > Merging [#4055](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/fac5a8e623a5c702adece7234547861b1cb2d1d8?src=pr=desc) will **not change** coverage. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4055/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#4055 +/- ## === Coverage 75.91% 75.91% === Files 199 199 Lines 1594815948 === Hits1210712107 Misses 3841 3841 ``` -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=footer). Last update [fac5a8e...9406ef3](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice
codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice URL: https://github.com/apache/incubator-airflow/pull/4055#issuecomment-429739296 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=h1) Report > Merging [#4055](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/fac5a8e623a5c702adece7234547861b1cb2d1d8?src=pr=desc) will **not change** coverage. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4055/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#4055 +/- ## === Coverage 75.91% 75.91% === Files 199 199 Lines 1594815948 === Hits1210712107 Misses 3841 3841 ``` -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=footer). Last update [fac5a8e...9406ef3](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators
itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators URL: https://github.com/apache/incubator-airflow/pull/3981#discussion_r225115466 ## File path: airflow/contrib/operators/mlengine_operator.py ## @@ -151,6 +151,9 @@ class MLEngineBatchPredictionOperator(BaseOperator): have doamin-wide delegation enabled. :type delegate_to: str Review comment: ok but, I tried to keep the same blank line as in other parameters, I did not want to trim all these existing blank lines. should I? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators
itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators URL: https://github.com/apache/incubator-airflow/pull/3981#discussion_r225115675 ## File path: airflow/contrib/operators/mlengine_operator.py ## @@ -387,6 +400,9 @@ class MLEngineVersionOperator(BaseOperator): For this to work, the service account making the request must have domain-wide delegation enabled. :type delegate_to: str + Review comment: ok but, I tried to keep the same blank line as in other parameters, I did not want to trim all these existing blank lines. should I? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators
itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators URL: https://github.com/apache/incubator-airflow/pull/3981#discussion_r225115466 ## File path: airflow/contrib/operators/mlengine_operator.py ## @@ -151,6 +151,9 @@ class MLEngineBatchPredictionOperator(BaseOperator): have doamin-wide delegation enabled. :type delegate_to: str Review comment: ok but, I tried to keep the same blank line as in other parameters This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators
itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators URL: https://github.com/apache/incubator-airflow/pull/3981#discussion_r225114997 ## File path: airflow/contrib/operators/gcs_to_bq.py ## @@ -248,7 +252,7 @@ def execute(self, context): time_partitioning=self.time_partitioning, cluster_fields=self.cluster_fields) -if self.max_id_key: +if self.do_xcom_push and self.max_id_key: Review comment: I wondered too but I decided to add `xcom_push` flag so that the `max_id_key` is not an exceptional flag to enable xcom feature. The best might be renaming `max_id_key` to `xcom_push` flag & add documenation to explain what is pushed to xcom when the feature is enabled. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators
itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators URL: https://github.com/apache/incubator-airflow/pull/3981#discussion_r225113656 ## File path: airflow/contrib/operators/bigquery_get_data.py ## @@ -78,6 +80,7 @@ def __init__(self, selected_fields=None, bigquery_conn_id='bigquery_default', delegate_to=None, + do_xcom_push=True, Review comment: The choice of default value `True` was intended for non breaking compatibility, as suggested by @ashb. But I agree that `False` is more coherent with core operators https://github.com/apache/incubator-airflow/pull/3981#issuecomment-425855530 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3155) Add ability to filter by a last modified time in GoogleCloudStorageToGoogleCloudStorageOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650023#comment-16650023 ] ASF GitHub Bot commented on AIRFLOW-3155: - kaxil closed pull request #4008: [AIRFLOW-3155] Add ability to filter by a last modified time in GoogleCloudStorageToGoogleCloudStorageOperator URL: https://github.com/apache/incubator-airflow/pull/4008 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/operators/gcs_to_gcs.py b/airflow/contrib/operators/gcs_to_gcs.py index 12fbff5276..0e1087e4d2 100644 --- a/airflow/contrib/operators/gcs_to_gcs.py +++ b/airflow/contrib/operators/gcs_to_gcs.py @@ -62,6 +62,10 @@ class GoogleCloudStorageToGoogleCloudStorageOperator(BaseOperator): For this to work, the service account making the request must have domain-wide delegation enabled. :type delegate_to: str +:param last_modified_time: When specified, if the object(s) were +modified after last_modified_time, they will be copied/moved. +If tzinfo has not been set, UTC will be assumed. +:type last_modified_time: datetime **Examples**: The following Operator would copy a single file named @@ -114,6 +118,7 @@ def __init__(self, move_object=False, google_cloud_storage_conn_id='google_cloud_default', delegate_to=None, + last_modified_time=None, *args, **kwargs): super(GoogleCloudStorageToGoogleCloudStorageOperator, @@ -125,6 +130,7 @@ def __init__(self, self.move_object = move_object self.google_cloud_storage_conn_id = google_cloud_storage_conn_id self.delegate_to = delegate_to +self.last_modified_time = last_modified_time self.wildcard = '*' def execute(self, context): @@ -140,6 +146,13 @@ def execute(self, context): objects = hook.list(self.source_bucket, prefix=prefix, delimiter=delimiter) for source_object in objects: +if self.last_modified_time is not None: +# Check to see if object was modified after last_modified_time +if hook.is_updated_after(self.source_bucket, source_object, + self.last_modified_time): +pass +else: +continue if self.destination_object is None: destination_object = source_object else: @@ -156,6 +169,14 @@ def execute(self, context): hook.delete(self.source_bucket, source_object) else: +if self.last_modified_time is not None: +if hook.is_updated_after(self.source_bucket, + self.source_object, + self.last_modified_time): +pass +else: +return + self.log.info( log_message.format(self.source_bucket, self.source_object, self.destination_bucket or self.source_bucket, diff --git a/tests/contrib/operators/test_gcs_to_gcs_operator.py b/tests/contrib/operators/test_gcs_to_gcs_operator.py index 6b866d11e1..dd16e2f2df 100644 --- a/tests/contrib/operators/test_gcs_to_gcs_operator.py +++ b/tests/contrib/operators/test_gcs_to_gcs_operator.py @@ -18,6 +18,7 @@ # under the License. import unittest +from datetime import datetime from airflow.contrib.operators.gcs_to_gcs import \ GoogleCloudStorageToGoogleCloudStorageOperator @@ -38,6 +39,7 @@ SOURCE_OBJECT_2 = 'test_object*' SOURCE_OBJECT_3 = 'test*object' SOURCE_OBJECT_4 = 'test_object*.txt' +SOURCE_OBJECT_5 = 'test_object.txt' DESTINATION_BUCKET = 'archive' DESTINATION_OBJECT_PREFIX = 'foo/bar' SOURCE_FILES_LIST = [ @@ -45,6 +47,7 @@ 'test_object/file2.txt', 'test_object/file3.json', ] +MOD_TIME_1 = datetime(2016, 1, 1) class GoogleCloudStorageToCloudStorageOperatorTest(unittest.TestCase): @@ -167,3 +170,113 @@ def test_execute_wildcard_empty_destination_object(self, mock_hook): DESTINATION_BUCKET, '/file2.txt'), ] mock_hook.return_value.rewrite.assert_has_calls(mock_calls_empty) + +@mock.patch('airflow.contrib.operators.gcs_to_gcs.GoogleCloudStorageHook') +def test_execute_last_modified_time(self, mock_hook): +mock_hook.return_value.list.return_value = SOURCE_FILES_LIST +operator = GoogleCloudStorageToGoogleCloudStorageOperator( +task_id=TASK_ID, source_bucket=TEST_BUCKET, +
[jira] [Resolved] (AIRFLOW-3155) Add ability to filter by a last modified time in GoogleCloudStorageToGoogleCloudStorageOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik resolved AIRFLOW-3155. - Resolution: Fixed Fix Version/s: 2.0.0 Resolved by https://github.com/apache/incubator-airflow/pull/4008 > Add ability to filter by a last modified time in > GoogleCloudStorageToGoogleCloudStorageOperator > --- > > Key: AIRFLOW-3155 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3155 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp >Affects Versions: 2.0.0 >Reporter: Brandon Kvarda >Assignee: Brandon Kvarda >Priority: Minor > Fix For: 2.0.0 > > > Currently the GoogleCloudStorageToGoogleCloudStorageOperator doesn't support > filtering objects based on a last modified time/date. This would add the > ability to further filter source object(s) to copy/move based on a last > modified time threshold (for example, if the objects were updated after the > last run at 10:00 yesterday, then copy/move them; otherwise, do not.) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kaxil closed pull request #4008: [AIRFLOW-3155] Add ability to filter by a last modified time in GoogleCloudStorageToGoogleCloudStorageOperator
kaxil closed pull request #4008: [AIRFLOW-3155] Add ability to filter by a last modified time in GoogleCloudStorageToGoogleCloudStorageOperator URL: https://github.com/apache/incubator-airflow/pull/4008 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/operators/gcs_to_gcs.py b/airflow/contrib/operators/gcs_to_gcs.py index 12fbff5276..0e1087e4d2 100644 --- a/airflow/contrib/operators/gcs_to_gcs.py +++ b/airflow/contrib/operators/gcs_to_gcs.py @@ -62,6 +62,10 @@ class GoogleCloudStorageToGoogleCloudStorageOperator(BaseOperator): For this to work, the service account making the request must have domain-wide delegation enabled. :type delegate_to: str +:param last_modified_time: When specified, if the object(s) were +modified after last_modified_time, they will be copied/moved. +If tzinfo has not been set, UTC will be assumed. +:type last_modified_time: datetime **Examples**: The following Operator would copy a single file named @@ -114,6 +118,7 @@ def __init__(self, move_object=False, google_cloud_storage_conn_id='google_cloud_default', delegate_to=None, + last_modified_time=None, *args, **kwargs): super(GoogleCloudStorageToGoogleCloudStorageOperator, @@ -125,6 +130,7 @@ def __init__(self, self.move_object = move_object self.google_cloud_storage_conn_id = google_cloud_storage_conn_id self.delegate_to = delegate_to +self.last_modified_time = last_modified_time self.wildcard = '*' def execute(self, context): @@ -140,6 +146,13 @@ def execute(self, context): objects = hook.list(self.source_bucket, prefix=prefix, delimiter=delimiter) for source_object in objects: +if self.last_modified_time is not None: +# Check to see if object was modified after last_modified_time +if hook.is_updated_after(self.source_bucket, source_object, + self.last_modified_time): +pass +else: +continue if self.destination_object is None: destination_object = source_object else: @@ -156,6 +169,14 @@ def execute(self, context): hook.delete(self.source_bucket, source_object) else: +if self.last_modified_time is not None: +if hook.is_updated_after(self.source_bucket, + self.source_object, + self.last_modified_time): +pass +else: +return + self.log.info( log_message.format(self.source_bucket, self.source_object, self.destination_bucket or self.source_bucket, diff --git a/tests/contrib/operators/test_gcs_to_gcs_operator.py b/tests/contrib/operators/test_gcs_to_gcs_operator.py index 6b866d11e1..dd16e2f2df 100644 --- a/tests/contrib/operators/test_gcs_to_gcs_operator.py +++ b/tests/contrib/operators/test_gcs_to_gcs_operator.py @@ -18,6 +18,7 @@ # under the License. import unittest +from datetime import datetime from airflow.contrib.operators.gcs_to_gcs import \ GoogleCloudStorageToGoogleCloudStorageOperator @@ -38,6 +39,7 @@ SOURCE_OBJECT_2 = 'test_object*' SOURCE_OBJECT_3 = 'test*object' SOURCE_OBJECT_4 = 'test_object*.txt' +SOURCE_OBJECT_5 = 'test_object.txt' DESTINATION_BUCKET = 'archive' DESTINATION_OBJECT_PREFIX = 'foo/bar' SOURCE_FILES_LIST = [ @@ -45,6 +47,7 @@ 'test_object/file2.txt', 'test_object/file3.json', ] +MOD_TIME_1 = datetime(2016, 1, 1) class GoogleCloudStorageToCloudStorageOperatorTest(unittest.TestCase): @@ -167,3 +170,113 @@ def test_execute_wildcard_empty_destination_object(self, mock_hook): DESTINATION_BUCKET, '/file2.txt'), ] mock_hook.return_value.rewrite.assert_has_calls(mock_calls_empty) + +@mock.patch('airflow.contrib.operators.gcs_to_gcs.GoogleCloudStorageHook') +def test_execute_last_modified_time(self, mock_hook): +mock_hook.return_value.list.return_value = SOURCE_FILES_LIST +operator = GoogleCloudStorageToGoogleCloudStorageOperator( +task_id=TASK_ID, source_bucket=TEST_BUCKET, +source_object=SOURCE_OBJECT_4, +destination_bucket=DESTINATION_BUCKET, +last_modified_time=None) + +operator.execute(None) +mock_calls_none = [ +mock.call(TEST_BUCKET,
[GitHub] johnhofman commented on issue #3960: [AIRFLOW-2966] Catch ApiException in the Kubernetes Executor
johnhofman commented on issue #3960: [AIRFLOW-2966] Catch ApiException in the Kubernetes Executor URL: https://github.com/apache/incubator-airflow/pull/3960#issuecomment-429789504 @Fokko I have rebased. Is this failing test something I need to look into? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on issue #4054: [AIRFLOW-XXX] Fix GCS Operator docstrings
kaxil commented on issue #4054: [AIRFLOW-XXX] Fix GCS Operator docstrings URL: https://github.com/apache/incubator-airflow/pull/4054#issuecomment-429788296 @Fokko Yes I agree, that should be the way to go. However, haven't been able to really find a decent linting tool for that yet. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil closed pull request #4054: [AIRFLOW-XXX] Fix GCS Operator docstrings
kaxil closed pull request #4054: [AIRFLOW-XXX] Fix GCS Operator docstrings URL: https://github.com/apache/incubator-airflow/pull/4054 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3205) GCS: Support multipart upload
[ https://issues.apache.org/jira/browse/AIRFLOW-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649995#comment-16649995 ] Gordon Ball commented on AIRFLOW-3205: -- The behaviour of the MySQL->GCS operator is to split the output into multiple files, whereas this is about uploading a single logical file in multiple HTTP requests, avoiding a size limit. The former behaviour is useful by itself (eg, for import to BigQuery the multiple uploaded files can be imported in parallel, instead of a slow serial import of a single file), but is orthogonal to this case. > GCS: Support multipart upload > - > > Key: AIRFLOW-3205 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3205 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp >Reporter: Gordon Ball >Priority: Minor > > GoogleCloudStorageHook currently only provides support for uploading files in > a single HTTP request. This means that loads fail with SSL errors for files > larger than 2GiB (presumably a int32 overflow, might depend on which SSL > library is being used). Multipart uploads should be supported to allow large > uploads, and possibly increase reliability for smaller uploads. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3205) GCS: Support multipart upload
[ https://issues.apache.org/jira/browse/AIRFLOW-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649979#comment-16649979 ] jack commented on AIRFLOW-3205: --- Some operators like: MySqlToGoogleCloudStorageOperator do support this behavior. You can specify the max file size and you also can define the param filename with name{}.json which will create name1.json, name0.json as many as it needs till all records were imported. [https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/operators/mysql_to_gcs.py] I haven't checked if this behavior is from the operator or from the hook but I agree that it would be nice to have such behavior in all operators that interact with GoogleStorage. > GCS: Support multipart upload > - > > Key: AIRFLOW-3205 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3205 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp >Reporter: Gordon Ball >Priority: Minor > > GoogleCloudStorageHook currently only provides support for uploading files in > a single HTTP request. This means that loads fail with SSL errors for files > larger than 2GiB (presumably a int32 overflow, might depend on which SSL > library is being used). Multipart uploads should be supported to allow large > uploads, and possibly increase reliability for smaller uploads. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice
codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice URL: https://github.com/apache/incubator-airflow/pull/4055#issuecomment-429739296 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=h1) Report > Merging [#4055](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/fac5a8e623a5c702adece7234547861b1cb2d1d8?src=pr=desc) will **not change** coverage. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4055/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#4055 +/- ## === Coverage 75.91% 75.91% === Files 199 199 Lines 1594815948 === Hits1210712107 Misses 3841 3841 ``` -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=footer). Last update [fac5a8e...8244820](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice
codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice URL: https://github.com/apache/incubator-airflow/pull/4055#issuecomment-429739296 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=h1) Report > Merging [#4055](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/fac5a8e623a5c702adece7234547861b1cb2d1d8?src=pr=desc) will **not change** coverage. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4055/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#4055 +/- ## === Coverage 75.91% 75.91% === Files 199 199 Lines 1594815948 === Hits1210712107 Misses 3841 3841 ``` -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=footer). Last update [fac5a8e...8244820](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] brylie opened a new pull request #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice
brylie opened a new pull request #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice URL: https://github.com/apache/incubator-airflow/pull/4055 Clarify the GPL dependency notice. Add line breaks for readability. Use neutral terms 'disallow' and 'allow'. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3206) More neutral language regarding Copyleft in installation instructions
[ https://issues.apache.org/jira/browse/AIRFLOW-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649822#comment-16649822 ] ASF GitHub Bot commented on AIRFLOW-3206: - brylie opened a new pull request #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice URL: https://github.com/apache/incubator-airflow/pull/4055 Clarify the GPL dependency notice. Add line breaks for readability. Use neutral terms 'disallow' and 'allow'. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > More neutral language regarding Copyleft in installation instructions > - > > Key: AIRFLOW-3206 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3206 > Project: Apache Airflow > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.10.0 >Reporter: Brylie Christopher Oxley >Assignee: Brylie Christopher Oxley >Priority: Trivial > Labels: newbie > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > When installing Airflow, the user must set an environment variable to > explicitly _allow_ or _disallow_ the installation of a GPL dependency. The > text of the error message is somewhat difficult to read, and seems biased > against the GPL dependency. > h2. Task > * add proper line breaks to GPL dependency notice, for improved readability > * use neutral language _allow_ and _disallow_ (as opposed to 'force') -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3206) More neutral language regarding Copyleft in installation instructions
Brylie Christopher Oxley created AIRFLOW-3206: - Summary: More neutral language regarding Copyleft in installation instructions Key: AIRFLOW-3206 URL: https://issues.apache.org/jira/browse/AIRFLOW-3206 Project: Apache Airflow Issue Type: Improvement Components: Documentation Affects Versions: 1.10.0 Reporter: Brylie Christopher Oxley Assignee: Brylie Christopher Oxley When installing Airflow, the user must set an environment variable to explicitly _allow_ or _disallow_ the installation of a GPL dependency. The text of the error message is somewhat difficult to read, and seems biased against the GPL dependency. h2. Task * add proper line breaks to GPL dependency notice, for improved readability * use neutral language _allow_ and _disallow_ (as opposed to 'force') -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3206) More neutral language regarding Copyleft in installation instructions
[ https://issues.apache.org/jira/browse/AIRFLOW-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brylie Christopher Oxley updated AIRFLOW-3206: -- Description: When installing Airflow, the user must set an environment variable to explicitly _allow_ or _disallow_ the installation of a GPL dependency. The text of the error message is somewhat difficult to read, and seems biased against the GPL dependency. h2. Task * add proper line breaks to GPL dependency notice, for improved readability * use neutral language _allow_ and _disallow_ (as opposed to 'force') was: When installing Airflow, the user must set an environment variable to explicitly _allow_ or _disallow_ the installation of a GPL dependency. The text of the error message is somewhat difficult to read, and seems biased against the GPL dependency. h2. Task * add proper line breaks to GPL dependency notice, for improved readability * use neutral language _allow_ and _disallow_ (as opposed to 'force') > More neutral language regarding Copyleft in installation instructions > - > > Key: AIRFLOW-3206 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3206 > Project: Apache Airflow > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.10.0 >Reporter: Brylie Christopher Oxley >Assignee: Brylie Christopher Oxley >Priority: Trivial > Labels: newbie > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > When installing Airflow, the user must set an environment variable to > explicitly _allow_ or _disallow_ the installation of a GPL dependency. The > text of the error message is somewhat difficult to read, and seems biased > against the GPL dependency. > h2. Task > * add proper line breaks to GPL dependency notice, for improved readability > * use neutral language _allow_ and _disallow_ (as opposed to 'force') -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] codecov-io edited a comment on issue #3741: [AIRFLOW-1368] Add auto_remove for DockerOperator
codecov-io edited a comment on issue #3741: [AIRFLOW-1368] Add auto_remove for DockerOperator URL: https://github.com/apache/incubator-airflow/pull/3741#issuecomment-412492251 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3741?src=pr=h1) Report > Merging [#3741](https://codecov.io/gh/apache/incubator-airflow/pull/3741?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/b8be322d3badfeadfa8f08e0bf92a12a6cd26418?src=pr=desc) will **increase** coverage by `1.85%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/3741/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3741?src=pr=tree) ```diff @@Coverage Diff @@ ## master#3741 +/- ## == + Coverage 75.79% 77.65% +1.85% == Files 199 204 +5 Lines 1594615850 -96 == + Hits1208612308 +222 + Misses 3860 3542 -318 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/3741?src=pr=tree) | Coverage Δ | | |---|---|---| | [airflow/operators/docker\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvZG9ja2VyX29wZXJhdG9yLnB5) | `97.7% <100%> (+97.7%)` | :arrow_up: | | [airflow/sensors/s3\_key\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX2tleV9zZW5zb3IucHk=) | `31.03% <0%> (-68.97%)` | :arrow_down: | | [airflow/sensors/s3\_prefix\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX3ByZWZpeF9zZW5zb3IucHk=) | `41.17% <0%> (-58.83%)` | :arrow_down: | | [airflow/utils/helpers.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9oZWxwZXJzLnB5) | `71.34% <0%> (-13.04%)` | :arrow_down: | | [airflow/hooks/mysql\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9teXNxbF9ob29rLnB5) | `78% <0%> (-12%)` | :arrow_down: | | [airflow/sensors/sql\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3NxbF9zZW5zb3IucHk=) | `90.47% <0%> (-9.53%)` | :arrow_down: | | [airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5) | `83.95% <0%> (-5.31%)` | :arrow_down: | | [airflow/utils/state.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zdGF0ZS5weQ==) | `93.33% <0%> (-3.34%)` | :arrow_down: | | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `88.82% <0%> (-2.89%)` | :arrow_down: | | [airflow/utils/email.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9lbWFpbC5weQ==) | `97.4% <0%> (-2.6%)` | :arrow_down: | | ... and [66 more](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree-more) | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3741?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3741?src=pr=footer). Last update [b8be322...c9247b4](https://codecov.io/gh/apache/incubator-airflow/pull/3741?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-3205) GCS: Support multipart upload
Gordon Ball created AIRFLOW-3205: Summary: GCS: Support multipart upload Key: AIRFLOW-3205 URL: https://issues.apache.org/jira/browse/AIRFLOW-3205 Project: Apache Airflow Issue Type: Improvement Components: gcp Reporter: Gordon Ball GoogleCloudStorageHook currently only provides support for uploading files in a single HTTP request. This means that loads fail with SSL errors for files larger than 2GiB (presumably a int32 overflow, might depend on which SSL library is being used). Multipart uploads should be supported to allow large uploads, and possibly increase reliability for smaller uploads. -- This message was sent by Atlassian JIRA (v7.6.3#76005)