[jira] [Updated] (AIRFLOW-3169) Indicate in the main UI if the scheduler is NOT working.

2018-10-15 Thread jack (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jack updated AIRFLOW-3169:
--
Labels:   (was: easy-fix)

> Indicate in the main UI if the scheduler is NOT working.
> 
>
> Key: AIRFLOW-3169
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3169
> Project: Apache Airflow
>  Issue Type: Improvement
>Affects Versions: 1.10.0
>Reporter: jack
>Priority: Major
>
> I came to work today and took a look at Airflow UI.
> Everything was green (success) - it took me a while to notice that the dates 
> of tasks are from Thursday. The scheduler was offline whole weekend.
> Only when I restarted the scheduler tasks has began to run. I don't know why 
> the scheduler stopped but I think it would be great if the UI would indicate 
> in the main screen when the scheduler is offline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3169) Indicate in the main UI if the scheduler is NOT working.

2018-10-15 Thread jack (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jack updated AIRFLOW-3169:
--
Affects Version/s: 1.10.0
   Labels: easy-fix  (was: )

> Indicate in the main UI if the scheduler is NOT working.
> 
>
> Key: AIRFLOW-3169
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3169
> Project: Apache Airflow
>  Issue Type: Improvement
>Affects Versions: 1.10.0
>Reporter: jack
>Priority: Major
>
> I came to work today and took a look at Airflow UI.
> Everything was green (success) - it took me a while to notice that the dates 
> of tasks are from Thursday. The scheduler was offline whole weekend.
> Only when I restarted the scheduler tasks has began to run. I don't know why 
> the scheduler stopped but I think it would be great if the UI would indicate 
> in the main screen when the scheduler is offline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3158) Improve error message for Broken DAG by adding function name

2018-10-15 Thread jack (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jack updated AIRFLOW-3158:
--
Affects Version/s: (was: 1.9.0)
   1.10.0
   Labels: easy-fix  (was: )

> Improve error message for Broken DAG by adding function name
> 
>
> Key: AIRFLOW-3158
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3158
> Project: Apache Airflow
>  Issue Type: Task
>Affects Versions: 1.10.0
>Reporter: jack
>Priority: Trivial
>  Labels: easy-fix
>
> The following message appears
> {color:#a94442}Broken DAG: [/home/ubuntu/airflow/dags/a_dag.py] Relationships 
> can only be set between Operators; received function{color}
> When generating
> {code:java}
> A >> B{code}
>  
> in case A is an operator and B is a function.
> The error message could be improved if it will be
> {color:#a94442}Broken DAG: [/home/ubuntu/airflow/dags/a_dag.py] Relationships 
> can only be set between Operators; received function B{color}
> This is a small change that makes the error user friendly and specify where 
> exactly is the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3214) [Kubernetes] No logs, hard to debug

2018-10-15 Thread Zhi Lin (JIRA)
Zhi Lin created AIRFLOW-3214:


 Summary: [Kubernetes] No logs, hard to debug
 Key: AIRFLOW-3214
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3214
 Project: Apache Airflow
  Issue Type: Bug
  Components: kubernetes
Affects Versions: 1.10.0
Reporter: Zhi Lin


I have a fresh install of airflow on Kubernetes, basically running scripts in 
scripts/ci/kubernetes, and I fix the flask-appbuilder version so the webserver 
is running fine. 

But when I try to run the example-kubernetes-executor dag, it fails and shows 
nothing useful in the log. Then I try to pull the image airflow/ci manually, 
and also fails. It says no such image or login is needed. I try to change the 
image in the dag to something I already have locally, but it still fails and 
this time no log...

And I wonder if I want to use KubernetesPodOperator, I cannot run airflow in a 
k8s pod? Because it tries to find the config files of k8s...

So what should I do? What I want to do is to run some machine learning task in 
a image specified. Sometimes I need to run a python script in that spawned pod, 
so I guess I need to put that file in the dag folder? Any suggestion 
appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #3656: [AIRFLOW-2803] Fix all ESLint issues

2018-10-15 Thread GitBox
codecov-io edited a comment on issue #3656: [AIRFLOW-2803] Fix all ESLint issues
URL: 
https://github.com/apache/incubator-airflow/pull/3656#issuecomment-408503531
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr=h1)
 Report
   > Merging 
[#3656](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/b8be322d3badfeadfa8f08e0bf92a12a6cd26418?src=pr=desc)
 will **increase** coverage by `1.71%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3656/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3656  +/-   ##
   ==
   + Coverage   75.79%   77.51%   +1.71% 
   ==
 Files 199  205   +6 
 Lines   1594615751 -195 
   ==
   + Hits1208612209 +123 
   + Misses   3860 3542 -318
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/sensors/s3\_key\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX2tleV9zZW5zb3IucHk=)
 | `31.03% <0%> (-68.97%)` | :arrow_down: |
   | 
[airflow/sensors/s3\_prefix\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX3ByZWZpeF9zZW5zb3IucHk=)
 | `41.17% <0%> (-58.83%)` | :arrow_down: |
   | 
[airflow/utils/helpers.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9oZWxwZXJzLnB5)
 | `67.07% <0%> (-17.31%)` | :arrow_down: |
   | 
[airflow/hooks/mysql\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9teXNxbF9ob29rLnB5)
 | `78% <0%> (-12%)` | :arrow_down: |
   | 
[airflow/sensors/sql\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3NxbF9zZW5zb3IucHk=)
 | `90.47% <0%> (-9.53%)` | :arrow_down: |
   | 
[airflow/utils/sqlalchemy.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zcWxhbGNoZW15LnB5)
 | `73.91% <0%> (-7.52%)` | :arrow_down: |
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `83.95% <0%> (-5.31%)` | :arrow_down: |
   | 
[airflow/utils/state.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zdGF0ZS5weQ==)
 | `93.33% <0%> (-3.34%)` | :arrow_down: |
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.54% <0%> (-3.18%)` | :arrow_down: |
   | 
[airflow/www\_rbac/utils.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy91dGlscy5weQ==)
 | `66.21% <0%> (-2.73%)` | :arrow_down: |
   | ... and [74 
more](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr=footer).
 Last update 
[b8be322...65f3b96](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3213) Create ADLS to GCS operator

2018-10-15 Thread Brandon Kvarda (JIRA)
Brandon Kvarda created AIRFLOW-3213:
---

 Summary: Create ADLS to GCS operator 
 Key: AIRFLOW-3213
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3213
 Project: Apache Airflow
  Issue Type: Improvement
  Components: gcp, operators
Reporter: Brandon Kvarda
Assignee: Brandon Kvarda


Create ADLS to GCS operator that supports copying of files from ADLS to GCS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3211) Airflow losing track of running GCP Dataproc jobs upon Airflow restart

2018-10-15 Thread Julie Chien (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julie Chien updated AIRFLOW-3211:
-
Description: 
If Airflow restarts (say, due to deployments, system updates, or regular 
machine restarts such as the weekly restarts in GCP App Engine) while it's 
running a job on GCP Dataproc, it'll lose track of that job, mark the task as 
failed, and eventually retry. However, the jobs may still be running on 
Dataproc and maybe even finish successfully. So when Airflow retries and reruns 
the job, the same job will run twice. This can result in issues like delayed 
workflows, increased costs, and duplicate data. 
  
 To reproduce:
 # Install Airflow and set up a GCP project that has Dataproc enabled. Create a 
bucket in the GCP project.
 # Install this DAG in the Airflow instance: 
[https://github.com/GoogleCloudPlatform/python-docs-samples/blob/b80895ed88ba86fce223df27a48bf481007ca708/composer/workflows/quickstart.py|https://github.com/GoogleCloudPlatform/python-docs-samples/blob/b80895ed88ba86fce223df27a48bf481007ca708/composer/workflows/quickstart.py.]
  Set up the Airflow variables as instructed in the comments at the top of the 
file.
 # Start the Airflow scheduler and webserver. Kick off a run of the above DAG 
through the Airflow UI. Wait for the cluster to spin up and the job to start 
running on Dataproc.
 # Kill the scheduler and webserver, and then start them back up.
 # Wait for Airflow to retry the task. Click on the cluster in Dataproc to 
observe that the job will have been resubmitted, even though the first job is 
still running without error.
  
 At Etsy, we've customized the Dataproc operators to allow for the new Airflow 
task to pick up where the old one left off upon Airflow restarts, and have been 
happily using our solution for the past 6 months. I'd like to submit a PR to 
merge this change upstream.
  

  was:
If Airflow restarts (say, due to deployments, system updates, or regular 
machine restarts such as the weekly restarts in GCP App Engine) while it's 
running a job on GCP Dataproc, it'll lose track of that job, mark the task as 
failed, and eventually retry. However, the jobs may still be running on 
Dataproc and maybe even finish successfully. So when Airflow retries and reruns 
the job, the same job will run twice. This can result in issues like delayed 
workflows, increased costs, and duplicate data. 
  
 To reproduce:
 # Install Airflow and set up a GCP project that has Dataproc enabled. Create a 
bucket in the GCP project.
 # Install this DAG in the Airflow instance: 
[https://github.com/GoogleCloudPlatform/python-docs-samples/blob/b80895ed88ba86fce223df27a48bf481007ca708/composer/workflows/quickstart.py.]
 Set up the Airflow variables as instructed in the comments at the top of the 
file.
 # Start the Airflow scheduler and webserver. Kick off a run of the above DAG 
through the Airflow UI. Wait for the cluster to spin up and the job to start 
running on Dataproc.
 # Kill the scheduler and webserver, and then start them back up.
 # Wait for Airflow to retry the task. Click on the cluster in Dataproc to 
observe that the job will have been resubmitted, even though the first job is 
still running without error.
  
 At Etsy, we've customized the Dataproc operators to allow for the new Airflow 
task to pick up where the old one left off upon Airflow restarts, and have been 
happily using our solution for the past 6 months. I'd like to submit a PR to 
merge this change upstream.
  


> Airflow losing track of running GCP Dataproc jobs upon Airflow restart
> --
>
> Key: AIRFLOW-3211
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3211
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Julie Chien
>Assignee: Julie Chien
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.9.0, 1.10.0
>
>
> If Airflow restarts (say, due to deployments, system updates, or regular 
> machine restarts such as the weekly restarts in GCP App Engine) while it's 
> running a job on GCP Dataproc, it'll lose track of that job, mark the task as 
> failed, and eventually retry. However, the jobs may still be running on 
> Dataproc and maybe even finish successfully. So when Airflow retries and 
> reruns the job, the same job will run twice. This can result in issues like 
> delayed workflows, increased costs, and duplicate data. 
>   
>  To reproduce:
>  # Install Airflow and set up a GCP project that has Dataproc enabled. Create 
> a bucket in the GCP project.
>  # Install this DAG in the Airflow instance: 
> 

[jira] [Updated] (AIRFLOW-3211) Airflow losing track of running GCP Dataproc jobs upon Airflow restart

2018-10-15 Thread Julie Chien (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julie Chien updated AIRFLOW-3211:
-
Description: 
If Airflow restarts (say, due to deployments, system updates, or regular 
machine restarts such as the weekly restarts in GCP App Engine) while it's 
running a job on GCP Dataproc, it'll lose track of that job, mark the task as 
failed, and eventually retry. However, the jobs may still be running on 
Dataproc and maybe even finish successfully. So when Airflow retries and reruns 
the job, the same job will run twice. This can result in issues like delayed 
workflows, increased costs, and duplicate data. 
  
 To reproduce:
 # Install Airflow and set up a GCP project that has Dataproc enabled. Create a 
bucket in the GCP project.
 # Install this DAG in the Airflow instance: 
[https://github.com/GoogleCloudPlatform/python-docs-samples/blob/b80895ed88ba86fce223df27a48bf481007ca708/composer/workflows/quickstart.py.]
 Set up the Airflow variables as instructed in the comments at the top of the 
file.
 # Start the Airflow scheduler and webserver. Kick off a run of the above DAG 
through the Airflow UI. Wait for the cluster to spin up and the job to start 
running on Dataproc.
 # Kill the scheduler and webserver, and then start them back up.
 # Wait for Airflow to retry the task. Click on the cluster in Dataproc to 
observe that the job will have been resubmitted, even though the first job is 
still running without error.
  
 At Etsy, we've customized the Dataproc operators to allow for the new Airflow 
task to pick up where the old one left off upon Airflow restarts, and have been 
happily using our solution for the past 6 months. I'd like to submit a PR to 
merge this change upstream.
  

  was:
If Airflow restarts (say, due to deployments, system updates, or regular 
machine restarts such as the weekly restarts in GCP App Engine) while it's 
running a job on GCP Dataproc, it'll lose track of that job, mark the task as 
failed, and eventually retry. However, the jobs may still be running on 
Dataproc and maybe even finish successfully. So when Airflow retries and reruns 
the job, the same job will run twice. This can result in issues like delayed 
workflows, increased costs, and duplicate data. 
  
 To reproduce:
 1. Create a DAG in Airflow that runs a Dataproc job that sleeps for 10 
minutes. Wait for the cluster to spin up and the job to start running on 
Dataproc.
 2. SSH into the machine that hosts Airflow and run the following commands to 
simulate restarting Airflow:
 {{supervisorctl stop airflow-webserver supervisorctl stop airflow-scheduler ps 
aux | grep python | grep airflow | awk '\{print $2}' | xargs -r kill -9 
supervisorctl restart airflow-scheduler supervisorctl restart 
airflow-webserver}}
 3. Wait for Airflow to retry the task after Supervisor respawns the Airflow 
processes. Click on the cluster in Dataproc to observe that the job will have 
been resubmitted, even though the first job is still running without error.
  
 At Etsy, we've customized the Dataproc operators to allow for the new Airflow 
task to pick up where the old one left off upon Airflow restarts, and have been 
happily using our solution for the past 6 months. I'd like to submit a PR to 
merge this change upstream.
  


> Airflow losing track of running GCP Dataproc jobs upon Airflow restart
> --
>
> Key: AIRFLOW-3211
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3211
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Julie Chien
>Assignee: Julie Chien
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.9.0, 1.10.0
>
>
> If Airflow restarts (say, due to deployments, system updates, or regular 
> machine restarts such as the weekly restarts in GCP App Engine) while it's 
> running a job on GCP Dataproc, it'll lose track of that job, mark the task as 
> failed, and eventually retry. However, the jobs may still be running on 
> Dataproc and maybe even finish successfully. So when Airflow retries and 
> reruns the job, the same job will run twice. This can result in issues like 
> delayed workflows, increased costs, and duplicate data. 
>   
>  To reproduce:
>  # Install Airflow and set up a GCP project that has Dataproc enabled. Create 
> a bucket in the GCP project.
>  # Install this DAG in the Airflow instance: 
> [https://github.com/GoogleCloudPlatform/python-docs-samples/blob/b80895ed88ba86fce223df27a48bf481007ca708/composer/workflows/quickstart.py.]
>  Set up the Airflow variables as instructed in the comments at the top of the 
> file.
>  # Start the Airflow scheduler and webserver. Kick off a run of the above DAG 
> through the Airflow UI. 

[GitHub] tswast commented on issue #4003: [AIRFLOW-3163] add operator to enable setting table description in BigQuery table

2018-10-15 Thread GitBox
tswast commented on issue #4003: [AIRFLOW-3163] add operator to enable setting 
table description in BigQuery table
URL: 
https://github.com/apache/incubator-airflow/pull/4003#issuecomment-430036691
 
 
   It's also possible to
   * add columns to a table and make updates to schema descriptions. 
https://cloud.google.com/bigquery/docs/managing-table-schemas
   * update the encryption key if the table was already using an encryption 
key. 
https://cloud.google.com/bigquery/docs/customer-managed-encryption#change_key
   
   Having a separate parameter for each property seems the most consistent with 
other operators, but I'm not opposed to having a single "table resource" 
parameter.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #2824: [AIRFLOW-1867] Fix sendgrid py3k bug; add sandbox mode

2018-10-15 Thread GitBox
codecov-io edited a comment on issue #2824: [AIRFLOW-1867] Fix sendgrid py3k 
bug; add sandbox mode
URL: 
https://github.com/apache/incubator-airflow/pull/2824#issuecomment-348023606
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=h1)
 Report
   > Merging 
[#2824](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/a581cbab1f79827ab645d21a9a221f1616cf8984?src=pr=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/2824/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#2824  +/-   ##
   ==
   - Coverage   75.92%   75.91%   -0.02% 
   ==
 Files 199  199  
 Lines   1595415957   +3 
   ==
 Hits1211312113  
   - Misses   3841 3844   +3
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/2824/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `64.82% <0%> (-0.24%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=footer).
 Last update 
[a581cba...c1a3c9c](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #2824: [AIRFLOW-1867] Fix sendgrid py3k bug; add sandbox mode

2018-10-15 Thread GitBox
codecov-io edited a comment on issue #2824: [AIRFLOW-1867] Fix sendgrid py3k 
bug; add sandbox mode
URL: 
https://github.com/apache/incubator-airflow/pull/2824#issuecomment-348023606
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=h1)
 Report
   > Merging 
[#2824](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/a581cbab1f79827ab645d21a9a221f1616cf8984?src=pr=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/2824/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#2824  +/-   ##
   ==
   - Coverage   75.92%   75.91%   -0.02% 
   ==
 Files 199  199  
 Lines   1595415957   +3 
   ==
 Hits1211312113  
   - Misses   3841 3844   +3
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/2824/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `64.82% <0%> (-0.24%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=footer).
 Last update 
[a581cba...c1a3c9c](https://codecov.io/gh/apache/incubator-airflow/pull/2824?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3212) Add AWS Glue Catalog sensor that behaves like HivePartitionSensor

2018-10-15 Thread Michael Mole (JIRA)
Michael Mole created AIRFLOW-3212:
-

 Summary: Add AWS Glue Catalog sensor that behaves like 
HivePartitionSensor 
 Key: AIRFLOW-3212
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3212
 Project: Apache Airflow
  Issue Type: New Feature
  Components: aws
Reporter: Michael Mole


Please add an AWS Glue Catalog sensor that behaves like the HivePartitionSensor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3211) Airflow losing track of running GCP Dataproc jobs upon Airflow restart

2018-10-15 Thread Julie Chien (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julie Chien updated AIRFLOW-3211:
-
Description: 
If Airflow restarts (say, due to deployments, system updates, or regular 
machine restarts such as the weekly restarts in GCP App Engine) while it's 
running a job on GCP Dataproc, it'll lose track of that job, mark the task as 
failed, and eventually retry. However, the jobs may still be running on 
Dataproc and maybe even finish successfully. So when Airflow retries and reruns 
the job, the same job will run twice. This can result in issues like delayed 
workflows, increased costs, and duplicate data. 
  
 To reproduce:
 1. Create a DAG in Airflow that runs a Dataproc job that sleeps for 10 
minutes. Wait for the cluster to spin up and the job to start running on 
Dataproc.
 2. SSH into the machine that hosts Airflow and run the following commands to 
simulate restarting Airflow:
 {{supervisorctl stop airflow-webserver supervisorctl stop airflow-scheduler ps 
aux | grep python | grep airflow | awk '\{print $2}' | xargs -r kill -9 
supervisorctl restart airflow-scheduler supervisorctl restart 
airflow-webserver}}
 3. Wait for Airflow to retry the task after Supervisor respawns the Airflow 
processes. Click on the cluster in Dataproc to observe that the job will have 
been resubmitted, even though the first job is still running without error.
  
 At Etsy, we've customized the Dataproc operators to allow for the new Airflow 
task to pick up where the old one left off upon Airflow restarts, and have been 
happily using our solution for the past 6 months. I'd like to submit a PR to 
merge this change upstream.
  

  was:
If Airflow restarts (say, due to deployments, system updates, or regular 
machine restarts such as the weekly restarts in GCP App Engine) while it's 
running a job on GCP Dataproc, it'll lose track of that job, mark the task as 
failed, and eventually retry. However, the jobs may still be running on 
Dataproc and maybe even finish successfully. So when Airflow retries and reruns 
the job, the same job will run twice.
 
To reproduce:
1. Create a DAG in Airflow that runs a Dataproc job that sleeps for 10 minutes. 
Wait for the cluster to spin up and the job to start running on Dataproc.
2. SSH into the machine that hosts Airflow and run the following commands to 
simulate restarting Airflow:
{{supervisorctl stop airflow-webserver supervisorctl stop airflow-scheduler ps 
aux | grep python | grep airflow | awk '\{print $2}' | xargs -r kill -9 
supervisorctl restart airflow-scheduler supervisorctl restart 
airflow-webserver}}
3. Wait for Airflow to retry the task after Supervisor respawns the Airflow 
processes. Click on the cluster in Dataproc to observe that the job will have 
been resubmitted, even though the first job is still running without error.
 
At Etsy, we've customized the Dataproc operators to allow for the new Airflow 
task to pick up where the old one left off upon Airflow restarts, and have been 
happily using our solution for the past 6 months. I'd like to submit a PR to 
merge this change upstream.
 


> Airflow losing track of running GCP Dataproc jobs upon Airflow restart
> --
>
> Key: AIRFLOW-3211
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3211
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Julie Chien
>Assignee: Julie Chien
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.9.0, 1.10.0
>
>
> If Airflow restarts (say, due to deployments, system updates, or regular 
> machine restarts such as the weekly restarts in GCP App Engine) while it's 
> running a job on GCP Dataproc, it'll lose track of that job, mark the task as 
> failed, and eventually retry. However, the jobs may still be running on 
> Dataproc and maybe even finish successfully. So when Airflow retries and 
> reruns the job, the same job will run twice. This can result in issues like 
> delayed workflows, increased costs, and duplicate data. 
>   
>  To reproduce:
>  1. Create a DAG in Airflow that runs a Dataproc job that sleeps for 10 
> minutes. Wait for the cluster to spin up and the job to start running on 
> Dataproc.
>  2. SSH into the machine that hosts Airflow and run the following commands to 
> simulate restarting Airflow:
>  {{supervisorctl stop airflow-webserver supervisorctl stop airflow-scheduler 
> ps aux | grep python | grep airflow | awk '\{print $2}' | xargs -r kill -9 
> supervisorctl restart airflow-scheduler supervisorctl restart 
> airflow-webserver}}
>  3. Wait for Airflow to retry the task after Supervisor respawns the Airflow 
> processes. Click on the cluster in Dataproc to observe that the job 

[jira] [Created] (AIRFLOW-3211) Airflow losing track of running GCP Dataproc jobs upon Airflow restart

2018-10-15 Thread Julie Chien (JIRA)
Julie Chien created AIRFLOW-3211:


 Summary: Airflow losing track of running GCP Dataproc jobs upon 
Airflow restart
 Key: AIRFLOW-3211
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3211
 Project: Apache Airflow
  Issue Type: Improvement
  Components: gcp
Affects Versions: 1.10.0, 1.9.0
Reporter: Julie Chien
Assignee: Julie Chien
 Fix For: 1.10.0, 1.9.0


If Airflow restarts (say, due to deployments, system updates, or regular 
machine restarts such as the weekly restarts in GCP App Engine) while it's 
running a job on GCP Dataproc, it'll lose track of that job, mark the task as 
failed, and eventually retry. However, the jobs may still be running on 
Dataproc and maybe even finish successfully. So when Airflow retries and reruns 
the job, the same job will run twice.
 
To reproduce:
1. Create a DAG in Airflow that runs a Dataproc job that sleeps for 10 minutes. 
Wait for the cluster to spin up and the job to start running on Dataproc.
2. SSH into the machine that hosts Airflow and run the following commands to 
simulate restarting Airflow:
{{supervisorctl stop airflow-webserver supervisorctl stop airflow-scheduler ps 
aux | grep python | grep airflow | awk '\{print $2}' | xargs -r kill -9 
supervisorctl restart airflow-scheduler supervisorctl restart 
airflow-webserver}}
3. Wait for Airflow to retry the task after Supervisor respawns the Airflow 
processes. Click on the cluster in Dataproc to observe that the job will have 
been resubmitted, even though the first job is still running without error.
 
At Etsy, we've customized the Dataproc operators to allow for the new Airflow 
task to pick up where the old one left off upon Airflow restarts, and have been 
happily using our solution for the past 6 months. I'd like to submit a PR to 
merge this change upstream.
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] aoen commented on issue #4037: [AIRFLOW-3191] Fix not being able to specify execution_date when creating dagrun

2018-10-15 Thread GitBox
aoen commented on issue #4037: [AIRFLOW-3191] Fix not being able to specify 
execution_date when creating dagrun
URL: 
https://github.com/apache/incubator-airflow/pull/4037#issuecomment-429951432
 
 
   The dagrun got created with the correct timezone, it follows the same 
pattern as other date elements in the forms.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] thesquelched commented on issue #2135: [AIRFLOW-843] Store exceptions on task_instance

2018-10-15 Thread GitBox
thesquelched commented on issue #2135: [AIRFLOW-843] Store exceptions on 
task_instance
URL: 
https://github.com/apache/incubator-airflow/pull/2135#issuecomment-429951532
 
 
   @ashb looks good now?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #4056: [AIRFLOW-3207] option to stop task pushing result to xcom

2018-10-15 Thread GitBox
codecov-io commented on issue #4056: [AIRFLOW-3207] option to stop task pushing 
result to xcom
URL: 
https://github.com/apache/incubator-airflow/pull/4056#issuecomment-429951253
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4056?src=pr=h1)
 Report
   > Merging 
[#4056](https://codecov.io/gh/apache/incubator-airflow/pull/4056?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/6097f829ac5a4442180018ed56fa1b695badb131?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4056/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4056?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master   #4056  +/-   ##
   =
   + Coverage   75.89%   75.9%   +<.01% 
   =
 Files 199 199  
 Lines   15957   15958   +1 
   =
   + Hits12111   12113   +2 
   + Misses   38463845   -1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4056?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4056/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `91.9% <100%> (ø)` | :arrow_up: |
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/4056/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `88.84% <0%> (+0.35%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4056?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4056?src=pr=footer).
 Last update 
[6097f82...1b46dd9](https://codecov.io/gh/apache/incubator-airflow/pull/4056?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3210) Changing defaults types in BigQuery Hook break BigQuery operator

2018-10-15 Thread Siarhei Hushchyn (JIRA)
Siarhei Hushchyn created AIRFLOW-3210:
-

 Summary: Changing defaults types in BigQuery Hook break BigQuery 
operator
 Key: AIRFLOW-3210
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3210
 Project: Apache Airflow
  Issue Type: Bug
  Components: contrib, gcp
Reporter: Siarhei Hushchyn


Changes in BigQuery Hook break BigQuery operator run_query() and all DAGs which 
accommodate current type (Boolean or value):

[BigQuery operator 
set|https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/operators/bigquery_operator.py#L115-L121]:
destination_dataset_table=False,
udf_config=False,

[New BigQuery hook 
expects|https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/bigquery_hook.py#L645-L650]:
(udf_config, 'userDefinedFunctionResources', None, list),
(destination_dataset_table, 'destinationTable', None, dict),



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3209) return job id on bq operators

2018-10-15 Thread Ben Marengo (JIRA)
Ben Marengo created AIRFLOW-3209:


 Summary: return job id on bq operators
 Key: AIRFLOW-3209
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3209
 Project: Apache Airflow
  Issue Type: Improvement
  Components: operators
Reporter: Ben Marengo
Assignee: Ben Marengo


i would like to be able to access the job_id in the post_execute()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (AIRFLOW-3207) option to stop task pushing result to xcom

2018-10-15 Thread Ben Marengo (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-3207 started by Ben Marengo.

> option to stop task pushing result to xcom
> --
>
> Key: AIRFLOW-3207
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3207
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: models, operators
>Reporter: Ben Marengo
>Assignee: Ben Marengo
>Priority: Major
>
> follows the completion of AIRFLOW-886, and closure (incomplete) of AIRFLOW-888
> i would actually like functionality similar to this, but i dont think it 
> necessitates the global config flag.
> - BaseOperator should have an option to stop a task pushing the return value 
> of execute() to xcom.
> - the default should be to push (preserves backward compat)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3207) option to stop task pushing result to xcom

2018-10-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650504#comment-16650504
 ] 

ASF GitHub Bot commented on AIRFLOW-3207:
-

marengaz opened a new pull request #4056: AIRFLOW-3207 option to stop task 
pushing result to xcom
URL: https://github.com/apache/incubator-airflow/pull/4056
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-3207\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3207
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-3207\], code changes always need a Jira issue.
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> option to stop task pushing result to xcom
> --
>
> Key: AIRFLOW-3207
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3207
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: models, operators
>Reporter: Ben Marengo
>Assignee: Ben Marengo
>Priority: Major
>
> follows the completion of AIRFLOW-886, and closure (incomplete) of AIRFLOW-888
> i would actually like functionality similar to this, but i dont think it 
> necessitates the global config flag.
> - BaseOperator should have an option to stop a task pushing the return value 
> of execute() to xcom.
> - the default should be to push (preserves backward compat)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] marengaz opened a new pull request #4056: AIRFLOW-3207 option to stop task pushing result to xcom

2018-10-15 Thread GitBox
marengaz opened a new pull request #4056: AIRFLOW-3207 option to stop task 
pushing result to xcom
URL: https://github.com/apache/incubator-airflow/pull/4056
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-3207\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-3207
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-3207\], code changes always need a Jira issue.
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-1945) Pass --autoscale to celery workers

2018-10-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650483#comment-16650483
 ] 

ASF GitHub Bot commented on AIRFLOW-1945:
-

msumit closed pull request #3989: [AIRFLOW-1945] Autoscale celery workers for 
airflow added
URL: https://github.com/apache/incubator-airflow/pull/3989
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/bin/cli.py b/airflow/bin/cli.py
index 675a88a63c..cfc6c6b8d6 100644
--- a/airflow/bin/cli.py
+++ b/airflow/bin/cli.py
@@ -1038,12 +1038,16 @@ def worker(args):
 from airflow.executors.celery_executor import app as celery_app
 from celery.bin import worker
 
+autoscale = args.autoscale
+if autoscale is None and conf.has_option("celery", "worker_autoscale"):
+autoscale = conf.get("celery", "worker_autoscale")
 worker = worker.worker(app=celery_app)
 options = {
 'optimization': 'fair',
 'O': 'fair',
 'queues': args.queues,
 'concurrency': args.concurrency,
+'autoscale': autoscale,
 'hostname': args.celery_hostname,
 'loglevel': conf.get('core', 'LOGGING_LEVEL'),
 }
@@ -1916,6 +1920,9 @@ class CLIFactory(object):
 ('-d', '--delete'),
 help='Delete a user',
 action='store_true'),
+'autoscale': Arg(
+('-a', '--autoscale'),
+help="Minimum and Maximum number of worker to autoscale"),
 
 }
 subparsers = (
@@ -2058,7 +2065,7 @@ class CLIFactory(object):
 'func': worker,
 'help': "Start a Celery worker node",
 'args': ('do_pickle', 'queues', 'concurrency', 'celery_hostname',
- 'pid', 'daemon', 'stdout', 'stderr', 'log_file'),
+ 'pid', 'daemon', 'stdout', 'stderr', 'log_file', 
'autoscale'),
 }, {
 'func': flower,
 'help': "Start a Celery Flower",
diff --git a/airflow/config_templates/default_airflow.cfg 
b/airflow/config_templates/default_airflow.cfg
index b572dbb2f7..12e5a16f21 100644
--- a/airflow/config_templates/default_airflow.cfg
+++ b/airflow/config_templates/default_airflow.cfg
@@ -349,6 +349,13 @@ celery_app_name = airflow.executors.celery_executor
 # your worker box and the nature of your tasks
 worker_concurrency = 16
 
+# The minimum and maximum concurrency that will be used when starting workers 
with the
+# "airflow worker" command. Pick these numbers based on resources on
+# worker box and the nature of the task. If autoscale option is available 
worker_concurrency
+# will be ignored.
+# 
http://docs.celeryproject.org/en/latest/reference/celery.bin.worker.html#cmdoption-celery-worker-autoscale
+# worker_autoscale = 12,16
+
 # When you start an airflow worker, airflow starts a tiny web server
 # subprocess to serve the workers local log files to the airflow main
 # web server, who then builds pages and sends them to users. This defines


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Pass --autoscale to celery workers
> --
>
> Key: AIRFLOW-1945
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1945
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: celery, cli
>Reporter: Michael O.
>Assignee: Sai Phanindhra
>Priority: Trivial
>  Labels: easyfix
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Celery supports autoscaling of the worker pool size (number of tasks that can 
> parallelize within one worker node).  I'd like to propose to support passing 
> the --autoscale parameter to {{airflow worker}}.
> Since this is a trivial change, I am not sure if there's any reason for not 
> being supported already.(?)
> For example
> {{airflow worker --concurrency=4}} will set a fixed pool size of 4.
> With minimal changes in 
> [https://github.com/apache/incubator-airflow/blob/4ce4faaeae7a76d97defcf9a9d3304ac9d78b9bd/airflow/bin/cli.py#L855]
>  it could support
> {{airflow worker --autoscale=2,10}} to set an autoscaled pool size of 2 to 10
> Some references:
> * 
> http://docs.celeryproject.org/en/latest/internals/reference/celery.worker.autoscale.html
> * 
> https://github.com/apache/incubator-airflow/blob/4ce4faaeae7a76d97defcf9a9d3304ac9d78b9bd/airflow/bin/cli.py#L855



--
This message was sent by Atlassian JIRA

[GitHub] bolkedebruin commented on issue #4037: [AIRFLOW-3191] Fix not being able to specify execution_date when creating dagrun

2018-10-15 Thread GitBox
bolkedebruin commented on issue #4037: [AIRFLOW-3191] Fix not being able to 
specify execution_date when creating dagrun
URL: 
https://github.com/apache/incubator-airflow/pull/4037#issuecomment-429933914
 
 
   Did you verify Tz information? (Didn’t look at the code)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] msumit closed pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added

2018-10-15 Thread GitBox
msumit closed pull request #3989: [AIRFLOW-1945] Autoscale celery workers for 
airflow added
URL: https://github.com/apache/incubator-airflow/pull/3989
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/bin/cli.py b/airflow/bin/cli.py
index 675a88a63c..cfc6c6b8d6 100644
--- a/airflow/bin/cli.py
+++ b/airflow/bin/cli.py
@@ -1038,12 +1038,16 @@ def worker(args):
 from airflow.executors.celery_executor import app as celery_app
 from celery.bin import worker
 
+autoscale = args.autoscale
+if autoscale is None and conf.has_option("celery", "worker_autoscale"):
+autoscale = conf.get("celery", "worker_autoscale")
 worker = worker.worker(app=celery_app)
 options = {
 'optimization': 'fair',
 'O': 'fair',
 'queues': args.queues,
 'concurrency': args.concurrency,
+'autoscale': autoscale,
 'hostname': args.celery_hostname,
 'loglevel': conf.get('core', 'LOGGING_LEVEL'),
 }
@@ -1916,6 +1920,9 @@ class CLIFactory(object):
 ('-d', '--delete'),
 help='Delete a user',
 action='store_true'),
+'autoscale': Arg(
+('-a', '--autoscale'),
+help="Minimum and Maximum number of worker to autoscale"),
 
 }
 subparsers = (
@@ -2058,7 +2065,7 @@ class CLIFactory(object):
 'func': worker,
 'help': "Start a Celery worker node",
 'args': ('do_pickle', 'queues', 'concurrency', 'celery_hostname',
- 'pid', 'daemon', 'stdout', 'stderr', 'log_file'),
+ 'pid', 'daemon', 'stdout', 'stderr', 'log_file', 
'autoscale'),
 }, {
 'func': flower,
 'help': "Start a Celery Flower",
diff --git a/airflow/config_templates/default_airflow.cfg 
b/airflow/config_templates/default_airflow.cfg
index b572dbb2f7..12e5a16f21 100644
--- a/airflow/config_templates/default_airflow.cfg
+++ b/airflow/config_templates/default_airflow.cfg
@@ -349,6 +349,13 @@ celery_app_name = airflow.executors.celery_executor
 # your worker box and the nature of your tasks
 worker_concurrency = 16
 
+# The minimum and maximum concurrency that will be used when starting workers 
with the
+# "airflow worker" command. Pick these numbers based on resources on
+# worker box and the nature of the task. If autoscale option is available 
worker_concurrency
+# will be ignored.
+# 
http://docs.celeryproject.org/en/latest/reference/celery.bin.worker.html#cmdoption-celery-worker-autoscale
+# worker_autoscale = 12,16
+
 # When you start an airflow worker, airflow starts a tiny web server
 # subprocess to serve the workers local log files to the airflow main
 # web server, who then builds pages and sends them to users. This defines


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-3208) Apache airflow 1.8.0 integration with LDAP anonmyously

2018-10-15 Thread Hari Krishna ADDEPALLI LN (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Krishna ADDEPALLI LN updated AIRFLOW-3208:
---
Description: 
Hello.,

We wanted to have airflow integration with LDAP anonymously, the LDAP is based 
on either "openldap" or "389 directory Server". Below is the detail added in 
the airflow.cfg : 
{noformat}
[webserver] 
authenticate = True 
auth_backend = airflow.contrib.auth.backends.ldap_auth  {noformat}
  
{noformat}
[ldap] 
uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
user_filter =  
user_name_attr = uid 
group_member_attr = groupMembership=ou=groups,dc=odc,dc=im 
superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
data_profiler_filter = 
bind_user = ou=people,dc=odc,dc=im 
bind_password = 
basedn = ou=people,dc=odc,dc=im 
cacert = /opt/orchestration/airflow/ldap_ca.crt 
search_scope = SUBTREE{noformat}
However, when trying to validate, it failed with below exception, please advise 
what to correct as per provided detail of LDAP as per above ? We only use 
"basedn=ou=people,dc=odc,dc=im" with provided LDAP host and was able to access 
anonymously when tried using jxplorer workbench. We are able to do LDAP 
anonymously both on kibana/elasticsearch/jenkins, however coming to airflow, 
please advise solution.

 
{noformat}
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1988, in 
wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1641, in 
full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1544, in 
handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, in 
reraise
raise value
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1639, in 
full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1625, in 
dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 69, in 
inner
return self._run_view(f, *args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 368, in 
_run_view
return fn(self, *args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/airflow/www/views.py", line 650, 
in login
return airflow.login.login(self, request)
File 
"/usr/local/lib/python3.6/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 268, in login
LdapUser.try_login(username, password)
File 
"/usr/local/lib/python3.6/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 180, in try_login
search_scope=native(search_scope))
File "/usr/local/lib/python3.6/site-packages/ldap3/core/connection.py", line 
779, in search
check_names=self.check_names)
File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 
372, in search_operation
request['filter'] = compile_filter(parse_filter(search_filter, schema, 
auto_escape, auto_encode, validator, check_names).elements[0]) # parse the 
searchFilter string and compile it starting from the root node
File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 
206, in parse_filter
current_node.append(evaluate_match(search_filter[start_pos:end_pos], schema, 
auto_escape, auto_encode, validator, check_names))
File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 
89, in evaluate_match
raise LDAPInvalidFilterError('invalid matching assertion')
ldap3.core.exceptions.LDAPInvalidFilterError: invalid matching assertion

{noformat}
 

 

  was:
Hello.,

We wanted to have airflow integration with LDAP anonymously, the LDAP is based 
on either "openldap" or "389 directory Server". Below is the detail added in 
the airflow.cfg : 
{noformat}
[webserver] 
authenticate = True 
auth_backend = airflow.contrib.auth.backends.ldap_auth  {noformat}
  
{noformat}
[ldap] 
uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
user_filter =  
user_name_attr = uid 
group_member_attr = groupMembership=ou=groups,dc=odc,dc=im 
superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
data_profiler_filter = 
bind_user = ou=people,dc=odc,dc=im 
bind_password = 
basedn = ou=people,dc=odc,dc=im 
cacert = /opt/orchestration/airflow/ldap_ca.crt 
search_scope = SUBTREE{noformat}
However, when trying to validate, it failed with below exception, please advise 
what to correct as per provided detail of LDAP as per above ? We only use 
"basedn=ou=people,dc=odc,dc=im" with provided LDAP host and was able to access 
anonymously when tried using jxplorer workbench. We tried both on 
kibana/elasticsearch/jenkins anonymously, however coming to airflow, please 
advise solution.

 
{noformat}
Traceback (most recent call 

[jira] [Created] (AIRFLOW-3208) Apache airflow 1.8.0 integration with LDAP anonmyously

2018-10-15 Thread Hari Krishna ADDEPALLI LN (JIRA)
Hari Krishna ADDEPALLI LN created AIRFLOW-3208:
--

 Summary: Apache airflow 1.8.0 integration with LDAP anonmyously
 Key: AIRFLOW-3208
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3208
 Project: Apache Airflow
  Issue Type: Bug
  Components: authentication
Affects Versions: 1.8.2, 1.8.0
Reporter: Hari Krishna ADDEPALLI LN


Hello.,

We wanted to have airflow integration with LDAP anonymously, the LDAP is based 
on either "openldap" or "389 directory Server". Below is the detail added in 
the airflow.cfg :

 

 
{noformat}
[webserver] 
authenticate = True auth_backend = airflow.contrib.auth.backends.ldap_auth  
{noformat}
 

 

 
{noformat}
[ldap] 
uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
user_filter =  
user_name_attr = uid 
group_member_attr = groupMembership=ou=groups,dc=odc,dc=im 
superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
data_profiler_filter = 
bind_user = ou=people,dc=odc,dc=im 
bind_password = 
basedn = ou=people,dc=odc,dc=im 
cacert = /opt/orchestration/airflow/ldap_ca.crt 
search_scope = SUBTREE{noformat}
 

 

However, when trying to validate, it failed with below exception, please advise 
what to correct as per provided detail of LDAP as per above ? We only use 
"basedn=ou=people,dc=odc,dc=im" with provided LDAP host and was able to access 
anonymously when tried using jxplorer workbench. We tried both on 
kibana/elasticsearch/jenkins anonymously, however coming to airflow, please 
advise solution.

 
{noformat}
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1988, in 
wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1641, in 
full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1544, in 
handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, in 
reraise
raise value
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1639, in 
full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1625, in 
dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 69, in 
inner
return self._run_view(f, *args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 368, in 
_run_view
return fn(self, *args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/airflow/www/views.py", line 650, 
in login
return airflow.login.login(self, request)
File 
"/usr/local/lib/python3.6/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 268, in login
LdapUser.try_login(username, password)
File 
"/usr/local/lib/python3.6/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 180, in try_login
search_scope=native(search_scope))
File "/usr/local/lib/python3.6/site-packages/ldap3/core/connection.py", line 
779, in search
check_names=self.check_names)
File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 
372, in search_operation
request['filter'] = compile_filter(parse_filter(search_filter, schema, 
auto_escape, auto_encode, validator, check_names).elements[0]) # parse the 
searchFilter string and compile it starting from the root node
File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 
206, in parse_filter
current_node.append(evaluate_match(search_filter[start_pos:end_pos], schema, 
auto_escape, auto_encode, validator, check_names))
File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 
89, in evaluate_match
raise LDAPInvalidFilterError('invalid matching assertion')
ldap3.core.exceptions.LDAPInvalidFilterError: invalid matching assertion

{noformat}
 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3208) Apache airflow 1.8.0 integration with LDAP anonmyously

2018-10-15 Thread Hari Krishna ADDEPALLI LN (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Krishna ADDEPALLI LN updated AIRFLOW-3208:
---
Description: 
Hello.,

We wanted to have airflow integration with LDAP anonymously, the LDAP is based 
on either "openldap" or "389 directory Server". Below is the detail added in 
the airflow.cfg : 
{noformat}
[webserver] 
authenticate = True 
auth_backend = airflow.contrib.auth.backends.ldap_auth  {noformat}
  
{noformat}
[ldap] 
uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
user_filter =  
user_name_attr = uid 
group_member_attr = groupMembership=ou=groups,dc=odc,dc=im 
superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
data_profiler_filter = 
bind_user = ou=people,dc=odc,dc=im 
bind_password = 
basedn = ou=people,dc=odc,dc=im 
cacert = /opt/orchestration/airflow/ldap_ca.crt 
search_scope = SUBTREE{noformat}
However, when trying to validate, it failed with below exception, please advise 
what to correct as per provided detail of LDAP as per above ? We only use 
"basedn=ou=people,dc=odc,dc=im" with provided LDAP host and was able to access 
anonymously when tried using jxplorer workbench. We tried both on 
kibana/elasticsearch/jenkins anonymously, however coming to airflow, please 
advise solution.

 
{noformat}
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1988, in 
wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1641, in 
full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1544, in 
handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.6/site-packages/flask/_compat.py", line 33, in 
reraise
raise value
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1639, in 
full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 1625, in 
dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 69, in 
inner
return self._run_view(f, *args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 368, in 
_run_view
return fn(self, *args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/airflow/www/views.py", line 650, 
in login
return airflow.login.login(self, request)
File 
"/usr/local/lib/python3.6/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 268, in login
LdapUser.try_login(username, password)
File 
"/usr/local/lib/python3.6/site-packages/airflow/contrib/auth/backends/ldap_auth.py",
 line 180, in try_login
search_scope=native(search_scope))
File "/usr/local/lib/python3.6/site-packages/ldap3/core/connection.py", line 
779, in search
check_names=self.check_names)
File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 
372, in search_operation
request['filter'] = compile_filter(parse_filter(search_filter, schema, 
auto_escape, auto_encode, validator, check_names).elements[0]) # parse the 
searchFilter string and compile it starting from the root node
File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 
206, in parse_filter
current_node.append(evaluate_match(search_filter[start_pos:end_pos], schema, 
auto_escape, auto_encode, validator, check_names))
File "/usr/local/lib/python3.6/site-packages/ldap3/operation/search.py", line 
89, in evaluate_match
raise LDAPInvalidFilterError('invalid matching assertion')
ldap3.core.exceptions.LDAPInvalidFilterError: invalid matching assertion

{noformat}
 

 

  was:
Hello.,

We wanted to have airflow integration with LDAP anonymously, the LDAP is based 
on either "openldap" or "389 directory Server". Below is the detail added in 
the airflow.cfg :

 

 
{noformat}
[webserver] 
authenticate = True auth_backend = airflow.contrib.auth.backends.ldap_auth  
{noformat}
 

 

 
{noformat}
[ldap] 
uri = ldap://nsp-daf178e8.ad1.prd.us-phx.odc.im:389 
user_filter =  
user_name_attr = uid 
group_member_attr = groupMembership=ou=groups,dc=odc,dc=im 
superuser_filter = memberOf=cn=rvd-sudo_all-prd_usphx,ou=groups,dc=odc,dc=im 
data_profiler_filter = 
bind_user = ou=people,dc=odc,dc=im 
bind_password = 
basedn = ou=people,dc=odc,dc=im 
cacert = /opt/orchestration/airflow/ldap_ca.crt 
search_scope = SUBTREE{noformat}
 

 

However, when trying to validate, it failed with below exception, please advise 
what to correct as per provided detail of LDAP as per above ? We only use 
"basedn=ou=people,dc=odc,dc=im" with provided LDAP host and was able to access 
anonymously when tried using jxplorer workbench. We tried both on 
kibana/elasticsearch/jenkins anonymously, however coming to airflow, please 
advise solution.

 
{noformat}
Traceback (most recent call 

[jira] [Created] (AIRFLOW-3207) option to stop task pushing result to xcom

2018-10-15 Thread Ben Marengo (JIRA)
Ben Marengo created AIRFLOW-3207:


 Summary: option to stop task pushing result to xcom
 Key: AIRFLOW-3207
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3207
 Project: Apache Airflow
  Issue Type: Improvement
  Components: models, operators
Reporter: Ben Marengo
Assignee: Ben Marengo


follows the completion of AIRFLOW-886, and closure (incomplete) of AIRFLOW-888

i would actually like functionality similar to this, but i dont think it 
necessitates the global config flag.

- BaseOperator should have an option to stop a task pushing the return value of 
execute() to xcom.
- the default should be to push (preserves backward compat)




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] oliviersm199 commented on issue #4036: [AIRFLOW-2744] Allow RBAC to accept plugins for views and links.

2018-10-15 Thread GitBox
oliviersm199 commented on issue #4036: [AIRFLOW-2744] Allow RBAC to accept 
plugins for views and links.
URL: 
https://github.com/apache/incubator-airflow/pull/4036#issuecomment-429928013
 
 
   Hello have the core committers had time to look at this? I know @Fokko 
requested input from @jgao54, let me know if you need anything from me.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added

2018-10-15 Thread GitBox
codecov-io edited a comment on issue #3989: [AIRFLOW-1945] Autoscale celery 
workers for airflow added
URL: 
https://github.com/apache/incubator-airflow/pull/3989#issuecomment-426543786
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3989?src=pr=h1)
 Report
   > Merging 
[#3989](https://codecov.io/gh/apache/incubator-airflow/pull/3989?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/a581cbab1f79827ab645d21a9a221f1616cf8984?src=pr=desc)
 will **decrease** coverage by `3.04%`.
   > The diff coverage is `0%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3989/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3989?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3989  +/-   ##
   ==
   - Coverage   75.92%   72.87%   -3.05% 
   ==
 Files 199  199  
 Lines   1595417003+1049 
   ==
   + Hits1211312391 +278 
   - Misses   3841 4612 +771
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3989?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5)
 | `58.76% <0%> (-6.3%)` | :arrow_down: |
   | 
[airflow/hooks/druid\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9kcnVpZF9ob29rLnB5)
 | `67.36% <0%> (-20.64%)` | :arrow_down: |
   | 
[airflow/task/task\_runner/base\_task\_runner.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy90YXNrL3Rhc2tfcnVubmVyL2Jhc2VfdGFza19ydW5uZXIucHk=)
 | `60.97% <0%> (-18.34%)` | :arrow_down: |
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `75.55% <0%> (-16.4%)` | :arrow_down: |
   | 
[airflow/example\_dags/example\_python\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy9leGFtcGxlX2RhZ3MvZXhhbXBsZV9weXRob25fb3BlcmF0b3IucHk=)
 | `78.94% <0%> (-15.79%)` | :arrow_down: |
   | 
[airflow/utils/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9jb25maWd1cmF0aW9uLnB5)
 | `85.71% <0%> (-14.29%)` | :arrow_down: |
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `85.07% <0%> (-3.78%)` | :arrow_down: |
   | 
[airflow/operators/s3\_file\_transform\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvczNfZmlsZV90cmFuc2Zvcm1fb3BlcmF0b3IucHk=)
 | `93.87% <0%> (-2.35%)` | :arrow_down: |
   | 
[airflow/www\_rbac/views.py](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree#diff-YWlyZmxvdy93d3dfcmJhYy92aWV3cy5weQ==)
 | `72.18% <0%> (-0.36%)` | :arrow_down: |
   | ... and [13 
more](https://codecov.io/gh/apache/incubator-airflow/pull/3989/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3989?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3989?src=pr=footer).
 Last update 
[a581cba...959ca5d](https://codecov.io/gh/apache/incubator-airflow/pull/3989?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #2135: [AIRFLOW-843] Store exceptions on task_instance

2018-10-15 Thread GitBox
codecov-io edited a comment on issue #2135: [AIRFLOW-843] Store exceptions on 
task_instance
URL: 
https://github.com/apache/incubator-airflow/pull/2135#issuecomment-348005174
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2135?src=pr=h1)
 Report
   > Merging 
[#2135](https://codecov.io/gh/apache/incubator-airflow/pull/2135?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/a581cbab1f79827ab645d21a9a221f1616cf8984?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/2135/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/2135?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#2135  +/-   ##
   ==
   + Coverage   75.92%   75.93%   +<.01% 
   ==
 Files 199  199  
 Lines   1595415956   +2 
   ==
   + Hits1211312116   +3 
   + Misses   3841 3840   -1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/2135?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/2135/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `91.99% <100%> (+0.04%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2135?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/2135?src=pr=footer).
 Last update 
[a581cba...e584190](https://codecov.io/gh/apache/incubator-airflow/pull/2135?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] thesquelched commented on issue #2135: [AIRFLOW-843] Store exceptions on task_instance

2018-10-15 Thread GitBox
thesquelched commented on issue #2135: [AIRFLOW-843] Store exceptions on 
task_instance
URL: 
https://github.com/apache/incubator-airflow/pull/2135#issuecomment-429901059
 
 
   @xnuinside sorry for the long wait; rebased and addressed PR comments


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] thesquelched commented on a change in pull request #2135: [AIRFLOW-843] Store exceptions on task_instance

2018-10-15 Thread GitBox
thesquelched commented on a change in pull request #2135: [AIRFLOW-843] Store 
exceptions on task_instance
URL: https://github.com/apache/incubator-airflow/pull/2135#discussion_r225210703
 
 

 ##
 File path: airflow/models.py
 ##
 @@ -1574,6 +1574,8 @@ def dry_run(self):
 def handle_failure(self, error, test_mode=False, context=None, 
session=None):
 self.log.exception(error)
 task = self.task
+session = settings.Session()
 
 Review comment:
   Cruft from testing, I suppose; I'll remove


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] aoen edited a comment on issue #4037: [AIRFLOW-3191] Fix not being able to specify execution_date when creating dagrun

2018-10-15 Thread GitBox
aoen edited a comment on issue #4037: [AIRFLOW-3191] Fix not being able to 
specify execution_date when creating dagrun
URL: 
https://github.com/apache/incubator-airflow/pull/4037#issuecomment-429896591
 
 
   Certainly!
   Here is the creation page before my change:
   https://user-images.githubusercontent.com/1592778/46960432-87e86580-d06c-11e8-8f51-b265be009cf5.png;>
   
   Here is the screenshot where you can see the execution_date parameter is now 
available:
   https://user-images.githubusercontent.com/1592778/46960215-0264b580-d06c-11e8-9424-7e8f3780f694.png;>
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] aoen commented on issue #4037: [AIRFLOW-3191] Fix not being able to specify execution_date when creating dagrun

2018-10-15 Thread GitBox
aoen commented on issue #4037: [AIRFLOW-3191] Fix not being able to specify 
execution_date when creating dagrun
URL: 
https://github.com/apache/incubator-airflow/pull/4037#issuecomment-429896591
 
 
   Certainly!
   
   Here is the screenshot where you can see the execution_date parameter is now 
available:
   https://user-images.githubusercontent.com/1592778/46960215-0264b580-d06c-11e8-9424-7e8f3780f694.png;>
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] aoen commented on issue #4005: [AIRFLOW-3160] Load latest_dagruns asynchronously, speed up front page load time

2018-10-15 Thread GitBox
aoen commented on issue #4005: [AIRFLOW-3160] Load latest_dagruns 
asynchronously, speed up front page load time
URL: 
https://github.com/apache/incubator-airflow/pull/4005#issuecomment-429895674
 
 
   Ready to merge FYI.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4038: [AIRFLOW-1970] Let empty Fernet key or special `no encryption` phrase.

2018-10-15 Thread GitBox
codecov-io edited a comment on issue #4038: [AIRFLOW-1970] Let empty Fernet key 
or special `no encryption` phrase.
URL: 
https://github.com/apache/incubator-airflow/pull/4038#issuecomment-429203577
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=h1)
 Report
   > Merging 
[#4038](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/a581cbab1f79827ab645d21a9a221f1616cf8984?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `63.63%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4038/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4038  +/-   ##
   ==
   + Coverage   75.92%   75.92%   +<.01% 
   ==
 Files 199  199  
 Lines   1595415955   +1 
   ==
   + Hits1211312114   +1 
 Misses   3841 3841
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/4038/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `89.41% <50%> (+0.56%)` | :arrow_up: |
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4038/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `91.88% <66.66%> (-0.07%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=footer).
 Last update 
[a581cba...211cde1](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4038: [AIRFLOW-1970] Let empty Fernet key or special `no encryption` phrase.

2018-10-15 Thread GitBox
codecov-io edited a comment on issue #4038: [AIRFLOW-1970] Let empty Fernet key 
or special `no encryption` phrase.
URL: 
https://github.com/apache/incubator-airflow/pull/4038#issuecomment-429203577
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=h1)
 Report
   > Merging 
[#4038](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/a581cbab1f79827ab645d21a9a221f1616cf8984?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `63.63%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4038/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4038  +/-   ##
   ==
   + Coverage   75.92%   75.92%   +<.01% 
   ==
 Files 199  199  
 Lines   1595415955   +1 
   ==
   + Hits1211312114   +1 
 Misses   3841 3841
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/4038/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `89.41% <50%> (+0.56%)` | :arrow_up: |
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4038/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `91.88% <66.66%> (-0.07%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=footer).
 Last update 
[a581cba...211cde1](https://codecov.io/gh/apache/incubator-airflow/pull/4038?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-1945) Pass --autoscale to celery workers

2018-10-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650256#comment-16650256
 ] 

ASF GitHub Bot commented on AIRFLOW-1945:
-

phani8996 opened a new pull request #3989: [AIRFLOW-1945] Autoscale celery 
workers for airflow added
URL: https://github.com/apache/incubator-airflow/pull/3989
 
 
   Dear Airflow Maintainers,
   
   This will add a provision to autoscale celery workers unlike same numbers of 
workers irrespective of number of running tasks.
   
   Please accept this PR that addresses the following issues:
   https://issues.apache.org/jira/browse/AIRFLOW-1945
   
   Testing Done:
   
   Manually tested by passing arguments in cli


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Pass --autoscale to celery workers
> --
>
> Key: AIRFLOW-1945
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1945
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: celery, cli
>Reporter: Michael O.
>Assignee: Sai Phanindhra
>Priority: Trivial
>  Labels: easyfix
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Celery supports autoscaling of the worker pool size (number of tasks that can 
> parallelize within one worker node).  I'd like to propose to support passing 
> the --autoscale parameter to {{airflow worker}}.
> Since this is a trivial change, I am not sure if there's any reason for not 
> being supported already.(?)
> For example
> {{airflow worker --concurrency=4}} will set a fixed pool size of 4.
> With minimal changes in 
> [https://github.com/apache/incubator-airflow/blob/4ce4faaeae7a76d97defcf9a9d3304ac9d78b9bd/airflow/bin/cli.py#L855]
>  it could support
> {{airflow worker --autoscale=2,10}} to set an autoscaled pool size of 2 to 10
> Some references:
> * 
> http://docs.celeryproject.org/en/latest/internals/reference/celery.worker.autoscale.html
> * 
> https://github.com/apache/incubator-airflow/blob/4ce4faaeae7a76d97defcf9a9d3304ac9d78b9bd/airflow/bin/cli.py#L855



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] phani8996 commented on issue #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added

2018-10-15 Thread GitBox
phani8996 commented on issue #3989: [AIRFLOW-1945] Autoscale celery workers for 
airflow added
URL: 
https://github.com/apache/incubator-airflow/pull/3989#issuecomment-429866997
 
 
   Mistakely closed PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] phani8996 opened a new pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added

2018-10-15 Thread GitBox
phani8996 opened a new pull request #3989: [AIRFLOW-1945] Autoscale celery 
workers for airflow added
URL: https://github.com/apache/incubator-airflow/pull/3989
 
 
   Dear Airflow Maintainers,
   
   This will add a provision to autoscale celery workers unlike same numbers of 
workers irrespective of number of running tasks.
   
   Please accept this PR that addresses the following issues:
   https://issues.apache.org/jira/browse/AIRFLOW-1945
   
   Testing Done:
   
   Manually tested by passing arguments in cli


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-1945) Pass --autoscale to celery workers

2018-10-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650252#comment-16650252
 ] 

ASF GitHub Bot commented on AIRFLOW-1945:
-

phani8996 closed pull request #3989: [AIRFLOW-1945] Autoscale celery workers 
for airflow added
URL: https://github.com/apache/incubator-airflow/pull/3989
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/bin/cli.py b/airflow/bin/cli.py
index 09bd0c1806..19ff220d9f 100644
--- a/airflow/bin/cli.py
+++ b/airflow/bin/cli.py
@@ -1055,12 +1055,16 @@ def worker(args):
 from airflow.executors.celery_executor import app as celery_app
 from celery.bin import worker
 
+autoscale = args.autoscale
+if autoscale is None and conf.has_option("celery", "worker_autoscale"):
+autoscale = conf.get("celery", "worker_autoscale")
 worker = worker.worker(app=celery_app)
 options = {
 'optimization': 'fair',
 'O': 'fair',
 'queues': args.queues,
 'concurrency': args.concurrency,
+'autoscale': autoscale,
 'hostname': args.celery_hostname,
 }
 
@@ -1932,6 +1936,9 @@ class CLIFactory(object):
 ('-d', '--delete'),
 help='Delete a user',
 action='store_true'),
+'autoscale': Arg(
+('-a', '--autoscale'),
+help="Minimum and Maximum number of worker to autoscale"),
 
 }
 subparsers = (
@@ -2074,7 +2081,7 @@ class CLIFactory(object):
 'func': worker,
 'help': "Start a Celery worker node",
 'args': ('do_pickle', 'queues', 'concurrency', 'celery_hostname',
- 'pid', 'daemon', 'stdout', 'stderr', 'log_file'),
+ 'pid', 'daemon', 'stdout', 'stderr', 'log_file', 
'autoscale'),
 }, {
 'func': flower,
 'help': "Start a Celery Flower",
diff --git a/airflow/config_templates/default_airflow.cfg 
b/airflow/config_templates/default_airflow.cfg
index bb4ab208d7..a1806a5dee 100644
--- a/airflow/config_templates/default_airflow.cfg
+++ b/airflow/config_templates/default_airflow.cfg
@@ -349,6 +349,13 @@ celery_app_name = airflow.executors.celery_executor
 # your worker box and the nature of your tasks
 worker_concurrency = 16
 
+# The minimum and maximum concurrency that will be used when starting workers 
with the
+# "airflow worker" command. Pick these numbers based on resources on
+# worker box and the nature of the task. If autoscale option is available 
worker_concurrency
+# will be ignored.
+# 
http://docs.celeryproject.org/en/latest/reference/celery.bin.worker.html#cmdoption-celery-worker-autoscale
+# worker_autoscale = 12,16
+
 # When you start an airflow worker, airflow starts a tiny web server
 # subprocess to serve the workers local log files to the airflow main
 # web server, who then builds pages and sends them to users. This defines


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Pass --autoscale to celery workers
> --
>
> Key: AIRFLOW-1945
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1945
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: celery, cli
>Reporter: Michael O.
>Assignee: Sai Phanindhra
>Priority: Trivial
>  Labels: easyfix
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Celery supports autoscaling of the worker pool size (number of tasks that can 
> parallelize within one worker node).  I'd like to propose to support passing 
> the --autoscale parameter to {{airflow worker}}.
> Since this is a trivial change, I am not sure if there's any reason for not 
> being supported already.(?)
> For example
> {{airflow worker --concurrency=4}} will set a fixed pool size of 4.
> With minimal changes in 
> [https://github.com/apache/incubator-airflow/blob/4ce4faaeae7a76d97defcf9a9d3304ac9d78b9bd/airflow/bin/cli.py#L855]
>  it could support
> {{airflow worker --autoscale=2,10}} to set an autoscaled pool size of 2 to 10
> Some references:
> * 
> http://docs.celeryproject.org/en/latest/internals/reference/celery.worker.autoscale.html
> * 
> https://github.com/apache/incubator-airflow/blob/4ce4faaeae7a76d97defcf9a9d3304ac9d78b9bd/airflow/bin/cli.py#L855



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] phani8996 closed pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added

2018-10-15 Thread GitBox
phani8996 closed pull request #3989: [AIRFLOW-1945] Autoscale celery workers 
for airflow added
URL: https://github.com/apache/incubator-airflow/pull/3989
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/bin/cli.py b/airflow/bin/cli.py
index 09bd0c1806..19ff220d9f 100644
--- a/airflow/bin/cli.py
+++ b/airflow/bin/cli.py
@@ -1055,12 +1055,16 @@ def worker(args):
 from airflow.executors.celery_executor import app as celery_app
 from celery.bin import worker
 
+autoscale = args.autoscale
+if autoscale is None and conf.has_option("celery", "worker_autoscale"):
+autoscale = conf.get("celery", "worker_autoscale")
 worker = worker.worker(app=celery_app)
 options = {
 'optimization': 'fair',
 'O': 'fair',
 'queues': args.queues,
 'concurrency': args.concurrency,
+'autoscale': autoscale,
 'hostname': args.celery_hostname,
 }
 
@@ -1932,6 +1936,9 @@ class CLIFactory(object):
 ('-d', '--delete'),
 help='Delete a user',
 action='store_true'),
+'autoscale': Arg(
+('-a', '--autoscale'),
+help="Minimum and Maximum number of worker to autoscale"),
 
 }
 subparsers = (
@@ -2074,7 +2081,7 @@ class CLIFactory(object):
 'func': worker,
 'help': "Start a Celery worker node",
 'args': ('do_pickle', 'queues', 'concurrency', 'celery_hostname',
- 'pid', 'daemon', 'stdout', 'stderr', 'log_file'),
+ 'pid', 'daemon', 'stdout', 'stderr', 'log_file', 
'autoscale'),
 }, {
 'func': flower,
 'help': "Start a Celery Flower",
diff --git a/airflow/config_templates/default_airflow.cfg 
b/airflow/config_templates/default_airflow.cfg
index bb4ab208d7..a1806a5dee 100644
--- a/airflow/config_templates/default_airflow.cfg
+++ b/airflow/config_templates/default_airflow.cfg
@@ -349,6 +349,13 @@ celery_app_name = airflow.executors.celery_executor
 # your worker box and the nature of your tasks
 worker_concurrency = 16
 
+# The minimum and maximum concurrency that will be used when starting workers 
with the
+# "airflow worker" command. Pick these numbers based on resources on
+# worker box and the nature of the task. If autoscale option is available 
worker_concurrency
+# will be ignored.
+# 
http://docs.celeryproject.org/en/latest/reference/celery.bin.worker.html#cmdoption-celery-worker-autoscale
+# worker_autoscale = 12,16
+
 # When you start an airflow worker, airflow starts a tiny web server
 # subprocess to serve the workers local log files to the airflow main
 # web server, who then builds pages and sends them to users. This defines


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] phani8996 commented on issue #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added

2018-10-15 Thread GitBox
phani8996 commented on issue #3989: [AIRFLOW-1945] Autoscale celery workers for 
airflow added
URL: 
https://github.com/apache/incubator-airflow/pull/3989#issuecomment-429864622
 
 
   > @phani8996 plz rebase your commits into a single commit. Also, the commit 
message could be like this `[AIRFLOW-1945] Add Autoscale config for Celery 
workers`
   
   @msumit commits have been rebased and commit message updated with proper 
message. Please check.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test

2018-10-15 Thread GitBox
codecov-io edited a comment on issue #4049: [AIRFLOW-3203] Fix DockerOperator & 
some operator test
URL: 
https://github.com/apache/incubator-airflow/pull/4049#issuecomment-429554020
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=h1)
 Report
   > Merging 
[#4049](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/719e0b16b909baedbc4679568548a4b123e6476a?src=pr=desc)
 will **increase** coverage by `1.77%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4049/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4049  +/-   ##
   ==
   + Coverage   75.91%   77.69%   +1.77% 
   ==
 Files 199  199  
 Lines   1594815944   -4 
   ==
   + Hits1210712387 +280 
   + Misses   3841 3557 -284
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/docker\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvZG9ja2VyX29wZXJhdG9yLnB5)
 | `97.61% <100%> (+97.61%)` | :arrow_up: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `82.48% <0%> (+0.35%)` | :arrow_up: |
   | 
[airflow/hooks/hive\_hooks.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9oaXZlX2hvb2tzLnB5)
 | `73.42% <0%> (+0.52%)` | :arrow_up: |
   | 
[airflow/operators/hive\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvaGl2ZV9vcGVyYXRvci5weQ==)
 | `86.53% <0%> (+5.76%)` | :arrow_up: |
   | 
[airflow/utils/file.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9maWxlLnB5)
 | `84% <0%> (+8%)` | :arrow_up: |
   | 
[airflow/operators/python\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvcHl0aG9uX29wZXJhdG9yLnB5)
 | `95.03% <0%> (+13.04%)` | :arrow_up: |
   | 
[airflow/operators/subdag\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvc3ViZGFnX29wZXJhdG9yLnB5)
 | `90.32% <0%> (+19.35%)` | :arrow_up: |
   | 
[airflow/operators/latest\_only\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbGF0ZXN0X29ubHlfb3BlcmF0b3IucHk=)
 | `90% <0%> (+65%)` | :arrow_up: |
   | 
[airflow/operators/s3\_to\_hive\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvczNfdG9faGl2ZV9vcGVyYXRvci5weQ==)
 | `94.01% <0%> (+94.01%)` | :arrow_up: |
   | ... and [1 
more](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=footer).
 Last update 
[719e0b1...111a803](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test

2018-10-15 Thread GitBox
codecov-io edited a comment on issue #4049: [AIRFLOW-3203] Fix DockerOperator & 
some operator test
URL: 
https://github.com/apache/incubator-airflow/pull/4049#issuecomment-429554020
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=h1)
 Report
   > Merging 
[#4049](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/719e0b16b909baedbc4679568548a4b123e6476a?src=pr=desc)
 will **increase** coverage by `1.77%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4049/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4049  +/-   ##
   ==
   + Coverage   75.91%   77.69%   +1.77% 
   ==
 Files 199  199  
 Lines   1594815944   -4 
   ==
   + Hits1210712387 +280 
   + Misses   3841 3557 -284
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/docker\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvZG9ja2VyX29wZXJhdG9yLnB5)
 | `97.61% <100%> (+97.61%)` | :arrow_up: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `82.48% <0%> (+0.35%)` | :arrow_up: |
   | 
[airflow/hooks/hive\_hooks.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9oaXZlX2hvb2tzLnB5)
 | `73.42% <0%> (+0.52%)` | :arrow_up: |
   | 
[airflow/operators/hive\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvaGl2ZV9vcGVyYXRvci5weQ==)
 | `86.53% <0%> (+5.76%)` | :arrow_up: |
   | 
[airflow/utils/file.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9maWxlLnB5)
 | `84% <0%> (+8%)` | :arrow_up: |
   | 
[airflow/operators/python\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvcHl0aG9uX29wZXJhdG9yLnB5)
 | `95.03% <0%> (+13.04%)` | :arrow_up: |
   | 
[airflow/operators/subdag\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvc3ViZGFnX29wZXJhdG9yLnB5)
 | `90.32% <0%> (+19.35%)` | :arrow_up: |
   | 
[airflow/operators/latest\_only\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvbGF0ZXN0X29ubHlfb3BlcmF0b3IucHk=)
 | `90% <0%> (+65%)` | :arrow_up: |
   | 
[airflow/operators/s3\_to\_hive\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvczNfdG9faGl2ZV9vcGVyYXRvci5weQ==)
 | `94.01% <0%> (+94.01%)` | :arrow_up: |
   | ... and [1 
more](https://codecov.io/gh/apache/incubator-airflow/pull/4049/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=footer).
 Last update 
[719e0b1...111a803](https://codecov.io/gh/apache/incubator-airflow/pull/4049?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XD-DENG edited a comment on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test

2018-10-15 Thread GitBox
XD-DENG edited a comment on issue #4049: [AIRFLOW-3203] Fix DockerOperator & 
some operator test
URL: 
https://github.com/apache/incubator-airflow/pull/4049#issuecomment-429848698
 
 
   In addition, I would suggest to include this commit into `1.10.1` which is 
intended to fix bugs. Re-enabling these tests should be useful to ensure less 
potential bugs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] msumit commented on issue #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added

2018-10-15 Thread GitBox
msumit commented on issue #3989: [AIRFLOW-1945] Autoscale celery workers for 
airflow added
URL: 
https://github.com/apache/incubator-airflow/pull/3989#issuecomment-429849799
 
 
   @phani8996 plz rebase your commits into a single commit. Also, the commit 
message could be like this `[AIRFLOW-1945] Add Autoscale config for Celery 
workers`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XD-DENG commented on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test

2018-10-15 Thread GitBox
XD-DENG commented on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some 
operator test
URL: 
https://github.com/apache/incubator-airflow/pull/4049#issuecomment-429848698
 
 
   In addition, I would suggest to include this commit into `1.10.1` which is 
intended to fix bugs. Re-enabling these should be useful to ensure less 
potential bugs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] msumit commented on a change in pull request #3989: [AIRFLOW-1945] Autoscale celery workers for airflow added

2018-10-15 Thread GitBox
msumit commented on a change in pull request #3989: [AIRFLOW-1945] Autoscale 
celery workers for airflow added
URL: https://github.com/apache/incubator-airflow/pull/3989#discussion_r225161038
 
 

 ##
 File path: airflow/config_templates/default_airflow.cfg
 ##
 @@ -349,6 +349,12 @@ celery_app_name = airflow.executors.celery_executor
 # your worker box and the nature of your tasks
 worker_concurrency = 16
 
+# The minimum and maximum concurrency that will be used when starting workers 
with the
+# "airflow worker" command. Pick these numbers based on resources on
+# worker box and the nature of the task. If autoscale option is available 
worker_concurrency
+# will be ignored
+#worker_autoscale = 12,16
 
 Review comment:
   @phani8996 a space after `#` would be better. i.e `# worker_autoscale = 
12,16`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XD-DENG commented on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some operator test

2018-10-15 Thread GitBox
XD-DENG commented on issue #4049: [AIRFLOW-3203] Fix DockerOperator & some 
operator test
URL: 
https://github.com/apache/incubator-airflow/pull/4049#issuecomment-429848209
 
 
   Hi @Fokko ,
   
   I have changed the name of some of the Operator test scripts (prepend with 
`"test_"`), after making sure they don't raise any exception and work as 
designed, including
   - `tests/operators/docker_operator.py` (with some code change)
   - `tests/operators/hive_operator.py`
   - `tests/operators/latest_only_operator.py`
   - `tests/operators/python_operator.py`
   - `tests/operators/s3_to_hive_operator.py`
   - `tests/operators/slack_operator.py`
   - `tests/operators/subdag_operator.py`
   
   There are another two, 
   - `tests/operators/bash_operator.py`
   - `tests/operators/operator.py`
   needing more works. I may fix them later (if I get time & nobody else picks 
them up).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] msumit closed pull request #3984: [AIRFLOW-3141] Handle duration for missing dag.

2018-10-15 Thread GitBox
msumit closed pull request #3984: [AIRFLOW-3141] Handle duration for missing 
dag.
URL: https://github.com/apache/incubator-airflow/pull/3984
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/www/views.py b/airflow/www/views.py
index 0aef2281e7..f2414b680d 100644
--- a/airflow/www/views.py
+++ b/airflow/www/views.py
@@ -1612,6 +1612,10 @@ def duration(self, session=None):
 num_runs = request.args.get('num_runs')
 num_runs = int(num_runs) if num_runs else default_dag_run
 
+if dag is None:
+flash('DAG "{0}" seems to be missing.'.format(dag_id), "error")
+return redirect('/admin/')
+
 if base_date:
 base_date = pendulum.parse(base_date)
 else:
diff --git a/airflow/www_rbac/views.py b/airflow/www_rbac/views.py
index e6e505c41a..7658c5c3f9 100644
--- a/airflow/www_rbac/views.py
+++ b/airflow/www_rbac/views.py
@@ -1352,6 +1352,10 @@ def duration(self, session=None):
 num_runs = request.args.get('num_runs')
 num_runs = int(num_runs) if num_runs else default_dag_run
 
+if dag is None:
+flash('DAG "{0}" seems to be missing.'.format(dag_id), "error")
+return redirect('/')
+
 if base_date:
 base_date = pendulum.parse(base_date)
 else:
diff --git a/tests/core.py b/tests/core.py
index 918e9b4d49..91062f6e58 100644
--- a/tests/core.py
+++ b/tests/core.py
@@ -1877,6 +1877,10 @@ def test_dag_views(self):
 response = self.app.get(
 '/admin/airflow/duration?days=30_id=example_bash_operator')
 self.assertIn("example_bash_operator", response.data.decode('utf-8'))
+response = self.app.get(
+'/admin/airflow/duration?days=30_id=missing_dag',
+follow_redirects=True)
+self.assertIn("seems to be missing", response.data.decode('utf-8'))
 response = self.app.get(
 '/admin/airflow/tries?days=30_id=example_bash_operator')
 self.assertIn("example_bash_operator", response.data.decode('utf-8'))
diff --git a/tests/www_rbac/test_views.py b/tests/www_rbac/test_views.py
index a952b9874c..e79cfb6db8 100644
--- a/tests/www_rbac/test_views.py
+++ b/tests/www_rbac/test_views.py
@@ -381,6 +381,11 @@ def test_duration(self):
 resp = self.client.get(url, follow_redirects=True)
 self.check_content_in_response('example_bash_operator', resp)
 
+def test_duration_missing(self):
+url = 'duration?days=30_id=missing_dag'
+resp = self.client.get(url, follow_redirects=True)
+self.check_content_in_response('seems to be missing', resp)
+
 def test_tries(self):
 url = 'tries?days=30_id=example_bash_operator'
 resp = self.client.get(url, follow_redirects=True)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3141) Fix 500 on duration view when dag doesn't exist

2018-10-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650184#comment-16650184
 ] 

ASF GitHub Bot commented on AIRFLOW-3141:
-

msumit closed pull request #3984: [AIRFLOW-3141] Handle duration for missing 
dag.
URL: https://github.com/apache/incubator-airflow/pull/3984
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/www/views.py b/airflow/www/views.py
index 0aef2281e7..f2414b680d 100644
--- a/airflow/www/views.py
+++ b/airflow/www/views.py
@@ -1612,6 +1612,10 @@ def duration(self, session=None):
 num_runs = request.args.get('num_runs')
 num_runs = int(num_runs) if num_runs else default_dag_run
 
+if dag is None:
+flash('DAG "{0}" seems to be missing.'.format(dag_id), "error")
+return redirect('/admin/')
+
 if base_date:
 base_date = pendulum.parse(base_date)
 else:
diff --git a/airflow/www_rbac/views.py b/airflow/www_rbac/views.py
index e6e505c41a..7658c5c3f9 100644
--- a/airflow/www_rbac/views.py
+++ b/airflow/www_rbac/views.py
@@ -1352,6 +1352,10 @@ def duration(self, session=None):
 num_runs = request.args.get('num_runs')
 num_runs = int(num_runs) if num_runs else default_dag_run
 
+if dag is None:
+flash('DAG "{0}" seems to be missing.'.format(dag_id), "error")
+return redirect('/')
+
 if base_date:
 base_date = pendulum.parse(base_date)
 else:
diff --git a/tests/core.py b/tests/core.py
index 918e9b4d49..91062f6e58 100644
--- a/tests/core.py
+++ b/tests/core.py
@@ -1877,6 +1877,10 @@ def test_dag_views(self):
 response = self.app.get(
 '/admin/airflow/duration?days=30_id=example_bash_operator')
 self.assertIn("example_bash_operator", response.data.decode('utf-8'))
+response = self.app.get(
+'/admin/airflow/duration?days=30_id=missing_dag',
+follow_redirects=True)
+self.assertIn("seems to be missing", response.data.decode('utf-8'))
 response = self.app.get(
 '/admin/airflow/tries?days=30_id=example_bash_operator')
 self.assertIn("example_bash_operator", response.data.decode('utf-8'))
diff --git a/tests/www_rbac/test_views.py b/tests/www_rbac/test_views.py
index a952b9874c..e79cfb6db8 100644
--- a/tests/www_rbac/test_views.py
+++ b/tests/www_rbac/test_views.py
@@ -381,6 +381,11 @@ def test_duration(self):
 resp = self.client.get(url, follow_redirects=True)
 self.check_content_in_response('example_bash_operator', resp)
 
+def test_duration_missing(self):
+url = 'duration?days=30_id=missing_dag'
+resp = self.client.get(url, follow_redirects=True)
+self.check_content_in_response('seems to be missing', resp)
+
 def test_tries(self):
 url = 'tries?days=30_id=example_bash_operator'
 resp = self.client.get(url, follow_redirects=True)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix 500 on duration view when dag doesn't exist
> ---
>
> Key: AIRFLOW-3141
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3141
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Josh Carp
>Assignee: Josh Carp
>Priority: Trivial
>
> Loading the duration view for a dag that doesn't exist throws a 500. Based on 
> the behavior of other dag views, this should redirect to the admin view and 
> flash an error message instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3203) Bugs in DockerOperator & Some operator test scripts were named incorrectly

2018-10-15 Thread Xiaodong DENG (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaodong DENG updated AIRFLOW-3203:
---
Summary: Bugs in DockerOperator & Some operator test scripts were named 
incorrectly  (was: Bugs in DockerOperator & some operator tests)

> Bugs in DockerOperator & Some operator test scripts were named incorrectly
> --
>
> Key: AIRFLOW-3203
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3203
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators, tests
>Affects Versions: 1.10.0
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
>
> Usage of `cpu_shares` and `cpu_shares` is incorrect in DockerOperator, based 
> on documentation of Python package "docker".
> In addition, its test is not really working due to incorrect file name. This 
> also happens for some other test scripts for Operators.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3203) Bugs in DockerOperator & Some operator test scripts were named incorrectly

2018-10-15 Thread Xiaodong DENG (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaodong DENG updated AIRFLOW-3203:
---
Description: 
Usage of `cpu_shares` and `cpu_shares` is incorrect in DockerOperator, based on 
documentation of Python package "docker".

In addition, its test is not really working due to incorrect file name. This 
also happens for some other test scripts for Operators. This results in test 
discovery failure.

  was:
Usage of `cpu_shares` and `cpu_shares` is incorrect in DockerOperator, based on 
documentation of Python package "docker".

In addition, its test is not really working due to incorrect file name. This 
also happens for some other test scripts for Operators.


> Bugs in DockerOperator & Some operator test scripts were named incorrectly
> --
>
> Key: AIRFLOW-3203
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3203
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators, tests
>Affects Versions: 1.10.0
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
>
> Usage of `cpu_shares` and `cpu_shares` is incorrect in DockerOperator, based 
> on documentation of Python package "docker".
> In addition, its test is not really working due to incorrect file name. This 
> also happens for some other test scripts for Operators. This results in test 
> discovery failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3203) Bugs in DockerOperator & some operator tests

2018-10-15 Thread Xiaodong DENG (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaodong DENG updated AIRFLOW-3203:
---
Description: 
Usage of `cpu_shares` and `cpu_shares` is incorrect in DockerOperator, based on 
documentation of Python package "docker".

In addition, its test is not really working due to incorrect file name. This 
also happens for some other test scripts for Operators.

  was:
Usage of `cpu_shares` and `cpu_shares` is incorrect in DockerOperator, based on 
documentation of Python package "docker".

In addition, its test is not really working due to incorrect file name.

Summary: Bugs in DockerOperator & some operator tests  (was: Bugs in 
DockerOperator & its test)

> Bugs in DockerOperator & some operator tests
> 
>
> Key: AIRFLOW-3203
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3203
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators, tests
>Affects Versions: 1.10.0
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
>
> Usage of `cpu_shares` and `cpu_shares` is incorrect in DockerOperator, based 
> on documentation of Python package "docker".
> In addition, its test is not really working due to incorrect file name. This 
> also happens for some other test scripts for Operators.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] msumit commented on issue #3984: [AIRFLOW-3141] Handle duration for missing dag.

2018-10-15 Thread GitBox
msumit commented on issue #3984: [AIRFLOW-3141] Handle duration for missing dag.
URL: 
https://github.com/apache/incubator-airflow/pull/3984#issuecomment-429815073
 
 
    much better than getting a 500 page. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice

2018-10-15 Thread GitBox
codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear 
GPL dependency notice
URL: 
https://github.com/apache/incubator-airflow/pull/4055#issuecomment-429739296
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=h1)
 Report
   > Merging 
[#4055](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/fac5a8e623a5c702adece7234547861b1cb2d1d8?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4055/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4055   +/-   ##
   ===
 Coverage   75.91%   75.91%   
   ===
 Files 199  199   
 Lines   1594815948   
   ===
 Hits1210712107   
 Misses   3841 3841
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=footer).
 Last update 
[fac5a8e...9406ef3](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice

2018-10-15 Thread GitBox
codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear 
GPL dependency notice
URL: 
https://github.com/apache/incubator-airflow/pull/4055#issuecomment-429739296
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=h1)
 Report
   > Merging 
[#4055](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/fac5a8e623a5c702adece7234547861b1cb2d1d8?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4055/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4055   +/-   ##
   ===
 Coverage   75.91%   75.91%   
   ===
 Files 199  199   
 Lines   1594815948   
   ===
 Hits1210712107   
 Misses   3841 3841
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=footer).
 Last update 
[fac5a8e...9406ef3](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators

2018-10-15 Thread GitBox
itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement 
xcom_push flag for contrib's operators
URL: https://github.com/apache/incubator-airflow/pull/3981#discussion_r225115466
 
 

 ##
 File path: airflow/contrib/operators/mlengine_operator.py
 ##
 @@ -151,6 +151,9 @@ class MLEngineBatchPredictionOperator(BaseOperator):
 have doamin-wide delegation enabled.
 :type delegate_to: str
 
 
 Review comment:
   ok but, I tried to keep the same blank line as in other parameters, I did 
not want to trim all these existing blank lines. should I?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators

2018-10-15 Thread GitBox
itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement 
xcom_push flag for contrib's operators
URL: https://github.com/apache/incubator-airflow/pull/3981#discussion_r225115675
 
 

 ##
 File path: airflow/contrib/operators/mlengine_operator.py
 ##
 @@ -387,6 +400,9 @@ class MLEngineVersionOperator(BaseOperator):
 For this to work, the service account making the request must have
 domain-wide delegation enabled.
 :type delegate_to: str
+
 
 Review comment:
   ok but, I tried to keep the same blank line as in other parameters, I did 
not want to trim all these existing blank lines. should I?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators

2018-10-15 Thread GitBox
itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement 
xcom_push flag for contrib's operators
URL: https://github.com/apache/incubator-airflow/pull/3981#discussion_r225115466
 
 

 ##
 File path: airflow/contrib/operators/mlengine_operator.py
 ##
 @@ -151,6 +151,9 @@ class MLEngineBatchPredictionOperator(BaseOperator):
 have doamin-wide delegation enabled.
 :type delegate_to: str
 
 
 Review comment:
   ok but, I tried to keep the same blank line as in other parameters


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators

2018-10-15 Thread GitBox
itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement 
xcom_push flag for contrib's operators
URL: https://github.com/apache/incubator-airflow/pull/3981#discussion_r225114997
 
 

 ##
 File path: airflow/contrib/operators/gcs_to_bq.py
 ##
 @@ -248,7 +252,7 @@ def execute(self, context):
 time_partitioning=self.time_partitioning,
 cluster_fields=self.cluster_fields)
 
-if self.max_id_key:
+if self.do_xcom_push and self.max_id_key:
 
 Review comment:
   I wondered too but I decided to add `xcom_push` flag so that the 
`max_id_key` is not an exceptional flag to enable xcom feature. The best might 
be renaming `max_id_key` to `xcom_push` flag & add documenation to explain what 
is pushed to xcom when the feature is enabled.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement xcom_push flag for contrib's operators

2018-10-15 Thread GitBox
itscaro commented on a change in pull request #3981: [AIRFLOW-3133] Implement 
xcom_push flag for contrib's operators
URL: https://github.com/apache/incubator-airflow/pull/3981#discussion_r225113656
 
 

 ##
 File path: airflow/contrib/operators/bigquery_get_data.py
 ##
 @@ -78,6 +80,7 @@ def __init__(self,
  selected_fields=None,
  bigquery_conn_id='bigquery_default',
  delegate_to=None,
+ do_xcom_push=True,
 
 Review comment:
   The choice of default value `True` was intended for non breaking 
compatibility, as suggested by @ashb. But I agree that `False` is more coherent 
with core operators
   
   https://github.com/apache/incubator-airflow/pull/3981#issuecomment-425855530


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3155) Add ability to filter by a last modified time in GoogleCloudStorageToGoogleCloudStorageOperator

2018-10-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650023#comment-16650023
 ] 

ASF GitHub Bot commented on AIRFLOW-3155:
-

kaxil closed pull request #4008: [AIRFLOW-3155] Add ability to filter by a last 
modified time in GoogleCloudStorageToGoogleCloudStorageOperator
URL: https://github.com/apache/incubator-airflow/pull/4008
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/operators/gcs_to_gcs.py 
b/airflow/contrib/operators/gcs_to_gcs.py
index 12fbff5276..0e1087e4d2 100644
--- a/airflow/contrib/operators/gcs_to_gcs.py
+++ b/airflow/contrib/operators/gcs_to_gcs.py
@@ -62,6 +62,10 @@ class 
GoogleCloudStorageToGoogleCloudStorageOperator(BaseOperator):
 For this to work, the service account making the request must have
 domain-wide delegation enabled.
 :type delegate_to: str
+:param last_modified_time: When specified, if the object(s) were
+modified after last_modified_time, they will be copied/moved.
+If tzinfo has not been set, UTC will be assumed.
+:type last_modified_time: datetime
 
 **Examples**:
 The following Operator would copy a single file named
@@ -114,6 +118,7 @@ def __init__(self,
  move_object=False,
  google_cloud_storage_conn_id='google_cloud_default',
  delegate_to=None,
+ last_modified_time=None,
  *args,
  **kwargs):
 super(GoogleCloudStorageToGoogleCloudStorageOperator,
@@ -125,6 +130,7 @@ def __init__(self,
 self.move_object = move_object
 self.google_cloud_storage_conn_id = google_cloud_storage_conn_id
 self.delegate_to = delegate_to
+self.last_modified_time = last_modified_time
 self.wildcard = '*'
 
 def execute(self, context):
@@ -140,6 +146,13 @@ def execute(self, context):
 objects = hook.list(self.source_bucket, prefix=prefix, 
delimiter=delimiter)
 
 for source_object in objects:
+if self.last_modified_time is not None:
+# Check to see if object was modified after 
last_modified_time
+if hook.is_updated_after(self.source_bucket, source_object,
+ self.last_modified_time):
+pass
+else:
+continue
 if self.destination_object is None:
 destination_object = source_object
 else:
@@ -156,6 +169,14 @@ def execute(self, context):
 hook.delete(self.source_bucket, source_object)
 
 else:
+if self.last_modified_time is not None:
+if hook.is_updated_after(self.source_bucket,
+ self.source_object,
+ self.last_modified_time):
+pass
+else:
+return
+
 self.log.info(
 log_message.format(self.source_bucket, self.source_object,
self.destination_bucket or 
self.source_bucket,
diff --git a/tests/contrib/operators/test_gcs_to_gcs_operator.py 
b/tests/contrib/operators/test_gcs_to_gcs_operator.py
index 6b866d11e1..dd16e2f2df 100644
--- a/tests/contrib/operators/test_gcs_to_gcs_operator.py
+++ b/tests/contrib/operators/test_gcs_to_gcs_operator.py
@@ -18,6 +18,7 @@
 # under the License.
 
 import unittest
+from datetime import datetime
 
 from airflow.contrib.operators.gcs_to_gcs import \
 GoogleCloudStorageToGoogleCloudStorageOperator
@@ -38,6 +39,7 @@
 SOURCE_OBJECT_2 = 'test_object*'
 SOURCE_OBJECT_3 = 'test*object'
 SOURCE_OBJECT_4 = 'test_object*.txt'
+SOURCE_OBJECT_5 = 'test_object.txt'
 DESTINATION_BUCKET = 'archive'
 DESTINATION_OBJECT_PREFIX = 'foo/bar'
 SOURCE_FILES_LIST = [
@@ -45,6 +47,7 @@
 'test_object/file2.txt',
 'test_object/file3.json',
 ]
+MOD_TIME_1 = datetime(2016, 1, 1)
 
 
 class GoogleCloudStorageToCloudStorageOperatorTest(unittest.TestCase):
@@ -167,3 +170,113 @@ def test_execute_wildcard_empty_destination_object(self, 
mock_hook):
   DESTINATION_BUCKET, '/file2.txt'),
 ]
 mock_hook.return_value.rewrite.assert_has_calls(mock_calls_empty)
+
+@mock.patch('airflow.contrib.operators.gcs_to_gcs.GoogleCloudStorageHook')
+def test_execute_last_modified_time(self, mock_hook):
+mock_hook.return_value.list.return_value = SOURCE_FILES_LIST
+operator = GoogleCloudStorageToGoogleCloudStorageOperator(
+task_id=TASK_ID, source_bucket=TEST_BUCKET,
+  

[jira] [Resolved] (AIRFLOW-3155) Add ability to filter by a last modified time in GoogleCloudStorageToGoogleCloudStorageOperator

2018-10-15 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-3155.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

Resolved by https://github.com/apache/incubator-airflow/pull/4008

> Add ability to filter by a last modified time in 
> GoogleCloudStorageToGoogleCloudStorageOperator
> ---
>
> Key: AIRFLOW-3155
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3155
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 2.0.0
>Reporter: Brandon Kvarda
>Assignee: Brandon Kvarda
>Priority: Minor
> Fix For: 2.0.0
>
>
> Currently the GoogleCloudStorageToGoogleCloudStorageOperator doesn't support 
> filtering objects based on a last modified time/date. This would add the 
> ability to further filter source object(s) to copy/move based on a last 
> modified time threshold (for example, if the objects were updated after the 
> last run at 10:00 yesterday, then copy/move them; otherwise, do not.) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil closed pull request #4008: [AIRFLOW-3155] Add ability to filter by a last modified time in GoogleCloudStorageToGoogleCloudStorageOperator

2018-10-15 Thread GitBox
kaxil closed pull request #4008: [AIRFLOW-3155] Add ability to filter by a last 
modified time in GoogleCloudStorageToGoogleCloudStorageOperator
URL: https://github.com/apache/incubator-airflow/pull/4008
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/operators/gcs_to_gcs.py 
b/airflow/contrib/operators/gcs_to_gcs.py
index 12fbff5276..0e1087e4d2 100644
--- a/airflow/contrib/operators/gcs_to_gcs.py
+++ b/airflow/contrib/operators/gcs_to_gcs.py
@@ -62,6 +62,10 @@ class 
GoogleCloudStorageToGoogleCloudStorageOperator(BaseOperator):
 For this to work, the service account making the request must have
 domain-wide delegation enabled.
 :type delegate_to: str
+:param last_modified_time: When specified, if the object(s) were
+modified after last_modified_time, they will be copied/moved.
+If tzinfo has not been set, UTC will be assumed.
+:type last_modified_time: datetime
 
 **Examples**:
 The following Operator would copy a single file named
@@ -114,6 +118,7 @@ def __init__(self,
  move_object=False,
  google_cloud_storage_conn_id='google_cloud_default',
  delegate_to=None,
+ last_modified_time=None,
  *args,
  **kwargs):
 super(GoogleCloudStorageToGoogleCloudStorageOperator,
@@ -125,6 +130,7 @@ def __init__(self,
 self.move_object = move_object
 self.google_cloud_storage_conn_id = google_cloud_storage_conn_id
 self.delegate_to = delegate_to
+self.last_modified_time = last_modified_time
 self.wildcard = '*'
 
 def execute(self, context):
@@ -140,6 +146,13 @@ def execute(self, context):
 objects = hook.list(self.source_bucket, prefix=prefix, 
delimiter=delimiter)
 
 for source_object in objects:
+if self.last_modified_time is not None:
+# Check to see if object was modified after 
last_modified_time
+if hook.is_updated_after(self.source_bucket, source_object,
+ self.last_modified_time):
+pass
+else:
+continue
 if self.destination_object is None:
 destination_object = source_object
 else:
@@ -156,6 +169,14 @@ def execute(self, context):
 hook.delete(self.source_bucket, source_object)
 
 else:
+if self.last_modified_time is not None:
+if hook.is_updated_after(self.source_bucket,
+ self.source_object,
+ self.last_modified_time):
+pass
+else:
+return
+
 self.log.info(
 log_message.format(self.source_bucket, self.source_object,
self.destination_bucket or 
self.source_bucket,
diff --git a/tests/contrib/operators/test_gcs_to_gcs_operator.py 
b/tests/contrib/operators/test_gcs_to_gcs_operator.py
index 6b866d11e1..dd16e2f2df 100644
--- a/tests/contrib/operators/test_gcs_to_gcs_operator.py
+++ b/tests/contrib/operators/test_gcs_to_gcs_operator.py
@@ -18,6 +18,7 @@
 # under the License.
 
 import unittest
+from datetime import datetime
 
 from airflow.contrib.operators.gcs_to_gcs import \
 GoogleCloudStorageToGoogleCloudStorageOperator
@@ -38,6 +39,7 @@
 SOURCE_OBJECT_2 = 'test_object*'
 SOURCE_OBJECT_3 = 'test*object'
 SOURCE_OBJECT_4 = 'test_object*.txt'
+SOURCE_OBJECT_5 = 'test_object.txt'
 DESTINATION_BUCKET = 'archive'
 DESTINATION_OBJECT_PREFIX = 'foo/bar'
 SOURCE_FILES_LIST = [
@@ -45,6 +47,7 @@
 'test_object/file2.txt',
 'test_object/file3.json',
 ]
+MOD_TIME_1 = datetime(2016, 1, 1)
 
 
 class GoogleCloudStorageToCloudStorageOperatorTest(unittest.TestCase):
@@ -167,3 +170,113 @@ def test_execute_wildcard_empty_destination_object(self, 
mock_hook):
   DESTINATION_BUCKET, '/file2.txt'),
 ]
 mock_hook.return_value.rewrite.assert_has_calls(mock_calls_empty)
+
+@mock.patch('airflow.contrib.operators.gcs_to_gcs.GoogleCloudStorageHook')
+def test_execute_last_modified_time(self, mock_hook):
+mock_hook.return_value.list.return_value = SOURCE_FILES_LIST
+operator = GoogleCloudStorageToGoogleCloudStorageOperator(
+task_id=TASK_ID, source_bucket=TEST_BUCKET,
+source_object=SOURCE_OBJECT_4,
+destination_bucket=DESTINATION_BUCKET,
+last_modified_time=None)
+
+operator.execute(None)
+mock_calls_none = [
+mock.call(TEST_BUCKET, 

[GitHub] johnhofman commented on issue #3960: [AIRFLOW-2966] Catch ApiException in the Kubernetes Executor

2018-10-15 Thread GitBox
johnhofman commented on issue #3960: [AIRFLOW-2966] Catch ApiException in the 
Kubernetes Executor
URL: 
https://github.com/apache/incubator-airflow/pull/3960#issuecomment-429789504
 
 
   @Fokko I have rebased. Is this failing test something I need to look into?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #4054: [AIRFLOW-XXX] Fix GCS Operator docstrings

2018-10-15 Thread GitBox
kaxil commented on issue #4054: [AIRFLOW-XXX] Fix GCS Operator docstrings
URL: 
https://github.com/apache/incubator-airflow/pull/4054#issuecomment-429788296
 
 
   @Fokko Yes I agree, that should be the way to go. However, haven't been able 
to really find a decent linting tool for that yet.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil closed pull request #4054: [AIRFLOW-XXX] Fix GCS Operator docstrings

2018-10-15 Thread GitBox
kaxil closed pull request #4054: [AIRFLOW-XXX] Fix GCS Operator docstrings
URL: https://github.com/apache/incubator-airflow/pull/4054
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3205) GCS: Support multipart upload

2018-10-15 Thread Gordon Ball (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649995#comment-16649995
 ] 

Gordon Ball commented on AIRFLOW-3205:
--

The behaviour of the MySQL->GCS operator is to split the output into multiple 
files, whereas this is about uploading a single logical file in multiple HTTP 
requests, avoiding a size limit.

 

The former behaviour is useful by itself (eg, for import to BigQuery the 
multiple uploaded files can be imported in parallel, instead of a slow serial 
import of a single file), but is orthogonal to this case.

> GCS: Support multipart upload
> -
>
> Key: AIRFLOW-3205
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3205
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Reporter: Gordon Ball
>Priority: Minor
>
> GoogleCloudStorageHook currently only provides support for uploading files in 
> a single HTTP request. This means that loads fail with SSL errors for files 
> larger than 2GiB (presumably a int32 overflow, might depend on which SSL 
> library is being used). Multipart uploads should be supported to allow large 
> uploads, and possibly increase reliability for smaller uploads.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3205) GCS: Support multipart upload

2018-10-15 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649979#comment-16649979
 ] 

jack commented on AIRFLOW-3205:
---

Some operators like: MySqlToGoogleCloudStorageOperator do support this behavior.

You can specify the max file size and you also can define the param filename  
with name{}.json

which will create name1.json, name0.json as many as it needs till all records 
were imported.

[https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/operators/mysql_to_gcs.py]

I haven't checked if this behavior is from the operator or from the hook but I 
agree that it would be nice to have such behavior in all operators that 
interact with GoogleStorage.

> GCS: Support multipart upload
> -
>
> Key: AIRFLOW-3205
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3205
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Reporter: Gordon Ball
>Priority: Minor
>
> GoogleCloudStorageHook currently only provides support for uploading files in 
> a single HTTP request. This means that loads fail with SSL errors for files 
> larger than 2GiB (presumably a int32 overflow, might depend on which SSL 
> library is being used). Multipart uploads should be supported to allow large 
> uploads, and possibly increase reliability for smaller uploads.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice

2018-10-15 Thread GitBox
codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear 
GPL dependency notice
URL: 
https://github.com/apache/incubator-airflow/pull/4055#issuecomment-429739296
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=h1)
 Report
   > Merging 
[#4055](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/fac5a8e623a5c702adece7234547861b1cb2d1d8?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4055/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4055   +/-   ##
   ===
 Coverage   75.91%   75.91%   
   ===
 Files 199  199   
 Lines   1594815948   
   ===
 Hits1210712107   
 Misses   3841 3841
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=footer).
 Last update 
[fac5a8e...8244820](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice

2018-10-15 Thread GitBox
codecov-io edited a comment on issue #4055: [AIRFLOW-3206] neutral and clear 
GPL dependency notice
URL: 
https://github.com/apache/incubator-airflow/pull/4055#issuecomment-429739296
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=h1)
 Report
   > Merging 
[#4055](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/fac5a8e623a5c702adece7234547861b1cb2d1d8?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4055/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4055   +/-   ##
   ===
 Coverage   75.91%   75.91%   
   ===
 Files 199  199   
 Lines   1594815948   
   ===
 Hits1210712107   
 Misses   3841 3841
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=footer).
 Last update 
[fac5a8e...8244820](https://codecov.io/gh/apache/incubator-airflow/pull/4055?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] brylie opened a new pull request #4055: [AIRFLOW-3206] neutral and clear GPL dependency notice

2018-10-15 Thread GitBox
brylie opened a new pull request #4055: [AIRFLOW-3206] neutral and clear GPL 
dependency notice
URL: https://github.com/apache/incubator-airflow/pull/4055
 
 
   Clarify the GPL dependency notice. Add line breaks for readability. Use 
neutral terms 'disallow' and 'allow'.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3206) More neutral language regarding Copyleft in installation instructions

2018-10-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649822#comment-16649822
 ] 

ASF GitHub Bot commented on AIRFLOW-3206:
-

brylie opened a new pull request #4055: [AIRFLOW-3206] neutral and clear GPL 
dependency notice
URL: https://github.com/apache/incubator-airflow/pull/4055
 
 
   Clarify the GPL dependency notice. Add line breaks for readability. Use 
neutral terms 'disallow' and 'allow'.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> More neutral language regarding Copyleft in installation instructions
> -
>
> Key: AIRFLOW-3206
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3206
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.10.0
>Reporter: Brylie Christopher Oxley
>Assignee: Brylie Christopher Oxley
>Priority: Trivial
>  Labels: newbie
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> When installing Airflow, the user must set an environment variable to 
> explicitly _allow_ or _disallow_ the installation of a GPL dependency. The 
> text of the error message is somewhat difficult to read, and seems biased 
> against the GPL dependency.
> h2. Task
>  * add proper line breaks to GPL dependency notice, for improved readability
>  * use neutral language _allow_ and _disallow_ (as opposed to 'force')



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3206) More neutral language regarding Copyleft in installation instructions

2018-10-15 Thread Brylie Christopher Oxley (JIRA)
Brylie Christopher Oxley created AIRFLOW-3206:
-

 Summary: More neutral language regarding Copyleft in installation 
instructions
 Key: AIRFLOW-3206
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3206
 Project: Apache Airflow
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.10.0
Reporter: Brylie Christopher Oxley
Assignee: Brylie Christopher Oxley


When installing Airflow, the user must set an environment variable to 
explicitly _allow_ or _disallow_ the installation of a GPL dependency. The text 
of the error message is somewhat difficult to read, and seems biased against 
the GPL dependency.

 
h2. Task
 * add proper line breaks to GPL dependency notice, for improved readability
 * use neutral language _allow_ and _disallow_ (as opposed to 'force')



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3206) More neutral language regarding Copyleft in installation instructions

2018-10-15 Thread Brylie Christopher Oxley (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brylie Christopher Oxley updated AIRFLOW-3206:
--
Description: 
When installing Airflow, the user must set an environment variable to 
explicitly _allow_ or _disallow_ the installation of a GPL dependency. The text 
of the error message is somewhat difficult to read, and seems biased against 
the GPL dependency.
h2. Task
 * add proper line breaks to GPL dependency notice, for improved readability
 * use neutral language _allow_ and _disallow_ (as opposed to 'force')

  was:
When installing Airflow, the user must set an environment variable to 
explicitly _allow_ or _disallow_ the installation of a GPL dependency. The text 
of the error message is somewhat difficult to read, and seems biased against 
the GPL dependency.

 
h2. Task
 * add proper line breaks to GPL dependency notice, for improved readability
 * use neutral language _allow_ and _disallow_ (as opposed to 'force')


> More neutral language regarding Copyleft in installation instructions
> -
>
> Key: AIRFLOW-3206
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3206
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.10.0
>Reporter: Brylie Christopher Oxley
>Assignee: Brylie Christopher Oxley
>Priority: Trivial
>  Labels: newbie
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> When installing Airflow, the user must set an environment variable to 
> explicitly _allow_ or _disallow_ the installation of a GPL dependency. The 
> text of the error message is somewhat difficult to read, and seems biased 
> against the GPL dependency.
> h2. Task
>  * add proper line breaks to GPL dependency notice, for improved readability
>  * use neutral language _allow_ and _disallow_ (as opposed to 'force')



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] codecov-io edited a comment on issue #3741: [AIRFLOW-1368] Add auto_remove for DockerOperator

2018-10-15 Thread GitBox
codecov-io edited a comment on issue #3741: [AIRFLOW-1368] Add auto_remove for 
DockerOperator
URL: 
https://github.com/apache/incubator-airflow/pull/3741#issuecomment-412492251
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3741?src=pr=h1)
 Report
   > Merging 
[#3741](https://codecov.io/gh/apache/incubator-airflow/pull/3741?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/b8be322d3badfeadfa8f08e0bf92a12a6cd26418?src=pr=desc)
 will **increase** coverage by `1.85%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3741/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3741?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3741  +/-   ##
   ==
   + Coverage   75.79%   77.65%   +1.85% 
   ==
 Files 199  204   +5 
 Lines   1594615850  -96 
   ==
   + Hits1208612308 +222 
   + Misses   3860 3542 -318
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3741?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/operators/docker\_operator.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy9vcGVyYXRvcnMvZG9ja2VyX29wZXJhdG9yLnB5)
 | `97.7% <100%> (+97.7%)` | :arrow_up: |
   | 
[airflow/sensors/s3\_key\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX2tleV9zZW5zb3IucHk=)
 | `31.03% <0%> (-68.97%)` | :arrow_down: |
   | 
[airflow/sensors/s3\_prefix\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3MzX3ByZWZpeF9zZW5zb3IucHk=)
 | `41.17% <0%> (-58.83%)` | :arrow_down: |
   | 
[airflow/utils/helpers.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9oZWxwZXJzLnB5)
 | `71.34% <0%> (-13.04%)` | :arrow_down: |
   | 
[airflow/hooks/mysql\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy9ob29rcy9teXNxbF9ob29rLnB5)
 | `78% <0%> (-12%)` | :arrow_down: |
   | 
[airflow/sensors/sql\_sensor.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy9zZW5zb3JzL3NxbF9zZW5zb3IucHk=)
 | `90.47% <0%> (-9.53%)` | :arrow_down: |
   | 
[airflow/configuration.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy9jb25maWd1cmF0aW9uLnB5)
 | `83.95% <0%> (-5.31%)` | :arrow_down: |
   | 
[airflow/utils/state.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9zdGF0ZS5weQ==)
 | `93.33% <0%> (-3.34%)` | :arrow_down: |
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.82% <0%> (-2.89%)` | :arrow_down: |
   | 
[airflow/utils/email.py](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree#diff-YWlyZmxvdy91dGlscy9lbWFpbC5weQ==)
 | `97.4% <0%> (-2.6%)` | :arrow_down: |
   | ... and [66 
more](https://codecov.io/gh/apache/incubator-airflow/pull/3741/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3741?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3741?src=pr=footer).
 Last update 
[b8be322...c9247b4](https://codecov.io/gh/apache/incubator-airflow/pull/3741?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3205) GCS: Support multipart upload

2018-10-15 Thread Gordon Ball (JIRA)
Gordon Ball created AIRFLOW-3205:


 Summary: GCS: Support multipart upload
 Key: AIRFLOW-3205
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3205
 Project: Apache Airflow
  Issue Type: Improvement
  Components: gcp
Reporter: Gordon Ball


GoogleCloudStorageHook currently only provides support for uploading files in a 
single HTTP request. This means that loads fail with SSL errors for files 
larger than 2GiB (presumably a int32 overflow, might depend on which SSL 
library is being used). Multipart uploads should be supported to allow large 
uploads, and possibly increase reliability for smaller uploads.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)