[GitHub] ckljohn commented on issue #3227: [AIRFLOW-2299] Add S3 Select functionarity to S3FileTransformOperator

2018-10-22 Thread GitBox
ckljohn commented on issue #3227: [AIRFLOW-2299] Add S3 Select functionarity to 
S3FileTransformOperator
URL: 
https://github.com/apache/incubator-airflow/pull/3227#issuecomment-432093892
 
 
   @sekikn  if the file storing encoded string, the `Payload` returned is bytes.
   
   At 
https://github.com/sekikn/incubator-airflow/blob/288fca445ffcad718d39f413eddd8712a18dbf85/airflow/hooks/S3_hook.py#L248,
 `''.join()` will raise exception.
   
   ```
 File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", 
line 249, in select_key
   for event in response['Payload']
   TypeError: sequence item 0: expected str instance, bytes found
   ```
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3243) UI task and dag clear feature cannot pick up dag parameters

2018-10-22 Thread chengningzhang (JIRA)
chengningzhang created AIRFLOW-3243:
---

 Summary: UI task and dag clear feature cannot pick up dag 
parameters
 Key: AIRFLOW-3243
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3243
 Project: Apache Airflow
  Issue Type: Improvement
Reporter: chengningzhang


Hi, 

    I meet an issue with airflow UI dags and tasks "clear" feature. When I 
clear the tasks from the UI, the dag parameters will not be picked up by the 
the cleared tasks.

    For example, I have "max_active_runs=1" in my dag parameter, but when I 
manually clear the tasks, this parameter will not be picked up. The same 
cleared tasks with different schedule time will run in parallel. 

   Is there way we can improve this, as we may want to backfill some data and 
just clear the past tasks from airflow UI. 

 

Thanks,

Chengning



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] msumit closed pull request #4081: add Neoway to companies list

2018-10-22 Thread GitBox
msumit closed pull request #4081: add Neoway to companies list
URL: https://github.com/apache/incubator-airflow/pull/4081
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/README.md b/README.md
index 266e30f677..271d6a74f7 100644
--- a/README.md
+++ b/README.md
@@ -224,6 +224,7 @@ Currently **officially** using Airflow:
 1. [New Relic](https://www.newrelic.com) 
[[@marcweil](https://github.com/marcweil)]
 1. [Newzoo](https://www.newzoo.com) 
[[@newzoo-nexus](https://github.com/newzoo-nexus)]
 1. [Nextdoor](https://nextdoor.com) 
[[@SivaPandeti](https://github.com/SivaPandeti), 
[@zshapiro](https://github.com/zshapiro) & 
[@jthomas123](https://github.com/jthomas123)]
+1. [Neoway](https://www.neoway.com.br/) 
[[@neowaylabs](https://github.com/orgs/NeowayLabs/people)]
 1. [OdysseyPrime](https://www.goprime.io/) 
[[@davideberdin](https://github.com/davideberdin)]
 1. [OfferUp](https://offerupnow.com)
 1. [OneFineStay](https://www.onefinestay.com) 
[[@slangwald](https://github.com/slangwald)]


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4083: Airflow 3211

2018-10-22 Thread GitBox
codecov-io edited a comment on issue #4083: Airflow 3211
URL: 
https://github.com/apache/incubator-airflow/pull/4083#issuecomment-432075536
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=h1)
 Report
   > Merging 
[#4083](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/829086c8718920b350728e2a126da5db08dea541?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4083/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4083   +/-   ##
   ===
 Coverage   77.91%   77.91%   
   ===
 Files 199  199   
 Lines   1595815958   
   ===
 Hits1243312433   
 Misses   3525 3525
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=footer).
 Last update 
[829086c...0b34f56](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #4083: Airflow 3211

2018-10-22 Thread GitBox
codecov-io commented on issue #4083: Airflow 3211
URL: 
https://github.com/apache/incubator-airflow/pull/4083#issuecomment-432075536
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=h1)
 Report
   > Merging 
[#4083](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/829086c8718920b350728e2a126da5db08dea541?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4083/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4083   +/-   ##
   ===
 Coverage   77.91%   77.91%   
   ===
 Files 199  199   
 Lines   1595815958   
   ===
 Hits1243312433   
 Misses   3525 3525
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=footer).
 Last update 
[829086c...0b34f56](https://codecov.io/gh/apache/incubator-airflow/pull/4083?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jj-ian opened a new pull request #4083: Airflow 3211

2018-10-22 Thread GitBox
jj-ian opened a new pull request #4083: Airflow 3211
URL: https://github.com/apache/incubator-airflow/pull/4083
 
 
   This change allows Airflow to reattach to existing Dataproc jobs upon 
scheduler restart. Previously, if the Airflow scheduler restarts while it's 
running a job on GCP Dataproc, it'll lose track of that job, mark the task as 
failed, and eventually retry. However, the jobs may still be running on 
Dataproc and maybe even finish successfully. So when Airflow retries and reruns 
the job, the same job will run twice. This can result in issues like delayed 
workflows, increased costs, and duplicate data. 
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. 
 - https://issues.apache.org/jira/browse/AIRFLOW-3211
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   My change has Airflow query the Dataproc API before submitting a job to see 
if the job is already running on the cluster. If a job with a matching task ID 
is already running on the cluster AND is in a recoverable state (like RUNNING 
or DONE), then Airflow will reattach itself to the existing job on Dataproc 
instead of resubmitting a new job to the cluster. If the job on the cluster is 
in an irrecoverable state like ERROR, Airflow will resubmit the job.
   
   To see this change in action:
   
   Setup:
   1. Set up a GCP Project with the Dataproc API enabled
   2. Install Airflow.
   3. In the box that's running Airflow, `pip install google-api-python-client 
oauth2client`
   4. Start the Airflow webserver. In the Airflow UI, Go to Admin->Connections, 
edit the `google_cloud_default` connection, and fill in the Project Id field 
with your project ID.
   
   To reproduce:
   
   1. Install this DAG in the Airflow instance: 
https://github.com/GoogleCloudPlatform/python-docs-samples/blob/b80895ed88ba86fce223df27a48bf481007ca708/composer/workflows/quickstart.py
 Set up the Airflow variables as instructed at the top of the file.
   2. Start the Airflow scheduler and webserver if they're not running already. 
Kick off a run of the above DAG through the Airflow UI. Wait for the cluster to 
spin up and the job to start running on Dataproc.
   3. While the job's running, kill the scheduler. Wait 5 seconds or so, and 
then start it back up.
   4. Airflow will retry the task and reattach to the existing task already on 
Dataproc. Look at the Airflow logs to observe "Reattaching to 
previously-started DataProc job [JOB NAME HERE] (in state RUNNING)." Click on 
the cluster in Dataproc to observe that only the single job is running; a 
duplicate job has not been submitted.
   5. Observe that, when the job finishes, Airflow detects the completion 
successfully and runs the downstream cluster delete operation.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   Added the following tests to `tests/contrib/hooks/test_gcp_dataproc_hook.py`:
   
   When submitting a new job to Dataproc:
   - If a job with the same task ID is already running on the cluster, don't 
resubmit the job.
   - If the first matching job found on the cluster is in an irrecoverable 
state, keep looking for a job in a recoverable state to reattach to on the 
cluster. This ensures that Airflow will prioritize recoverable jobs when 
looking for jobs to reattach to on the cluster.
   - If there are jobs running on the cluster, but none of them have the same 
task ID as the job we're about to submit, then submit the new job.
   - If there are no other jobs already running on the cluster, then submit the 
job.
   - If a job with the same task ID finished with error on the cluster, then 
resubmit the job for retry.
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, 

[GitHub] codecov-io commented on issue #4082: [AIRFLOW-2865] Call success_callback before updating task state

2018-10-22 Thread GitBox
codecov-io commented on issue #4082: [AIRFLOW-2865] Call success_callback 
before updating task state
URL: 
https://github.com/apache/incubator-airflow/pull/4082#issuecomment-432029716
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4082?src=pr=h1)
 Report
   > Merging 
[#4082](https://codecov.io/gh/apache/incubator-airflow/pull/4082?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/829086c8718920b350728e2a126da5db08dea541?src=pr=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4082/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4082?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4082  +/-   ##
   ==
   + Coverage   77.91%   77.91%   +<.01% 
   ==
 Files 199  199  
 Lines   1595815957   -1 
   ==
 Hits1243312433  
   + Misses   3525 3524   -1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/4082?src=pr=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/4082/diff?src=pr=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `92.2% <100%> (+0.03%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4082?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4082?src=pr=footer).
 Last update 
[829086c...8a41998](https://codecov.io/gh/apache/incubator-airflow/pull/4082?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io commented on issue #4081: add Neoway to companies list

2018-10-22 Thread GitBox
codecov-io commented on issue #4081: add Neoway to companies list
URL: 
https://github.com/apache/incubator-airflow/pull/4081#issuecomment-432028987
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4081?src=pr=h1)
 Report
   > Merging 
[#4081](https://codecov.io/gh/apache/incubator-airflow/pull/4081?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/829086c8718920b350728e2a126da5db08dea541?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4081/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4081?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4081   +/-   ##
   ===
 Coverage   77.91%   77.91%   
   ===
 Files 199  199   
 Lines   1595815958   
   ===
 Hits1243312433   
 Misses   3525 3525
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4081?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4081?src=pr=footer).
 Last update 
[829086c...00dafa7](https://codecov.io/gh/apache/incubator-airflow/pull/4081?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] codecov-io edited a comment on issue #4081: add Neoway to companies list

2018-10-22 Thread GitBox
codecov-io edited a comment on issue #4081: add Neoway to companies list
URL: 
https://github.com/apache/incubator-airflow/pull/4081#issuecomment-432028987
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4081?src=pr=h1)
 Report
   > Merging 
[#4081](https://codecov.io/gh/apache/incubator-airflow/pull/4081?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/829086c8718920b350728e2a126da5db08dea541?src=pr=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/4081/graphs/tree.svg?width=650=WdLKlKHOAU=150=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4081?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#4081   +/-   ##
   ===
 Coverage   77.91%   77.91%   
   ===
 Files 199  199   
 Lines   1595815958   
   ===
 Hits1243312433   
 Misses   3525 3525
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4081?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4081?src=pr=footer).
 Last update 
[829086c...00dafa7](https://codecov.io/gh/apache/incubator-airflow/pull/4081?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (AIRFLOW-3242) execution_date for TriggerDagRunOperator should be based from Triggering dag

2018-10-22 Thread Feng Zhou (JIRA)
Feng Zhou created AIRFLOW-3242:
--

 Summary: execution_date for TriggerDagRunOperator should be based 
from Triggering dag
 Key: AIRFLOW-3242
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3242
 Project: Apache Airflow
  Issue Type: Bug
  Components: DagRun
Affects Versions: 1.10.0, 1.9.0, 1.8.2
 Environment: any linux / mac os
Reporter: Feng Zhou


TriggerDagRunOperator should pick up execute_date from context instead just 
default to today. This broke back filling logic if TriggerDagRunOperator is 
used.

Could simply add one line to address this issue, see red highlighted line below:

    def execute(self, context):

    
    dr = trigger_dag.create_dagrun(
    run_id=dro.run_id,
    state=State.RUNNING,

    
*_{color:#FF}execution_date=context['execution_date'],{color}_*  ## around 
line#70
    conf=dro.payload,
    external_trigger=True)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ultrabug commented on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ultrabug commented on issue #2460: [AIRFLOW-1424] make the next execution date 
of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431993101
 
 
   Well @ron819 @ashb thanks for your updates but I've never heard back from 
upstream and any interest in this so I will gladly rebase myself from the CLI 
implementation (which looks very close on a quick glance) if there's a chance 
for it to go in...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2865) Race condition between on_success_callback and LocalTaskJob's cleanup

2018-10-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659699#comment-16659699
 ] 

ASF GitHub Bot commented on AIRFLOW-2865:
-

evizitei opened a new pull request #4082: [AIRFLOW-2865] Call success_callback 
before updating task state
URL: https://github.com/apache/incubator-airflow/pull/4082
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW-2865/) issues and 
references them in the PR title.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   In cases where the success callback takes variable
   time, it's possible for it to interrupted by the heartbeat process.
   This is because the heartbeat process looks for tasks that are no
   longer in the "running" state but are still executing and reaps them.
   
   This commit reverses the order of callback invocation and state
   updating so that the "SUCCESS" state for the task isn't committed
   to the database until after the success callback has finished.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   TaskInstanceTest.test_success_callbak_no_race_condition
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Race condition between on_success_callback and LocalTaskJob's cleanup
> -
>
> Key: AIRFLOW-2865
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2865
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Marcin Mejran
>Priority: Minor
>
> The TaskInstance's run_raw_task method first records SUCCESS for the task 
> instance and then runs the on_success_callback function.
> The LocalTaskJob's heartbeat_callback checks for any TI's with a SUCCESS 
> state and terminates their processes.
> As such it's possible for the TI process to be terminated before the 
> on_success_callback function finishes running.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] evizitei opened a new pull request #4082: [AIRFLOW-2865] Call success_callback before updating task state

2018-10-22 Thread GitBox
evizitei opened a new pull request #4082: [AIRFLOW-2865] Call success_callback 
before updating task state
URL: https://github.com/apache/incubator-airflow/pull/4082
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW-2865/) issues and 
references them in the PR title.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   In cases where the success callback takes variable
   time, it's possible for it to interrupted by the heartbeat process.
   This is because the heartbeat process looks for tasks that are no
   longer in the "running" state but are still executing and reaps them.
   
   This commit reverses the order of callback invocation and state
   updating so that the "SUCCESS" state for the task isn't committed
   to the database until after the success callback has finished.
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   TaskInstanceTest.test_success_callbak_no_race_condition
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] gseva opened a new pull request #4080: [AIRFLOW-XXX] Make hmsclient import optional

2018-10-22 Thread GitBox
gseva opened a new pull request #4080: [AIRFLOW-XXX] Make hmsclient import 
optional
URL: https://github.com/apache/incubator-airflow/pull/4080
 
 
   
   ### Jira
   
   - No jira issue
   
   ### Description
   
   Currently to use anything from hive_hooks.py you must have hmsclient 
installed, which is inconsistent: thrift imports are made inside the 
`get_metastore_client` method, and hnsclient is imported outside (and it does 
thrift imports internally).
   
   ### Tests
   
   Not sure if tests needed for this change.
   
   ### Documentation
   
   ### Code Quality
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jghoman commented on issue #4079: [AIRFLOW-XXX] Add Surfline to companies list

2018-10-22 Thread GitBox
jghoman commented on issue #4079: [AIRFLOW-XXX] Add Surfline to companies list
URL: 
https://github.com/apache/incubator-airflow/pull/4079#issuecomment-431966893
 
 
   +1.  Looks good.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jghoman closed pull request #4079: [AIRFLOW-XXX] Add Surfline to companies list

2018-10-22 Thread GitBox
jghoman closed pull request #4079: [AIRFLOW-XXX] Add Surfline to companies list
URL: https://github.com/apache/incubator-airflow/pull/4079
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/README.md b/README.md
index 4dad5f8327..266e30f677 100644
--- a/README.md
+++ b/README.md
@@ -262,6 +262,7 @@ Currently **officially** using Airflow:
 1. [Stripe](https://stripe.com) [[@jbalogh](https://github.com/jbalogh)]
 1. [Strongmind](https://www.strongmind.com) 
[[@tomchapin](https://github.com/tomchapin) & 
[@wongstein](https://github.com/wongstein)]
 1. [Square](https://squareup.com/)
+1. [Surfline](https://www.surfline.com/) 
[[@jawang35](https://github.com/jawang35)]
 1. [Tails.com](https://tails.com/) 
[[@alanmcruickshank](https://github.com/alanmcruickshank)]
 1. [Tesla](https://www.tesla.com/) 
[[@thoralf-gutierrez](https://github.com/thoralf-gutierrez)]
 1. [The Home 
Depot](https://www.homedepot.com/)[[@apekshithr](https://github.com/apekshithr)]


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jason-udacity commented on issue #4073: [AIRFLOW-3238] Fix models.DAG.deactivate_unknown_dags

2018-10-22 Thread GitBox
jason-udacity commented on issue #4073: [AIRFLOW-3238] Fix 
models.DAG.deactivate_unknown_dags
URL: 
https://github.com/apache/incubator-airflow/pull/4073#issuecomment-431965073
 
 
   @ashb `upgradedb` does not invoke `deactivate_unknown_dags` as far as I'm 
aware.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] jawang35 opened a new pull request #4079: [AIRFLOW-XXX] Add Surfline to companies list

2018-10-22 Thread GitBox
jawang35 opened a new pull request #4079: [AIRFLOW-XXX] Add Surfline to 
companies list
URL: https://github.com/apache/incubator-airflow/pull/4079
 
 
   ### Jira
   
   - No Jira issue. Add Surfline to companies list.
   
   ### Description
   
   - This PR adds Surfline to the companies list in the `README.md`.
   
   ### Tests
   
   - No tests required. No code changes.
   
   ### Documentation
   
   - No code changes.
   
   ### Code Quality
   
   - No code changes.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] BasPH commented on issue #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
BasPH commented on issue #4071: [AIRFLOW-3237] Refactor example DAGs
URL: 
https://github.com/apache/incubator-airflow/pull/4071#issuecomment-431914227
 
 
   I processed your changes except for @KimchaC's suggestion for using the DAG 
context manager, because I'm unsure what the Airflow team thinks about it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Eronarn commented on issue #3584: [AIRFLOW-249] Refactor the SLA mechanism

2018-10-22 Thread GitBox
Eronarn commented on issue #3584: [AIRFLOW-249] Refactor the SLA mechanism
URL: 
https://github.com/apache/incubator-airflow/pull/3584#issuecomment-431896371
 
 
   What's the recommended way to proceed at this point? I'm glad to write more 
test code, but I'd love to have an end goal of where this needs to be to be 
mergeable.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] oelesinsc24 commented on a change in pull request #4068: [AIRFLOW-2310]: Add AWS Glue Job Compatibility to Airflow

2018-10-22 Thread GitBox
oelesinsc24 commented on a change in pull request #4068: [AIRFLOW-2310]: Add 
AWS Glue Job Compatibility to Airflow
URL: https://github.com/apache/incubator-airflow/pull/4068#discussion_r227038736
 
 

 ##
 File path: airflow/contrib/hooks/aws_glue_job_hook.py
 ##
 @@ -0,0 +1,130 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+
+from airflow.exceptions import AirflowException
+from airflow.contrib.hooks.aws_hook import AwsHook
+import time
+
+
+class AwsGlueJobHook(AwsHook):
+"""
+Interact with AWS Glue - create job, trigger, crawler
+
+:param job_name: unique job name per AWS account
+:type str
+:param desc: job description
+:type str
+:param region_name: aws region name (example: us-east-1)
+:type region_name: str
 
 Review comment:
   Can explain what you mean, please? Other AWS service implementations have it 
like this already:
   airflow/contrib/operators/awsbatch_operator.py#L65-71


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] oelesinsc24 commented on a change in pull request #4068: [AIRFLOW-2310]: Add AWS Glue Job Compatibility to Airflow

2018-10-22 Thread GitBox
oelesinsc24 commented on a change in pull request #4068: [AIRFLOW-2310]: Add 
AWS Glue Job Compatibility to Airflow
URL: https://github.com/apache/incubator-airflow/pull/4068#discussion_r227036055
 
 

 ##
 File path: airflow/contrib/hooks/aws_glue_job_hook.py
 ##
 @@ -0,0 +1,130 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+
+from airflow.exceptions import AirflowException
+from airflow.contrib.hooks.aws_hook import AwsHook
+import time
+
+
+class AwsGlueJobHook(AwsHook):
+"""
+Interact with AWS Glue - create job, trigger, crawler
+
+:param job_name: unique job name per AWS account
+:type str
+:param desc: job description
+:type str
+:param region_name: aws region name (example: us-east-1)
+:type region_name: str
+"""
+
+def __init__(self,
+ job_name=None,
+ desc=None,
+ aws_conn_id='aws_default',
+ region_name=None, *args, **kwargs):
+self.job_name = job_name
+self.desc = desc
+self.aws_conn_id = aws_conn_id
+self.region_name = region_name
+super(AwsGlueJobHook, self).__init__(*args, **kwargs)
+
+def get_conn(self):
+conn = self.get_client_type('glue', self.region_name)
+return conn
+
+def list_jobs(self):
+conn = self.get_conn()
+return conn.get_jobs()
+
+def initialize_job(self, script_arguments=None):
+"""
+Initializes connection with AWS Glue
+to run job
+:return:
+"""
+glue_client = self.get_conn()
+
+try:
+job_response = self.get_glue_job()
+job_name = job_response['Name']
+job_run = glue_client.start_job_run(
+JobName=job_name,
+Arguments=script_arguments
+)
+return self.job_completion(job_name, job_run['JobRunId'])
+except Exception as general_error:
+raise AirflowException(
+'Failed to run aws glue job, error: {error}'.format(
+error=str(general_error)
+)
+)
+
+def job_completion(self, job_name=None, run_id=None):
+"""
+:param job_name:
+:param run_id:
+:return:
+"""
+glue_client = self.get_conn()
+job_status = glue_client.get_job_run(
+JobName=job_name,
+RunId=run_id,
+PredecessorsIncluded=True
+)
+job_run_state = job_status['JobRun']['JobRunState']
+failed = job_run_state == 'FAILED'
+stopped = job_run_state == 'STOPPED'
+completed = job_run_state == 'SUCCEEDED'
+
+while True:
+if failed or stopped or completed:
+self.log.info("Exiting Job {} Run State: {}"
+  .format(run_id, job_run_state))
+return {'JobRunState': job_run_state, 'JobRunId': run_id}
+else:
+self.log.info("Polling for AWS Glue Job {} current run state"
+  .format(job_name))
+time.sleep(6)
 
 Review comment:
   @ashb, run job and poll job for completion are two separate methods already.
   
   @Fokko, 6 seconds was a random number having looked at sleep time in some 
existing implementations. e.g: 
   
   1. BigQuery `5 seconds`
   2. Google Dataproc `5 seconds` 
(airflow/contrib/hooks/gcp_dataproc_hook.py#L70)
   3. Google Dataproc `10 seconds` (another section: 
airflow/contrib/hooks/gcp_dataproc_hook.py#L164)
   
   However, this can be changed. In addition, from my experience with AWS Glue, 
6 seconds is enough to poll for job status


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] oelesinsc24 commented on a change in pull request #4068: [AIRFLOW-2310]: Add AWS Glue Job Compatibility to Airflow

2018-10-22 Thread GitBox
oelesinsc24 commented on a change in pull request #4068: [AIRFLOW-2310]: Add 
AWS Glue Job Compatibility to Airflow
URL: https://github.com/apache/incubator-airflow/pull/4068#discussion_r227033014
 
 

 ##
 File path: tests/contrib/hooks/test_aws_glue_job_hook.py
 ##
 @@ -0,0 +1,84 @@
+# -*- coding: utf-8 -*-
 
 Review comment:
   Currently, moto does not support glue jobs yet. I opened a feature request 
(https://github.com/spulec/moto/issues/1561) for this.
   
   I could have contributed but not sure I have much time now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-3234) Enable alerting for dagbag import errors

2018-10-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659408#comment-16659408
 ] 

ASF GitHub Bot commented on AIRFLOW-3234:
-

ajbosco opened a new pull request #4078: [AIRFLOW-3234] add 
dagbag_import_failure_handler
URL: https://github.com/apache/incubator-airflow/pull/4078
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable alerting for dagbag import errors
> 
>
> Key: AIRFLOW-3234
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3234
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Adam Boscarino
>Assignee: Adam Boscarino
>Priority: Minor
>
> If a task fails due to being unable to load the DagBag it is set to `failed` 
> without being able to use callbacks or retries. This creates the possibility 
> for "silent" failures. We should have the ability to handle these failures 
> with some other functionality (similar to the SLA callbacks).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ron819 commented on issue #3563: [AIRFLOW-2698] Simplify Kerberos code

2018-10-22 Thread GitBox
ron819 commented on issue #3563: [AIRFLOW-2698] Simplify Kerberos code
URL: 
https://github.com/apache/incubator-airflow/pull/3563#issuecomment-431841517
 
 
   @Fokko @gglanzani 
   any updates on this? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ajbosco opened a new pull request #4078: [AIRFLOW-3234] add dagbag_import_failure_handler

2018-10-22 Thread GitBox
ajbosco opened a new pull request #4078: [AIRFLOW-3234] add 
dagbag_import_failure_handler
URL: https://github.com/apache/incubator-airflow/pull/4078
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-XXX
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [ ] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] KimchaC commented on issue #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
KimchaC commented on issue #4071: [AIRFLOW-3237] Refactor example DAGs
URL: 
https://github.com/apache/incubator-airflow/pull/4071#issuecomment-431833674
 
 
   Sorry, github is wonky for me today. First it wouldn't let me post comments 
and then it ended up with many duplicates and when I tried to clean them up it 
deleted all of them.
   
   I previously wrote:
   > Consider also changing the code to use the with context manager so that 
you don't have to repeat the dag=dag parameter on each task:
   > ```
   > dag = DAG(
   > 'my_dag',
   > start_date=datetime(2016, 1, 1))
   > with dag:
   > op = DummyOperator('op')
   >
   > op.dag is dag # True
   > ```
   
   To which @BasPH replied...
   > @KimchaC I generally see the 50/50 usage of passing dag object vs using 
dag context manager in Airflow code. All example DAGs pass the dag object to 
the operators. Is there a preference for either by the Airflow community?
   
   I think the airflow team should decide on a preference for the community. 
One of the reasons that it is not used by everyone is probably because not 
everyone is aware of this feature. One of the reasons for that is that the 
examples are not using it :)
   
   Personally I think the with statement is awesome, makes the DAG code _much_ 
cleaner and reduces repetition.
   
   I'd suggest adding a comment to the examples like...
   ```
   # The with statement allows you to omit the dag parameter when initializing 
tasks.
   with dag:
   ...
   ```
   
   I also think the clode is clearer when the DAG is initiated separately and 
not inside the with statement.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] marengaz commented on issue #4056: [AIRFLOW-3207] option to stop task pushing result to xcom

2018-10-22 Thread GitBox
marengaz commented on issue #4056: [AIRFLOW-3207] option to stop task pushing 
result to xcom
URL: 
https://github.com/apache/incubator-airflow/pull/4056#issuecomment-431832931
 
 
   @ashb - we cant use a flag `BaseOperator.xcom_push` because this conflicts 
with the method `BaseOperator.xcom_push()`. I'll change the flag to be 
`BaseOperator.do_xcom_push`. it's already like this in some operators


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] BasPH commented on a change in pull request #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
BasPH commented on a change in pull request #4071: [AIRFLOW-3237] Refactor 
example DAGs
URL: https://github.com/apache/incubator-airflow/pull/4071#discussion_r226978756
 
 

 ##
 File path: airflow/example_dags/example_bash_operator.py
 ##
 @@ -17,48 +17,54 @@
 # specific language governing permissions and limitations
 # under the License.
 
-import airflow
 from builtins import range
-from airflow.operators.bash_operator import BashOperator
-from airflow.operators.dummy_operator import DummyOperator
-from airflow.models import DAG
 from datetime import timedelta
 
+import airflow
+from airflow.models import DAG
+from airflow.operators.bash_operator import BashOperator
+from airflow.operators.dummy_operator import DummyOperator
 
-args = {
-'owner': 'airflow',
-'start_date': airflow.utils.dates.days_ago(2)
-}
+args = {'owner': 'airflow', 'start_date': airflow.utils.dates.days_ago(2)}
 
 Review comment:
   @kaxil will revert back to each pair on separate line.
   @KimchaC I didn't refactor the start_date in any of the example DAGs. I 
imagine dynamic start_date was used to always have the example DAGs start from 
a recent date instead of a fixed date resulting in a large number of DAG runs 
when loading the examples.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] BasPH commented on a change in pull request #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
BasPH commented on a change in pull request #4071: [AIRFLOW-3237] Refactor 
example DAGs
URL: https://github.com/apache/incubator-airflow/pull/4071#discussion_r226977084
 
 

 ##
 File path: airflow/example_dags/example_trigger_target_dag.py
 ##
 @@ -62,12 +59,13 @@ def run_this_func(ds, **kwargs):
 task_id='run_this',
 provide_context=True,
 python_callable=run_this_func,
-dag=dag)
-
+dag=dag,
+)
 
 # You can also access the DagRun object in templates
 bash_task = BashOperator(
 task_id="bash_task",
 bash_command='echo "Here is the message: '
- '{{ dag_run.conf["message"] if dag_run else "" }}" ',
-dag=dag)
+'{{ dag_run.conf["message"] if dag_run else "" }}" ',
 
 Review comment:
   My auto formatter (Black) done it that way. Will correct.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] BasPH commented on a change in pull request #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
BasPH commented on a change in pull request #4071: [AIRFLOW-3237] Refactor 
example DAGs
URL: https://github.com/apache/incubator-airflow/pull/4071#discussion_r226976404
 
 

 ##
 File path: airflow/example_dags/example_branch_python_dop_operator_3.py
 ##
 @@ -46,17 +48,9 @@ def should_run(ds, **kwargs):
 
 
 cond = BranchPythonOperator(
-task_id='condition',
-provide_context=True,
-python_callable=should_run,
-dag=dag)
-
-oper_1 = DummyOperator(
-task_id='oper_1',
-dag=dag)
-oper_1.set_upstream(cond)
-
-oper_2 = DummyOperator(
-task_id='oper_2',
-dag=dag)
-oper_2.set_upstream(cond)
+task_id='condition', provide_context=True, python_callable=should_run, 
dag=dag
+)
+
+oper_1 = DummyOperator(task_id='oper_1', dag=dag)
 
 Review comment:
   I didn't refactor the task names but agree it makes more sense. Will fix.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] BasPH commented on a change in pull request #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
BasPH commented on a change in pull request #4071: [AIRFLOW-3237] Refactor 
example DAGs
URL: https://github.com/apache/incubator-airflow/pull/4071#discussion_r226975344
 
 

 ##
 File path: airflow/example_dags/example_skip_dag.py
 ##
 @@ -37,23 +33,15 @@ def execute(self, context):
 raise AirflowSkipException
 
 
-dag = DAG(dag_id='example_skip_dag', default_args=args)
-
-
 def create_test_pipeline(suffix, trigger_rule, dag):
-
 skip_operator = 
DummySkipOperator(task_id='skip_operator_{}'.format(suffix), dag=dag)
-
 always_true = DummyOperator(task_id='always_true_{}'.format(suffix), 
dag=dag)
-
 join = DummyOperator(task_id=trigger_rule, dag=dag, 
trigger_rule=trigger_rule)
-
-join.set_upstream(skip_operator)
-join.set_upstream(always_true)
-
 final = DummyOperator(task_id='final_{}'.format(suffix), dag=dag)
-final.set_upstream(join)
+
+[skip_operator, always_true] >> join >> final
 
 Review comment:
   Whoops my bad. Checked the `BaseOperator.__rshift__` and that accepts a 
single task or List of tasks for the `other` object. But obviously calling 
rshift on a Python List doesn't work :-) Will fix.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution 
date of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431765827
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master to populate the 
values to the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution 
date of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431763957
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master to populate the 
values to the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution 
date of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431765691
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master to populate the 
values to the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution 
date of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431778262
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master to populate the 
values to the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution 
date of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431766806
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master to populate the 
values to the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution 
date of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431763613
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master to populate the 
values to the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution 
date of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431764312
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master to populate the 
values to the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on a change in pull request #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
Fokko commented on a change in pull request #4071: [AIRFLOW-3237] Refactor 
example DAGs
URL: https://github.com/apache/incubator-airflow/pull/4071#discussion_r226970148
 
 

 ##
 File path: airflow/example_dags/example_skip_dag.py
 ##
 @@ -37,23 +33,15 @@ def execute(self, context):
 raise AirflowSkipException
 
 
-dag = DAG(dag_id='example_skip_dag', default_args=args)
-
-
 def create_test_pipeline(suffix, trigger_rule, dag):
-
 skip_operator = 
DummySkipOperator(task_id='skip_operator_{}'.format(suffix), dag=dag)
-
 always_true = DummyOperator(task_id='always_true_{}'.format(suffix), 
dag=dag)
-
 join = DummyOperator(task_id=trigger_rule, dag=dag, 
trigger_rule=trigger_rule)
-
-join.set_upstream(skip_operator)
-join.set_upstream(always_true)
-
 final = DummyOperator(task_id='final_{}'.format(suffix), dag=dag)
-final.set_upstream(join)
+
+[skip_operator, always_true] >> join >> final
 
 Review comment:
   Is this allowed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on a change in pull request #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
Fokko commented on a change in pull request #4071: [AIRFLOW-3237] Refactor 
example DAGs
URL: https://github.com/apache/incubator-airflow/pull/4071#discussion_r226969880
 
 

 ##
 File path: airflow/example_dags/example_branch_python_dop_operator_3.py
 ##
 @@ -46,17 +48,9 @@ def should_run(ds, **kwargs):
 
 
 cond = BranchPythonOperator(
-task_id='condition',
-provide_context=True,
-python_callable=should_run,
-dag=dag)
-
-oper_1 = DummyOperator(
-task_id='oper_1',
-dag=dag)
-oper_1.set_upstream(cond)
-
-oper_2 = DummyOperator(
-task_id='oper_2',
-dag=dag)
-oper_2.set_upstream(cond)
+task_id='condition', provide_context=True, python_callable=should_run, 
dag=dag
+)
+
+oper_1 = DummyOperator(task_id='oper_1', dag=dag)
 
 Review comment:
   Maybe change `oper_1` to `dummy_task_1`? This makes more sense when it shows 
up in the UI.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] KimchaC removed a comment on issue #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
KimchaC removed a comment on issue #4071: [AIRFLOW-3237] Refactor example DAGs
URL: 
https://github.com/apache/incubator-airflow/pull/4071#issuecomment-431792918
 
 
   Consider also changing the code to use the with context manager so that you 
don't have to repeat the dag=dag parameter on each task:
   ```
   dag = DAG(
'my_dag',
start_date=datetime(2016, 1, 1))
   with dag:
   op = DummyOperator('op')
   
   op.dag is dag # True


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on a change in pull request #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
Fokko commented on a change in pull request #4071: [AIRFLOW-3237] Refactor 
example DAGs
URL: https://github.com/apache/incubator-airflow/pull/4071#discussion_r226970342
 
 

 ##
 File path: airflow/example_dags/example_trigger_target_dag.py
 ##
 @@ -62,12 +59,13 @@ def run_this_func(ds, **kwargs):
 task_id='run_this',
 provide_context=True,
 python_callable=run_this_func,
-dag=dag)
-
+dag=dag,
+)
 
 # You can also access the DagRun object in templates
 bash_task = BashOperator(
 task_id="bash_task",
 bash_command='echo "Here is the message: '
- '{{ dag_run.conf["message"] if dag_run else "" }}" ',
-dag=dag)
+'{{ dag_run.conf["message"] if dag_run else "" }}" ',
 
 Review comment:
   I prefer the original indentation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] KimchaC removed a comment on issue #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
KimchaC removed a comment on issue #4071: [AIRFLOW-3237] Refactor example DAGs
URL: 
https://github.com/apache/incubator-airflow/pull/4071#issuecomment-431793561
 
 
   Consider also changing the code to use the with context manager so that you 
don't have to repeat the dag=dag parameter on each task:
   ```
   dag = DAG(
'my_dag',
start_date=datetime(2016, 1, 1))
   with dag:
   op = DummyOperator('op')
   
   op.dag is dag # True


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] KimchaC removed a comment on issue #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
KimchaC removed a comment on issue #4071: [AIRFLOW-3237] Refactor example DAGs
URL: 
https://github.com/apache/incubator-airflow/pull/4071#issuecomment-431795378
 
 
   Consider also changing the code to use the with context manager so that you 
don't have to repeat the dag=dag parameter on each task:
   ```
   dag = DAG(
'my_dag',
start_date=datetime(2016, 1, 1))
   with dag:
   op = DummyOperator('op')
   
   op.dag is dag # True


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] KimchaC removed a comment on issue #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
KimchaC removed a comment on issue #4071: [AIRFLOW-3237] Refactor example DAGs
URL: 
https://github.com/apache/incubator-airflow/pull/4071#issuecomment-431793039
 
 
   Consider also changing the code to use the with context manager so that you 
don't have to repeat the dag=dag parameter on each task:
   ```
   dag = DAG(
'my_dag',
start_date=datetime(2016, 1, 1))
   with dag:
   op = DummyOperator('op')
   
   op.dag is dag # True


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] KimchaC commented on a change in pull request #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
KimchaC commented on a change in pull request #4071: [AIRFLOW-3237] Refactor 
example DAGs
URL: https://github.com/apache/incubator-airflow/pull/4071#discussion_r226949341
 
 

 ##
 File path: airflow/example_dags/example_bash_operator.py
 ##
 @@ -17,48 +17,54 @@
 # specific language governing permissions and limitations
 # under the License.
 
-import airflow
 from builtins import range
-from airflow.operators.bash_operator import BashOperator
-from airflow.operators.dummy_operator import DummyOperator
-from airflow.models import DAG
 from datetime import timedelta
 
+import airflow
+from airflow.models import DAG
+from airflow.operators.bash_operator import BashOperator
+from airflow.operators.dummy_operator import DummyOperator
 
-args = {
-'owner': 'airflow',
-'start_date': airflow.utils.dates.days_ago(2)
-}
+args = {'owner': 'airflow', 'start_date': airflow.utils.dates.days_ago(2)}
 
 Review comment:
   Also, shouldn't the start_date be a fixed date?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] KimchaC commented on a change in pull request #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
KimchaC commented on a change in pull request #4071: [AIRFLOW-3237] Refactor 
example DAGs
URL: https://github.com/apache/incubator-airflow/pull/4071#discussion_r226954622
 
 

 ##
 File path: airflow/example_dags/example_bash_operator.py
 ##
 @@ -17,48 +17,54 @@
 # specific language governing permissions and limitations
 # under the License.
 
-import airflow
 from builtins import range
-from airflow.operators.bash_operator import BashOperator
-from airflow.operators.dummy_operator import DummyOperator
-from airflow.models import DAG
 from datetime import timedelta
 
+import airflow
+from airflow.models import DAG
+from airflow.operators.bash_operator import BashOperator
+from airflow.operators.dummy_operator import DummyOperator
 
-args = {
-'owner': 'airflow',
-'start_date': airflow.utils.dates.days_ago(2)
-}
+args = {'owner': 'airflow', 'start_date': airflow.utils.dates.days_ago(2)}
 
 Review comment:
   Also, shouldn't the start_date be a fixed date?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] KimchaC commented on a change in pull request #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
KimchaC commented on a change in pull request #4071: [AIRFLOW-3237] Refactor 
example DAGs
URL: https://github.com/apache/incubator-airflow/pull/4071#discussion_r226949567
 
 

 ##
 File path: airflow/example_dags/example_bash_operator.py
 ##
 @@ -17,48 +17,54 @@
 # specific language governing permissions and limitations
 # under the License.
 
-import airflow
 from builtins import range
-from airflow.operators.bash_operator import BashOperator
-from airflow.operators.dummy_operator import DummyOperator
-from airflow.models import DAG
 from datetime import timedelta
 
+import airflow
+from airflow.models import DAG
+from airflow.operators.bash_operator import BashOperator
+from airflow.operators.dummy_operator import DummyOperator
 
-args = {
-'owner': 'airflow',
-'start_date': airflow.utils.dates.days_ago(2)
-}
+args = {'owner': 'airflow', 'start_date': airflow.utils.dates.days_ago(2)}
 
 Review comment:
   Also, shouldn't the start_date be a fixed date?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] KimchaC commented on a change in pull request #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
KimchaC commented on a change in pull request #4071: [AIRFLOW-3237] Refactor 
example DAGs
URL: https://github.com/apache/incubator-airflow/pull/4071#discussion_r226955684
 
 

 ##
 File path: airflow/example_dags/example_bash_operator.py
 ##
 @@ -17,48 +17,54 @@
 # specific language governing permissions and limitations
 # under the License.
 
-import airflow
 from builtins import range
-from airflow.operators.bash_operator import BashOperator
-from airflow.operators.dummy_operator import DummyOperator
-from airflow.models import DAG
 from datetime import timedelta
 
+import airflow
+from airflow.models import DAG
+from airflow.operators.bash_operator import BashOperator
+from airflow.operators.dummy_operator import DummyOperator
 
-args = {
-'owner': 'airflow',
-'start_date': airflow.utils.dates.days_ago(2)
-}
+args = {'owner': 'airflow', 'start_date': airflow.utils.dates.days_ago(2)}
 
 Review comment:
   Also, shouldn't the start_date be a fixed date?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] KimchaC removed a comment on issue #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
KimchaC removed a comment on issue #4071: [AIRFLOW-3237] Refactor example DAGs
URL: 
https://github.com/apache/incubator-airflow/pull/4071#issuecomment-431792968
 
 
   Consider also changing the code to use the with context manager so that you 
don't have to repeat the dag=dag parameter on each task:
   ```
   dag = DAG(
'my_dag',
start_date=datetime(2016, 1, 1))
   with dag:
   op = DummyOperator('op')
   
   op.dag is dag # True


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] BasPH commented on issue #4071: [AIRFLOW-3237] Refactor example DAGs

2018-10-22 Thread GitBox
BasPH commented on issue #4071: [AIRFLOW-3237] Refactor example DAGs
URL: 
https://github.com/apache/incubator-airflow/pull/4071#issuecomment-431818293
 
 
   @KimchaC I generally see the 50/50 usage of passing dag object vs using dag 
context manager in Airflow code. All example DAGs pass the dag object to the 
operators. Is there a preference for either by the Airflow community?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #4077: [AIRFLOW-461] Restore parameter position for BQ run_load method

2018-10-22 Thread GitBox
kaxil commented on issue #4077: [AIRFLOW-461] Restore parameter position for BQ 
run_load method
URL: 
https://github.com/apache/incubator-airflow/pull/4077#issuecomment-431817142
 
 
   Well ofcourse.. It was stupid of me.. :D Reverted this
   
   ```
   non-default argument follows default argument
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on issue #4077: [AIRFLOW-461] Restore parameter position for BQ run_load method

2018-10-22 Thread GitBox
ashb commented on issue #4077: [AIRFLOW-461] Restore parameter position for BQ 
run_load method
URL: 
https://github.com/apache/incubator-airflow/pull/4077#issuecomment-431810456
 
 
   This failed flake8 tests with a syntax error.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] felipegasparini edited a comment on issue #4036: [AIRFLOW-2744] Allow RBAC to accept plugins for views and links.

2018-10-22 Thread GitBox
felipegasparini edited a comment on issue #4036: [AIRFLOW-2744] Allow RBAC to 
accept plugins for views and links.
URL: 
https://github.com/apache/incubator-airflow/pull/4036#issuecomment-431781833
 
 
   hey guys,
   
   is there any ETA for this PR?
   
   Btw, could you also update the documentation and release notes for 1.10 to 
clarify that plugins are not supported at this moment on the new RBAC UI? 
   I put some effort on the migration to 1.10 only to rollback because of the 
lack of support for plugins.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-2397) Support affinity policies for Kubernetes executor/operator

2018-10-22 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2397:
---
Fix Version/s: (was: 1.10.0)
   1.10.1

> Support affinity policies for Kubernetes executor/operator
> --
>
> Key: AIRFLOW-2397
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2397
> Project: Apache Airflow
>  Issue Type: Sub-task
>Reporter: Sergio B
>Assignee: roc chan
>Priority: Major
> Fix For: 2.0.0, 1.10.1
>
>
> In order to be able to have a fine control in the workload distribution 
> implement the ability to set affinity policies in kubernetes would solve 
> complex problems 
> https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.10/#affinity-v1-core



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2854) kubernetes_pod_operator add more configuration items

2018-10-22 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2854:
---
Fix Version/s: 1.10.1

> kubernetes_pod_operator add more configuration items
> 
>
> Key: AIRFLOW-2854
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2854
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: 2.0.0
>Reporter: pengchen
>Assignee: pengchen
>Priority: Minor
> Fix For: 2.0.0, 1.10.1
>
>
> kubernetes_pod_operator is missing kubernetes pods related configuration 
> items, as follows:
> 1. image_pull_secrets
> _Pull secrets_ are used to _pull_ private container _images_ from registries. 
> In this case, we need to configure the image_pull_secrets in pod spec file
> 2. service_account_name
> When kubernetes is running on rbac Authorization. If it is a job that needs 
> to operate on kubernetes resources, we need to configure service account.
> 3. is_delete_operator_pod
> This option can be given to the user to decide whether to delete the job pod 
> created by pod_operator, which is currently not processed.
> 4. hostnetwork



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2662) support affinity & nodeSelector policies for kubernetes executor/operator

2018-10-22 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2662:
---
Fix Version/s: 1.10.1

> support affinity & nodeSelector policies for kubernetes executor/operator
> -
>
> Key: AIRFLOW-2662
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2662
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: 2.0.0
>Reporter: pengchen
>Assignee: pengchen
>Priority: Minor
> Fix For: 2.0.0, 1.10.1
>
>
> In this issue(https://issues.apache.org/jira/browse/AIRFLOW-2397), only the 
> affinity function of the kubernetes operator pod is provided, and the 
> affinity function of the kubernetes executor pod is not supported. The full 
> affinity and nodeselector function are provided here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1753) Can't install on windows 10

2018-10-22 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659062#comment-16659062
 ] 

jack commented on AIRFLOW-1753:
---

[~ashb] I think it's worth mentioning on the docs (at least in the FAQ section) 
that Airflow currently can't be installed on Windows. Seen this question on 
many places.

> Can't install on windows 10
> ---
>
> Key: AIRFLOW-1753
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1753
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Lakshman Udayakantha
>Priority: Major
>
> When I installed airflow using "pip install airflow command" two errors pop 
> up.
> 1.  link.exe failed with exit status 1158
> 2.\x86_amd64\\cl.exe' failed with exit status 2
> first issue can be solved by reffering 
> https://stackoverflow.com/questions/43858836/python-installing-clarifai-vs14-0-link-exe-failed-with-exit-status-1158/44563421#44563421.
> But second issue is still there. there was no any solution by googling also. 
> how to prevent that issue and install airflow on windows 10 X64.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2141) Cannot create airflow variables when there is a list of dictionary as a value

2018-10-22 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658904#comment-16658904
 ] 

jack commented on AIRFLOW-2141:
---

This is related to a similar ticket I opened: 
https://issues.apache.org/jira/browse/AIRFLOW-3157

> Cannot create airflow variables when there is a list of dictionary as a value
> -
>
> Key: AIRFLOW-2141
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2141
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: aws
>Affects Versions: 1.8.0
>Reporter: Soundar
>Priority: Major
>  Labels: beginner, newbie
> Attachments: airflow_cli.png, airflow_cli2_crop.png
>
>
> I'm trying to create Airflow variables using a json file. I am trying to 
> import airflow variables using UI(webserver) when I upload the json file I 
> get this error "Missing file or syntax error" and when I try to upload using 
> airflow cli not all the variables gets uploaded properly. The catch is that I 
> have a list of dictionary in my json file, say
>  ex:
>  {
>  "demo_archivedir": "/home/ubuntu/folders/archive",
>  "demo_filepattern": [
> { "id": "reference", "pattern": "Sample Data.xlsx" }
> ,
> { "id": "sale", "pattern": "Sales.xlsx" }
> ],
>  "demo_sourcepath": "/home/ubuntu/folders/input",
>  "demo_workdir": "/home/ubuntu/folders/working"
>  }
> I've attached two images
> img1. Using airflow variables cli command I was able to create partial 
> variables from my json file(airflow_cli.png)img2. After inserting logs in the 
> "airflow/bin/cli.py" file, I got this error. (airflow_cli2_crop.png)
> The thing is I gave this value through the Admin UI one by one and it worked. 
> Then I exported those same variable using "airflow variables" cli command and 
> tried importing them, still it failed and the above mentioned error still 
> occurs.
> Note:
>    I am using Python 3.5 with Airflow version 1.8
> The stack trace is as follows
> .compute-1.amazonaws.com:22] out: 0 of 4 variables successfully updated.
> .compute-1.amazonaws.com:22] out: Traceback (most recent call last):
> .compute-1.amazonaws.com:22] out:   File "/home/ubuntu/Env/bin/airflow", line 
> 28, in 
> .compute-1.amazonaws.com:22] out: args.func(args)
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/airflow/bin/cli.py", line 242, 
> in variables
> .compute-1.amazonaws.com:22] out: import_helper(imp)
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/airflow/bin/cli.py", line 273, 
> in import_helper
> .compute-1.amazonaws.com:22] out: Variable.set(k, v)
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/airflow/utils/db.py", line 53, 
> in wrapper
> .compute-1.amazonaws.com:22] out: result = func(*args, **kwargs)
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/airflow/models.py", line 3615, 
> in set
> .compute-1.amazonaws.com:22] out: session.add(Variable(key=key, 
> val=stored_value))
> .compute-1.amazonaws.com:22] out:   File "", line 4, in __init__
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/sqlalchemy/orm/state.py", line 
> 417, in _initialize_instance
> .compute-1.amazonaws.com:22] out: manager.dispatch.init_failure(self, 
> args, kwargs)
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/sqlalchemy/util/langhelpers.py",
>  line 66, in __exit__
> .compute-1.amazonaws.com:22] out: compat.reraise(exc_type, exc_value, 
> exc_tb)
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/sqlalchemy/util/compat.py", 
> line 187, in reraise
> .compute-1.amazonaws.com:22] out: raise value
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/sqlalchemy/orm/state.py", line 
> 414, in _initialize_instance
> .compute-1.amazonaws.com:22] out: return 
> manager.original_init(*mixed[1:], **kwargs)
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/sqlalchemy/ext/declarative/base.py",
>  line 700, in _declarative_constructor
> .compute-1.amazonaws.com:22] out: setattr(self, k, kwargs[k])
> compute-1.amazonaws.com:22] out:   File "", line 1, in __set__
> .compute-1.amazonaws.com:22] out:   File 
> "/home/ubuntu/Env/lib/python3.5/site-packages/airflow/models.py", line 3550, 
> in set_val
> .compute-1.amazonaws.com:22] out: self._val = FERNET.encrypt(bytes(value, 
> 'utf-8')).decode()
> .compute-1.amazonaws.com:22] out: TypeError: encoding without a string 
> argument
> .compute-1.amazonaws.com:22] out:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2323) Should we replace the librabbitmq with other library in setup.py for Apache Airflow 1.9+?

2018-10-22 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658901#comment-16658901
 ] 

jack commented on AIRFLOW-2323:
---

It doesn't seems like the librabbitmq lib is going to fix the problems. It's 
barely maintained.

> Should we replace the librabbitmq with other library in setup.py for Apache 
> Airflow 1.9+?
> -
>
> Key: AIRFLOW-2323
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2323
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: A.Quasimodo
>Priority: Major
>
> As we know, latest librabbitmq is still can't support Python3,so, when I 
> executed the command *pip install apache-airflow[rabbitmq]*, some errors 
> happened.
> So, should we replace the librabbitmq with other libraries like 
> amqplib,py-amqp,.etc?
> Thank you



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3239) Test discovery partial fails due to incorrect name of the test files

2018-10-22 Thread Xiaodong DENG (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658889#comment-16658889
 ] 

Xiaodong DENG commented on AIRFLOW-3239:


Hi [~ashb], thanks for checking on this as well.

For

- tests/operators/bash_operator.py
- tests/operators/operator.py

I'm aware of them, but got CI failures when I tried to simply rename them 
(prepend with "test_"), so haven't fixed them yet in my two earlier PRs. May 
continue on them later. If you have got the solution to fix them, kindly 
proceed.

 

Cheers

> Test discovery partial fails due to incorrect name of the test files
> 
>
> Key: AIRFLOW-3239
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3239
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: tests
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Major
>
> In PR [https://github.com/apache/incubator-airflow/pull/4049,] I have fixed 
> the incorrect name of some test files (resulting in partial failure in test 
> discovery).
> There are some other scripts with this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (AIRFLOW-3239) Test discovery partial fails due to incorrect name of the test files

2018-10-22 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor reopened AIRFLOW-3239:


> Test discovery partial fails due to incorrect name of the test files
> 
>
> Key: AIRFLOW-3239
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3239
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: tests
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Major
>
> In PR [https://github.com/apache/incubator-airflow/pull/4049,] I have fixed 
> the incorrect name of some test files (resulting in partial failure in test 
> discovery).
> There are some other scripts with this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3239) Test discovery partial fails due to incorrect name of the test files

2018-10-22 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658880#comment-16658880
 ] 

Ash Berlin-Taylor commented on AIRFLOW-3239:


A couple more files:

tests/api/common/experimental/mark_tasks.py
tests/api/common/experimental/trigger_dag_tests.py
tests/impersonation.py
tests/jobs.py
tests/models.py
tests/plugins_manager.py
tests/utils.py
tests/operators/bash_operator.py
tests/operators/operator.py

I think the ones in models.py are being loaded from tests/\_\_init\_\_.py so 
are being run. But we should remove the need for imports in 
tests/\_\_init\_\_.py et al and name the rest of the files properly

> Test discovery partial fails due to incorrect name of the test files
> 
>
> Key: AIRFLOW-3239
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3239
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: tests
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Major
>
> In PR [https://github.com/apache/incubator-airflow/pull/4049,] I have fixed 
> the incorrect name of some test files (resulting in partial failure in test 
> discovery).
> There are some other scripts with this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2925) gcp dataflow hook doesn't show traceback

2018-10-22 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658863#comment-16658863
 ] 

jack commented on AIRFLOW-2925:
---

[~xnuinside] where does the log shows the exception message? "DataFlow failed 
with return code..."

> gcp dataflow hook doesn't show traceback
> 
>
> Key: AIRFLOW-2925
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2925
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: jack
>Priority: Major
>  Labels: easyfix
>
> The gcp_dataflow_hook.py has:
>  
> {code:java}
> if self._proc.returncode is not 0:   
> raise Exception("DataFlow failed with return code 
> {}".format(self._proc.returncode))
> {code}
>  
> This does not show the full trace of the error which makes it harder to 
> understand the problem.
> [https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/gcp_dataflow_hook.py#L171]
>  
>  
> reported on gitter by Oscar Carlsson



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-461) BigQuery: Support autodetection of schemas

2018-10-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658861#comment-16658861
 ] 

ASF GitHub Bot commented on AIRFLOW-461:


kaxil closed pull request #4077: [AIRFLOW-461] Restore parameter position for 
BQ run_load method
URL: https://github.com/apache/incubator-airflow/pull/4077
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> BigQuery: Support autodetection of schemas
> --
>
> Key: AIRFLOW-461
> URL: https://issues.apache.org/jira/browse/AIRFLOW-461
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp
>Reporter: Jeremiah Lowin
>Assignee: Iuliia Volkova
>Priority: Major
> Fix For: 2.0.0
>
>
> Add support for autodetecting schemas. Autodetect behavior is described in 
> the documentation for federated data sources here: 
> https://cloud.google.com/bigquery/federated-data-sources#auto-detect but is 
> actually available when loading any CSV or JSON data (not just for federated 
> tables). See the API: 
> https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.load.autodetect



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-3240) Airflow dags are not working (not starting t1)

2018-10-22 Thread Anonymous (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous reassigned AIRFLOW-3240:
--

Assignee: (was: Ivan Vitoria)

> Airflow dags are not working (not starting t1)
> --
>
> Key: AIRFLOW-3240
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3240
> Project: Apache Airflow
>  Issue Type: Task
>  Components: DAG, DagRun
>Affects Versions: 1.8.0
>Reporter: Pandu
>Priority: Critical
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kaxil closed pull request #4077: [AIRFLOW-461] Restore parameter position for BQ run_load method

2018-10-22 Thread GitBox
kaxil closed pull request #4077: [AIRFLOW-461] Restore parameter position for 
BQ run_load method
URL: https://github.com/apache/incubator-airflow/pull/4077
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ashb commented on a change in pull request #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ashb commented on a change in pull request #2460: [AIRFLOW-1424] make the next 
execution date of DAGs visible
URL: https://github.com/apache/incubator-airflow/pull/2460#discussion_r226942255
 
 

 ##
 File path: airflow/models.py
 ##
 @@ -3055,6 +3055,37 @@ def latest_execution_date(self):
 session.close()
 return execution_date
 
+@property
+def next_run_date(self):
+"""
+Returns the next run date for which the dag will be scheduled
+"""
+next_run_date = None
+if not self.latest_execution_date:
+# First run
+task_start_dates = [t.start_date for t in self.tasks]
+if task_start_dates:
+next_run_date = self.normalize_schedule(min(task_start_dates))
+else:
+next_run_date = self.following_schedule(self.latest_execution_date)
+return next_run_date
+
+@property
+def next_execution_date(self):
+"""
+Returns the next execution date at which the dag will be scheduled by
 
 Review comment:
   It's not clear to me how these two methods differ. What's the intent behind 
these two methods?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-2722) ECSOperator requires network configuration parameter when FARGATE launch type is used

2018-10-22 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658857#comment-16658857
 ] 

jack commented on AIRFLOW-2722:
---

[~ThomasVdBerge] this refers to your PR

> ECSOperator requires network configuration parameter when FARGATE launch type 
> is used
> -
>
> Key: AIRFLOW-2722
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2722
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: aws
>Affects Versions: 2.0.0
>Reporter: Craig Forster
>Priority: Major
>
> The 'FARGATE' launch type was added in AIRFLOW-2435, however when using that 
> launch mode the following error is returned:
> {noformat}
> Network Configuration must be provided when networkMode 'awsvpc' is specified.
> {noformat}
> Fargate-launched tasks use the "awsvpc" networking type, and as per the 
> [boto3 
> documentation|http://boto3.readthedocs.io/en/latest/reference/services/ecs.html#ECS.Client.run_task]
>  for run_task:
> {quote}This parameter is required for task definitions that use the awsvpc 
> network mode to receive their own Elastic Network Interface, and it is not 
> supported for other network modes.
> {quote}
> As it's currently implemented, the Fargate launch type is unusable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] felipegasparini commented on issue #4036: [AIRFLOW-2744] Allow RBAC to accept plugins for views and links.

2018-10-22 Thread GitBox
felipegasparini commented on issue #4036: [AIRFLOW-2744] Allow RBAC to accept 
plugins for views and links.
URL: 
https://github.com/apache/incubator-airflow/pull/4036#issuecomment-431781833
 
 
   hey guys,
   
   is there any ETA for this PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil opened a new pull request #4077: [AIRFLOW-461] Restore parameter position for BQ run_load method

2018-10-22 Thread GitBox
kaxil opened a new pull request #4077: [AIRFLOW-461] Restore parameter position 
for BQ run_load method
URL: https://github.com/apache/incubator-airflow/pull/4077
 
 
   This is just a follow-up PR to 
https://github.com/apache/incubator-airflow/pull/3880 as I didn't have write 
permission to forked repo. This PR just restores parameter position for 
`schema_fields` param
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-461
   
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-461) BigQuery: Support autodetection of schemas

2018-10-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658845#comment-16658845
 ] 

ASF GitHub Bot commented on AIRFLOW-461:


kaxil opened a new pull request #4077: [AIRFLOW-461] Restore parameter position 
for BQ run_load method
URL: https://github.com/apache/incubator-airflow/pull/4077
 
 
   This is just a follow-up PR to 
https://github.com/apache/incubator-airflow/pull/3880 as I didn't have write 
permission to forked repo. This PR just restores parameter position for 
`schema_fields` param
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-461
   
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `flake8`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> BigQuery: Support autodetection of schemas
> --
>
> Key: AIRFLOW-461
> URL: https://issues.apache.org/jira/browse/AIRFLOW-461
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp
>Reporter: Jeremiah Lowin
>Assignee: Iuliia Volkova
>Priority: Major
> Fix For: 2.0.0
>
>
> Add support for autodetecting schemas. Autodetect behavior is described in 
> the documentation for federated data sources here: 
> https://cloud.google.com/bigquery/federated-data-sources#auto-detect but is 
> actually available when loading any CSV or JSON data (not just for federated 
> tables). See the API: 
> https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.load.autodetect



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ashb commented on issue #4073: [AIRFLOW-3238] Fix models.DAG.deactivate_unknown_dags

2018-10-22 Thread GitBox
ashb commented on issue #4073: [AIRFLOW-3238] Fix 
models.DAG.deactivate_unknown_dags
URL: 
https://github.com/apache/incubator-airflow/pull/4073#issuecomment-431780472
 
 
   Does this also apply to `upgradedb`? I counsel people against running initdb 
in production (because it creates all the sample connections which is often not 
what people want.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil commented on issue #3880: [AIRFLOW-461] Support autodetected schemas in BigQuery run_load

2018-10-22 Thread GitBox
kaxil commented on issue #3880: [AIRFLOW-461]  Support autodetected schemas in 
BigQuery run_load
URL: 
https://github.com/apache/incubator-airflow/pull/3880#issuecomment-431779597
 
 
   @xnuinside I think there are many people facing the issue so I think we 
should get this in. Thanks @xnuinside :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kaxil closed pull request #3880: [AIRFLOW-461] Support autodetected schemas in BigQuery run_load

2018-10-22 Thread GitBox
kaxil closed pull request #3880: [AIRFLOW-461]  Support autodetected schemas in 
BigQuery run_load
URL: https://github.com/apache/incubator-airflow/pull/3880
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/hooks/bigquery_hook.py 
b/airflow/contrib/hooks/bigquery_hook.py
index 7a1631b53a..a4d91769c6 100644
--- a/airflow/contrib/hooks/bigquery_hook.py
+++ b/airflow/contrib/hooks/bigquery_hook.py
@@ -851,8 +851,8 @@ def run_copy(self,
 
 def run_load(self,
  destination_project_dataset_table,
- schema_fields,
  source_uris,
+ schema_fields=None,
  source_format='CSV',
  create_disposition='CREATE_IF_NEEDED',
  skip_leading_rows=0,
@@ -866,7 +866,8 @@ def run_load(self,
  schema_update_options=(),
  src_fmt_configs=None,
  time_partitioning=None,
- cluster_fields=None):
+ cluster_fields=None,
+ autodetect=False):
 """
 Executes a BigQuery load command to load data from Google Cloud Storage
 to BigQuery. See here:
@@ -884,7 +885,11 @@ def run_load(self,
 :type destination_project_dataset_table: str
 :param schema_fields: The schema field list as defined here:
 
https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.load
+Required if autodetect=False; optional if autodetect=True.
 :type schema_fields: list
+:param autodetect: Attempt to autodetect the schema for CSV and JSON
+source files.
+:type autodetect: bool
 :param source_uris: The source Google Cloud
 Storage URI (e.g. gs://some-bucket/some-file.txt). A single wild
 per-object name can be used.
@@ -941,6 +946,11 @@ def run_load(self,
 # if it's not, we raise a ValueError
 # Refer to this link for more details:
 #   
https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query.tableDefinitions.(key).sourceFormat
+
+if schema_fields is None and not autodetect:
+raise ValueError(
+'You must either pass a schema or autodetect=True.')
+
 if src_fmt_configs is None:
 src_fmt_configs = {}
 
@@ -975,6 +985,7 @@ def run_load(self,
 
 configuration = {
 'load': {
+'autodetect': autodetect,
 'createDisposition': create_disposition,
 'destinationTable': {
 'projectId': destination_project,
@@ -1592,7 +1603,7 @@ def _split_tablename(table_input, default_project_id, 
var_name=None):
 
 if '.' not in table_input:
 raise ValueError(
-'Expected deletion_dataset_table name in the format of '
+'Expected target table name in the format of '
 '.. Got: {}'.format(table_input))
 
 if not default_project_id:
diff --git a/airflow/contrib/operators/bigquery_operator.py 
b/airflow/contrib/operators/bigquery_operator.py
index fec877db05..caed3befed 100644
--- a/airflow/contrib/operators/bigquery_operator.py
+++ b/airflow/contrib/operators/bigquery_operator.py
@@ -308,7 +308,7 @@ def __init__(self,
  project_id=None,
  schema_fields=None,
  gcs_schema_object=None,
- time_partitioning={},
+ time_partitioning=None,
  bigquery_conn_id='bigquery_default',
  google_cloud_storage_conn_id='google_cloud_default',
  delegate_to=None,
@@ -325,7 +325,7 @@ def __init__(self,
 self.bigquery_conn_id = bigquery_conn_id
 self.google_cloud_storage_conn_id = google_cloud_storage_conn_id
 self.delegate_to = delegate_to
-self.time_partitioning = time_partitioning
+self.time_partitioning = {} if time_partitioning is None else 
time_partitioning
 self.labels = labels
 
 def execute(self, context):
diff --git a/airflow/contrib/operators/gcs_to_bq.py 
b/airflow/contrib/operators/gcs_to_bq.py
index 39dff21606..a98e15a8d6 100644
--- a/airflow/contrib/operators/gcs_to_bq.py
+++ b/airflow/contrib/operators/gcs_to_bq.py
@@ -152,6 +152,7 @@ def __init__(self,
  external_table=False,
  time_partitioning=None,
  cluster_fields=None,
+ autodetect=False,
  *args, **kwargs):
 
 super(GoogleCloudStorageToBigQueryOperator, self).__init__(*args, 
**kwargs)
@@ -190,20 +191,24 @@ def __init__(self,
 self.src_fmt_configs = src_fmt_configs
 self.time_partitioning = time_partitioning
 

[jira] [Commented] (AIRFLOW-461) BigQuery: Support autodetection of schemas

2018-10-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658823#comment-16658823
 ] 

ASF GitHub Bot commented on AIRFLOW-461:


kaxil closed pull request #3880: [AIRFLOW-461]  Support autodetected schemas in 
BigQuery run_load
URL: https://github.com/apache/incubator-airflow/pull/3880
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/hooks/bigquery_hook.py 
b/airflow/contrib/hooks/bigquery_hook.py
index 7a1631b53a..a4d91769c6 100644
--- a/airflow/contrib/hooks/bigquery_hook.py
+++ b/airflow/contrib/hooks/bigquery_hook.py
@@ -851,8 +851,8 @@ def run_copy(self,
 
 def run_load(self,
  destination_project_dataset_table,
- schema_fields,
  source_uris,
+ schema_fields=None,
  source_format='CSV',
  create_disposition='CREATE_IF_NEEDED',
  skip_leading_rows=0,
@@ -866,7 +866,8 @@ def run_load(self,
  schema_update_options=(),
  src_fmt_configs=None,
  time_partitioning=None,
- cluster_fields=None):
+ cluster_fields=None,
+ autodetect=False):
 """
 Executes a BigQuery load command to load data from Google Cloud Storage
 to BigQuery. See here:
@@ -884,7 +885,11 @@ def run_load(self,
 :type destination_project_dataset_table: str
 :param schema_fields: The schema field list as defined here:
 
https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.load
+Required if autodetect=False; optional if autodetect=True.
 :type schema_fields: list
+:param autodetect: Attempt to autodetect the schema for CSV and JSON
+source files.
+:type autodetect: bool
 :param source_uris: The source Google Cloud
 Storage URI (e.g. gs://some-bucket/some-file.txt). A single wild
 per-object name can be used.
@@ -941,6 +946,11 @@ def run_load(self,
 # if it's not, we raise a ValueError
 # Refer to this link for more details:
 #   
https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query.tableDefinitions.(key).sourceFormat
+
+if schema_fields is None and not autodetect:
+raise ValueError(
+'You must either pass a schema or autodetect=True.')
+
 if src_fmt_configs is None:
 src_fmt_configs = {}
 
@@ -975,6 +985,7 @@ def run_load(self,
 
 configuration = {
 'load': {
+'autodetect': autodetect,
 'createDisposition': create_disposition,
 'destinationTable': {
 'projectId': destination_project,
@@ -1592,7 +1603,7 @@ def _split_tablename(table_input, default_project_id, 
var_name=None):
 
 if '.' not in table_input:
 raise ValueError(
-'Expected deletion_dataset_table name in the format of '
+'Expected target table name in the format of '
 '.. Got: {}'.format(table_input))
 
 if not default_project_id:
diff --git a/airflow/contrib/operators/bigquery_operator.py 
b/airflow/contrib/operators/bigquery_operator.py
index fec877db05..caed3befed 100644
--- a/airflow/contrib/operators/bigquery_operator.py
+++ b/airflow/contrib/operators/bigquery_operator.py
@@ -308,7 +308,7 @@ def __init__(self,
  project_id=None,
  schema_fields=None,
  gcs_schema_object=None,
- time_partitioning={},
+ time_partitioning=None,
  bigquery_conn_id='bigquery_default',
  google_cloud_storage_conn_id='google_cloud_default',
  delegate_to=None,
@@ -325,7 +325,7 @@ def __init__(self,
 self.bigquery_conn_id = bigquery_conn_id
 self.google_cloud_storage_conn_id = google_cloud_storage_conn_id
 self.delegate_to = delegate_to
-self.time_partitioning = time_partitioning
+self.time_partitioning = {} if time_partitioning is None else 
time_partitioning
 self.labels = labels
 
 def execute(self, context):
diff --git a/airflow/contrib/operators/gcs_to_bq.py 
b/airflow/contrib/operators/gcs_to_bq.py
index 39dff21606..a98e15a8d6 100644
--- a/airflow/contrib/operators/gcs_to_bq.py
+++ b/airflow/contrib/operators/gcs_to_bq.py
@@ -152,6 +152,7 @@ def __init__(self,
  external_table=False,
  time_partitioning=None,
  cluster_fields=None,
+ autodetect=False,
  *args, **kwargs):
 
 

[GitHub] ron819 commented on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 commented on issue #2460: [AIRFLOW-1424] make the next execution date of 
DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431778262
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master to populate the 
values to the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution 
date of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431765946
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master to populate the 
values to the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] xnuinside opened a new pull request #3880: [AIRFLOW-461] Support autodetected schemas in BigQuery run_load

2018-10-22 Thread GitBox
xnuinside opened a new pull request #3880: [AIRFLOW-461]  Support autodetected 
schemas in BigQuery run_load
URL: https://github.com/apache/incubator-airflow/pull/3880
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [x] My PR addresses the following [Airflow 
Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
 - https://issues.apache.org/jira/browse/AIRFLOW-461
 - In case you are fixing a typo in the documentation you can prepend your 
commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   added autodetect to run_load in BigQuery hook and gcs_to_bq Operator
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   ### Code Quality
   
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (AIRFLOW-3025) Allow to specify dns and dns-search parameters for DockerOperator

2018-10-22 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3025:
---
Fix Version/s: 1.10.1

> Allow to specify dns and dns-search parameters for DockerOperator
> -
>
> Key: AIRFLOW-3025
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3025
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Konrad Gołuchowski
>Assignee: Konrad Gołuchowski
>Priority: Minor
> Fix For: 2.0.0, 1.10.1
>
>
> Docker allows to specify dns and dns-search options when starting a 
> container. It would be useful to enable DockerOperator to use these two 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-3240) Airflow dags are not working (not starting t1)

2018-10-22 Thread Anonymous (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous reassigned AIRFLOW-3240:
--

Assignee: Ivan Vitoria

> Airflow dags are not working (not starting t1)
> --
>
> Key: AIRFLOW-3240
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3240
> Project: Apache Airflow
>  Issue Type: Task
>  Components: DAG, DagRun
>Affects Versions: 1.8.0
>Reporter: Pandu
>Assignee: Ivan Vitoria
>Priority: Critical
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2574) initdb fails when mysql password contains percent sign

2018-10-22 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor resolved AIRFLOW-2574.

Resolution: Fixed

> initdb fails when mysql password contains percent sign
> --
>
> Key: AIRFLOW-2574
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2574
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: db
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Zihao Zhang
>Priority: Minor
> Fix For: 1.10.1
>
>
> [db.py|https://github.com/apache/incubator-airflow/blob/3358551c8e73d9019900f7a85f18ebfd88591450/airflow/utils/db.py#L345]
>  uses 
> [config.set_main_option|http://alembic.zzzcomputing.com/en/latest/api/config.html#alembic.config.Config.set_main_option]
>  which says "A raw percent sign not part of an interpolation symbol must 
> therefore be escaped"
> When there is a percent sign in database connection string, this will crash 
> due to bad interpolation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2574) initdb fails when mysql password contains percent sign

2018-10-22 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658796#comment-16658796
 ] 

jack commented on AIRFLOW-2574:
---

This was fixed and merged.

Can be closed?

> initdb fails when mysql password contains percent sign
> --
>
> Key: AIRFLOW-2574
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2574
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: db
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Zihao Zhang
>Priority: Minor
> Fix For: 1.10.1
>
>
> [db.py|https://github.com/apache/incubator-airflow/blob/3358551c8e73d9019900f7a85f18ebfd88591450/airflow/utils/db.py#L345]
>  uses 
> [config.set_main_option|http://alembic.zzzcomputing.com/en/latest/api/config.html#alembic.config.Config.set_main_option]
>  which says "A raw percent sign not part of an interpolation symbol must 
> therefore be escaped"
> When there is a percent sign in database connection string, this will crash 
> due to bad interpolation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (AIRFLOW-2574) initdb fails when mysql password contains percent sign

2018-10-22 Thread jack (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jack updated AIRFLOW-2574:
--
Comment: was deleted

(was: This was fixed and merged.

Can be closed?)

> initdb fails when mysql password contains percent sign
> --
>
> Key: AIRFLOW-2574
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2574
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: db
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Zihao Zhang
>Priority: Minor
> Fix For: 1.10.1
>
>
> [db.py|https://github.com/apache/incubator-airflow/blob/3358551c8e73d9019900f7a85f18ebfd88591450/airflow/utils/db.py#L345]
>  uses 
> [config.set_main_option|http://alembic.zzzcomputing.com/en/latest/api/config.html#alembic.config.Config.set_main_option]
>  which says "A raw percent sign not part of an interpolation symbol must 
> therefore be escaped"
> When there is a percent sign in database connection string, this will crash 
> due to bad interpolation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2421) HTTPHook and SimpleHTTPOperator do not verify certificates by default

2018-10-22 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-2421:
---
Fix Version/s: 1.10.1
  Component/s: security

> HTTPHook and SimpleHTTPOperator do not verify certificates by default
> -
>
> Key: AIRFLOW-2421
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2421
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks, security
>Affects Versions: 1.8.0
>Reporter: David Adrian
>Priority: Major
> Fix For: 1.10.1
>
>
> To verify HTTPS certificates when using anything built with an HTTP hook, you 
> have to explicitly pass the undocumented {{extra_options = \{"verify": True} 
> }}. The offending line is at 
> https://github.com/apache/incubator-airflow/blob/master/airflow/hooks/http_hook.py#L103.
> {code}
> response = session.send(
> 
> verify=extra_options.get("verify", False),
> 
> )
> {code}
> Not only is this the opposite default of what is expected, the necessary 
> requirements to verify certificates (e.g certifi), are already installed as 
> part of Airflow. I haven't dug through all of the code yet, but I'm concerned 
> that any other connections, operators or hooks built using HTTP hook don't 
> pass this option in.
> Instead, the HTTP hook should default to {{verify=True}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2421) HTTPHook and SimpleHTTPOperator do not verify certificates by default

2018-10-22 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658789#comment-16658789
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2421:


I think we should change the default to verify true - not verifying is the 
wrong default value.

Additionally I think the "default" value for the extra options should come from 
the connection extra field, and merge in any extra settings from the 
per-function dict.



> HTTPHook and SimpleHTTPOperator do not verify certificates by default
> -
>
> Key: AIRFLOW-2421
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2421
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks, security
>Affects Versions: 1.8.0
>Reporter: David Adrian
>Priority: Major
> Fix For: 1.10.1
>
>
> To verify HTTPS certificates when using anything built with an HTTP hook, you 
> have to explicitly pass the undocumented {{extra_options = \{"verify": True} 
> }}. The offending line is at 
> https://github.com/apache/incubator-airflow/blob/master/airflow/hooks/http_hook.py#L103.
> {code}
> response = session.send(
> 
> verify=extra_options.get("verify", False),
> 
> )
> {code}
> Not only is this the opposite default of what is expected, the necessary 
> requirements to verify certificates (e.g certifi), are already installed as 
> part of Airflow. I haven't dug through all of the code yet, but I'm concerned 
> that any other connections, operators or hooks built using HTTP hook don't 
> pass this option in.
> Instead, the HTTP hook should default to {{verify=True}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution 
date of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431764042
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master to populate the 
values to the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution 
date of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431762443
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR only needs to be modified to show the result of this CLI 
command on the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution 
date of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431766046
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master to populate the 
values to the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 removed a comment on issue #2460: [AIRFLOW-1424] make the next execution 
date of DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431762576
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ron819 commented on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 commented on issue #2460: [AIRFLOW-1424] make the next execution date of 
DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431765827
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master to populate the 
values to the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ron819 commented on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 commented on issue #2460: [AIRFLOW-1424] make the next execution date of 
DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431764312
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master to populate the 
values to the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (AIRFLOW-2618) Improve UI by add "Next Run" column

2018-10-22 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor closed AIRFLOW-2618.
--
Resolution: Duplicate

> Improve UI by add "Next Run" column
> ---
>
> Key: AIRFLOW-2618
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2618
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ui
>Reporter: jack
>Priority: Minor
>
> Please add also a column in the UI for "Next Run". Ideally when passing mouse 
> over it we will also see the 5 next scheduled runs.
> This can be very helpful.
> If for some reason you think this is an "overhead" why not adding it and 
> allow a "personalized UI" feature where the user can set if this column will 
> appear or not. This can be a very good feature in allowing users 
> personalizing their own UI columns.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] ron819 commented on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible

2018-10-22 Thread GitBox
ron819 commented on issue #2460: [AIRFLOW-1424] make the next execution date of 
DAGs visible
URL: 
https://github.com/apache/incubator-airflow/pull/2460#issuecomment-431766806
 
 
   There is a CLI command merged to master by @XD-DENG that shows the next 
execution date
   https://github.com/apache/incubator-airflow/pull/3834
   Maybe this PR can use the logic already merged to master to populate the 
values to the UI.
   
   Also duplicate Jira ticket for this:
   https://issues.apache.org/jira/browse/AIRFLOW-2618


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (AIRFLOW-63) Dangling Running Jobs

2018-10-22 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-63?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658765#comment-16658765
 ] 

Ash Berlin-Taylor commented on AIRFLOW-63:
--

Possibly, though if the scheduler process is killed hard (oom, segfault etc) 
there still may be cases where the job remains running. So I think I'd say "not 
quite yet" and this is still possibly an issue (at least not fixed by my PR)

> Dangling Running Jobs
> -
>
> Key: AIRFLOW-63
> URL: https://issues.apache.org/jira/browse/AIRFLOW-63
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.7.0
> Environment: mac os X with local executor
>Reporter: Giacomo Tagliabe
>Priority: Minor
>
> It seems that if the scheduler is killed unexpectedly, the SchedulerJob 
> remains marked as running. Same thing applies to LocalTaskJob: if a job is 
> running when the scheduler dies, the job remains marked as running forever. 
> I'd expect `kill_zombies` to mark the job with an old heartbeat as not 
> running, but it seems it only marks the related task instances. This to me 
> seems like a bug, I also fail to see the piece of code that  is supposed to 
> do that, which leads me to think that this is not handled at all. I don't 
> think there is anything really critical about having stale jobs marked as 
> running, but they definitely is confusing to see



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1523) Clicking on Graph View should display related DAG run

2018-10-22 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658764#comment-16658764
 ] 

jack commented on AIRFLOW-1523:
---

This is actually quite annoying. I noticed this too.

 

> Clicking on Graph View should display related DAG run
> -
>
> Key: AIRFLOW-1523
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1523
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webapp
>Reporter: Gediminas Rapolavicius
>Priority: Minor
> Attachments: Screen Shot 2017-08-20 at 10.09.16 PM.png
>
>
> When you are looking at the logs of a task instance (and you got there from 
> tree view, etc, see the screenshot), clicking on Graph View will take you to 
> the Graph View of the latest DAG run.
> It's very hard to navigate from task instance logs to the related Graph View, 
> so you could see logs of other tasks in the same run, etc.
> I am proposing to change the Graph View link, so that it would take to the 
> Graph View of the same run as the task instance, not the latest.
> I could try to implement this, if the maintainers think that it would useful 
> and could be merged.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1471) DAGs not deleted from scheduler after DAG file is removed

2018-10-22 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658762#comment-16658762
 ] 

jack commented on AIRFLOW-1471:
---

This is not a bug.

Deleting DAG file manually doesn't delete it from the DB.

in Airflow 1.10 a delete option was added to the UI so upgrading your Airflow 
version should give you the ability you are looking for.

> DAGs not deleted from scheduler after DAG file is removed
> -
>
> Key: AIRFLOW-1471
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1471
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler, ui, webserver
>Affects Versions: 1.8.0
> Environment: Centos7, python 3.5.2
>Reporter: Daniel Ochoa
>Priority: Minor
> Attachments: airflow_bug.PNG
>
>
> After I deleted a DAG (i.e load_examples = false or rename a dag in the dag 
> folder) DAGs do not show up after "ariflow list_dags" but they show up greyed 
> out and unclickable on airflow UI. I tried "airflow resetdb" and the problem 
> persists.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3238) Dags, removed from the filesystem, are not deactivated on initdb

2018-10-22 Thread Ash Berlin-Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ash Berlin-Taylor updated AIRFLOW-3238:
---
Fix Version/s: 1.10.1

> Dags, removed from the filesystem, are not deactivated on initdb
> 
>
> Key: AIRFLOW-3238
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3238
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG
>Reporter: Jason Shao
>Assignee: Jason Shao
>Priority: Major
> Fix For: 1.10.1
>
>
> Removed dags continue to show up in the airflow UI. This can only be 
> remedied, currently, by either deleting the dag or modifying the internal 
> meta db. Fix models.DAG.deactivate_unknown_dags so that removed dags are 
> automatically deactivated (hidden from the UI) on restart.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-461) BigQuery: Support autodetection of schemas

2018-10-22 Thread Kaxil Naik (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658755#comment-16658755
 ] 

Kaxil Naik commented on AIRFLOW-461:


Resolved by https://github.com/apache/incubator-airflow/pull/3880

> BigQuery: Support autodetection of schemas
> --
>
> Key: AIRFLOW-461
> URL: https://issues.apache.org/jira/browse/AIRFLOW-461
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp
>Reporter: Jeremiah Lowin
>Assignee: Iuliia Volkova
>Priority: Major
> Fix For: 2.0.0
>
>
> Add support for autodetecting schemas. Autodetect behavior is described in 
> the documentation for federated data sources here: 
> https://cloud.google.com/bigquery/federated-data-sources#auto-detect but is 
> actually available when loading any CSV or JSON data (not just for federated 
> tables). See the API: 
> https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.load.autodetect



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >