incubator-airflow git commit: closes apache/incubator-airflow#3225 *Closed for inactivity*

2018-04-19 Thread sanand
Repository: incubator-airflow
Updated Branches:
  refs/heads/master f1e65c489 -> e6145784e


closes apache/incubator-airflow#3225 *Closed for inactivity*


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/e6145784
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/e6145784
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/e6145784

Branch: refs/heads/master
Commit: e6145784e64c6160e302417fed4474fd580e9a8b
Parents: f1e65c4
Author: r39132 
Authored: Thu Apr 19 18:37:34 2018 -0700
Committer: r39132 
Committed: Thu Apr 19 18:37:34 2018 -0700

--

--




[jira] [Assigned] (AIRFLOW-2300) Add S3 Select functionarity to S3ToHiveTransfer

2018-04-19 Thread Kengo Seki (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kengo Seki reassigned AIRFLOW-2300:
---

Assignee: Kengo Seki

> Add S3 Select functionarity to S3ToHiveTransfer
> ---
>
> Key: AIRFLOW-2300
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2300
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, operators
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Major
>
> For the same reason as AIRFLOW-2299, S3ToHiveTransfer should leverage S3 
> Select.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AIRFLOW-2347) Add Banco de Formaturas new officially using Airflow

2018-04-19 Thread Siddharth Anand (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Anand closed AIRFLOW-2347.

Resolution: Fixed

> Add Banco de Formaturas new officially using Airflow 
> -
>
> Key: AIRFLOW-2347
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2347
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Siddharth Anand
>Assignee: Siddharth Anand
>Priority: Trivial
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-2347] Add Banco de Formaturas to Readme

2018-04-19 Thread sanand
Repository: incubator-airflow
Updated Branches:
  refs/heads/master c208a5668 -> f1e65c489


[AIRFLOW-2347] Add Banco de Formaturas to Readme

Make sure you have checked _all_ steps below.

### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "\[AIRFLOW-XXX\] My Airflow PR"
-
https://issues.apache.org/jira/browse/AIRFLOW-2347
- In case you are fixing a typo in the
documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a JIRA
issue.

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
Add a company to the README

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason: N/A -- documentation update only

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

### Documentation
- [x] In case of new functionality, my PR adds
documentation that describes how to use it.
- When adding new operators/hooks/sensors, the
autoclass documentation generation needs to be
added.

### Code Quality
- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`

Closes #3242 from
r39132/Add_banco_Formaturas_to_readme


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/f1e65c48
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/f1e65c48
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/f1e65c48

Branch: refs/heads/master
Commit: f1e65c4897535aa9b97f2ce1ae628eddc6a4a6e5
Parents: c208a56
Author: Sid Anand 
Authored: Thu Apr 19 18:33:33 2018 -0700
Committer: r39132 
Committed: Thu Apr 19 18:33:33 2018 -0700

--
 README.md | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/f1e65c48/README.md
--
diff --git a/README.md b/README.md
index 943919b..5e5193c 100644
--- a/README.md
+++ b/README.md
@@ -93,6 +93,7 @@ Currently **officially** using Airflow:
 1. [Auth0](https://auth0.com) [[@sicarul](https://github.com/sicarul)]
 1. [Away](https://awaytravel.com) [[@trunsky](https://github.com/trunsky)]
 1. [BalanceHero](http://truebalance.io/) 
[[@swalloow](https://github.com/swalloow)]
+1. [Banco de Formaturas](https://www.bancodeformaturas.com.br) 
[[@guiligan](https://github.com/guiligan)]
 1. [Azri Solutions](http://www.azrisolutions.com/) 
[[@userimack](https://github.com/userimack)]
 1. [BandwidthX](http://www.bandwidthx.com) 
[[@dineshdsharma](https://github.com/dineshdsharma)]
 1. [Bellhops](https://github.com/bellhops)



[jira] [Commented] (AIRFLOW-2347) Add Banco de Formaturas new officially using Airflow

2018-04-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16445091#comment-16445091
 ] 

ASF subversion and git services commented on AIRFLOW-2347:
--

Commit f1e65c4897535aa9b97f2ce1ae628eddc6a4a6e5 in incubator-airflow's branch 
refs/heads/master from Sid Anand
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=f1e65c4 ]

[AIRFLOW-2347] Add Banco de Formaturas to Readme

Make sure you have checked _all_ steps below.

### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "\[AIRFLOW-XXX\] My Airflow PR"
-
https://issues.apache.org/jira/browse/AIRFLOW-2347
- In case you are fixing a typo in the
documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a JIRA
issue.

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
Add a company to the README

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason: N/A -- documentation update only

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

### Documentation
- [x] In case of new functionality, my PR adds
documentation that describes how to use it.
- When adding new operators/hooks/sensors, the
autoclass documentation generation needs to be
added.

### Code Quality
- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`

Closes #3242 from
r39132/Add_banco_Formaturas_to_readme


> Add Banco de Formaturas new officially using Airflow 
> -
>
> Key: AIRFLOW-2347
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2347
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Siddharth Anand
>Assignee: Siddharth Anand
>Priority: Trivial
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-2347) Add Banco de Formaturas new officially using Airflow

2018-04-19 Thread Siddharth Anand (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Anand updated AIRFLOW-2347:
-
External issue URL: https://github.com/apache/incubator-airflow/pull/3242

> Add Banco de Formaturas new officially using Airflow 
> -
>
> Key: AIRFLOW-2347
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2347
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Siddharth Anand
>Assignee: Siddharth Anand
>Priority: Trivial
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2347) Add Banco de Formaturas new officially using Airflow

2018-04-19 Thread Siddharth Anand (JIRA)
Siddharth Anand created AIRFLOW-2347:


 Summary: Add Banco de Formaturas new officially using Airflow 
 Key: AIRFLOW-2347
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2347
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Siddharth Anand
Assignee: Siddharth Anand






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AIRFLOW-2346) Add Investorise as official user of Airflow

2018-04-19 Thread Siddharth Anand (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Anand closed AIRFLOW-2346.

Resolution: Fixed

> Add Investorise as official user of Airflow
> ---
>
> Key: AIRFLOW-2346
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2346
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Siddharth Anand
>Assignee: Siddharth Anand
>Priority: Trivial
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2346) Add Investorise as official user of Airflow

2018-04-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16445079#comment-16445079
 ] 

ASF subversion and git services commented on AIRFLOW-2346:
--

Commit c208a5668285a4cbd5e1073535f30774e942eac1 in incubator-airflow's branch 
refs/heads/master from Sven Varkel
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=c208a56 ]

[AIRFLOW-2346] Add Investorise as official user of Airflow

Closes #3238 from svenvarkel/master


> Add Investorise as official user of Airflow
> ---
>
> Key: AIRFLOW-2346
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2346
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Siddharth Anand
>Assignee: Siddharth Anand
>Priority: Trivial
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-2346] Add Investorise as official user of Airflow

2018-04-19 Thread sanand
Repository: incubator-airflow
Updated Branches:
  refs/heads/master 17d3d1d9d -> c208a5668


[AIRFLOW-2346] Add Investorise as official user of Airflow

Closes #3238 from svenvarkel/master


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/c208a566
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/c208a566
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/c208a566

Branch: refs/heads/master
Commit: c208a5668285a4cbd5e1073535f30774e942eac1
Parents: 17d3d1d
Author: Sven Varkel 
Authored: Thu Apr 19 18:20:31 2018 -0700
Committer: r39132 
Committed: Thu Apr 19 18:20:37 2018 -0700

--
 README.md | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/c208a566/README.md
--
diff --git a/README.md b/README.md
index 6911e32..943919b 100644
--- a/README.md
+++ b/README.md
@@ -155,6 +155,7 @@ Currently **officially** using Airflow:
 1. [imgix](https://www.imgix.com/) [[@dclubb](https://github.com/dclubb)]
 1. [ING](http://www.ing.com/)
 1. [Intercom](http://www.intercom.com/) [[@fox](https://github.com/fox) & 
[@paulvic](https://github.com/paulvic)]
+1. [Investorise](https://investorise.com/) 
[[@svenvarkel](https://github.com/svenvarkel)]
 1. [Jampp](https://github.com/jampp)
 1. [JobTeaser](https://www.jobteaser.com) 
[[@stefani75](https://github.com/stefani75) &  
[@knil-sama](https://github.com/knil-sama)]
 1. [Kalibrr](https://www.kalibrr.com/) 
[[@charlesverdad](https://github.com/charlesverdad)]



[jira] [Updated] (AIRFLOW-2346) Add Investorise as official user of Airflow

2018-04-19 Thread Siddharth Anand (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Anand updated AIRFLOW-2346:
-
External issue URL: https://github.com/apache/incubator-airflow/pull/3238

> Add Investorise as official user of Airflow
> ---
>
> Key: AIRFLOW-2346
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2346
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Siddharth Anand
>Assignee: Siddharth Anand
>Priority: Trivial
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2346) Add Investorise as official user of Airflow

2018-04-19 Thread Siddharth Anand (JIRA)
Siddharth Anand created AIRFLOW-2346:


 Summary: Add Investorise as official user of Airflow
 Key: AIRFLOW-2346
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2346
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Siddharth Anand
Assignee: Siddharth Anand






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2342) DAG in running state but tasks not running

2018-04-19 Thread chidrup jhanjhari (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chidrup jhanjhari reassigned AIRFLOW-2342:
--

Assignee: (was: chidrup jhanjhari)

> DAG in running state but tasks not running
> --
>
> Key: AIRFLOW-2342
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2342
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: Airflow 1.8
> Environment: Redhat
>Reporter: chidrup jhanjhari
>Priority: Major
> Attachments: job1.py.log
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Hi, We are on Airflow 1.8.0. We have airflow production environment running 
> well since 8 months. There has been no change on configuration etc. The issue 
> is since 2 days DAGs are showing in running state but the tasks are not 
> getting triggered. After the default start task, the DAG run is not moving to 
> the next task. Attached is the scheduler throwing following error:
> 2018-04-19 01:32:22,586] \{jobs.py:354} DagFileProcessor17 ERROR - Got an 
> exception! Propagating...
> Traceback (most recent call last):
>   Any help will be greatly appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2342) DAG in running state but tasks not running

2018-04-19 Thread chidrup jhanjhari (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chidrup jhanjhari reassigned AIRFLOW-2342:
--

Assignee: chidrup jhanjhari

> DAG in running state but tasks not running
> --
>
> Key: AIRFLOW-2342
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2342
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: Airflow 1.8
> Environment: Redhat
>Reporter: chidrup jhanjhari
>Assignee: chidrup jhanjhari
>Priority: Major
> Attachments: job1.py.log
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Hi, We are on Airflow 1.8.0. We have airflow production environment running 
> well since 8 months. There has been no change on configuration etc. The issue 
> is since 2 days DAGs are showing in running state but the tasks are not 
> getting triggered. After the default start task, the DAG run is not moving to 
> the next task. Attached is the scheduler throwing following error:
> 2018-04-19 01:32:22,586] \{jobs.py:354} DagFileProcessor17 ERROR - Got an 
> exception! Propagating...
> Traceback (most recent call last):
>   Any help will be greatly appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2319) Table "dag_run" has (bad) second index on (dag_id, execution_date)

2018-04-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444778#comment-16444778
 ] 

Andreas Költringer commented on AIRFLOW-2319:
-

* if the index was needed for performance, creating the index with unique 
constraint would not be necessary -> could be an index without uniqueness.
 * regarding entropy: there is an "id" column which is the primary key
 * my understanding is that [the DagRun table was first created with (dag_id, 
execution_date) as primary 
key|https://github.com/apache/incubator-airflow/commit/58519878bba9cf39f9abaf9a2cb016aa1b8f683e],
 and was later refactored. This makes me think that the uniqueness constraint 
on (dag_id, execution_date) is there by accident

> Table "dag_run" has (bad) second index on (dag_id, execution_date)
> --
>
> Key: AIRFLOW-2319
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2319
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.9.0
>Reporter: Andreas Költringer
>Priority: Major
>
> Inserting DagRun's via {{airflow.api.common.experimental.trigger_dag}} 
> (multiple rows with the same {{(dag_id, execution_date)}}) raised the 
> following error:
> {code:java}
> {models.py:1644} ERROR - No row was found for one(){code}
> This is weird as the {{session.add()}} and {{session.commit()}} is right 
> before {{run.refresh_from_db()}} in {{models.DAG.create_dagrun()}}.
> Manually inspecting the database revealed that there is an extra index with 
> {{unique}} constraint on the columns {{(dag_id, execution_date)}}:
> {code:java}
> sqlite> .schema dag_run
> CREATE TABLE dag_run (
>     id INTEGER NOT NULL, 
>     dag_id VARCHAR(250), 
>     execution_date DATETIME, 
>     state VARCHAR(50), 
>     run_id VARCHAR(250), 
>     external_trigger BOOLEAN, conf BLOB, end_date DATETIME, start_date 
> DATETIME, 
>     PRIMARY KEY (id), 
>     UNIQUE (dag_id, execution_date), 
>     UNIQUE (dag_id, run_id), 
>     CHECK (external_trigger IN (0, 1))
> );
> CREATE INDEX dag_id_state ON dag_run (dag_id, state);{code}
> (On SQLite its a unique constraint, on MariaDB its also an index)
> The {{DagRun}} class in {{models.py}} does not reflect this, however it is in 
> [migrations/versions/1b38cef5b76e_add_dagrun.py|https://github.com/apache/incubator-airflow/blob/master/airflow/migrations/versions/1b38cef5b76e_add_dagrun.py#L42]
> I looked for other migrations correting this, but could not find any. As this 
> is not reflected in the model, I guess this is a bug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AIRFLOW-2319) Table "dag_run" has (bad) second index on (dag_id, execution_date)

2018-04-19 Thread John Arnold (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444711#comment-16444711
 ] 

John Arnold edited comment on AIRFLOW-2319 at 4/19/18 8:14 PM:
---

Also, if you truly need "duplicate" dag runs with the same dag_id and execution 
date, some additional entropy will be needed for uniqueness – eg. a uuid or 
table id


was (Author: johnarnold):
Also, if you truly need "duplicate" dag runs with the same dag_id and execution 
date, some additional entropy will be needed for uniqueness – eg. a dag run id

> Table "dag_run" has (bad) second index on (dag_id, execution_date)
> --
>
> Key: AIRFLOW-2319
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2319
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.9.0
>Reporter: Andreas Költringer
>Priority: Major
>
> Inserting DagRun's via {{airflow.api.common.experimental.trigger_dag}} 
> (multiple rows with the same {{(dag_id, execution_date)}}) raised the 
> following error:
> {code:java}
> {models.py:1644} ERROR - No row was found for one(){code}
> This is weird as the {{session.add()}} and {{session.commit()}} is right 
> before {{run.refresh_from_db()}} in {{models.DAG.create_dagrun()}}.
> Manually inspecting the database revealed that there is an extra index with 
> {{unique}} constraint on the columns {{(dag_id, execution_date)}}:
> {code:java}
> sqlite> .schema dag_run
> CREATE TABLE dag_run (
>     id INTEGER NOT NULL, 
>     dag_id VARCHAR(250), 
>     execution_date DATETIME, 
>     state VARCHAR(50), 
>     run_id VARCHAR(250), 
>     external_trigger BOOLEAN, conf BLOB, end_date DATETIME, start_date 
> DATETIME, 
>     PRIMARY KEY (id), 
>     UNIQUE (dag_id, execution_date), 
>     UNIQUE (dag_id, run_id), 
>     CHECK (external_trigger IN (0, 1))
> );
> CREATE INDEX dag_id_state ON dag_run (dag_id, state);{code}
> (On SQLite its a unique constraint, on MariaDB its also an index)
> The {{DagRun}} class in {{models.py}} does not reflect this, however it is in 
> [migrations/versions/1b38cef5b76e_add_dagrun.py|https://github.com/apache/incubator-airflow/blob/master/airflow/migrations/versions/1b38cef5b76e_add_dagrun.py#L42]
> I looked for other migrations correting this, but could not find any. As this 
> is not reflected in the model, I guess this is a bug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AIRFLOW-2319) Table "dag_run" has (bad) second index on (dag_id, execution_date)

2018-04-19 Thread John Arnold (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444711#comment-16444711
 ] 

John Arnold edited comment on AIRFLOW-2319 at 4/19/18 8:14 PM:
---

Also, if you truly need "duplicate" dag runs with the same dag_id and execution 
date, some additional entropy will be needed for uniqueness – eg. a dag run id


was (Author: johnarnold):
Also, if you truly need "duplicate" dag runs with the same dag_id and execution 
date, some additional entropy will be needed for uniqueness – eg. a dag run 
uuid.

> Table "dag_run" has (bad) second index on (dag_id, execution_date)
> --
>
> Key: AIRFLOW-2319
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2319
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.9.0
>Reporter: Andreas Költringer
>Priority: Major
>
> Inserting DagRun's via {{airflow.api.common.experimental.trigger_dag}} 
> (multiple rows with the same {{(dag_id, execution_date)}}) raised the 
> following error:
> {code:java}
> {models.py:1644} ERROR - No row was found for one(){code}
> This is weird as the {{session.add()}} and {{session.commit()}} is right 
> before {{run.refresh_from_db()}} in {{models.DAG.create_dagrun()}}.
> Manually inspecting the database revealed that there is an extra index with 
> {{unique}} constraint on the columns {{(dag_id, execution_date)}}:
> {code:java}
> sqlite> .schema dag_run
> CREATE TABLE dag_run (
>     id INTEGER NOT NULL, 
>     dag_id VARCHAR(250), 
>     execution_date DATETIME, 
>     state VARCHAR(50), 
>     run_id VARCHAR(250), 
>     external_trigger BOOLEAN, conf BLOB, end_date DATETIME, start_date 
> DATETIME, 
>     PRIMARY KEY (id), 
>     UNIQUE (dag_id, execution_date), 
>     UNIQUE (dag_id, run_id), 
>     CHECK (external_trigger IN (0, 1))
> );
> CREATE INDEX dag_id_state ON dag_run (dag_id, state);{code}
> (On SQLite its a unique constraint, on MariaDB its also an index)
> The {{DagRun}} class in {{models.py}} does not reflect this, however it is in 
> [migrations/versions/1b38cef5b76e_add_dagrun.py|https://github.com/apache/incubator-airflow/blob/master/airflow/migrations/versions/1b38cef5b76e_add_dagrun.py#L42]
> I looked for other migrations correting this, but could not find any. As this 
> is not reflected in the model, I guess this is a bug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2319) Table "dag_run" has (bad) second index on (dag_id, execution_date)

2018-04-19 Thread John Arnold (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444711#comment-16444711
 ] 

John Arnold commented on AIRFLOW-2319:
--

Also, if you truly need "duplicate" dag runs with the same dag_id and execution 
date, some additional entropy will be needed for uniqueness – eg. a dag run 
uuid.

> Table "dag_run" has (bad) second index on (dag_id, execution_date)
> --
>
> Key: AIRFLOW-2319
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2319
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.9.0
>Reporter: Andreas Költringer
>Priority: Major
>
> Inserting DagRun's via {{airflow.api.common.experimental.trigger_dag}} 
> (multiple rows with the same {{(dag_id, execution_date)}}) raised the 
> following error:
> {code:java}
> {models.py:1644} ERROR - No row was found for one(){code}
> This is weird as the {{session.add()}} and {{session.commit()}} is right 
> before {{run.refresh_from_db()}} in {{models.DAG.create_dagrun()}}.
> Manually inspecting the database revealed that there is an extra index with 
> {{unique}} constraint on the columns {{(dag_id, execution_date)}}:
> {code:java}
> sqlite> .schema dag_run
> CREATE TABLE dag_run (
>     id INTEGER NOT NULL, 
>     dag_id VARCHAR(250), 
>     execution_date DATETIME, 
>     state VARCHAR(50), 
>     run_id VARCHAR(250), 
>     external_trigger BOOLEAN, conf BLOB, end_date DATETIME, start_date 
> DATETIME, 
>     PRIMARY KEY (id), 
>     UNIQUE (dag_id, execution_date), 
>     UNIQUE (dag_id, run_id), 
>     CHECK (external_trigger IN (0, 1))
> );
> CREATE INDEX dag_id_state ON dag_run (dag_id, state);{code}
> (On SQLite its a unique constraint, on MariaDB its also an index)
> The {{DagRun}} class in {{models.py}} does not reflect this, however it is in 
> [migrations/versions/1b38cef5b76e_add_dagrun.py|https://github.com/apache/incubator-airflow/blob/master/airflow/migrations/versions/1b38cef5b76e_add_dagrun.py#L42]
> I looked for other migrations correting this, but could not find any. As this 
> is not reflected in the model, I guess this is a bug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2319) Table "dag_run" has (bad) second index on (dag_id, execution_date)

2018-04-19 Thread John Arnold (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444653#comment-16444653
 ] 

John Arnold commented on AIRFLOW-2319:
--

IMO, the index is probably needed for performance, as those are the most common 
lookup fields etc.  I would add the index to the model as a bugfix.

> Table "dag_run" has (bad) second index on (dag_id, execution_date)
> --
>
> Key: AIRFLOW-2319
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2319
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.9.0
>Reporter: Andreas Költringer
>Priority: Major
>
> Inserting DagRun's via {{airflow.api.common.experimental.trigger_dag}} 
> (multiple rows with the same {{(dag_id, execution_date)}}) raised the 
> following error:
> {code:java}
> {models.py:1644} ERROR - No row was found for one(){code}
> This is weird as the {{session.add()}} and {{session.commit()}} is right 
> before {{run.refresh_from_db()}} in {{models.DAG.create_dagrun()}}.
> Manually inspecting the database revealed that there is an extra index with 
> {{unique}} constraint on the columns {{(dag_id, execution_date)}}:
> {code:java}
> sqlite> .schema dag_run
> CREATE TABLE dag_run (
>     id INTEGER NOT NULL, 
>     dag_id VARCHAR(250), 
>     execution_date DATETIME, 
>     state VARCHAR(50), 
>     run_id VARCHAR(250), 
>     external_trigger BOOLEAN, conf BLOB, end_date DATETIME, start_date 
> DATETIME, 
>     PRIMARY KEY (id), 
>     UNIQUE (dag_id, execution_date), 
>     UNIQUE (dag_id, run_id), 
>     CHECK (external_trigger IN (0, 1))
> );
> CREATE INDEX dag_id_state ON dag_run (dag_id, state);{code}
> (On SQLite its a unique constraint, on MariaDB its also an index)
> The {{DagRun}} class in {{models.py}} does not reflect this, however it is in 
> [migrations/versions/1b38cef5b76e_add_dagrun.py|https://github.com/apache/incubator-airflow/blob/master/airflow/migrations/versions/1b38cef5b76e_add_dagrun.py#L42]
> I looked for other migrations correting this, but could not find any. As this 
> is not reflected in the model, I guess this is a bug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2345) pip import is unused in setup.py

2018-04-19 Thread Sam Garrett (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1603#comment-1603
 ] 

Sam Garrett commented on AIRFLOW-2345:
--

I have created a PR for this here: 
https://github.com/apache/incubator-airflow/pull/3241

> pip import is unused in setup.py
> 
>
> Key: AIRFLOW-2345
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2345
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Sam Garrett
>Assignee: Sam Garrett
>Priority: Minor
>
> pip is unnecessarily imported here in the current master branch of airflow: 
> [https://github.com/apache/incubator-airflow/blob/master/setup.py#L26]
>  
> It should be removed in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (AIRFLOW-2345) pip import is unused in setup.py

2018-04-19 Thread Sam Garrett (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-2345 started by Sam Garrett.

> pip import is unused in setup.py
> 
>
> Key: AIRFLOW-2345
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2345
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Sam Garrett
>Assignee: Sam Garrett
>Priority: Minor
>
> pip is unnecessarily imported here in the current master branch of airflow: 
> [https://github.com/apache/incubator-airflow/blob/master/setup.py#L26]
>  
> It should be removed in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2345) pip import is unused in setup.py

2018-04-19 Thread Sam Garrett (JIRA)
Sam Garrett created AIRFLOW-2345:


 Summary: pip import is unused in setup.py
 Key: AIRFLOW-2345
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2345
 Project: Apache Airflow
  Issue Type: Improvement
Reporter: Sam Garrett
Assignee: Sam Garrett


pip is unnecessarily imported here in the current master branch of airflow: 
[https://github.com/apache/incubator-airflow/blob/master/setup.py#L26]

 

It should be removed in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-630) Airflow worker is not working with Celery 4.0.0

2018-04-19 Thread Luke Bodeen (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444204#comment-16444204
 ] 

Luke Bodeen commented on AIRFLOW-630:
-

This should be closed as many people run celery 4 now with airflow 1.9

> Airflow worker is not working with Celery 4.0.0
> ---
>
> Key: AIRFLOW-630
> URL: https://issues.apache.org/jira/browse/AIRFLOW-630
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery
>Affects Versions: Airflow 1.7.1.2, Airflow 1.7.1.3
>Reporter: Hafiz Badrie Lubis
>Priority: Major
>
> Soon as celery version is upgraded to 4.0.0, airflow worker is not working, 
> because loglevel value is None. You can see the detail of error log on this 
> image: http://imgur.com/JHedHeN. 
> Should make loglevel value assignment be more flexible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1268) Celery bug can cause tasks to be delayed indefinitely

2018-04-19 Thread Luke Bodeen (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444196#comment-16444196
 ] 

Luke Bodeen commented on AIRFLOW-1268:
--

that celery issue shows fixed in 4.2 now

> Celery bug can cause tasks to be delayed indefinitely
> -
>
> Key: AIRFLOW-1268
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1268
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: celery
> Environment: With celery_executor with redis
>Reporter: Alex Guziel
>Priority: Critical
>
> With celery, tasks can get delayed indefinitely (or default 1 hour) due to a 
> bug with celery, see https://github.com/celery/celery/issues/3765



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-2330) GoogleCloudStorageToGoogleCloudStorageOperator on wildcard appends destination_object even when not given

2018-04-19 Thread Fokko Driesprong (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong resolved AIRFLOW-2330.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

Issue resolved by pull request #3233
[https://github.com/apache/incubator-airflow/pull/3233]

> GoogleCloudStorageToGoogleCloudStorageOperator on wildcard appends 
> destination_object even when not given
> -
>
> Key: AIRFLOW-2330
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2330
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Berislav Lopac
>Assignee: Berislav Lopac
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently, the operator builds the destination like this:
> {code}
> hook.copy(self.source_bucket, source_object,
>   self.destination_bucket, "{}/{}".format(self.destination_object,
>   source_object))
> {code}
> If destination is {{None}} (the default) the file will land in 
> {{None/\{source_object\}}}, and if it's an empty string it goes to 
> {{/\{source_object\}}}. Basically, it should not prepend 
> {{destination_object}} if it's empty.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2330) GoogleCloudStorageToGoogleCloudStorageOperator on wildcard appends destination_object even when not given

2018-04-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/AIRFLOW-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443725#comment-16443725
 ] 

ASF subversion and git services commented on AIRFLOW-2330:
--

Commit 17d3d1d9dc87c0bbb03de049607c2ad76a4fd747 in incubator-airflow's branch 
refs/heads/master from [~b11c]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=17d3d1d ]

[AIRFLOW-2330] Do not append destination prefix if not given

Closes #3233 from berislavlopac/AIRFLOW-2330


> GoogleCloudStorageToGoogleCloudStorageOperator on wildcard appends 
> destination_object even when not given
> -
>
> Key: AIRFLOW-2330
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2330
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Berislav Lopac
>Assignee: Berislav Lopac
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently, the operator builds the destination like this:
> {code}
> hook.copy(self.source_bucket, source_object,
>   self.destination_bucket, "{}/{}".format(self.destination_object,
>   source_object))
> {code}
> If destination is {{None}} (the default) the file will land in 
> {{None/\{source_object\}}}, and if it's an empty string it goes to 
> {{/\{source_object\}}}. Basically, it should not prepend 
> {{destination_object}} if it's empty.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


incubator-airflow git commit: [AIRFLOW-2330] Do not append destination prefix if not given

2018-04-19 Thread fokko
Repository: incubator-airflow
Updated Branches:
  refs/heads/master e95a1251b -> 17d3d1d9d


[AIRFLOW-2330] Do not append destination prefix if not given

Closes #3233 from berislavlopac/AIRFLOW-2330


Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/17d3d1d9
Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/17d3d1d9
Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/17d3d1d9

Branch: refs/heads/master
Commit: 17d3d1d9dc87c0bbb03de049607c2ad76a4fd747
Parents: e95a125
Author: Berislav Lopac 
Authored: Thu Apr 19 10:26:23 2018 +0200
Committer: Fokko Driesprong 
Committed: Thu Apr 19 10:26:23 2018 +0200

--
 airflow/contrib/operators/gcs_to_gcs.py | 41 --
 .../operators/test_gcs_to_gcs_operator.py   | 58 +++-
 2 files changed, 81 insertions(+), 18 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/17d3d1d9/airflow/contrib/operators/gcs_to_gcs.py
--
diff --git a/airflow/contrib/operators/gcs_to_gcs.py 
b/airflow/contrib/operators/gcs_to_gcs.py
index dc67ddc..6acc517 100644
--- a/airflow/contrib/operators/gcs_to_gcs.py
+++ b/airflow/contrib/operators/gcs_to_gcs.py
@@ -7,9 +7,9 @@
 # to you under the Apache License, Version 2.0 (the
 # "License"); you may not use this file except in compliance
 # with the License.  You may obtain a copy of the License at
-# 
+#
 #   http://www.apache.org/licenses/LICENSE-2.0
-# 
+#
 # Unless required by applicable law or agreed to in writing,
 # software distributed under the License is distributed on an
 # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
@@ -24,7 +24,7 @@ from airflow.utils.decorators import apply_defaults
 
 class GoogleCloudStorageToGoogleCloudStorageOperator(BaseOperator):
 """
-Copies an object from a bucket to another, with renaming if requested.
+Copies objects from a bucket to another, with renaming if requested.
 
 :param source_bucket: The source Google cloud storage bucket where the 
object is.
 :type source_bucket: string
@@ -43,8 +43,7 @@ class 
GoogleCloudStorageToGoogleCloudStorageOperator(BaseOperator):
 destination Google cloud
 storage bucket.
 If a wildcard is supplied in the source_object argument, this is the
-folder that the files will be
-copied to in the destination bucket.
+prefix that will be prepended to the final destination objects' paths.
 :type destination_object: string
 :param move_object: When move object is True, the object is moved instead
 of copied to the new location.
@@ -96,24 +95,34 @@ class 
GoogleCloudStorageToGoogleCloudStorageOperator(BaseOperator):
 objects = hook.list(self.source_bucket,
 prefix=self.source_object[:wildcard_position],
 delimiter=self.source_object[wildcard_position 
+ 1:])
+
 for source_object in objects:
+if self.destination_object:
+destination_object = 
"{}/{}".format(self.destination_object,
+source_object)
+else:
+destination_object = source_object
 self.log.info('Executing copy of gs://{0}/{1} to '
-  'gs://{2}/{3}/{1}'.format(self.source_bucket,
-source_object,
-
self.destination_bucket,
-
self.destination_object,
-source_object))
+  'gs://{2}/{3}'.format(self.source_bucket,
+source_object,
+self.destination_bucket,
+destination_object))
+
 hook.copy(self.source_bucket, source_object,
-  self.destination_bucket, 
"{}/{}".format(self.destination_object,
-  
source_object))
+  self.destination_bucket, destination_object)
 if self.move_object:
 hook.delete(self.source_bucket, source_object)
 
 else:
-self.log.info('Executing copy: %s, %s, %s, %s', self.source_bucket,
-  self.source_object,
-  self.destination_bucket or self.source_bucket,
-  

[jira] [Created] (AIRFLOW-2344) Fix `airflow connections -l` to work with pipe and redirect

2018-04-19 Thread Kengo Seki (JIRA)
Kengo Seki created AIRFLOW-2344:
---

 Summary: Fix `airflow connections -l` to work with pipe and 
redirect
 Key: AIRFLOW-2344
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2344
 Project: Apache Airflow
  Issue Type: Bug
  Components: cli
Affects Versions: 1.9.0
Reporter: Kengo Seki


{{airflow connections -l}} fails with pipe or redirect e.g.:

{code}
$ airflow connections -l > foo
Traceback (most recent call last):
  File "/home/sekikn/.virtualenvs/a/bin/airflow", line 6, in 
exec(compile(open(__file__).read(), __file__, 'exec'))
  File "/home/sekikn/dev/incubator-airflow/airflow/bin/airflow", line 32, in 

args.func(args)
  File "/home/sekikn/dev/incubator-airflow/airflow/utils/cli.py", line 77, in 
wrapper
raise e
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-141: 
ordinal not in range(128)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-273) Vectorized Logos

2018-04-19 Thread Anonymous (JIRA)

 [ 
https://issues.apache.org/jira/browse/AIRFLOW-273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous reassigned AIRFLOW-273:
-

Assignee: Ivan Vitoria  (was: George Leslie-Waksman)

> Vectorized Logos
> 
>
> Key: AIRFLOW-273
> URL: https://issues.apache.org/jira/browse/AIRFLOW-273
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: George Leslie-Waksman
>Assignee: Ivan Vitoria
>Priority: Trivial
>
> There has been interest on the mailing list in a SVG version of the logo.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)