[jira] [Assigned] (AIRFLOW-5761) Check and document that docker-compose >= 1.20 is needed to run breeze

2021-02-22 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-5761:
-

Assignee: Abhik Chakraborty  (was: Abhishek Kandoi)

> Check and document that docker-compose >= 1.20 is needed to run breeze
> --
>
> Key: AIRFLOW-5761
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5761
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 2.0.0, 1.10.5, 1.10.6
>Reporter: Jarek Potiuk
>Assignee: Abhik Chakraborty
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5761) Check and document that docker-compose >= 1.20 is needed to run breeze

2021-02-22 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17288633#comment-17288633
 ] 

Jarek Potiuk commented on AIRFLOW-5761:
---

Feel free . We are not part of GSoC in 2021 though (and we switched to GitHub 
Issues so this one is kinda obsolete) but you can make a PR directly to the 
airflow repo without any issue created.

> Check and document that docker-compose >= 1.20 is needed to run breeze
> --
>
> Key: AIRFLOW-5761
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5761
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 2.0.0, 1.10.5, 1.10.6
>Reporter: Jarek Potiuk
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-5761) Check and document that docker-compose >= 1.20 is needed to run breeze

2021-02-22 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-5761:
-

Assignee: Abhishek Kandoi

> Check and document that docker-compose >= 1.20 is needed to run breeze
> --
>
> Key: AIRFLOW-5761
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5761
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 2.0.0, 1.10.5, 1.10.6
>Reporter: Jarek Potiuk
>Assignee: Abhishek Kandoi
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-3644) AIP-8 Split Hooks/Operators out of core package and repository

2020-09-13 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-3644:
--
Description: 
Based on discussion at 
[http://mail-archives.apache.org/mod_mbox/airflow-dev/201809.mbox/%3c308670db-bd2a-4738-81b1-3f6fb312c...@apache.org%3E]
 I believe separating hooks/operators into separate packages can benefit 
long-term maintainability of Apache Airflow by distributing maintenance and 
reducing the surface area of the core Airflow package.

AIP-8 draft: 
[https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303]

 

The AIP-8 is updated and the decision is made to split Airflow 2.0 into 
separate packages. Voting started at 
[https://lists.apache.org/thread.html/rcd63bbe62a618c4547bd00b1c1d14dc329cfe1c09e4795571be28cb3%40%3Cdev.airflow.apache.org%3E]

  was:
Based on discussion at 
http://mail-archives.apache.org/mod_mbox/airflow-dev/201809.mbox/%3c308670db-bd2a-4738-81b1-3f6fb312c...@apache.org%3E
 I believe separating hooks/operators into separate packages can benefit 
long-term maintainability of Apache Airflow by distributing maintenance and 
reducing the surface area of the core Airflow package.

AIP-8 draft: 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303


> AIP-8 Split Hooks/Operators out of core package and repository
> --
>
> Key: AIRFLOW-3644
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3644
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, core
>Reporter: Tim Swast
>Assignee: Jarek Potiuk
>Priority: Major
>
> Based on discussion at 
> [http://mail-archives.apache.org/mod_mbox/airflow-dev/201809.mbox/%3c308670db-bd2a-4738-81b1-3f6fb312c...@apache.org%3E]
>  I believe separating hooks/operators into separate packages can benefit 
> long-term maintainability of Apache Airflow by distributing maintenance and 
> reducing the surface area of the core Airflow package.
> AIP-8 draft: 
> [https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303]
>  
> The AIP-8 is updated and the decision is made to split Airflow 2.0 into 
> separate packages. Voting started at 
> [https://lists.apache.org/thread.html/rcd63bbe62a618c4547bd00b1c1d14dc329cfe1c09e4795571be28cb3%40%3Cdev.airflow.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-3644) AIP-8 Split Hooks/Operators out of core package and repository

2020-09-13 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-3644:
-

Assignee: Jarek Potiuk

> AIP-8 Split Hooks/Operators out of core package and repository
> --
>
> Key: AIRFLOW-3644
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3644
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, core
>Reporter: Tim Swast
>Assignee: Jarek Potiuk
>Priority: Major
>
> Based on discussion at 
> http://mail-archives.apache.org/mod_mbox/airflow-dev/201809.mbox/%3c308670db-bd2a-4738-81b1-3f6fb312c...@apache.org%3E
>  I believe separating hooks/operators into separate packages can benefit 
> long-term maintainability of Apache Airflow by distributing maintenance and 
> reducing the surface area of the core Airflow package.
> AIP-8 draft: 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-5831) Add production image support

2020-08-01 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-5831:
--
Description: 
Progress kept here:

 

[https://github.com/apache/airflow/projects/3]

  was:Production image should be build automatically


> Add production image support
> 
>
> Key: AIRFLOW-5831
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5831
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: breeze
>Affects Versions: 2.0.0, 1.10.6
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
>
> Progress kept here:
>  
> [https://github.com/apache/airflow/projects/3]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (AIRFLOW-5831) Add production image support

2020-08-01 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-5831 started by Jarek Potiuk.
-
> Add production image support
> 
>
> Key: AIRFLOW-5831
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5831
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: breeze
>Affects Versions: 2.0.0, 1.10.6
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
>
> Production image should be build automatically



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-3081) Support automated integration tests in the CI

2020-07-09 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-3081:
--
Summary: Support automated integration tests in the CI  (was: Support 
automated integration tests in Travis CI)

> Support automated integration tests in the CI
> -
>
> Key: AIRFLOW-3081
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3081
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: ci
>Reporter: Jarek Potiuk
>Priority: Minor
> Attachments: Screen Shot 2018-09-19 at 16.39.46.png
>
>
> I think it would be great to have a way to run integration tests 
> automatically for some of the operators. We've started to work on some GCP 
> operators (Cloud Functions is the first one). We have a proposal on how Cloud 
> Functions (and later other GCP operators) could have integration tests that 
> could run on GCP infrastructure. Here is the link to the proposal Doc 
> [https://docs.google.com/document/d/1-763cYrOs37Sj77RzSQP5hy1GSvZ7I7MPOOG2Q86Osc/edit|https://docs.google.com/document/d/1-763cYrOs37Sj77RzSQP5hy1GSvZ7I7MPOOG2Q86Osc/edit?usp=sharing]
> Maybe it's a good time to start discussion on that :).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (AIRFLOW-5029) Migrate out of Travis CI

2020-07-09 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk closed AIRFLOW-5029.
-
Resolution: Fixed

> Migrate out of Travis CI
> 
>
> Key: AIRFLOW-5029
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5029
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 1.10.4, 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Tomasz Urbaszek
>Priority: Major
>
> Travis CI is getting terrible. We need to migrate out of it



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5071) Thousand os Executor reports task instance X finished (success) although the task says its queued. Was the task killed externally?

2020-07-08 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153252#comment-17153252
 ] 

Jarek Potiuk commented on AIRFLOW-5071:
---

[~kaxilnaik] [~ash] [~turbaszek] [~kamil.bregula] -> We might want to take a 
very deep look at it.  Seems that there is something deep inside the Dag  
Scheduling process that we might want to take a look at!

> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> --
>
> Key: AIRFLOW-5071
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, scheduler
>Affects Versions: 1.10.3
>Reporter: msempere
>Priority: Critical
> Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance  2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-1536) DaemonContext uses default umask 0

2020-05-17 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-1536.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> DaemonContext uses default umask 0
> --
>
> Key: AIRFLOW-1536
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1536
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: cli, security
>Reporter: Timothy O'Keefe
>Assignee: Deepak Aggarwal
>Priority: Major
> Fix For: 2.0.0
>
>
> All DaemonContext instances used for worker, scheduler, webserver, flower, 
> etc. do not supply a umask argument. See here for example:
> https://github.com/apache/incubator-airflow/blob/b0669b532a7be9aa34a4390951deaa25897c62e6/airflow/bin/cli.py#L869
> As a result, the DaemonContext will use the default umask=0 which leaves user 
> data exposed. A BashOperator for example that writes any files would have 
> permissions rw-rw-rw- as would any airflow logs.
> I believe the umask should either be configurable, or inherited from the 
> parent shell, or both.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6157) Separate out executor protocol

2020-04-02 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073457#comment-17073457
 ] 

Jarek Potiuk commented on AIRFLOW-6157:
---

Not for now.

It would technically be possible to have multiple executors (I had a working 
POC for that). But we have no plans for that.

> Separate out executor protocol
> --
>
> Key: AIRFLOW-6157
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6157
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
>
> Some of the fields of executors are accessed directly in the main core. The 
> protocol for executor can be extracted and used in all places where executors 
> are used. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7115) display the dag owner in the dag view

2020-03-30 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7115.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> display the dag owner in the dag view
> -
>
> Key: AIRFLOW-7115
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7115
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 1.10.9
>Reporter: ohad
>Assignee: ohad
>Priority: Minor
> Fix For: 2.0.0
>
>
> It will be nice to display the owner in the dag view, not only in the hompage 
> list.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7114) bug with copy the dag name from the dag view

2020-03-30 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7114.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> bug with copy the dag name from the dag view
> 
>
> Key: AIRFLOW-7114
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7114
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.10.9
>Reporter: ohad
>Assignee: ohad
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: 111.jpg
>
>
> Hi
> When I am copying the dag name from the title it's copy to the clipboard also 
> garbage text.
>  In my daily usage I am copying this title a lot of times, it's kind of 
> annoying.
> +Example+:
> when I am coping the big title "*Task_16790*" and and paste,
>  the content from the clipboard will be: "*Task_16790schedule*"
> !111.jpg!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-7039) Specific DAG Schedule & DST Results in Skipped DAG Run

2020-03-29 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-7039:
-

Assignee: (was: Jarek Potiuk)

> Specific DAG Schedule & DST Results in Skipped DAG Run
> --
>
> Key: AIRFLOW-7039
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7039
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.7
> Environment: Amazon Linux 2 AMI
>Reporter: Peter Kim
>Priority: Critical
>  Labels: timezone
>
> *Scenario:* 
> EC2 running airflow is in Eastern Time (America/New_York), 
> airflow.cfg>[core]>default_timezone=America/New_York (automatically changes 
> correctly)
> Monday morning after Daylight Savings Time applied a handful of DAG runs were 
> not executed as expected.  The strange part is that these DAGs were the only 
> jobs that did not behave as expected, all other DAGs ran normally.  
> Additionally, only the first expected run after DST was skipped, subsequent 
> runs later that day were scheduled successfully.
> Here is the pattern observed:
> DAG Schedule which skipped first run:  (0 , * * 1,2,3,4,5)
> e.g. Schedules M-F, with two distinct runs per day.
> DAGs that run at one time, M-F & DAGs that run at two times, not M-F did not 
> experience this issue.  
>  
> Based on the logs, it appears as if the expected run that was missed was not 
> seen by the scheduler whatsoever (see below):
>  
>  
> 2020 03 06 6:30 AM ET (BEFORE DST, EXPECTED BEHAVIOR):
> [2020-03-06 06:31:01,220] \{logging_mixin.py:112} INFO - [2020-03-06 
> 06:31:01,220] \{settings.py:254} INFO - settings.configure_orm(): Using pool 
> settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=697
> [2020-03-06 06:31:01,222] \{scheduler_job.py:153} INFO - Started process 
> (PID=697) to work on /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 06:31:01,228] \{scheduler_job.py:1539} INFO - Processing file 
> /home/ec2-user/airflow/s3fuse/dags/.py for tasks to queue
> [2020-03-06 06:31:01,228] \{logging_mixin.py:112} INFO - [2020-03-06 
> 06:31:01,228] \{dagbag.py:403} INFO - Filling up the DagBag from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 06:31:01,238] \{scheduler_job.py:1551} INFO - DAG(s) 
> dict_keys(['']) retrieved from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 06:31:01,305] \{scheduler_job.py:1262} INFO - Processing 
> 
> [2020-03-06 06:31:01,348] \{logging_mixin.py:112} WARNING - 
> /home/ec2-user/venv/lib64/python3.7/site-packages/pymysql/cursors.py:170: 
> Warning: (1300, "Invalid utf8mb4 character string: '80047D'") result = 
> self._query(query)
> [2020-03-06 06:31:01,362] \{scheduler_job.py:1272} INFO - Created   @ 2020-03-05T15:30:00+00:00: scheduled__2020-03-05T15:30:00+00:00, 
> externally triggered: False>
> [2020-03-06 06:31:01,366] \{scheduler_job.py:740} INFO - Examining DAG run 
>  @ 2020-03-05 15:30:00+00:00: 
> scheduled__2020-03-05T15:30:00+00:00, externally triggered: False>
> [2020-03-06 06:31:01,389] \{scheduler_job.py:440} INFO - Skipping SLA check 
> for > because no tasks in DAG have SLAs
> [2020-03-06 06:31:01,395] \{scheduler_job.py:1613} INFO - Creating / updating 
> . 2020-03-05 15:30:00+00:00 [scheduled]> 
> in ORM
> [2020-03-06 06:31:01,414] \{scheduler_job.py:161} INFO - Processing 
> /home/ec2-user/airflow/s3fuse/dags/.py took 0.192 seconds
> 20200306 10 AM ET (BEFORE DST, EXPECTED BEHAVIOR):
> [2020-03-06 10:30:00,083] \{logging_mixin.py:112} INFO - [2020-03-06 
> 10:30:00,082] \{settings.py:254} INFO - settings.configure_orm(): Using pool 
> settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=16194
> [2020-03-06 10:30:00,085] \{scheduler_job.py:153} INFO - Started process 
> (PID=16194) to work on /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 10:30:00,090] \{scheduler_job.py:1539} INFO - Processing file 
> /home/ec2-user/airflow/s3fuse/dags/.py for tasks to queue
> [2020-03-06 10:30:00,090] \{logging_mixin.py:112} INFO - [2020-03-06 
> 10:30:00,090] \{dagbag.py:403} INFO - Filling up the DagBag from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 10:30:00,099] \{scheduler_job.py:1551} INFO - DAG(s) 
> dict_keys(['']) retrieved from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 10:30:00,159] \{scheduler_job.py:1262} INFO - Processing 
> 
> [2020-03-06 10:30:00,193] \{logging_mixin.py:112} WARNING - 
> /home/ec2-user/venv/lib64/python3.7/site-packages/pymysql/cursors.py:170: 
> Warning: (1300, "Invalid utf8mb4 character string: '80047D'")
>   result = self._query(query)
> [2020-03-06 10:30:00,207] \{scheduler_job.py:1272} INFO - Created   @ 2020-03-06T11:30:00+00:00: scheduled__2020-03-06T11:30:00+00:00, 
> externally triggered: False>
> [2020-03-06 10:30:00,212] \{scheduler_job.py:740} 

[jira] [Commented] (AIRFLOW-7039) Specific DAG Schedule & DST Results in Skipped DAG Run

2020-03-29 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17070547#comment-17070547
 ] 

Jarek Potiuk commented on AIRFLOW-7039:
---

No progress - and next DST change is in half a year so we still have time. Feel 
free to take it over if you want.

> Specific DAG Schedule & DST Results in Skipped DAG Run
> --
>
> Key: AIRFLOW-7039
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7039
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.7
> Environment: Amazon Linux 2 AMI
>Reporter: Peter Kim
>Assignee: Jarek Potiuk
>Priority: Critical
>  Labels: timezone
>
> *Scenario:* 
> EC2 running airflow is in Eastern Time (America/New_York), 
> airflow.cfg>[core]>default_timezone=America/New_York (automatically changes 
> correctly)
> Monday morning after Daylight Savings Time applied a handful of DAG runs were 
> not executed as expected.  The strange part is that these DAGs were the only 
> jobs that did not behave as expected, all other DAGs ran normally.  
> Additionally, only the first expected run after DST was skipped, subsequent 
> runs later that day were scheduled successfully.
> Here is the pattern observed:
> DAG Schedule which skipped first run:  (0 , * * 1,2,3,4,5)
> e.g. Schedules M-F, with two distinct runs per day.
> DAGs that run at one time, M-F & DAGs that run at two times, not M-F did not 
> experience this issue.  
>  
> Based on the logs, it appears as if the expected run that was missed was not 
> seen by the scheduler whatsoever (see below):
>  
>  
> 2020 03 06 6:30 AM ET (BEFORE DST, EXPECTED BEHAVIOR):
> [2020-03-06 06:31:01,220] \{logging_mixin.py:112} INFO - [2020-03-06 
> 06:31:01,220] \{settings.py:254} INFO - settings.configure_orm(): Using pool 
> settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=697
> [2020-03-06 06:31:01,222] \{scheduler_job.py:153} INFO - Started process 
> (PID=697) to work on /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 06:31:01,228] \{scheduler_job.py:1539} INFO - Processing file 
> /home/ec2-user/airflow/s3fuse/dags/.py for tasks to queue
> [2020-03-06 06:31:01,228] \{logging_mixin.py:112} INFO - [2020-03-06 
> 06:31:01,228] \{dagbag.py:403} INFO - Filling up the DagBag from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 06:31:01,238] \{scheduler_job.py:1551} INFO - DAG(s) 
> dict_keys(['']) retrieved from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 06:31:01,305] \{scheduler_job.py:1262} INFO - Processing 
> 
> [2020-03-06 06:31:01,348] \{logging_mixin.py:112} WARNING - 
> /home/ec2-user/venv/lib64/python3.7/site-packages/pymysql/cursors.py:170: 
> Warning: (1300, "Invalid utf8mb4 character string: '80047D'") result = 
> self._query(query)
> [2020-03-06 06:31:01,362] \{scheduler_job.py:1272} INFO - Created   @ 2020-03-05T15:30:00+00:00: scheduled__2020-03-05T15:30:00+00:00, 
> externally triggered: False>
> [2020-03-06 06:31:01,366] \{scheduler_job.py:740} INFO - Examining DAG run 
>  @ 2020-03-05 15:30:00+00:00: 
> scheduled__2020-03-05T15:30:00+00:00, externally triggered: False>
> [2020-03-06 06:31:01,389] \{scheduler_job.py:440} INFO - Skipping SLA check 
> for > because no tasks in DAG have SLAs
> [2020-03-06 06:31:01,395] \{scheduler_job.py:1613} INFO - Creating / updating 
> . 2020-03-05 15:30:00+00:00 [scheduled]> 
> in ORM
> [2020-03-06 06:31:01,414] \{scheduler_job.py:161} INFO - Processing 
> /home/ec2-user/airflow/s3fuse/dags/.py took 0.192 seconds
> 20200306 10 AM ET (BEFORE DST, EXPECTED BEHAVIOR):
> [2020-03-06 10:30:00,083] \{logging_mixin.py:112} INFO - [2020-03-06 
> 10:30:00,082] \{settings.py:254} INFO - settings.configure_orm(): Using pool 
> settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=16194
> [2020-03-06 10:30:00,085] \{scheduler_job.py:153} INFO - Started process 
> (PID=16194) to work on /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 10:30:00,090] \{scheduler_job.py:1539} INFO - Processing file 
> /home/ec2-user/airflow/s3fuse/dags/.py for tasks to queue
> [2020-03-06 10:30:00,090] \{logging_mixin.py:112} INFO - [2020-03-06 
> 10:30:00,090] \{dagbag.py:403} INFO - Filling up the DagBag from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 10:30:00,099] \{scheduler_job.py:1551} INFO - DAG(s) 
> dict_keys(['']) retrieved from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 10:30:00,159] \{scheduler_job.py:1262} INFO - Processing 
> 
> [2020-03-06 10:30:00,193] \{logging_mixin.py:112} WARNING - 
> /home/ec2-user/venv/lib64/python3.7/site-packages/pymysql/cursors.py:170: 
> Warning: (1300, "Invalid utf8mb4 character string: '80047D'")
>   result = self._query(query)
> [2020-03-06 10:30:00,207] \{scheduler_job.py:1272} INFO - Created   @ 

[jira] [Commented] (AIRFLOW-3674) Adding documentation on official docker images

2020-03-27 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17069244#comment-17069244
 ] 

Jarek Potiuk commented on AIRFLOW-3674:
---

I thin that one can be closed. We already have quite a documentation and more 
of it is coming with the production image.

> Adding documentation on official docker images
> --
>
> Key: AIRFLOW-3674
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3674
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Reporter: Peter van 't Hof
>Assignee: Jarek Potiuk
>Priority: Major
>  Labels: docker, dockerfile
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (AIRFLOW-3674) Adding documentation on official docker images

2020-03-27 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk closed AIRFLOW-3674.
-
Resolution: Done

> Adding documentation on official docker images
> --
>
> Key: AIRFLOW-3674
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3674
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Reporter: Peter van 't Hof
>Assignee: Jarek Potiuk
>Priority: Major
>  Labels: docker, dockerfile
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-265) Custom parameters for DockerOperator

2020-03-27 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17069243#comment-17069243
 ] 

Jarek Potiuk commented on AIRFLOW-265:
--

Feel free [~Becky]

> Custom parameters for DockerOperator
> 
>
> Key: AIRFLOW-265
> URL: https://issues.apache.org/jira/browse/AIRFLOW-265
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: operators
>Reporter: Alexandr Nikitin
>Assignee: Ngwe Becky
>Priority: Major
>  Labels: docker, gsoc, gsoc2020, mentor
>
> Add ability to specify custom parameters to docker cli. E.g. 
> "--volume-driver=""" or --net="bridge" or any other



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-265) Custom parameters for DockerOperator

2020-03-27 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-265:


Assignee: Ngwe Becky

> Custom parameters for DockerOperator
> 
>
> Key: AIRFLOW-265
> URL: https://issues.apache.org/jira/browse/AIRFLOW-265
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: operators
>Reporter: Alexandr Nikitin
>Assignee: Ngwe Becky
>Priority: Major
>  Labels: docker, gsoc, gsoc2020, mentor
>
> Add ability to specify custom parameters to docker cli. E.g. 
> "--volume-driver=""" or --net="bridge" or any other



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6562) mushroom cloud error when clicking 'mark failed/success' from graph view of dag that has never been run yet

2020-03-24 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6562:
--
Fix Version/s: (was: 2.0.0)
   1.10.10

> mushroom cloud error when clicking 'mark failed/success' from graph view of 
> dag that has never been run yet
> ---
>
> Key: AIRFLOW-6562
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6562
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.10.6
> Environment: localexec, mysql metastore, 1.10.6
>Reporter: t oo
>Assignee: t oo
>Priority: Major
> Fix For: 1.10.10
>
>
> # create a new dag
>  # go to graph view
>  # click on one of the tasks (it should have a white border)
>  # click on 'past/future' on either 2nd last row (mark failed) or last row 
> (mark success)
>  # then click either (mark failed) or (mark success)
> below error appears
> Traceback (most recent call last):
>  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2446, in 
> wsgi_app
>  response = self.full_dispatch_request()
>  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1951, in 
> full_dispatch_request
>  rv = self.handle_user_exception(e)
>  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1820, in 
> handle_user_exception
>  reraise(exc_type, exc_value, tb)
>  File "/usr/local/lib/python3.7/site-packages/flask/_compat.py", line 39, in 
> reraise
>  raise value
>  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1949, in 
> full_dispatch_request
>  rv = self.dispatch_request()
>  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1935, in 
> dispatch_request
>  return self.view_functions[rule.endpoint](**req.view_args)
>  File "/usr/local/lib/python3.7/site-packages/flask_admin/base.py", line 69, 
> in inner
>  return self._run_view(f, *args, **kwargs)
>  File "/usr/local/lib/python3.7/site-packages/flask_admin/base.py", line 368, 
> in _run_view
>  return fn(self, *args, **kwargs)
>  File "/usr/local/lib/python3.7/site-packages/flask_login/utils.py", line 
> 258, in decorated_view
>  return func(*args, **kwargs)
>  File "/usr/local/lib/python3.7/site-packages/airflow/www/utils.py", line 
> 290, in wrapper
>  return f(*args, **kwargs)
>  File "/usr/local/lib/python3.7/site-packages/airflow/www/utils.py", line 
> 337, in wrapper
>  return f(*args, **kwargs)
>  File "/usr/local/lib/python3.7/site-packages/airflow/www/views.py", line 
> 1449, in failed
>  future, past, State.FAILED)
>  File "/usr/local/lib/python3.7/site-packages/airflow/www/views.py", line 
> 1420, in _mark_task_instance_state
>  commit=False)
>  File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 74, 
> in wrapper
>  return func(*args, **kwargs)
>  File 
> "/usr/local/lib/python3.7/site-packages/airflow/api/common/experimental/mark_tasks.py",
>  line 105, in set_state
>  dates = get_execution_dates(dag, execution_date, future, past)
>  File 
> "/usr/local/lib/python3.7/site-packages/airflow/api/common/experimental/mark_tasks.py",
>  line 246, in get_execution_dates
>  raise ValueError("Received non-localized date {}".format(execution_date))
> ValueError: Received non-localized date 2020-01-14T21:58:44.855743+00:00
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6562) mushroom cloud error when clicking 'mark failed/success' from graph view of dag that has never been run yet

2020-03-24 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6562.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> mushroom cloud error when clicking 'mark failed/success' from graph view of 
> dag that has never been run yet
> ---
>
> Key: AIRFLOW-6562
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6562
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ui
>Affects Versions: 1.10.6
> Environment: localexec, mysql metastore, 1.10.6
>Reporter: t oo
>Assignee: t oo
>Priority: Major
> Fix For: 2.0.0
>
>
> # create a new dag
>  # go to graph view
>  # click on one of the tasks (it should have a white border)
>  # click on 'past/future' on either 2nd last row (mark failed) or last row 
> (mark success)
>  # then click either (mark failed) or (mark success)
> below error appears
> Traceback (most recent call last):
>  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2446, in 
> wsgi_app
>  response = self.full_dispatch_request()
>  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1951, in 
> full_dispatch_request
>  rv = self.handle_user_exception(e)
>  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1820, in 
> handle_user_exception
>  reraise(exc_type, exc_value, tb)
>  File "/usr/local/lib/python3.7/site-packages/flask/_compat.py", line 39, in 
> reraise
>  raise value
>  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1949, in 
> full_dispatch_request
>  rv = self.dispatch_request()
>  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1935, in 
> dispatch_request
>  return self.view_functions[rule.endpoint](**req.view_args)
>  File "/usr/local/lib/python3.7/site-packages/flask_admin/base.py", line 69, 
> in inner
>  return self._run_view(f, *args, **kwargs)
>  File "/usr/local/lib/python3.7/site-packages/flask_admin/base.py", line 368, 
> in _run_view
>  return fn(self, *args, **kwargs)
>  File "/usr/local/lib/python3.7/site-packages/flask_login/utils.py", line 
> 258, in decorated_view
>  return func(*args, **kwargs)
>  File "/usr/local/lib/python3.7/site-packages/airflow/www/utils.py", line 
> 290, in wrapper
>  return f(*args, **kwargs)
>  File "/usr/local/lib/python3.7/site-packages/airflow/www/utils.py", line 
> 337, in wrapper
>  return f(*args, **kwargs)
>  File "/usr/local/lib/python3.7/site-packages/airflow/www/views.py", line 
> 1449, in failed
>  future, past, State.FAILED)
>  File "/usr/local/lib/python3.7/site-packages/airflow/www/views.py", line 
> 1420, in _mark_task_instance_state
>  commit=False)
>  File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 74, 
> in wrapper
>  return func(*args, **kwargs)
>  File 
> "/usr/local/lib/python3.7/site-packages/airflow/api/common/experimental/mark_tasks.py",
>  line 105, in set_state
>  dates = get_execution_dates(dag, execution_date, future, past)
>  File 
> "/usr/local/lib/python3.7/site-packages/airflow/api/common/experimental/mark_tasks.py",
>  line 246, in get_execution_dates
>  raise ValueError("Received non-localized date {}".format(execution_date))
> ValueError: Received non-localized date 2020-01-14T21:58:44.855743+00:00
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7067) Add apache-airflow-pinned version

2020-03-22 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7067.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> Add apache-airflow-pinned version
> -
>
> Key: AIRFLOW-7067
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7067
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0, 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.10
>
>
> For official docker image we need to have fixed set of requirements so that 
> we know rebuilding the image can be reproducible.
> We need a -pinned version of apache-airflow for that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7097) Install gcloud beta components in CI image

2020-03-21 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7097.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> Install gcloud beta components in CI image
> --
>
> Key: AIRFLOW-7097
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7097
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 2.0.0, 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.10
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6752) Add GoogleAnalyticsRetrieveAdsLinksListOperator

2020-03-20 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6752.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Add GoogleAnalyticsRetrieveAdsLinksListOperator
> ---
>
> Key: AIRFLOW-6752
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6752
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: api
>Affects Versions: 1.10.8
>Reporter: Michał Słowikowski
>Assignee: Michał Słowikowski
>Priority: Minor
> Fix For: 2.0.0
>
>
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7098) Simple-salesforce releae 1.0.0 breaks salesforce extra

2020-03-20 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7098.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> Simple-salesforce releae 1.0.0 breaks salesforce extra
> --
>
> Key: AIRFLOW-7098
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7098
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 2.0.0, 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.10
>
>
> simple-salesforce pypi dependency breaks salesforce extra



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7098) Simple-salesforce releae 1.0.0 breaks salesforce extra

2020-03-20 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7098:
-

 Summary: Simple-salesforce releae 1.0.0 breaks salesforce extra
 Key: AIRFLOW-7098
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7098
 Project: Apache Airflow
  Issue Type: Bug
  Components: ci
Affects Versions: 1.10.9, 2.0.0
Reporter: Jarek Potiuk


simple-salesforce pypi dependency breaks salesforce extra



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7097) Install gcloud beta components in CI image

2020-03-20 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7097:
-

 Summary: Install gcloud beta components in CI image
 Key: AIRFLOW-7097
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7097
 Project: Apache Airflow
  Issue Type: Bug
  Components: ci
Affects Versions: 1.10.9, 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7069) Fix GCP Cloud SQL system tests

2020-03-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7069.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Fix GCP Cloud SQL system tests
> --
>
> Key: AIRFLOW-7069
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7069
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: gcp
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7080) API Endpoint to query a dag's paused status

2020-03-18 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7080.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> API Endpoint to query a dag's paused status
> ---
>
> Key: AIRFLOW-7080
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7080
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: api
>Affects Versions: 2.0.0, 1.10.10
>Reporter: Philipp Großelfinger
>Assignee: Philipp Großelfinger
>Priority: Minor
> Fix For: 1.10.10
>
>
> So far it is possible to set the paused state of a DAG via the experimental 
> API. It would be nice to be also able to query the current paused state of a 
> DAG.
> The endpoint to set the paused state looks like this: 
> /api/experimental/dags//paused/
> The new endpoint could look like: 
> GET /api/experimental/dags//paused



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (AIRFLOW-1585) Documentation (e.g. doc_md) for tasks should be displayed more prominently

2020-03-18 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reopened AIRFLOW-1585:
---

> Documentation (e.g. doc_md) for tasks should be displayed more prominently 
> ---
>
> Key: AIRFLOW-1585
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1585
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 1.8.0
>Reporter: Max Grender-Jones
>Priority: Major
>
> We have a web of complex tasks. I love that when I set `dag_md` on an entire 
> DAG it is visible very prominently on the Graph View. However, the same 
> cannot be said of `dag_md` set on individual tasks (until now I've never even 
> had a use for the `Task Details` tab, even there it's not very prominently 
> presented).
> My wishlist of where it would be displayed:
>  - In the mouseover (although if there's a long description, perhaps that 
> would be too much for here)
>  - In the popup that you see when you click on a task (My favourite)
>  - At the top of *all* the tabs for that task, displayed in a prominent 
> break-out fashion (as with the DAG), as opposed to buried in the 'attributes' 
> section



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6014) Kubernetes executor - handle preempted deleted pods - queued tasks

2020-03-17 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6014.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> Kubernetes executor - handle preempted deleted pods - queued tasks
> --
>
> Key: AIRFLOW-6014
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6014
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: executor-kubernetes
>Affects Versions: 1.10.6
>Reporter: afusr
>Assignee: Daniel Imberman
>Priority: Minor
> Fix For: 1.10.10
>
>
> We have encountered an issue whereby when using the kubernetes executor, and 
> using autoscaling, airflow pods are preempted and airflow never attempts to 
> rerun these pods. 
> This is partly as a result of having the following set on the pod spec:
> restartPolicy: Never
> This makes sense as if a pod fails when running a task, we don't want 
> kubernetes to retry it, as this should be controlled by airflow. 
> What we believe happens is that when a new node is added by autoscaling, 
> kubernetes schedules a number of airflow pods onto the new node, as well as 
> any pods required by k8s/daemon sets. As these are higher priority, the 
> Airflow pods are preempted, and deleted. You see messages such as:
>  
> Preempted by kube-system/ip-masq-agent-xz77q on node 
> gke-some--airflow--node-1ltl
>  
> Within the kubernetes executor, these pods end up in a status of pending and 
> an event of deleted is received but not handled. 
> The end result is tasks remain in a queued state forever. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5610) Add ability to specify multiple objects to copy to GoogleCloudStorageToGoogleCloudStorageOperator

2020-03-17 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-5610.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Add ability to specify multiple objects to copy to 
> GoogleCloudStorageToGoogleCloudStorageOperator
> -
>
> Key: AIRFLOW-5610
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5610
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.5
>Reporter: Joel Croteau
>Assignee: Ephraim E Anierobi
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
> Fix For: 2.0.0
>
>
> The restriction in GoogleCloudStorageToGoogleCloudStorageOperator that I am 
> only allowed to specify a single object to list is rather arbitrary. If I 
> specify a wildcard, all it does is split at the wildcard and use that to get 
> a prefix and delimiter. Why not just let me do this search myself and return 
> a list of objects?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (AIRFLOW-3381) KubernetesPodOperator: Use secretKeyRef or configMapKeyRef in env_vars

2020-03-17 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk closed AIRFLOW-3381.
-
Resolution: Duplicate

> KubernetesPodOperator: Use secretKeyRef or configMapKeyRef in env_vars
> --
>
> Key: AIRFLOW-3381
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3381
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: operators
>Affects Versions: 1.10.0
>Reporter: Arthur Brenaut
>Priority: Major
>  Labels: gsoc, gsoc2020, kubernetes, mentor
>
> The env_vars attribute of the KubernetesPodOperator allows to pass 
> environment variables as string but it doesn't allows to pass a value from a 
> configmap or a secret.
> I'd like to be able to do
> {code:java}
> modeling = KubernetesPodOperator(
>  ...
>  env_vars={
>   'MY_ENV_VAR': {
>'valueFrom': {
> 'secretKeyRef': {
>  'name': 'an-already-existing-secret',
>  'key': 'key',
>}
>   }
>  },
>  ...
> )
> {code}
> Right now if I do that, Airflow generates the following config
> {code:java}
> - name: MY_ENV_VAR
>   value:
>valueFrom:
> configMapKeyRef:
>  name: an-already-existing-secret
>  key: key
> {code}
> instead of 
> {code:java}
> - name: MY_ENV_VAR
>   valueFrom:
>configMapKeyRef:
> name: an-already-existing-secret
> key: key
> {code}
> The _extract_env_and_secrets_ method of the _KubernetesRequestFactory_ could 
> check if the value is a dictionary and use it directly.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-4175) S3Hook load_file should support ACL policy parameter

2020-03-16 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-4175.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> S3Hook load_file should support ACL policy parameter
> 
>
> Key: AIRFLOW-4175
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4175
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: hooks
>Affects Versions: 1.10.2
>Reporter: Keith O'Brien
>Assignee: Omair Khan
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
> Fix For: 1.10.10
>
>
> We have a use case where we are uploading files to an S3 bucket in a 
> different AWS account to the one Airflow is running in.  AWS S3 supports this 
> situation using the pre canned ACL policy, specifically 
> {{bucket-owner-full-control. }}
> However, the current implementations of the {{S3Hook.load_*}}() and 
> {{S3Hook.copy_object}}() methods do not allow us to supply any ACL policy for 
> the file being uploaded/copied to S3.  
> It would be good to add another optional parameter to the {{S3Hook}} methods 
> called {{acl_policy}} which would then be passed into the boto3 client method 
> calls like so 
>  
> {code}
> # load_file
> ...
> if encrypt: 
>extra_args['ServerSideEncryption'] = "AES256"
> if acl_policy:
>extra_args['ACL'] = acl_policy
> client.upload_file(filename, bucket_name, key, ExtraArgs=extra_args){code}
>  
> {code}
> # load_bytes
> ...
> if encrypt: 
>extra_args['ServerSideEncryption'] = "AES256"
> if acl_policy:
>extra_args['ACL'] = acl_policy
> client.upload_file(filename, bucket_name, key, ExtraArgs=extra_args){code}
> {code}
> # copy_object
> self.get_conn().copy_object(Bucket=dest_bucket_name,
>Key=dest_bucket_key,
>CopySource=CopySource, 
>ACL=acl_policy)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6987) Avoid creating default connections

2020-03-16 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6987.
---
Resolution: Fixed

> Avoid creating default connections
> --
>
> Key: AIRFLOW-6987
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6987
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: configuration
>Affects Versions: 1.10.9
>Reporter: Noël BARDELOT
>Priority: Minor
> Fix For: 1.10.10
>
>
> Add a new load_default_connections in the Core configuration of Airflow, 
> similar to load_examples but aimed at avoiding the creation of default 
> Connections in airflow/utils/db.py.
> By default it should be retrocompatible:
> - the default behaviour should be the old behaviour (True)
> - the old behaviour should also be kept in the CI tests and other places were 
> configuration is used
> The config should be documented as being new.
> See also in the Helm charts project, concerning the stable/airflow chart:
> > [stable/airflow] allow default connections to be removed
> https://github.com/helm/charts/issues/20568 (by @javamonkey79)
> https://github.com/helm/charts/pull/21018 (comment) (by @javamonkey79)
> > [stable/airflow] Add a feature to run extra init scripts in the scheduler 
> > after initdb
> https://github.com/helm/charts/pull/21047



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6518) Task did not retry when there was temporary metastore db connectivity loss

2020-03-16 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060059#comment-17060059
 ] 

Jarek Potiuk commented on AIRFLOW-6518:
---

Then 
[https://stackoverflow.com/questions/53287215/retry-failed-sqlalchemy-queries/53300049#53300049]
 seems like a good idea.

> Task did not retry when there was temporary metastore db connectivity loss
> --
>
> Key: AIRFLOW-6518
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6518
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: database, scheduler
>Affects Versions: 1.10.6
>Reporter: t oo
>Priority: Major
>
> My DAG has got retries configured at the task level. I started a dagrun, then 
> while a task was running the metastore db crashed, the task failed, but the 
> dagrun did not attempt to retry the task (even though task retries are 
> configured!), db recovered 3 seconds after the task failed, instead the 
> dagrun went to FAILED state.
> *Last part of log of TaskInstance:*
> [2020-01-08 17:34:46,301] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk Traceback (most recent call last):
> [2020-01-08 17:34:46,301] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File "/home/ec2-user/venv/bin/airflow", line 37, in 
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk args.func(args)
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/utils/cli.py", 
> line 74, in wrapper
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk return f(*args, **kwargs)
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/bin/cli.py", 
> line 551, in run
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk _run(args, dag, ti)
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/bin/cli.py", 
> line 469, in _run
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk pool=args.pool,
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/utils/db.py", 
> line 74, in wrapper
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk return func(*args, **kwargs)
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/models/taskinstance.py",
>  line 962, in _run_raw_task
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk self.refresh_from_db()
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/utils/db.py", 
> line 74, in wrapper
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk return func(*args, **kwargs)
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/models/taskinstance.py",
>  line 461, in refresh_from_db
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk ti = qry.first()
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/query.py",
>  line 3265, in first
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk ret = list(self[0:1])
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/query.py",
>  line 3043, in __getitem__
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk return list(res)
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/query.py",
>  line 3367, in __iter__
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk return self._execute_and_instances(context)
> [2020-01-08 17:34:46,304] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/query.py",
>  

[jira] [Assigned] (AIRFLOW-7069) Fix GCP Cloud SQL system tests

2020-03-16 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-7069:
-

Assignee: Jarek Potiuk

> Fix GCP Cloud SQL system tests
> --
>
> Key: AIRFLOW-7069
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7069
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: gcp
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7069) Fix GCP Cloud SQL system tests

2020-03-16 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7069:
-

 Summary: Fix GCP Cloud SQL system tests
 Key: AIRFLOW-7069
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7069
 Project: Apache Airflow
  Issue Type: Bug
  Components: gcp
Affects Versions: 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7067) Add apache-airflow-pinned version

2020-03-15 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7067:
-

 Summary: Add apache-airflow-pinned version
 Key: AIRFLOW-7067
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7067
 Project: Apache Airflow
  Issue Type: Improvement
  Components: ci
Affects Versions: 1.10.9, 2.0.0
Reporter: Jarek Potiuk


For official docker image we need to have fixed set of requirements so that we 
know rebuilding the image can be reproducible.

We need a -pinned version of apache-airflow for that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6588) json_format and write_stdout are boolean

2020-03-15 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6588:
--
Affects Version/s: (was: 1.10.10)
   1.10.9

> json_format and write_stdout are boolean
> 
>
> Key: AIRFLOW-6588
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6588
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.9
>Reporter: Ping Zhang
>Assignee: Ping Zhang
>Priority: Minor
> Fix For: 1.10.10
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6588) json_format and write_stdout are boolean

2020-03-15 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6588.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> json_format and write_stdout are boolean
> 
>
> Key: AIRFLOW-6588
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6588
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: logging
>Affects Versions: 1.10.10
>Reporter: Ping Zhang
>Assignee: Ping Zhang
>Priority: Minor
> Fix For: 1.10.10
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6946) Switch to MySQL 5.7 in 2.0 as base

2020-03-14 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6946.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Switch to MySQL 5.7 in 2.0 as base
> --
>
> Key: AIRFLOW-6946
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6946
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>
> Switch to MySQL 5.7 in tests. 
> Also test utf8mb4 encoding



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6518) Task did not retry when there was temporary metastore db connectivity loss

2020-03-14 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059433#comment-17059433
 ] 

Jarek Potiuk commented on AIRFLOW-6518:
---

I wonder if this is a real problem that we should take care of? [~ash]? I think 
in most cases Airflow runs in a corporate environment where DB connectivity is 
pretty much guaranteed. In our case the central database is central point of 
synchronization and we rely a lot on it's availability - all components are 
relying on that central point and loosing connectivity even for a while is not 
best idea.

Of course it is not a 100% approach but maybe it is "good enough"?

> Task did not retry when there was temporary metastore db connectivity loss
> --
>
> Key: AIRFLOW-6518
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6518
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: database, scheduler
>Affects Versions: 1.10.6
>Reporter: t oo
>Priority: Major
>
> My DAG has got retries configured at the task level. I started a dagrun, then 
> while a task was running the metastore db crashed, the task failed, but the 
> dagrun did not attempt to retry the task (even though task retries are 
> configured!), db recovered 3 seconds after the task failed, instead the 
> dagrun went to FAILED state.
> *Last part of log of TaskInstance:*
> [2020-01-08 17:34:46,301] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk Traceback (most recent call last):
> [2020-01-08 17:34:46,301] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File "/home/ec2-user/venv/bin/airflow", line 37, in 
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk args.func(args)
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/utils/cli.py", 
> line 74, in wrapper
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk return f(*args, **kwargs)
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/bin/cli.py", 
> line 551, in run
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk _run(args, dag, ti)
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/bin/cli.py", 
> line 469, in _run
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk pool=args.pool,
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/utils/db.py", 
> line 74, in wrapper
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk return func(*args, **kwargs)
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/models/taskinstance.py",
>  line 962, in _run_raw_task
> [2020-01-08 17:34:46,302] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk self.refresh_from_db()
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/utils/db.py", 
> line 74, in wrapper
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk return func(*args, **kwargs)
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/models/taskinstance.py",
>  line 461, in refresh_from_db
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk ti = qry.first()
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/query.py",
>  line 3265, in first
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk ret = list(self[0:1])
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/query.py",
>  line 3043, in __getitem__
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk return list(res)
> [2020-01-08 17:34:46,303] {base_task_runner.py:115} INFO - Job 34662: Subtask 
> mytsk   File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/query.py",
>  

[jira] [Resolved] (AIRFLOW-7063) dag.clear() slowness caused by multiple UNION statements and tis.count()

2020-03-14 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7063.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> dag.clear() slowness caused by multiple UNION statements and tis.count()
> 
>
> Key: AIRFLOW-7063
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7063
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webserver
>Affects Versions: 1.10.9
>Reporter: Qian Yu
>Assignee: Qian Yu
>Priority: Major
> Fix For: 1.10.10
>
>
> When multiple {{ExternalTaskMarker}} are used, {{dag.clear()}} becomes very 
> slow when clearing all the {{ExternalTaskMarker}} together. The slowness 
> turns out to come from this line of code in {{dag.clear()}}:
> {code:python}
> if dry_run:
> tis = tis.all()
> session.expunge_all()
> return tis
> count = tis.count()   <--- This line is the culprit
> do_it = True
> if count == 0:
> return 0
> {code}
> This is the sql generated by {{tis.count()}} when there are three 
> {{ExternalTaskMarker}} being cleared together. Note there's nothing wrong 
> with the sql and it's reasonably efficient when executed on postgres even 
> when the number of UNION statements is bigger (e.g. 30 UNION statements takes 
> about 13ms in the docker container I started with breeze)
>  But it takes more than three minutes for sqlalchemy to construct this count 
> query before it goes to the database.
> The fix is really simple, just get rid of the count() and query all the 
> entries from the db instead. The function becomes ten times faster when 
> {{tis.count()}} is removed.
>  There are multiple places people are complaining about similar problems with 
> sqlalchemy count() being slower than the query itself. It does not look like 
> sqlalchemy is going to fix this issue:
>  
> [https://stackoverflow.com/questions/14754994/why-is-sqlalchemy-count-much-slower-than-the-raw-query]
>  [https://gist.github.com/hest/8798884]
>  
> {code:sql}
> [2020-03-14 09:42:50,264] {base.py:1203} INFO - SELECT count(*) AS count_1
> FROM (SELECT anon_2.anon_3_anon_4_task_instance_try_number AS 
> anon_2_anon_3_anon_4_task_instance_try_number, 
> anon_2.anon_3_anon_4_task_instance_task_id AS 
> anon_2_anon_3_anon_4_task_instance_task_id, 
> anon_2.anon_3_anon_4_task_instance_dag_id AS 
> anon_2_anon_3_anon_4_task_instance_dag_id, 
> anon_2.anon_3_anon_4_task_instance_execution_date AS 
> anon_2_anon_3_anon_4_task_instance_execution_date, 
> anon_2.anon_3_anon_4_task_instance_start_date AS 
> anon_2_anon_3_anon_4_task_instance_start_date, 
> anon_2.anon_3_anon_4_task_instance_end_date AS 
> anon_2_anon_3_anon_4_task_instance_end_date, 
> anon_2.anon_3_anon_4_task_instance_duration AS 
> anon_2_anon_3_anon_4_task_instance_duration, 
> anon_2.anon_3_anon_4_task_instance_state AS 
> anon_2_anon_3_anon_4_task_instance_state, 
> anon_2.anon_3_anon_4_task_instance_max_tries AS 
> anon_2_anon_3_anon_4_task_instance_max_tries, 
> anon_2.anon_3_anon_4_task_instance_hostname AS 
> anon_2_anon_3_anon_4_task_instance_hostname, 
> anon_2.anon_3_anon_4_task_instance_unixname AS 
> anon_2_anon_3_anon_4_task_instance_unixname, 
> anon_2.anon_3_anon_4_task_instance_job_id AS 
> anon_2_anon_3_anon_4_task_instance_job_id, 
> anon_2.anon_3_anon_4_task_instance_pool AS 
> anon_2_anon_3_anon_4_task_instance_pool, 
> anon_2.anon_3_anon_4_task_instance_pool_slots AS 
> anon_2_anon_3_anon_4_task_instance_pool_slots, 
> anon_2.anon_3_anon_4_task_instance_queue AS 
> anon_2_anon_3_anon_4_task_instance_queue, 
> anon_2.anon_3_anon_4_task_instance_priority_weight AS 
> anon_2_anon_3_anon_4_task_instance_priority_weight, 
> anon_2.anon_3_anon_4_task_instance_operator AS 
> anon_2_anon_3_anon_4_task_instance_operator, 
> anon_2.anon_3_anon_4_task_instance_queued_dttm AS 
> anon_2_anon_3_anon_4_task_instance_queued_dttm, 
> anon_2.anon_3_anon_4_task_instance_pid AS 
> anon_2_anon_3_anon_4_task_instance_pid, 
> anon_2.anon_3_anon_4_task_instance_executor_config AS 
> anon_2_anon_3_anon_4_task_instance_executor_config
> FROM (SELECT anon_3.anon_4_task_instance_try_number AS 
> anon_3_anon_4_task_instance_try_number, anon_3.anon_4_task_instance_task_id 
> AS anon_3_anon_4_task_instance_task_id, anon_3.anon_4_task_instance_dag_id AS 
> anon_3_anon_4_task_instance_dag_id, 
> anon_3.anon_4_task_instance_execution_date AS 
> anon_3_anon_4_task_instance_execution_date, 
> anon_3.anon_4_task_instance_start_date AS 
> anon_3_anon_4_task_instance_start_date, anon_3.anon_4_task_instance_end_date 
> AS anon_3_anon_4_task_instance_end_date, anon_3.anon_4_task_instance_duration 
> AS anon_3_anon_4_task_instance_duration, 

[jira] [Commented] (AIRFLOW-7063) dag.clear() slowness caused by multiple UNION statements and tis.count()

2020-03-14 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059326#comment-17059326
 ] 

Jarek Potiuk commented on AIRFLOW-7063:
---

Interesting threads at SO !

> dag.clear() slowness caused by multiple UNION statements and tis.count()
> 
>
> Key: AIRFLOW-7063
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7063
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webserver
>Affects Versions: 1.10.9
>Reporter: Qian Yu
>Assignee: Qian Yu
>Priority: Major
>
> When multiple {{ExternalTaskMarker}} are used, {{dag.clear()}} becomes very 
> slow when clearing all the {{ExternalTaskMarker}} together. The slowness 
> turns out to come from this line of code in {{dag.clear()}}:
> {code:python}
> if dry_run:
> tis = tis.all()
> session.expunge_all()
> return tis
> count = tis.count()   <--- This line is the culprit
> do_it = True
> if count == 0:
> return 0
> {code}
> This is the sql generated by {{tis.count()}} when there are three 
> {{ExternalTaskMarker}} being cleared together. Note there's nothing wrong 
> with the sql and it's reasonably efficient when executed on postgres even 
> when the number of UNION statements is bigger (e.g. 30 UNION statements takes 
> about 13ms in the docker container I started with breeze)
>  But it takes more than three minutes for sqlalchemy to construct this count 
> query before it goes to the database.
> The fix is really simple, just get rid of the count() and query all the 
> entries from the db instead. The function becomes ten times faster when 
> {{tis.count()}} is removed.
>  There are multiple places people are complaining about similar problems with 
> sqlalchemy count() being slower than the query itself. It does not look like 
> sqlalchemy is going to fix this issue:
>  
> [https://stackoverflow.com/questions/14754994/why-is-sqlalchemy-count-much-slower-than-the-raw-query]
>  [https://gist.github.com/hest/8798884]
>  
> {code:sql}
> [2020-03-14 09:42:50,264] {base.py:1203} INFO - SELECT count(*) AS count_1
> FROM (SELECT anon_2.anon_3_anon_4_task_instance_try_number AS 
> anon_2_anon_3_anon_4_task_instance_try_number, 
> anon_2.anon_3_anon_4_task_instance_task_id AS 
> anon_2_anon_3_anon_4_task_instance_task_id, 
> anon_2.anon_3_anon_4_task_instance_dag_id AS 
> anon_2_anon_3_anon_4_task_instance_dag_id, 
> anon_2.anon_3_anon_4_task_instance_execution_date AS 
> anon_2_anon_3_anon_4_task_instance_execution_date, 
> anon_2.anon_3_anon_4_task_instance_start_date AS 
> anon_2_anon_3_anon_4_task_instance_start_date, 
> anon_2.anon_3_anon_4_task_instance_end_date AS 
> anon_2_anon_3_anon_4_task_instance_end_date, 
> anon_2.anon_3_anon_4_task_instance_duration AS 
> anon_2_anon_3_anon_4_task_instance_duration, 
> anon_2.anon_3_anon_4_task_instance_state AS 
> anon_2_anon_3_anon_4_task_instance_state, 
> anon_2.anon_3_anon_4_task_instance_max_tries AS 
> anon_2_anon_3_anon_4_task_instance_max_tries, 
> anon_2.anon_3_anon_4_task_instance_hostname AS 
> anon_2_anon_3_anon_4_task_instance_hostname, 
> anon_2.anon_3_anon_4_task_instance_unixname AS 
> anon_2_anon_3_anon_4_task_instance_unixname, 
> anon_2.anon_3_anon_4_task_instance_job_id AS 
> anon_2_anon_3_anon_4_task_instance_job_id, 
> anon_2.anon_3_anon_4_task_instance_pool AS 
> anon_2_anon_3_anon_4_task_instance_pool, 
> anon_2.anon_3_anon_4_task_instance_pool_slots AS 
> anon_2_anon_3_anon_4_task_instance_pool_slots, 
> anon_2.anon_3_anon_4_task_instance_queue AS 
> anon_2_anon_3_anon_4_task_instance_queue, 
> anon_2.anon_3_anon_4_task_instance_priority_weight AS 
> anon_2_anon_3_anon_4_task_instance_priority_weight, 
> anon_2.anon_3_anon_4_task_instance_operator AS 
> anon_2_anon_3_anon_4_task_instance_operator, 
> anon_2.anon_3_anon_4_task_instance_queued_dttm AS 
> anon_2_anon_3_anon_4_task_instance_queued_dttm, 
> anon_2.anon_3_anon_4_task_instance_pid AS 
> anon_2_anon_3_anon_4_task_instance_pid, 
> anon_2.anon_3_anon_4_task_instance_executor_config AS 
> anon_2_anon_3_anon_4_task_instance_executor_config
> FROM (SELECT anon_3.anon_4_task_instance_try_number AS 
> anon_3_anon_4_task_instance_try_number, anon_3.anon_4_task_instance_task_id 
> AS anon_3_anon_4_task_instance_task_id, anon_3.anon_4_task_instance_dag_id AS 
> anon_3_anon_4_task_instance_dag_id, 
> anon_3.anon_4_task_instance_execution_date AS 
> anon_3_anon_4_task_instance_execution_date, 
> anon_3.anon_4_task_instance_start_date AS 
> anon_3_anon_4_task_instance_start_date, anon_3.anon_4_task_instance_end_date 
> AS anon_3_anon_4_task_instance_end_date, anon_3.anon_4_task_instance_duration 
> AS anon_3_anon_4_task_instance_duration, anon_3.anon_4_task_instance_state AS 
> 

[jira] [Commented] (AIRFLOW-7063) dag.clear() slowness caused by multiple UNION statements and tis.count()

2020-03-14 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059325#comment-17059325
 ] 

Jarek Potiuk commented on AIRFLOW-7063:
---

Whoa. What a surprise :)

> dag.clear() slowness caused by multiple UNION statements and tis.count()
> 
>
> Key: AIRFLOW-7063
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7063
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: webserver
>Affects Versions: 1.10.9
>Reporter: Qian Yu
>Assignee: Qian Yu
>Priority: Major
>
> When multiple {{ExternalTaskMarker}} are used, {{dag.clear()}} becomes very 
> slow when clearing all the {{ExternalTaskMarker}} together. The slowness 
> turns out to come from this line of code in {{dag.clear()}}:
> {code:python}
> if dry_run:
> tis = tis.all()
> session.expunge_all()
> return tis
> count = tis.count()   <--- This line is the culprit
> do_it = True
> if count == 0:
> return 0
> {code}
> This is the sql generated by {{tis.count()}} when there are three 
> {{ExternalTaskMarker}} being cleared together. Note there's nothing wrong 
> with the sql and it's reasonably efficient when executed on postgres even 
> when the number of UNION statements is bigger (e.g. 30 UNION statements takes 
> about 13ms in the docker container I started with breeze)
>  But it takes more than three minutes for sqlalchemy to construct this count 
> query before it goes to the database.
> The fix is really simple, just get rid of the count() and query all the 
> entries from the db instead. The function becomes ten times faster when 
> {{tis.count()}} is removed.
>  There are multiple places people are complaining about similar problems with 
> sqlalchemy count() being slower than the query itself. It does not look like 
> sqlalchemy is going to fix this issue:
>  
> [https://stackoverflow.com/questions/14754994/why-is-sqlalchemy-count-much-slower-than-the-raw-query]
>  [https://gist.github.com/hest/8798884]
>  
> {code:sql}
> [2020-03-14 09:42:50,264] {base.py:1203} INFO - SELECT count(*) AS count_1
> FROM (SELECT anon_2.anon_3_anon_4_task_instance_try_number AS 
> anon_2_anon_3_anon_4_task_instance_try_number, 
> anon_2.anon_3_anon_4_task_instance_task_id AS 
> anon_2_anon_3_anon_4_task_instance_task_id, 
> anon_2.anon_3_anon_4_task_instance_dag_id AS 
> anon_2_anon_3_anon_4_task_instance_dag_id, 
> anon_2.anon_3_anon_4_task_instance_execution_date AS 
> anon_2_anon_3_anon_4_task_instance_execution_date, 
> anon_2.anon_3_anon_4_task_instance_start_date AS 
> anon_2_anon_3_anon_4_task_instance_start_date, 
> anon_2.anon_3_anon_4_task_instance_end_date AS 
> anon_2_anon_3_anon_4_task_instance_end_date, 
> anon_2.anon_3_anon_4_task_instance_duration AS 
> anon_2_anon_3_anon_4_task_instance_duration, 
> anon_2.anon_3_anon_4_task_instance_state AS 
> anon_2_anon_3_anon_4_task_instance_state, 
> anon_2.anon_3_anon_4_task_instance_max_tries AS 
> anon_2_anon_3_anon_4_task_instance_max_tries, 
> anon_2.anon_3_anon_4_task_instance_hostname AS 
> anon_2_anon_3_anon_4_task_instance_hostname, 
> anon_2.anon_3_anon_4_task_instance_unixname AS 
> anon_2_anon_3_anon_4_task_instance_unixname, 
> anon_2.anon_3_anon_4_task_instance_job_id AS 
> anon_2_anon_3_anon_4_task_instance_job_id, 
> anon_2.anon_3_anon_4_task_instance_pool AS 
> anon_2_anon_3_anon_4_task_instance_pool, 
> anon_2.anon_3_anon_4_task_instance_pool_slots AS 
> anon_2_anon_3_anon_4_task_instance_pool_slots, 
> anon_2.anon_3_anon_4_task_instance_queue AS 
> anon_2_anon_3_anon_4_task_instance_queue, 
> anon_2.anon_3_anon_4_task_instance_priority_weight AS 
> anon_2_anon_3_anon_4_task_instance_priority_weight, 
> anon_2.anon_3_anon_4_task_instance_operator AS 
> anon_2_anon_3_anon_4_task_instance_operator, 
> anon_2.anon_3_anon_4_task_instance_queued_dttm AS 
> anon_2_anon_3_anon_4_task_instance_queued_dttm, 
> anon_2.anon_3_anon_4_task_instance_pid AS 
> anon_2_anon_3_anon_4_task_instance_pid, 
> anon_2.anon_3_anon_4_task_instance_executor_config AS 
> anon_2_anon_3_anon_4_task_instance_executor_config
> FROM (SELECT anon_3.anon_4_task_instance_try_number AS 
> anon_3_anon_4_task_instance_try_number, anon_3.anon_4_task_instance_task_id 
> AS anon_3_anon_4_task_instance_task_id, anon_3.anon_4_task_instance_dag_id AS 
> anon_3_anon_4_task_instance_dag_id, 
> anon_3.anon_4_task_instance_execution_date AS 
> anon_3_anon_4_task_instance_execution_date, 
> anon_3.anon_4_task_instance_start_date AS 
> anon_3_anon_4_task_instance_start_date, anon_3.anon_4_task_instance_end_date 
> AS anon_3_anon_4_task_instance_end_date, anon_3.anon_4_task_instance_duration 
> AS anon_3_anon_4_task_instance_duration, anon_3.anon_4_task_instance_state AS 
> 

[jira] [Resolved] (AIRFLOW-5705) add option for alternative creds backend

2020-03-14 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-5705.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> add option for alternative creds backend
> 
>
> Key: AIRFLOW-5705
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5705
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.10.5
>Reporter: Daniel Standish
>Assignee: Daniel Standish
>Priority: Major
> Fix For: 1.10.10
>
>
> Idea here is to create some kind of generic creds backend that could support 
> using other creds stores such as AWS SSM parameter store.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7061) Rename openfass to openfaas

2020-03-14 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7061.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Rename openfass to openfaas
> ---
>
> Key: AIRFLOW-7061
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7061
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 2.0.0, 1.10.10
>Reporter: Bas Harenslak
>Priority: Minor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-7029) Move license check to dedicated image

2020-03-14 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-7029:
--
Fix Version/s: (was: 2.0.0)
   1.10.9

> Move license check to dedicated image
> -
>
> Key: AIRFLOW-7029
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7029
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Ash Berlin-Taylor
>Assignee: Ash Berlin-Taylor
>Priority: Minor
> Fix For: 1.10.9
>
>
> Apache Rat is baked in to the CI image -- instead we should make it use a 
> (tiny) standalone image.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-4762) Test against Python 3.8

2020-03-14 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4762:
--
Affects Version/s: (was: 1.10.3)
   2.0.0

> Test against Python 3.8
> ---
>
> Key: AIRFLOW-4762
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4762
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Philippe Gagnon
>Assignee: Jarek Potiuk
>Priority: Major
>  Labels: ci, test
>
> Airflow is currently tested against python 3.5 only. This may be insufficient 
> as users may wish to run airflow on newer versions, which may introduce 
> breaking changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7062) New release of pydruid (0.5.9) breaks Airflow installation

2020-03-14 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7062.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> New release of pydruid (0.5.9) breaks Airflow installation 
> ---
>
> Key: AIRFLOW-7062
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7062
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 2.0.0, 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-7062) New release of pydruid (0.5.9) breaks Airflow installation

2020-03-14 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-7062:
--
Issue Type: Bug  (was: Improvement)

> New release of pydruid (0.5.9) breaks Airflow installation 
> ---
>
> Key: AIRFLOW-7062
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7062
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 2.0.0, 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7062) New release of pydruid (0.5.9) breaks Airflow installation

2020-03-14 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7062:
-

 Summary: New release of pydruid (0.5.9) breaks Airflow 
installation 
 Key: AIRFLOW-7062
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7062
 Project: Apache Airflow
  Issue Type: Improvement
  Components: ci
Affects Versions: 1.10.9, 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7055) Add verbose/HTTP logging option in GCP

2020-03-14 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7055.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Add verbose/HTTP logging option in GCP 
> ---
>
> Key: AIRFLOW-7055
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7055
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7058) Add support for DB versions in CI

2020-03-13 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7058:
-

 Summary: Add support for DB versions in CI
 Key: AIRFLOW-7058
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7058
 Project: Apache Airflow
  Issue Type: Improvement
  Components: ci
Affects Versions: 1.10.9, 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6724) Add Google Analytics 360 Accounts Retrieve Operator

2020-03-13 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6724.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Add Google Analytics 360 Accounts Retrieve Operator
> ---
>
> Key: AIRFLOW-6724
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6724
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: api
>Affects Versions: 1.10.8
>Reporter: Michał Słowikowski
>Assignee: Michał Słowikowski
>Priority: Minor
> Fix For: 2.0.0
>
>
> Add new operator related to Google Analytics 360 – Accounts Retrieve Operator.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7056) Backport packages might be prepared selectively

2020-03-13 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7056.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Backport packages might be prepared selectively
> ---
>
> Key: AIRFLOW-7056
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7056
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: backport-packages
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7056) Backport packages might be prepared selectively

2020-03-13 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7056:
-

 Summary: Backport packages might be prepared selectively
 Key: AIRFLOW-7056
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7056
 Project: Apache Airflow
  Issue Type: Improvement
  Components: backport-packages
Affects Versions: 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7055) Add verbose/HTTP logging option in GCP

2020-03-13 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7055:
-

 Summary: Add verbose/HTTP logging option in GCP 
 Key: AIRFLOW-7055
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7055
 Project: Apache Airflow
  Issue Type: Improvement
  Components: gcp
Affects Versions: 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-7054) Add --reset-db option to Breeze

2020-03-13 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-7054:
--
Component/s: (was: ci)
 breeze

> Add --reset-db option to Breeze
> ---
>
> Key: AIRFLOW-7054
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7054
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 2.0.0, 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7054) Add --reset-db option to Breeze

2020-03-13 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7054:
-

 Summary: Add --reset-db option to Breeze
 Key: AIRFLOW-7054
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7054
 Project: Apache Airflow
  Issue Type: Improvement
  Components: ci
Affects Versions: 1.10.9, 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-7047) build-providers-dependencies pre-commit fails on mac

2020-03-12 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-7047:
--
Fix Version/s: (was: 1.10.10)
   2.0.0

> build-providers-dependencies pre-commit fails on mac
> 
>
> Key: AIRFLOW-7047
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7047
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: pre-commit
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7047) build-providers-dependencies pre-commit fails on mac

2020-03-12 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7047.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> build-providers-dependencies pre-commit fails on mac
> 
>
> Key: AIRFLOW-7047
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7047
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: pre-commit
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.10
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7047) build-providers-dependencies pre-commit fails on mac

2020-03-12 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7047:
-

 Summary: build-providers-dependencies pre-commit fails on mac
 Key: AIRFLOW-7047
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7047
 Project: Apache Airflow
  Issue Type: Bug
  Components: pre-commit
Affects Versions: 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6481) SalesforceHook attempts to use .str accessor on object dtype

2020-03-12 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6481.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> SalesforceHook attempts to use .str accessor on object dtype
> 
>
> Key: AIRFLOW-6481
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6481
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Affects Versions: 1.10.7
>Reporter: Teddy Hartanto
>Assignee: Teddy Hartanto
>Priority: Minor
> Fix For: 2.0.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> I've searched through Airflow's issues and couldn't find any report regarding 
> this. I wonder if I'm the only one who's facing this? 
> {noformat}
> Panda version: 0.24.2{noformat}
> *Bug description*
> I'm using the SalesforceHook to fetch data from SalesForce and I encountered 
> this exception:
> {code:java}
> AttributeError: ('Can only use .str accessor with string values, which use 
> np.object_ dtype in pandas', ...)
> {code}
> The root of the problem is that some of the object in Salesforce has a column 
> with compound data type. Eg: User's address is a Python dict:
> {code:java}
> : {'city': None, 'country': 'my', 'geocodeAccuracy': None, 
> 'latitude': None, 'longitude': None, 'postalCode': None, 'state': None, 
> 'street': None}{code}
> The problematic code is here:
> {code:java}
> if fmt == "csv":
> # there are also a ton of newline objects
> # that mess up our ability to write to csv
> # we remove these newlines so that the output is a valid CSV format
> self.log.info("Cleaning data and writing to CSV")
> possible_strings = df.columns[df.dtypes == "object"]
> df[possible_strings] = df[possible_strings].apply(
> lambda x: x.str.replace("\r\n", "")
> )
> df[possible_strings] = df[possible_strings].apply(
> lambda x: x.str.replace("\n", "")
> )
> # write the dataframe
> df.to_csv(filename, index=False)
> {code}
> Because a Series containing Python dicts are also considered of dtype object, 
> they're assumed to be "possible_strings". And then, when .str is called on 
> that Series, the exception is thrown.
> To fix it, we could explicitly cast the object type to string as such: 
> {code:java}
> if fmt == "csv":
> ...
> df[possible_strings] = df[possible_strings].astype(str).apply(
> lambda x: x.str.replace("\r\n", "")
> )
> df[possible_strings] = df[possible_strings].astype(str).apply(
> lambda x: x.str.replace("\n", "")
> )
> {code}
> I've tested this and it works for me. Could somebody help me verify that the 
> type conversion is indeed needed? If yes, I'm keen to submit a PR to fix this 
> with the unit test included.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7041) Bowler required to be installed in pre-commit

2020-03-11 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7041.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Bowler required to be installed in pre-commit
> -
>
> Key: AIRFLOW-7041
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7041
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: pre-commit
>Affects Versions: 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7040) Move tests/contrib/utils files tp test/utils

2020-03-11 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7040.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Move tests/contrib/utils files tp test/utils
> 
>
> Key: AIRFLOW-7040
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7040
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: contrib
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-7040) Move tests/contrib/utils files tp test/utils

2020-03-11 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-7040:
--
Parent: AIRFLOW-4733
Issue Type: Sub-task  (was: Improvement)

> Move tests/contrib/utils files tp test/utils
> 
>
> Key: AIRFLOW-7040
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7040
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: contrib
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7042) Example causes migration scripts to show errors

2020-03-11 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7042:
-

 Summary: Example causes migration scripts to show errors
 Key: AIRFLOW-7042
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7042
 Project: Apache Airflow
  Issue Type: Bug
  Components: database
Affects Versions: 1.10.9
Reporter: Jarek Potiuk


When trying to install  fresh db you get this (harmless) error by default (with 
load_defaults=True).

 

This should be fixed as it is misleading for the users.

 

INFO [alembic.runtime.migration] Running upgrade 127d2bf2dfa7 -> cc1e65623dc7, 
add max tries column to task instance
ERROR [airflow.models.dagbag.DagBag] Failed to import: 
/usr/local/lib/python3.6/site-packages/airflow/example_dags/example_subdag_operator.py
Traceback (most recent call last):
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 
1246, in _execute_context
 cursor, statement, parameters, context
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py", 
line 588, in do_execute
 cursor.execute(statement, parameters)
psycopg2.errors.UndefinedTable: relation "slot_pool" does not exist
LINE 2: FROM slot_pool 
 ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
 File "/usr/local/lib/python3.6/site-packages/airflow/models/dagbag.py", line 
204, in process_file
 m = imp.load_source(mod_name, filepath)
 File "/usr/local/lib/python3.6/imp.py", line 172, in load_source
 module = _load(spec)
 File "", line 684, in _load
 File "", line 665, in _load_unlocked
 File "", line 678, in exec_module
 File "", line 219, in _call_with_frames_removed
 File 
"/usr/local/lib/python3.6/site-packages/airflow/example_dags/example_subdag_operator.py",
 line 47, in 
 dag=dag,
 File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, in 
wrapper
 return func(*args, **kwargs)
 File "/usr/local/lib/python3.6/site-packages/airflow/utils/decorators.py", 
line 98, in wrapper
 result = func(*args, **kwargs)
 File 
"/usr/local/lib/python3.6/site-packages/airflow/operators/subdag_operator.py", 
line 77, in __init__
 .filter(Pool.pool == self.pool)
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 
3287, in first
 ret = list(self[0:1])
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 
3065, in __getitem__
 return list(res)
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 
3389, in __iter__
 return self._execute_and_instances(context)
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 
3414, in _execute_and_instances
 result = conn.execute(querycontext.statement, self._params)
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 
982, in execute
 return meth(self, multiparams, params)
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 
293, in _execute_on_connection
 return connection._execute_clauseelement(self, multiparams, params)
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 
1101, in _execute_clauseelement
 distilled_params,
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 
1250, in _execute_context
 e, statement, parameters, cursor, context
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 
1476, in _handle_dbapi_exception
 util.raise_from_cause(sqlalchemy_exception, exc_info)
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 
398, in raise_from_cause
 reraise(type(exception), exception, tb=exc_tb, cause=cause)
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 
152, in reraise
 raise value.with_traceback(tb)
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 
1246, in _execute_context
 cursor, statement, parameters, context
 File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py", 
line 588, in do_execute
 cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation 
"slot_pool" does not exist
LINE 2: FROM slot_pool 
 ^

[SQL: SELECT slot_pool.id AS slot_pool_id, slot_pool.pool AS slot_pool_pool, 
slot_pool.slots AS slot_pool_slots, slot_pool.description AS 
slot_pool_description 
FROM slot_pool 
WHERE slot_pool.slots = %(slots_1)s AND slot_pool.pool = %(pool_1)s 
 LIMIT %(param_1)s]
[parameters: \{'slots_1': 1, 'pool_1': 'default_pool', 'param_1': 1}]
(Background on this error at: http://sqlalche.me/e/f405)
INFO [alembic.runtime.migration] Running upgrade cc1e65623dc7 -> bdaa763e6c56, 
Make xcom value column a large binary



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-7042) Example causes migration scripts to show errors

2020-03-11 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-7042:
-

Assignee: Jarek Potiuk

> Example causes migration scripts to show errors
> ---
>
> Key: AIRFLOW-7042
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7042
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: database
>Affects Versions: 1.10.9
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
>
> When trying to install  fresh db you get this (harmless) error by default 
> (with load_defaults=True).
>  
> This should be fixed as it is misleading for the users.
>  
> INFO [alembic.runtime.migration] Running upgrade 127d2bf2dfa7 -> 
> cc1e65623dc7, add max tries column to task instance
> ERROR [airflow.models.dagbag.DagBag] Failed to import: 
> /usr/local/lib/python3.6/site-packages/airflow/example_dags/example_subdag_operator.py
> Traceback (most recent call last):
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", 
> line 1246, in _execute_context
>  cursor, statement, parameters, context
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py", 
> line 588, in do_execute
>  cursor.execute(statement, parameters)
> psycopg2.errors.UndefinedTable: relation "slot_pool" does not exist
> LINE 2: FROM slot_pool 
>  ^
> The above exception was the direct cause of the following exception:
> Traceback (most recent call last):
>  File "/usr/local/lib/python3.6/site-packages/airflow/models/dagbag.py", line 
> 204, in process_file
>  m = imp.load_source(mod_name, filepath)
>  File "/usr/local/lib/python3.6/imp.py", line 172, in load_source
>  module = _load(spec)
>  File "", line 684, in _load
>  File "", line 665, in _load_unlocked
>  File "", line 678, in exec_module
>  File "", line 219, in _call_with_frames_removed
>  File 
> "/usr/local/lib/python3.6/site-packages/airflow/example_dags/example_subdag_operator.py",
>  line 47, in 
>  dag=dag,
>  File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, 
> in wrapper
>  return func(*args, **kwargs)
>  File "/usr/local/lib/python3.6/site-packages/airflow/utils/decorators.py", 
> line 98, in wrapper
>  result = func(*args, **kwargs)
>  File 
> "/usr/local/lib/python3.6/site-packages/airflow/operators/subdag_operator.py",
>  line 77, in __init__
>  .filter(Pool.pool == self.pool)
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 
> 3287, in first
>  ret = list(self[0:1])
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 
> 3065, in __getitem__
>  return list(res)
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 
> 3389, in __iter__
>  return self._execute_and_instances(context)
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 
> 3414, in _execute_and_instances
>  result = conn.execute(querycontext.statement, self._params)
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", 
> line 982, in execute
>  return meth(self, multiparams, params)
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", 
> line 293, in _execute_on_connection
>  return connection._execute_clauseelement(self, multiparams, params)
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", 
> line 1101, in _execute_clauseelement
>  distilled_params,
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", 
> line 1250, in _execute_context
>  e, statement, parameters, cursor, context
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", 
> line 1476, in _handle_dbapi_exception
>  util.raise_from_cause(sqlalchemy_exception, exc_info)
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", 
> line 398, in raise_from_cause
>  reraise(type(exception), exception, tb=exc_tb, cause=cause)
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", 
> line 152, in reraise
>  raise value.with_traceback(tb)
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", 
> line 1246, in _execute_context
>  cursor, statement, parameters, context
>  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py", 
> line 588, in do_execute
>  cursor.execute(statement, parameters)
> sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation 
> "slot_pool" does not exist
> LINE 2: FROM slot_pool 
>  ^
> [SQL: SELECT slot_pool.id AS slot_pool_id, slot_pool.pool AS slot_pool_pool, 
> slot_pool.slots AS slot_pool_slots, slot_pool.description AS 
> slot_pool_description 
> FROM slot_pool 
> WHERE slot_pool.slots = %(slots_1)s AND slot_pool.pool = %(pool_1)s 
>  LIMIT %(param_1)s]
> 

[jira] [Updated] (AIRFLOW-7041) Bowler required to be installed in pre-commit

2020-03-11 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-7041:
--
Affects Version/s: (was: 2.0.0)

> Bowler required to be installed in pre-commit
> -
>
> Key: AIRFLOW-7041
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7041
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: pre-commit
>Affects Versions: 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7041) Bowler required to be installed in pre-commit

2020-03-11 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7041:
-

 Summary: Bowler required to be installed in pre-commit
 Key: AIRFLOW-7041
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7041
 Project: Apache Airflow
  Issue Type: Bug
  Components: pre-commit
Affects Versions: 1.10.9, 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7040) Move tests/contrib/utils files tp test/utils

2020-03-11 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7040:
-

 Summary: Move tests/contrib/utils files tp test/utils
 Key: AIRFLOW-7040
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7040
 Project: Apache Airflow
  Issue Type: Improvement
  Components: contrib
Affects Versions: 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-7039) Specific DAG Schedule & DST Results in Skipped DAG Run

2020-03-11 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-7039:
-

Assignee: Jarek Potiuk

> Specific DAG Schedule & DST Results in Skipped DAG Run
> --
>
> Key: AIRFLOW-7039
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7039
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.10.7
> Environment: Amazon Linux 2 AMI
>Reporter: Peter Kim
>Assignee: Jarek Potiuk
>Priority: Critical
>  Labels: timezone
>
> *Scenario:* 
> EC2 running airflow is in Eastern Time (America/New_York), 
> airflow.cfg>[core]>default_timezone=America/New_York (automatically changes 
> correctly)
> Monday morning after Daylight Savings Time applied a handful of DAG runs were 
> not executed as expected.  The strange part is that these DAGs were the only 
> jobs that did not behave as expected, all other DAGs ran normally.  
> Additionally, only the first expected run after DST was skipped, subsequent 
> runs later that day were scheduled successfully.
> Here is the pattern observed:
> DAG Schedule which skipped first run:  (0 , * * 1,2,3,4,5)
> e.g. Schedules M-F, with two distinct runs per day.
> DAGs that run at one time, M-F & DAGs that run at two times, not M-F did not 
> experience this issue.  
>  
> Based on the logs, it appears as if the expected run that was missed was not 
> seen by the scheduler whatsoever (see below):
>  
>  
> 2020 03 06 6:30 AM ET (BEFORE DST, EXPECTED BEHAVIOR):
> [2020-03-06 06:31:01,220] \{logging_mixin.py:112} INFO - [2020-03-06 
> 06:31:01,220] \{settings.py:254} INFO - settings.configure_orm(): Using pool 
> settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=697
> [2020-03-06 06:31:01,222] \{scheduler_job.py:153} INFO - Started process 
> (PID=697) to work on /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 06:31:01,228] \{scheduler_job.py:1539} INFO - Processing file 
> /home/ec2-user/airflow/s3fuse/dags/.py for tasks to queue
> [2020-03-06 06:31:01,228] \{logging_mixin.py:112} INFO - [2020-03-06 
> 06:31:01,228] \{dagbag.py:403} INFO - Filling up the DagBag from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 06:31:01,238] \{scheduler_job.py:1551} INFO - DAG(s) 
> dict_keys(['']) retrieved from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 06:31:01,305] \{scheduler_job.py:1262} INFO - Processing 
> 
> [2020-03-06 06:31:01,348] \{logging_mixin.py:112} WARNING - 
> /home/ec2-user/venv/lib64/python3.7/site-packages/pymysql/cursors.py:170: 
> Warning: (1300, "Invalid utf8mb4 character string: '80047D'") result = 
> self._query(query)
> [2020-03-06 06:31:01,362] \{scheduler_job.py:1272} INFO - Created   @ 2020-03-05T15:30:00+00:00: scheduled__2020-03-05T15:30:00+00:00, 
> externally triggered: False>
> [2020-03-06 06:31:01,366] \{scheduler_job.py:740} INFO - Examining DAG run 
>  @ 2020-03-05 15:30:00+00:00: 
> scheduled__2020-03-05T15:30:00+00:00, externally triggered: False>
> [2020-03-06 06:31:01,389] \{scheduler_job.py:440} INFO - Skipping SLA check 
> for > because no tasks in DAG have SLAs
> [2020-03-06 06:31:01,395] \{scheduler_job.py:1613} INFO - Creating / updating 
> . 2020-03-05 15:30:00+00:00 [scheduled]> 
> in ORM
> [2020-03-06 06:31:01,414] \{scheduler_job.py:161} INFO - Processing 
> /home/ec2-user/airflow/s3fuse/dags/.py took 0.192 seconds
> 20200306 10 AM ET (BEFORE DST, EXPECTED BEHAVIOR):
> [2020-03-06 10:30:00,083] \{logging_mixin.py:112} INFO - [2020-03-06 
> 10:30:00,082] \{settings.py:254} INFO - settings.configure_orm(): Using pool 
> settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=16194
> [2020-03-06 10:30:00,085] \{scheduler_job.py:153} INFO - Started process 
> (PID=16194) to work on /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 10:30:00,090] \{scheduler_job.py:1539} INFO - Processing file 
> /home/ec2-user/airflow/s3fuse/dags/.py for tasks to queue
> [2020-03-06 10:30:00,090] \{logging_mixin.py:112} INFO - [2020-03-06 
> 10:30:00,090] \{dagbag.py:403} INFO - Filling up the DagBag from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 10:30:00,099] \{scheduler_job.py:1551} INFO - DAG(s) 
> dict_keys(['']) retrieved from 
> /home/ec2-user/airflow/s3fuse/dags/.py
> [2020-03-06 10:30:00,159] \{scheduler_job.py:1262} INFO - Processing 
> 
> [2020-03-06 10:30:00,193] \{logging_mixin.py:112} WARNING - 
> /home/ec2-user/venv/lib64/python3.7/site-packages/pymysql/cursors.py:170: 
> Warning: (1300, "Invalid utf8mb4 character string: '80047D'")
>   result = self._query(query)
> [2020-03-06 10:30:00,207] \{scheduler_job.py:1272} INFO - Created   @ 2020-03-06T11:30:00+00:00: scheduled__2020-03-06T11:30:00+00:00, 
> externally triggered: False>
> [2020-03-06 10:30:00,212] 

[jira] [Resolved] (AIRFLOW-7037) Fix Incorrect Type Annotation for Multiprocessing Connection

2020-03-10 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7037.
---
Resolution: Fixed

> Fix Incorrect Type Annotation for Multiprocessing Connection
> 
>
> Key: AIRFLOW-7037
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7037
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Kaxil Naik
>Assignee: Kaxil Naik
>Priority: Minor
> Fix For: 2.0.0
>
>
> The current annotation for *DagFileProcessorManager.signal_conn* is incorrect.
> Currently, it shows airflow.models.connection instead of 
> *multiprocessing.connection.Connection*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7015) Detect Dockerhub repo/user when building on Dockerhub

2020-03-10 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7015.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> Detect Dockerhub repo/user when building on Dockerhub
> -
>
> Key: AIRFLOW-7015
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7015
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0, 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.10
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7023) Remove duplicated package definitions in setup.py

2020-03-10 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7023.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> Remove duplicated package definitions in setup.py
> -
>
> Key: AIRFLOW-7023
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7023
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: setup
>Affects Versions: 1.10.9
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Trivial
> Fix For: 1.10.10
>
>
> For now, the {{devel_all}} subpackage is defined in setup.py:
> {code}
> devel_all = (all_dbs + atlas + aws + azure + celery + cgroups + datadog + 
> devel + doc + docker + druid +
>  elasticsearch + gcp + grpc + jdbc + jenkins + kerberos + 
> kubernetes + ldap + odbc + oracle +
>  pagerduty + papermill + password + pinot + redis + salesforce + 
> samba + segment + sendgrid +
>  sentry + singularity + slack + snowflake + ssh + statsd + 
> tableau + virtualenv + webhdfs +
>  yandexcloud + zendesk)
> {code}
> But {{druid}} and {{pinot}} are also included in {{all_dbs}}, so they can be 
> removed from {{devel_all}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5013) Add GCP Data Catalog Hook and Operators

2020-03-09 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-5013.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Add GCP Data Catalog Hook and Operators
> ---
>
> Key: AIRFLOW-5013
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5013
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: gcp
>Affects Versions: 1.10.3
>Reporter: Ryan Yuan
>Assignee: Kamil Bregula
>Priority: Major
> Fix For: 2.0.0
>
>
> Add GCP Data Catalog services to Airflow
> [https://cloud.google.com/data-catalog/docs/reference]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7018) Travis escaping of job name caused the "Build documentation" to fail

2020-03-09 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7018.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> Travis escaping of job name caused the "Build documentation" to fail
> 
>
> Key: AIRFLOW-7018
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7018
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0, 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.10
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7018) Travis escaping of job name caused the "Build documentation" to fail

2020-03-09 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7018:
-

 Summary: Travis escaping of job name caused the "Build 
documentation" to fail
 Key: AIRFLOW-7018
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7018
 Project: Apache Airflow
  Issue Type: Improvement
  Components: ci
Affects Versions: 1.10.9, 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-7015) Detect Dockerhub repo/user when building on Dockerhub

2020-03-09 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-7015:
--
Summary: Detect Dockerhub repo/user when building on Dockerhub  (was: 
Detect Dockerhub User/Password when building on Dockerhub)

> Detect Dockerhub repo/user when building on Dockerhub
> -
>
> Key: AIRFLOW-7015
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7015
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0, 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7015) Detect Dockerhub User/Password when building on Dockerhub

2020-03-09 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7015:
-

 Summary: Detect Dockerhub User/Password when building on Dockerhub
 Key: AIRFLOW-7015
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7015
 Project: Apache Airflow
  Issue Type: Improvement
  Components: ci
Affects Versions: 1.10.9, 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7010) Dockerhub builds are run inside a container and fail

2020-03-09 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7010.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> Dockerhub builds are run inside a container and fail
> 
>
> Key: AIRFLOW-7010
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7010
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 2.0.0, 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.10
>
>
> Currently all Dockerhub builds fail  :(.  Not a big problem - I manually 
> pushed the images. But it would be great to fix it ..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-7010) Dockerhub builds are run inside a container and fail

2020-03-08 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-7010:
--
Description: Currently all Dockerhub builds fail  :(.  Not a big problem - 
I manually pushed the images. But it would be great to fix it ..  (was: 
Currently all Dockerhub builds fail  :(.  Not a big problem - I manually pushed 
the images. But it would be great to update it ..)

> Dockerhub builds are run inside a container and fail
> 
>
> Key: AIRFLOW-7010
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7010
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 2.0.0, 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
>
> Currently all Dockerhub builds fail  :(.  Not a big problem - I manually 
> pushed the images. But it would be great to fix it ..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-7010) Dockerhub builds are run inside a container and fail

2020-03-08 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-7010:
--
Description: Currently all Dockerhub builds fail  :(.  Not a big problem - 
I manually pushed the images. But it would be great to update it ..

> Dockerhub builds are run inside a container and fail
> 
>
> Key: AIRFLOW-7010
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7010
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 2.0.0, 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
>
> Currently all Dockerhub builds fail  :(.  Not a big problem - I manually 
> pushed the images. But it would be great to update it ..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7012) Travis CI + MySQL or Postgres + Python 3.7 fail most of the times

2020-03-08 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7012.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Travis CI + MySQL or Postgres + Python 3.7 fail most of the times
> -
>
> Key: AIRFLOW-7012
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7012
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>
> Seems that on Travis where we have MySQL + Postgres + Python 3.7 we have 
> often (but not always) failures resulting with "Lost connection to MySQL 
> server during query". Generally it seems to be connected to DNS lookup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7013) Breeze should check automatically if the image should be pulled

2020-03-08 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7013:
-

 Summary: Breeze should check automatically if the image should be 
pulled
 Key: AIRFLOW-7013
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7013
 Project: Apache Airflow
  Issue Type: Bug
  Components: breeze, ci
Affects Versions: 1.10.9, 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-7012) Travis CI + MySQL or Postgres + Python 3.7 fail most of the times

2020-03-07 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-7012:
--
Description: Seems that on Travis where we have MySQL + Postgres + Python 
3.7 we have often (but not always) failures resulting with "Lost connection to 
MySQL server during query". Generally it seems to be connected to DNS lookup.  
(was: Seems that on Travis where we have MySQL + Postgres + Python 3.7 we have 
often (but not always) failures resulting with "Lost connection to MySQL server 
during query". )

> Travis CI + MySQL or Postgres + Python 3.7 fail most of the times
> -
>
> Key: AIRFLOW-7012
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7012
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
>
> Seems that on Travis where we have MySQL + Postgres + Python 3.7 we have 
> often (but not always) failures resulting with "Lost connection to MySQL 
> server during query". Generally it seems to be connected to DNS lookup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-7012) Travis CI + MySQL or Postgres + Python 3.7 fail most of the times

2020-03-07 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-7012:
--
Description: Seems that on Travis where we have MySQL + Postgres + Python 
3.7 we have often (but not always) failures resulting with "Lost connection to 
MySQL server during query".   (was: Seems that on Travis where we have MySQL 
5.7 + Python 3.7 we have often (but not always) failures resulting with "Lost 
connection to MySQL server during query". While investigating it swapping 
Postgres and MySql in 3.6 and 3.7 tests should help
 )

> Travis CI + MySQL or Postgres + Python 3.7 fail most of the times
> -
>
> Key: AIRFLOW-7012
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7012
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
>
> Seems that on Travis where we have MySQL + Postgres + Python 3.7 we have 
> often (but not always) failures resulting with "Lost connection to MySQL 
> server during query". 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-7012) Travis CI + MySQL or Postgres + Python 3.7 fail most of the times

2020-03-07 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-7012:
--
Summary: Travis CI + MySQL or Postgres + Python 3.7 fail most of the times  
(was: Travis CI + MySQL 5.7 + Python 3.7 fail most of the times)

> Travis CI + MySQL or Postgres + Python 3.7 fail most of the times
> -
>
> Key: AIRFLOW-7012
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7012
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
>
> Seems that on Travis where we have MySQL 5.7 + Python 3.7 we have often (but 
> not always) failures resulting with "Lost connection to MySQL server during 
> query". While investigating it swapping Postgres and MySql in 3.6 and 3.7 
> tests should help
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-7012) Travis CI + MySQL 5.7 + Python 3.7 fail most of the times

2020-03-07 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-7012:
--
Description: 
Seems that on Travis where we have MySQL 5.7 + Python 3.7 we have often (but 
not always) failures resulting with "Lost connection to MySQL server during 
query". While investigating it swapping Postgres and MySql in 3.6 and 3.7 tests 
should help
 

  was:Seems that on Travis where we have MySQL 5.7 + Python 3.7 we have often 
(but not always) failures resulting with "Lost connection to MySQL server 
during query". While investigating it I am going to swap 3.7 tests to to 
Postgres for now.


> Travis CI + MySQL 5.7 + Python 3.7 fail most of the times
> -
>
> Key: AIRFLOW-7012
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7012
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
>
> Seems that on Travis where we have MySQL 5.7 + Python 3.7 we have often (but 
> not always) failures resulting with "Lost connection to MySQL server during 
> query". While investigating it swapping Postgres and MySql in 3.6 and 3.7 
> tests should help
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-7012) Travis CI + MySQL 5.7 + Python 3.7 fail most of the times

2020-03-07 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-7012:
-

Assignee: Jarek Potiuk

> Travis CI + MySQL 5.7 + Python 3.7 fail most of the times
> -
>
> Key: AIRFLOW-7012
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7012
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
>
> Seems that on Travis where we have MySQL 5.7 + Python 3.7 we have often (but 
> not always) failures resulting with "Lost connection to MySQL server during 
> query". While investigating it I am going to swap 3.7 tests to to Postgres 
> for now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7012) Travis CI + MySQL 5.7 + Python 3.7 fail most of the times

2020-03-07 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7012:
-

 Summary: Travis CI + MySQL 5.7 + Python 3.7 fail most of the times
 Key: AIRFLOW-7012
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7012
 Project: Apache Airflow
  Issue Type: Bug
  Components: ci
Affects Versions: 2.0.0
Reporter: Jarek Potiuk


Seems that on Travis where we have MySQL 5.7 + Python 3.7 we have often (but 
not always) failures resulting with "Lost connection to MySQL server during 
query". While investigating it I am going to swap 3.7 tests to to Postgres for 
now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-7011) JPype1 release 0.7.2 from 29 Feb 2020 breaks [jdbc] installation for Python 2.7

2020-03-07 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-7011.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> JPype1 release 0.7.2 from 29 Feb 2020 breaks [jdbc] installation for Python 
> 2.7
> ---
>
> Key: AIRFLOW-7011
> URL: https://issues.apache.org/jira/browse/AIRFLOW-7011
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: ci, dependencies
>Affects Versions: 1.10.9
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.10
>
>
> Python 2.7 image cannot be built without the fix to pin it to 0.7.1. This is 
> not a problem for python3 so 2.0 is not affected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7011) JPype1 release 0.7.2 from 29 Feb 2020 breaks [jdbc] installation for Python 2.7

2020-03-07 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7011:
-

 Summary: JPype1 release 0.7.2 from 29 Feb 2020 breaks [jdbc] 
installation for Python 2.7
 Key: AIRFLOW-7011
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7011
 Project: Apache Airflow
  Issue Type: Bug
  Components: ci, dependencies
Affects Versions: 1.10.9
Reporter: Jarek Potiuk


Python 2.7 image cannot be built without the fix to pin it to 0.7.1. This is 
not a problem for python3 so 2.0 is not affected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-7010) Dockerhub builds are run inside a container and fail

2020-03-07 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-7010:
-

 Summary: Dockerhub builds are run inside a container and fail
 Key: AIRFLOW-7010
 URL: https://issues.apache.org/jira/browse/AIRFLOW-7010
 Project: Apache Airflow
  Issue Type: Bug
  Components: ci
Affects Versions: 1.10.9, 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6877) Add backport packages cross-dependency mechanism

2020-03-07 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6877:
--
Summary: Add backport packages cross-dependency mechanism  (was: Add 
backport packages dependency warnings)

> Add backport packages cross-dependency mechanism
> 
>
> Key: AIRFLOW-6877
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6877
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: backport-packages
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
>
> Some of the backport packages depend on others - but this is mostly for 
> transfer operators. other operators should work fine. We should allow those 
> packages to be imported with a warning that you need to import the other 
> package to get full functionality



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6877) Add backport packages cross-dependency mechanism

2020-03-07 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6877.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Add backport packages cross-dependency mechanism
> 
>
> Key: AIRFLOW-6877
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6877
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: backport-packages
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>
> Some of the backport packages depend on others - but this is mostly for 
> transfer operators. other operators should work fine. We should allow those 
> packages to be imported with a warning that you need to import the other 
> package to get full functionality



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5842) Switch to Buster base images

2020-03-07 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-5842.
---
Fix Version/s: 1.10.10
   Resolution: Fixed

> Switch to Buster base images
> 
>
> Key: AIRFLOW-5842
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5842
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: breeze
>Affects Versions: 2.0.0, 1.10.6
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.10
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   7   8   9   10   >