[jira] [Updated] (AIRFLOW-4968) Add Cloud AutoML NL Sentiment integration

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4968:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Add Cloud AutoML NL Sentiment integration
> -
>
> Key: AIRFLOW-4968
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4968
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.3
>Reporter: Kamil Bregula
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> Hi
> This project lacks integration with the Cloud AutoML NL Sentiment service. I 
> would be happy if Airflow had proper operators and hooks that integrate with 
> this service.
> Product Documentation:  
> https://cloud.google.com/natural-language/automl/sentiment/docs/
> API Documentation: 
> https://googleapis.github.io/google-cloud-python/latest/automl/index.html
> Lots of love



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-4972) Add Google Search Ads 360 integration

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4972:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Add Google Search Ads 360 integration
> -
>
> Key: AIRFLOW-4972
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4972
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.3
>Reporter: Kamil Bregula
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> Hi
> This project lacks integration with the Google Search Ads 360 service. I 
> would be happy if Airflow had proper operators and hooks that integrate with 
> this service.
> Product Documentation: https://developers.google.com/search-ads/
> API Documentation: 
> https://developers.google.com/resources/api-libraries/documentation/dfareporting/v3.3/python/latest/
> Lots of love



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6087) Snowflake Connector cannot run more than one sql from a sql file

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6087:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Snowflake Connector cannot run more than one sql from a sql file
> 
>
> Key: AIRFLOW-6087
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6087
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Affects Versions: 1.10.6
>Reporter: Saad
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> I am getting an error when passing in a SQL file with multiple SQL statements 
> to snowflake operator
> {code:java}
> snowflake.connector.errors.ProgrammingError: 06 (0A000): 
> 01908236-01a3-b2c4--f36100052686: Multiple SQL statements in a single API 
> call are not supported; use one API call per statement instead.
> {code}
> It only fails if you pass a file with multiple statements. A file with just 
> one statement or list of statements to the operator works fine.
> After looking at the current snowflake operator implementation it seems like 
> a list of SQL statements work because it executes one statement at a time. 
> Whereas multiple statements in a SQL file fails because all of them are read 
> as one continuous string.
>  
> h4. _*How can we fix this:*_
> There is an API call in Snowflake python connector that supports multiple SQL 
> statements.
> [https://docs.snowflake.net/manuals/user-guide/python-connector-api.html#execute_string]
> This can be fixed by overriding the run function in Snowflake Hook to support 
> multiple sql statements in a file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-4966) Add Cloud AutoML NL Classification integration

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4966:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Add Cloud AutoML NL Classification integration
> --
>
> Key: AIRFLOW-4966
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4966
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.3
>Reporter: Kamil Bregula
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> Hi
> This project lacks integration with the Cloud AutoML NL Classification 
> service. I would be happy if Airflow had proper operators and hooks that 
> integrate with this service.
> Product Documentation: https://cloud.google.com/natural-language/automl/docs/
> API Documentation: 
> https://googleapis.github.io/google-cloud-python/latest/automl/index.html
> Love



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-4967) Add Cloud AutoML NL Entity Extraction integration

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4967:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Add Cloud AutoML NL Entity Extraction integration
> -
>
> Key: AIRFLOW-4967
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4967
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.3
>Reporter: Kamil Bregula
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> Hi
> This project lacks integration with the Cloud AutoML NL Entity Extraction 
> service. I would be happy if Airflow had proper operators and hooks that 
> integrate with this service.
> Product Documentation: 
> https://cloud.google.com/natural-language/automl/entity-analysis/docs/
> API Documentation: 
> https://googleapis.github.io/google-cloud-python/latest/automl/index.html
> Love



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (AIRFLOW-5433) Add script to check external links in docs

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk closed AIRFLOW-5433.
-
Resolution: Won't Fix

> Add script to check external links in docs
> --
>
> Key: AIRFLOW-5433
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5433
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 1.10.5
>Reporter: Kamil Bregula
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-6157) Separate out executor protocol

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-6157:
-

Assignee: Jarek Potiuk

> Separate out executor protocol
> --
>
> Key: AIRFLOW-6157
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6157
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
>
> Some of the fields of executors are accessed directly in the main core. The 
> protocol for executor can be extracted and used in all places where executors 
> are used. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-6198) Add types to Core classes

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-6198:
-

Assignee: Jarek Potiuk

> Add types to Core classes
> -
>
> Key: AIRFLOW-6198
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6198
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
>
> We need to add types to the main/core classes. They are all inter-connnected:
>  
>  * airflow/jobs/base_job.py modified
>  * airflow/jobs/scheduler_job.py
>  * airflow/models/baseoperator.py
>  * airflow/models/dag.py
>  * airflow/models/dagbag.py
>  * airflow/models/dagrun.py
>  * airflow/operators/subdag_operator.py
>  * airflow/utils/dag_processing.py
>  * airflow/utils/dates.py



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-4973) Add Cloud Data Fusion Pipeline integration

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-4973:
-

Assignee: Kamil Bregula

> Add Cloud Data Fusion Pipeline integration
> --
>
> Key: AIRFLOW-4973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4973
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.3
>Reporter: Kamil Bregula
>Assignee: Kamil Bregula
>Priority: Major
>
> Hi
> This project lacks integration with the Cloud Data Fusion Pipeline service. I 
> would be happy if Airflow had proper operators and hooks that integrate with 
> this service.
> Product Documentation: https://cloud.google.com/data-fusion/docs/
> API Documentation: 
> https://developers.google.com/apis-explorer/#search/data%20fusion/datafusion/v1beta1/
> Love



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-6571) Rewrite BigQueryExecuteQueryOperator to use python client

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-6571:
-

Assignee: Tomasz Urbaszek

> Rewrite BigQueryExecuteQueryOperator to use python client
> -
>
> Key: AIRFLOW-6571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6571
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp, operators
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Assignee: Tomasz Urbaszek
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-5454) security - hide all password/secret/credentials/tokens from log

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-5454:
--
Labels: gsoc gsoc2020 mentor  (was: )

> security - hide all password/secret/credentials/tokens from log
> ---
>
> Key: AIRFLOW-5454
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5454
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: logging, security
>Affects Versions: 1.10.5
>Reporter: t oo
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> I am proposing a new config flag. It will enforce a generic override in all 
> airflow logging to suppress printing any lines containing case-insensitive 
> match on any of: password|secret|credential|token
>  
> If you do a
> {code:java}
> grep -iE 'password|secret|credential|token' -R {code}
> you may be surprised with what you find :O
>  
> ideally could replace only the sensitive value but there are various formats 
> like:  
> {code:java}
> key=value, key'=value, key value, key"=value, key = value, key"="value, 
> key:value{code}
> ..etc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (AIRFLOW-6386) Fix cassandra mocking in v1.10

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reopened AIRFLOW-6386:
---

> Fix cassandra mocking in v1.10
> --
>
> Key: AIRFLOW-6386
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6386
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: tests
>Affects Versions: 1.10.8
>Reporter: Tomasz Urbaszek
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (AIRFLOW-6386) Fix cassandra mocking in v1.10

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk closed AIRFLOW-6386.
-
Resolution: Fixed

> Fix cassandra mocking in v1.10
> --
>
> Key: AIRFLOW-6386
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6386
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: tests
>Affects Versions: 1.10.8
>Reporter: Tomasz Urbaszek
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6386) Fix cassandra mocking in v1.10

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6386.
---
Resolution: Duplicate

> Fix cassandra mocking in v1.10
> --
>
> Key: AIRFLOW-6386
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6386
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: tests
>Affects Versions: 1.10.8
>Reporter: Tomasz Urbaszek
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6318) Change python3 as Dataflow Hooks/Operators default interpreter

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6318.
---
Fix Version/s: 2.0.0
   Resolution: Duplicate

> Change python3 as Dataflow Hooks/Operators default interpreter
> --
>
> Key: AIRFLOW-6318
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6318
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp, hooks, operators
>Affects Versions: 1.10.6
>Reporter: Xinbin Huang
>Assignee: Xinbin Huang
>Priority: Major
> Fix For: 2.0.0
>
>
> Change default python interpreter for dataflow to python3, given that support 
> is mostly finished. and they are also about to sunset python2 in 2020, ( 
> thought the timeline has not been set yet)
>  * Dataflow python3 support roadmap: 
> https://issues.apache.org/jira/browse/BEAM-1251#comment-16979381]
>  * Python2 sunset schedule: https://issues.apache.org/jira/browse/BEAM-8371
>  * Apache-beam sign (without detailed timeline) on 
> [https://python3statement.org/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6378) Providers package should not be used in v1-10-test

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6378.
---
Fix Version/s: 1.10.8
   Resolution: Fixed

> Providers package should not be used in v1-10-test
> --
>
> Key: AIRFLOW-6378
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6378
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 1.10.7
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.8
>
>
> When cherry-picking from master we should remove all references to 
> 'airflow.providers' package as this package was only introduced in 2.0.*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-6406) Operator Resources have wronge type hint derived

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-6406:
-

Assignee: Jarek Potiuk

> Operator Resources have wronge type hint derived
> 
>
> Key: AIRFLOW-6406
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6406
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
>
> The Operator Resources classes 
> ([https://github.com/apache/airflow/blob/master/airflow/utils/operator_resources.py])
>  have wrong type-hints defined. Their __init__ takes same-named parameters 
> with different types and it fools Type hinting for IntelliJ.
>  
> {code:java}
> def __init__(self,
>  cpus=conf.getint('operators', 'default_cpus'),
>  ram=conf.getint('operators', 'default_ram'),
>  disk=conf.getint('operators', 'default_disk'),
>  gpus=conf.getint('operators', 'default_gpus')
> ):
> self.cpus = CpuResource(cpus)
> self.ram = RamResource(ram)
> self.disk = DiskResource(disk)
> self.gpus = GpuResource(gpus){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-6204) Add GCP system tests helper

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-6204:
-

Assignee: Tomasz Urbaszek

> Add GCP system tests helper
> ---
>
> Key: AIRFLOW-6204
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6204
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp, tests
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Assignee: Tomasz Urbaszek
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6429) Add SalesForce connection to UI

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6429:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Add SalesForce connection to UI
> ---
>
> Key: AIRFLOW-6429
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6429
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: 1.10.7
>Reporter: Elad
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> Airflow has SalesForceHook but it doesn't have a distinct connection.
> In order to create a Connection one must expose it's secret token as text :
> [https://stackoverflow.com/questions/53510980/salesforce-connection-using-apache-airflow-ui]
> Also it's not very intuitive that the +Conn Type should remain blank+.
> It would be easier and also user friendly if there will be salesforce 
> connection in the UI which has a security_token field that is encrypted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-4762) Test against Python 3.8

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-4762:
-

Assignee: Jarek Potiuk  (was: Philippe Gagnon)

> Test against Python 3.8
> ---
>
> Key: AIRFLOW-4762
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4762
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 1.10.3
>Reporter: Philippe Gagnon
>Assignee: Jarek Potiuk
>Priority: Major
>  Labels: ci, test
>
> Airflow is currently tested against python 3.5 only. This may be insufficient 
> as users may wish to run airflow on newer versions, which may introduce 
> breaking changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-4762) Test against Python 3.8

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4762:
--
Summary: Test against Python 3.8  (was: Test against Python 3.6, 3.7 and 
3.8b1)

> Test against Python 3.8
> ---
>
> Key: AIRFLOW-4762
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4762
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 1.10.3
>Reporter: Philippe Gagnon
>Assignee: Philippe Gagnon
>Priority: Major
>  Labels: ci, test
>
> Airflow is currently tested against python 3.5 only. This may be insufficient 
> as users may wish to run airflow on newer versions, which may introduce 
> breaking changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6463) Mock Cassandra in tests

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6463:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Mock Cassandra in tests
> ---
>
> Key: AIRFLOW-6463
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6463
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 1.10.7
>Reporter: Kamil Bregula
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> Cassandra consume 1.173GiB of memory. Travis does not have very efficient 
> machines, so we should limit system/integration tests of components that do 
> not require much attention, e.g. they are not changed often. Cassandra is a 
> good candidate for this. This will allow the machine power to be used for 
> more needed work.
> {code:java}
> CONTAINER IDNAME  CPU %   
> MEM USAGE / LIMIT MEM %   NET I/O BLOCK I/O   
> PIDS
> 8aa37ca50f7cci_airflow-testing_run_1f3aeb6d1052   0.00%   
> 5.715MiB / 3.855GiB   0.14%   1.14kB / 0B 2.36MB / 0B 
> 2
> f2b3be15558fci_cassandra_10.69%   
> 1.173GiB / 3.855GiB   30.42%  2.39kB / 0B 75.3MB / 9.95MB 
> 50
> ef1de3981ca6ci_krb5-kdc-server_1  0.02%   
> 12.15MiB / 3.855GiB   0.31%   2.46kB / 0B 18.9MB / 184kB  
> 4
> be808233eb91ci_mongo_10.31%   
> 36.71MiB / 3.855GiB   0.93%   2.39kB / 0B 43.2MB / 19.1MB 
> 24
> 667e047be097ci_rabbitmq_1 0.77%   
> 69.95MiB / 3.855GiB   1.77%   2.39kB / 0B 43.2MB / 508kB  
> 92
> 2453dd6e7ccaci_postgres_1 0.00%   
> 7.547MiB / 3.855GiB   0.19%   1.05MB / 889kB  35.4MB / 145MB  
> 6
> 78050c5c61ccci_redis_10.29%   
> 1.695MiB / 3.855GiB   0.04%   2.46kB / 0B 6.94MB / 0B 
> 4
> c117eb0a0d43ci_mysql_10.13%   
> 452MiB / 3.855GiB 11.45%  2.21kB / 0B 33.9MB / 548MB  
> 21
> 131427b19282ci_openldap_1 0.00%   
> 45.68MiB / 3.855GiB   1.16%   2.64kB / 0B 32.8MB / 16.1MB 
> 4
> 8c2549c010b1ci_docker_1   0.59%   
> 22.06MiB / 3.855GiB   0.56%   2.39kB / 0B 95.9MB / 291kB  
> 30
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6471) Show failures in pytest immediately when they happen

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6471.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Show failures in pytest immediately when they happen
> 
>
> Key: AIRFLOW-6471
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6471
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
> Attachments: Screenshot 2020-01-05 at 16.59.45.png
>
>
> We have a problem currently that if a test fails in CI we do not see the 
> failures immediately - only when it finishes, but when tests hang, sometimes 
> the details about failed tests are not shown immediately. We tried to 
> increase verbosity but it's not very helpful. 
> The pytest-instafail plugin solves the problem without increasing verbosity.
> !Screenshot 2020-01-05 at 16.59.45.png|width=1263,height=391!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6369) clear cli command needs a 'conf' option

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6369:
--
Labels: gsoc gsoc2020 mentor  (was: )

> clear cli command needs a 'conf' option
> ---
>
> Key: AIRFLOW-6369
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6369
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: cli, core, DagRun
>Affects Versions: 1.10.6
>Reporter: t oo
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> key-value pairs of conf can be passed into trigger_dag command
> ie
> --conf '{"ric":"amzn"}'
> clear command needs this feature too
> ie in case exec_date is important and there was a failure halfway in the 1st 
> dagrun due to bad conf being sent on trigger_dag command and want to run the 
> same execdate but with new conf on 2nd dagrun
> alternative solution would be a new delete_dag_run cli command so never need 
> to 'clear' but can do a 2nd DagRun for same exec date



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6419) dag_processor_manager/webserver/scheduler logs should be created under date folder

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6419:
--
Labels: gsoc gsoc2020 mentor  (was: )

> dag_processor_manager/webserver/scheduler logs should be created under date 
> folder
> --
>
> Key: AIRFLOW-6419
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6419
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: logging
>Affects Versions: 1.10.7
>Reporter: t oo
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> dag level logs are written under separate date folders. This is great because 
> the old dates are not 'modified/accessed' so they can be easily purged by 
> utilities like tmpwatch
> This JIRA is about making other logs (such as 
> dag_processor_manager/webserver/scheduler.etc) go under separate date folders 
> to allow easy purging. the log from redirecting 'airflow scheduler' to stdout 
> grows over 100mb a day in my env



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-6466) Remove yarn cache in Dockerfile

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-6466:
-

Assignee: Jarek Potiuk

> Remove yarn cache in Dockerfile
> ---
>
> Key: AIRFLOW-6466
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6466
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 1.10.7
>Reporter: Kamil Bregula
>Assignee: Jarek Potiuk
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (AIRFLOW-3674) Adding documentation on official docker images

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reassigned AIRFLOW-3674:
-

Assignee: Jarek Potiuk  (was: Peter van 't Hof)

> Adding documentation on official docker images
> --
>
> Key: AIRFLOW-3674
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3674
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Reporter: Peter van 't Hof
>Assignee: Jarek Potiuk
>Priority: Major
>  Labels: docker, dockerfile
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5850) Make DockerSwarmOperator capture task logs

2020-01-19 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019049#comment-17019049
 ] 

Jarek Potiuk commented on AIRFLOW-5850:
---

Ah yeah. indeed. I think it's best if you rebase it now - it has conflicts - 
and maybe [~dimberman] can finish the review :) ? I do not want to step onto 
his feet :)

 

> Make DockerSwarmOperator capture task logs
> --
>
> Key: AIRFLOW-5850
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5850
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: contrib, operators
>Affects Versions: 1.10.6
> Environment: Reproducible everywhere since missing feature - Tested 
> via puckel/airflow:latest
>Reporter: Henning Redestig
>Assignee: Akshesh Doshi
>Priority: Major
>  Labels: docker, orchestration, swarm
> Fix For: 2.0.0
>
>
> contrib.operators.DockerSwarmOperator gives output on task starting and 
> exiting but does not capture the logs from the task.  E.g.
> [2019-11-04 16:27:30,337] \{{docker_swarm_operator.py:125}} INFO - Service 
> started: \{'ID': 'mxjn03sm32kfm8bcczs4qw6tu'}
> \## the task outputs logs here but they do not show up in the webapp..
> [2019-11-04 16:27:32,909] \{{docker_swarm_operator.py:136}} INFO - Service 
> status before exiting: complete
> [2019-11-04 16:27:37,298] \{{logging_mixin.py:95}} INFO - [2019-11-04 
> 16:27:37,298] 
> local_task_job.py:105}} INFO - Task exited with return code 0
>  Also discussed in https://github.com/apache/airflow/pull/5489



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-5850) Make DockerSwarmOperator capture task logs

2020-01-19 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019050#comment-17019050
 ] 

Jarek Potiuk commented on AIRFLOW-5850:
---

But after rebase if [~dimberman] has no time I can help with it.

> Make DockerSwarmOperator capture task logs
> --
>
> Key: AIRFLOW-5850
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5850
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: contrib, operators
>Affects Versions: 1.10.6
> Environment: Reproducible everywhere since missing feature - Tested 
> via puckel/airflow:latest
>Reporter: Henning Redestig
>Assignee: Akshesh Doshi
>Priority: Major
>  Labels: docker, orchestration, swarm
> Fix For: 2.0.0
>
>
> contrib.operators.DockerSwarmOperator gives output on task starting and 
> exiting but does not capture the logs from the task.  E.g.
> [2019-11-04 16:27:30,337] \{{docker_swarm_operator.py:125}} INFO - Service 
> started: \{'ID': 'mxjn03sm32kfm8bcczs4qw6tu'}
> \## the task outputs logs here but they do not show up in the webapp..
> [2019-11-04 16:27:32,909] \{{docker_swarm_operator.py:136}} INFO - Service 
> status before exiting: complete
> [2019-11-04 16:27:37,298] \{{logging_mixin.py:95}} INFO - [2019-11-04 
> 16:27:37,298] 
> local_task_job.py:105}} INFO - Task exited with return code 0
>  Also discussed in https://github.com/apache/airflow/pull/5489



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-5850) Make DockerSwarmOperator capture task logs

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-5850:
--
Labels: docker orchestration swarm  (was: docker gsoc gsoc2020 mentor 
orchestration swarm)

> Make DockerSwarmOperator capture task logs
> --
>
> Key: AIRFLOW-5850
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5850
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: contrib, operators
>Affects Versions: 1.10.6
> Environment: Reproducible everywhere since missing feature - Tested 
> via puckel/airflow:latest
>Reporter: Henning Redestig
>Assignee: Akshesh Doshi
>Priority: Major
>  Labels: docker, orchestration, swarm
> Fix For: 2.0.0
>
>
> contrib.operators.DockerSwarmOperator gives output on task starting and 
> exiting but does not capture the logs from the task.  E.g.
> [2019-11-04 16:27:30,337] \{{docker_swarm_operator.py:125}} INFO - Service 
> started: \{'ID': 'mxjn03sm32kfm8bcczs4qw6tu'}
> \## the task outputs logs here but they do not show up in the webapp..
> [2019-11-04 16:27:32,909] \{{docker_swarm_operator.py:136}} INFO - Service 
> status before exiting: complete
> [2019-11-04 16:27:37,298] \{{logging_mixin.py:95}} INFO - [2019-11-04 
> 16:27:37,298] 
> local_task_job.py:105}} INFO - Task exited with return code 0
>  Also discussed in https://github.com/apache/airflow/pull/5489



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6464) Add cloud providers CLI tools in Breeze

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6464.
---
Fix Version/s: 1.10.8
   2.0.0
   Resolution: Fixed

> Add cloud providers CLI tools in Breeze
> ---
>
> Key: AIRFLOW-6464
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6464
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 1.10.7
>Reporter: Kamil Bregula
>Priority: Major
> Fix For: 2.0.0, 1.10.8
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-5015) Make AWS Operators Pylint compatible

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-5015:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Make AWS Operators Pylint compatible
> 
>
> Key: AIRFLOW-5015
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5015
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: pylint
>Affects Versions: 2.0.0
>Reporter: Ishan Rastogi
>Assignee: Ishan Rastogi
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
> Fix For: 2.0.0
>
>
> Make AWS Operators Pylint compatible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-5058) Add support for dmypy (Mypy daemon) to Breeze environment

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-5058:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Add support for dmypy (Mypy daemon) to Breeze environment
> -
>
> Key: AIRFLOW-5058
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5058
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 1.10.4, 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> Per discussion in [https://github.com/apache/airflow/pull/5664] we might use 
> dmypy for local development speedups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-4891) Extend list of pylint good-names

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-4891.
---
Resolution: Won't Fix

I am closing this one. I think we do .need any more good names

 

> Extend list of pylint good-names 
> -
>
> Key: AIRFLOW-4891
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4891
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: pylint
>Affects Versions: 2.0.0
>Reporter: Tomasz Urbaszek
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6465) Add airflow bash autocomplete in Breeze

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6465.
---
Fix Version/s: 1.10.8
   2.0.0
   Resolution: Fixed

> Add airflow bash autocomplete in Breeze
> ---
>
> Key: AIRFLOW-6465
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6465
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 1.10.7
>Reporter: Kamil Bregula
>Priority: Major
> Fix For: 2.0.0, 1.10.8
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5761) Check and document that docker-compose >= 1.20 is needed to run breeze

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-5761.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Check and document that docker-compose >= 1.20 is needed to run breeze
> --
>
> Key: AIRFLOW-5761
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5761
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 2.0.0, 1.10.5, 1.10.6
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5026) Group static code checks in on job

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-5026.
---
Resolution: Invalid

> Group static code checks in on job
> --
>
> Key: AIRFLOW-5026
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5026
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 1.10.4, 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
>
> Initializing docker image takes some time for static checks. We can group 
> several checks together to save few minutes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6475) Remove duplicated volume mounts in Breeze

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6475.
---
Fix Version/s: 1.10.8
   2.0.0
   Resolution: Fixed

> Remove duplicated volume mounts in Breeze
> -
>
> Key: AIRFLOW-6475
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6475
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 2.0.0, 1.10.7
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0, 1.10.8
>
>
> We had two sets of duplicated local volume mounts - one for ./breeze 
> interactive runs (with docker_compose) and the other with scripts that are 
> used to run static checks.
> The volumes from the .yaml for docker compose should be the "source of truth" 
> and the bash script should parse the yaml and use volume mounts from there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6461) Remove silent flags in dockerfile

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6461.
---
Fix Version/s: 1.10.8
   2.0.0
   Resolution: Fixed

> Remove silent flags in dockerfile
> -
>
> Key: AIRFLOW-6461
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6461
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 1.10.7
>Reporter: Kamil Bregula
>Priority: Major
> Fix For: 2.0.0, 1.10.8
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6496) Choosing integration at Breeze start

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6496.
---
Fix Version/s: 1.10.8
   2.0.0
   Resolution: Fixed

> Choosing integration at Breeze start
> 
>
> Key: AIRFLOW-6496
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6496
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 2.0.0, 1.10.7
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0, 1.10.8
>
>
> You should be able to choose which integration you want to start when you run 
> Breeze.
> At later stage we will ad markers to tests to run them only when given 
> environment is available.. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6470) Avoid pipe file when do curl

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6470.
---
Fix Version/s: 1.10.8
   2.0.0
   Resolution: Fixed

> Avoid pipe file when do curl
> 
>
> Key: AIRFLOW-6470
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6470
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 1.10.7
>Reporter: Kamil Bregula
>Priority: Major
> Fix For: 2.0.0, 1.10.8
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (AIRFLOW-5761) Check and document that docker-compose >= 1.20 is needed to run breeze

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reopened AIRFLOW-5761:
---

> Check and document that docker-compose >= 1.20 is needed to run breeze
> --
>
> Key: AIRFLOW-5761
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5761
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 2.0.0, 1.10.5, 1.10.6
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5765) Kerberos does not work if ENV == kubernetes in breeze

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-5765.
---
Resolution: Invalid

> Kerberos does not work if ENV == kubernetes in breeze
> -
>
> Key: AIRFLOW-5765
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5765
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 2.0.0, 1.10.5
>Reporter: Jarek Potiuk
>Priority: Major
>
> When you switch to "kubernetes" env in Breeze kerberos keytab is not 
> initialized
> cc: [~dimberman]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-4372) Document using local development environments

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-4372.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Document using local development environments
> -
>
> Key: AIRFLOW-4372
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4372
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze, documentation
>Reporter: Aizhamal Nurmamat kyzy
>Priority: Major
>  Labels: gsod2019
> Fix For: 2.0.0
>
>
> h4. Expected deliverables
>  * On-boarding documentation chapter/page that will be easily discoverable 
> for new developers joining Apache Airflow community or someone who wants to 
> start working on Apache Airflow development on a new PC. Ideally that could 
> be a step-by-step guide or some kind of video guide - generally something 
> easy to follow. Specifically it should be clear that there are different 
> local development environments depending on your needs and experience - from 
> local virtualenv through docker image to full-blown replica of CI integration 
> testing environment. Maybe some kind of interactive tutorial would be good as 
> well.
> h4. Related resources
> [1] 
> [https://github.com/PolideaInternal/airflow/blob/simplified-development-workflow/CONTRIBUTING.md]
> [2][Airflow 
> Breeze|https://github.com/PolideaInternal/airflow/blob/simplified-development-workflow/BREEZE.rst]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-4370) Testing: document “best practices” on dealing with Apache Airflow DAGs and operators

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-4370.
---
Fix Version/s: 1.10.7
   2.0.0
   Resolution: Fixed

> Testing: document “best practices” on dealing with Apache Airflow DAGs and 
> operators
> 
>
> Key: AIRFLOW-4370
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4370
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze, documentation
>Reporter: Aizhamal Nurmamat kyzy
>Priority: Major
>  Labels: gsod2019
> Fix For: 2.0.0, 1.10.7
>
>
> h4. Expected deliverables
>  * Introduction in testing workflows.
>  * A page for Designing and Testing DAGs
>  * Tips and working examples on “good practices” for designing DAGs
>  * Descriptions on how to perform dry-runs of DAGs
>  * Descriptions on how to write unit tests for DAGs
>  * Snippets with working examples of DAGs and tests for them. Use 
> [PlantUML|https://github.com/plantuml/plantuml] diagrams to compliment all 
> the new documentation
>  * How to develop operators that are testable?
> h4. Related resources
> [1] [https://github.com/jghoman/awesome-apache-airflow]
> [2] [https://airflow.apache.org/scheduler.html]
> [3] 
> [https://github.com/PolideaInternal/airflow/blob/simplified-development-workflow/CONTRIBUTING.md]
> [4] [Airflow 
> Breeze|https://github.com/PolideaInternal/airflow/blob/simplified-development-workflow/BREEZE.rst]
> [5] [https://github.com/godatadriven/whirl]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-5761) Check and document that docker-compose >= 1.20 is needed to run breeze

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-5761:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Check and document that docker-compose >= 1.20 is needed to run breeze
> --
>
> Key: AIRFLOW-5761
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5761
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 2.0.0, 1.10.5, 1.10.6
>Reporter: Jarek Potiuk
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5366) SLIM IMAGE version overrides python version to be always 3.5

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-5366.
---
Resolution: Invalid

> SLIM IMAGE version overrides python version to be always 3.5 
> -
>
> Key: AIRFLOW-5366
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5366
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze
>Affects Versions: 1.10.4, 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6201) How can I manually trigger a future dag?

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6201.
---
Resolution: Duplicate

> How can I manually trigger a future dag?
> 
>
> Key: AIRFLOW-6201
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6201
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: DAG, DagRun, scheduler
>Affects Versions: 1.10.4
>Reporter: xifeng
>Priority: Major
>
> For example, I set the dag run every 10 days. its latest {{start_time}} is 
> {{2019-11-30}}. So the next {{start_time}} will be {{2019-12-10}}.
> However, I want it to run in advance occasionally. As today is 
> {{2019-12-09}}, How can I trigger the future dag in advance?   is there any 
> way to run the future dag manually?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6328) Improve multiline output in admin gui

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6328:
--
Labels: gsoc gsoc2020  (was: )

> Improve multiline output in admin gui
> -
>
> Key: AIRFLOW-6328
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6328
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ui
>Affects Versions: 1.10.6
>Reporter: Paul Rhodes
>Assignee: Daniel Imberman
>Priority: Major
>  Labels: gsoc, gsoc2020
>
> Multiline attributes, rendered templates, or Xcom variables are not well 
> supported in the admin GUI at present. Any values are treated as native HTML 
> text() blocks and as such all formatting is lost. When passing structured 
> data such as YAML in these variables, it makes a real mess of them.
> Ideally, these values should keep their line-breaks and indentation.
> This should only require having these code blocks wrapped in a  block or 
> setting `white-space: pre` on the class for the block.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6538) prevent autocomplete of username in login UI

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6538:
--
Labels: gsoc gsoc2020 mentor  (was: )

> prevent autocomplete of username in login UI
> 
>
> Key: AIRFLOW-6538
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6538
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: security, ui
>Affects Versions: 1.10.7
>Reporter: t oo
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> Login page of the UI has autocomplete for username field. This should be 
> disabled for security



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6545) Validate all commit messages in PR for AIRFLOW-XXXX

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6545.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Validate all commit messages in PR for AIRFLOW-
> ---
>
> Key: AIRFLOW-6545
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6545
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6545) Validate all commit messages in PR for AIRFLOW-XXXX

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6545:
--
Summary: Validate all commit messages in PR for AIRFLOW-  (was: 
Validate all commit messages in PR for AIRFLOW-XXX)

> Validate all commit messages in PR for AIRFLOW-
> ---
>
> Key: AIRFLOW-6545
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6545
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6570) Add dag tag for all example dag

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6570.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Add dag tag for all example dag
> ---
>
> Key: AIRFLOW-6570
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6570
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: examples
>Affects Versions: 1.10.7
>Reporter: zhongjiajie
>Assignee: zhongjiajie
>Priority: Major
> Fix For: 2.0.0
>
>
> We miss add some dag tag in [https://github.com/apache/airflow/pull/6489]  . 
> this patch to add to all others



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6546) add GDriveToGcsOperator

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6546:
--
Labels: gsoc2020 mentor  (was: )

> add GDriveToGcsOperator
> ---
>
> Key: AIRFLOW-6546
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6546
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.6
>Reporter: lovk korm
>Priority: Major
>  Labels: gsoc2020, mentor
>
> There is GcsToGDriveOperator but there isn't the equivalent in the other 
> direction
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6474) list_dag_runs cli command should allow exec_date between start/end range and print start/end times

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6474:
--
Labels: gsoc gsoc2020 mentor  (was: )

> list_dag_runs cli command should allow exec_date between start/end range and 
> print start/end times
> --
>
> Key: AIRFLOW-6474
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6474
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: cli
>Affects Versions: 1.10.7
>Reporter: t oo
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> 1. accept argument exec_date_from, exec_date_to to filter execution_dates 
> returned, ie show dag runs with exec_date between 20190901 and 20190930
> 2. separate to that in the output print the start_date and end_date of each 
> dagrun (ie execdate for 20190907 had start_date 2019090804:23 and end_date  
> 2019090804:38
> 3. dag_id arg should be optional



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6579) One of Celery executor tests is flaky

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6579:
--
Labels: gsoc gsoc2020 mentor  (was: )

> One of Celery executor tests is flaky 
> --
>
> Key: AIRFLOW-6579
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6579
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: celery, ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
> Attachments: celery_eecutor_failure.log
>
>
> tests/executors/test_celery_executor.py::TestCeleryExecutor::test_celery_integration_0_amqp_guest_guest_rabbitmq_5672
>  
> Log attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6579) One of Celery executor tests is flaky

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6579:
--
Summary: One of Celery executor tests is flaky   (was: One of Celery 
executir tests is flaky )

> One of Celery executor tests is flaky 
> --
>
> Key: AIRFLOW-6579
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6579
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: celery, ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Attachments: celery_eecutor_failure.log
>
>
> tests/executors/test_celery_executor.py::TestCeleryExecutor::test_celery_integration_0_amqp_guest_guest_rabbitmq_5672
>  
> Log attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6186) Switch pymssql to pyodbc for connecting to Miscrosoft SQL Server

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6186.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Switch pymssql to pyodbc for connecting to Miscrosoft SQL Server
> 
>
> Key: AIRFLOW-6186
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6186
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: dependencies
>Affects Versions: 1.10.6
>Reporter: Xinbin Huang
>Assignee: Mario Mendes
>Priority: Major
> Fix For: 2.0.0
>
>
> As the author of `pymssql` project announce to discontinue supporting the 
> PyMSSQL project. It is better to switch over to other package (i.e. pyodbc) 
> for better stability 
> Note: the current workaround is to pin down the version to <3.0 ( 
> -AIRFLOW-5942- )



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-2289) Add additional quick start to INSTALL

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-2289:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Add additional quick start to INSTALL
> -
>
> Key: AIRFLOW-2289
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2289
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Bolke de Bruin
>Priority: Blocker
>  Labels: gsoc, gsoc2020, mentor
> Fix For: 1.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-2697) Drop snakebite in favour of pyarrow

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-2697:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Drop snakebite in favour of pyarrow
> ---
>
> Key: AIRFLOW-2697
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2697
> Project: Apache Airflow
>  Issue Type: Improvement
>Affects Versions: 1.9.0
>Reporter: Julian de Ruiter
>Assignee: Julian de Ruiter
>Priority: Blocker
>  Labels: gsoc, gsoc2020, mentor
> Fix For: 2.0.0
>
>
> The current HdfsHook relies on the snakebite library, which is unfortunately 
> not compatible with Python 3. To add Python 3 support for the HdfsHook 
> requires switching to a different library for interacting with HDFS. The 
> hdfs3 library is an attractive alternative, as it supports Python 3 and seems 
> to be stable and relatively well supported.
> Update: hdfs3 doesn't get any updates anymore. The best library right now 
> seems to be pyarrow: https://arrow.apache.org/docs/python/filesystems.html
> Therefore I would like to upgrade to pyarrow instead of hdfs3.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-5850) Make DockerSwarmOperator capture task logs

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-5850:
--
Labels: docker gsoc gsoc2020 mentor orchestration swarm  (was: docker 
orchestration swarm)

> Make DockerSwarmOperator capture task logs
> --
>
> Key: AIRFLOW-5850
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5850
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: contrib, operators
>Affects Versions: 1.10.6
> Environment: Reproducible everywhere since missing feature - Tested 
> via puckel/airflow:latest
>Reporter: Henning Redestig
>Assignee: Akshesh Doshi
>Priority: Major
>  Labels: docker, gsoc, gsoc2020, mentor, orchestration, swarm
> Fix For: 2.0.0
>
>
> contrib.operators.DockerSwarmOperator gives output on task starting and 
> exiting but does not capture the logs from the task.  E.g.
> [2019-11-04 16:27:30,337] \{{docker_swarm_operator.py:125}} INFO - Service 
> started: \{'ID': 'mxjn03sm32kfm8bcczs4qw6tu'}
> \## the task outputs logs here but they do not show up in the webapp..
> [2019-11-04 16:27:32,909] \{{docker_swarm_operator.py:136}} INFO - Service 
> status before exiting: complete
> [2019-11-04 16:27:37,298] \{{logging_mixin.py:95}} INFO - [2019-11-04 
> 16:27:37,298] 
> local_task_job.py:105}} INFO - Task exited with return code 0
>  Also discussed in https://github.com/apache/airflow/pull/5489



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6401) Request for OktopostToGoogleStorageOperator

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6401:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Request for OktopostToGoogleStorageOperator
> ---
>
> Key: AIRFLOW-6401
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6401
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: gcp, operators
>Affects Versions: 1.10.5
>Reporter: HaloKu
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> [https://www.oktopost.com/]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-4425) Add FacebookAdsHook

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-4425:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Add FacebookAdsHook
> ---
>
> Key: AIRFLOW-4425
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4425
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: hooks
>Reporter: jack
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
> Fix For: 2.0.0
>
>
> Add hook to interact with FacebookAds



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-5154) Add docs how to integrate with grafana and prometheus

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-5154:
--
Labels: gsoc gsoc2020 mentor  (was: )

> Add docs how to integrate with grafana and prometheus
> -
>
> Key: AIRFLOW-5154
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5154
> Project: Apache Airflow
>  Issue Type: Wish
>  Components: documentation
>Affects Versions: 1.10.4
>Reporter: lovk korm
>Priority: Major
>  Labels: gsoc, gsoc2020, mentor
>
> I'm not sure how this is doable but one of the key components that is missing 
> in airflow is the ability to notify about detecting anomalies something like 
> graphana [https://grafana.com/]
> It would be great if airflow can add support for such tools
>  
> I'm talking here about +*airflow itself*+. For example: if DAG run normally 
> takes 5 minutes but now for any reason it's running over 30 minutes than we 
> want an alert to be sent with graph that shows that anomaly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5520) DataflowPythonOperator dependency management requires side effects

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-5520.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> DataflowPythonOperator dependency management requires side effects
> --
>
> Key: AIRFLOW-5520
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5520
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.2
>Reporter: Jacob Ferriero
>Priority: Major
> Fix For: 2.0.0
>
>
> When using DataflowPythonOperator it is difficult to manage apache beam 
> version, (and other python dependencies) without affecting your entire 
> airflow environment. It seems the Dataflow hook just submits a subprocess and 
> python 
> The operator / hook should be improved to isolate python dependencies for 
> running run py_file.
> Perhaps this could be achieved in a virtual environment (similar to 
> PythonVirtualEnvOperator).
> For beam it's often customary to specify a --requirements_file or 
> --setup_file to manage python dependencies, we could run one of these in the 
> venv to get it setup. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6589) BATS tests are not executed when bash scripts change in pre-commit

2020-01-19 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6589.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> BATS tests are not executed when bash scripts change in pre-commit
> --
>
> Key: AIRFLOW-6589
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6589
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: pre-commit
>Affects Versions: 2.0.0, 1.10.7
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6594) Raise an exception when the GCP connection is misconfigured

2020-01-18 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6594.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Raise an exception when the GCP connection is misconfigured
> ---
>
> Key: AIRFLOW-6594
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6594
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: gcp
>Affects Versions: 1.10.7
>Reporter: Kamil Bregula
>Priority: Major
> Fix For: 2.0.0
>
>
> We cannot pass two service account keys to the connection configuration at 
> the same time



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6583) (BigQuery) Add query_params to templated_fields

2020-01-18 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6583.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> (BigQuery) Add query_params to templated_fields
> ---
>
> Key: AIRFLOW-6583
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6583
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: gcp
>Affects Versions: 1.10.7
>Reporter: Jithin Sukumar
>Assignee: Jithin Sukumar
>Priority: Minor
> Fix For: 2.0.0
>
>
> To query time-partitioned tables, I am passing \{{query_params}} like this
> yesterday = Variable.get('yesterday', '\{{yesterday_ds}}')
> today = Variable.get('today', '\{{ds}}')
> ...
> query_params=[\{'name': 'yesterday', 'parameterType': {'type': 'STRING'},
>'parameterValue': \{'value': yesterday}},
>   \{'name': 'today', 'parameterType': {'type': 'STRING'},
>'parameterValue': \{'value': today}}]
> query_params needs to be a template_field



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6592) Doc tests can be run in parallel to tests in CI

2020-01-18 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-6592:
-

 Summary: Doc tests can be run in parallel to tests in CI
 Key: AIRFLOW-6592
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6592
 Project: Apache Airflow
  Issue Type: Improvement
  Components: ci
Affects Versions: 1.10.7, 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6592) Doc build can be run in parallel to tests in CI

2020-01-18 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6592:
--
Summary: Doc build can be run in parallel to tests in CI  (was: Doc tests 
can be run in parallel to tests in CI)

> Doc build can be run in parallel to tests in CI
> ---
>
> Key: AIRFLOW-6592
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6592
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0, 1.10.7
>Reporter: Jarek Potiuk
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-6589) BATS tests are not executed when bash scripts change in pre-commit

2020-01-18 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-6589:
--
Affects Version/s: 1.10.7

> BATS tests are not executed when bash scripts change in pre-commit
> --
>
> Key: AIRFLOW-6589
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6589
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: pre-commit
>Affects Versions: 2.0.0, 1.10.7
>Reporter: Jarek Potiuk
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6589) BATS tests are not executed when bash scripts change in pre-commit

2020-01-18 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-6589:
-

 Summary: BATS tests are not executed when bash scripts change in 
pre-commit
 Key: AIRFLOW-6589
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6589
 Project: Apache Airflow
  Issue Type: Improvement
  Components: pre-commit
Affects Versions: 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6557) Warn if new fields are added to BaseOperator (for serialization)

2020-01-17 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6557.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Warn if new fields are added to BaseOperator (for serialization)
> 
>
> Key: AIRFLOW-6557
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6557
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: serialization, tests
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>
> There should be an automated tests that tells people what to do in case they 
> add field in BaseOperator. The test should fail and explain how serialisation 
> should be updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6584) Pin cassandra driver

2020-01-17 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6584.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Pin cassandra driver
> 
>
> Key: AIRFLOW-6584
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6584
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>
> 3.21.0 release of Cassandra driver 
> ([https://pypi.org/project/cassandra-driver/3.21.0/]) broke backwards 
> compatibility. We need to pin it to 3.20.2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6584) Pin cassandra driver

2020-01-16 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-6584:
-

 Summary: Pin cassandra driver
 Key: AIRFLOW-6584
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6584
 Project: Apache Airflow
  Issue Type: Improvement
  Components: ci
Affects Versions: 2.0.0
Reporter: Jarek Potiuk


3.21.0 release of Cassandra driver 
([https://pypi.org/project/cassandra-driver/3.21.0/]) broke backwards 
compatibility. We need to pin it to 3.20.2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6556) Improving unclear and incomplete documentation

2020-01-16 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017504#comment-17017504
 ] 

Jarek Potiuk commented on AIRFLOW-6556:
---

[~uncletoxa] (y)

> Improving unclear and incomplete documentation
> --
>
> Key: AIRFLOW-6556
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6556
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: master
>Reporter: Jacob Ward
>Assignee: Jarek Potiuk
>Priority: Trivial
>
> To help improve documentation it was discussed in the mailing list that users 
> of Airflow should have somewhere to report missing, incomplete or unclear 
> documentation. Any users who find this should comment on this ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6556) Improving unclear and incomplete documentation

2020-01-16 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016788#comment-17016788
 ] 

Jarek Potiuk commented on AIRFLOW-6556:
---

Cool. I don't promise anything super-fast, but I will look for opportunities 
how to engage more people. Having this issue here provided by users and User's 
support for review is crucial to get people involved!
 

> Improving unclear and incomplete documentation
> --
>
> Key: AIRFLOW-6556
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6556
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: master
>Reporter: Jacob Ward
>Assignee: Jarek Potiuk
>Priority: Trivial
>
> To help improve documentation it was discussed in the mailing list that users 
> of Airflow should have somewhere to report missing, incomplete or unclear 
> documentation. Any users who find this should comment on this ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6579) One of Celery executir tests is flaky

2020-01-16 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-6579:
-

 Summary: One of Celery executir tests is flaky 
 Key: AIRFLOW-6579
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6579
 Project: Apache Airflow
  Issue Type: Improvement
  Components: celery, ci
Affects Versions: 2.0.0
Reporter: Jarek Potiuk
 Attachments: celery_eecutor_failure.log

tests/executors/test_celery_executor.py::TestCeleryExecutor::test_celery_integration_0_amqp_guest_guest_rabbitmq_5672

 

Log attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6575) Change entropy source in all ci/breeze containers to urandom (unblocking)

2020-01-15 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6575.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Change entropy source in all ci/breeze containers to urandom (unblocking)
> -
>
> Key: AIRFLOW-6575
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6575
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: breeze, ci
>Affects Versions: 2.0.0, 1.10.7
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6575) Change entropy source in all ci/breeze containers to urandom (unblocking)

2020-01-15 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-6575:
-

 Summary: Change entropy source in all ci/breeze containers to 
urandom (unblocking)
 Key: AIRFLOW-6575
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6575
 Project: Apache Airflow
  Issue Type: Improvement
  Components: breeze, ci
Affects Versions: 1.10.7, 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6564) Display extra diagnostics if initial environment check fails

2020-01-15 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6564.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Display extra diagnostics if initial environment check fails
> 
>
> Key: AIRFLOW-6564
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6564
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: ci
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-6556) Improving unclear and incomplete documentation

2020-01-15 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016323#comment-17016323
 ] 

Jarek Potiuk commented on AIRFLOW-6556:
---

Thanks [~jward]! Some of these we already have on our list. It's good to have 
the list here. I will discuss it with others involved in the docs and we'll see 
what we can do. I hope, once we have some drafts we can count on you and your 
team to review and give feedback on what we propose.

> Improving unclear and incomplete documentation
> --
>
> Key: AIRFLOW-6556
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6556
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: master
>Reporter: Jacob Ward
>Assignee: Jarek Potiuk
>Priority: Trivial
>
> To help improve documentation it was discussed in the mailing list that users 
> of Airflow should have somewhere to report missing, incomplete or unclear 
> documentation. Any users who find this should comment on this ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6552) Move Azure classes to providers.microsoft package

2020-01-15 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6552.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Move Azure classes to providers.microsoft package
> -
>
> Key: AIRFLOW-6552
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6552
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: hooks, operators
>Affects Versions: 1.10.7
>Reporter: Kamil Bregula
>Priority: Major
> Fix For: 2.0.0
>
>
> More information: 
> [https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-21%3A+Changes+in+import+paths]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-6564) Display extra diagnostics if initial environment check fails

2020-01-14 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-6564:
-

 Summary: Display extra diagnostics if initial environment check 
fails
 Key: AIRFLOW-6564
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6564
 Project: Apache Airflow
  Issue Type: Improvement
  Components: ci
Affects Versions: 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (AIRFLOW-2516) Deadlock found when trying to update task_instance table

2020-01-14 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk updated AIRFLOW-2516:
--
Attachment: (was: jobs_fixed_deadlock_possibly_1.9.py)

> Deadlock found when trying to update task_instance table
> 
>
> Key: AIRFLOW-2516
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2516
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.8.0, 1.9.0, 1.10.0, 1.10.1, 1.10.2, 1.10.3, 1.10.4, 
> 1.10.5, 1.10.6, 1.10.7
>Reporter: Jeff Liu
>Assignee: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.8
>
> Attachments: Screenshot 2019-12-30 at 10.42.52.png, 
> image-2019-12-30-10-48-41-313.png, image-2019-12-30-10-58-02-610.png, 
> jobs.py, jobs_fixed_deadlock_possibly_1.9.py, 
> scheduler_job_fixed_deadlock_possibly_1.10.6.py
>
>
>  
>  
> {code:java}
> [2018-05-23 17:59:57,218] {base_task_runner.py:98} INFO - Subtask: 
> [2018-05-23 17:59:57,217] {base_executor.py:49} INFO - Adding to queue: 
> airflow run production_wipeout_wipe_manager.Carat Carat_20180227 
> 2018-05-23T17:41:18.815809 --local -sd DAGS_FOLDER/wipeout/wipeout.py
> [2018-05-23 17:59:57,231] {base_task_runner.py:98} INFO - Subtask: Traceback 
> (most recent call last):
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/bin/airflow", line 27, in 
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: 
> args.func(args)
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/airflow/bin/cli.py", line 392, in run
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: 
> pool=args.pool,
> [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/airflow/utils/db.py", line 50, in 
> wrapper
> [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: result = 
> func(*args, **kwargs)
> [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 1532, in 
> _run_raw_task
> [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask: 
> self.handle_failure(e, test_mode, context)
> [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 1641, in 
> handle_failure
> [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask: 
> session.merge(self)
> [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 
> 1920, in merge
> [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask: 
> _resolve_conflict_map=_resolve_conflict_map)
> [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 
> 1974, in _merge
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: merged = 
> self.query(mapper.class_).get(key[1])
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 882, 
> in get
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: ident, 
> loading.load_on_pk_identity)
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 952, 
> in _get_impl
> [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: return 
> db_load_fn(self, primary_key_identity)
> [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", line 247, 
> in load_on_pk_i
> dentity
> [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: return 
> q.one()
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2884, 
> in one
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: ret = 
> self.one_or_none()
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2854, 
> in one_or_none
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: ret = 
> list(self)
> [2018-05-23 17:59:57,239] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2925, 
> in __iter__
> [2018-05-23 17:59:57,239] {base_task_runner.py:98} INFO - Subtask: return 
> self._execute_and_instances(context)
> [2018-05-23 17:59:57,239] {base_task_runner.py:98} INFO 

[jira] [Created] (AIRFLOW-6557) Warn if new fields are added to BaseOperator (for serialization)

2020-01-14 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-6557:
-

 Summary: Warn if new fields are added to BaseOperator (for 
serialization)
 Key: AIRFLOW-6557
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6557
 Project: Apache Airflow
  Issue Type: Improvement
  Components: serialization, tests
Affects Versions: 2.0.0
Reporter: Jarek Potiuk


There should be an automated tests that tells people what to do in case they 
add field in BaseOperator. The test should fail and explain how serialisation 
should be updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (AIRFLOW-1467) allow tasks to use more than one pool slot

2020-01-13 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk reopened AIRFLOW-1467:
---
  Assignee: Lokesh Lal

Reopening as there were accidentally two merge heads created and we needed to 
revert the merged request

> allow tasks to use more than one pool slot
> --
>
> Key: AIRFLOW-1467
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1467
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: scheduler
>Reporter: Adrian Bridgett
>Assignee: Lokesh Lal
>Priority: Trivial
>  Labels: pool
> Fix For: 1.10.8
>
>
> It would be useful to have tasks use more than a single pool slot. 
> Our use case is actually to limit how many tasks run on a head node (due to 
> memory constraints), currently we have to set a pool limit limiting how many 
> tasks.
> Ideally we could set the pool size to e.g amount of memory and then set those 
> tasks pool_usage (or whatever the option would be called) to the amount of 
> memory we think they'll use.  This way the pool would let lots of small tasks 
> run or just a few large tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5505) task_instance table errors in metastore db with localexecutor/mysql

2020-01-13 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-5505.
---
Fix Version/s: 1.10.8
   Resolution: Duplicate

> task_instance table errors in metastore db with localexecutor/mysql
> ---
>
> Key: AIRFLOW-5505
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5505
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: database, scheduler
>Affects Versions: 1.10.3
>Reporter: t oo
>Priority: Major
> Fix For: 1.10.8
>
>
> using v1.10.3, localexecutor, mysql backend. MySQL CPU usage is around 50%, 
> peaking at 70%. I externally trigger 30 DAGs in parallel (different execution 
> dates but same dagid). I repeat that same pattern for 20 different DAGids.
> ie dagidA - run execdate 1-30sep in parallel
> let those 30 runs finish then:
> dagidB - run execdate 1-30sep in parallel
> let those 30 runs finish then:
> dagidC - run execdate 1-30sep in parallel
> ..etc
>  
> I face these errors approx 50 times a day.
>  
> Facing several error messages, all related to task_instance table.
>  
> 1.
> [2019-09-15 22:09:14,475] \{__init__.py:305} INFO - Filling up the DagBag 
> from /home/ec2-user/airflow/dags
> Traceback (most recent call last):
>  File "/home/ec2-user/venv/bin/airflow", line 32, in 
>  args.func(args)
>  File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/utils/cli.py", 
> line 74, in wrapper
>  return f(*args, **kwargs)
>  File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/bin/cli.py", 
> line 233, in trigger_dag
>  execution_date=args.exec_date)
>  File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/api/client/local_client.py",
>  line 33, in trigger_dag
>  execution_date=execution_date)
>  File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/api/common/experimental/trigger_dag.py",
>  line 101, in trigger_dag
>  replace_microseconds=replace_microseconds,
>  File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/api/common/experimental/trigger_dag.py",
>  line 77, in _trigger_dag
>  external_trigger=True,
>  File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/utils/db.py", 
> line 73, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/models/__init__.py",
>  line 4095, in create_dagrun
>  run.verify_integrity(session=session)
>  File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/utils/db.py", 
> line 69, in wrapper
>  return func(*args, **kwargs)
>  File 
> "/home/ec2-user/venv/local/lib/python2.7/site-packages/airflow/models/__init__.py",
>  line 4934, in verify_integrity
>  session.commit()
>  File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/session.py",
>  line 1023, in commit
>  self.transaction.commit()
>  File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/session.py",
>  line 487, in commit
>  self._prepare_impl()
>  File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/session.py",
>  line 466, in _prepare_impl
>  self.session.flush()
>  File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/session.py",
>  line 2446, in flush
>  self._flush(objects)
>  File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/session.py",
>  line 2584, in _flush
>  transaction.rollback(_capture_exception=True)
>  File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py",
>  line 67, in __exit__
>  compat.reraise(exc_type, exc_value, exc_tb)
>  File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/session.py",
>  line 2544, in _flush
>  flush_context.execute()
>  File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py",
>  line 416, in execute
>  rec.execute(self)
>  File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py",
>  line 583, in execute
>  uow,
>  File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py",
>  line 245, in save_obj
>  insert,
>  File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py",
>  line 1063, in _emit_insert_statements
>  c = cached_connections[connection].execute(statement, multiparams)
>  File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/engine/base.py",
>  line 980, in execute
>  return meth(self, multiparams, params)
>  File 
> "/home/ec2-user/venv/local/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py",
>  line 273, in _execute_on_connection
>  return connection._execute_clauseelement(self, 

[jira] [Resolved] (AIRFLOW-2511) Subdag failed by scheduler deadlock

2020-01-13 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-2511.
---
Fix Version/s: 1.10.8
   Resolution: Duplicate

> Subdag failed by scheduler deadlock
> ---
>
> Key: AIRFLOW-2511
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2511
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Yohei Shimomae
>Assignee: lufeng
>Priority: Major
> Fix For: 1.10.8
>
>
> I am using subdag and sometimes main dag marked failed because of the 
> following error. In this case, tasks in the subdag stopped.
> {code:java}
> hourly_dag = DAG(
>   hourly_dag_name,
>   default_args=dag_default_args,
>   params=dag_custom_params,
>   schedule_interval=config_values.hourly_job_interval,
>   max_active_runs=2)
> hourly_subdag = SubDagOperator(
>   task_id='s3_to_hive',
>   subdag=LoadFromS3ToHive(
>   hourly_dag,
>   's3_to_hive'),
>   dag=hourly_dag)
> {code}
> I got this error in main dag. bug in scheduler?
> {code:java}
> [2018-05-22 21:52:19,683] {models.py:1595} ERROR - This Session's transaction 
> has been rolled back due to a previous exception during flush. To begin a new 
> transaction with this Session, first issue Session.rollback(). Original 
> exception was: (_mysql_exceptions.OperationalError) (1213, 'Deadlock found 
> when trying to get lock; try restarting transaction') [SQL: 'UPDATE 
> task_instance SET state=%s WHERE task_instance.task_id = %s AND 
> task_instance.dag_id = %s AND task_instance.execution_date = %s'] 
> [parameters: ('queued', 'transfer_from_tmp_table_into_cleaned_table', 
> 'rfid_warehouse_carton_wh_g_dl_dwh_csv_uqjp_1h.s3_to_hive', 
> datetime.datetime(2018, 5, 7, 5, 2))] (Background on this error at: 
> http://sqlalche.me/e/e3q8)
> Traceback (most recent call last):
> sqlalchemy.exc.InvalidRequestError: This Session's transaction has been 
> rolled back due to a previous exception during flush. To begin a new 
> transaction with this Session, first issue Session.rollback(). Original 
> exception was: (_mysql_exceptions.OperationalError) (1213, 'Deadlock found 
> when trying to get lock; try restarting transaction') [SQL: 'UPDATE 
> task_instance SET state=%s WHERE task_instance.task_id = %s AND 
> task_instance.dag_id = %s AND task_instance.execution_date = %s'] 
> [parameters: ('queued', 'transfer_from_tmp_table_into_cleaned_table', 
> 'rfid_warehouse_carton_wh_g_dl_dwh_csv_uqjp_1h.s3_to_hive', 
> datetime.datetime(2018, 5, 7, 5, 2))] (Background on this error at: 
> http://sqlalche.me/e/e3q8)
> [2018-05-22 21:52:19,687] {models.py:1624} INFO - Marking task as FAILED.
> [2018-05-22 21:52:19,688] {base_task_runner.py:98} INFO - Subtask: 
> [2018-05-22 21:52:19,688] {slack_hook.py:143} INFO - Message is prepared: 
> [2018-05-22 21:52:19,688] {base_task_runner.py:98} INFO - Subtask: 
> {"attachments": [{"color": "danger", "text": "", "fields": [{"title": "DAG", 
> "value": 
> "",
>  "short": true}, {"title": "Owner", "value": "airflow", "short": true}, 
> {"title": "Task", "value": "s3_to_hive", "short": false}, {"title": "Status", 
> "value": "FAILED", "short": false}, {"title": "Execution Time", "value": 
> "2018-05-07T05:02:00", "short": true}, {"title": "Duration", "value": 
> "826.305929", "short": true}, {"value": 
> "  Task Log>", "short": false}]}]}
> [2018-05-22 21:52:19,688] {models.py:1638} ERROR - Failed at executing 
> callback
> [2018-05-22 21:52:19,688] {models.py:1639} ERROR - This Session's transaction 
> has been rolled back due to a previous exception during flush. To begin a new 
> transaction with this Session, first issue Session.rollback(). Original 
> exception was: (_mysql_exceptions.OperationalError) (1213, 'Deadlock found 
> when trying to get lock; try restarting transaction') [SQL: 'UPDATE 
> task_instance SET state=%s WHERE task_instance.task_id = %s AND 
> task_instance.dag_id = %s AND task_instance.execution_date = %s'] 
> [parameters: ('queued', 'transfer_from_tmp_table_into_cleaned_table', 
> 'rfid_warehouse_carton_wh_g_dl_dwh_csv_uqjp_1h.s3_to_hive', 
> datetime.datetime(2018, 5, 7, 5, 2))] (Background on this error at: 
> http://sqlalche.me/e/e3q8)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-4498) sql metastore/scheduler deadlock

2020-01-13 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-4498.
---
Fix Version/s: 1.10.8
   Resolution: Duplicate

> sql metastore/scheduler deadlock
> 
>
> Key: AIRFLOW-4498
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4498
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: database
>Affects Versions: 1.10.3
>Reporter: t oo
>Priority: Major
> Fix For: 1.10.8
>
> Attachments: jobs_fixed_deadlock_possibly_1.10.3.py
>
>
> see late comments in AIRFLOW-2511, issue still occurring in 1.10.3 release
> 17/Apr/19 14:44
> We're still seeing deadlocking issues within 1.10.3 with this change applied, 
> it appears that for this specific condition (subdagoperator) it has no effect:
> [2019-04-17 14:01:33,423] {{__init__.py:1580}} ERROR - 
> (_mysql_exceptions.OperationalError) (1213, 'Deadlock found when trying to 
> get lock; try restarting transaction') [SQL: 'UPDATE task_instance SET 
> state=%s, queued_dttm=%s WHERE task_instance.task_id = %s AND 
> task_instance.dag_id = %s AND task_instance.execution_date = %s'] 
> [parameters: ('queued', datetime.datetime(2019, 4, 17, 14, 1, 18, 249580, 
> tzinfo=), 'subdag_task_instance', 'subdag_dag.move_data', 
> datetime.datetime(2019, 4, 17, 11, 35, 56, 625793, tzinfo=))] 
> (Background on this error at: http://sqlalche.me/e/e3q8)
> Traceback (most recent call last):
>   File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 
> 1236, in _execute_context
> cursor, statement, parameters, context
>   File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/default.py", 
> line 536, in do_execute
> cursor.execute(statement, parameters)
>   File "/usr/lib64/python3.6/site-packages/MySQLdb/cursors.py", line 255, in 
> execute
> self.errorhandler(self, exc, value)
>   File "/usr/lib64/python3.6/site-packages/MySQLdb/connections.py", line 50, 
> in defaulterrorhandler
> raise errorvalue
>   File "/usr/lib64/python3.6/site-packages/MySQLdb/cursors.py", line 252, in 
> execute
> res = self._query(query)
>   File "/usr/lib64/python3.6/site-packages/MySQLdb/cursors.py", line 378, in 
> _query
> db.query(q)
>   File "/usr/lib64/python3.6/site-packages/MySQLdb/connections.py", line 280, 
> in query
> _mysql.connection.query(self, query)
> _mysql_exceptions.OperationalError: (1213, 'Deadlock found when trying to get 
> lock; try restarting transaction')
> The above exception was the direct cause of the following exception:
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/site-packages/airflow/models/__init__.py", line 
> 1441, in _run_raw_task
> result = task_copy.execute(context=context)
>   File 
> "/usr/lib/python3.6/site-packages/airflow/operators/subdag_operator.py", line 
> 102, in execute
> executor=self.executor)
>   File "/usr/lib/python3.6/site-packages/airflow/models/__init__.py", line 
> 4030, in run
> job.run()
>   File "/usr/lib/python3.6/site-packages/airflow/jobs.py", line 209, in run
> self._execute()
>   File "/usr/lib/python3.6/site-packages/airflow/utils/db.py", line 73, in 
> wrapper
> return func(*args, **kwargs)
>   File "/usr/lib/python3.6/site-packages/airflow/jobs.py", line 2475, in 
> _execute
> session=session)
>   File "/usr/lib/python3.6/site-packages/airflow/utils/db.py", line 69, in 
> wrapper
> return func(*args, **kwargs)
>   File "/usr/lib/python3.6/site-packages/airflow/jobs.py", line 2421, in 
> _execute_for_run_dates
> session=session)
>   File "/usr/lib/python3.6/site-packages/airflow/utils/db.py", line 69, in 
> wrapper
> return func(*args, **kwargs)
>   File "/usr/lib/python3.6/site-packages/airflow/jobs.py", line 2310, in 
> _process_backfill_task_instances
> _per_task_process(task, key, ti)
>   File "/usr/lib/python3.6/site-packages/airflow/utils/db.py", line 73, in 
> wrapper
> return func(*args, **kwargs)
>   File "/usr/lib/python3.6/site-packages/airflow/jobs.py", line 2273, in 
> _per_task_process
> session.commit()
>   File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/session.py", line 
> 1023, in commit
> self.transaction.commit()
>   File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/session.py", line 
> 487, in commit
> self._prepare_impl()
>   File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/session.py", line 
> 466, in _prepare_impl
> self.session.flush()
>   File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/session.py", line 
> 2446, in flush
> self._flush(objects)
>   File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/session.py", line 
> 2584, in _flush
> transaction.rollback(_capture_exception=True)
>   File "/usr/lib64/python3.6/site-packages/sqlalchemy/util/langhelpers.py", 

[jira] [Resolved] (AIRFLOW-2516) Deadlock found when trying to update task_instance table

2020-01-13 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-2516.
---
Fix Version/s: 1.10.8
   Resolution: Fixed

> Deadlock found when trying to update task_instance table
> 
>
> Key: AIRFLOW-2516
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2516
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.8.0, 1.9.0, 1.10.0, 1.10.1, 1.10.2, 1.10.3, 1.10.4, 
> 1.10.5, 1.10.6, 1.10.7
>Reporter: Jeff Liu
>Assignee: Jarek Potiuk
>Priority: Major
> Fix For: 1.10.8
>
> Attachments: Screenshot 2019-12-30 at 10.42.52.png, 
> image-2019-12-30-10-48-41-313.png, image-2019-12-30-10-58-02-610.png, 
> jobs.py, jobs_fixed_deadlock_possibly_1.9.py, 
> jobs_fixed_deadlock_possibly_1.9.py, 
> scheduler_job_fixed_deadlock_possibly_1.10.6.py
>
>
>  
>  
> {code:java}
> [2018-05-23 17:59:57,218] {base_task_runner.py:98} INFO - Subtask: 
> [2018-05-23 17:59:57,217] {base_executor.py:49} INFO - Adding to queue: 
> airflow run production_wipeout_wipe_manager.Carat Carat_20180227 
> 2018-05-23T17:41:18.815809 --local -sd DAGS_FOLDER/wipeout/wipeout.py
> [2018-05-23 17:59:57,231] {base_task_runner.py:98} INFO - Subtask: Traceback 
> (most recent call last):
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/bin/airflow", line 27, in 
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: 
> args.func(args)
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/airflow/bin/cli.py", line 392, in run
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: 
> pool=args.pool,
> [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/airflow/utils/db.py", line 50, in 
> wrapper
> [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: result = 
> func(*args, **kwargs)
> [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 1532, in 
> _run_raw_task
> [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask: 
> self.handle_failure(e, test_mode, context)
> [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 1641, in 
> handle_failure
> [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask: 
> session.merge(self)
> [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 
> 1920, in merge
> [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask: 
> _resolve_conflict_map=_resolve_conflict_map)
> [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 
> 1974, in _merge
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: merged = 
> self.query(mapper.class_).get(key[1])
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 882, 
> in get
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: ident, 
> loading.load_on_pk_identity)
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 952, 
> in _get_impl
> [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: return 
> db_load_fn(self, primary_key_identity)
> [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", line 247, 
> in load_on_pk_i
> dentity
> [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: return 
> q.one()
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2884, 
> in one
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: ret = 
> self.one_or_none()
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2854, 
> in one_or_none
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: ret = 
> list(self)
> [2018-05-23 17:59:57,239] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2925, 
> in __iter__
> [2018-05-23 17:59:57,239] {base_task_runner.py:98} INFO - Subtask: return 
> self._execute_and_instances(context)
> [2018-05-23 17:59:57,239] 

[jira] [Created] (AIRFLOW-6545) Validate all commit messages in PR for AIRFLOW-XXX

2020-01-12 Thread Jarek Potiuk (Jira)
Jarek Potiuk created AIRFLOW-6545:
-

 Summary: Validate all commit messages in PR for AIRFLOW-XXX
 Key: AIRFLOW-6545
 URL: https://issues.apache.org/jira/browse/AIRFLOW-6545
 Project: Apache Airflow
  Issue Type: Improvement
  Components: ci
Affects Versions: 2.0.0
Reporter: Jarek Potiuk






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-2516) Deadlock found when trying to update task_instance table

2020-01-12 Thread Jarek Potiuk (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013886#comment-17013886
 ] 

Jarek Potiuk commented on AIRFLOW-2516:
---

Hey [~0x4ec7] -  I guess no news is a good news ?

> Deadlock found when trying to update task_instance table
> 
>
> Key: AIRFLOW-2516
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2516
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DagRun
>Affects Versions: 1.8.0, 1.9.0, 1.10.0, 1.10.1, 1.10.2, 1.10.3, 1.10.4, 
> 1.10.5, 1.10.6, 1.10.7
>Reporter: Jeff Liu
>Assignee: Jarek Potiuk
>Priority: Major
> Attachments: Screenshot 2019-12-30 at 10.42.52.png, 
> image-2019-12-30-10-48-41-313.png, image-2019-12-30-10-58-02-610.png, 
> jobs.py, jobs_fixed_deadlock_possibly_1.9.py, 
> jobs_fixed_deadlock_possibly_1.9.py, 
> scheduler_job_fixed_deadlock_possibly_1.10.6.py
>
>
>  
>  
> {code:java}
> [2018-05-23 17:59:57,218] {base_task_runner.py:98} INFO - Subtask: 
> [2018-05-23 17:59:57,217] {base_executor.py:49} INFO - Adding to queue: 
> airflow run production_wipeout_wipe_manager.Carat Carat_20180227 
> 2018-05-23T17:41:18.815809 --local -sd DAGS_FOLDER/wipeout/wipeout.py
> [2018-05-23 17:59:57,231] {base_task_runner.py:98} INFO - Subtask: Traceback 
> (most recent call last):
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/bin/airflow", line 27, in 
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: 
> args.func(args)
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/airflow/bin/cli.py", line 392, in run
> [2018-05-23 17:59:57,232] {base_task_runner.py:98} INFO - Subtask: 
> pool=args.pool,
> [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/airflow/utils/db.py", line 50, in 
> wrapper
> [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: result = 
> func(*args, **kwargs)
> [2018-05-23 17:59:57,233] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 1532, in 
> _run_raw_task
> [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask: 
> self.handle_failure(e, test_mode, context)
> [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/airflow/models.py", line 1641, in 
> handle_failure
> [2018-05-23 17:59:57,234] {base_task_runner.py:98} INFO - Subtask: 
> session.merge(self)
> [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 
> 1920, in merge
> [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask: 
> _resolve_conflict_map=_resolve_conflict_map)
> [2018-05-23 17:59:57,235] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 
> 1974, in _merge
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: merged = 
> self.query(mapper.class_).get(key[1])
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 882, 
> in get
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: ident, 
> loading.load_on_pk_identity)
> [2018-05-23 17:59:57,236] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 952, 
> in _get_impl
> [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: return 
> db_load_fn(self, primary_key_identity)
> [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", line 247, 
> in load_on_pk_i
> dentity
> [2018-05-23 17:59:57,237] {base_task_runner.py:98} INFO - Subtask: return 
> q.one()
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2884, 
> in one
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: ret = 
> self.one_or_none()
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2854, 
> in one_or_none
> [2018-05-23 17:59:57,238] {base_task_runner.py:98} INFO - Subtask: ret = 
> list(self)
> [2018-05-23 17:59:57,239] {base_task_runner.py:98} INFO - Subtask: File 
> "/usr/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2925, 
> in __iter__
> [2018-05-23 17:59:57,239] {base_task_runner.py:98} INFO - Subtask: return 
> self._execute_and_instances(context)
> [2018-05-23 

[jira] [Resolved] (AIRFLOW-6540) [DockerOperator] Some logs are lost

2020-01-12 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6540.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> [DockerOperator] Some logs are lost
> ---
>
> Key: AIRFLOW-6540
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6540
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.10.6
>Reporter: Emmanuel NALEPA
>Assignee: Emmanuel NALEPA
>Priority: Minor
> Fix For: 2.0.0
>
>
> +_*Observed:*_+
> Using DockerOperator, if the Docker container emits logs very early after 
> container creation, then these logs are lost.
> +_*Expected:*_+
> Using DockerOperator, no logs should be lost at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6536) Make "job_id" parameter of the DatabricksRunNowOperator optional

2020-01-11 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6536.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Make "job_id" parameter of the DatabricksRunNowOperator optional
> 
>
> Key: AIRFLOW-6536
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6536
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: 1.10.7
>Reporter: Mustafa Gök
>Assignee: Mustafa Gök
>Priority: Minor
> Fix For: 2.0.0
>
>
> "job_id" parameter should be optional because it can be passed in json(dict) 
> parameter.
> line 317 (example in docstring, but it gives error):
> {code:python}
> notebook_run = DatabricksRunNowOperator(task_id='notebook_run', json=json)
> {code}
> line 458:
> {code:python}
> if job_id is not None:
> self.json['job_id'] = job_id
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6537) Fix backticks in rst file

2020-01-11 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6537.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Fix backticks in rst file
> -
>
> Key: AIRFLOW-6537
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6537
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 1.10.7
>Reporter: Kamil Bregula
>Priority: Major
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-5704) Docker scripts for kind kubernetes tests can be improved

2020-01-11 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-5704.
---
Resolution: Fixed

> Docker scripts for kind kubernetes tests can be improved
> 
>
> Key: AIRFLOW-5704
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5704
> Project: Apache Airflow
>  Issue Type: Sub-task
>  Components: breeze, ci, dependencies
>Affects Versions: 2.0.0
>Reporter: Jarek Potiuk
>Assignee: Jarek Potiuk
>Priority: Major
> Fix For: 2.0.0
>
>
> The docker CI image for kind tests can be improved
>  
>  * Kubernetes Version and all the installation of docker + kubectl + kind can 
> be added back
>  * Running kubernetes scripts should be possible from within breeze without 
> special "kubernetes" environment
>  * --env breeze switch should be removed
>  * "bare" environment should be replaced by --no-deps switch
>  * ENV variable should disappear
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6528) disable W503 flake8 check (line break before binary operator)

2020-01-10 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6528.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> disable W503 flake8 check (line break before binary operator)
> -
>
> Key: AIRFLOW-6528
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6528
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: pre-commit
>Affects Versions: 1.10.7
>Reporter: Daniel Standish
>Priority: Trivial
> Fix For: 2.0.0
>
>
> Flake8's W503 rule says there should be no line break before binary operator.
> This rule is incompatible with black formatter, and is also in my opinion bad 
> style.
> Status quo example with W503 check enabled:
> {code}
> @property
> def sqlalchemy_scheme(self):
> """
> Database provided in init if exists; otherwise, ``schema`` from 
> ``Connection`` object.
> """
> return (
> self._sqlalchemy_scheme or
> self.connection_extra_lower.get('sqlalchemy_scheme') or
> self.DEFAULT_SQLALCHEMY_SCHEME
> )
> {code}
> as required by black (W503 disabled)
> {code}
> @property
> def sqlalchemy_scheme(self):
> """
> Database provided in init if exists; otherwise, ``schema`` from 
> ``Connection`` object.
> """
> return (
> self._sqlalchemy_scheme
> or self.connection_extra_lower.get('sqlalchemy_scheme')
> or self.DEFAULT_SQLALCHEMY_SCHEME
> )
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6326) Sort cli commands and arg

2020-01-09 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6326.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> Sort cli commands and arg
> -
>
> Key: AIRFLOW-6326
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6326
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: cli
>Affects Versions: 1.10.6
>Reporter: zhongjiajie
>Assignee: zhongjiajie
>Priority: Major
> Fix For: 2.0.0
>
>
> Sort cli commands and arg



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (AIRFLOW-6451) self._print_stat() in dag_processing.py should be skippable by config option

2020-01-09 Thread Jarek Potiuk (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Potiuk resolved AIRFLOW-6451.
---
Fix Version/s: 2.0.0
   Resolution: Fixed

> self._print_stat() in dag_processing.py should be skippable by config option
> 
>
> Key: AIRFLOW-6451
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6451
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 1.10.7
>Reporter: t oo
>Assignee: t oo
>Priority: Minor
> Fix For: 2.0.0
>
>
> perf benefit
> clean up extra poll, logs, typos



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


<    1   2   3   4   5   6   7   8   9   10   >