[jira] [Created] (AIRFLOW-5640) BaseOperator email parameter is wrongly typed and not documented

2019-10-11 Thread Cedrik Neumann (Jira)
Cedrik Neumann created AIRFLOW-5640:
---

 Summary: BaseOperator email parameter is wrongly typed and not 
documented
 Key: AIRFLOW-5640
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5640
 Project: Apache Airflow
  Issue Type: Bug
  Components: operators
Affects Versions: 1.10.5
Reporter: Cedrik Neumann


The {{email}} field is not documented in BaseOperator and furthermore the type 
annotation {{str}} is wrong 
[here|https://github.com/apache/airflow/blob/master/airflow/models/baseoperator.py#L273].

The method {{get_email_address_list}} clearly accepts lists of strings as well 
as comma and semicolon delimited lists: 
[here|https://github.com/apache/airflow/blob/88989200a66291580088188f06a6db503ac823e2/airflow/utils/email.py#L123]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-5463) Variable set is not atomic

2019-09-12 Thread Cedrik Neumann (Jira)
Cedrik Neumann created AIRFLOW-5463:
---

 Summary: Variable set is not atomic
 Key: AIRFLOW-5463
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5463
 Project: Apache Airflow
  Issue Type: Bug
  Components: core, models
Affects Versions: 1.10.5
Reporter: Cedrik Neumann


The function \{{Variable.set}} deletes the variable first

[https://github.com/apache/airflow/blob/1.10.5/airflow/models/variable.py#L137]

although it doesn't pass the DB session as an argument, thus delete and add 
don't run in an atomic operation.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (AIRFLOW-5179) Top level __init__.py breaks imports

2019-08-12 Thread Cedrik Neumann (JIRA)
Cedrik Neumann created AIRFLOW-5179:
---

 Summary: Top level __init__.py breaks imports
 Key: AIRFLOW-5179
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5179
 Project: Apache Airflow
  Issue Type: Bug
  Components: build
Affects Versions: 2.0.0
Reporter: Cedrik Neumann


The recent commit 
[3724c2aaf4cfee4a60f6c7231777bfb256090c7c|https://github.com/apache/airflow/commit/3724c2aaf4cfee4a60f6c7231777bfb256090c7c]
 to master introduced a {{__init__.py}} file in the project root folder, which 
basically breaks all imports in local development ({{pip install -e .}}) as it 
turns the project root into a package.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (AIRFLOW-1784) SKIPPED status is being cascading wrongly

2019-08-10 Thread Cedrik Neumann (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cedrik Neumann reassigned AIRFLOW-1784:
---

Assignee: Cedrik Neumann

> SKIPPED status is being cascading wrongly
> -
>
> Key: AIRFLOW-1784
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1784
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.8.2
> Environment: Ubuntu 16.04.3 LTS 
> Python 2.7.12 
> CeleryExecutor: 2-nodes cluster
>Reporter: Dmytro Kulyk
>Assignee: Cedrik Neumann
>Priority: Critical
>  Labels: documentation, latestonly, operators
> Attachments: Capture_graph.JPG, Capture_tree2.JPG, cube_update.py
>
>
> After implementation of AIRFLOW-1296 within 1.8.2 there is an wrong behavior 
> of LatestOnlyOperator which is forcing SKIPPED status cascading despite of 
> TriggerRule='all_done' set
> Which is opposite to documented 
> [here|https://airflow.incubator.apache.org/concepts.html#latest-run-only]
> *Expected Behavior:*
> dummy task and all downstreams (update_*) should not be skipped
> Full listings are attached
> 1.8.1 did not have such issue



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (AIRFLOW-2923) LatestOnlyOperator cascade skip through all_done and dummy

2019-08-10 Thread Cedrik Neumann (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cedrik Neumann reassigned AIRFLOW-2923:
---

Assignee: Cedrik Neumann

> LatestOnlyOperator cascade skip through all_done and dummy
> --
>
> Key: AIRFLOW-2923
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2923
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 1.9.0
> Environment: CeleryExecutor, 2-nodes cluster, RMQ, PostgreSQL
>Reporter: Dmytro Kulyk
>Assignee: Cedrik Neumann
>Priority: Major
>  Labels: cascade, latestonly, skip
> Attachments: cube_update.py, screenshot-1.png
>
>
> DAG with consolidating point (calc_ready : dummy)
> as per [https://airflow.apache.org/concepts.html#latest-run-only] given task 
> should be ran even catching up an execution DagRuns for a previous periods
>  However, LatestOnlyOperator cascading through calc_ready despite of it is a 
> dummy and trigger_rule=all_done
>  Same behavior when trigger_rule=all_success
> {code}
> t_ready = DummyOperator(
>   task_id = 'calc_ready',
>   trigger_rule = TriggerRule.ALL_DONE,
>   dag=dag)
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Comment Edited] (AIRFLOW-5046) Allow GoogleCloudStorageToBigQueryOperator to accept source_objects as a string or otherwise take input from XCom

2019-08-03 Thread Cedrik Neumann (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899492#comment-16899492
 ] 

Cedrik Neumann edited comment on AIRFLOW-5046 at 8/3/19 5:15 PM:
-

It should reflect the xcom_pull interface as similar as possible:
{code}
def xcom_pull(
self,
task_ids=None,
dag_id=None,
key=XCOM_RETURN_KEY,
include_prior_dates=False):
{code}


was (Author: m1racoli):
It should reflect the xcom_pull interface as similar as possile:

{code:python}
def xcom_pull(
self,
task_ids=None,
dag_id=None,
key=XCOM_RETURN_KEY,
include_prior_dates=False):
{code}


> Allow GoogleCloudStorageToBigQueryOperator to accept source_objects as a 
> string or otherwise take input from XCom
> -
>
> Key: AIRFLOW-5046
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5046
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp
>Affects Versions: 1.10.2
>Reporter: Joel Croteau
>Priority: Minor
>
> `GoogleCloudStorageToBigQueryOperator` should be able to have its 
> `source_objects` dynamically determined by the results of a previous 
> workflow. This is hard to do with it expecting a list, as any template 
> expansion will render as a string. This could be implemented either as a 
> check for whether `source_objects` is a string, and trying to parse it as a 
> list if it is, or a separate argument for a string encoded as a list.
> My particular use case for this is as follows:
>  # A daily DAG scans a GCS bucket for all objects created in the last day and 
> loads them into BigQuery.
>  # To find these objects, a `PythonOperator` scans the bucket and returns a 
> list of object names.
>  # A `GoogleCloudStorageToBigQueryOperator` is used to load these objects 
> into BigQuery.
> The operator should be able to have its list of objects provided by XCom, but 
> there is no functionality to do this, and trying to do a template expansion 
> along the lines of `source_objects='\{{ task_instance.xcom_pull(key="KEY") 
> }}'` doesn't work because this is rendered as a string, which 
> `GoogleCloudStorageToBigQueryOperator` will try to treat as a list, with each 
> character being a single item.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (AIRFLOW-5046) Allow GoogleCloudStorageToBigQueryOperator to accept source_objects as a string or otherwise take input from XCom

2019-08-03 Thread Cedrik Neumann (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899492#comment-16899492
 ] 

Cedrik Neumann commented on AIRFLOW-5046:
-

It should reflect the xcom_pull interface as similar as possile:

{code:python}
def xcom_pull(
self,
task_ids=None,
dag_id=None,
key=XCOM_RETURN_KEY,
include_prior_dates=False):
{code}


> Allow GoogleCloudStorageToBigQueryOperator to accept source_objects as a 
> string or otherwise take input from XCom
> -
>
> Key: AIRFLOW-5046
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5046
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp
>Affects Versions: 1.10.2
>Reporter: Joel Croteau
>Priority: Minor
>
> `GoogleCloudStorageToBigQueryOperator` should be able to have its 
> `source_objects` dynamically determined by the results of a previous 
> workflow. This is hard to do with it expecting a list, as any template 
> expansion will render as a string. This could be implemented either as a 
> check for whether `source_objects` is a string, and trying to parse it as a 
> list if it is, or a separate argument for a string encoded as a list.
> My particular use case for this is as follows:
>  # A daily DAG scans a GCS bucket for all objects created in the last day and 
> loads them into BigQuery.
>  # To find these objects, a `PythonOperator` scans the bucket and returns a 
> list of object names.
>  # A `GoogleCloudStorageToBigQueryOperator` is used to load these objects 
> into BigQuery.
> The operator should be able to have its list of objects provided by XCom, but 
> there is no functionality to do this, and trying to do a template expansion 
> along the lines of `source_objects='\{{ task_instance.xcom_pull(key="KEY") 
> }}'` doesn't work because this is rendered as a string, which 
> `GoogleCloudStorageToBigQueryOperator` will try to treat as a list, with each 
> character being a single item.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Comment Edited] (AIRFLOW-5046) Allow GoogleCloudStorageToBigQueryOperator to accept source_objects as a string or otherwise take input from XCom

2019-08-03 Thread Cedrik Neumann (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899491#comment-16899491
 ] 

Cedrik Neumann edited comment on AIRFLOW-5046 at 8/3/19 5:13 PM:
-

Yeah, maybe it's a second mechanism after Jinja operating on the result string. 
I was thinking something like this:
If a string after Jinja defines a xcom address like 
{noformat}
xcom://2019-07-30?key=return_value&task_ids=mytask
{noformat}
then the string will be replaced with the value/values of the xcom address.


was (Author: m1racoli):
Yeah, maybe it's a second mechanism after Jinja operating on the result string. 
I was thinking something like this:
If a string after Jinja defines a xcom address like 
{noformat}
xcom://2019-07-30?key=return_value&task_id=mytask
{noformat}
then the string will be replaced with the value/values of the xcom address.

> Allow GoogleCloudStorageToBigQueryOperator to accept source_objects as a 
> string or otherwise take input from XCom
> -
>
> Key: AIRFLOW-5046
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5046
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp
>Affects Versions: 1.10.2
>Reporter: Joel Croteau
>Priority: Minor
>
> `GoogleCloudStorageToBigQueryOperator` should be able to have its 
> `source_objects` dynamically determined by the results of a previous 
> workflow. This is hard to do with it expecting a list, as any template 
> expansion will render as a string. This could be implemented either as a 
> check for whether `source_objects` is a string, and trying to parse it as a 
> list if it is, or a separate argument for a string encoded as a list.
> My particular use case for this is as follows:
>  # A daily DAG scans a GCS bucket for all objects created in the last day and 
> loads them into BigQuery.
>  # To find these objects, a `PythonOperator` scans the bucket and returns a 
> list of object names.
>  # A `GoogleCloudStorageToBigQueryOperator` is used to load these objects 
> into BigQuery.
> The operator should be able to have its list of objects provided by XCom, but 
> there is no functionality to do this, and trying to do a template expansion 
> along the lines of `source_objects='\{{ task_instance.xcom_pull(key="KEY") 
> }}'` doesn't work because this is rendered as a string, which 
> `GoogleCloudStorageToBigQueryOperator` will try to treat as a list, with each 
> character being a single item.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (AIRFLOW-5046) Allow GoogleCloudStorageToBigQueryOperator to accept source_objects as a string or otherwise take input from XCom

2019-08-03 Thread Cedrik Neumann (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899491#comment-16899491
 ] 

Cedrik Neumann commented on AIRFLOW-5046:
-

Yeah, maybe it's a second mechanism after Jinja operating on the result string. 
I was thinking something like this:
If a string after Jinja defines a xcom address like 
{noformat}
xcom://2019-07-30?key=return_value&task_id=mytask
{noformat}
then the string will be replaced with the value/values of the xcom address.

> Allow GoogleCloudStorageToBigQueryOperator to accept source_objects as a 
> string or otherwise take input from XCom
> -
>
> Key: AIRFLOW-5046
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5046
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp
>Affects Versions: 1.10.2
>Reporter: Joel Croteau
>Priority: Minor
>
> `GoogleCloudStorageToBigQueryOperator` should be able to have its 
> `source_objects` dynamically determined by the results of a previous 
> workflow. This is hard to do with it expecting a list, as any template 
> expansion will render as a string. This could be implemented either as a 
> check for whether `source_objects` is a string, and trying to parse it as a 
> list if it is, or a separate argument for a string encoded as a list.
> My particular use case for this is as follows:
>  # A daily DAG scans a GCS bucket for all objects created in the last day and 
> loads them into BigQuery.
>  # To find these objects, a `PythonOperator` scans the bucket and returns a 
> list of object names.
>  # A `GoogleCloudStorageToBigQueryOperator` is used to load these objects 
> into BigQuery.
> The operator should be able to have its list of objects provided by XCom, but 
> there is no functionality to do this, and trying to do a template expansion 
> along the lines of `source_objects='\{{ task_instance.xcom_pull(key="KEY") 
> }}'` doesn't work because this is rendered as a string, which 
> `GoogleCloudStorageToBigQueryOperator` will try to treat as a list, with each 
> character being a single item.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Comment Edited] (AIRFLOW-5046) Allow GoogleCloudStorageToBigQueryOperator to accept source_objects as a string or otherwise take input from XCom

2019-07-28 Thread Cedrik Neumann (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894652#comment-16894652
 ] 

Cedrik Neumann edited comment on AIRFLOW-5046 at 7/28/19 8:35 AM:
--

Yeah, this problem - lists from xcom not usable in arguments accepting lists - 
arises in various use cases. It's worth debating,
 # if some operators which accept parameters as lists should allow comma 
delimited list in strings.
 # if operators should have a separate argument for string encoded lists
 # if operators should recognise xcom keys in arguments (i.e. prefixed with 
"xcom:" => "xcom:KEY")

First might suit most use cases, but has its limitations as it doesn't apply to 
all operators (SQL ones for example). Second might blow up operator interfaces 
and is probably the least generic solution. Third could be implemented as an 
airflow wide feature, which would enable this functionality to all operators, 
potentially limited to templated fields.


was (Author: m1racoli):
Yeah, this problem - lists from xcom not usable in arguments accepting lists - 
arises in various use cases. It's worth debating,
 # if some operators which accept parameters as lists should allow comma 
delimited list in strings.
 # if operators should have a separate argument for string encoded lists
 # if operators should recognise xcom keys in arguments (i.e. prefixed with 
"xcom:KEY")

First might suit most use cases, but has its limitations as it doesn't apply to 
all operators (SQL ones for example). Second might blow up operator interfaces 
and is probably the least generic solution. Third could be implemented as an 
airflow wide feature, which would enable this functionality to all operators, 
potentially limited to templated fields.

> Allow GoogleCloudStorageToBigQueryOperator to accept source_objects as a 
> string or otherwise take input from XCom
> -
>
> Key: AIRFLOW-5046
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5046
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp
>Affects Versions: 1.10.2
>Reporter: Joel Croteau
>Priority: Minor
>
> `GoogleCloudStorageToBigQueryOperator` should be able to have its 
> `source_objects` dynamically determined by the results of a previous 
> workflow. This is hard to do with it expecting a list, as any template 
> expansion will render as a string. This could be implemented either as a 
> check for whether `source_objects` is a string, and trying to parse it as a 
> list if it is, or a separate argument for a string encoded as a list.
> My particular use case for this is as follows:
>  # A daily DAG scans a GCS bucket for all objects created in the last day and 
> loads them into BigQuery.
>  # To find these objects, a `PythonOperator` scans the bucket and returns a 
> list of object names.
>  # A `GoogleCloudStorageToBigQueryOperator` is used to load these objects 
> into BigQuery.
> The operator should be able to have its list of objects provided by XCom, but 
> there is no functionality to do this, and trying to do a template expansion 
> along the lines of `source_objects='\{{ task_instance.xcom_pull(key="KEY") 
> }}'` doesn't work because this is rendered as a string, which 
> `GoogleCloudStorageToBigQueryOperator` will try to treat as a list, with each 
> character being a single item.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (AIRFLOW-5046) Allow GoogleCloudStorageToBigQueryOperator to accept source_objects as a string or otherwise take input from XCom

2019-07-28 Thread Cedrik Neumann (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894652#comment-16894652
 ] 

Cedrik Neumann commented on AIRFLOW-5046:
-

Yeah, this problem - lists from xcom not usable in arguments accepting lists - 
arises in various use cases. It's worth debating,
 # if some operators which accept parameters as lists should allow comma 
delimited list in strings.
 # if operators should have a separate argument for string encoded lists
 # if operators should recognise xcom keys in arguments (i.e. prefixed with 
"xcom:KEY")

First might suit most use cases, but has its limitations as it doesn't apply to 
all operators (SQL ones for example). Second might blow up operator interfaces 
and is probably the least generic solution. Third could be implemented as an 
airflow wide feature, which would enable this functionality to all operators, 
potentially limited to templated fields.

> Allow GoogleCloudStorageToBigQueryOperator to accept source_objects as a 
> string or otherwise take input from XCom
> -
>
> Key: AIRFLOW-5046
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5046
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp
>Affects Versions: 1.10.2
>Reporter: Joel Croteau
>Priority: Minor
>
> `GoogleCloudStorageToBigQueryOperator` should be able to have its 
> `source_objects` dynamically determined by the results of a previous 
> workflow. This is hard to do with it expecting a list, as any template 
> expansion will render as a string. This could be implemented either as a 
> check for whether `source_objects` is a string, and trying to parse it as a 
> list if it is, or a separate argument for a string encoded as a list.
> My particular use case for this is as follows:
>  # A daily DAG scans a GCS bucket for all objects created in the last day and 
> loads them into BigQuery.
>  # To find these objects, a `PythonOperator` scans the bucket and returns a 
> list of object names.
>  # A `GoogleCloudStorageToBigQueryOperator` is used to load these objects 
> into BigQuery.
> The operator should be able to have its list of objects provided by XCom, but 
> there is no functionality to do this, and trying to do a template expansion 
> along the lines of `source_objects='\{{ task_instance.xcom_pull(key="KEY") 
> }}'` doesn't work because this is rendered as a string, which 
> `GoogleCloudStorageToBigQueryOperator` will try to treat as a list, with each 
> character being a single item.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Closed] (AIRFLOW-4996) Add multipart upload support to FileToGoogleCloudStorageOperator

2019-07-27 Thread Cedrik Neumann (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cedrik Neumann closed AIRFLOW-4996.
---
Resolution: Duplicate

> Add multipart upload support to FileToGoogleCloudStorageOperator
> 
>
> Key: AIRFLOW-4996
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4996
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, operators
>Affects Versions: 1.10.3
>Reporter: Cedrik Neumann
>Priority: Minor
>
> GoogleCloudStorageHook's upload function has an option for multipart uploads 
> which is beneficial for large files to be uploaded.
>  
> Extend FileToGoogleCloudStorageOperator to make use of this feature.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (AIRFLOW-4998) Run multiple queries in BigQueryOperator

2019-07-20 Thread Cedrik Neumann (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cedrik Neumann reassigned AIRFLOW-4998:
---

Assignee: (was: Cedrik Neumann)

> Run multiple queries in BigQueryOperator
> 
>
> Key: AIRFLOW-4998
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4998
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp, operators
>Affects Versions: 1.10.3
>Reporter: Cedrik Neumann
>Priority: Minor
>
> Contrary to it's documentation BigQueryOperator doesn't support lists for the 
> {{sql}} argument.
> Add support to run multiple queries provided in a list. This brings it in 
> line with other SQL operators.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (AIRFLOW-4998) Run multiple queries in BigQueryOperator

2019-07-20 Thread Cedrik Neumann (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cedrik Neumann reassigned AIRFLOW-4998:
---

Assignee: Cedrik Neumann

> Run multiple queries in BigQueryOperator
> 
>
> Key: AIRFLOW-4998
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4998
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib, gcp, operators
>Affects Versions: 1.10.3
>Reporter: Cedrik Neumann
>Assignee: Cedrik Neumann
>Priority: Minor
>
> Contrary to it's documentation BigQueryOperator doesn't support lists for the 
> {{sql}} argument.
> Add support to run multiple queries provided in a list. This brings it in 
> line with other SQL operators.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (AIRFLOW-4998) Run multiple queries in BigQueryOperator

2019-07-20 Thread Cedrik Neumann (JIRA)
Cedrik Neumann created AIRFLOW-4998:
---

 Summary: Run multiple queries in BigQueryOperator
 Key: AIRFLOW-4998
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4998
 Project: Apache Airflow
  Issue Type: Improvement
  Components: contrib, gcp, operators
Affects Versions: 1.10.3
Reporter: Cedrik Neumann


Contrary to it's documentation BigQueryOperator doesn't support lists for the 
{{sql}} argument.

Add support to run multiple queries provided in a list. This brings it in line 
with other SQL operators.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (AIRFLOW-4996) Add multipart upload support to FileToGoogleCloudStorageOperator

2019-07-20 Thread Cedrik Neumann (JIRA)
Cedrik Neumann created AIRFLOW-4996:
---

 Summary: Add multipart upload support to 
FileToGoogleCloudStorageOperator
 Key: AIRFLOW-4996
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4996
 Project: Apache Airflow
  Issue Type: Improvement
  Components: contrib, operators
Affects Versions: 1.10.3
Reporter: Cedrik Neumann


GoogleCloudStorageHook's upload function has an option for multipart uploads 
which is beneficial for large files to be uploaded.
 
Extend FileToGoogleCloudStorageOperator to make use of this feature.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (AIRFLOW-1479) BashOperator does not open pipe for STDIN

2019-04-19 Thread Cedrik Neumann (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16821858#comment-16821858
 ] 

Cedrik Neumann commented on AIRFLOW-1479:
-

Hello, I'm not working at Wooga anymore. As far as I know the FreeBSD specific 
branch of our Airflow version was still using `FNULL`:

[https://github.com/wooga/airflow/blob/1.9-fbsd-master/airflow/operators/bash_operator.py#L85]

As the Team was working on a migration to Kubernetes before my departure Dec 
2017, I don't think this was still relevant unfortunately.

> BashOperator does not open pipe for STDIN
> -
>
> Key: AIRFLOW-1479
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1479
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 1.8.1
> Environment: FreeBSD, JRuby 9.x, Rake, Python 2.7, Airflow 1.8 with 
> CeleryExecutor
>Reporter: Cedrik Neumann
>Priority: Trivial
>  Labels: easyfix
>
> From JRuby 9 onwards we experienced issues when executing Rake tasks via the 
> BashOperator with the error message:
> {noformat}
> Errno::EBADF: Bad file descriptor - 0
> {noformat}
> We figured out that the issue is due to a missing pipe for STDIN when the 
> BashOperator calls `Popen`.
> The quick fix of the issue would be to add a pipe for STDIN as well:
> {code:python}
> sp = Popen(
> ['bash', fname],
> stdout=PIPE, stderr=STDOUT, stdin=PIPE,
> cwd=tmp_dir, env=self.env)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)