[jira] [Commented] (AIRFLOW-2842) GCS rsync operator

2018-12-23 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16727842#comment-16727842
 ] 

jack commented on AIRFLOW-2842:
---

[~dlamblin] This can be achieved with BashOperator but you can say this on 
everything.

In any case having operator for this can make life easier (you don't need to 
manage separated connection files with credentials etc.. ).

> GCS rsync operator
> --
>
> Key: AIRFLOW-2842
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2842
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Vikram Oberoi
>Priority: Major
>
> The GoogleCloudStorageToGoogleCloudStorageOperator supports copying objects 
> from one bucket to another using a wildcard.
> As long you don't delete anything in the source bucket, the destination 
> bucket will end up synchronized on every run.
> However, each object gets copied over even if it exists at the destination, 
> which makes this operation inefficient, time-consuming, and potentially 
> costly.
> I'd love an operator that behaves like `gsutil rsync` for when I need to 
> synchronize two buckets, supporting `gsutil rsync -d` behavior as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2842) GCS rsync operator

2018-11-06 Thread Daniel Lamblin (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676582#comment-16676582
 ] 

Daniel Lamblin commented on AIRFLOW-2842:
-

Do you think it would not be possible with a simple BashOperator call to the 
utility?

> GCS rsync operator
> --
>
> Key: AIRFLOW-2842
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2842
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Vikram Oberoi
>Priority: Major
>
> The GoogleCloudStorageToGoogleCloudStorageOperator supports copying objects 
> from one bucket to another using a wildcard.
> As long you don't delete anything in the source bucket, the destination 
> bucket will end up synchronized on every run.
> However, each object gets copied over even if it exists at the destination, 
> which makes this operation inefficient, time-consuming, and potentially 
> costly.
> I'd love an operator that behaves like `gsutil rsync` for when I need to 
> synchronize two buckets, supporting `gsutil rsync -d` behavior as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)