o-nikolas commented on code in PR #59042:
URL: https://github.com/apache/airflow/pull/59042#discussion_r2632979110
##########
providers/amazon/src/airflow/providers/amazon/aws/operators/s3.py:
##########
@@ -366,6 +366,159 @@ def get_openlineage_facets_on_start(self):
)
+class S3CopyPrefixOperator(AwsBaseOperator[S3Hook]):
+ """
+ Creates a copy of all objects under a prefix already stored in S3.
+
+ Note: the S3 connection used here needs to have access to both
+ source and destination bucket/prefix.
+
+ .. seealso::
+ For more information on how to use this operator, take a look at the
guide:
+ :ref:`howto/operator:S3CopyPrefixOperator`
+
+ :param source_bucket_prefix: The prefix in the source bucket. (templated)
+ It can be either full s3:// style url or relative path from root level.
+ When it's specified as a full s3:// url, please omit
source_bucket_name.
+ :param dest_bucket_prefix: The prefix in the destination to copy to.
(templated)
+ The convention to specify `dest_bucket_prefix` is the same as
`source_bucket_prefix`.
+ :param source_bucket_name: Name of the S3 bucket where the source objects
are in. (templated)
+ It should be omitted when `source_bucket_prefix` is provided as a full
s3:// url.
+ :param dest_bucket_name: Name of the S3 bucket to where the objects are
copied. (templated)
+ It should be omitted when `dest_bucket_prefix` is provided as a full
s3:// url.
+ :param page_size: Number of objects to list per page when paginating
through S3 objects.
+ Low values result in more API calls, high values increase memory usage.
+ Between 1 and 1000, setting it to 0 results in no objects copied.
Default is 1000.
Review Comment:
The idea of disabling a task is interesting, but not something that is very
common in airflow, so I'm not sure an operator would think of it in the rare
case that it would be useful . Plus it would also need a deploy to update dag
code. I think overall it's worth simplifying and just catching that case and
not allowing it
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]