vincbeck commented on code in PR #30501:
URL: https://github.com/apache/airflow/pull/30501#discussion_r1162925499
##########
airflow/providers/amazon/aws/transfers/dynamodb_to_s3.py:
##########
@@ -118,11 +123,51 @@ def __init__(
self.dynamodb_scan_kwargs = dynamodb_scan_kwargs
self.s3_bucket_name = s3_bucket_name
self.s3_key_prefix = s3_key_prefix
+ self.export_time = export_time
+ self.export_format = export_format
def execute(self, context: Context) -> None:
hook = DynamoDBHook(aws_conn_id=self.source_aws_conn_id)
- table = hook.get_conn().Table(self.dynamodb_table_name)
+ if self.export_time:
+ self._export_table_to_point_in_time(hook=hook)
+ else:
+ self._export_entire_data(hook=hook)
+
+ def _export_table_to_point_in_time(self, hook: DynamoDBHook):
+ """
+ Export data from start of epoc till `export_time`. Table export will
be a snapshot of the table’s
+ state at this point in time.
+ """
+ terminal_status = ['COMPLETED', 'FAILED']
+ sleep_time = 30 # unit: seconds
+ client = hook.conn.meta.client
+ while True:
+ response = client.export_table_to_point_in_time(
+ TableArn=self.dynamodb_table_name,
+ ExportTime=self.export_time,
+ ClientToken='string',
Review Comment:
My understanding from the
[doc](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb/client/export_table_to_point_in_time.html)
is if you make multiple calls with the same `clientToken`, the result will be
the same (unless some corner cases, see doc for more details). If you dont
provide it, you might get different result if the table content changed, which
we want I think
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]