uranusjr commented on code in PR #32322:
URL: https://github.com/apache/airflow/pull/32322#discussion_r1250187244
##########
tests/providers/amazon/aws/transfers/test_gcs_to_s3.py:
##########
@@ -128,7 +128,65 @@ def test_execute_without_replace(self, mock_hook):
assert [] == uploaded_files
assert sorted(MOCK_FILES) == sorted(hook.list_keys("bucket",
delimiter="/"))
- # Test3: There are no files in destination bucket
+ # Test3: All the files (within some folders) are already in origin and
destination without replace
Review Comment:
Let’s remove the TestX prefix, they provide no value at all and very messy
to maintain as displayed in this PR.
##########
tests/providers/amazon/aws/transfers/test_gcs_to_s3.py:
##########
@@ -128,7 +128,65 @@ def test_execute_without_replace(self, mock_hook):
assert [] == uploaded_files
assert sorted(MOCK_FILES) == sorted(hook.list_keys("bucket",
delimiter="/"))
- # Test3: There are no files in destination bucket
+ # Test3: All the files (within some folders) are already in origin and
destination without replace
+ @mock.patch("airflow.providers.amazon.aws.transfers.gcs_to_s3.GCSHook")
+ def test_execute_without_replace_with_folder_structure(self, mock_hook):
+ mock_files_gcs = [f"test{idx}/{mock_file}" for idx, mock_file in
enumerate(MOCK_FILES)]
+ mock_files_s3 = [f"test/test{idx}/{mock_file}" for idx, mock_file in
enumerate(MOCK_FILES)]
+ mock_hook.return_value.list.return_value = mock_files_gcs
+
+ hook, bucket = _create_test_bucket()
+ for mock_file_s3 in mock_files_s3:
+ bucket.put_object(Key=mock_file_s3, Body=b"testing")
+
+ with NamedTemporaryFile() as f:
+ gcs_provide_file = mock_hook.return_value.provide_file
+ gcs_provide_file.return_value.__enter__.return_value.name = f.name
+
+ with pytest.deprecated_call():
Review Comment:
What is this masking? Can this be more specific than ignoring all
deprecations?
##########
airflow/providers/amazon/aws/transfers/gcs_to_s3.py:
##########
@@ -161,6 +161,9 @@ def execute(self, context: Context) -> list[str]:
# and only keep those files which are present in
# Google Cloud Storage and not in S3
bucket_name, prefix = S3Hook.parse_s3_url(self.dest_s3_key)
+ # only if prefix is not empty
Review Comment:
This comment seems awkward since `if prefix` is pretty obviously filtering
out the empty prefix case and this says the same thing twice. If a comment is
desired, it should instead explain _why_ the empty prefix case needs to be
filtered out.
##########
tests/providers/amazon/aws/transfers/test_gcs_to_s3.py:
##########
@@ -128,7 +128,65 @@ def test_execute_without_replace(self, mock_hook):
assert [] == uploaded_files
assert sorted(MOCK_FILES) == sorted(hook.list_keys("bucket",
delimiter="/"))
- # Test3: There are no files in destination bucket
+ # Test3: All the files (within some folders) are already in origin and
destination without replace
+ @mock.patch("airflow.providers.amazon.aws.transfers.gcs_to_s3.GCSHook")
+ def test_execute_without_replace_with_folder_structure(self, mock_hook):
+ mock_files_gcs = [f"test{idx}/{mock_file}" for idx, mock_file in
enumerate(MOCK_FILES)]
+ mock_files_s3 = [f"test/test{idx}/{mock_file}" for idx, mock_file in
enumerate(MOCK_FILES)]
+ mock_hook.return_value.list.return_value = mock_files_gcs
+
+ hook, bucket = _create_test_bucket()
+ for mock_file_s3 in mock_files_s3:
+ bucket.put_object(Key=mock_file_s3, Body=b"testing")
+
+ with NamedTemporaryFile() as f:
+ gcs_provide_file = mock_hook.return_value.provide_file
+ gcs_provide_file.return_value.__enter__.return_value.name = f.name
+
+ with pytest.deprecated_call():
+ operator = GCSToS3Operator(
+ task_id=TASK_ID,
+ bucket=GCS_BUCKET,
+ # prefix value for gcs bucket does not matter
+ prefix=PREFIX,
+ delimiter=DELIMITER,
+ dest_aws_conn_id="aws_default",
+ # endswith "/"
+ dest_s3_key=f"{S3_BUCKET}/test/",
+ replace=False,
+ )
+
+ # we expect nothing to be uploaded
+ # and all the MOCK_FILES to be present at the S3 bucket
+ uploaded_files = operator.execute(None)
+
+ assert [] == uploaded_files
+ assert sorted(mock_files_s3) == sorted(hook.list_keys("bucket",
prefix="test/"))
+
+ with NamedTemporaryFile() as f:
Review Comment:
This should be reduced with `pytest.mark.parametrize`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]