o-nikolas commented on code in PR #60164:
URL: https://github.com/apache/airflow/pull/60164#discussion_r2677983630
##########
providers/amazon/src/airflow/providers/amazon/aws/hooks/s3.py:
##########
@@ -1755,6 +1779,24 @@ def _sync_to_local_dir_if_changed(self, s3_bucket,
s3_object, local_target_path:
download_msg = (
f"S3 object size ({s3_object.size}) and local file size
({local_stats.st_size}) differ."
)
+ else:
+ s3_etag = s3_object.e_tag
+ if s3_etag and "-" not in s3_etag:
+ local_md5 = self._compute_local_file_md5(local_target_path)
+ if local_md5 is None:
+ should_download, download_msg =
self._check_needs_download_by_timestamp(
+ s3_object, local_stats.st_mtime
+ )
+ elif local_md5 != s3_etag:
Review Comment:
Is the S3 etag an md5 in _all_ cases? I didn't think that was the case. If
it isn't always an md5 then you could get stuck in infinite loops here no?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]