dxichen commented on code in PR #1684: URL: https://github.com/apache/samza/pull/1684#discussion_r1302325007
########## samza-core/src/main/java/org/apache/samza/storage/blobstore/util/DirDiffUtil.java: ########## @@ -168,11 +171,17 @@ public static BiPredicate<File, FileIndex> areSameFile(boolean compareLargeFileC // Don't compare file timestamps. The ctime of a local file just restored will be different than the // remote file, and will cause the file to be uploaded again during the first commit after restore. - areSameFiles = localFileAttrs.size() == remoteFileMetadata.getSize() && - groupCache.get(String.valueOf(Files.getAttribute(localFile.toPath(), "unix:gid")), - () -> localFileAttrs.group().getName()).equals(remoteFileMetadata.getGroup()) && - ownerCache.get(String.valueOf(Files.getAttribute(localFile.toPath(), "unix:uid")), - () -> localFileAttrs.owner().getName()).equals(remoteFileMetadata.getOwner()); + areSameFiles = localFileAttrs.size() == remoteFileMetadata.getSize(); Review Comment: On a side note is this sufficient? what if the files are the same size but with different values? ########## samza-core/src/main/java/org/apache/samza/config/BlobStoreConfig.java: ########## @@ -47,6 +47,10 @@ public class BlobStoreConfig extends MapConfig { public static final String RETRY_POLICY_JITTER_FACTOR = RETRY_POLICY_PREFIX + "jitter.factor"; // random retry delay between -0.1*retry-delay to 0.1*retry-delay public static final double DEFAULT_RETRY_POLICY_JITTER_FACTOR = 0.1; + // Set wether to compare file owners after restoring blobs from remote store. Useful when the job is started on a new Review Comment: typo: whether ########## samza-core/src/main/java/org/apache/samza/config/BlobStoreConfig.java: ########## @@ -47,6 +47,10 @@ public class BlobStoreConfig extends MapConfig { public static final String RETRY_POLICY_JITTER_FACTOR = RETRY_POLICY_PREFIX + "jitter.factor"; // random retry delay between -0.1*retry-delay to 0.1*retry-delay public static final double DEFAULT_RETRY_POLICY_JITTER_FACTOR = 0.1; + // Set wether to compare file owners after restoring blobs from remote store. Useful when the job is started on a new + // machine with new gid/uid or if gid/uid changes for some reason Review Comment: changes due to host migration -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org