khandelwal-prateek commented on code in PR #4115: URL: https://github.com/apache/gobblin/pull/4115#discussion_r2071172287
########## gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/ManifestBasedDataset.java: ########## @@ -143,12 +146,28 @@ public Iterator<FileSet<CopyEntity>> getFileSetIterator(FileSystem targetFs, Cop copyableFile.setFsDatasets(srcFs, targetFs); copyEntities.add(copyableFile); + // In case of directory with 000 permission, the permission is changed to 100 due to HadoopUtils::addExecutePermissionToOwner + // getting called from CopyDataPublisher::preserveFileAttrInPublisher -> FileAwareInputStreamDataWriter::setPathPermission -> + // FileAwareInputStreamDataWriter::setOwnerExecuteBitIfDirectory -> HadoopUtils::addExecutePermissionToOwner + // We need to revert this extra permission change in setPermissionStep + if (srcFile.isDirectory() && !srcFile.getPermission().getUserAction().implies(FsAction.EXECUTE) + && !ancestorOwnerAndPermissionsForSetPermissionStep.containsKey(PathUtils.getPathWithoutSchemeAndAuthority(fileToCopy).toString()) + && !targetFs.exists(fileToCopy)) { + List<OwnerAndPermission> ancestorsOwnerAndPermission = new ArrayList<>(copyableFile.getAncestorsOwnerAndPermission()); + OwnerAndPermission srcFileOwnerPermission = new OwnerAndPermission(srcFile.getOwner(), srcFile.getGroup(), srcFile.getPermission()); + ancestorsOwnerAndPermission.add(0, srcFileOwnerPermission); + ancestorOwnerAndPermissionsForSetPermissionStep.put(fileToCopy.toString(), ancestorsOwnerAndPermission); + } + // Always grab the parent since the above permission setting should be setting the permission for a folder itself // {@link CopyDataPublisher#preserveFileAttrInPublisher} will be setting the permission for the empty child dir Path fromPath = fileToCopy.getParent(); // Avoid duplicate calculation for the same ancestor if (fromPath != null && !ancestorOwnerAndPermissions.containsKey(PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString()) && !targetFs.exists(fromPath)) { ancestorOwnerAndPermissions.put(fromPath.toString(), copyableFile.getAncestorsOwnerAndPermission()); + if (!ancestorOwnerAndPermissionsForSetPermissionStep.containsKey(PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString())) { + ancestorOwnerAndPermissionsForSetPermissionStep.put(fromPath.toString(), copyableFile.getAncestorsOwnerAndPermission()); Review Comment: for `containsKey` we are checking `PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString()`, but in `put` `fromPath.toString()` is used directly. There can be mismatch due to this, it would be better to use `PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString()` in both places -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@gobblin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org