khandelwal-prateek commented on code in PR #4115:
URL: https://github.com/apache/gobblin/pull/4115#discussion_r2071172287


##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/ManifestBasedDataset.java:
##########
@@ -143,12 +146,28 @@ public Iterator<FileSet<CopyEntity>> 
getFileSetIterator(FileSystem targetFs, Cop
             copyableFile.setFsDatasets(srcFs, targetFs);
             copyEntities.add(copyableFile);
 
+            // In case of directory with 000 permission, the permission is 
changed to 100 due to HadoopUtils::addExecutePermissionToOwner
+            // getting called from 
CopyDataPublisher::preserveFileAttrInPublisher -> 
FileAwareInputStreamDataWriter::setPathPermission ->
+            // FileAwareInputStreamDataWriter::setOwnerExecuteBitIfDirectory 
-> HadoopUtils::addExecutePermissionToOwner
+            // We need to revert this extra permission change in 
setPermissionStep
+            if (srcFile.isDirectory() && 
!srcFile.getPermission().getUserAction().implies(FsAction.EXECUTE)
+                && 
!ancestorOwnerAndPermissionsForSetPermissionStep.containsKey(PathUtils.getPathWithoutSchemeAndAuthority(fileToCopy).toString())
+                && !targetFs.exists(fileToCopy)) {
+              List<OwnerAndPermission> ancestorsOwnerAndPermission = new 
ArrayList<>(copyableFile.getAncestorsOwnerAndPermission());
+              OwnerAndPermission srcFileOwnerPermission = new 
OwnerAndPermission(srcFile.getOwner(), srcFile.getGroup(), 
srcFile.getPermission());
+              ancestorsOwnerAndPermission.add(0, srcFileOwnerPermission);
+              
ancestorOwnerAndPermissionsForSetPermissionStep.put(fileToCopy.toString(), 
ancestorsOwnerAndPermission);
+            }
+
             // Always grab the parent since the above permission setting 
should be setting the permission for a folder itself
             // {@link CopyDataPublisher#preserveFileAttrInPublisher} will be 
setting the permission for the empty child dir
             Path fromPath = fileToCopy.getParent();
             // Avoid duplicate calculation for the same ancestor
             if (fromPath != null && 
!ancestorOwnerAndPermissions.containsKey(PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString())
 && !targetFs.exists(fromPath)) {
               ancestorOwnerAndPermissions.put(fromPath.toString(), 
copyableFile.getAncestorsOwnerAndPermission());
+              if 
(!ancestorOwnerAndPermissionsForSetPermissionStep.containsKey(PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString()))
 {
+                
ancestorOwnerAndPermissionsForSetPermissionStep.put(fromPath.toString(), 
copyableFile.getAncestorsOwnerAndPermission());

Review Comment:
   for `containsKey` we are checking 
`PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString()`, but in `put` 
`fromPath.toString()` is used directly. There can be mismatch due to this, it 
would be better to use 
`PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString()` in both 
places  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to