[ https://issues.apache.org/jira/browse/GOBBLIN-2206?focusedWorklogId=968347&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-968347 ]
ASF GitHub Bot logged work on GOBBLIN-2206: ------------------------------------------- Author: ASF GitHub Bot Created on: 02/May/25 09:35 Start Date: 02/May/25 09:35 Worklog Time Spent: 10m Work Description: khandelwal-prateek commented on code in PR #4115: URL: https://github.com/apache/gobblin/pull/4115#discussion_r2071172287 ########## gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/ManifestBasedDataset.java: ########## @@ -143,12 +146,28 @@ public Iterator<FileSet<CopyEntity>> getFileSetIterator(FileSystem targetFs, Cop copyableFile.setFsDatasets(srcFs, targetFs); copyEntities.add(copyableFile); + // In case of directory with 000 permission, the permission is changed to 100 due to HadoopUtils::addExecutePermissionToOwner + // getting called from CopyDataPublisher::preserveFileAttrInPublisher -> FileAwareInputStreamDataWriter::setPathPermission -> + // FileAwareInputStreamDataWriter::setOwnerExecuteBitIfDirectory -> HadoopUtils::addExecutePermissionToOwner + // We need to revert this extra permission change in setPermissionStep + if (srcFile.isDirectory() && !srcFile.getPermission().getUserAction().implies(FsAction.EXECUTE) + && !ancestorOwnerAndPermissionsForSetPermissionStep.containsKey(PathUtils.getPathWithoutSchemeAndAuthority(fileToCopy).toString()) + && !targetFs.exists(fileToCopy)) { + List<OwnerAndPermission> ancestorsOwnerAndPermission = new ArrayList<>(copyableFile.getAncestorsOwnerAndPermission()); + OwnerAndPermission srcFileOwnerPermission = new OwnerAndPermission(srcFile.getOwner(), srcFile.getGroup(), srcFile.getPermission()); + ancestorsOwnerAndPermission.add(0, srcFileOwnerPermission); + ancestorOwnerAndPermissionsForSetPermissionStep.put(fileToCopy.toString(), ancestorsOwnerAndPermission); + } + // Always grab the parent since the above permission setting should be setting the permission for a folder itself // {@link CopyDataPublisher#preserveFileAttrInPublisher} will be setting the permission for the empty child dir Path fromPath = fileToCopy.getParent(); // Avoid duplicate calculation for the same ancestor if (fromPath != null && !ancestorOwnerAndPermissions.containsKey(PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString()) && !targetFs.exists(fromPath)) { ancestorOwnerAndPermissions.put(fromPath.toString(), copyableFile.getAncestorsOwnerAndPermission()); + if (!ancestorOwnerAndPermissionsForSetPermissionStep.containsKey(PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString())) { + ancestorOwnerAndPermissionsForSetPermissionStep.put(fromPath.toString(), copyableFile.getAncestorsOwnerAndPermission()); Review Comment: for `containsKey` we are checking `PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString()`, but in `put` `fromPath.toString()` is used directly. There can be mismatch due to this, it would be better to use `PathUtils.getPathWithoutSchemeAndAuthority(fromPath).toString()` in both places Issue Time Tracking ------------------- Worklog Id: (was: 968347) Time Spent: 0.5h (was: 20m) > Fix extra execute bit getting set in ManifestDistcp > --------------------------------------------------- > > Key: GOBBLIN-2206 > URL: https://issues.apache.org/jira/browse/GOBBLIN-2206 > Project: Apache Gobblin > Issue Type: Improvement > Reporter: Vivek Rai > Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > In Manifest Distcp extra execute bit is set to owner permission even if owner > permission doesn't have execute bit set -- This message was sent by Atlassian Jira (v8.20.10#820010)