ArafatKhan2198 opened a new pull request, #8109: URL: https://github.com/apache/ozone/pull/8109
## What changes were proposed in this pull request? **Problem:** When using FSO buckets, files with the same name uploaded into different directories were being merged into a single key record. This was because Recon’s container key mapping used only the volume, bucket, and file name as the unique identifier, which ignored the full directory path information. **Reproducing the Issue:** The issue can be reproduced by creating a nested directory structure and uploading two files (testfile1 and testfile2) at different directory depths. For example, run the following commands: ``` ozone fs -mkdir -p ofs://om/volume1/fso-bucket/dir1/dir2/dir3 ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/ ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/ ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/dir2/ ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/dir2/ ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/dir2/dir3/ ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/dir2/dir3/ ``` In this scenario, two duplicate file names (`testfile1` and `testfile2`) are created in three different directory hierarchies (`dir1`, `dir1/dir2`, and` dir1/dir2/dir3`). **Root Cause:** The root cause was that the Recon container key mapping computed a unique key based only on the volume, bucket, and file name. For FSO buckets, the directory structure is encoded as part of the raw key prefix (using negative object IDs), but this information was being omitted from the computed key. As a result, files with identical names from different directories were being incorrectly merged. **Fix:** The fix updates the container key mapping logic to use the raw key prefix from the container key table as the unique identifier. Since the raw key prefix includes the complete directory structure (with the object IDs representing the directories, volume, bucket), this change ensures that keys with the same file name but in different directories (as in the above scenario) are recognized as distinct records by Recon. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-12589 ## How was this patch tested? - I manually verified the fix by executing the above commands, which created duplicate files (testfile1 and testfile2) under different directory hierarchies, and confirmed that the container endpoint returned separate records for each file. - Additionally, I wrote unit tests for both the `ContainerKeyMapperTask` and the `container endpoint` to simulate duplicate FSO key names under different directories, ensuring that the raw key prefix is correctly used to differentiate the keys. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
