ArafatKhan2198 opened a new pull request, #8797: URL: https://github.com/apache/ozone/pull/8797
## What changes were proposed in this pull request? This pull request introduces a materialized fields optimization for disk usage calculation in Recon by adding two new fields, totalSize and totalFiles, to the NSSummary proto. These fields store precomputed aggregate sizes and file counts for directories and buckets, which are incrementally updated during file system write operations. This eliminates the need for expensive recursive RocksDB reads during disk usage queries, drastically improving read performance. The change modifies the write path logic to update these totals along the parent directory chain atomically during file creation, deletion, and renaming events. The read path now performs a simple single-key lookup to get total size or file count instead of recursively traversing millions of directories. ### Why were these changes proposed? Recon’s existing method for disk usage calculation recursively traverses every directory in a bucket’s tree, issuing a RocksDB Get for each node. For large buckets with millions of directories, this results in millions of disk reads and latency of up to 54 seconds for a single query, severely impacting user experience and system responsiveness. By materializing totals in each NSSummary entry and updating them incrementally on writes, the read path becomes an O(1) operation, returning total sizes instantly without recursive lookups. Although this adds some overhead to write operations, it reduces the overall query latency by several orders of magnitude, making the tradeoff worthwhile. ### Approach and Implementation Details: - Added `totalSize` and `totalFiles` fields to the NSSummary protobuf schema. - On each file-related write operation (create, delete, rename), calculate the size/file count delta. - Traverse upwards from the file’s parent directory through all ancestors using `parentId` links. - Update totalSize and totalFiles fields atomically for all ancestors within a single RocksDB WriteBatch. - Adjust the `getTotalSize(objectId)` method to return the totalSize field directly, avoiding recursive calls. - Unit tests simulate different file and directory deletion scenarios to ensure correctness and consistency of incremental updates. - Performance tests demonstrate a `O(1)` disk usage query time at the cost of approximately 10.9% slower write performance. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-13432 ## How was this patch tested? Manually verified the changes, will add unit test soon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
