ArafatKhan2198 opened a new pull request, #8797:
URL: https://github.com/apache/ozone/pull/8797

   ## What changes were proposed in this pull request?
   This pull request introduces a materialized fields optimization for disk 
usage calculation in Recon by adding two new fields, totalSize and totalFiles, 
to the NSSummary proto. These fields store precomputed aggregate sizes and file 
counts for directories and buckets, which are incrementally updated during file 
system write operations. This eliminates the need for expensive recursive 
RocksDB reads during disk usage queries, drastically improving read performance.
   
   The change modifies the write path logic to update these totals along the 
parent directory chain atomically during file creation, deletion, and renaming 
events. The read path now performs a simple single-key lookup to get total size 
or file count instead of recursively traversing millions of directories.
   
   ### Why were these changes proposed?
   
   Recon’s existing method for disk usage calculation recursively traverses 
every directory in a bucket’s tree, issuing a RocksDB Get for each node. For 
large buckets with millions of directories, this results in millions of disk 
reads and latency of up to 54 seconds for a single query, severely impacting 
user experience and system responsiveness.
   
   By materializing totals in each NSSummary entry and updating them 
incrementally on writes, the read path becomes an O(1) operation, returning 
total sizes instantly without recursive lookups. Although this adds some 
overhead to write operations, it reduces the overall query latency by several 
orders of magnitude, making the tradeoff worthwhile.
   
   ### Approach and Implementation Details:
   
   - Added `totalSize` and `totalFiles` fields to the NSSummary protobuf schema.
   - On each file-related write operation (create, delete, rename), calculate 
the size/file count delta.
   - Traverse upwards from the file’s parent directory through all ancestors 
using `parentId` links.
   - Update totalSize and totalFiles fields atomically for all ancestors within 
a single RocksDB WriteBatch.
   - Adjust the `getTotalSize(objectId)` method to return the totalSize field 
directly, avoiding recursive calls.
   - Unit tests simulate different file and directory deletion scenarios to 
ensure correctness and consistency of incremental updates.
   - Performance tests demonstrate a `O(1)` disk usage query time at the cost 
of approximately 10.9% slower write performance.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-13432
   ## How was this patch tested?
   
   Manually verified the changes, will add unit test soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to