[PR] HDDS-13003. [POC] Snapshot Defragmentation to reduce storage footprint [ozone]

via GitHub Mon, 18 Aug 2025 21:35:32 -0700


smengcl opened a new pull request, #8954:
URL: https://github.com/apache/ozone/pull/8954


   WARNING: DO NOT MERGE. This PR is for visibility and comments only.
   
   ## What changes were proposed in this pull request?
   
   Implement Snapshot Defrag service and manual trigger CLI.
   
   This working POC contains extremely crude implementation. Major refactoring 
and optimizations expected. The point is to prove that Snapshot Defrag could 
bring space saving.
   
   This also include dev commits with that begins with `[dev]` in their commit 
messages that likely won't end up in the code base. `[split]` are the ones that 
can be put in separate PRs.
   
   Design doc: https://github.com/apache/ozone/pull/8514
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-13003
   
   ## How was this patch tested?
   
   - Use integration test `TestSnapshotDefragService2` for faster debugging 
iteration.
   - Use Docker dev to manually test the defrag service (which generates keys 
order of magnitude faster).
   
   ## Results
   I've tested it with "1 million keys overwrite" scenario locally in Docker 
dev:
   1. write 1m keys, take snapshot 1
   2. overwrite those 1m keys, take snapshot 2
   3. overwrite those 1m keys again, take snapshot 3
   
   After snapshot defrag processed all 3 snapshot checkpoint DBs, the space 
usage went from 169 MB to 92 MB (46% space saved), that is actual disk usage 
(hard links is only counted once).
   
   Note RocksDB WAL is disabled for more accurate disk usage during testing, 
those WALs can account be huge when writing millions of keys.
   
   Before Defrag, all 3 snapshot DBs:
   ```bash
   ▶ du -h -d1 ./checkpointState
    43M ./checkpointState/om.db-6639d124-6615-4ced-9af6-3dabd680727b
    63M ./checkpointState/om.db-d39279ce-cab6-44e0-839a-2baecb8c283a
    62M ./checkpointState/om.db-77b75627-5534-4db4-88e5-1661aceae92f
   169M ./checkpointState
   ```
   
   After Defrag, all 3 snapshot DBs:
   ```bash
   ▶ du -h -d1 ./checkpointStateDefragged
    83M ./checkpointStateDefragged/om.db-6639d124-6615-4ced-9af6-3dabd680727b
   4.1M ./checkpointStateDefragged/om.db-d39279ce-cab6-44e0-839a-2baecb8c283a
   4.1M ./checkpointStateDefragged/om.db-77b75627-5534-4db4-88e5-1661aceae92f
    92M ./checkpointStateDefragged
   ```
   
   There are more scenarios to be tried out. Even the above scenario can be 
improved to trigger DB compaction after EACH snapshot taken. (And DELETED_TABLE 
/ DELETED_DIR_TABLE should also be copied over in the impl.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] HDDS-13003. [POC] Snapshot Defragmentation to reduce storage footprint [ozone]

Reply via email to