smengcl opened a new pull request, #8954: URL: https://github.com/apache/ozone/pull/8954
WARNING: DO NOT MERGE. This PR is for visibility and comments only. ## What changes were proposed in this pull request? Implement Snapshot Defrag service and manual trigger CLI. This working POC contains extremely crude implementation. Major refactoring and optimizations expected. The point is to prove that Snapshot Defrag could bring space saving. This also include dev commits with that begins with `[dev]` in their commit messages that likely won't end up in the code base. `[split]` are the ones that can be put in separate PRs. Design doc: https://github.com/apache/ozone/pull/8514 ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-13003 ## How was this patch tested? - Use integration test `TestSnapshotDefragService2` for faster debugging iteration. - Use Docker dev to manually test the defrag service (which generates keys order of magnitude faster). ## Results I've tested it with "1 million keys overwrite" scenario locally in Docker dev: 1. write 1m keys, take snapshot 1 2. overwrite those 1m keys, take snapshot 2 3. overwrite those 1m keys again, take snapshot 3 After snapshot defrag processed all 3 snapshot checkpoint DBs, the space usage went from 169 MB to 92 MB (46% space saved), that is actual disk usage (hard links is only counted once). Note RocksDB WAL is disabled for more accurate disk usage during testing, those WALs can account be huge when writing millions of keys. Before Defrag, all 3 snapshot DBs: ```bash ▶ du -h -d1 ./checkpointState 43M ./checkpointState/om.db-6639d124-6615-4ced-9af6-3dabd680727b 63M ./checkpointState/om.db-d39279ce-cab6-44e0-839a-2baecb8c283a 62M ./checkpointState/om.db-77b75627-5534-4db4-88e5-1661aceae92f 169M ./checkpointState ``` After Defrag, all 3 snapshot DBs: ```bash ▶ du -h -d1 ./checkpointStateDefragged 83M ./checkpointStateDefragged/om.db-6639d124-6615-4ced-9af6-3dabd680727b 4.1M ./checkpointStateDefragged/om.db-d39279ce-cab6-44e0-839a-2baecb8c283a 4.1M ./checkpointStateDefragged/om.db-77b75627-5534-4db4-88e5-1661aceae92f 92M ./checkpointStateDefragged ``` There are more scenarios to be tried out. Even the above scenario can be improved to trigger DB compaction after EACH snapshot taken. (And DELETED_TABLE / DELETED_DIR_TABLE should also be copied over in the impl.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
