[
https://issues.apache.org/jira/browse/HDDS-12518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wei-Chiu Chuang reassigned HDDS-12518:
--------------------------------------
Assignee: Wei-Chiu Chuang
> Compact RocksDB after consecutive deletions
> -------------------------------------------
>
> Key: HDDS-12518
> URL: https://issues.apache.org/jira/browse/HDDS-12518
> Project: Apache Ozone
> Issue Type: Improvement
> Reporter: Wei-Chiu Chuang
> Assignee: Wei-Chiu Chuang
> Priority: Major
> Attachments: Screenshot 2025-03-07 at 6.15.03 PM.png, Screenshot
> 2025-03-07 at 6.15.10 PM.png, Screenshot 2025-03-07 at 6.15.49 PM.png
>
>
> We can manually trigger compaction for RocksDB column families after some
> deletions to improve seek performance and thus improve overall OM performance.
> This is similar to RocksDB's builtin CompactOnDeleteCollector, but that
> requires newer version of RocksDB so we can't use it yet.
> I added a background service CompactionService to periodically (default 5
> minutes) check if a column family has accumulated 10,000 deletes, and
> schedule a full compaction if it does.
> Using this heuristic, I was able to improve the deletion of a large directory
> from 20k directories per hour to 400k per hour on my machine. The average
> rocksdb seek time managed to maintain below 2ms. (Without compactions,
> average db seek time would grow towards 20ms after several hours)
> I also did a proof of concept to simply compact over and over again, in
> which cause it could increase to 1 million deletions per hour. But that would
> be too intense if there are foreground traffic at the same time. So a less
> frequent approach is picked.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]